Learning, Selection and Coding of New Block Transforms in and for the Optimization Loop of Video Coders Saurabh Puri
Total Page:16
File Type:pdf, Size:1020Kb
Learning, selection and coding of new block transforms in and for the optimization loop of video coders Saurabh Puri To cite this version: Saurabh Puri. Learning, selection and coding of new block transforms in and for the optimization loop of video coders. Computer Science [cs]. Université Bretagne Loire; Université de Nantes; LS2N, Université de Nantes, 2017. English. tel-01779566 HAL Id: tel-01779566 https://tel.archives-ouvertes.fr/tel-01779566 Submitted on 26 Apr 2018 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Thèse de Doctorat Saurabh PURI Mémoire présenté en vue de l’obtention du grade de Docteur de l’Université de Nantes sous le sceau de l’Université Bretagne Loire École doctorale : Sciences et technologies de l’information, et mathématiques Discipline : Traitement du signal et des images, section CNU 27 Unité de recherche : LABORATOIRE DES SCIENCES DU NUMÉRIQUE DE NANTES (LS2N) Soutenance prévue le 09 November 2017 Learning, selection and coding of new block transforms in and for the optimization loop of video coders JURY Présidente : Mme Christine GUILLEMOT, Directrice de Recherche, INRIA Rennes Rapporteurs : M. Olivier DEFORGES, Professeur, INSA Rennes M. Marco CAGNAZZO, Associate Professeur, TELECOM-ParisTech Examinateur : M. André KAUP, Professeur, FAU-Erlangen-Nürnberg Allemagne Directeur de thèse : M. Patrick LE CALLET, Polytech Nantes, Université de Nantes Co-encadrant de thèse : M. Sébastien LASSERRE, Principal Scientist, Technicolor France Contents 1 General Introduction 11 1.1 Context......................................... 11 1.2 Objective and Scope of the Research......................... 12 1.3 Outline......................................... 13 1.4 Summary of The Contributions............................ 14 I Prior art on transform-based video compression 17 2 Video Compression, HEVC and Transform-based Coding 19 2.1 Video Compression: Basic Building Blocks...................... 19 2.2 High Efficiency Video Compression Standard.................... 20 2.2.1 Sketch of the codec............................... 21 2.2.2 Quad-tree structure in HEVC......................... 22 2.2.3 Intra and Inter Prediction........................... 23 2.2.4 Transforms in HEVC.............................. 24 2.2.5 Quantization in HEVC............................. 25 2.2.6 Adaptive Coefficient Scanning and Coefficient Encoding.......... 25 2.2.7 CABAC..................................... 27 2.2.8 Special Modes in HEVC............................ 28 2.2.9 Encoder Control................................ 28 2.2.10 Bjøntegaard Delta (BD) Rates........................ 30 2.3 Block Transforms in general.............................. 31 2.3.1 Properties of block transforms........................ 31 2.3.2 The Most Popular Transforms........................ 32 2.4 Motivation to improve the transforms used in HEVC................ 34 2.5 Conclusion....................................... 36 3 Advanced Transforms 37 3.1 Introduction....................................... 37 3.2 Future foreseen standard transforms in JVET.................... 38 3.2.1 Enhanced Multiple Transforms (EMT).................... 38 3.2.2 Secondary Transforms............................. 41 3.3 Other systematic Transforms............................. 43 3.3.1 Systematic Directional Transforms...................... 43 3.4 Offline learned Transforms............................... 46 3.4.1 Adaptive Karhunen Louve Transform (KLT)................ 46 3.4.2 Mode Dependent Transforms......................... 47 3 4 CONTENTS 3.4.3 Rate Distortion Optimized Transforms (RDOT).............. 48 3.5 Online learned transforms............................... 49 3.5.1 Content Adaptive Transforms (CAT)..................... 50 3.5.2 Signal Dependent Transforms (SDT)..................... 51 3.5.3 Other online transform learning schemes................... 51 3.6 Discussion........................................ 53 3.7 Conclusion....................................... 54 II The machinery on transform learning and coding 55 4 Data-Driven Transform Learning 57 4.1 Analysis of the training process of data-driven adaptive transforms........ 57 4.1.1 Elements of the training process....................... 58 4.1.2 Classification of residual blocks........................ 58 4.1.3 Computing optimized transform for each class................ 61 4.1.4 Re-classification................................ 65 4.1.5 Finding a consistent value for λ ....................... 65 4.2 Separable versus non-separable transform learning................. 66 4.3 Discussion and Conclusion............................... 67 4.3.1 Comparison of state-of-the-art offline learning schemes........... 68 5 Online and Offline Learning 69 5.1 Introduction....................................... 69 5.2 Online Adaptive Transform Learning Scheme.................... 69 5.2.1 Principles of the Algorithm.......................... 70 5.2.2 Proposed Adaptive Directional Transforms in HEVC............ 72 5.2.3 Simulation Results............................... 75 5.2.4 Discussion and Conclusions.......................... 79 5.3 Offline Adaptive Transform Learning scheme.................... 81 5.3.1 Learning scheme................................ 81 5.3.2 Implementation in HEVC........................... 82 5.3.3 Simulation Results............................... 83 5.4 Comparison between offline and online learning................... 85 5.4.1 Short-comings of the online framework.................... 85 5.5 Conclusions and Perspectives............................. 87 5.5.1 Perspectives................................... 88 6 Advanced signaling of transform and transform index 89 6.1 Goal of this chapter.................................. 89 6.2 Proposed methods for signaling the overhead.................... 89 6.3 Scheme for coding of transform basis vectors.................... 90 6.3.1 Prior art on basis vector coding........................ 90 6.3.2 Proposed methods............................... 92 6.3.3 Simulation Results............................... 97 6.3.4 Conclusion................................... 98 6.4 Proposed transform index prediction......................... 98 6.4.1 Related Works................................. 99 6.4.2 Proposed trained model-based transform index prediction......... 99 6.4.3 Architecture and training of CNN-based model............... 101 CONTENTS 5 6.4.4 Simulation Results............................... 103 6.4.5 Conclusions................................... 106 6.5 Discussion and Conclusions.............................. 106 III Improving offline adaptive transform learning 107 7 Offline scheme improvement 109 7.1 Introduction....................................... 109 7.2 Observations on training sets used for the design of MDTC............ 110 7.3 Proposed improvements to MDTC scheme to get proposed IMDTC........ 112 7.3.1 Proposed iterative learning on training set.................. 112 7.3.2 Learning KLTs on the chroma components................. 113 7.3.3 Extension to larger residual blocks...................... 114 7.3.4 Low-complexity IMDTC using partial transforms.............. 114 7.4 Experimental results.................................. 115 7.4.1 Effect of iterating over the training set.................... 115 7.4.2 R-D performance over HEVC due to improved chroma residual coding.. 116 7.4.3 Extension of transforms to 16×16 residual blocks.............. 118 7.4.4 Comparison of low-complexity IMDTC and full IMDTC scheme..... 119 7.4.5 Comparison of R-D performance, encoding complexity and memory re- quirement with MDTC scheme........................ 120 7.5 Conclusion....................................... 122 8 Content adaptability improvement: playing with datasets 125 8.1 Introduction....................................... 125 8.2 Related works...................................... 126 8.3 Generation of a pool of multiple transform sets................... 127 8.4 Proposed pool-based transform coding scheme.................... 128 8.5 Simulation Results................................... 129 8.6 Conclusions and Perspectives............................. 130 IV Conclusion and Future Perspectives 133 9 Conclusions and Future work 135 9.1 Summary........................................ 135 9.2 What we have learnt.................................. 138 9.2.1 h.264 vs. HEVC................................ 138 9.2.2 Online vs Offline................................ 139 9.2.3 Penalization and rate models......................... 139 9.2.4 Transform genericity versus specificity.................... 139 9.3 Future Work...................................... 140 A Smart re-ordering of coefficients during learning 143 B Statistical model to determine precision drop value b 147 C Author’s publications 151 List of Tables 3.1 Transform basis functions for the DCT and the DST-type transforms [1]..... 39 3.2 Transform set with corresponding candidates [1].................. 40 3.3 Transform set implicitly (no signaling) chosen for each IP Mode [1]........ 40 4.1 Summary of different offline transform learning schemes.............