Predictive Coding Algorithms for Lossy Image and Video Compression
Total Page:16
File Type:pdf, Size:1020Kb
PREDICTIVE CODING ALGORITHMS FOR LOSSY IMAGE AND VIDEO COMPRESSION Lu´ısFilipe Ros´arioLucas Tese de Doutorado apresentada ao Programa de P´os-gradua¸c~ao em Engenharia El´etrica, COPPE, da Universidade Federal do Rio de Janeiro, como parte dos requisitos necess´arios `aobten¸c~aodo t´ıtulode Doutor em Engenharia El´etrica. Orientadores: Eduardo Ant^onioBarros da Silva S´ergioManuel Maciel de Faria Rio de Janeiro Janeiro de 2016 PREDICTIVE CODING ALGORITHMS FOR LOSSY IMAGE AND VIDEO COMPRESSION Lu´ısFilipe Ros´arioLucas TESE SUBMETIDA AO CORPO DOCENTE DO INSTITUTO ALBERTO LUIZ COIMBRA DE POS-GRADUAC¸´ AO~ E PESQUISA DE ENGENHARIA (COPPE) DA UNIVERSIDADE FEDERAL DO RIO DE JANEIRO COMO PARTE DOS REQUISITOS NECESSARIOS´ PARA A OBTENC¸ AO~ DO GRAU DE DOUTOR EM CIENCIAS^ EM ENGENHARIA ELETRICA.´ Examinada por: Prof. Eduardo Ant^onioBarros da Silva, Ph.D. Prof. S´ergioManuel Maciel de Faria, Ph.D. Prof. Sergio Lima Netto, Ph.D. Prof. Ricardo Lopes de Queiroz, Ph.D. Prof. Fernando Manuel Bernardo Pereira, Ph.D. RIO DE JANEIRO, RJ { BRASIL JANEIRO DE 2016 Lucas, Lu´ısFilipe Ros´ario Predictive coding algorithms for lossy image and video compression/Lu´ısFilipe Ros´arioLucas. { Rio de Janeiro: UFRJ/COPPE, 2016. XXIII, 211 p.: il.; 29; 7cm. Orientadores: Eduardo Ant^onioBarros da Silva S´ergioManuel Maciel de Faria Tese (doutorado) { UFRJ/COPPE/Programa de Engenharia El´etrica, 2016. Refer^enciasBibliogr´aficas:p. 197 { 211. 1. Image and video compression. 2. Directional intra prediction. 3. Linear prediction. 4. Least-squares optimisation. 5. Sparse representation. I. Silva, Eduardo Ant^onioBarros da et al. II. Universidade Federal do Rio de Janeiro, COPPE, Programa de Engenharia El´etrica.III. T´ıtulo. iii The only source of knowledge is experience. Albert Einstein iv Acknowledgements I would like to express my gratitude to my supervisors Prof. Eduardo da Silva, Prof. S´ergioFaria, Prof. Nuno Rodrigues and Prof. Carla Pagliari, without whom this thesis would not be made possible. I am thankful for their support and wise guidance during this journey; for their suggestions and fruitful discussions providing new insights and inspiring ideas that enriched this research work; for their availability to review and improve the papers and this thesis; and also for their friendship. I would also like to acknowledge Prof. Krzysztof Wegner, for helping to implement the VSO algorithm and for reviewing the journal paper on depth map coding. I would like to thank Funda¸c~aopara a Ci^enciae Tecnologia, which provided the financial support for this PhD. I am also grateful to the Instituto de Teleco- munica¸c~oesin Escola Superior de Tecnologia e Gest~aoof Instituto Polit´ecnicode Leiria and also to the Laborat´orio de Sinais, Multim´ıdiae Telecomunica¸c~oesof Uni- versidade Federal do Rio de Janeiro, which provided the computer resources and laboratory facilities with excellent conditions for my research work in Leiria and Rio de Janeiro, and provided some funds for participating in scientific conferences. I would like to acknowledge my research group colleagues of IT, namely Sylvain, Lino, Carreira, Ricardo, Santos and Andr´e,for the nice working environment, for sharing new ideas, and also for the moments outside the laboratory. I am also thankful to Danillo, Nelson and Diego, for sharing their knowledge and experience, since my first years as researcher. To all my colleagues of SMT-UFRJ, in particular Anderson, Fl´avio,Tadeu, Luiz and Renam, I express my gratitude for receiving me in Brazil, and for helping me with the statistics and algebra courses. I also thank all my friends, that accompanied me during this journey, for the distracting and social moments, running nights in Leiria and pleasant bike rides. I must thank all those people and communities around the world that contribute for the development of many free and open-source tools used in this research work, namely the GNU/Linux/KDE software, distributed by Ubuntu and Gentoo commu- nities. I also acknowledge Carreira for the PlaYUVer software, and its contribution to the Remoterun scripts, which made our lives as researchers much easier. Last but not least, I would like to thank my family and Ana for the support and love, mainly in those moments I was absent and more involved with this work. v Resumo da Tese apresentada `aCOPPE/UFRJ como parte dos requisitos necess´arios para a obten¸c~aodo grau de Doutor em Ci^encias(D.Sc.) ALGORITMOS DE PREDIC¸ AO~ PARA COMPRESSAO~ DE IMAGEM E V´IDEO COM PERDAS Lu´ısFilipe Ros´arioLucas Janeiro/2016 Orientadores: Eduardo Ant^onioBarros da Silva S´ergioManuel Maciel de Faria Programa: Engenharia El´etrica Nesta tese s~aoinvestigadas t´ecnicas de predi¸c~aointra para os atuais algoritmos estado-da-arte de compress~aode imagem e v´ıdeo.S~aopropostos v´ariosm´etodos de predi¸c~aoque permitem reduzir a redund^ancia espacial presente em v´ıdeos2D e 3D, usando algoritmos baseados em diferentes paradigmas de compress~ao. A primeira proposta desta tese consiste em um esquema de predi¸c~aomais eficiente para o algoritmo Multidimensional Multiscale Parser (MMP) baseado em casamento de padr~oes.Com a utiliza¸c~aode um maior n´umerode modos de predi¸c~aodirecional o algoritmo MMP melhora significativamente, atingindo um desempenho competitivo com os atuais padr~oesde codifica¸c~aobaseados em transformada. O estudo de novos m´etodos de predi¸c~aopara o MMP levou ao desenvolvimento de um novo algoritmo de compress~aode mapas de profundidade que combina predi¸c~ao direcional com segmenta¸c~aoflex´ıvel e aproxima¸c~oeslineares. Os resultados expe- rimentais, baseados na qualidade das vistas sintetizadas, apresentam ganhos taxa- distor¸c~aoconsistentes sobre o padr~ao3D-HEVC (High Efficiency Video Coding). Em rela¸c~ao`apredi¸c~aolinear, s~aoapresentadas novas abordagens que usam mo- delos lineares com restri¸c~oesde esparsidade. A predi¸c~ao Locally Linear Embedding ´e investigada com sucesso para compress~aode imagens holosc´opicasusando o padr~ao HEVC. E´ tamb´emdesenvolvido um novo m´etodo de predi¸c~aolinear esparsa, que melhora o desempenho do HEVC, em particular para imagens com estruturas com- plexas e repetidas. Por ´ultimo, ´eproposto um novo m´etodo de predi¸c~aogeneralizado que unifica as predi¸c~oesdirecional e linear, com base em modelos esparsos ´optimos. Os resultados experimentais mostram a vantagem do novo esquema de predi¸c~aoem rela¸c~ao`aatual predi¸c~aointra do padr~aoHEVC. vi Abstract of Thesis presented to COPPE/UFRJ as a partial fulfillment of the requirements for the degree of Doctor of Science (D.Sc.) PREDICTIVE CODING ALGORITHMS FOR LOSSY IMAGE AND VIDEO COMPRESSION Lu´ısFilipe Ros´arioLucas January/2016 Advisors: Eduardo Ant^onioBarros da Silva S´ergioManuel Maciel de Faria Department: Electrical Engineering This thesis investigates efficient intra prediction techniques for current state- of-the-art image and video compression algorithms. Several prediction methods are proposed to reduce the spatial redundancy present in 2D and 3D video signals, using several compression algorithms based on different coding paradigms. The first proposal of this thesis consists in an improved prediction framework for a pattern-matching-based image encoder known as Multidimensional Multiscale Parser (MMP). It is demonstrated that the MMP algorithm significantly benefits from using an increased number of directional prediction modes, achieving a com- petitive performance in comparison to the transform-based standards. In the context of the prediction methods investigated for the MMP, a new highly predictive algorithm is presented for depth map coding, combining directional pre- diction with flexible block partitioning and linear fitting. Based on the quality of the synthesised views, the experimental results show consistent rate-distortion gains relative to the current 3D-HEVC (High Efficiency Video Coding) standard. Regarding linear prediction, new approaches that enforce sparsity constraints on linear models are presented. The Locally Linear Embedding-based prediction method is investigated for holoscopic image coding using the HEVC standard with successful results. A new sparse linear prediction method is also developed, im- proving the coding performance of HEVC, particularly for images with complex and repeated structures. Finally, a new generalised intra prediction framework is proposed, unifying the directional and linear prediction methods, based on opti- mal sparse models. Experimental results show the advantage of the new prediction framework over the current intra prediction of HEVC standard. vii Contents List of Figures xii List of Tables xvii List of Abbreviations xix 1 Introduction 1 1.1 Motivation . .1 1.2 Main objectives and contributions . .3 1.3 Outline . .7 2 Prediction techniques for image and video coding 9 2.1 Digital video representation . .9 2.2 Image prediction overview . 11 2.3 State-of-the-art prediction methods . 15 2.3.1 Intra-frame prediction . 16 2.3.2 Inter-frame prediction . 20 2.4 Least-squares prediction methods . 23 2.4.1 Linear prediction of images and video using LSP . 24 2.4.2 Context-based adaptive LSP . 25 2.4.3 Block-based LSP . 26 2.4.4 Spatio-temporal LSP . 27 2.5 Sparse representation for image prediction . 28 2.5.1 Sparse prediction problem formulation . 29 2.5.2 Matching Pursuit methods . 31 2.5.3 Template Matching algorithm . 32 2.5.4 Neighbour embedding methods . 33 2.6 Conclusions . 36 3 Image and video coding algorithms 37 3.1 Hybrid video compression . 37 3.2 Compression of 2D video . 40 viii 3.2.1 H.265/HEVC standard . 40 3.2.2 Multidimensional Multiscale Parser algorithm . 50 3.2.3 Experimental results . 53 3.3 Compression of 3D video . 57 3.3.1 3D video systems . 57 3.3.2 3D video coding standards . 66 3.3.3 Experimental results . 70 3.4 Conclusions . 73 4 Efficient predictive coding using MMP 75 4.1 Introduction .