Robust Scalable Video Compression Using Multiple

ROBUST SCALABLE VIDEO COMPRESSION USING MULTIPLE DESCRIPTION CODING A Dissertation Submitted to the Graduate School of the University of Notre Dame in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy by Guanjun Zhang, B.E.E.E., M.S.E.E. Robert L. Stevenson, Director Graduate Program in Electrical Engineering Notre Dame, Indiana April 2007 ROBUST SCALABLE VIDEO COMPRESSION USING MULTIPLE DESCRIPTION CODING Abstract by Guanjun Zhang The growth of the Internet and wireless networks boosts digital video applications in a wide range. Network video, such as digital video broadcasting, streaming, or surveillance, involve various networks and diverse clients. The networks may have varying bandwidths, loss rates, and best-effort or Quality of Service (QoS) capabilities. Video transmission over heterogeneous networks suffers from delay, congestion, losses, and errors. Video clients also tend to have various system resources and dis- play resolutions. Often, these conditions are unknown in advance. In such an environment, the challenge of reliably delivery of video over error- prone networks requires error resilience and flexible rate control. The usual codec design tradeoff between bit rate and quality is complicated by these requirements. The design of robust and scalable video codecs is desirable and the key to the success of these applications. Robust video coding plays an important role in limiting the error propagation and improving visual quality in case of errors. It addresses the issue of error concealment by designing proper structures and maintaining acceptable redundancy while minimizing the complexity. Scalable video coding is to encode a video sequences in a way that multiple levels Guanjun Zhang of quality can be obtained depending on the parts of the video bit stream that are received. It has the potential for high flexibility and error resilience capabilities due to their separable bit stream structures. Conventional layered video coding enables nested scalabilities among different parts of the coded bit streams. Fixed decoding order is necessary in layered coding to improve quality. The emerging multiple description coding provides parallel scalabilities that enable decoding using all available information. In-depth research has revealed that multiple description coding has great po- tentials in best effort networks, while layered coding performs better with QoS. In scenarios when tight delay constraints are imposed, which is true in most real time applications such as video conference and video streaming, multiple description be- come the best choice. Additional complexities of coders and redundancies in the information content are part of the price for the convenience and depend on actual algorithms. The works in this dissertation aimed to develop new approaches for robust and scalable video coding that meet the requirements of reliably delivery of video to diverse clients over heterogeneous networks. In study of robust video coding tech- nologies, we proposed an error resilience video coding algorithm using reference diversity. In the effort to increase robustness of multiple description video streams, a modified multiple state video coding scheme is introduced for better error concealment results. Significant quality improvement is observed by using the enhanced algorithm, and the complexity of error concealment is also reduced which can ben- efit power constrained clients. A hybrid scalable video coding algorithm with both layered and parallel scalabilities has been proposed based on the multiple state video coding. Quality scalability is achieved by layered coding, while temporal scalability is implemented using multiple description principle. Experimental evidence Guanjun Zhang indicates that it has better error concealment in networks with high error rates. To my parents, Ching, and Kevin ii ACKNOWLEDGMENTS I want to express my gratitude to my academic advisor, Dr. Robert L. Stevenson, for the invaluable ideas he has shared with me, for all the guidance provided to me, and for his unceasing help to me in this work. He is always supportive, patient, and understanding. I have enjoyed our working relationship and I am grateful to have had the opportunity to work with him. I would also like to thank Dr. Patrick Flynn, Dr. Martin Haenggi, and Dr. Nicholas Laneman, who kindly agreed to serve on my thesis examining committee, for their constructive suggestions in both academic and language, and inspiring questions that improve the quality of this thesis. Special thanks to Dr. Laneman, whose insightful lectures and meetings early in this work helped me find the path in the maze of research. Thanks to the Department of Electrical Engineering for the opportunity to pur- sue my educational goals and the help throughout my time in Notre Dame. Many thanks to Dr. Daniel Costello, who provided me a great opportunity to come to this beautiful campus seven years ago. My thanks to Connie, Debbie, Bong and all other ISSA staff members who have given me help during my years at Notre Dame. They are fabulous friends and gracious people. To Kyle Erickson, Adam Alessio, Sean Borman, Blanca Bandia, and Mark Robertson, thank you for your friendship. I am grateful for your help in academic and language. Also, to Hu Wenchuang, Hu Min, Ye Xin, Feng Junbo, Liu Donglin, iii and Gao Yongqin, the brothers, my life will be less successful without your friendship. Thanks to my parents, for the values, supports and love which they have given to me throughout the years. Finally, I give my greatest and special thanks to Ching, a true friend and supportive partner, whose love, encouragement and support have empowered me to complete this work, and to our son Kevin. It is they who make all my work mean- ingful and joyful. This work is dedicated to them. iv CONTENTS ACKNOWLEDGMENTS ............................. iii FIGURES ..................................... vii TABLES ...................................... ix CHAPTER1:INTRODUCTION . .. .. 1 CHAPTER 2: FUNDAMENTALS OF VIDEO COMPRESSION . 6 2.1 General Video Compression Techniques . .. 7 2.1.1 GenericVideoCodec . .. .. 7 2.1.2 Inter-framePrediction . 11 2.1.3 Transforms ............................ 14 2.1.4 Quantization ........................... 16 2.1.5 EntropyCoding.......................... 18 2.2 InternationalStandards . 20 2.2.1 ITU-TRecommendations . 21 2.2.2 MPEGStandards......................... 23 2.2.3 JointVideoTeam......................... 24 CHAPTER 3: ROBUST VIDEO CODING USING VIRTUAL REFERENCE PICTURE.................................... 26 3.1 Introduction to robust video coding . .. 26 3.2 VirtualReferencePicture . 28 3.2.1 CodingScheme .......................... 28 3.2.2 FilterSelection .......................... 31 3.3 SimulationSetup ............................. 35 3.4 ResultsandDiscussion . 37 3.5 Conclusions ................................ 40 v CHAPTER 4: SCALABLE VIDEO CODING . 50 4.1 Introduction................................ 50 4.2 LayeredVideoCoding .......................... 53 4.2.1 SpatialScalability. 53 4.2.2 TemporalScalability . 55 4.2.3 QualityScalability . 56 4.2.4 FineGrainScalability . 56 4.3 Multiple Description Coding . 57 4.3.1 Introduction............................ 57 4.3.2 Multiple Description Image Coding . 62 4.3.3 Multiple Description Video Coding . 68 4.3.4 MDCwithSubbandCoding . 72 CHAPTER 5: ENHANCED ERROR CONCEALMENT OF MULTIPLE STATEVIDEOCODING ........................... 77 5.1 MultipleStateVideoCoding. 77 5.2 ErrorRecoveryofMSVC. 79 5.3 ImprovedMSVCScheme . 81 5.4 Results................................... 83 CHAPTER 6: HYBRID SCALABLE VIDEO USING MSVC AND SNR SCALABILITY................................. 88 6.1 Robust Video with Layered and Parallel Scalabilities . ....... 88 6.2 Proposed Hybrid Scalable Video . 91 6.3 Results................................... 95 6.3.1 CodingEfficiency ......................... 96 6.3.2 FrameDropEnvironment . 98 6.3.3 PacketDropEnvironment . .101 6.4 Remarks..................................102 CHAPTER7:SUMMARY . .. .. .106 CHAPTER8:FUTURERESEARCH . .108 8.1 ImprovementonVRP . .. .. .108 8.2 Future Improvement on Error Concealment of MSVC . .108 8.3 HybridScalableVideo . .110 REFERENCES ..................................112 vi FIGURES 2.1 Hybrid Motion-Compensated Video Encoder. ... 8 2.2 Hybrid Motion-Compensated Video Decoder. ... 9 2.3 Motionestimation. ............................ 11 2.4 Anexampleoflogarithmicsearch. 13 2.5 Function of a seven-step uniform quantizer. ..... 17 2.6 AnexampleoftheHuffmantree. 19 2.7 An example of arithmetic encoding procedures . .... 21 3.1 A video coding system using virtual reference pictures. ........ 41 3.2 Dependence of pictures in the virtual reference video coding system. 42 3.3 Error recovery of median filter in VRP. 42 3.4 Coding overhead of QCIF sequences on VRP codecs at different bit rates against the input size of the VRP filters: a) and b) foreman; c) andd)container;e)andf)silent. 43 3.5 Coding overhead of SIF sequence mobile on VRP codecs at different bit rates against the input size of the VRP filters. .. 44 3.6 PSNR value comparison of sequence mobile. .. 44 3.7 Pictures from mobile: a) - c) picture No.7 to No.9 of the error free sequence; d) - f) picture No.7 to No.9 of the damaged and concealed VRPsequence. .............................. 45 3.8 PSNR vs. BER curves of sequence foreman, QCIF. 46 3.9 PSNR vs. BER curves of sequence container, QCIF. ... 47 3.10 PSNR vs. BER curves of sequence silent, QCIF. ... 48 3.11 PSNR vs. BER

Robust Scalable Video Compression Using Multiple

Video Basics ---Major Ref

CT8021 H.32X G.723.1/G.728 Truespeech Co-Processor

TR 101 329-7 V1.1.1 (2000-11) Technical Report

Multiple Description Coding Using Time Domain Division for MP3 Coded Sound Signal

A Real-Time Audio Compression Technique Based on Fast Wavelet Filtering and Encoding

EEG CB1512 Caption Legalizer™& Relocating Bridge

A Cross-Layer Design for Scalable Mobile Video

Operations Manual Tandberg EN8090 MPEG4 HD Encoder

A Guide to MPEG Fundamentals and Protocol Analysis and Protocol to MPEG Fundamentals a Guide

AMR Wideband Speech Codec; Feasibility Study Report (3GPP TR 26.901 Version 4.0.1 Release 4)

Installation and Operation Guide

MPEG Compression Is Lossy in That What Is Decoded Is Not Identical to the Interlinked Tables and Coded Identifiers to Separate the Programs and the Original