Rate Distortion Optimization for Interprediction in H.264/Avc Video Coding

RATE DISTORTION OPTIMIZATION FOR INTERPREDICTION IN H.264/AVC VIDEO CODING Thesis Submitted to The School of Engineering of the UNIVERSITY OF DAYTON In Partial Fulfillment of the Requirements for The Degree of Master of Science in Electrical Engineering By Jonathan Patrick Skeans UNIVERSITY OF DAYTON Dayton, Ohio August, 2013 RATE DISTORTION OPTIMIZATION FOR INTERPREDICTION IN H.264/AVC VIDEO CODING Name: Skeans, Jonathan Patrick APPROVED BY: Eric Balster, Ph.D. Frank Scarpino, Ph.D. Advisor Committee Chairman Committee Member Assistant Professor, Department of Professor Emeritus, Department of Electrical and Computer Engineering Electrical and Computer Engineering Vijayan Asari, Ph.D. Committee Member Professor, Department of Electrical and Computer Engineering John G. Weber, Ph.D. Tony E. Saliba, Ph.D. Associate Dean Dean, School of Engineering School of Engineering & Wilke Distinguished Professor ii ABSTRACT RATE DISTORTION OPTIMIZATION FOR INTERPREDICTION IN H.264/AVC VIDEO CODING Name: Skeans, Jonathan Patrick University of Dayton Advisor: Dr. Eric Balster Part 10 of MPEG-4 describes the Advanced Video Coding (AVC) method widely known as H.264. H.264 is the product of a collaborative effort known as the Joint Video Team (JVT). The final draft of the standard was completed in May of 2003 and since then H.264 has become one of the most commonly used formats for compression [1]. H.264, unlike previous standards, describes a myriad of coding options that involve variable block size inter prediction methods, nine different intra prediction modes, multi frame prediction and B frame prediction. There are a huge number of options for coding that will tend to generate a different number of coded bits and different recon- struction quality. A video encoder is challenged to minimize coded bitrate and maximize quality. However, choosing the coding mode of a macroblock to achieve this is a difficult problem due to the large number of coding combinations and parameters. Rate Distortion Optimization is an effective technique for choosing the ’best’ coding mode for a macroblock. This thesis presents two features of an H.264 encoder, multi frame prediction and B frame prediction. Additionally, a Rate Distor- tion Optimization scheme is implemented with the features to improve overall performance of the encoder. iii For my friends and family iv ACKNOWLEDGMENTS I would like to thank my family for their support during my time as a college student. I would also like to thank the following people for making the experience more rewarding: • Thank you to Chris McGuinness for helping me learn H.264 and always being available to answer any questions I had. You have been an excellent role model, colleague, and friend. • Thank you to the William Turri and the rest of the ADDA lab for all the assistance you have given me through out my time with UDRI. • Thank you to Kerry Hill, Al Scarpelli, and the Air Force Research Laboratory for enabling the experience. • Thank you to Dr. Frank Scarpino and Dr. Vijayan Asari for serving on my thesis committee. • Thank you to Mike Ratterman and Chris Direnzi for putting up with me during undergrad. • Special thanks to Dr. Eric Balster for taking a chance on me and serving as my advisor. v TABLE OF CONTENTS ABSTRACT . iii DEDICATION . iv ACKNOWLEDGMENTS . v LIST OF FIGURES . viii LIST OF TABLES . x I. Introduction . 1 1.1 Video Coding Overview . 1 1.2 Video Coding Standards . 2 1.2.1 H.264 Standard . 3 1.3 H.264 Overview . 3 1.4 Prediction . 4 1.4.1 Intra Prediction . 5 1.4.2 Inter Prediction . 6 1.5 Transform, Scaling, and Quantization . 10 1.5.1 Hadamard Transform . 11 1.5.2 Quantization . 13 1.6 Entropy Coding . 14 1.6.1 Exp-Golomb Coding . 14 1.6.2 CAVLC . 15 1.6.3 CABAC . 15 1.7 Profiles and Levels . 15 1.8 Mode Selection . 16 1.8.1 Rate Distortion Optimized Mode Selection . 19 1.9 Motivation and Organization . 19 vi II. Multi Frame Prediction . 20 2.1 Interprediction Overview . 20 2.2 Syntax Overview . 21 2.3 Picture Ordering . 22 2.4 Reference Picture Lists . 25 2.5 Exp-Golomb Coding . 26 2.6 Motion Vector Prediction . 28 2.7 Multi Frame Encoding . 29 2.8 Conclusions . 33 III. B Frame Inter Prediction . 38 3.1 B Frame Inter Prediction Overview . 38 3.2 B Frame Reference Picture Lists . 39 3.3 B Frame Coding . 41 3.3.1 SPS and PPS . 41 3.3.2 Decoded Picture Buffer . 43 3.3.3 Create Search Window . 43 3.3.4 Block Match, Transform and Quantize . 43 3.3.5 Motion Vector Prediction . 44 3.4 B Frame Implementation . 44 3.5 B Frame Conclusions . 45 IV. Mode Selection . 48 4.1 Introduction . 48 4.2 Proposed Low Complexity RDO Method . 49 4.3 Proposed RDO Method Implementation . 50 4.4 Conclusions . 52 V. Conclusions and Future Work . 56 5.1 Conclusions . 56 5.2 Future Work . 57 BIBLIOGRAPHY . 58 vii LIST OF FIGURES 1.1 Subdivision of Picture into Slices . 5 1.2 Prediction Samples for luma 4x4 prediction . 6 1.3 Macroblock Partitioning for Inter prediction . 8 1.4 Multiframe Motion Compensation . 9 1.5 Extracting DC Coefficients . 12 1.6 H.264 Syntax Layers . 14 1.7 H.264 Profiles . 16 1.8 Available Prediction Modes . 18 2.1 Macroblock Layer Overview: Baseline . 23 2.2 mb pred syntax overview . 24 2.3 sub mb pred syntax overview . 25 2.4 Display Order Example, Type 0 . 26 2.5 Reference Picture Order Example: P Slices . 27 2.6 Current and neighboring partitions: 16x16 partitions . 29 2.7 Current and neighboring partitions: different partitions sizes . 30 2.8 Multi Frame Prediction Foreman . 32 viii 2.9 Multi Frame Prediction Flower . 33 2.10 Multi Frame Prediction Flyby . 34 2.11 Multi Frame Prediction Foreman . 35 2.12 Multi Frame Prediction Flower Complexity . 36 2.13 Multi Frame Prediction Flyby Complexity . 37 3.1 IPBB Display Order . 40 3.2 List0 and List1 Ordering Example . 41 3.3 B MB Prediction Block Diagram . 42 3.4 B MB Motion Vector Prediction . 44 3.5 Rate Distortion Curve Foreman using B Frame Interprediction . 45 3.6 Rate Distortion Curve Foreman using B Frame Interprediction for QPs 24 through 28 46 3.7 Complexity using B Frame Interprediction . 47 4.1 Traditional RDO . 50 4.2 Proposed RDO Method . 51 4.3 Proposed RDO Results: Foreman . 52 4.4 Proposed RDO Complexity: Foreman . 53 4.5 Proposed RDO Results: Flower . 54 4.6 Proposed RDO Complexity: Foreman . 55 ix LIST OF TABLES 1.1 Video Compression Standards . 2 1.2 Luma Prediction Modes, 4x4 prediction . 7 2.1 Exp-Golomb Codewords . ..

Load more