
On Computational Complexity of Motion Estimation Algorithms in MPEG-4 Encoder Muhammad Shahid This thesis report is presented as a part of degree of Master of Science in Electrical Engineering Blekinge Institute of Technology, 2010 Supervisor: Tech Lic. Andreas Rossholm, ST-Ericsson Examiner: Dr. Benny Lovstrom, Blekinge Institute of Technology Abstract Video Encoding in mobile equipments is a computationally demanding fea- ture that requires a well designed and well developed algorithm. The op- timal solution requires a trade off in the encoding process, e.g. motion estimation with tradeoff between low complexity versus high perceptual quality and efficiency. The present thesis works on reducing the complexity of motion estimation algorithms used for MPEG-4 video encoding taking SLIMPEG motion estimation algorithm as reference. The inherent prop- erties of video like spatial and temporal correlation have been exploited to test new techniques of motion estimation. Four motion estimation algo- rithms have been proposed. The computational complexity and encoding quality have been evaluated. The resulting encoded video quality has been compared against the standard Full Search algorithm. At the same time, reduction in computational complexity of the improved algorithm is com- pared against SLIMPEG which is already about 99 % more efficient than Full Search in terms of computational complexity. The fourth proposed algorithm, Adaptive SAD Control, offers a mechanism of choosing trade off between computational complexity and encoding quality in a dynamic way. Acknowledgements It is a matter of great pleasure to express my deepest gratitude to my ad- visors Dr. Benny L¨ovstr¨om and Andreas Rossholm for all their guidance, support and encouragement throughout my thesis work. It was nonethe- less a great opportunity to do research work at ST-Ericsson under the marvelous supervision of Andreas Rossholm. The counseling provided by Benny L¨ovstr¨om was of great value for me in writing up this manuscript. I can’t forget mentioning the comfort I received from Fredrik Nillson and Jimmy Rubin of ST-E in setting up the working environment and start up of ST-E algorithm. I owe my successes in life so far to all of my family members, for their magnificent kindness and love! iii Contents Abstract i Acknowledgements iii Contents v 1 Introduction 1 2 Basics of Digital Video 3 2.1ColorSpaces.......................... 3 2.2VideoQuality.......................... 4 2.3RepresentationofDigitalVideo................ 4 2.4Applications........................... 4 2.4.1 Internet......................... 5 2.4.2 VideoStorage...................... 5 2.4.3 Television........................ 5 2.4.4 GamesandEntertainment............... 6 2.4.5 VideoTelephony.................... 6 3 Video Compression Fundamentals 7 3.1CODEC............................. 7 3.2AVideoCODEC........................ 8 3.3VideoCodingStandards.................... 9 3.3.1 MPEG-1........................ 10 3.3.2 MPEG-2........................ 10 3.3.3 MPEG-4........................ 10 3.3.4 MPEG-7........................ 10 3.3.5 MPEG-21........................ 10 3.3.6 H.261.......................... 11 3.3.7 H.263.......................... 11 3.3.8 H.263+......................... 11 3.3.9 H.264.......................... 11 3.4MPEG-4............................. 11 v Contents 3.5Syntax.............................. 12 4 Motion Estimation and its Implementation 15 4.1BlockMatching......................... 17 4.2MotionEstimationAlgorithms................ 18 4.2.1 FullSearch....................... 19 4.2.2 Three-StepSearch................... 20 4.2.3 DiamondSearch.................... 20 4.2.4 SLIMPEG........................ 21 5 Rate Distortion Optimization and Bjontegaard Delta PSNR 23 5.1MeasurementofDistortion.................. 24 5.2 Bjontegaard Delta PSNR . 24 6 Simulation, Results and Discussion 29 6.1SADasaComparisonMetric................. 30 6.2ProposedTechniques...................... 30 6.2.1 Spatial Correlation Algorithm . 31 6.2.2 Temporal Correlation Algorithm . 31 6.2.3 AdaptiveSADControl................. 32 6.3 Simulations with different video sequences . 34 6.3.1 FootballSequence................... 35 6.3.2 ForemanSequence................... 36 6.3.3 ClaireSequence..................... 40 7 Conclusion and Future Work 51 List of figures 54 List of tables 55 Bibliography 57 vi Chapter 1 Introduction Since the advent of the first digital video coding technology standard in 1984 by the International Telecommunication Union (ITU), the technology has seen a great progress. The two main standard setting bodies in this regard are ITU and International Organization for Standardization (ISO). Recommendations of ITU include the standards like H 261/262/263/264 and these focus on applications in the area of telecommunication. Motion Pictures Experts Group (MPEG) of ISO has released standards like MPEG- 1/-2/-4 which focus the applications in computer and consumer electronics area. The standards defined by both of these groups have some parts in common and also some work has been performed as a joint venture. The field of video compression has been continuously developing with the en- hancements in the previous versions of the standards and introduction of new recommendations. MPEG-4 standard is followed in this thesis work. It can be easily said that video compression is a top requirement in any multimedia storage and transmission phenomenon with encoding the video in various forms before sending or storing it and then decoding it subse- quently at the receiver end or when viewing it. Besides the presence of digital video in television and CD/DVD etc, cellular phones will probably be the next high use place of video content. The limited storage capacity of mobile equipments dictates the requirement of efficient video compression tools. Video encoding in mobile equipments has developed from a high- end feature to something that is taken for granted. Nevertheless, it is a computationally demanding feature that requires well designed and well developed algorithms. Many different algorithms need to be evaluated in order to come closer to the optimal solution. As early as 1929, Ray Davis Kell described a form of video compression for which he obtained a patent [1]. Given the fact that a video is actually a series of pictures transmitted at some designated speed between successive images, Rays patent gave rise to the idea of transmitting the difference be- tween the successive images instead of sending the whole image. However, 1 Chapter 1. Introduction it took ages to get the idea implemented into reality but still it is a keystone of many video compression standards today. Connected to this idea, there comes the concept of motion estimation which tries to exploit the presence of temporal correlation at different positions between the video frames. It predicts the motion found in the current frame using already encoded frames. Henceforth, the residual frame contains much less energy than the actual frame. Motion vectors and the residual frame are encoded by a bit rate much lesser than the bit rate required to encode a regular frame. Motion estimation may require tremendous amount of computational work inside the video coding process. There are certain algorithms employed for doing motion estimation. The basic class of these is called Full Search Al- gorithms and it gives optimal performance but computationally very time consuming. To deal with this computation issue, many sub optimal fast search algorithms have been designed and this thesis will focus on some of them in a try to improve performance of one of them. The SLIMPEG motion estimation algorithm is taken as reference here and inherent video properties like spatial and temporal correlation has been applied to devise techniques in a try to achieve less complex yet performance oriented motion estimation algorithms. The rest of the report is organized as: Chapter 2 and chapter 3 deal with fundamentals of digital video and video compression respectively. Imple- mentation aspects of motion estimation have been explored in chapter 4 ending with the introduction of SLIMPEG motion estimation algorithm. Rate distortion and delta PSNR are the contents of chapter 5. Results of the main contribution have been provided in chapter 6 with their descrip- tion. Chapter 7 contains conclusion and some hints about future work in the field. 2 Chapter 2 Basics of Digital Video A video image is obtained by capturing the 2D plane view of a 3D scene. Digital video is then spatial and temporal sampled frames presented in a sequence. The spatio-temporal sampling unit which is usually called pixel (picture element) can be represented by a digital value to describe its color and brightness. The more sampling points taken to form the video frame the higher is usually the visual quality but requiring high storage capacity. The video frame is usually formed in a rectangular shape. The smoothness of a video is determined by the rate at which its frames are presented in a succession. A video comprising a frame rate of thirty frames per second looks fairly smooth enough for most purposes. A general comparison of ap- pearance of a video determined by its frame rate is given in the table 2.1[2]. Table 2.1: Video frame rates.[2] Video frame rates. Appearance Below 10 frames per second ’Jerky’, unnatural appearance to movement 10-20 frames per second Slow movement appears OK; rapid movement is clearly jerky 20-30 frames per second Movement is reasonably smooth 50-60 frames per second Movement
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages66 Page
-
File Size-