An Efficient Inter-Coding Algorithm for H.264

15th European Signal Processing Conference (EUSIPCO 2007), Poznan, Poland, September 3-7, 2007, copyright by EURASIP AN EFFICIENT INTER-CODING ALGORITHM FOR H.264 Laeeq Aslam and Nadeem Ahmad Khan Department of Computer Science, Lahore University of Management Sciences Opposite Sector ‘U’, DHA, 54792, Lahore, Pakistan phone: + (92)42-5722670-9, fax: + (92)42-5722591, email: {laeeq, nkhan}@lums.edu.pk web: http://cs.lums.edu.pk ABSTRACT In general, a large area with consistent motion is An efficient inter-coding algorithm for H.264 video cod- more likely to be coded using large block size, and the area ing standard is presented that gives a simultaneous improve- containing the boundaries of motion are more likely to be ment of encoding time and bit rate without a loss in SNR. coded using smaller block sizes. The basic idea is to exploit This is an extension of 3DRS algorithm for variable block the homogeneity of motion in the scene. The encoder selects sizes which was previously known for fixed block sizes. We the best macroblock partition and mode of prediction for use central difference to find boundaries in a consistent mo- each macroblock, such that the video coding performance is tion vector field to make macroblock partition decision. In optimized. comparison to Full Search (FS) algorithm the proposed algorithm on average takes 66.65% less computational time, producing 2.9% less number of bits with an average SNR gain of 0.053 dB. The proposed algorithm, on average, saves 6.31% of computational time and 1.5% number of bits as compared to Fast Motion Estimation (FME) reference algorithm with a gain of 0.072 dB of SNR. Experimental results using six different sequences are presented to demonstrate the advantage of using the proposed algorithm. 1. INTRODUCTION H.264 defines three key profiles: Baseline, Main, and Ex- tended. The Baseline profile is the simplest profile which targets applications with limited processing resources and Figure 1 - Allowed macroblock and sub-macroblock partitioning in low delay requirements. The Main profile adds features that H.264 improve video quality at the expense of a significant in- In the next section we will give an overview of the crease in computational complexity. The Extended profile related work from the literature. Third section will cover the targets streaming video, and includes features to improve suggested algorithm and fourth section contains the experi- error resilience and to facilitate switching between different mental results after which we give conclusions. bit streams [1]. To overcome the high computational complexity of the codec we need fast algorithms, efficient im- 2. OVERVIEW OF RELATED WORK plementation and enhanced computational resources. All three profiles use inter-coding to exploit the temporal redun- A lot of work in recent years has been done suggesting algo- dancy. This work is regarding inter-coding and is more use- rithms to find the best mode and partition [2, 3, 6, 7]. If we ful for the baseline profile due to limited processing re- can identify, prior to motion estimation, the macroblocks that sources, however other profiles will also get benefit of the are likely to be skipped, we can save a lot of computations. proposed scheme. Inter coding in H.264 is done with mac- One such technique is suggested in [2] which can reduce the roblocks partitioned into different ways as shown in figure encoding time by 29.67% on average without significant loss 1; with a minimum luma-block size of 4x4 to a maximum of in rate-distortion performance. This is achieved by estimating 16x16. The primary macroblock partition patterns are a Lagrangian rate-distortion cost function. 16x16, 16x8, 8x16 and 8x8. If the 8x8 partitioning is se- In [3] an early termination algorithm is given for vari- lected, it can be further partitioned into 8x4, 4x8 or 4x4 [1]. able block-size motion estimation by concentrating on zero If we take object coding and fixed block coding as two ex- motion. If the rate distortion cost of a partition in a macrob- treme approaches, H.264 gives the flexibility to use variable lock at (0, 0) is less than a threshold, it is declared as a zero block sizes to explore the tradeoffs between the two ap- motion block. For sequences with less motion this technique proaches. can save up to 93.4% search points per macroblock while reduction in PSNR is not more than .05 dB, whereas, in se- ©2007 EURASIP 1260 15th European Signal Processing Conference (EUSIPCO 2007), Poznan, Poland, September 3-7, 2007, copyright by EURASIP quences with high motion, this technique can save up to 60% In the first step, our algorithm obtains the consistent mo- of search points with a negligible loss in PSNR. tion field using 3DRS algorithm with fixed block size of According to [6], homogeneity decision of a block is de- 16x16. In the second step obtained motion field is used to pendent on edge information, and macroblock differencing determine whether a certain macroblock is on the motion can be used to judge whether the macroblock is time station- vector field discontinuity or not and what is the right partition ary or not. Based on these facts a fast algorithm for inter- of that macroblock. This process is performed as follows: coding is presented in [6] which can reduce the encoding Assume that we have (j x k) blocks in a given frame. Af- time up to 30% with a loss of 0.03 dB in PSNR and 0.6% ter the first step motion vector field can be represented as: increase in bit rate. MV = {(Xxy, Yxy) | 0≤ x < j & 0≤ y <k} th In [7], a prediction algorithm for block size and mode is Where, (Xab, Yab), is the motion vector for macroblock in a presented which have speedup factor of 30% in comparison row and bth column of macroblocks. As we are interested in to fast full search algorithm with a PSNR loss of 0.071 dB finding the macroblocks through which the motion field and a bit rate increase of 2.78%. boundaries are passing, we calculate the scaled central differ- In [9], 3DRS based variable block size motion estima- ence as: tion algorithm is presented which achieves 80% encoding DH = { MVX(x+1)y – MVX(x-1)y | 0< x < k-1 & 0≤ y <k } time improvement over full search and up to 55% over FME DV = { MVYx(y+1) - MVYx(y-1) | 0≤ x < k & 0< y <k-1 } by compromising around 0.15 dB SNR. The bit-stream size is slightly less as compared to FS and FME. DH contains difference of first elements of all motion vectors (pairs) while the difference of second elements is 3. THE PROPOSED ALGORITHM kept in DV. Note that it is scaled central difference because it is not divided by two but it is still a gradient estimator. If we We focus on inter-coding which starts at second frame of any consider the image plane as shown in figure 2, D will cap- sequence while the first one is intra coded. We suggest an H ture the horizontal edges in the MV field while the D will algorithm which is motion adaptive and exploits the motion V capture the vertical edges in the MV field. information to decide that whether a certain macroblock is at the discontinuity of motion vector field or not. Consistent or true motion field is obtained using 3DRS algorithm with fixed block size [8]. The examples of such motion field are shown in figure 3. Scaled absolute central difference is ap- plied on motion field to find the macroblocks containing the motion boundaries. Such macroblocks are partitioned to ob- tain high coding efficiency. The suggested algorithm consists of two steps: 1. Keep the mode of each macroblock as inter 16x16 and find Figure 2: Image plane considered in this paper the motion vectors using 3DRS algorithm [8]. Apply fine search to refine the motion vectors found. The strength of an edge in this case will be a measure of 2. Based on motion vector information found in above step, relative motion between two blocks in horizontal or vertical decide about the macroblock partitions and encode. direction. 3-D recursive search algorithm has been previously used It is also important to note that if we use forward or for fixed block size in [8], and for variable block-size in [9]. backward difference the obtained information is not suffi- Key idea of this algorithm is to use known motion vectors of cient to point out the blocks which contain the motion spatial and temporal neighbours to find motion vector for the boundaries; rather such difference will just tell us that two current block. Spatial neighbours are those for which we blocks are moving in different directions. To find the mac- have already found the motion vector. Temporal neighbours roblocks having motion boundaries we have to take the cen- are those for which current frame does not have the motion tral difference. vector information yet but the information for the previous Using central difference as approximate derivative we frame is still available and is not overwritten. Two estimators can capture many local properties. If right and left, or top and ‘a’ and ‘b’ are defined with diagonally opposite convergence bottom, neighbouring blocks are moving in different direc- directions. Each estimator provides a set of candidates con- tions then the difference in motion will be quantified by gra- sisting of a temporal predictor and up to four other candi- dient magnitude. Blocks containing boundary of two objects dates from spatial predictor by adding random update vectors or surfaces which have a relative motion will also be identi- to it.

Load more