1. Introduction

FACTA UNIVERSITATIS (NIS) Series: Electronics and Energetics vol. 9, No. 2 (1996), 173{184 SOME IMPROVEMENTS OF THE IMAGE SEQUENCE COMPRESSION Dusan Curap ov and Mio drag Pop ovic Abstract. In this pap er, some improvements of the standard metho d for image sequence compression using Discrete cosine transform and motion comp ensation are describ ed. The main improvements are realized by dividing the images into blo cks of various size and by tuning the quantization matrices according to image activity. The realized signal to noise ratio was 37 to 39 dB at the bit rate of 0.3 bpp, what represents improvementof2to4dBover similar DCT{based metho ds. 1. Intro duction The recentadvances in communication and computer systems havemade p ossible the transmission of images at low bit rates, using standard telephone or ISDN channels. In order to transmit TV signal through such links, its spatial and temp oral redundancies should b e made signi cantly smaller. To reduce the spatial redundancy the Discrete cosine transform is usually used, b ecause of its ability to compress the signal into a small number of DCT co ecients. Further, the DCT is a real transform, and fast algorithms for its computation exist. In order to realize compression in real time using mo dern VLSI integrated circuits for DCT computation, the image is usually divided into blo cks of 8 8 pixels [7]. The simplest metho d for the removal of temp oral redundancy is the co ding of the di erence between successive images from the sequence. The more elab orate metho ds extract some information ab out the motion and use it to comp ensate the motion b efore the di erence and co ding op erations. Manuscript received January 27, 1995. Aversion of this pap er was presented at the second Conference Telecommunications in Mo dern Satellite and Cables Services, TELSIKS'95, Octob er 1995, Nis, Yugoslavia.. The authors are with Faculty of Electrical Engineering, University of Belgrade, Bule- var Revolucije 73, P.O. Box 816, 11001 Belgrade, Yugoslavia. 173 174 Facta Universitatis ser.: Elect. and Energ. vol. 9, No.2 (1996) In order to facilitate the transmission and storage of video signals, several international standards are prop osed, such as H.261 [1] and MPEG [3]. In these standards, the DCT is used to remove the spatial redundancy,and motion comp ensation is optionally used to remove the temp oral redundancy. However, no sp eci c motion comp ensation metho d is sp eci ed in these standards, only the p osition of motion vector in the data stream and its size are sp eci ed. The simpli ed blo ck diagrams of co der and deco der in b oth systems are very similar, and they are presented in Fig. 1. (a) (b) Figure 1. The blo ck diagram of the motion comp ensated video signal transmission: (a) co der, (b) deco der. D. Curap ov and M. Pop ovic: Some improvements of the image... 175 In the last few years, several solutions for motion comp ensation in image sequences were prop osed [6]. It has b een shown that motion comp ensation can signi cantly improve visual quality of images at the xed bit rate, or p ermit the transmission at lower bit rate at the xed image quality. However, no metho d app ears to b e the b est solution. In this pap er an exp erimental analysis is p erformed to establish what can be obtained if images are divided into blo cks of variable sizes (instead of xed size 8 8 pixels) in the pro cess of transformation and co ding. The results are evaluated using signal to noise ratio as the ob jective criterion, andalsoby a sub jectiveevaluation. To facilitate the comparison with other results, the sequence known as Miss America was used in our exp eriments. 2. Motion comp ensation The motion comp ensation is a very imp ortant part of any algorithm for the compression of image sequences. Every algorithm for the motion com- p ensation consists of two parts: the estimation of the motion, and the com- pensation of the estimated motion. In order to realize ecient motion estimation and comp ensation, images are usually divided into small blo cks, and the motion of the blo cks is estimated giving a set of motion vectors. The quality of the motion estimation algorithm is dep endent on its computational complexity and accuracy of the estimated motion vectors. In the usual video conferencing sequences, the motions of the blo cks between two successive images are only few pixels. Because of that, the motion estimation of a blo ck n n from the current image, reduces to the deter- mination of the b est match of that blo ck and blo cks of the same size in the previous image which lay in the search region (n +2p) (n +2p), where p is the maximal tolerable motion. The most accurate, but the least ef- fective, motion estimation metho d is the blo ck matching metho d based on the criterion of the maximal cross-correlation (or minimal average absolute error) between blo cks from the two images. In both cases, it is necessary 2 to examine (2p +1) p ossible matches, what represents very high computational complexity. Because of that, in the last few years several more e ective metho ds for the motion estimation were prop osed [6], that improve computational complexity up to ten times. In all these metho ds only the lo cal optimum is found, but this solution is acceptable in this application. One of the b est algorithms for the motion estimation is the three-step algorithm [5]. In the rst step of this algorithm, the mean absolute error (MAE) is calculated in nine p oints with co ordinates (i p ;j p ), p =0 1 1 1 or 3, based on formula [4]: 176 Facta Universitatis ser.: Elect. and Energ. vol. 9, No.2 (1996) n n X X 1 MAE(i; j )= j f (u; v ) f (u + i; v + j ) j (1) k k 1 2 n u=1 v =1 These p oints are denoted by 1 in Fig. 2. From these nine p oints, the p oint with minimum MAE is chosen, which represent the rst approximation of the motion vector. In Fig. 1 this is the p oint(i +3;j+3). In the second step, the MAE is calculated in new eight points (denoted by 2 in Fig. 2), using ner resolution p <p ; and the new approximation of the motion vector is 2 1 determined. This is the p oint(i +3;j +5) in Fig. 2. The second step can b e rep eated several times, every time with ner resolution p <p . In the last i i1 step, the resolution is p =1. When the size of the search region is p 6, i only three steps are needed. The nal p osition of the motion vector in Fig. 2is(i +2;j +6). This metho d is also suitable for hardware realizations in VLSI. In this pap er, in order to obtain higher compression, several exp eriments using variable blo ck size will b e p erformed. It is known that blo cks of greater size can b e co ded using smaller numb er of bits p er pixel (bpp), but their use is not suitable if there is a large motion b etween images. That is the reason why in standard co ding metho ds the blo ck size of 8 8 pixels is used. In our exp eriment, in the rst phase of the motion estimation, the blo ck size of 16 16 pixels was used. Each blo ckischaracterized bytwo parameters: the motion vector and the MAE. The value of MAE determines the activity of the blo ck, e.g. it represents the amount of motion in the blo ck that can not b e comp ensated. Using the MAE, all blo cks are classi ed into three classes, so that class I contains the blo cks with the highest activity. Then, in the second phase of the algorithm, four neighb oring blo cks are examined. If at least three blo cks b elong to class I I I and only one to class I I, then these four blo cks are merged into a larger blo ckofsize32 32, with motion vector equal to the arithmetic mean of four motion vectors. If a blo ck b elongs to class I, it will b e divided into four smaller blo cks of size 8 8. The motion vectors of these new blo cks should b e determined again, but using the existing motion vector as a go o d initial guess to reduce the computation. The remaining blo cks, which are not group ed nor divided, retain the initial size of 16 16 pixels. Using this pro cedure, only the blo cks that contain large motion will b e co ded using blo cksizeof8 8 pixels. D. Curap ov and M. Pop ovic: Some improvements of the image... 177 Figure 2. The three{step algorithm for motion estimation. 3. Discrete cosine transform When motion vectors of all blo cks of an image are determined, the dif- ferences between blo cks from the current image and corresp onding motion comp ensated blo cks from the previous image are formed. The di erence blo cks are then transformed using 2{D Discrete cosine transform (DCT) using the expression: n1 n1 X X c c (2x +1)u (2y +1)v u v p F (u; v )= f (x; y ) cos cos (2) 2n 2n 2n y =0 x=0 where: 8 1 < p u; v =0 c ;c = (3) 2 u v : 1 u; v 6=0 and n =8; 16 or 32.

1. Introduction

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support