Applied Wavelet Techniques in Image Coding Standards
Total Page:16
File Type:pdf, Size:1020Kb
Applied Wavelet Techniques in Image Coding Standards
DRAGORAD MILOVANOVIC(1), ZORAN BOJKOVIC(2) (1) Faculty of Electrical Engineering, University of Belgrade, Bulevar Revolucije 73, 11120 Belgrade, Serbia and Montenegro
(2)Faculty of Transport and Traffic Engineering, University of Belgrade, Vojvode Stepe 305, 11000 Belgrade , Serbia and Montenegro
Abstract: - With the increasing use of multimedia technologies, image compression requires higher performance as well as new features. To address this needs ISO/IEC JPEG2000 and MPEG-4 VTC image coding standard has been developed. Lossless and lossy compression, progressive transmission and region-of-interest coding are vital new features based on advanced multiresolution techniques: integer wavelets, spatially segmented and line-based wavelet transforms, trellis quantization, binary arithmetic entropy coding and rate-distortion optimization. A standard-based development of wavelet technology is reviewed in the paper, along with comparative analysis of performances and functionalities.
Key-words: multiresolution techniques, image coding algorithms.
1 Introduction A JPEG2000 and MPEG-4 VTC advanced features as well as development of wavelet technology is With development of advanced multimedia presented in the paper, along with comparative applications (image archiving, network image analysis of performances and functionalities. transmission, document imaging, medical imaging, …), image compression requires higher performance as well as new features. A great effort has been made 2 Technology overview to deliver a new standard by providing features 2.1 JPEG2000 inexistent in previous standards, but also by providing higher efficiency for features that exist in The JPEG2000 standard provides a set of new others. features that are of vital importance to many multimedia applications. It addresses areas where JPEG2000 and MPEG-4 VTC are new image current standards fail to produce the best quality or compression standards developed by the ISO/IEC performance and provides capabilities to markets that [1, 2]. The set of new features currently do not use compression [3]. lossy and lossless compression, It is desired to provide lossless compression naturally progressive recovery by pixel accuracy/ in the course of progressive coding. JPEG2000 resolution, wavelet lossy coder offers performance superior to tiling, the current standards at low bit-rates (e.g. below 0.25 region of interest coding, bpp for highly detailed grey-scale images). This random codestream access and processing, significantly improved low bit-rate performance has error resilience, been achieved without sacrificing performance on the are take advantages of advanced multiresolution rest of the rate-distortion intervals. technologies: Progressive transmission that allows images to bi integer wavelets and lifting schemes, reconstructed with increasing pixel accuracy or spatially segmented and line-based wavelets, spatial resolution is essential for many applications. trellis quantization, Often there are parts of an image that are more binary arithmetic entropy coding and important than other. This feature allows user defined rate-distortion optimization. Regions-Of-Interest (ROI) in the image to be randomly (and progressively) accessed and/or decompressed with less distortion than the rest of the partitioning, in that all operations are performed image. Also, random codestream processing could independently on the different tiles of the image. All allow operations such as rotation, translation, tiles have exactly the same dimensions, except filtering, feature extraction and scaling. maybe those, which about the right and lower Portions of the codestream may be more important boundary of the image. Tiling reduces memory than others in determining decoded image quality. requirements and constitutes one of the methods for Proper design of the codestream can aid subsequent the efficient extraction of a region of the image. error resilience. To perform the forward DWT, the standard uses a 1D The JPEG2000 standard is capable to compressing subband decomposition of a 1D set of samples into and decompressing images with a single sequential low-pass samples, and high-pass samples, pass (real time coding). representing a downsampled residual version of the original set, needed for the perfect reconstruction of 2.2 MPEG-4 VTC the original set from the low-pass set. The DWT can MPEG-4 Visual Texture Coding (VTC) is the be irreversible or reversible [5]. The default algorithm used in MPEG-4 standard in order to irreversible transformation is implemented by means compress the texture information in photo realistic of the Doubechies 9-tap/7-tap filter. The default 3D models. As the texture in a 3D model is similar to reversible transformation is implemented by means a still picture this algorithm can also be used for of the 5-tap/3-tap filter with integer coefficients [6]. compression of still images. It is based on the The standard supports two filtering models: a discrete wavelet transform (DWT), scalar convolution-based and lifting-based. For both modes quantization, zero-tree coding and arithmetic coding. the signal should be first extended periodically. MPEG-4 VTC supports SNR scalability through the Convolution-based filtering consists in performing a use of different quantization strategies: single series of dot products between the two filter masks quantization (SQ), multiple quantization (MQ) and and the extended 1D signal. Lifting-based filtering bilevel quantization (BQ). SQ provides no SNR consists of a sequence of very simple filtering scalability. MQ provides limited SNR scalability and operations for which alternatively odd sample values BQ provides generic SNR scalability. of the signal are updated with a weighted sum of even sample values. On the other hand, even sample Resolution scalability is supported by the use of values are updated with a weighted sum of odd band-by-band scanning (BB), instead of traditional sample values. For the reversible (lossless) case the zero-tree scanning (tree-depth), TD), which is also results are rounded to integer values. The lifting- supported. based filtering for the 5/3 analysis filter is achieved MPEG-4 VTC also supports coding of arbitrary by the following equations shaped objects, by means of a shape adaptive DWT x (2n) x (2n 2) 1 but does not support lossless coding. Several objects est est y(2n 1) xest (2n 1) (1) can be encoded separately, possibly at different 2 qualities and then composited at the decoder to y(2n 1) y(2n 1) 2 y(2n) xest (2n) (2) obtain the final decoded image. 4
3 Wavelet techniques development where xest is the extended input signal, y is the output signal, while a and a indicate the largest integer At first, in the JPEG2000 coder, the discrete not exceeding a and the smallest integer not ecceeded transform is applied on the source image by a, respectively. data. The transform coefficients are then quantized and entropy coded, before forming Quantization is the process by which the the output bitstream [4]. The decoder is the coefficients are reduced in precision. This operation reverse of the encoder. Depending on the is lossy, unless the quantization step is 1 and the wavelet transform and the applied coefficients are integers, as produced by the quantization, the JPEG2000 can be both reversible integer 5/3 wavelet. Each of the transform lossy and lossless. coefficients ab(u,v) of the subband b is quantized to the value q (u,v) according to the formula The original (source) image is partitioned into b rectangular nonoverlapping blocks called tiles (Figure 1). This is the strongest form of spatial packets within the bitstream (Figure 2, 3) ab (u,v) qb (u,v) signab (u,v) (3) [14]. The JPEG2000 bitstream contains b markers which identify the progression type where is the quantization step. The dynamic of the bitstream. Other markers may be b written which store the length of every range depends on the number of bits used to packet in the bitstream. To change a represent the original image tile component and on bitstream from progressive by resolution to the choice of the wavelet transform. All quantized progressive by SNR, a parser can read all the transform coefficients are signed values even when markers, change the type of progression in the original components are unsigned. These the markers, write the lengths of the packets coefficients are expressed in a sign-magnitude out in the new order, and write the packets representation prior to coding. Part 1 of the themselves out in the new order. There is no JPEG2000 standard uses only simple scalar dead need to run the MQ-coder, the context zone quantization. Part 2 of the standard will model, or even decode the block inclusion probably contain a trellis coded quantizer [7]. information. The complexity is only slightly Each subband of the wavelet decomposition is higher than a pure copy operation. divided into rectangular blocks, called code-blocks, which are coded independently using arithmetic 4 Comparative performances and coding. A binary arithmetic entropy coder called the functionalities MQ-coder is used to provide compression of Compression efficiency is one of the top priorities in symbols output by the context model. The the design of image products. Lossless and lossy complexity and compression are much higher than progressive compression efficiency have been typically used Huffman coder in JPEG. The code- evaluated with 7 images from the JPEG2000 test set, blocks are coded at a bitplane at a time, starting with covering various types of imagery. JPEG2000 the most significant bit-plane with a non-zero provides, in most cases, competitive compression element to the least significant bit-plane. For each ratios with the added benefit of scalability. The rate bit-plane in a code-block, a special code-block scan distortion behavior of the lossy (nonreversible) pattern is used for each of a three passes. Each JPEG2000 and the progressive JPEG is depicted in coefficient bit in bit-plane is coded in only one of the Figure 4 for a natural image. It is seen that the three passes. A rate distortion optimization method is JPEG2000 significantly outperforms the JPEG used to allocate a certain number of bits to each scheme [14]. block [8, 9, 10, 11]. In order to evaluate the error resilience features The ROI scaling-based coding used, scales up the offered by the different standards a transmission coefficients so that the bits associated with the ROI channel with random errors has been simulated in are placed in higher bit-planes. During the embedded [15], together with the evaluation of the average coding process, those bits are placed in the bitstream, reconstructed image quality after decompression. before the non ROI parts of the image. Thus, the ROI Table 3 shows the results for JPEG2000 with non- will be decoded, or refined, before the rest of the reversible filter and JPEG baseline. A it can be seen, image. Regardless of the scaling, a full decoding of the reconstructed image quality under transmission the bitstream results in a reconstruction of the whole errors is higher for JPEG2000 then JPEG, across all image with the highest fidelity available. If the encoding bitrates and error rates. bitstream is truncated, of the encoding process is terminated before the whole image is fully encoded, Table 1 sumarizes the comparison of still image the ROI will have a higher fidelity than the rest of the coding algorithms from a functionality point of view. image. The ROI approach defined in JPEG2000 A functionality matrix indicates the set of supported Part 1 allows ROI encoding of arbitrary shaped features in each standard. regions without the need of shape information and 5 Concluding remarks shape decoding [12, 13]. JPEG2000 and MPEG-4 VTC are the new standards There are four basic dimensions of offering the rich set of features based on progression/scalability in JPEG2000 multiresolution techniques (Table 1), in an efficient bitstream: resolution, quality, spatial location manner and within an integrated algorithmic and component. Different types of approach. progression are achieved by the ordering of The most important technology highlights for JPEG [6] ISO/IEC JTC1/SC29/WG11 N868, Performance 2000 are: wavelet/subband coding, reversible integer- evaluation of spatially segmented wavelet to-integer and nonreversible real-to-real wavelet transform in the JPEG200 baseline system, 1998. transforms, bit-plane coding, arithmetic coding MQ [7] P.Sriram, M.W. Marcellin, Image coding using coder from JBIG2 (ISO/IEC 14492), based heavily wavelet transforms and entropy-constrained trellis on EBCOT (embedded block coding with optimized quantization, IEEE Trans. IP, vol.4, pp.725-733, truncation) coding scheme, code stream syntax 1995. similar to JPEG, file format syntax. [8] J.M.Shapiro, Embedded image coding using Overall, one can say that JPEG2000 is successful zerotrees of wavelet coefficients, IEEE Trans. SP, standard that offers the richest set of features and vol.41, no.12, pp.3445-3462, 1993. provides superior rate-distortion performance. [9] A.Said, W.Pearlman, A new, fast and efficient However, this comes at the price of additional image codec based on set partitioning in complexity when compared to JPEG, which might be hierarchical trees, IEEE Trans. CSVT, vol.6, no.3, currently perceived as a disadvantage for some pp.214-223, 1993. applications, as was the case for JPEG when it was first introduced. [10] D.Taubman, A.Zakhor, Multirate 3D subband coding of video, IEEE Trans. IP, vol.3, no.5, pp.572-588, 1994. References: [11] D.Taubman, High performance scalable image [1] K.R.Rao, Z.Bojkovic, D.Milovanovic, compression with EBCOT, IEEE Trans. IP, Multimedia communication systems: techniques, vol.9, no.7, 2000. standards and networks (Prentice-Hall, 2002). [12] D.S.Cruz, T.Ebrahimi, M.Larsson, J.Askelöf, [2] K.R.Rao, Z.Bojkovic, Packet video C.Cristopoulos, Region of Interest Coding in communications over ATM networks (Prentice- JPEG2000 for interactive client/server Hall 2000). applications, Proc IEEE Workshop on MSP, [3] A.N.Skodras, C.A.Christopoulos, T.Ebrahimi, Denmark, pp.389-394, 1999 JPEG2000: The upcoming stil image compression [13] ISO/IEC JTC1/SC29/WG11 N892, Region of standard, Proc. RECPA, pp.359-366, 2000. interest coding, 1998. [4] ISO/IEC JTC1/SC29/WG11 N750, Performance [14] ISO/IEC JTC1/SC29/WG11 N1716, Report on evaluation of different reversible decorrelating CoreExperiments V1 (Evaluation of the transforms in JPEG2000 baseline system, 1998. distortion-adaptive progressive CSF weighting [5] M.D.Adams, F.Kossentini, Reversible integer-to- technique), 2000. integer wavelet transforms for image [15] ISO/IEC JTC1/SC29/WG11 N1606, Error compression: Performance evaluation and resilience Ad-hoc sub-group report, 2000. analysis, IEEE Trans. IP, vol.9, no.6, pp.1010- 1024, 2000. Tiling DWT on each tile
0 1 2 3
DC Image level 4 5 8 9 Component 6 7 shifting
Figure 1. Tiling, DC level shifting and DWT on each image tile Figure 2. Twelve code-blocks of one packet component. partition location at resolution level 2 of a 3-level dyadic wavelet transform. The packet partition location is presented by heavy lines. Packet n0 sub-bitplanes n1 sub-bitplanes n11 sub-bitplanes Header from code-block 0 from code-block 1 from code-block 11 Figure 3. The composition of one packet partition location with 12 code-blocks.
Table 1. Functionality matrix. A + indicates that it is supported, the more+ the more efficiently or better it is supported. A - indicates that it is not supported. JPEG2000 JPEG-LS JPEG MPEG-4 VTC PNG Lossless compression performance +++ ++++ +1 - +++ Lossy compression performance +++++ + +++ ++++ - Progressive bitstreams +++++ - ++2 +++ + Region Of Interest (ROI) coding +++ - - +3 - Arbitrary shaped objects - - - ++ - Random access ++ - _ - - Low complexity ++ +++++ +++++ + +++ Error resilience +++ ++ ++ +++ + Non-iterative rate control +++ - - + - Genericity4 +++ +++ ++ ++ +++ 1Only using the lossless mode of JPEG. 2Only in the progressive mode of JPEG. 3Tile-based only. 4 Ability to efficiently compress different types of imagery across a wide range of bitrates.
Table 2. Wavelet technologies in JPEG2000 Part 1 and Part 2. Technology Part 1 Part 2 Bitstream Fixed and variable length markers. New markers can be skipped by a Part 1 decoder. Optional. Provide intellectual property (e.g. Allow metadata to be interleaved with coded data. File format copyright) information, color or tone-space for Define types of metadata. image, general method of including metadata. Arithmetic coder MQ-coder. Same? Coefficient Independent coding of fixed size blocks within subbands. Division of coefficients into 3 sub- Special models for binary or graphic data? modeling bitplanes. Grouping of sub-bitplanes into layers. Scalar quantizer with dead-zone, truncation of Quantization Trellis coded quantization. code-blocks. Low complexity (5,3) and high performance Many more filters, perhaps user-defined filters. Transformation Daubechies (9,7). Mallat decomposition. Packet and other decompositions. Component Reversible component transform (RCT), YcrCb Arbitrary point transform or reversible wavelet decorrelation transform. transform across components. Error resilience Resynchronization markers. Fixed length entropy coder, repeated headers. Progressive by tile-part, then SNR, or Bit-stream ordering Out of order tile-parts. resolution, or component.
Figure 4. Rate distortion results for the progressive JPEG2000 vs. the progressive JPEG for a natural image. 46 44 42 ) B
d 40 (
R 38 N S P 36 JPEG2000 NR 34 P-JPEG 32 30 0 0.5 1 1.5 2 2.5 Bitrate (bpp) Table 3. Average PSNR [dB] of the decoded Café image transmitted over noisy channel with various bit error rates (BER) and compression bitrates, for JPEG baseline and JPEG2000 (J2K). bpp BER 0 BER 1E-06 BER 1E-05 J2K 23.06 23.00 21.62 0.25 JPEG 21.94 21.79 20.77 J2K 26.71 26.42 23.96 0.5 JPEG 25.40 25.12 22.95 J2K 31.90 30.75 27.08 1.0 JPEG 30.34 29.24 23.65 J2K 38.91 36.38 27.23 2.0 JPEG 37.22 30.68 20.78