JPEG 2000 Performance Evaluation and Assessment
Total Page:16
File Type:pdf, Size:1020Kb
Signal Processing: Image Communication 17 (2002) 113–130 JPEG 2000 performance evaluation and assessment Diego Santa-Cruz*, Raphael. Grosbois, Touradj Ebrahimi Signal Processing Laboratory, Swiss Federal Institute of Technology, CH-1015Lausanne, Switzerland Abstract JPEG 2000, the new ISO/ITU-T standard for still image coding, has recently reached the International Standard (IS) status. Other new standards have been recently introduced, namely JPEG-LS and MPEG-4 VTC. This paper provides a comparison of JPEG 2000 with JPEG-LS and MPEG-4 VTC, in addition to older but widely used solutions, such as JPEG and PNG, and well established algorithms, such as SPIHT. Lossless compression efficiency, fixed and progressive lossy rate-distortion performance, as well as complexity and robustness to transmission errors, are evaluated. Region of Interest coding is also discussed and its behavior evaluated. Finally, the set of provided functionalities of each standard is also evaluated. In addition, the principles behind each algorithm are briefly described. The results show that the choice of the ‘‘best’’ standard depends strongly on the application at hand, but that JPEG 2000 supports the widest set of features among the evaluated standards, while providing superior rate-distortion performance in most cases. r 2002 Elsevier Science B.V. All rights reserved. Keywords: JPEG; JPEG 2000; Image compression; MPEG; JPEG-LS; Wavelet; Lossless; MPEG-4 1. Introduction offered by JPEG 2000 and how well are they fulfilled, when compared to other standards offer- The standardization effort of the next ISO/ITU- ing the same or similar features. This paper aims at T standard for compression of still images, JPEG providing an answer to this simple but somewhat 2000 [9], has recently reached the International complex question. Section 2 provides a brief Standard (IS) status, 4 years after the call for overview of the techniques compared, Section 3 proposals [10]. Great efforts have been made by all evaluates different aspects of compression effi- the participants to deliver a new standard for ciency, while Section 4 looks at the algorithms’ today’s and tomorrow’s applications, by providing complexities. Error resilience performance is stu- features inexistent in previous standards, but also died in Section 5 and functionality in Section 6. by providing higher efficiency for features that Finally conclusions are drawn in Section 7. exist in others. Now that the standard has been finalized and accepted some trivial questions would be: how well does it perform, what are the features 2. Overview of evaluated algorithms *Corresponding author. For the purpose of this study we compare E-mail address: diego.santacruz@epfl.ch (D. Santa-Cruz). the coding algorithm in the JPEG 2000 standard 0923-5965/02/$ - see front matter r 2002 Elsevier Science B.V. All rights reserved. PII: S 0923-5965(01)00025-X 114 D. Santa-Cruz et al. / Signal Processing: Image Communication 17 (2002) 113–130 to the following three standards: JPEG [17], The lossless mode is based on a completely MPEG-4 Visual Texture Coding (VTC) [7] and different algorithm, which uses a predictive scheme. JPEG-LS [6]. In addition, we also include SPIHT The prediction is based on the nearest three causal [20] and PNG [25]. The reasons behind this choice neighbors and seven different predictors are defined are as follows. JPEG is one of the most popular (the same one is used for all samples). The coding techniques in imaging applications ranging prediction error is entropy coded with Huffman from Internet to digital photography. Both coding. Hereafter we refer to this mode as L-JPEG. MPEG-4 VTC and JPEG-LS are very recent The progressive and hierarchical modes of standards that start appearing in various applica- JPEG are both lossy and differ only in the way tions. It is only logical to compare the set of the DCT coefficients are coded or computed, features offered by the JPEG 2000 standard not respectively, when compared to the baseline mode. only to those offered in a popular but older one They allow a reconstruction of a lower quality or (JPEG), but also to those offered in most recent lower resolution version of the image, respectively, ones using newer state-of-the-art technologies. by partial decoding of the compressed bitstream. SPIHT is a well-known representative of state-of- Progressive mode encodes the quantized coeffi- the-art wavelet codecs and serves as a reference for cients by a mixture of spectral selection and comparison. Although PNG is not formally a successive approximation, while hierarchical mode standard and is not based on state-of-the-art uses a pyramidal approach to computing the DCT techniques, it is becoming increasingly popular coefficients in a multi-resolution way. for Internet-based applications. PNG is also undergoing standardization by ISO/IEC JTC1/ 2.2. JPEG-LS SC24 and will eventually become ISO/IEC inter- national standard 15948. Similar, less complete, JPEG-LS [6] is the latest ISO/ITU-T standard evaluations have been previously presented in for lossless coding of still images, which in [22,21,13]. addition provides support for ‘‘near-lossless’’ compression. The main design goal of JPEG-LS 2.1. JPEG has been to deliver a low-complexity solution for lossless image coding with a compression efficiency This is the very well known ISO/ITU-T stan- that is very close to the best reported results, which dard created in the late 1980s. There are several have been obtained with much more complex modes defined for JPEG [17,26], including base- schemes, such as CALIC [28]. As a design trade- line, lossless, progressive and hierarchical. The off, much of the extra functionality that is found in baseline mode is the most popular one and other standards, such as support for scalability supports lossy coding only. The lossless mode is and error resilience, has been left out. not popular but provides for lossless coding, JPEG-LS, which is based on the LOCO-I [27] although it does not support lossy. algorithm, is composed of two parts: part I In the baseline mode, the image is divided in specifies the baseline system, which is suitable for 8 Â 8 blocks and each of these is transformed with most applications, while part II defines extensions the Discrete Cosine Transform (DCT). The which improve the coding efficiency for particular transformed blocks’ coefficients are quantized with types of imagery at the expense of increased a uniform scalar quantizer, zig-zag scanned and complexity. Part I is based on adaptive prediction, entropy coded with Huffman coding. The quanti- context modeling and Golomb coding. In addition zation step size for each of the 64 DCT coefficients it features a flat-region detector to encode these in is specified in a quantization table, which remains run-lengths. Part I also supports ‘‘near-lossless’’ the same for all blocks. The DC coefficients of all compression by allowing a fixed maximum blocks are coded separately, using a predictive sample error. Part II introduces an arithmetic scheme. Hereafter we refer to this mode simply as coder, which is intended to deal with the limita- JPEG. tions that part I presents when dealing with very D. Santa-Cruz et al. / Signal Processing: Image Communication 17 (2002) 113–130 115 compressible images (i.e. computer graphics, features than this last one. It is based on a images with sparse histograms). predictive scheme and entropy coding. The pre- diction is done on the three nearest causal nei- 2.3. MPEG-4 VTC ghbors and there are five predictors that can be selected on a line-by-line basis. The entropy coding MPEG-4 Visual Texture Coding (VTC) [23,12] uses the Deflate algorithm of the popular Zip file is the algorithm used in MPEG-4 [7,18,5] to compression utility, which is based on LZ77 compress visual textures and still images, which coupled with Huffman coding. PNG is capable are then used in photo realistic 3Dmodels, of lossless compression only and supports gray animated meshes, etc., or as simple still images. scale, paletted color and true color, an optional It is based on the discrete wavelet transform alpha plane, interlacing and other features. (DWT), scalar quantization, zero-tree coding and arithmetic coding. The DWT is dyadic and uses a 2.5. SPIHT Daubechies (9,3) taps biorthogonal filter [2]. The quantization is scalar and can be of three Although not a standard or a widely used types: single (SQ), multiple (MQ) and bi-level scheme in user applications, set partioning in (BQ). With SQ each wavelet coefficient is quan- hierarchical trees (SPIHT), the algorithm intro- tized once, the produced bitstream not being SNR duced by Said and Pearlman [20], has become a scalable. With MQ a coarse quantizer is used and widely known one in the image coding community this information coded. A finer quantizer is then and makes a good candidate as a basis to compare applied to the resulting quantization error and the wavelet-based coders from the compression effi- new information coded. This process can be ciency point of view. Hence, its inclusion in this repeated several times, resulting in limited comparative study. SNR scalability. BQ is essentially like SQ, but SPIHT is based on the DWT and exploits its the information is sent by bit-planes, providing self-similarity across scales by using set partition- general SNR scalability. ing. The wavelet coefficients are ordered into sets Two scanning modes for the generated wavelet using the parent–child relationship and their coefficients are available: tree-depth (TD), the significance at successively finer quantization standard zero-tree scanning, and band-by-band levels. The binary decisions can be further (BB). The former produces a SNR scalable bit- compressed by the use of an optional arithmetic stream only, while the latter produces a resolution coder.