Evaluation of Floating Point Image Compression
Total Page:16
File Type:pdf, Size:1020Kb
Evaluation of Floating Point Image Compression Thomas Richter RUS Computing Center University of Stuttgart 70550 Stuttgart, Germany Email: [email protected] Abstract— Recently, compression of High Dynamic Range under multiplication of all samples by a constant1. (HDR) photography gained attention in the standardization of the Microsoft HDPhoto compression scheme as JPEG-XR. While integer data of 16 bits/pixel (bpp) in scRGB color-space can A. A Mathematical Quality Index represent images up to a dynamic range of about 3.5 magnitudes Additional desired properties are of course symmetry and in luminance, even higher ranges are more efficiently represented non-negativity; furthermore, the index should be zero if and by floating-point number formats. In this work, the author presents the approach taken by JPEG-XR for compressing such only if reference and distorted image are identical. The mean data, and shows that this method when applied to JPEG 2000 square error defined in the following satisfies all these condi- generates equally good results. Furthermore, it is shown that the tions: |x − y |2 method performs nearly optimal under a mathematical quality 1 i i , MRSE := 2 2 (1) index that is closely related to SSIM, and to the mean square N |xi| + |yi| error of the HDR images when rendered to the LDR regime. i where the fraction is, by convention, taken to be zero in case numerator and denominator are both zero. First, note that the I. INTRODUCTION MRSE is always between0 and 2, because the fraction can x 1 be written as 1 − sin 2∠ i , . Image compression usually deals with sensor input al- yi 0 ready quantized to an integer representation; intensities are Second, this expression has an interesting relation to the encoded as integers of bit-precisions of up to 12 bpp for SSIM index[3], [4], resp. its precursor, UQI[2]. If the term 2 (traditional) lossy JPEG[1], 16 bpp for JPEG-XR[9] or up comparing the averages luminances is neglected , one gets that to 38 bpp for JPEG2000[8]. Applications for bit-depths con- 2 siderably higher than 8bpp are either HDR photography or σ σ C σ C |xj − yj| ≈ 2 x y + 1 · xy + 3 ≈ − j scientific and sensing applications. It is then often desirable SSIM 2 2 1 2 2 σx + σy + C1 σxσy + C3 j |xj| + j |yj| to represent the sample values in floating point numbers. (2) For example, the EXR format by Industrial Light & Magic where σ denotes the variance, resp. co-variance and the sum (ILM, Lucasfilms)[6], [7] used in post-processing of cinematic over j runs over a small image block. In the limiting case of material deploys a 16bit floating point representation able to this block going to one pixel, the mean SSIM is thus simply cover dynamic ranges of up to 10.7 magnitudes in luminance. one minus the mean relative error. Also, modern shading units in GPUs represent pixel data We furthermore have by Taylor expansion: in floating point[11]. While JPEG-XR introduced a floating |x − y |2 | x |2 | x |2 point compression algorithm[9], it was unclear how well it i i Δ i ≈ 1 Δ i ≈ x . 2 2 = 2 2 2 Δ log( i) performed, and how to evaluate compression efficiency for |xi| + |yi| |xi| + |xi +Δxi| 2 |xi| floating point data in first place as there are currently no (3) agreed metrics for this regime, and existing metrics are not That is, the relative error approximates for small errors the well-applicable to floating point data. absolute error of the logarithm of the data. As noted by Sheik and Bovik [3], [4], the logarithm can be understood as com- ing from Weber’s Law[5], stating that perceived lightness is II. QUALITY ASSESSMENT FOR FLOATING POINT DATA approximately proportional to the logarithm of the luminance, or by the above equation, that the mean relative square error Applying metrics like MSE to floating point data is not estimates the mean square error of the perceived lightness. very reasonable: Floating point formats use a half-logarithmic representation of the data, thus can represent data up to an error that is proportional to the magnitude of the data itself. B. Application Specific Quality Indices Second, a small relative error of a large data item may cause a Naturally, the above estimation of the visual error over- huge overall error contribution, while a large relative error in a simplifies many other aspects of the human visual system small data item is likely ignored by such a metric — though it 1 might be very relevant in any numerical algorithm processing Note that this implies that the triangle inequality cannot hold for such an index this data. Hence, unlike MSE a reasonable mathematical 2It has been observed that this simplification doesn’t impact the overall quality index for floating point samples should be invariant correlation of SSIM to perceived quality[14] much. 978-1-4244-5654-3/09/$26.00 ©2009 IEEE 1909 ICIP 2009 (HVS), and more elaborate quality indices have been proposed. where b is an exponent bias, and n is one for =0 and zero otherwise; hence n indicates so called normalized numbers. Two approaches shall be mentioned how to address per- Zeros are represented by m =0and =0. The bit-pattern ceived quality in the HDR regime: Either, the image is already representing a (positive) floating point number interpreted as in an output-referred color-space, and is to be reproduced by integer, however, corresponds to the value μ high-quality equipment. In this case the properties of the HVS i = · 2 + m. (5) in the HDR regime have to be taken into account, making for example HDR-VDP [13] an appropriate choice. Or, one Comparing the two, one sees that casting implements the considers the HDR image as a “digital negative” taken in a following mapping between the integer and floating point scene-referred color space that still requires a tone-mapping interpretation: operation, and quality is to be evaluated after rendering the −μ −μ −μ i·2 −log2 f −b = m2 −log2(n+m·2 )−1+n, (6) image. Clearly, the output depends then on the tone-mapping curve chosen, and thus on the intend of the photographer. Even where the right-hand side is seen to be a number between zero μ μ though not perfect, [15] proposed an algorithm that derives a and one, and hence i ≈ 2 log2 f +2 b. To be precise, the reasonable tone mapping curve from the image automatically. casting operation is a piecewise linear function that is equal to the binary logarithm at values of f that are exact powers III. FLOATING POINT IMAGE COMPRESSION ALGORITHMS of 2. Floating point image compression algorithms will now be Since MRSE is also approximately logarithmic for small evaluated by the following means: First, mean relative square errors, a MSE optimal compression algorithm on integers is errors will be measured in the HDR regime. Next, reference (approximately) MRSE-optimal when applied to IEEE 754 and distorted image will be rendered into a LDR output color floating point numbers casted to integers, making an MSE space by PPTM [15], and SSIM as a candidate quality index optimal integer compression scheme automatically MRSE will be used to compare the resulting images. The following optimal for floating points. It remains to be seen whether three codecs are evaluated: JPEG-HDR, a proprietary exten- MRSE is a reasonable metric. sion to traditional JPEG by Gregory Ward[10], JPEG-XR and a floating-point extension of JPEG 2000. The latter two IV. EXPERIMENTS encode floating point data by applying a one-to-one map to In order to evaluate the usefulness of the proposed metrics, the integers first. the author performed extensive compression tests on the JPEG-XR is an image compression scheme proposed by OpenEXR test set provided by Industrial Light and Magic [17]. Microsoft [9] that is currently under standardization by the This set contain images with dynamic ranges covering up to ISO; details on its codec design are found in [9]. It offers both 4.5 magnitudes, thus considerably higher than sRGB or even IEEE 754[16] compliant 32-bit single-precision pixel formats scRGB could cover. In the first set of tests, images have been as well as a similar 16-bit “half-precision” format compatible first compressed to bit-rates between 0.3 and 2.0 bits per pixel to the OpenEXR[6], [7] and OpenGL[11] representations. using JPEG-XR, JPEG-HDR and JPEG 2000, and SNR and Floating point data is encoded by first removing the sign bit, MRSE have been measured. Results are are shown in the and casting the 16 bit floating point bit pattern to an unsigned column left of the plots 1 to 2. integer. If the input is negative, the two’s complement of the For the next set of tests, the PPTM tone mapping oper- negated input is taken. The resulting number is then encoded as ation [15] has been applied to reference and reconstructed if it would be a 16 bit signed integer sample. Note that this map images in parallel, deriving optimal parameters automatically is not entirely lossless, it will map the IEEE representations from test and reference image separately. It was configured to of +0 and -0 to the integer zero. apply full chromatic and global light adaption. Then, the mean square error and the SSIM index was measured in the LDR JPEG 2000[8] does not offer any compression scheme for regime, and results are shown in the right column.