Evaluation of Floating Point Thomas Richter RUS Computing Center University of Stuttgart 70550 Stuttgart, Germany Email: [email protected]

Abstract— Recently, compression of High Dynamic Range under multiplication of all samples by a constant1. (HDR) photography gained attention in the standardization of the Microsoft HDPhoto compression scheme as JPEG-XR. While integer data of 16 bits/pixel (bpp) in scRGB color-space can A. A Mathematical Quality Index represent images up to a dynamic range of about 3.5 magnitudes Additional desired properties are of course symmetry and in luminance, even higher ranges are more efficiently represented non-negativity; furthermore, the index should be zero if and by floating-point number formats. In this work, the author presents the approach taken by JPEG-XR for compressing such only if reference and distorted image are identical. The mean data, and shows that this method when applied to JPEG 2000 square error defined in the following satisfies all these condi- generates equally good results. Furthermore, it is shown that the tions:  |x − y |2 method performs nearly optimal under a mathematical quality 1 i i , MRSE := 2 2 (1) index that is closely related to SSIM, and to the mean square N |xi| + |yi| error of the HDR images when rendered to the LDR regime. i where the fraction is, by convention, taken to be zero in case numerator and denominator are both zero. First, note that the I. INTRODUCTION MRSE is always between0 and  2, because the fraction can x 1 be written as 1 − sin 2∠ i , . Image compression usually deals with sensor input al- yi 0 ready quantized to an integer representation; intensities are Second, this expression has an interesting relation to the encoded as integers of bit-precisions of up to 12 bpp for SSIM index[3], [4], resp. its precursor, UQI[2]. If the term 2 (traditional) lossy JPEG[1], 16 bpp for JPEG-XR[9] or up comparing the averages luminances is neglected , one gets that to 38 bpp for JPEG2000[8]. Applications for bit-depths con-  2 siderably higher than 8bpp are either HDR photography or σ σ C σ C |xj − yj| ≈ 2 x y + 1 · xy + 3 ≈ − j  scientific and sensing applications. It is then often desirable SSIM 2 2 1 2 2 σx + σy + C1 σxσy + C3 j |xj| + j |yj| to represent the sample values in floating point numbers. (2) For example, the EXR format by Industrial Light & Magic where σ denotes the variance, resp. co-variance and the sum (ILM, Lucasfilms)[6], [7] used in post-processing of cinematic over j runs over a small image block. In the limiting case of material deploys a 16bit floating point representation able to this block going to one pixel, the mean SSIM is thus simply cover dynamic ranges of up to 10.7 magnitudes in luminance. one minus the mean relative error. Also, modern shading units in GPUs represent pixel data We furthermore have by Taylor expansion: in floating point[11]. While JPEG-XR introduced a floating |x − y |2 | x |2 | x |2 point compression algorithm[9], it was unclear how well it i i Δ i ≈ 1 Δ i ≈ x . 2 2 = 2 2 2 Δ log( i) performed, and how to evaluate compression efficiency for |xi| + |yi| |xi| + |xi +Δxi| 2 |xi| floating point data in first place as there are currently no (3) agreed metrics for this regime, and existing metrics are not That is, the relative error approximates for small errors the well-applicable to floating point data. absolute error of the logarithm of the data. As noted by Sheik and Bovik [3], [4], the logarithm can be understood as com- ing from Weber’s Law[5], stating that perceived lightness is II. QUALITY ASSESSMENT FOR FLOATING POINT DATA approximately proportional to the logarithm of the luminance, or by the above equation, that the mean relative square error Applying metrics like MSE to floating point data is not estimates the mean square error of the perceived lightness. very reasonable: Floating point formats use a half-logarithmic representation of the data, thus can represent data up to an error that is proportional to the magnitude of the data itself. B. Application Specific Quality Indices Second, a small relative error of a large data item may cause a Naturally, the above estimation of the visual error over- huge overall error contribution, while a large relative error in a simplifies many other aspects of the human visual system small data item is likely ignored by such a metric — though it 1 might be very relevant in any numerical algorithm processing Note that this implies that the triangle inequality cannot hold for such an index this data. Hence, unlike MSE a reasonable mathematical 2It has been observed that this simplification doesn’t impact the overall quality index for floating point samples should be invariant correlation of SSIM to perceived quality[14] much.

978-1-4244-5654-3/09/$26.00 ©2009 IEEE 1909 ICIP 2009 (HVS), and more elaborate quality indices have been proposed. where b is an exponent bias, and n is one for  =0 and zero otherwise; hence n indicates so called normalized numbers. Two approaches shall be mentioned how to address per- Zeros are represented by m =0and  =0. The bit-pattern ceived quality in the HDR regime: Either, the image is already representing a (positive) floating point number interpreted as in an output-referred color-space, and is to be reproduced by integer, however, corresponds to the value μ high-quality equipment. In this case the properties of the HVS i =  · 2 + m. (5) in the HDR regime have to be taken into account, making for example HDR-VDP [13] an appropriate choice. Or, one Comparing the two, one sees that casting implements the considers the HDR image as a “” taken in a following mapping between the integer and floating point scene-referred color space that still requires a tone-mapping interpretation: operation, and quality is to be evaluated after rendering the −μ −μ −μ i·2 −log2 f −b = m2 −log2(n+m·2 )−1+n, (6) image. Clearly, the output depends then on the tone-mapping curve chosen, and thus on the intend of the photographer. Even where the right-hand side is seen to be a number between zero μ μ though not perfect, [15] proposed an algorithm that derives a and one, and hence i ≈ 2 log2 f +2 b. To be precise, the reasonable tone mapping curve from the image automatically. casting operation is a piecewise linear function that is equal to the binary logarithm at values of f that are exact powers III. FLOATING POINT IMAGE COMPRESSION ALGORITHMS of 2. Floating point image compression algorithms will now be Since MRSE is also approximately logarithmic for small evaluated by the following means: First, mean relative square errors, a MSE optimal compression algorithm on integers is errors will be measured in the HDR regime. Next, reference (approximately) MRSE-optimal when applied to IEEE 754 and distorted image will be rendered into a LDR output color floating point numbers casted to integers, making an MSE space by PPTM [15], and SSIM as a candidate quality index optimal integer compression scheme automatically MRSE will be used to compare the resulting images. The following optimal for floating points. It remains to be seen whether three codecs are evaluated: JPEG-HDR, a proprietary exten- MRSE is a reasonable metric. sion to traditional JPEG by Gregory Ward[10], JPEG-XR and a floating-point extension of JPEG 2000. The latter two IV. EXPERIMENTS encode floating point data by applying a one-to-one map to In order to evaluate the usefulness of the proposed metrics, the integers first. the author performed extensive compression tests on the JPEG-XR is an image compression scheme proposed by OpenEXR test set provided by Industrial Light and Magic [17]. Microsoft [9] that is currently under standardization by the This set contain images with dynamic ranges covering up to ISO; details on its codec design are found in [9]. It offers both 4.5 magnitudes, thus considerably higher than sRGB or even IEEE 754[16] compliant 32-bit single-precision pixel formats scRGB could cover. In the first set of tests, images have been as well as a similar 16-bit “half-precision” format compatible first compressed to bit-rates between 0.3 and 2.0 bits per pixel to the OpenEXR[6], [7] and OpenGL[11] representations. using JPEG-XR, JPEG-HDR and JPEG 2000, and SNR and Floating point data is encoded by first removing the sign bit, MRSE have been measured. Results are are shown in the and casting the 16 bit floating point bit pattern to an unsigned column left of the plots 1 to 2. integer. If the input is negative, the two’s complement of the For the next set of tests, the PPTM tone mapping oper- negated input is taken. The resulting number is then encoded as ation [15] has been applied to reference and reconstructed if it would be a 16 bit signed integer sample. Note that this map images in parallel, deriving optimal parameters automatically is not entirely lossless, it will map the IEEE representations from test and reference image separately. It was configured to of +0 and -0 to the integer zero. apply full chromatic and global light adaption. Then, the mean square error and the SSIM index was measured in the LDR JPEG 2000[8] does not offer any compression scheme for regime, and results are shown in the right column. floating point data; however, such methods have been under discussion for a while and have been scheduled for part-2 of V. D ISCUSSION the ISO standard. The following algorithm to encode all types of floating point data is proposed: For positive numbers, cast In terms of the SSIM metric in the LDR regime, the the number to integers. For negative numbers, first remove visually tuned JPEG 2000 shows a definite advantage over the sign-bit, apply the casting operation, then take the one’s other compression schemes, here for example on the “Ocean” complement. Up to the handling of negative numbers, the steps image — this is fairly typical; even though the JPEG 2000 are identical to JPEG-XR. algorithm was never designed to compress floating point data efficiently, the proposed mapping between floating point and Even though this type of mapping looks ad-hoc, it has a the integer domain works considerably well. This advantage couple of interesting properties: IEEE-754 and the half-float was already observed in the LDR regime in tests run by the f format represent a floating point number as a bit pattern JPEG committee earlier [18]. consisting of the sign-bit s, exponent bits  ≥ 0 and a mantissa A noteworthy exception is the “StillLife” image from the m, consisting of μ bits: ILM test-set (Fig. 2). Here, and only here, JPEG-HDR outper- s −μ −b+1−n f =(−1) (n + m · 2 )2 , (4) forms the other candidates for medium to large bit-rates. This

1910 performance gain is also visible in PSNR in the LDR domain, VII. ACKNOWLEDGEMENTS and to a lesser degree, in terms of MRSE in the HDR regime. The author wants to thank the Computing Center of the University of Stuttgart for partially funding this work, and This image is in so far unique as it consists of a very dark, Accusoft-Pegasus for providing their JPEG 2000 code base barely visible image background, and candle-light spotlights and their long-term support of the author. Special acknowl- in the foreground. Also note that the distortion/rate plots for edgements go to Gregory Ward for providing test images and this image are almost flat. A visual inspection of the PPTM the JPEG-HDR code, and to Industrial Light & Magic for mapped images revealed quantization artifacts in the dark releasing the OpenEXR test image set for the purpose of this image regions in the output of the JPEG 2000 and JPEG-XR work. algorithms that are not visible in the JPEG-HDR version. The artifacts are here created by the amplification of an REFERENCES initially small error in the HDR regime by a very steep tone [1] William B. Pennebaker, Joan L. Mitchell: “JPEG Still Image Data mapping curve chosen by PPTM to keep the background visi- Compression Standard” Van Nostrand Reinhold, New York (1992) ble3. Since the floating point to integer mapping used by JPEG [2] Z. Wang, A.C. Bovik, “A universal image quality index”, IEEE Signal Proc. Lett., Vol. 9, No. 3, 81-84, (2002) 2000 and JPEG-XR is content-independent, a tone mapping [3] Z. Wang, A.C. Bovik, H.R. Sheikh, E.P. Simoncelli: “Image quality that is considerably different from the logarithmic mapping assessment: From error visibility to structural similarity”, IEEE Trans. will necessarily amplify errors in some amplitude region. on Image Proc., Vol. 13, No. 4, pp. 600-612, (2004) [4] Z. Wang, E.P. Simoncelli, A.C. Bovik: “Multi-scale structural similarity JPEG-HDR avoids this problem by using a content-dependent for image quality assessment”, IEEE Asilomar Conf. on Signals, Systems tone-mapping to compute its internal LDR-representation as and Computers, (2003). seen in figure 2. [5] A.N. Netravali, B.G. Haskel: “Digital Pictures - Representation and Compression”, Plenum Publishing Corp., USA, (1988) The plots also show that MRSE should not be considered a [6] Industrial Light & Magic: “Technical Introduction to visual metric, it typically disagrees with SSIM its ranking of OpenEXR”, available at http://www.openexr.com/ algorithms. However, its ranking of algorithms typically does TechnicalIntroduction., USA, (1999) [7] R. Bogart, F. Kainz, D .Hess: “The OpenEXR ”, Siggraph agree with those performed by measuring PSNR in the LDR 2003 Technical Sketch, (2003) regime. This is clearly due to the relation 3, or equalently, this [8] M. Boliek (Ed.): “Information Technology — The JPEG2000 Image relation holds because the tone-mapping curves are for most Coding System: Part 1”, ISO/IEC IS 15444-1, (2000) [9] S. Srinivasan, C. Tu, Zhi Zhou, D. Ray, S. Regunathan, G. Sullivan: “An images approximately logarithmic. Introduction to the HDPhoto Technical Design”, JPEG document WG 1 N4183. VI. CONCLUSIONS [10] G. Ward, M. Simmons: “JPEG-HDR: A Backwards Compatible, High Dynamic Range Extension to JPEG”, Intl. Conf. on Comp. Graphics and In this article, three compression schemes for HDR images Interactive Techniques, ACM SIGGRAPH 2005, (2005) encoded in floating point have been discussed: JPEG-HDR, [11] M. Segal, K. Akeley: “The OpenGL Graphics System: A Specification” JPEG-XR and JPEG 2000 with a currently non-standardized Version 3.0, available at http://www.opengl.org/registry/ doc/glspec30.20080811.pdf, USA, (2008) extension for floating-point; except for JPEG-HDR, floating- [12] Scott Daly: “The visible differences predictor: an algorithm for the point compression works by a reversible mapping between the assessment of image fidelity,”, SPIE Vol. 1666 Human Vision, Visual IEEE floating point representation and integers that has been Processing, and Digital Display III (1992) [13] Rafał Mantiuk, Scott Daly, Karol Myszkowski, Hans-Peter Seidel: shown to be a piecewise-linear approximation of the logarithm. “Predicting Visible Differences in High Dynamic Range Images - Model and its Calibration”, Proc. of Human Vision and Electronic Imaging X, IS&T/SPIE’s 17th Annual Symposium on Electronic Imaging 2005 pp. It has been seen that the codec design of JPEG-XR and 204-214 JPEG 2000 is approximately optimal in terms of the mean [14] D.M. Rouse, S.S.Hemami: “Understanding and Simplifying the Struc- relative error (MRSE), allowing them to preserve their advan- tural Similarity Index”, Proc. of 2008 15th IEEE Intl. Conf. on Image Proc, ICIP 2008, (2008) tages in the floating-point domain. JPEG-HDR is, for most [15] E. Reinhard, K. Devlin: “Dynamic Range Reduction Inspired by Pho- images, limited by its design constraint of being backwards toreceptor Physiology”, IEEE Trans. on Visualization and Computer compatible to JPEG, but its content-dependent approach shows Graphics, (2004). [16] ISO/IEC: “IEEE Standard for Binary Floating-Point Arithmetic for advantages in extreme cases. microprocessor systems”, ANSI/IEEE 754, (1985) Measurements indicate that the ranking of algorithms due [17] Industrial Light & Magic: OpenEXR Samples, available for download to the mean relative square error measured in the HDR regime at http://www.openexr.com/samples.html [18] Th. Richter, C. Larabi: “Towards object image quality metrics: the AIC agrees with the ranking obtained from measuring PSNR after eval program of the JPEG”, in: Proc. SPIE: Applications of Digital Image rendering the images to the LDR regime. This relation has Processing XXXI, A.G. Tescher(Ed.), Vol. 7073, No. 34, (2008) been seen to hold because MRSE and MSE are equivalent [19] Th. Richter: “Application Specific Performance Measurements of Image Compression Codecs”, in: Proc. SPIE: Applications of Digital Image under a logarithmic map, and the tone mapping operation Processing XXXI, A.G. Tescher(Ed.), Vol. 7073, No. 34, (2008) for rendering is also approximately logarithmic. Furthermore, this logarithmic mapping is implicitly also present in the design of the compression schemes for JPEG 2000 and JPEG- XR, making both approximately MRSE optimal compression algorithms.

3It is easy to see that the amplification factor is approximately equal to the slope of this curve [19].

1911 Ocean Ocean -35 90

-40

80 -45

-50 70

-55 SNR/dB 60 -60 SNR[after PPTM]/dB

-65 50

-70 -1 with huffman coder jpeg-1 with huffman coder HD Photo HD Photo jpeg2000 jpeg2000 jpeg2000 (visual) jpeg2000 (visual) -75 40 0 0.5 1 1.5 2 2.5 0 0.5 1 1.5 2 2.5 bpp bpp

Ocean Ocean 50 50

40 40

30 30 MRSE/dB 20 20 MS-MSSIM[after PPTM]/dB

10 10

jpeg-1 with huffman coder jpeg-1 with huffman coder HD Photo HD Photo jpeg2000 jpeg2000 jpeg2000 (visual) jpeg2000 (visual) 0 0 0 0.5 1 1.5 2 2.5 0 0.5 1 1.5 2 2.5 bpp bpp

Fig. 1. Quality plots for the Ocean image. From left to right, top to bottom: SNR in the HDR domain, SNR after applying the PPTM tone mapping, MRSE, and MS-MSSIM after applying the PPTM tone mapping.

StillLife StillLife 0 90

-5 80

-10 70

-15 SNR/dB 60

-20 SNR[after PPTM]/dB

50 -25 jpeg-1 with huffman coder jpeg-1 with huffman coder HD Photo HD Photo jpeg2000 jpeg2000 jpeg2000 (visual) jpeg2000 (visual) -30 40 0 0.5 1 1.5 2 2.5 0 0.5 1 1.5 2 2.5 bpp bpp StillLife StillLife 2 50

1 40

0 30 MRSE/dB -1 20 MS-MSSIM[after PPTM]/dB

-2 10

jpeg-1 with huffman coder jpeg-1 with huffman coder HD Photo HD Photo jpeg2000 jpeg2000 jpeg2000 (visual) jpeg2000 (visual) -3 0 0 0.5 1 1.5 2 2.5 0 0.5 1 1.5 2 2.5 bpp bpp

Fig. 2. Quality plots for the StillLife image. From left to right, top to bottom: SNR in the HDR domain, SNR after applying the PPTM tone mapping, MRSE, and MS-MSSIM after applying the PPTM tone mapping.

SNR[HDR] MRSE[HDR] PSNR[LDR] SSIM[LDR] Image J2K-V J2K-R HDR XR J2K-V J2K-R HDR XR J2K-V J2K-R HDR XR J2K-V J2K-R HDR XR Bonita 21.8 23.2 13.9 21.0 45.2 46.8 39.5 43.8 81.2 79.5 78.8 77.1 42.3 38.8 38.9 32.7 GoldenGate 6.69 36.1 -9.16 32.2 50.5 53.3 44.8 51.0 80.8 85.0 51.4 83.9 45.2 42.8 40.0 39.7 Kapaa 46.8 46.1 45.9 43.1 45.9 49.2 38.1 46.6 77.5 78.9 72.4 75.8 43.6 40.0 38.1 35.1 KernerEnv 10.7 14.2 -40.3 14.4 55.2 57.1 44.9 53.5 83.1 85.4 49.8 81.2 46.7 44.7 39.4 38.8 MtTamWest 54.7 54.6 54.9 51.1 39.8 42.1 32.1 39.8 83.2 84.1 71.2 79.7 43.6 40.7 39.5 33.5 Ocean -46.8 -46.8 -62.4 -48.5 31.5 32.7 22.9 31.0 70.4 72.5 60.1 69.8 40.4 37.9 34.0 32.0 StillLife -15.5 -10.2 -18.6 -11.7 -0.59 -0.48 -0.42 -0.92 69.1 70.1 71.1 63.9 36.4 33.4 38.2 28.5 Tree 5.37 3.55 -3.07 0.96 12.4 12.2 11.3 11.0 60.0 55.8 52.0 54.4 35.3 29.6 32.1 25.4 TABLE I COMPRESSION RESULTS FOR OTHER IMAGES. J2K-V IS THE VISUALLY TUNED, J2K-R THE REGULAR JPEG 2000. HDR IS JPEG-HDR, XR IS JPEG-XR. ALL FIGURES IN DB.

1912