High Dynamic Range Imaging by Perceptual Logarithmic Exposure
Total Page:16
File Type:pdf, Size:1020Kb
High Dynamic Range Imaging by Perceptual Logarithmic Exposure Merging Corneliu Florea [email protected] Image Processing and Analysis Laboratory University ”Politehnica” of Bucharest, Romania, Address Splaiul Independent¸ei 313 Constantin Vertan [email protected] Image Processing and Analysis Laboratory University ”Politehnica” of Bucharest, Romania, Address Splaiul Independent¸ei 313 Laura Florea [email protected] Image Processing and Analysis Laboratory University ”Politehnica” of Bucharest, Romania, Address Splaiul Independent¸ei 313 Abstract were introduced as an alternative to image processing with real based operations. While initially modelled from In this paper we emphasize a similarity between the cascade of two transmitting filters (Jourlin and Pinoli, the Logarithmic-Type Image Processing (LTIP) 1987), later it was shown that the LIP models can be gen- model and the Naka-Rushton model of the Hu- erated by the homomorphic theory and they have a cone man Visual System (HVS). LTIP is a deriva- space structure (Deng et al., 1995). The initial model was tion of the Logarithmic Image Processing (LIP), shown to be compatible with the Weber-Fechner perception which further replaces the logarithmic function law (Pinoli and Debayle, 2007), which is not unanimously with a ratio of polynomial functions. Based accepted (Stevens, 1961). Currently, most global Human on this similarity, we show that it is possi- Visual System (HVS) models are extracted from the Naka- ble to present an unifying framework for the Rushton equation of photoreceptor absorption of incident High Dynamic Range (HDR) imaging problem, energy and are followed by further modelling of the lo- namely that performing exposure merging under cal adaptation. We will show in this paper that the new the LTIP model is equivalent to standard irradi- LIP extension model introduced in (Vertan et al., 2008) is ance map fusion. The resulting HDR algorithm consistent with the global human perception as described is shown to provide high quality in both subjec- by the Naka-Rushton model. The model no longer uses a tive and objective evaluations. logarithmic generative function but only a logarithmic-like function, hence it will be named Logarithmic Type Image arXiv:1411.0326v2 [cs.CV] 3 Jun 2015 Processing (LTIP) model. In such a case, the generative 1. Introduction function of the LTIP model transfers the radiometric energy Motivated by the limitation of digital cameras in captur- domain into human eye compatible image domain; thus it ing real scenes with large lightness dynamic range, a cate- mimics, by itself and by its inverse, both the camera re- gory of image acquisition and processing techniques, col- sponse function and the human eye lightness perception. lectively named High Dynamic Range (HDR) Imaging, The current paper claims three contributions. Firstly, we gained popularity. To acquire HDR scenes, consecutive show that the previously introduced LTIP model is com- frames with different exposures are typically acquired and patible with Naka-Rushton/Michaelis-Menten model of the combined into a HDR image that is viewable on regular eye global perception. Secondly, based on the previous displays and printers. finding, we show that it is possible to treat two contrast- In parallel, Logarithmic Image Processing (LIP) models ing HDR approaches unitary if the LTIP model framework is assumed. Thirdly, the reinterpretation of the exposure merging algorithm (Mertens et al., 2007) under the LTIP model produces a new algorithm that leads to qualitative results. HDR by Perceptual Merging The paper is constructed as follows: in Section 2 we present and to the book by Banterle et al.(2011). a short overview of the existing HDR trends and we em- Among other TMO attempts, a notable one was proposed phasize their correspondence with human perception. In by Reinhard et al. (2002) which, inspired by Ansel Adams’ Section 3, state of the art results in the LIP framework and Zone System, firstly applied a logarithmic scaling to mimic the usage of the newly introduced LTIP framework for the the exposure setting of the camera, followed by dodging- generation of models compatible with human perception and-burning (selectively and artificially increase and de- are discussed. In Section 4 we derive and motivate the sub- crease image values for better contrast) for the actual com- mitted HDR imaging, such that in Section 5 we discuss im- pression. Durand et al. (2002) separated, by means plementation details and achieved results, ending the paper of a bilateral filter, the HDR irradiance map into a base with discussion and conclusions. layer that encoded large scale variations (thus, needing range compression) and into a detail preserving layer to 2. Related work form an approximation of the image pyramid. Fattal et al. (2002) attenuated the magnitude of large gradients The typical acquisition of a High Dynamic Range image based on a Poisson equation. Drago et al. (2003) imple- relies on the “Wyckoff Principle”, that is differently ex- mented a logarithmic compression of luminance values that posed images of the same scene capture different informa- matches the HVS. Krawczyk et al. (2005) implemented tion due to the differences in exposure (Mann and Picard, the Gestalt based anchoring theory of Gilchrist Gilchrist 1995). Bracketing techniques are used in practice to ac- et al. (1999) to divide the image in frameworks and per- quire pictures of the same subject but with consecutive ex- formed range compression by ensuring that frameworks are posure values. These pictures are, then, fused to create the well-preserved. Banterle et al. (2012) segmented the im- HDR image. age into luminance components and applied independently For the fusion step two directions are envisaged. The the TMOs introduced in Drago et al. (Drago et al., 2003) first direction, named irradiance fusion, acknowledges that and in Reinhard et al. (Reinhard et al., 2005) for further the camera recorded frames are non-linearly related to the adaptive fusion based on previously found areas. Ferradans scene reflectance and, thus, it relies on the irradiance maps et al. (2012) proposed an elaborated model of the global retrieval from the acquired frames, by inverting the camera HVS response and pursued local adaptation with an itera- response function (CRF), followed by fusion in the irradi- tive variational algorithm. ance domain. The fused irradiance map is compressed via Yet, as irradiance maps are altered with respect to the re- a tone mapping operator (TMO) into a displayable low dy- ality by the camera optical systems, additional constraints namic range (LDR) image. The second direction, called are required for a perfect match with the HVS. Hence, this exposure fusion, aims at simplicity and directly combines category of methods, while being theoretically closer to the acquired frames into the final image. A short compari- the pure perceptual approach, requires supplementary and son between these two is presented in Table 1 and detailed costly constraints and significant computational resources in further paragraphs. for the CRF estimation and for the TMO implementation. 2.1. Irradiance fusion 2.2. Exposure merging Originating in the work of Debevec and Malik (1997), the Noting the high computational cost of the irradiance maps schematic of the irradiance fusion may be followed in Fig. fusion, Mertens et al. (2007) proposed to implement the 1 (a). Many approaches were schemed for determining the fusion directly in the image domain; this approach is de- CRF (Grossberg and Nayar, 2004). We note that the dom- scribed in Fig. 1 (b). The method was further improved for inant shape is that of a gamma function (Mann and Mann, robustness to ghosting artifacts and details preservation in 2001), trait required by the compatibility with the HVS. HDR composition by Pece and Kautz (2010). After reverting the CRF, the irradiance maps are combined, Other developments addressed the method of computing typically by a convex combination (Debevec and Malik, local contrast to preserve edges and local high dynamic 1997), (Robertson et al., 1999). For proper displaying, a range. Another expansion has been introduced by Zhang tone mapping operator (TMO) is then applied on the HDR et al. (2012), who used the direction of gradient in a par- irradiance map to ensure that in the compression process tial derivatives type of framework and two local quality all image details are preserved. For this last step, follow- measures to achieve local optimality in the fusion process. ing Ward’s proposal (Ward et al., 1997), typical approaches Bruce (2014) replaced the contrast computed by Mertens et adopt a HVS-inspired function for domain compression, al. (2007) onto a Laplacian pyramid with the entropy calcu- followed by local contrast enhancement. For a survey of lated in a flat circular neighborhood for deducing weights the TMOs we refer to the paper of Ferradans et al. (2012) that maximize the local contrast. HDR by Perceptual Merging Table 1. Comparison of the two main approaches to the HDR problem. Method CRF recovery Fused Components Fusion Method Perceptual Irradiance fusion Weighted Yes Irradiance Maps Yes (via TMO) (Debevec and Malik, 1997) convex combination Exposure fusion Weighted No Acquired Frames No (Mertens et al., 2007) convex combination (a) (b) Figure 1. HDR imaging techniques: (a) irradiance maps fusion as described in Debevec and Malik (1997) and (b) exposure fusion as described in Mertens et al.(2007). Irradiance map fusion relies on inverting the Camera Response Function (CRF) in order to return to the irradiance domain, while the exposure fusion works directly in the image domain, and thus avoiding the CRF reversal. The exposure fusion method is the inspiration source for plications for the classical LIP model is presented in many commercial applications. Yet, in such cases, expo- (Pinoli and Debayle, 2007). In parallel, other logarith- sure fusion is followed by further processing that increases mic models and logarithmic-like models were reported. the visual impact of the final image.