<<

Pattern Recognition 37 (2004) 1201–1217 www.elsevier.com/locate/patcog

Color balancing of digital photos using simple image statistics

Francesca Gasparini , Raimondo Schettini∗ DISCO (Dipartimento di Informatica, Sistemistica e Comunicazione), Universita degli Studi di Milano-Bicocca, Ediÿcio 7, Via Bicocca degli Arcimboldi 8, Milano 20126, Italy

Received16 May 2003; accepted19 December 2003

Abstract

The great di.usion of digital and the widespread use of the internet have produced a mass of digital images depicting a huge variety of subjects, generally acquired by unknown imaging systems under unknown conditions. This makes balancing, recovery of the color characteristics of the original scene, increasingly di2cult. In this paper, we describe a methodfor detectingandremoving a color cast (i.e. a superimposedcolor dueto lighting conditions,or to the characteristics of the capturing device), from a digital photo without any a priori knowledge of its semantic content. First a cast detector, using simple image statistics, classiÿes the input images as presenting no cast, evident cast, ambiguous cast, a predominant color that must be preserved(such as in underwaterimages or single color close-ups) or as unclassiÿable. A cast remover, a modiÿed version of the balance algorithm, is then applied in cases of evident or ambiguous cast. The method we propose has been testedwith positive results on a dataset of some 750 photos. ? 2003 Pattern Recognition Society. Published by Elsevier Ltd. All rights reserved.

Keywords: ; Cast detection; Cast removal; Von Kries; White balance; Gray world algorithm

1. Introduction of designing reliable automatic tools that can improve the overall quality of digital pragmatically by de- The great di.usion of digital cameras and the widespread signing a modular enhancing procedure (Fig. 1). Each mod- use of the internet have produced a mass of digital im- ule can be considered as an autonomous element, that can ages depicting a huge variety of subjects, generally acquired be combinedin more or less complex algorithms. by non-professional photographers using unknown imaging This paper focuses on unsupervisedcolor correction. The systems under unknown lighting conditions. The quality of problem is that of automatically andreliably removing a these real-worldphotographs can often be considerablyim- color cast (a superimposed color due to lighting conditions, provedby digitalimage processing. At present color and or to the characteristics of the capturing device). The solu- contrast corrections are usually manually performedwithin tion we have designed exploits only simple image statistics the framework of speciÿc software packages. Since these to drive the color procedure, preventing artifacts. interactive processes may prove di2cult and tedious, espe- Section 2 illustrates the problem addressed, and the related cially for amateur users, an automatic image enhancement methods available in the literature. In Section 3, we describe tool wouldbe most desirable. the automatic procedure for color balancing, which we have Typical image properties requiring correction are color, structuredin two main parts: a cast detectoranda cast re- contrast andsharpness. We have approachedthe open issues mover. The detector, described in Section 3.1, classiÿes the input images with respect to their as: (i) no-cast images, (ii) evident cast images, (iii) ambiguous cast images ∗ Corresponding author. Tel.: +39-02-6448-7840; fax: +39-02- (images with a weak cast, or for which whether or not a cast 6448-7839. exists is a subjective opinion), (iv) images with a predomi- E-mail addresses: [email protected] (F. Gasparini), nant color which must be preserved(such as in underwater [email protected] (R. Schettini). images, or single color close-ups), (v) unclassiÿable.

0031-3203/$30.00 ? 2003 Pattern Recognition Society. Published by Elsevier Ltd. All rights reserved. doi:10.1016/j.patcog.2003.12.007 1202 F. Gasparini, R. Schettini / Pattern Recognition 37 (2004) 1201–1217

Input Color Contrast Edge Correction Output image Enhancement Sharpening image

Fig. 1. Our enhancement chain constituted by independent enhancement modules.

The remover, a modiÿed version of the white balance that require a stable perception of an object’s color, as in algorithm [1,2], pilotedby the class of the cast, is applied object recognition, image retrieval, or image reproduc- to images labeledas having evidentor ambiguous cast. This tion, e.g. Ref. [4]. step is described in Section 3.2. Our experimental results are reportedandcommentedin Section 4, while our conclusions Color constancy is an under-determined problem and thus are presentedin Section 5. impossible to solve in the most general case [1]. Several strategies are proposedin the literature. These, in general, require some information about the being used, and are basedon assumptions about the statistical properties of 2. Related work the expectedilluminants andsurface reJectances. From a computational perspective, color constancy is a two-stage Automatic color correction is a challenging issue as the process: the illuminant is estimated, and the image RGB images recorded by an imaging device depend on the are then correctedon the basis of this estimate. The correc- following elements: tion generates a new image of the scene as if taken under a known . The color correction step is 1. The response of each color channel as a function of usually based on a diagonal model of illumination change, intensity, usually modeled as a correction. For deriving from the Von Kries hypothesis that color constancy generic images the channel responses of the imaging de- is an independent gain regulation of the three cone signals, vice are usually unknown. Cardei et al. [2,3] have shown through three di.erent gain coe2cients [5]. This diagonal that gamma-on images (i.e. images for which gamma model is generally a good approximation of change in illu- does not equal unity) can be color corrected without ÿrst mination, as demonstrated by Finlayson et al. in Ref. [6]. linearizing them. Consequently, it is common practice to Shouldthe modelleadto large errors, its performance can address color and gamma corrections independently. still be improvedwith sensor sharpening [ 7,8]. In formula 2. The device white balancing, usually set by the device the Von Kries hypothesis can be written as manufacturer so that it produces equal RGB values for a       white patch under some chosen illuminant. Images of un- L kL 00 L known color balance can still be corrected, but this poses             an additional challenge to the color correction algorithm.  M  =  0 kM 0   M  ; (1) 3. The relative spectral sensitivities of the capturing de- S 00 kS S vice as a function of wavelength. This information is of- ten not available (e.g. for images downloaded from the where L, M, S and L, M , S are the initial and

Internet), and sensitivities can di.er signiÿcantly for dif- post-adaptation cone signals and kL;M;S are the scaling co- ferent cameras. As Cardei et al. [2,3] have pointedout, e2cients [5]. The scaling coe2cients can be expressedas while two di.erent camera models balanced for the same the ratio between the cone responses to a white under the illuminant will have by deÿnition the same response to reference illuminant andthose of the current one. A typical white, they may have di.erent responses to other colors. reference illuminant, which is also the one we have used 4. The surface properties of the objects present in the scene here, is the D65 CIE standard illuminant [9]. In practical depicted and the lighting conditions (lighting geometry situations the L, M, S retinal wavebands are transformed andilluminant color). Unlike human vision, imaging de- into CIE XYZ tristimulus values by a linear transformation, vices (such as digital cameras) can not adapt their spec- or approximatedby image RGB values [ 12]. tral responses to cope with di.erent lighting conditions. Whatever the features usedto describethe colors, we must As a result, the acquiredimage may have a cast, i.e. an have some criteria for estimating the illuminant andthus undesirable shift in the entire color range. The ability of the scaling coe2cients in Eq. (1). The gray worldalgorithm the human to discount the illuminant, ren- assumes that, given an image of su2ciently variedcolors, dering the perceived colors of objects almost independent the average surface color in a scene is gray [10]. This means of illumination is called color constancy. This wouldbe that the shift from gray of the measuredaverages on the a useful property in any vision system performing tasks three channels corresponds to the color of the illuminant. F. Gasparini, R. Schettini / Pattern Recognition 37 (2004) 1201–1217 1203

The three scaling coe2cients in Eq. (1) are therefore set to in order to produce an image where the white objects ap- compensate this shift. pear white. Moreover, an adhoc adjustmentis commonly The white balance algorithm, instead, looks for a white made to force an image’s maximum luminance value to patch in the image, the chromaticity of which will then be pure white. Some cameras also adjust the minimum lumi- the chromaticity of the illuminant. The white patch is eval- nance to set the point at a certain energy level. The uatedas the maximum foundin each of the three image movement of the white andblack point regions of the im- bands separately [1,2]. The scaling coe2cients are now ob- age towards pre-deÿned goals typically produces pleasing tainedcomparing these maxima with the values of the three peak highlights andshadows.However, the resulting im- channels of the chosen reference white. ages may show signiÿcant color casts in over half the image. Retinex algorithms, [11,12], try to simulate the adapta- Since the image is alteredin a very non-linear tion mechanisms of the human vision, performing both color way during image acquisition, the camera data available for constancy anddynamicrange enhancement. Although re- post-processing retains only limitedinformation about the latedto the white patch algorithms, they are not a simple illuminant [25]. For digital still cameras with an unknown diagonal model, but also take into account the spatial rela- white balance correction in which the brightest may tionships in the scene. have been compromised, Cooper et al. developed an algo- Still based on the diagonal model, the mapping rithm that analyzes the chromaticity of any large contiguous approach [13] determines the set of all possible RGB values nearly gray objects foundusing adaptivesegmentation tech- of worldsurfaces undera known canonical illuminant. This niques, to identify the presence of a cast, estimate the chro- set is representedas a convex hull in the . The set matic strength of the objects, andalter the image’s colors to of all the possible RGB values consistent with the image data compensate for the cast [25,26]. Unsupervisedsegmentation under an unknown illuminant, forms another convex hull. is, in turn, another ill-posedproblem. The object is to determine the diagonal mapping of these two hulls. However, there is no unique solution to this problem, and,in particular, not all the mappings foundcorrespondto 3. Our color correction strategy real worldilluminants. Finlayson reformulatedthe problem in a two-dimensional chromatic space (r = R=B; g = G=B) The computational strategy describedhereshouldnot be where the diagonal model is still valid, then argued that interpretedas a “new color constancy algorithm”; it is in- diagonal maps can be further constrained by considering stead a tool that, given as input any digital photo, produces only those corresponding to expected illuminants [14–17]. as output a more pleasing image, that is an image the user Once the set of possible maps hadbeen computed,several will perceive as more natural than the original one. The suc- di.erent models for ÿnding a unique solution were proposed cess of the methodis evaluatedby directcomparison of the in Refs. [18,19]. original andthe processedimages. Color by correlation has been introduced by Finlayson The procedure is structured in two main parts: a cast de- et al. as a successive improvement of the gamut mapping tector and a cast remover. The basic idea of our cast detec- [20–22]. The basic idea is to pre-compute a correlation tor (Section 3.1) is that by analyzing the color distribution matrix which describes the extent to which the proposed il- of the image in a suitable color space with simple statisti- luminants are compatible with the occurrence of image chro- cal tools, it is possible not only to evaluate whether or not a maticities. In a further reÿnement, the correlation matrix has cast is present, but also to classify it. The classes considered been set up to compute the probability that the observed are: (i) no cast images; (ii) evident cast images; (iii) am- are due to each of the training illuminants. biguous cast images; (iv) images with a predominant color The best illuminant can then be chosen, using a maximum that must be preserved; and (v) unclassiÿable. The estimate likelihood estimate for example, or other methods described of the sensor scaling coe2cients is assimilatedwithin the in Ref . [18]. Funt et al. [23,24] have achieveda goodper- problem of quantifying the cast. The cast remover (Section formance in color constancy using a neural network. The 3.2) is appliedonly to those images classiÿedas having an network estimates the illuminant chromaticity basedon the evident or ambiguous color cast. gamut present in the image: the input layer is binary, indi- cating the presence, or absence in the image of a sampled 3.1. Cast detection and classiÿcation chromaticity; the output is the estimatedchromaticity of the illuminant. The algorithm is structuredas follows: As we have said, color correcting images of unknown ori- As seen in Section 2, RGB images are often referredto gin adds to the complexities of the already di2cult problem unknown imaging devices. We assume here that the images of color constancy, because the pre-processing the image are coded in terms of sRGB color coordinates. The sRGB is was subjectedto, the camera sensors andcamera balance are representative of the majority of devices on which color is all unknown [1]. Digital still cameras usually employ au- andwill be viewed[ 27,28]; it refers to a monitor of average tomatic white balance techniques, basedon some illumina- characteristics, so that the images can be directly displayed tion estimation assumption, to adjust sensor ampliÿer gains by remote systems without further manipulation. 1204 F. Gasparini, R. Schettini / Pattern Recognition 37 (2004) 1201–1217

The sRGB values are mappedinto the CIELAB color is spread, D makes it possible to quantify the strength space [30] (see Appendix A), where the chromatic compo- of the cast. nents andlightness are separated[ 20,25]. The CIELAB is a perceptually uniform color space in which it is possible to The algorithm analyzes the color histogram distribution e.ectively quantify color di.erences as seen by the human in the ab chromatic plane, examining its EC andcomputing eye; it is widelyusedin color imaging andis includedasa the statistical values D and D. standard in the international color consortium (ICC) color proÿles [28]. 1. If the histogram is concentratedandfar from the neutral axis, the colors of the image are thus conÿnedto a (i) We analyze only those pixels with a in an in- small region in the ab chromatic diagram (Fig. 4). The terval that excludes the brightest and the darkest points. images are, instead, likely to have either an evident This because the images we consider may already have cast (to be removed), or a predominant color (to be been processedduringacquisition, andwe assume that preserved), if: imaging device is unknown. Digital cameras often force the brightest image point to white andthe darkestto (D¿10 and D ¿ 0:6) or (D ¿ 1:5): (6) black, altering the chroma of very andvery dark regions. Our experience on a data set of several hundred A predominantcolor couldcorrespondto an intrinsic images has suggestedthat we considerthe interval of cast (widespread areas of vegetation, skin, sky, or sea), lightness: 30 ¡L∗ ¡ 95 in identifying color cast. If the or to a single color close-up (Fig. 5). size of the considered portion of image is less than the To detect images with a predominant color corre- 20% of the whole, the image statistics are not fully re- sponding to an intrinsic cast and a single color close-up, liable. These images are considered unclassiÿable and a simple classiÿer exploiting both color andspatial in- are not processedat all. This is the case of very dark formation is used[ 31]. A region identiÿed as probably or very light images, such as those shown in Fig. 2. corresponding to skin, sky, sea, or vegetation is con- Otherwise, the algorithm proceeds with the next step. sidered signiÿcant if it covers over 40% of the whole (ii) The two-dimensional histogram, F(a; b), of the image image; the image is classiÿedas having an intrinsic colors in the ab-plane is computed. For a multicolor cast, andthe cast remover is not applied. image without cast it will present several peaks, dis- If none of the regions corresponding to skin, sky, tributedover the whole ab-plane, while for a single sea, or vegetation occupies over 40% of the whole, but , there will be a single peak, or a few peaks the image EC is extremely concentrated, D ¿ 6, and in a limitedregion (see Fig. 3). The more concentrated has an high average color saturation, (C∗=L∗ ¿ 1—the the histogram andthe farther from the neutral axis, the ratio between the chroma radius and the lightness is more intense the cast. correlatedto the color saturation), the image is clas- The color distribution is modeled using the following siÿedas a single color close-up and,also in this case, statistical measures, with k = a; b: the cast remover is not applied. Images presenting a concentratedhistogram which k = kF(a; b)dk; (2) are not classiÿedas having a predominantcolor, i.e. k intrinsic cast images or close-ups, are heldto have an 2 2 evidentcast andare processedfor color correction as k = (k − k) F(a; b)dk; (3) k described in Section 3.2. respectively, the mean values andthe variances of the 2. All images without a clearly concentratedcolor his- histogram projections along the two chromatic axes a, togram are analyzedwith a procedurebasedon the and b. criterion that a cast has a greater inJuence on a neu- tral region than on objects with colors of high chroma. (iii) An equivalent circle (EC) with center: C =(a;b) and 2 2 The color distribution of near neutral objects (NNO) is radius:  = a + b is associatedto each histogram. To characterize the EC quantitatively we introduce a studied with the same statistical tools described above. distance D: A of the image belongs to the NNO region if its chroma is less than an initial ÿxedvalue (set here D =  −  (4) at one fourth the maximum chroma radius of that im- age), andif it is not isolated,but has some neighbors (where  = (2 + 2)), andthe ratio: a b that present similar chromatic characteristics, than it belongs to the NNO region. Isolatednearly gray pixels D = D=: (5) are probably due to noise. If the percentage of pixels Because D is a measure of how far the whole histogram that satisÿes these requisites is less than a predeÿned (identiÿed by its EC) lies from the neutral axis (a = percentage of the whole image, which experience has 0;b= 0), while  is a measure of how the histogram suggestedto set at 5%, the radiusof the neutral region F. Gasparini, R. Schettini / Pattern Recognition 37 (2004) 1201–1217 1205

Fig. 2. Examples of unclassiÿable images (typically very dark or very light images in which the color range evaluated is not su2cient to determine whether or not a cast exists).

Fig. 3. Left, a multicolor image with no cast, which presents several peaks in the 2D histogram in the ab-plane; right, a cast image showing a concentrated2D histogram.

Fig. 4. Left, equivalent circles corresponding to a cast image; right, equivalent circles for a multicolor image with no cast. The parameters D and  permit the characterization of the histogram distribution.

is gradually increased to the maximum chroma radius, • ambiguous cases (images with a weak cast, or for which andthe region recursively evaluated. the presence of a cast is a subjective opinion). If we deÿne, as above, the mean values and variances When there is a cast, we may expect the NNO color of the NNO histogram, with k = a; b: histogram to show a single small spot, or, at most, a few spots in a limitedregion. NNO colors spreadaroundthe NNOk = kFNNO(a; b)dk; (7) neutral axis, or distributed in several distinct spots indicate k insteadthat there is no cast. The statistical analysis that we have performedon the whole histogram is now appliedto 2 2 NNOk = (NNOk − k) FNNO(a; b)dk: (8) the NNO histogram, allowing us to distinguish among three a cases: we can associate with each NNO histogram a NNO equiv- • evident cast images; alent circle (ECNNO) with center in CNNO =(NNOa;NNOb) 2 2 • no cast images; anda radiusof NNO = NNOa + NNOb. 1206 F. Gasparini, R. Schettini / Pattern Recognition 37 (2004) 1201–1217

Fig. 5. Images with a predominant color and their equivalent circles. Left, single color close-up; right, intrinsic cast.

Fig. 6. If analysis of the whole image does not indicate whether or not there is a cast, the NNO must be analyzed. The three possible cases are representedin the three rows of this ÿgure, each showing the original image andthe NNO region, together with the correspondingECs andECNNOs. The top row shows the case of a no-cast image: DNNO = −0:87 andthe NNO histogram is clearly spreaduniformly around the neutral axis. The middle row corresponds to an image with evident cast: DNNO =1:02 andshows a clear shift histogram. The bottom row presents the case of an ambiguous cast: it is not clear whether there is a cast, andany evaluation couldbe a subjective opinion: DNNO =0:01.

NNO histograms that are well deÿnedandconcentrated distribution around the neutral axis, indicating no cast. We far from the neutral axis will have a positive DNNO: the set as thresholds DNNO =0:5 for cast images and DNNO = image has an evident cast (Fig. 6). With values of −0:5 for no cast images. Cases falling in the interval between

DNNO the histogram presents an almost uniformly spread −0:5 and0.5 are consideredambiguous. F. Gasparini, R. Schettini / Pattern Recognition 37 (2004) 1201–1217 1207

Input image WB Region Output image

Fig. 7. The WB region andcolor correction for a cast image.

Fig. 8. The WB region andcolor correction for an ambiguous case.

3.2. Cast removal where RWB, GWB, and BWB are the averages of the three RGB channels over the whole selectedWB region and The methodwe propose for cast removal is basedon the (WhiteR; WhiteG; WhiteB) represents the reference white Von Kries hypothesis with the RGB channels considered chosen. an approximation of the L, M, S retinal wavebands [12]. The main peculiarity of the methodis that it determines The estimate of the sensor scaling coe2cients is assimilated the WB region on the basis of the type of cast detected. within the evaluation of the color balancing coe2cients. To avoidthe mistaken removal of an intrinsic color, re- Our cast remover is not usedalone, but appliedafter cast gions previously identiÿed by the cast detector as probably detectionandonly to those images classiÿedas having either corresponding to skin, sky, sea or vegetation, are temporar- an evident or an ambiguous cast in order to avoid problems ily removedfrom the analyzedimage. The algorithm then such as the mistaken removal of predominant color, or the looks for the WB region in the rest of the image which distortion of the image’s chromatic balance. The diagonal presents a low level of saturation. The region corresponds transform is to the object, or, as is explainedbelow, group of objects that will be forcedto white. Forcing to white regions with high       R kR 00 R saturation usually causes a color distortion in the output im-             age. During experimentation it became apparent that a su2-  G  =  0 kG 0   G  ; (9) ciently low level of saturation corresponds to C=L∗ ¡ =0:8 B 00kB B (ratio of the chroma radius and the lightness in the CIELAB color space). where RGB and RGB are the color coordinates of the input The algorithm now di.erentiates between evident cast and andcorrectedimage, respectively. ambiguous cast. The gain coe2cients, kR, kG, and kB are estimatedby In the ÿrst case, the WB region is formedby the objects setting at white what we have calledthe white balance (WB) corresponding to all the signiÿcant color peaks in the ab region: histogram of the analyzedportion of the image (Fig. 7). Only objects with a lightness greater than 30 will be accepted, kR = WhiteR=RWB; because the candidates to be whitened must not be too dark. In the case of ambiguous cast, the WB region is composed k = WhiteG=G ; G WB of the near neutral objects (NNO) of the analyzedportion kB = WhiteB=BWB; (10) of the original image (Fig. 8) because the inJuence of a 1208 F. Gasparini, R. Schettini / Pattern Recognition 37 (2004) 1201–1217 weak cast is visible only in objects near the neutral axis. By To evaluate the goodness of the method, the performance choosing the NNO region, we avoidthe risk of removing of the cast detector and that of the cast remover must be the intrinsic color of an object far from the neutral axis, and combined. However, we ÿrst considered the performance of the consequent chromatic distortion of the output image. the cast detector alone, as this ÿrst step could also be applied Once the scaling coe2cients have been evaluatedwith in conjunction with other cast removal algorithms. Eq. (10), the cast remover algorithm is appliedto the whole image, including regions of skin, sky, sea, and vegetation. 4.1. Cast detector performance This color balance algorithm can be considered a mix- Among the 748 images, 2 were considered unclassiÿable ture of the white balance andgray worldprocedures.The by the cast detector; these images are shown in Fig. 2. For WB region is formedby several objects of di.erentcolors, each of the remaining 746 images, the output of the cast which are set at white not singularly, but as an average. This detector was compared with the majority opinion of a panel prevents the chromatic distortion that follows on the wrong of ÿve experts andreportedin the confusion matrix, C, choice of the region to be whitened. When there is a clear presentedin Table 1. cast, the colors of all the objects in the image su.er the same The diagonal terms of the matrix correspond to images shift due to that speciÿc cast. If we consider the average over for which there is an agreement between the class assigned several objects, introducinga kindof gray worldassump- by the detectorandthe opinion of the experts andcover tion, only that shift survives andwill be removed.Note that the 85% of the cases (635/746). Examples of correctly for images where the WB region consists of a single ob- classiÿedimages are reportedin Fig. 9 (see also Figs. 10 ject, such as is usually the case of images with a strong cast, and 11). our methodtendsto correspondto the conventional white If we look at the misclassiÿedelements of the matrix, we balance algorithm, while for images where the WB region ÿnd: includes more objects (as in images with a softer cast), the methodtendsto the gray worldalgorithm. The evaluation • Seven predominant color images and three no-cast images of the scaling coe2cients can be conveniently performed erroneously classiÿedas cast images by the cast detec- on the thumbnail image or on a sub-sampledimage, signif- tor. As a consequence of this misclassiÿcation, the cast icantly reducing the computational time. remover is appliedto images the colors of which should instead be preserved. This can be considered a serious er- 4. Experimental results ror. The secondandthirdimages of Fig. 12 are examples of this type of misclassiÿcation. The thresholds in our procedure were heuristically deter- • Three cast images classiÿedby the detectoras having no minedon a set of 40 images, indicatedbyour research spon- cast and17 cast images as having a predominantcolor. sor as a goodexample of non-professional digitalphotos. These images will not be color corrected; consequently, These images were not usedin the evaluation phase. the output images will correspondto the input ones. These To verify the reliability of our methodwe usedanother cases can be consideredafailure of the methodas the data set, composed of 748 images of di.erent sizes (from cast is not removed; however, the images are preserved, 120 × 160 to 2272 × 1704 pixels), resolutions andquality and there is no color distortion. We do not consider this (in terms of jpeg compression, noise, dynamic range, etc.). misclassiÿcation a serious error. Even less severe is the These images were downloaded from personal web-pages, misclassiÿcation of eight ambiguous cast images as hav- or acquiredby our research sponsor using various digital ing no cast. The ÿrst image on the left of Fig. 12 is an cameras andscanners. The pre-processing of these images example of a cast image erroneously classiÿedas a pre- variedandwas in the most of the cases unknown. We tried dominant color image. to avoidhaving undesirableclusters of images (i.e. images • A subset of 22 images classiÿedby the detectoras am- derived from the same web site and/or of the same subject), biguous cases but judged by the experts to have no cast. since that might have biasedin some way our classiÿcation These cases must be evaluated individually, however the experiments. application of the cast remover to these images, does not

Table 1 Cast detector No cast Ambiguous cast Cast Predominant color Panel of experts No cast 177 22 3 2 Ambiguous cast 8 133 43 0 Cast 3 5 219 17 Predominant color 1 0 7 106 F. Gasparini, R. Schettini / Pattern Recognition 37 (2004) 1201–1217 1209

Fig. 9. Examples of images correctly classiÿedby the detector.Above left, cast image; right, ambiguous case; below left, no-cast image; right, predominant color image.

Fig. 10. Examples of images judged by the experts as no-cast images. 1210 F. Gasparini, R. Schettini / Pattern Recognition 37 (2004) 1201–1217

Fig. 11. Examples of images judged by the experts as predominant color images.

Fig. 12. Examples of misclassiÿcation. From left to right, a cast image classiÿed as a predominant color image (not considered a serious error), a predominantcolor image classiÿedas a cast image, anda no-cast image consideredacast image (the latter two, both cases of serious misclassiÿcation).

at any rate create artifacts, or substantially alter the high better than the originals, six were considered equivalent, chroma colors. andtwo were judgedworse.In Fig. 13, these latter 2 (left) • A subset of 48 images (5 + 43) judged ambiguous by the are comparedwith their color balancedoutput (right), experts while classiÿedby the detectoras cast images, while Fig. 14 shows examples of cast images judged better or vice versa. These images anyway have consequently than the originals after the color balancing. been color corrected, but not according to the computa- • Among the 133 ambiguous cases, 98 images were judged tional strategy designed for them. These cases must be by the panel of experts better than the originals after the also evaluated individually. cast removal, andthe remaining equivalent. Examples of • Two images classiÿedas predominantcolor andjudged processedambiguous cases are shown in Fig. 15 (see by the expert as no-cast, andone predominantcolor im- Fig. 16) age classiÿedas no-cast. These misclassiÿcations have no consequences since neither no-cast nor predominant color In the case of cast detector misclassiÿcations, there are images undergo cast removal. two possible kinds of error: No color balancing of cast images: The cast images are 4.2. Cast remover performance not color balancedbecause classiÿedas having no cast, or having a predominant color: the output image corresponds We consider the behavior of the cast remover only with to the input. The error does not introduce chromatic distor- respect to the images correctly classiÿedas having a cast, tion. The more frequent situation is the classiÿcation of a or an ambiguous cast, (the secondandthirdelements of the cast as a predominant color. It is usually the case of a strong diagonal of the confusion matrix): , , or cast which the classiÿer interprets as a signiÿcant region of sky, sea, vegetation, or skin, respec- • Of the 219 images correctly classiÿedas having a cast, tively, (Fig. 17). We plan to reduce this error in the future after cast removal, 211 were judged by the panel of experts by enhancing the performance of the region classiÿer. F. Gasparini, R. Schettini / Pattern Recognition 37 (2004) 1201–1217 1211

Fig. 13. The 2 cast images judged by the panel of experts to be worse after color balancing: left, the input images; right, the images after color balancing.

Color distortion of no cast images or predominant color output of our algorithm). In the case of images with a strong images: Predominant color images or no-cast images erro- cast, such as in those of the ÿrst row of Fig. 19, the gray neously classiÿedas having a cast undergocolor balancing. worldnot only introducesa grayish overtone but at times This may be the case of close-ups of a single object, or strongly distorts the colors. no-cast images with only one near neutral object which the In comparison with the white patch algorithm, our cast detector considers cast images. Applying the cast re- methodis judgedbetterin 40% of the cases, while in the mover may produce color changes in the output images. The rest it is judged equal. It shows a better performance in the three no-cast images, erroneously classiÿedas cast images, case of cast images with highlights, or where the acquisi- are shown, before andafter color balancing, in Fig. 18. tion system has forcedthe white point incorrectly. In these In summary, in the great majority of the experimented cases, the maxima usedby the white patch corresponded cases, the processedimage was heldto be superior to the to regions that hadlost their chromatic information, and original; only in less than 2% of the 746 images processed consequently the cast couldnot be removed.Examples of was the output judged worse than the original. this kindof failure of the white patch algorithm are shown We have comparedour results with those obtainedwith in the ÿrst andfourth rows of Fig. 19. Moreover, in the conventional white patch andgray worldalgorithms, those presence of a predominant color, the white patch approach most often usedin imaging devices.We again collected generally causes severe chromatic distortion, trying to the majority opinion of the same ÿve experts. This second whiten the intrinsic color of the scene, as in the last row of analysis, gives our algorithm as consistently better than the Fig. 19. gray worldwith the exception of only a few cases, ( 10), when it is judged equal. It does not introduce the grayish 4.3. Computational cost atmosphere typical of the gray worldoutput images, as can be seen in Fig. 19 comparing the secondcolumn (output of The prototype codehas been developedandtestedwith the gray worldalgorithm) with the fourth (the corresponding the Matlab 6.5 Image Processing Toolbox andwith the 1212 F. Gasparini, R. Schettini / Pattern Recognition 37 (2004) 1201–1217

Fig. 14. Top, 4 cast images correctly classiÿed by the cast detector. Bottom, the same 4 images after color balancing, judged by the experts better than the originals.

BorlandC++ Builder5.0, while its ÿrmware implementa- sub-sampling. This strategy is justiÿedby the fact that there tion is under study. are no signiÿcant di.erences between the adopted simple To evaluate quantitatively the computational cost of our statistics over the whole image andover its sub-sampledver- procedure, the two modules of cast detection and cast re- sion. In the case of the C++ implementation, the average moval are considered separately. The image analysis in the computational cost of the thumbnail analysis is about 0:17 s cast detector is performed on the thumbnail image, already on a PC Intel Pentium M, 1:5 GHz, 512 MB, with Win- available in most digital cameras, or easily computable by dows XP Professional as operating system. We report here F. Gasparini, R. Schettini / Pattern Recognition 37 (2004) 1201–1217 1213

Fig. 15. Top two rows, images classiÿedas ambiguous cases by the cast detector.Bottom two rows, the same images after color correction.

the average cost as the procedure can follow di.erent paths 5. Conclusions depending on the characteristics of the analyzed image. The cast remover is appliedto the whole image andits cost Color balancing is in general performedinteractively. Au- is about 5:6 s=pixel. Total cost is linear with the number tomatic color correction, without knowledge of the image of image pixels as it substantially involves only a diagonal source andsubject, is a very challenging task. For example, transform. an automatic white balance adjustment applied to a 1214 F. Gasparini, R. Schettini / Pattern Recognition 37 (2004) 1201–1217

Fig. 16. The cast andambiguous case images of Fig. 9 after the color balancing. The other two images of Fig. 9 (secondrow) were classiÿed as a no-cast image anda predominantcolor image, respectively, anddidnotundergocast removal.

an unwanteddistortionof the image chromatic content with respect to the original scene. A fundamental aspect of our methodis that it is designedtodistinguishbetween true cast andpredominantcolor in a completely unsupervisedway, allowing us to discriminate between images requiring color correction andthose in which the chromaticity must, instead, be preserved. The correction is also calibrated on the type of cast, allowing the processing of even ambiguous images without color distortion. The whole analysis is performed by simple image statistics on the thumbnail image, already available in most digital cameras, or, easily computable by sub-sampling. This economy of size, together with the speed of the operations involvedis a great advantage:the algo- rithm is so quick, it couldeasily be incorporatedinto more Fig. 17. Cast images erroneously classiÿedas having a predominant complex procedures, including object recognition and image color, andthus not color balanced. retrieval, without a signiÿcant loss of computational time. Our cast detector can also be employed in tandem with color balance methods, to suggest which images should be color image may drastically modify the nature of the scene, re- corrected, and which preserved. moving the characteristic reddish cast, while methods based on the gray worldassumption will generate a wholly grayish scene in the case of a predominant color such as is found in Acknowledgements underwater images, or in those of a forest of mostly bright green leaves. Color constancy algorithms work well only Our investigation was performedas a part of a Olivetti when prior assumptions are satisÿed, for example that the I-jet’s research contract. The authors thank Olivetti I-jet for scene is colorful enough, or uniformly illuminatedby a sin- the permission to present this paper. gle source of light. When these prerequisites do not hold, the result may be distorted, or grayed out colors. Moreover, Appendix A. sRGB to CIELAB traditional methods of cast removal do not discriminate be- tween images with true cast andthose with predominantcol- The sRGB color space was introduced in 1996 as a stan- ors, such as underwater images, or single color close-ups; but dard color space for image interchange, especially over the are appliedin the same way to all images. This may result in Internet. The sRGB color space complements current ICC F. Gasparini, R. Schettini / Pattern Recognition 37 (2004) 1201–1217 1215

Fig. 18. Top row, the 3 no-cast images classiÿedby the cast detectoras having a cast. Bottom row, the processedimages after cast removal, judged worse than the originals.

strategies by enabling a methodof han- RsRGB = RsRGB=12:92; GsRGB = GsRGB=12:92; dling color in the operating systems and device drivers, us- ing a simple and robust device independent color deÿnition BsRGB = BsRGB=12:92; (if there does not exist an ICC proÿle, then the color space else if RsRGB, GsRGB, BsRGB ¿ 0:04045, is sRGB. If there is an ICC proÿle embedded in the im- R =((R +0:055)=1:055)2:4; age, this takes priority andprovidesan unambiguous color sRGB sRGB space deÿnition [28]). The sRGB color space deÿnition is 2:4 GsRGB =((GsRGB +0:055)=1:055) ; basedon the average performance of typical CRT moni- 2:4 tors under reference viewing conditions, but it is well suited BsRGB =((BsRGB +0:055)=1:055) to Jat panel displays, television, scanners, digital cameras, These values are convertedto XYZ (D65) by andprinting systems. For these reasons it has now been       X 0:4124 0:3576 0:1805 RsRGB widely accepted in the consumer imaging industry and it             has now been adopted as an IEC international standard [29].  Y  =  0:2126 0:7152 0:0722   GsRGB  : Due to similarities of the deÿned reference display to real Z 0:0193 0:1192 0:9505 B CRT monitors, often no additional color space conversion sRGB is needed to display the images. However, conversions are Tristimulus values are then mappedin the CIELAB color requiredto transform datainto sRGB andthen out to de- space according to the following equations [30], with the vices with di.erent dynamic ranges, and viewing D65 reference white: Xn =0:9505; Yn =1:00; Zn =1:0891; conditions. More details on the respective sRGB standard and q = Y=Yn; p = X=Xn; r = Z=Zn; color space andthe ICC proÿle format can be foundin Refs. ∗ 1=3 L = 116(Y=Yn) − 16 for (Y=Yn) ¿ 0:008856; [28,29]. ∗ The 8 bit integer sRGB values are convertedto Joating L = 903:3(Y=Yn) for (Y=Yn) 6 0:008856 point non-linear sRGB values as follows: ∗ ∗ a = 500 (p1 − q1); R = R8bit =255:0; G = G8bit =255:0; sRGB sRGB ∗ ∗ b = 200 (q1 − r1); B = B =255:0: sRGB 8bit where The nonlinear sR G B values are transformedto linear q1 =7:787q +16=116 for q 6 0:008856; RsRGB, GsRGB, BsRGB values by: q = q1=3 for q¿0:008856; If RsRGB, GsRGB, BsRGB 6 0:04045, 1 1216 F. Gasparini, R. Schettini / Pattern Recognition 37 (2004) 1201–1217

Fig. 19. Comparison of the output of our algorithm (last column), with that of the gray world(secondcolumn) andof the white patch (third column). The gray worldmay fail in that images without a su2cient variation in color. The white patch in the case of cast images with highlights. Our algorithm shows a greater stability.

p1 =7:787p +16=116 for p 6 0:008856; References p = p1=3 for p¿0:008856; 1 [1] B. Funt, K. Barnard, L. Martin, Is machine colour constancy r1 =7:787r +16=116 for r 6 0:008856; goodenough?, Proceedingsof the Fifth European Conference on Computer Vision, Freiburg, Germany, 1998, pp. 1=3 r1 = r for r¿0:008856: 445–459. F. Gasparini, R. Schettini / Pattern Recognition 37 (2004) 1201–1217 1217

[2] V. Cardei, B. Funt, K. Barnard, White point estimation for [17] G.D. Finlayson, S. Hordley, Improving gamut mapping uncalibratedimages, Proceedingsof the IS& T/SID Seventh colour constancy, IEEE Trans. Image Process 9 (10) (2000) Color Imaging Conference, Scottsdale, USA, 1999, pp. 1774–1783. 97–100. [18] K. Barnard, V. Cardei, B. Funt, A comparison of [3] V. Cardei, B. Funt, M. Brockington, Issues in color correcting computational color constancy algorithms—Part digital images of unknown origin, Proceedings of the 12th I: methodologyandexperiments with synthesizeddata,IEEE International Conference on Control Systems andComputer Trans. Image Process. 11 (7) (2002) 972–983. Science, CSCS’12, Bucharest, Romania, 26–29 May, 1999. [19] K. Barnard, L. Martin, A. Coath, B. Funt, A comparison [4] G. Ciocca, D. Marini, A. Rizzi, R. Schettini, S. Zu2, Retinex of computational color constancy algorithms—Part II: pre-processing of uncalibratedimages for color basedimage experiments with image data, IEEE Trans. Image Process 11 retrieval, J. Electron. Imaging 12 (1) (2003) 161–172. (7) (2002) 985–995. [5] M.D. Fairchild, Color Appearance Models, Addison-Wesley, [20] G.D. Finlayson, P.H. Hubel, S. Hordley, Color by Reading, MA, 1997. correlation, Proceedings of the IS&T/SID Fifth Color Imaging [6] G.D. Finlayson, M.S. Drew, B. Funt, Diagonal transform Conference: Color Science, Systems andApplications, su2ce for color constancy, Proceedings of the IEEE Scottsdale, USA, 1997, pp. 6–11. International Conference on Computer Vision, Berlin, 1993, [21] G.D. Finlayson, P.H. Hubel, S. Hordley, Color by correlation: pp. 164–171. a simple unifying framework for color constancy, IEEE Trans. [7] G.D. Finlayson, M.S. Drew, B. Funt, Spectral sharpening: Pattern Anal. Mach. Intell. 21 (2001) 1209–1221. sensor transformations for improvedcolor constancy, J. Opt. [22] G.D. Finlayson, P.H. Hubel, White point estimation using Soc. Am. A 11 (1994) 1553–1563. color by correlation, U.S. Patent 6038339, 2000. [8] K. Barnard, F. Ciurea, B. Funt, Sensor sharpening for [23] K. Barnard, V. Cardei, B. Funt, Learning color constancy, computational color constancy, J. Opt. Soc. Am. A 18 (2001) Proceedings of the IS&T/SID Fourth Color Imaging 2728–2743. Conference, Scottsdale, USA, 1996, pp. 58–60. [9] http://www.cie.co.at/cie/ [24] B. Funt, V. Cardei, Bootstrapping color constancy, [10] G. Buchsbaum, A spatial processor model for object color Proceedings on Human Vision and Electronic Imaging IV, perception, J. Franklin Inst. 310 (1980) 1–26. San Jose, CA, 1999, pp. 421–428. [11] E. Land, The retinex theory of , Scien. Am. 237 [25] T. Cooper, A novel approach to color cast detection and (1977) 108–129. removal in digital images, Proc. SPIE 3963 (2000) 167–175. [12] Edwin H. Land, John McCann, Lightness and retinex theory, [26] T. Cooper, Color segmentation as an aidto white balancing J. Opt. Soc. Am. 61 (1) (1971) 1–11. for digital still cameras, Proc. SPIE 4300 (2001) 164–171. [13] D. Forsyth, A novel algorithm for color constancy, Int. J. [27] http://www.srgb.com. Comput. Vision 5 (1990) 5–36. [28] International Color Consortium (ICC). http://www.color.org. [14] G.D. Finlayson, Color in perspective, IEEE Trans. Pattern [29] International Electrochemical Commission, Colour measure- Anal. Mach. Intell. 18 (10) (1996) 1034–1038. ment andmanagement in multimediasystems 75 and [15] G.D. Finlayson, S. Hordley, Selection for gamut mapping equipment, Part 2–1: colour management. Default RGB colour colour constancy, Proceedings of the Eighth British Machine space-sRGB, IEC Publication 61966-2.1, 1999. Vision Conference, Colchester, England, Vol. 2, 1997, pp. [30] R.W.G. Hunt, Measuring Color, Ellis Horwood, Chichester, 630–639. 1987. [16] G.D. Finlayson, S. Hordley, A theory of selection for [31] C. Cusano, G. Ciocca, R. Schettini, Image annotation using gamut mapping colour constancy, Proceedings of the IEEE SVM, in: S. Santini, R. Schettini (eds.), Proceedings on Conference on Computer Vision andPattern Recognition, 41, Internet Imaging IV, Vol. SPIE 5304. pp. 330–338, San Jose, Santa Barbara, CA, USA, 1998, pp. 60–65. CA, USA, 2004, in press.

About the Author—RAIMONDO SCHETTINI is Associate Professor in the Department of Information Science, Systems Theory, and Communication (DISCo), at the University of Milano Bicocca, where he is in charge of the Imaging andVision Lab. He has workedwith Italian National Research Council (C.N.R.) since 1987. He has been team leader in several national and international research projects and publishedover 140 refereedpapers on image processing, analysis andreproduction,andimage content-basedindexingandretrieval. He was General Co-Chairman of the First Workshop on Image andVideoContent-basedRetrieval (1998) of the First European Conference on Color in Graphics, Imaging andVision (CGIV’2002), andof the EI Internet Imaging Conferences (2000–2004). He has been guest editorof a special issue about Internet Imaging of the Journal of Electronic Imaging, 2002, of another one on Color Image Processing andAnalysis in Pattern Recognition Letters, 2003, andis currently associate editorof the Pattern Recognition Journal.

About the Author—FRANCESCA GASPARINI took her degree (Laurea) in Nuclear Engineering at the Polytechnic of Milan in 1997 and her Ph.D. in Science andTechnology in Nuclear Power Plants at the Polytechnic of Milan in 2000. Since 2001 she has been a fellow at the ITC, Imaging andVision Laboratory, of the Italian National Research Council. Francesca is currently a Post-docResearcher at DISCo (Dipartimento di Informatica, Sistemistica e Comunicazione), University of Milano Bicocca, where her research focuses on image and video enhancement, face detection, and image quality assessment.