MVA2002 IAPR Workshop on Machine Vision Applications, Dec. 11 - 13.2002, Nara- ken New Public Hall, Nara, Japan 8-20 Decomposed Eigenface Method along with Image Correction for Robust Face Recognition

Kazuma Shigenari Fumihiko Sakaue Takeshi Shakunaga * Department of Informat ion Technology, 0 kayama University

Abstract The image correction enables the refinement of the CS and two schemes for registration and recognition, as We have proposed a decomposition of the eigen- summarized in 3.2 and 3.3. face into two orthogonal eigenspaces and have shown Section 4 compares an experimentally obtained re- that the decomposition is effective for realizing robust sult to the result given by the original method for a face recognition under various lighting conditions [lo]. database of facial images of 50 persons under 24 differ- The present paper refines the decomposed eigenface ent lighting conditions. method by introducing a projection-based image cor- 2 Decomposed Eigenface Method rection. The image correction technique is principally 2.1 Normalized Image Space authorized when the object shape is fixed and a suf- ficient number of images are taken beforehand. How- Let an N-dimensional vector X denote an image, ever, the proposed technique can also be applied to a and let 1 denote an N-dimensional vector in which any canonical eigenspace, which is constructed from several element is equal to 1. The normalized ima e x of an faces taken under various lighting conditions. Reflec- original image X is defined as x = X/(X 81). After tive noises, shadows and occlusions are detected and the normalization, x is normalized in the sense that corrected by the projection of a facial image onto the xT1 = 1. Any nonzero image X(# 0) can be mapped canonical eigenface. Based on the newly proposed im- to a point in the Normalized Image Space (NIS). age correction, we develop herein a refined decomposed eigenface method. The experimental results indicate 2.2 Canonical Space that the refinement works well for face recognition un- der various lighting conditions, as compared to the In the present paper, a facial space is defined as a original decomposed eigenface method. space composed from a set of frontal faces obtained from several persons under various lighting conditions. 1 Introduction In order to simplify the problem, we assume that a good segmentation is readily accomplished for each Appearance-based face recognition can be resolved normalized image, as shown in Fig. 4. Eigenspace anal- to the eigenface method[6], which is often identical to ysis on the face space decreases the dimension of the the subspace method[5, 21, if several facial images can face space with little loss of representability [2, 61. In be collected in the registration stage. The eigenface, our experiments, a 45d eigenspace, called the canonical however. cannot be stablv com~osedwhentoo few sam- space (CS), is constructed from a canonical image set ple images are available or whe; the lighting conditions which consists of facial images of 50 persons under 24 are highly similar over the sample images. In order to lighting conditions. solve t,his problem, Shakunaga and Yamamoto[9] pro- Let x and C denote the mean image and the covari- posed the concept of virtual subspace, and Shakunaga ance on the canonical set, respectively. Let A and Shigenari[lO] refined this concept to obtain the de- denote a diagonal matrix of which the diagonal terms composed eigenface method. The decomposed eigen- are eigenvalues of C in descending order, and let cP de- face method facilitates face recognition under various note a matrix of which the i-th column is the i-th eigen- lighting conditions to cover cases in which too few im- vector of C. Then, PCA implies A = cPTC@. Using a ages are available for registration. submatrix an of a, which consists of the n largest- The present paper improves the original decomposed eigenvalue eigenvectors, a projection of x to the n- eigenface method[lO] by introducing a projection-based dimensional CS and the residual are respectively given image correction. as section 2 summarizes the decomposed eigenface X* = anT(x - X) method[lO]. After briefly introducing the terminology and in 2.1-2.3, registration and recognition schemes are de- x# =x-x-anx*. scribed in 2.5. Refinement of the decomposed eigenface Thus, a normalized image x can be decomposed into method is described in Section 3. Projection-based the canonical component x* and the residual compo- image correction is introduced in 3.1 using a canoni- nent x#, which are orthogonal by definition. cal eigenspace (CS) which is constructed using several facial images taken under various lighting conditions. 2.3 Eigen-projection and Eigen-residual Examples of the image correction are also shown in 3.1.

*Address: Okayama-shi, Okayama 700-8530, Japan. E-mail: The orthogonal components x* and x# enable us to {sakaue ,shaku)Pchino. it. okayama-u.ac . jp decompose the eigenface (EF) in NIS. That is, as shown We can estimate the lighting condition under which each input image is taken. In our current implemen- tation, a nearest-neighbor discrimination is performed for 24 lighting conditions. (2) Lighting transformation in CS We can prepare a linear transformation in CS which approximately transforms a facial image taken under one lighting condition to a facial image taken under another condition. The transformation is constructed from a canonical image set which is used for CS con- struction. Using the lighting transformation, a set of Figure 1: Decomposition of the EF to the EP and the images is constructed. ER.. A VEP can be constructed in the CS from the syn- thesized image set, even if only one image is registered in Fig. 1, two eigenspaces can be constructed indepen- for each person. When additional images are regis- dently in CS and in the orthogonal complement CS'. tered, the VEP is updated by nearest-neighbor selec- The first eigenspace, called an eigen-projection (EP), tion of the virtual images. is constructed from the canonical components in CS. The second eigenspace, called an eigen-residual (ER), 2.5 Registration and Recognition Schemes is constructed from the residual components in CS'. The EP and the ER are constructed by the eigenspace In the registration stage, an input facial image x analysis in CS and CS'. is decomposed to x* and x#. Then, the EP and the For the EP construction, the mean vector and ER are created independently in CS and in CS', re- the C; are calculated as spectively. The Virtual Eigen-projection (VEP) is' con- structed as described in 2.4. In the recognition stage, we can realize the face iden- tification by combining the two eigenspaces. Given an unknown face x, two similarity measures are defined by normalized correlations in CS and CS~,where C(x, y) and shows a normalized correlation of x and y: (1) Similarity between x* and (V)EP in CS: Cl, (x) = c(a;,ji; + xrf,x*).

(2) between x# and ER @$ in CS': Let 8; and A; denote the eigenvecton and the diag- - onal matrix, respectively. Then, PCA implies A; = C2,(x) = c(ap#,jig + xf , X#). @;T~;@*. Using a submatrix a* of a;, which con- sists of tRe first m eigenvectors, tLZ projection of x* to (3) Combined similarity of C1 and C2: the pth EP is given by Because C1 and C2 are calculated independently in

CS and CS'. a thev" can be combined as shown below Cl (x) C2,(x) C3,(x) = A +7 Clp? (x) C 2c2(x) l The ER can also be constructed in the mamer de- scribed for the EP. In addition, we can define xf , c?, where gi = arg max15~5qCi,(x). @$, A$, and 8% in the same manner. Consequently, A simple discrimination rule is then created for Ci(i = 1,2,3),by selecting a person the projection of x# to the pth ER is given by arg max Cip(x). llp5P 3 Refinement of Decomposed Eigen- 2.4 Virtual Eigen-projection face Method 3.1 Projection-based Image Correction Virtual eigenspace is defined as a virtualized con- cept of eigenspace [lo]. When at least one image is As discussed in 2.2, a projection of an image x to registered for a person, a virtual eigenspace can be di- CS is given as x* = anT(x - F). The residual x# is rectly constructed over a set of virtual images that are then expressed as x# = x - f - @,x*. synthesized by lighting estimation followed by lighting In this section, we refine the original decomposed transformation. Virtual eigenspace converges to the eigenface method by image correction prior to the de- real eigenspace when additional images are taken. composition. For this purpose, let us define the relative We can also synthesize a set of images in CS from residual ri for the i-th pixel of x as a single projection. Therefore, an eigenspace can be constructed in CS by PCA. Let us call the virtual eigenspace a virtual eigen-projection (VEP). The im- age synthesis is based on the following lighting estima- tion and lighting transformation: where ei is a unit vector of which only the i-th element (1) Lighting estimation in CS is 1 and the other elements are 0. Then, a pixel-wise correction is defined as follows. Because both the CS and the registered image are cor- When lri 1 2 re for a threshold re, the i-th pixel of x rected, each VEP includes much less noise than the is replaced by eT(i + @,x*). The image correction original method. No further noise reduction is neces- makes an intensity value to be consistent with the pro- sary for the construction of the ER because the image jection. For example, regions obscured by shadows and correction repairs the ER. reflections from eyeglasses are removed by the image In the recognition scheme, the subspace method is correction. applied to an image after the image correction. Be- The image correction does not satisfy the normality cause the VEP and the ER include less noise, the pro- of the image. Therefore, the corrected image should be posed recognition scheme works better than the origi- re-normalized when all of the pixels are checked and nal method. corrected. 4 Experimental result The projection-based correction changes outliers to inliers. When more than a few ~ixelsare corrected. x* 4.1 Data Specifications also changes to some extent. ~hkrefore,a few iterations of the image correction should be performed in order to The data specifications are summarized in Table 1. obtain better noise suppression. After a few iterations, Facial images were taken using a fixed camera in our x converges to an image containing little noise and few laboratory. Each of the 100 persons looked forward shadows. while sitting in a chair located a fixed distance from the Figure 2 shows an example of image correction. The camera. The location of the chair was fixed in order to original image in the top row is gradually corrected in obtain the frontal facial images of each person. both the projection and residual domains. In particu-

lar, reflections from the person's glasses are suppressed. Table- 1: Data s~ecifications 11 Canonical set I Test images AL of ~ersons 11 50 I 50 # of lighting 24 24 conditions Image size 32 x 32 32 x 32 First %"""$ # of persons 9 15 Correction "4 3 &iW:g3 wearine elasses Second Correction As shown in 2.2, CS is created from the canonical image set, which consists of 1200 images of 50 per- Final Result sons. For each person, images were taken under 24 lighting conditions, which were controlled by changing the position of the light. In the canonical set, nine persons wore glasses. Figure 4 shows the averages of Figure 2: Example of iterative image correction. the canonical images taken under the 24 lighting condi- tions. The remaining 50 persons were used for the test 3.2 Refinement of Canonical Space data, in which 15 persons wore glasses. Figures 5 (a) and (b) show seven examples of canonical images and Because the image correction can also be applied seven example test images, respectively, taken under to any image in the canonical image set, CS is recon- fixed lighting conditions. structed from the corrected image set. Figure 3 shows the most significant five eigenfaces of CS before and after image correction. The image correction refines the canonical spaces because several reflections and shadows are removed from the canonical image set. After reconstruction of the canonical space, lighting transformation matrices are also reestimated for the construction of the VEP[lO].

Figure 3: Comparison of average face and the most sig- nificant five bases: Upper and lower rows show images Figure 4: Averages of canonical images under 24 light- before and after image correction, respectively. ing conditions. 3.3 Refined Registration/Recognit ion Schemes 4.2 Comparison

In the registration scheme, the VEP is constructed For personal registration, It' images were randomly in the manner described in 2.5 after image correction. sampled from 24 images of each person in the test Table 3: Results for two databases [%I L, Met.hod C3 (original) 11 C3 (refined) Database I] P K=l I Ours 11 50 92.1 1 97.3

AR databases. The refined method can be applied to a face recognition under natural lighting conditions, Figure 5: Examples of canonical/test image sets. even if the lighting condition is unknown or changes with time. The image correction method can also be data. Therefore. the discrimination ex~erimentwas applied to a wide range of applications, including face performed using the remaining 24 - Ii' iAages of each and object recognition. registered person. This process was repeated one- hundred times while registered images for each person Acknowledgment were varied This work has been supported in part by Japan Science Table 2 shows the average discrimination rates. Six and Technology Corporation under Ikeuchi CREST methods were compared using the same canonical set project. and test set. ~h;first three rows show the results of the original method, and t,he remaining three rows References show the results of the refined method. In all of the methods, face symmetry was used in the registration [I] P. N. Belhumeur, J. P. Hespanha and D. J. Krieg- stage. The dimension of the EP is li + 3, whereas the man, "Eigenfaces vs. Fisherfaces: Recognition Us- dimension of the ER is min(2Ii - 1, Ii' + 2). ing Class Specific Linear Projection", IEEE Trans. The table shows that the discrimination rates are Pattern Analysis and Machine Intelligence, vol. improved by the image correction for all Ii' and for 19, no. 7, pp. 711-720, 1997. all similarity measures. The best result is provided by [2] M. Kirby and L. Sirovich, "Application of the C3, with recognition reaching 95.1% when only one Karhunen-Loeve Procedure for the Characteriza- image is registered for each person. This represents an tion of Human Faces", IEEE Trans. Pattern Anal- improvement of three points over the original method. ysis and Machine Intelligence, vol. 12, no. 1, pp.103- When five images are registered for each person, the 108, 1990. result obtained using C3 reaches 99.9%. [3] R. Moghaddam and A. Pentland, "Probabilistic Visual Learning for Object Representation", IEEE Table 2: Discrimination rat,es for 50 persons [%] Trans. Pattern Analysis and Machine Intelligence, Number of samples(K)/person vol. 19, no. 7, pp. 696-710, 1997. Method 112131415 [4] A. Z. Martinez and R. Benavente, The AR Face Cl (original) 81.2 88.3 91.1 91.8 93.3 Database, CVC Technical Report #24, 1998. C2 (original) 89.0 96.2 98.2 99.1 99.3 [5] E. Oja, Subspace Methods of , C3 (original) 2.1 97.3 98.6 99.3 99.3 Research Studies Press Ltd., 1983. Cl(rrfined) 83.1 91.4 93.1 94.9 94.6 [6] M. Turk and A. Pentland, "Eigenfaces for Recog- C2(rcfincd) 1.97.0 98.9 99.6 99.8 nition", Journal of Cognitive Neuroscience, vol. 3, C3 (refined) 95.1 98.5 99.4 99.8 99.9 no. 1, pp. 71-86, 1991. [7] P. N. Relhumeur and D. J. Kriegman, "What is the set of images of an object under all possi- 4.3 Recognition on AR database ble lighting conditions?", Proc. CVPR'96, pp.270- 277, 1997. [8] A. L. Yuille, S. R.. Epstein and P. N. Belhumeur, In order to confirm the effectiveness of the refine- ment, we conducted experiments on the AR Face "Determining Generative Models of Objects Un- der Varying Illumination: Shape and Albedo from Database [4]. These experiments employ the same Multiple Images Using SVD and Integrability", canonical spaces used in the above experiments. Ta- International Journal of , vol. 35, ble 3 shows the experimental results. no.3, pp.203-222, 1999. The AR database contains images of 68 persons [9] T. Shakunaga and N. Yamamoto, "Virtual Sub- taken under three lighting conditions for each person. space Method for Robust Face Recognition In- In order to obtain a clear comparison with the results dependent of Lighting Conditions", Proc. IAPR in our database, 50 persons are randomly selected from Workshop on Machine Vision Applications (MVA the AR database, and the discrimination experiment is 2000), 2000. performed using C3. In the experiment, one or two im- [lo] T. Shakunaga and K. Shigenari, "Decomposed ages are registered for each person and the remaining Eigenface for Face Recognition under Various Light- images are used for the test. Table 3 shows that the ing Conditions", Proc. CVPR2001, vol. 1, pp. 864- refinement can stably discriminate persons in the AR 871, 2001. database as well as persons in our database. [ll] T. Shakunaga and F. Sakaue, "Natural Image Cor- 5 Conclusions rection by Iterative Projections to Eigenspace Con- structed in Normalized Image Space", in Proc. Refinement of the decomposed eigenface method is ICPR2002, 2002. established based on projection-based image correc- [12] W. Y. Zhao and R. Chella~~a,"Illumination in- tion. Experimental results show that the refinement sensitive face recognition using symmetric shape- improves the discrimination rat,es on both our own and from-shading", Proc. CVPR2000, pp.286-293,2000.