
M.I.T Media Lab oratory Perceptual Computing Section Technical Rep ort No. 443 App ears in: The 3rd IEEE Int'l ConferenceonAutomatic Face & GestureRecognition, Nara, Japan, April 1998 Beyond Eigenfaces: Probabilistic Matching for Face Recognition Baback Moghaddam, Wasiuddin Wahid and Alex Pentland MIT Media Lab oratory, 20 Ames St., Cambridge, MA 02139, USA. Email: fbaback,wasi,[email protected] facial expressions of the same individual ) and extrapersonal Abstract variations (corresp onding to variations b etween di er- E We prop ose a novel technique for direct visual ent individu al s). Our similarity measure is then expressed matching of images for the purp oses of face in terms of the probabili ty recognition and database search. Sp eci call y, we argue in favor of a probabilistic measure of S (I ;I )=P ( 2 )=P ( j) (1) 1 2 I I similarity, in contrast to simpler metho ds which where P ( j) is the aposteriori probabilitygiven by I are based on standard L norms (e.g., template 2 Bayes rule, using estimates of the likeliho o d s P (j )and I matching) or subspace-restricted norms (e.g., P (j )which are derived from training data using an E eigenspace matching). The prop osed similarity ecient subspace metho d for density estimation of high- measure is based on a Bayesian analysis of image dimensional data [6]. This Bayesian (MAP) approach di erences: we mo del twomutually exclusive can also b e viewed as a generalized nonlinear extension of classes of variation b etween two facial images: Linear Discriminant Analysis (LDA) [8, 3] or \FisherFace" intra-personal (variations in app earance of the techniques [1] for face recognition. Moreover, our nonlinear same individ ual , due to di erent expressions or generalization has distinct computational/ storage advan- lighting) and extra-personal (variations in ap- tages over these linear metho ds for large databases. p earance due to a di erence in identity). The high-dimension al probabili ty density functions 2 Analysis of Intensity Di erences for each resp ective class are then obtained from training data using an eigenspace densityes- Wenow consider the problem of characterizing the typ e timation technique and subsequently used to of di erences which o ccur when matching two images compute a similarity measure based on the a in a face recognition task. We de ne two distinct and posteriori probabili tyofmemb ership in the intra- mutually exclusive classes: representing intrapersonal I personal class, which is used to rank matches variations b etween multiple images of the same individu al in the database. The p erformance advantage (e.g., with di erent expressions and lighting conditions), of this probabili sti c matching technique over and representing extrapersonal variations which result E standard nearest-neighb or eigenspace matching when matching two di erent individual s. We will assume is demonstrated using results from ARPA's 1996 that b oth classes are Gaussian-distri bu ted and seek to \FERET" face recognition comp etition, in which obtain estimates of the likeliho o d functions P (j )and I this algorithm was found to b e the top p erformer. P (j ) for a given intensity di erence = I I . E 1 2 Given these likelih o o d s we can de ne the similarity score 1 Intro duction S (I ;I )between a pair of images directly in terms of the 1 2 intrap ersonal aposteriori probability as given byBayes Current approaches to image matching for visual ob ject rule: recognition and image database retrieval often makeuse of simple image similarity metrics such as Euclidean S = P ( j) I distance or normalized correlation, which corresp ond to P (j )P ( ) I I (2) = a standard template-matching approach to recognition [2]. P (j )P ( )+ P (j )P ( ) I I E E For example, in its simplest form, the similarity measure S (I ;I )between two images I and I can b e set to b e where the priors P ( ) can b e set to re ect sp eci c 1 2 1 2 inversely prop ortional to the norm jjI I jj.Such a simple op erating conditions (e.g., numb er of test images vs. the 1 2 formulation su ers from a ma jor drawback: it do es not size of the database) or other sources of a priori knowledge exploit knowledge of whichtypeofvariations are critical regarding the two images b eing matched. Additionall y,this (as opp osed to incidental) in expressing similarity. In particular Bayesian formulation casts the standard face this pap er, weformulate a probabilistic similarity measure recognition task (essentially an M -ary classi catio n prob- which is based on the probability that the image intensity lem for M individu als ) into a binary pattern classi cation di erences, denoted by = I I ,arecharacteristic problem with and . This much simpler problem 1 2 I E of typical variations in app earance of the same ob ject. is then solved using the maximum aposteriori (MAP) For example, for purp oses of face recognition, wecan rule | i.e., two images are determined to b elong to the de ne two classes of facial image variations: intrapersonal same individua l if P ( j) >P( j), or equivalently,if I E 1 variations (corresp onding, for example, to di erent . S (I ;I ) > I 1 2 2 1 Density Mo deling 2.1 F One diculty with this approach is that the intensity N di erence vector is very high-dimensi onal , with 2R 4 N = O (10 ). Therefore wetypically lack sucient and DFFS indep endent training observations to compute reliable 2nd- order statistics for the likeliho o d densities (i.e., singular covariance matrices will result). Even if wewere able to estimate these statistics, the computational cost of aluating the likeliho o ds is formidable. Furthermore, this ev DIFS computation would b e highly inecient since the intrinsic dimensionali ty or ma jor degrees-of-freedom of for each class is likely to b e signi cantly smaller than N . Recently, an ecient density estimation metho d was y Moghaddam & Pentland [6] which divides prop osed b F N the vector space R into two complementary subspaces (a) using an eigenspace decomp osition. This metho d relies on a Principal Comp onents Analysis (PCA) [4] to form a low- dimensional estimate of the complete likelihood whichcan aluated using only the rst M principal comp onents, be ev FF where M<<N. This decomp osition is illustrated in Figure 1 whichshows an orthogonal decomp osition of the N vector space R into twomutually exclusive subspaces: the principal subspace F containing the rst M principal comp onents and its orthogonal complement F ,which contains the residual of the expansion. The comp onentin the orthogonal subspace F is the so-called \distance-from- feature-space" (DFFS), a Euclidean distance equivalentto the PCA residual error. The comp onent ofwhichlies in the feature space F is referred to as the \distance-in- Mahalanobis distance for feature-space" (DIFS) and is a 1 M N Gaussian densities. (b) As shown in [6], the complete likeliho o d estimate can N Figure 1: (a) Decomp osition of R into the principal sub- b e written as the pro duct of two indep endent marginal space F and its orthogonal complement F for a Gaussian Gaussian densities ! 2 3 density,(b)atypical eigenvalue sp ectrum and its division M X 2 y 3 2 into the two orthogonal subspaces. 1 i exp 2 6 7 () 2 i exp 6 7 2 i=1 ^ 6 7 5 4 P (j ) = (N M )=2 M 6 7 (2) Y 4 5 1=2 M=2 (2 ) images consists of hard recognition cases that have proven i i=1 dicult for all face recognition algorithms previously tested on the FERET database. The diculty p osed bythis ^ = P (j ) P (j ) dataset app ears to stem from the fact that the images were F F (3) taken at di erent times, at di erent lo cations, and under ^ di erent imaging conditions. The set of images consists where P (j ) is the true marginal densityinF , P (j ) F F of pairs of frontal-views (FA/FB) and are divided into is the estimated marginal density in the orthogonal comple- 2 two subsets: the \gallery" (training set) and the \prob es" ment F , y are the principal comp onents and () is the i (testing set). The gallery images consisted of 74 pairs of residual (or DFFS). The optimal value for the weighting images (2 p er individu al) and the prob e set consisted of 38 parameter is then found to b e simply the average of the pairs of images, corresp onding to a subset of the gallery F eigenvalues memb ers. The prob e and gallery datasets were captured N X 1 aweek apart and exhibit di erences in clothing, hair and = (4) i lighting (see Figure 2). N M i=M +1 Before we can apply our matching technique, we need to We note that in actual practice, the ma jorityofthe p erform an ane alignment of these facial images. For this F eigenvalues are unknown but can b e estimated, for purp ose wehave used an automatic face-pro cessing system example, by tting a nonlinear function to the available which extracts faces from the input image and normalizes p ortion of the eigenvalue sp ectrum and estimating the for translation, scale as well as slight rotations (b oth in- average of the eigenvalues b eyond the principal subspace.
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages6 Page
-
File Size-