Subclass Discriminant Nonnegative Matrix Factorization for Facial Image Analysis

Pattern Recognition 45 (2012) 4080–4091 Contents lists available at SciVerse ScienceDirect Pattern Recognition journal homepage: www.elsevier.com/locate/pr Subclass discriminant Nonnegative Matrix Factorization for facial image analysis Symeon Nikitidis b,a, Anastasios Tefas b, Nikos Nikolaidis b,a, Ioannis Pitas b,a,n a Informatics and Telematics Institute, Center for Research and Technology, Hellas, Greece b Department of Informatics, Aristotle University of Thessaloniki, Greece article info abstract Article history: Nonnegative Matrix Factorization (NMF) is among the most popular subspace methods, widely used in Received 4 October 2011 a variety of image processing problems. Recently, a discriminant NMF method that incorporates Linear Received in revised form Discriminant Analysis inspired criteria has been proposed, which achieves an efficient decomposition of 21 March 2012 the provided data to its discriminant parts, thus enhancing classification performance. However, this Accepted 26 April 2012 approach possesses certain limitations, since it assumes that the underlying data distribution is Available online 16 May 2012 unimodal, which is often unrealistic. To remedy this limitation, we regard that data inside each class Keywords: have a multimodal distribution, thus forming clusters and use criteria inspired by Clustering based Nonnegative Matrix Factorization Discriminant Analysis. The proposed method incorporates appropriate discriminant constraints in the Subclass discriminant analysis NMF decomposition cost function in order to address the problem of finding discriminant projections Multiplicative updates that enhance class separability in the reduced dimensional projection space, while taking into account Facial expression recognition Face recognition subclass information. The developed algorithm has been applied for both facial expression and face recognition on three popular databases. Experimental results verified that it successfully identified discriminant facial parts, thus enhancing recognition performance. & 2012 Elsevier Ltd. All rights reserved. 1. Introduction to which the yielded representation is sparse. Towards this direction, Hoyer incorporated the notion of sparseness into the Nonnegative Matrix Factorization (NMF) [1], is an unsuper- standard NMF decomposition function, so as the representation vised matrix decomposition algorithm that requires both the data sparsity can be better controlled [6], while Li et al. [7] introduced matrix being decomposed and the yielding factors to contain localization constraints leading to a parts-based representation. nonnegative elements. The nonnegativity constraint implies that To interpret NMF parts-based image representation, consider the original data are reconstructed using only additive and no the scenario where NMF operates either on facial images or on a subtractive combinations of the yielded basic elements. This text documents collection. In this scenario, the NMF training limitation distinguishes NMF from many other traditional dimen- procedure aims to learn the parts of the decomposed data, which, sionality reduction algorithms, such as Principal Component for the first case, will correspond to different facial parts, while for Analysis (PCA) [2], Independent Component Analysis (ICA) [3,4] the latter, to meaningful topics. Consequently, the identified basis or Singular Value Decomposition (SVD) [5]. elements when combined using appropriate weight factors, will One of the most useful properties of NMF-based methods is reconstruct accurately the original facial images or text docu- that they usually produce a sparse representation of the decom- ments that have been decomposed. This parts-based representa- posed data. Sparse coding corresponds to data representation tion property of NMF, is consistent with the psychological using few basic elements that are spatially distributed, and ideally, intuition of combining parts to form the whole for object nonoverlapping. However, because the sparseness achieved by the representation in the human brain [8,9]. original NMF is somewhat of a side-effect rather than a goal, Recently, numerous specialized NMF-based algorithms applied caused by the imposed nonnegativity constraints, different in various problems in diverse fields have been proposed. These approaches have been proposed that attempt to control the degree algorithms modify the NMF decomposition cost function by incorporating additional penalty terms in order to fulfill specific requirements, arising in each application domain. In [10] Projec- n Corresponding author at: Department of Informatics, Aristotle University of tive NMF (PNMF) was introduced, which proved to generate a Thessaloniki, Greece. Tel.: þ30 2310996361. much sparser, compared to original NMF, and approximately E-mail addresses: [email protected] (S. Nikitidis), [email protected] (A. Tefas), [email protected] (N. Nikolaidis), orthogonal projection matrix, which reveals strong connections [email protected] (I. Pitas). between PNMF and nonnegative PCA. Extensive theoretical and 0031-3203/$ - see front matter & 2012 Elsevier Ltd. All rights reserved. http://dx.doi.org/10.1016/j.patcog.2012.04.030 S. Nikitidis et al. / Pattern Recognition 45 (2012) 4080–4091 4081 practical justifications of PNMF algorithm have been given in [11]. underlying distribution of each class by a mixture of Gaussians An extension of NMF that is applicable on mixed sign data has and employ criteria inspired by the Clustering based Discriminant been attempted in [12], where the proposed framework relaxes Analysis (CDA) introduced in [24]. Moreover, we extend the NMF the nonnegativity constraint on the bases matrix and considers algorithm modifying its decomposition by embedding appropri- entries with both positive and negative sign, while the weights ate discriminant constraints and reformulating the cost function matrix remains positively constrained. The efficiency of the that drives the optimization process. This extension provides presented framework has been investigated in various clustering discriminant projections that are expected both to exhibit robust- problems. ness in illumination changes and expression variations, and to Focusing on facial image analysis, numerous specialized NMF enhance class separability in the reduced dimensional space. To decomposition variants have been proposed for face recognition perform SDNMF optimization, we develop multiplicative update [7,13], facial identity verification [14] and facial expression rules that consider both samples class origin and clusters forma- recognition [17,18]. In such applications the entire facial image tion inside each class and prove their convergence using an forms a feature vector and NMF aims to find its projections that appropriately designed auxiliary function. optimize a given criterion. The resulting projections are then used In summary, the novel contributions of this paper are the in order to project unknown test facial images from the original following: high dimensional image space into a lower dimensional subspace, where the criterion under consideration is optimized. One limita- Subclass based discriminant constraints are incorporated in the tion of NMF is that the decomposed images should be vectorized NMF decomposition cost function resulting in a specialized in order to perform the nonnegative decomposition. Conse- NMF-based method. quently, this vectorization leads to significant information loss, Novel multiplicative update rules for optimizing SDNMF are since the local structure of the decomposed images is no longer proposed and their convergence is proven. available. In order to remedy this limitation, the 3D Nonnegative A decomposition of a facial image into its discriminant parts Tensor Factorization (NTF) has been introduced in the literature using sparse representations is obtained. [19,20]. A supervised NMF learning method that aims at extracting The rest of the paper is organized as follows. The NMF algorithm discriminant facial parts is the Discriminant NMF (DNMF) algo- and some of its most notable variants are reviewed in Section 2. rithm proposed in [14,26]. DNMF incorporates discrimination Section 3 introduces the proposed SDNMF method, which incor- criteria in the NMF factorization and achieves a more efficient porates subclass discriminant constraints in the NMF decomposi- decomposition of the data to its discriminant parts, thus enhan- tion framework and also, draws the proposed multiplicative cing separability between classes compared to the conventional update rules. Section 4 describes the conducted experiments, NMF. However, the incorporation of Linear Discriminant Analysis verifying the efficiency of our algorithm in face and facial (LDA) inspired criterion [22] inside DNMF has certain shortcom- expression recognition. Finally, convergence proof of our optimi- ings. Firstly, LDA assumes that the sample vectors of each class zation scheme is provided in Appendix A, whereas directions for are generated from underlying multivariate Gaussian distribu- future work and concluding remarks are drawn in Section 5. tions having a common covariance matrix but different class means. Secondly, since LDA assumes that each class is repre- sented by a single compact data cluster, the problem of non- 2. Brief review of NMF and its most notable variants linearly separable classes cannot be solved. However, this problem can be tackled if we consider that each class is parti- In this section, we briefly present

Subclass Discriminant Nonnegative Matrix Factorization for Facial Image Analysis

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support