<<

Image and Vision Computing 58 (2017) 214–223

Contents lists available at ScienceDirect

Image and Vision Computing

journal homepage: www.elsevier.com/locate/imavis

Semi self-training / detection and ଝ ଝଝ segmentation simultaneously ,

T. Hoang Ngan Le*, Khoa Luu, Chenchen Zhu, Marios Savvides CyLab Biometrics Center and Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, PA, USA

ARTICLE INFO ABSTRACT

Article history: This paper presents a robust, fully automatic and semi self-training system to detect and segment facial Received 22 September 2015 beard/moustache simultaneously in challenging facial images. Based on the observation that some certain Received in revised form 15 July 2016 facial areas, e.g. cheeks, do not typically contain any facial whereas the others, e.g. brows, often con- Accepted 25 July 2016 tain , a self-trained model is first built using a testing image itself. To overcome the limitation of Available online 21 September 2016 that facial in brows regions and beard/moustache regions are different in length, density, color, etc., a pre-trained model is also constructed using training data. The pre-trained model is only pursued when Keywords: the self-trained model produces low confident classification results. In the proposed system, we employ Beard/moustache the superpixel together a combination of two classifiers, i.e. Random Ferns (rFerns) and Support Vector Detection and segmentation Machines (SVM) to obtain good classification performance as well as improve time efficiency. A feature Super pixel Random Ferns vector, consisting of Histogram of Gabor (HoG) and Histogram of Oriented Gradient of Gabor (HOGG) at dif- Support Vector Machine ferent directions and frequencies, is generated from both the bounding box of the superpixel and the super Histogram of Gabor (HoG) pixel foreground. The segmentation result is then refined by our proposed aggregately searching strategy Histogram of Oriented Gradient of Gabor in order to deal with inaccurate landmarking points. Experimental results have demonstrated the robust- (HOGG) ness and effectiveness of the proposed system. It is evaluated in images drawn from three entire databases i.e. the Multiple Biometric Grand Challenge (MBGC) still face database, the NIST color Facial Recognition Technology FERET database and a large subset from Pinellas County database. © 2016 Elsevier B.V. All rights reserved.

1. Introduction recognition systems due to the following biometric observation. There is lack of small patches under a human mouth and these fea- Facial hair analysis has recently received significant attention tures do not change during the lifetime of a person [2] (illustrated from forensic and biometric researchers due to three important in Fig. 3). Accurate beard and moustache detection and segmenta- observations as follows. Firstly, changing facial can modify tion enable us to perform occlusion removal, gender and ethnicity a person’s appearance such that it effects facial recognition systems classification, age estimation, facial recognition, etc. as illustrated in Fig. 1. In this figure, the most left image is the gallery Nguyen et al. [3] presented a method for facial beard synthesis image whereas the right images are from the probe set. The sim- and editing. However, this method only works on high resolu- ilarities between the galley image and the probe set at pixel level tion images with the assumption that facial hair already existed. are given under each probe image. Secondly, most females do not Also, there was the requirement of initial seeds for Graphcuts-based have beard or moustache. Therefore, detecting facial hair helps to method. Their proposed method was lack of experimentally quanti- distinguish male against female with high confidence in the gender tative results. To overcome those weaknesses, Pierrard [4] presented classification problem. Finally, opposed to babies and young adults, facial hair detection using GrabCut and alpha matching. Although only male senior adults generally have beard or moustache. The facial their method was evaluated in the application of face recognition hair detection can help to improve the accuracy of an age estima- on FERET database [5], there was a lack of quantitative results in tion system as shown in Fig. 2. In addition to beard and moustache their work. Later, Le et al. [2,6] introduced the SparCLeS method to detection, segmentation also plays a crucial role, especially in facial automatically detect and segment beard/moustache with full mea- surement. Their methods are robust against illumination and obtain high accuracy of detection. However, it is time consuming and ଝ A preliminary version of this paper was presented in the International Conference highly complex because of the Multiscale Self-Quotient algorithm on Biometrics (ICB), May 19–22, 2015 in Phuket, Thailand [1]. and the dictionary learning approach. Furthermore, their facial hair ଝଝ This paper has been recommended for acceptance by Michele Nappi. * Corresponding author. detection depends on training data built the binary decision dynamic

http://dx.doi.org/10.1016/j.imavis.2016.07.009 0262-8856/© 2016 Elsevier B.V. All rights reserved. T. Hoang Ngan Le et al. / Image and Vision Computing 58 (2017) 214–223 215

score. Our proposed algorithm first extracts features consisting of Histogram of Gabor (HoG) and Histogram of Oriented Gradient of Gabor (HOGG) at different directions and frequencies [1] form both the bounding box of the superpixel as well as from the super- pixel foreground. Then, the system is trained using a combination of two productive classifiers, i.e. Random Ferns(rFernspng) and Sup- 100.0 73.0 69.9 56.44 port Vector Machines (SVM) on the top of HoG and HOGG features to assign a score for each superpixel. To avoid the problem of incorrect Fig. 1. An illustration of how changes to facial hair can transform the appearance of alignment from the landmarker and to handle the cases where peo- a person. The left image is a target image whereas the right images are in a probe set ple have long beard, we propose a greedy searching strategy to group and the matching scores at pixel level are given below each image in the probe set. all superpixels which are close in the feature space. Unlike earlier developed systems, treating detection and seg- mentation as two separated tasks, our proposed system is able to dictionary. Recently, Le et al. [1] has developed a system that enables simultaneously detect and segment facial hair. In contrast to previ- self-learn transformation vector to separate a hair class and a non- ous works that deal with pixel level and strongly rely on landmarker, hair class from the testing image itself. In their approach, a feature- our work makes use of the superpixels and propose an aggregately based segmentation was proposed to segment the beard/moustache search strategy to overcome the presented handicaps. The proposed from a region on the face that is discovered to contain facial hair. algorithm includes the four main steps as follows, Their system has been evaluated on the large number of images from the MBGC, FERET and Pinellas databases and obtained highly accurate results of detection and segmentation with small time con- • Superpixel Generation: We start with grouping pixels into sumption. To overcome the handicaps of time consuming of Le et al. perceptually meaningful region called superpixel. Because we [2,6],Leetal.[1] presented a self-learning transformation vector for want to accurately segment object (facial hair) as well as each testing image where the training samples are extracted from reduce computational complexity, we use simple linear itera- testing image itself. Le et al. [1] is more computationally efficient and tive clustering (SLIC) [7] to generate superpixel. based on three observations: (1) certain areas, such as cheeks, do • Feature Extraction: We extract feature, which is a combination not typically contain any facial hair, (2) certain areas, such as brows, of HoG and HOGG feature [1], from both the bounding box of often contain facial hair, and (3) unlike a skin region, a facial hair the superpixel as well as from the superpixel foreground. region contains high frequency information in different orientations, • Superpixel Classification: We train the proposed system using which depends on whether the hair is straight, wavy or curly. The a classifier, which is a combination of rFerns and SVM on top experiment has shown that the later is faster but less accurate when of the high frequency features to assign a score for each super- facial hair presented in the beard/moustache region is quite different pixel. There are two models applied in the proposed system, in length, color, density from facial hair in the brows. namely, self-trained model and pre-trained model. We first use In this paper, we take advantages of both self-trained model and self-trained model to decide the label of superpixel. Based on pre-trained model. Given a testing image, the pre-trained model is the confidence of superpixel labeling, the pre-trained model called if the self-trained model gives low confidence on classification maybe used if the score is low.

Fig. 2. An illustration of how beard/moustache works in the age estimation problem.

Fig. 3. An illustration of patches under the mouth where facial hair is absent. 216 T. Hoang Ngan Le et al. / Image and Vision Computing 58 (2017) 214–223

• Propose an aggregately searching strategy to deal with the problem of inaccurate alignment come from the landmarker.

The rest of this paper is organized as follows. Section 2 introduces the background of the main techniques employed in our method, i.e. the MASM landmarking algorithm, the two classifiers Random Ferns (rFerns) and SVM and Simple linear iterative clustering (SLIC) algorithm. Section 3 describes our proposed algorithm that consists of two key parts, i.e. feature extraction, and aggregate searching. In Section 4, we demonstrate the performance of our proposed algorithms on facial images collected from three databases, i.e. the Multiple Biometric Grand Challenge (MBGC) database, the color FERET database, and a large subset from the Pinellas County face database.

2. Preliminary notions

2.1. Facial landmarking

Active Shape Model (ASM) [8] aims to fit a set of landmarks specified in its training onto an unseen instance of the object. For automatically locating facial landmarks, the ASM must be initial- ized so that its mean shape lies approximately on top of the face in the given image. In order to receive the best fitting in ASM, an iterative process of moving the landmarks into locations is followed by a shape based constraining step that ensures that the group of landmarks satisfy the structure of a typical face. This paper employs the Modified Active Shape Model (MASM) approach [9] to robustly find facial landmarking points in our sys- Fig. 4. The dense facial landmarking scheme consisting of 161 landmarks. tem. The dense facial landmarking scheme consists of 161 land- mark points. Fig. 4 shows an example of our proposed landmark- ing scheme. This MASM approach uses 2D texture descriptor and • Aggregately Searching: Because the region of interest (ROI) is searches for candidate locations of a landmark in a 2D region around selected and cropped by the landmarker, the facial hair maybe the initial locations. At the test stage, a 2D profile around a potential inaccuratelysegmentedifthelandmarkingpointsarenotaligned landmark is projected onto a subspace to obtain a vector of projec- well or in the case where the facial hair is longer than facial tion coefficients and is then reconstructed with these coefficients. boundary. We propose an aggregately search strategy to refine The reconstruction error between this reconstructed and the origi- the result as well as overcome the limitations of landmarker. nal profiles is calculated. The best candidate location is the one that gives the lowest reconstruction error. Our contributions in this paper are concluded as follows: MASM is trained on images cross several poses of Multi-PIE (MPIE) [10] database. A commercial face detection is used to initial- • Simultaneously segment and detect beard/moustache to ize MASM that can automatically localize 161 key points in a testing reduce the computational time face. • Make use of advantage of superpixel to enforce local smooth- ness as well as improve segmentation substantially 2.2. Random Ferns • Make use of advantage of both self-trained model and pre- trained model to overcome the limitations of facial hair color, Random Ferns ((rFerns) [11,12]) are classifiers that make use of length, density, near-clear , illumination, etc. features of decision trees and Bayesian classifiers. Random Ferns is

Fig. 5. An illustration of the real part of Gabor filter. The 1st row: s x = sy = 15, F0 =0.05; the 2nd row: sx = sy = 15, F0 =0.1; the 3rd row: sx = s y =30,F0 =0.1; the 4th row: s x = sy =30,F0 =0.2. From left to right, the orientations are p/8:p/4:p. T. Hoang Ngan Le et al. / Image and Vision Computing 58 (2017) 214–223 217

Fig. 6. An illustration of HoG: original image (a1), Gabor filtering image (a2), HoG of four ROIs corresponding to brow (in blue), cheek (in cyan), moustache (in red) and beard (in green) (a3).

an extension of Random Forest but simpler and more computation- in Eq. (3) below, such that the distance to the support vectors is ally efficient than Random Forest. A rFerns is a collection of ferns maximized as follows, and each fern is a non-hierarchical structures defined as a simpli- fied binary decision tree of a fixed depth D.Eachleafnodeofa Ns fern stores a posterior probabilities of classes over objects that are w.x + b =0 where, w = aiyisi (3) directed to this node. When classifying or recognizing an object, each i=1 fern returns a vector of probabilities that the object belongs to par- SVMs can be expanded to become nonlinear decision surfaces by first ticular decision classes. Let Ci,wherei =1,..., L be the set of classes using a mapping V to map these points to some other Euclidian space and let Fj,wherej =1,..., N be the set of binary features that will be classified. The classification problem is formed as follows, H, that is linearly separable with a given regularization parameter C,whereC > 0, Rn → H and by defining a kernel function K,where ∗ | K(xi, xj)=V(xi).V(xj). Then, the nonlinear decision surface is defined Ci = arg max P(C = Ci F1, F2, ...FN)(1) Ci as follows, where Ns 1 2 min  w + C fi w,b,f 2 P(F1, F2, ...FN|C = Ci)P(C = Ci) i=1 P(C = Ci|F1, F2, ...FN)= (2) P(F1, F2, ...FN) s.t., yi(w.xi + b) ≥ 1 − fi, fi ≥ 0(4)

2.3. Support Vector Machines 2.3.2. Support vector regression (SVR) This section gives a brief overview of the basics of SVMs for binary The goal of the SVR problem is to build a hyperplane “close” classification. Then, we explain how this technique can be expanded to as many of the training points as possible. Given N training ∈ Rn ∈{− } to deal with the regression problem. points (x1, y1), (x2, y2), ...,(xN, yN)withxi and yi 1, 1 , i =1,..., N, we have to construct the hyper plane and values of 2.3.1. Binary classification w and b. The hyper plane w is selected with a small norm, while ∈ Rn Given N training points (x1, y1), (x2, y2), ...,(xN, yN)withxi simultaneously minimizing the sum of the distances from these ∈{− } and yi 1, 1 , i =1,..., N, and suppose these points are linearly training points to the hyperplane, measured by using Vapnik’s e- separable; we have to find a set of Ns support vectors si(Ns ≤ N), insensitive loss function. The value of e is selected by the user, and coefficient weights ai, constant b and the linear decision surface, as the tradeoff between finding a hyper-plane with a good regression

Fig. 7. An illustration of HOGG: original image (b1), Gabor filtering image (b2), phase of Gabor filtering image (b3) and magnitude of Gabor filtering image (b4), HOGG of four ROIs corresponding to brow (in blue), cheek (in cyan), moustache (in red) and beard (in green) (b5). 218 T. Hoang Ngan Le et al. / Image and Vision Computing 58 (2017) 214–223

Fig. 8. An illustration where the landmarking points along the face boundary are not always well aligned. performance is controlled via the given regularization parameter C. 3. Our proposed algorithm The QP problem associated with SVR is described as follows, This section is composed of two parts. First, we present our pro-

Ns posed feature extraction approach in Section 3.1. These features are 1  2 ∗ min w + C (fi + fi ) employed to classify if a superpixel contains facial hair information. w,b,f,f∗ 2 i=1 Then, Section 3.2 details our aggregately grouping strategy. yi − (w.xi + b) ≤ e + fi, f ≥ 0 − ≤ ∗ ∗ ≥ 3.1. Feature extraction yi +(w.xi + b) e + fi , f 0(5) We propose two kinds of 2D Gabor based features for both facial beard/moustache detection and segmentation. In general, a complex 2.4. Simple linear iterative clustering (SLIC) Gabor filter can be formulated as a product between Gaussian func- tion and a complex exponential Sinusoid function. In spatial domain, The SLIC algorithm [7] clusters pixels in the combined five- a complex Gabor filter includes two separated components, i.e. real dimensional color and image plane space to efficiently generate and imaginary, defined as in Eq. (7). Fig. 5 shows the real part of compact, nearly uniform superpixels. The color space is the CIELAB the spatial impulse response of a set of Gabor filters constructed in space [lab] considered to be the perceptually uniform for small color various frequencies and orientations. distances. And the image plane space is represented by the pixel   coordinates [xy]. The algorithm uses Euclidean distance in the [labxy] R2 R2 h(x, y)=K exp − 1 − 2 exp( j2 F R )(7) 5D space with normalizing the spatial distances based on super- 2 2 p 0 1 2sx 2sy pixel’s initial grid interval S, since the maximum possible distance between two colors in the CIELAB space is limited whereas the spa- where R =(x − l )cosh +(y − l )sinh and R = −(x − l )sinh + tial distance in the xy plane depends on the image size. The distance 1 x y 2 x (y − l )cosh. The other parameters, i.e. l , l , sx, s y,denotethe measure in SLIC is formed as follows, y x y means and standard deviations along each corresponding x and y 1 direction respectively, and K = . The frequency F0 and the m 2psxsy Ds = dlab + dxy (6) direction y define the sinusoid function in the polar coordinates. S 0 In this work, our first feature, Histogram of Gabor (HoG), is defined as the histogram of a 2D Gabor feature constructed using where Ds is the sum of the lab Euclidean distance and the xy plane distance normalized by the grid interval S. And m is introduced to a Gabor filter defined as in Eq. (7) at different frequencies F0 and control the compactness of the superpixel. orientations h. Despite the striking simplicity of this approach, it generates Given a testing image, its corresponding HoG feature is computed superpixels at a very low computational cost while achieving a in four steps as follows: segmentation quality similar to or greater than some mainstream methods, as measured by boundary recall and under-segmentation • Apply the 2D Gabor filter at different frequencies and orienta- error. tions to obtain a set of Gabor filtering images

Fig. 9. An illustration where the landmarking points along the face boundary are not always well aligned. T. Hoang Ngan Le et al. / Image and Vision Computing 58 (2017) 214–223 219

Fig. 10. Some examples of superpixels grouping.

• Compute the histogram of each filtering image • Normalize the set of histogram using l2 norm • Concatenate the histograms of all Gabor filtering images • Normalizethesetofhistogramsusingthel2-normnormalization. A demonstration of HOGG feature in four different ROIs, i.e. brow (in blue), cheek (in cyan), moustache (in red) and beard (in green), Given an image, a demonstration of HoG feature in four differ- can be seen in Fig. 7. ent ROIs, i.e. brow (in blue), cheek (in cyan), moustache (in red) and beard (in green) can be shown as in Fig. 6. The second type of the proposed features in this paper, the His- 3.2. Aggregately grouping togram of Orientated Gradient of Gabor (HOGG), is also extracted as robust facial hair features. Given an image, the 2D Gabor filter is first In our proposed algorithm, facial hair is simultaneously detected applied to obtain the Gabor features. Both the phase and the magni- and segmented within a ROI which is decided by the landmarker tude of the Gabor filtered image are then taken into account in our MASM. As shown in Fig. 8 as instance, the landmarking points along method. The HOGG feature is computed in four steps as follows, the face boundary are not always well aligned. That causes the problem of that beard/moustache is incorrectly detected and seg- • Apply the 2D Gabor filter at different frequencies and orienta- mented. Furthermore, there are also some cases where facial hair is tions to obtain a set of Gabor filtering images not completely segmented from the facial image even the landmark- • For each Gabor filtered image: ing points are correctly aligned. Fig. 9 illustrates the problems where – Divide the image into a small blocks (cells) moustache is long and covers a mouth and the case where beard is – Compute the phase and the magnitude of each block longer than a chin. – Calculate the histogram of edge orientation of each block, It is clearly that pixels within one superpixel are close in fea- where the gradient direction is based on the phase image ture space and belong to one coherent region. However, superpixels and gradient is based on the magnitude image within a coherent region are similar and tend to belong to one group. – Represent the histogram of orientation of the Gabor filtering In order to overcome the limitations of incorrect alignment of the image as a set of these block histograms landmarker, we proposed an aggregately grouping strategy which is • Concatenate histograms of all Gabor filtering images able to connect neighboring superpixels close in feature space. Let s be a superpixel and the distance between s and its neighboring is Table 1 defined by Eq. (8). In this equation, we choose 8-neighbor and the dif- Images distribution of MBGC, FERET and Pinellas databases used in our experiments. ference is computed on three color channels, namely, red(r), green(g)

MBGC FERET Pinellas and blue(b) together lightness(l) information. Let Cs be a feature vec- tor which represents superpixel s and its 8-neighbor is denoted by Facial hair 2050 157 4782 8 Csi| ). The feature vector Cs is defined as in Eq. (9). Non-facial hair (male) 16,860 434 3296 i=1 Non-facial hair (female) 15,786 398 1922 Total 34,696 989 10,000 8 || − || ds = sumi=1 Cs Csi l2 (8)

Fig. 11. An illustration of the performance of our algorithm under different illuminations on the same subject. 220 T. Hoang Ngan Le et al. / Image and Vision Computing 58 (2017) 214–223

Fig. 12. An illustration of the performance of our algorithm under different illuminations on different subjects.

Fig. 13. Some examples of our segmentation performance in the case of near-clear shaving and sparse hair.

Cs =[rs, gs, bs, ls](9)There are 989 color images of 989 subjects under different illumi-  nations from the NIST Color Facial Recognition Technology (FERET) where r is computed as r = 1 sr and g , b , l are defined s s =( p∈s) p∈s p s s s database [5]. We test our proposed algorithm on 10,000 images with similarly. We apply this procedure on multi-scale of image. As shown JPEF “artifact” from the first folder DIR-000 in the Pinellas County in Fig. 10, superpixels which are close in the feature space tends to database. The detailed distribution of images used in our experi- belong to a coherent region. Thanks to the use of superpixel which ments is shown in Table 1 corresponding to MBGC, FERET, Pinellas. is able to occupy consecutive image areas, its cues enforce local Each database is divided into three categories: having facial hair smoothness. (either moustache or beard), male without facial hair and female without facial hair. 4. Datasets and results The performance of our semi self-training facial hair detection & segmentation simultaneously is considered in many challenging We conduct the experiments on large and diverse databases conditions as follows: to evaluate our proposed self-detecting and segmenting facial hair algorithm. The images used for our experimental results are from the NIST Multiple Biometric Grand Challenge - 2008 (MBGC) still face • Illumination: As illustrated in Fig. 11, the subject is cap- challenge database which contains 34,696 still frontal images of 810 tured under different conditions of light, the beard/moustache subjects with varying facial expressions and illumination conditions. still can be detected and segmented. More examples of our

Fig. 14. Some examples of the case where facial hair in brows is different from the one in beard/moustache region. T. Hoang Ngan Le et al. / Image and Vision Computing 58 (2017) 214–223 221

Fig. 15. Some examples of the cases where beard/moustache can be shown in different colors.

Fig. 16. Some examples of the case where facial hair in brows is different from the one in beard/moustache region.

performance on difference databases with different subjects row of Fig. 15. In addition, beard/moustache of the same under many environmental conditions are given in Fig. 12. person is presented in different colors, for example, gray and • Near-clear shaving and sparse beard/moustache: Different white beard/moustache, blonde beard/moustache. Using high from most previous works which are able to detect beard/ moustache in the case of near-clear shaving or sparse beard/moustache; however, they cannot segment it, our pro- posed system is able to detect and segment both long facial hair and short facial hair as shown in Fig. 13. • Facial hair in brows is different from the one in bears/ moustache region: It is true that both brows and beard/ moustache contain high frequency information; however, they are usually different in length (long facial hair in brows and near-shaving beard/moustache), color (dark brows and white beard/moustache), density (sparse brows and dense beard/moustache), etc. as shown in the first row of Fig. 14. In this situation, self-training approach [1] cannot segment beard/moustache whereas the performance of our proposed algorithm is shown in the second row of Fig. 14. • Different colors of beard/moustache: It is not always true that beard/moustache is a darker than skin because many people wear light beard/moustache as shown in the first Fig. 17. Accuracy of our proposed facial hair detection. 222 T. Hoang Ngan Le et al. / Image and Vision Computing 58 (2017) 214–223

Table 2 these cases, shadow makes the beard/moustache regions and eye- The averaging time consuming in seconds of the proposed algorithm on different brow are alike even there is no facial hair exits. Some examples databases. are given in the first row, whereas the superpixel grouping results Superpixel Feature extraction Classification Aggregately are given in the second row and the beard/moustache segmenting generation extraction grouping results are shown in the third row of Fig. 18. MBGC 1.2 1.2 4.2 1.9 FERET 2.3 0.9 2.1 3.7 PINELLAS 1.6 0.8 0.5 2.1 5. Conclusions

In this paper, we have proposed a fast, robust, fully automatic and semi self-training beard/moustache detection and segmenta- frequencies information to emphasize facial hair can help to tion simultaneously algorithm. Our algorithm consists of four stages, improve the performance of segmentation as shown in the namely, superpixel generation, feature extraction, classification and second row of Fig. 15. refinement. Feature vector composing of HoG and HOGG at differ- • Long beard/moustache: Long beard/moustache is usually lim- ent directions and frequencies is extracted from both the bounding ited and uncompleted segmented if using the landmarker box of the superpixel and the superpixel foreground.The classifier alone. Apply the proposed aggregately grouping, the problem is built from Random Ferns and Support Vector Machine with two of long beard/moustache can be overcome as shown in Fig. 16 models: self-trained model and pre-trained model. The self-trained however, long bear and neck hair is very confusing and it model is built from testing image itself and first is applied to assign causes the problem of over-segmentation as shown in Fig. 16. labeling score to each superpixel. The pre-trained model is built from training data and is only used if the confidence of labeling by the self- model is low. In order to use the landmarker flexibly, we proposed The performance of our proposed facial hair detection algorithm an aggregately searching strategy. The proposed beard/ moustache evaluated on the three databases in Table 1 and is provided in detection and segmentation simultaneously is considered under dif- Fig. 17. The accuracy is computed as the ratio of number of cor- ferent challenging conditions such as images are captured under rect detections and the sum of number of correct and incorrect strong illumination, beard/moustache is near clear shaved, sparse detections for each category. To compare with [1], our detection per- beard/moustache, long beard/moustache, beard/moustache of the formance obtains better accuracy on the same dataset. To provide same person is presented in a incoherent region, facial hair proper- the quantitative results on segmentation, we used the same images ties in brows is different from beard/moustache. Our algorithms have from the MBGC and FERET which were used in [2] and [1] to con- been evaluated on the entire MBGC, FERET database and a large sub- duct the experiment. The average F-measure of our proposed semi set from Pinellas database and found to obtain highly accurate results self-training beard/moustache detection and segmentation simulta- of detection and segmentation with small time consumption. neously is 99.0% of F-measure whereas Le et al. [1] obtained 98.5% and Le et al. [2] achieved 98.8% of F-measure References We tested our approach on the MBGC, FERET, Pinellas database and on a CPU with Intel Core i7 processor, 2.93 GHz, 8GB of RAM [1] T.H.N. Le, K. Luu, M. Savvides, Fast and robust self-training beard/moustache and Matlab 2011a, the averaging time consuming in seconds of the detection and segmentation, International Conference on Biometrics (ICB), 2015. pp. 507–512. proposed algorithm on different databases is reported in Table 2. [2] T.H.N. Le, K. Luu, K. Seshadri, M. Savvides, SparCLeS: Dynamic L1 sparse classi- The capability of near-clear shaving beard/moustache segmen- fiers with level sets for robust beard/moustache detection and segmentation, tation causes the problem of over segmentation when people wear IEEE Trans. Image Process. (TIP) 22 (8) (2013) 3097–3107. both long and near-shaving beard. It maybe a problem if we just only [3] M.H. Nguyen, J.F. Lalonde, A.A. Efros, F. de la Torre, Image-based shaving, Comput. Graphics Forum J. (Eurographics 2008) 27 (2) (2008) 627–635. care about the long facial hair that is more pronounced. However, [4] J.S. Pierrard, Skin Segmentation for Robust Face Image Analysis, University of under biometrics and forensics research, we would like to study the Basel, Faculty of Science. 2008. PhD Thesis invariant feature. That means we want to know all about the facial [5] P.J. Phillips, H. Moon, P.J. Rauss, S. Rizvi, The FERET evaluation methodology for face-recognition algorithms, IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 22 hair of beard/moustache no matter how he tries to change/shave. In (10) (2000) 1090–1104. addition, that our proposed algorithm is able to segment long beard [6] T.H.N. Le, K. Luu, K. Seshadri, M. Savvides, Beard and mustache segmentation brings another problem of mis-segmenting when people have hair using sparse classifiers on self-quotient images, IEEE Int. Conf. Image Process. (ICIP) (2012) 165–168. on the neck. Our proposed beard/moustache detection system some- [7] R. Achanta, A. Shaji, K. Smith, A. Lucchi, P. Fua, S. Susstrunk, SLIC Superpixels, times fails in the presence of shadows and illumination artifacts. In EPFL Technical Report, 2010. Tech. Rep. 149300

Fig. 18. Some examples when our system sometimes fails in the presence of shadows and illumination artifacts. The first row is original images under different illuminations. The second row is superpixels grouping results. The third row is beard/moustache segmenting results. T. Hoang Ngan Le et al. / Image and Vision Computing 58 (2017) 214–223 223

[8] T.F. Cootes, C.J. Taylor, D.H. Cooper, J. Graham, Active shape models - their [10] R. Gross, I. Matthews, J. Cohn, T. Kanade, S. Baker, Multi-PIE, Image Vis. Comput. training and application, Comput. Vis. Image Underst. 61 (1) (1995) 38–59. 28 (5) (2010) 807–813. [9] K. Seshadri, M. Savvides, Robust modified active shape model for automatic [11] V.L.M. Ozuysal, P. Fua, Fast keypoint recognition in ten lines of code, 2007, 1–8. facial landmark annotation of frontal faces, Biometrics: Theory, Applications [12] V.L.P.F.M. Ozuysal, P. Fua, Fast keypoint recognition using random ferns, IEEE and Systems (BTAS), 2009. pp. 319–326. Trans. Pattern Anal. Mach. Intell. (TPAMI) 32 (3) (2010) 448–461.