Son Lam Phung, Abdesselam Bouzerdoum, and Douglas Chai Visual Information Processing Research Group Edith Cowan University, Western Australia {s.phung, a.bouzerdoum, d.chai}@ecu.edu.au detection of skin colored regions has gained special attention. This approach not only enables fast localization This paper presents a new human skin color model in of potential facial regions but also proves to be highly YCbCr color space and its application to human face robust to geometric variations in the face patterns. detection. Skin colors are modeled by a set of three However, the success of this approach depends heavily on Gaussian clusters, each of which is characterized by a the accuracy and robustness of the human skin color centroid and a covariance matrix. The centroids and model. covariance matrices are estimated from large set of training samples after a k-means clustering process. Pixels Many existing human skin color models operate only on in a color input image can be classified into skin or non- chrominance planes (e.g. Cb-Cr [2, 3] or H-S [11]). They skin based on the Mahalanobis distances to the three work based on the observation that chrominance of clusters. Efficient post-processing techniques namely noise different skin colors (e.g. white, black, yellow and brown) removal, shape criteria, elliptic curve fitting and face/non- shares a marked similarity, which can be used to face classification are proposed in order to further refine distinguish skin colors from non-skin colors. These models skin segmentation results for the purpose of face detection. work reasonably well with medium level of lighting. However, as shown in Fig. 1, for low and high luminance Y the decision boundary between skin and non-skin in chrominance plane is reduced. Therefore, significant false alarms will incur if only one chrominance decision Face detection, which aims to detect the presence and boundary is used for all levels of luminance Y. To address subsequently the position of human faces in an image, is this issue, Hsu et al. [7] suggested a non-linear transform often the first important step in automated facial image to the YCbCr color space that makes the skin color cluster analysis. Results of face detection enable tasks such as in the chrominance plane luma-independent. Garcia and face recognition and facial expression analysis to be Tziritas [5] used a set of bounding planes to approximate performed on focused image regions. The major challenge the skin clusters in YCbCr and HSV spaces. in face detection is to cope with a wide range of variations in the human face pattern caused by factors such as In this paper, we present a skin color-based face detection different lighting, face orientation, face size, facial algorithm that employs a human skin color model, which expression and people ethnicity. The presence of complex takes into account the luminance Y in classifying skin and background or extra facial features such as glasses, beard, non-skin pixels. In our approach, the distribution of human and moustache also adds to the complexity of the problem. skin colors in YCbCr space is modeled with three Gaussian clusters that correspond approximately to three Several face detection algorithms including those that use levels of luminance: low, medium and high. The rest of the neural networks [4, 10], support vector machines [1, 8], paper is organized as follows. The construction of the mixtures of factor analyzers [13] and wavelets [5] have human skin color model in YCbCr color space is been proposed. Interested readers are referred to two described in Section 2. The various stages of our face comprehensive reviews on face detection published detection algorithm are presented in Section 3. recently by Yang et al. [14], and by Hjelmas and Low [6]. Experimental results and discussions are provided in In recent years, face detection in color images through the Section 4. Conclusion is given in Section 5. 0-7803-7622-6/02/$17.00 ©2002 IEEEI - 289 IEEE ICIP 2002 ")& xiMji where = arg min ( jii , , ) . (5) i Skin colors form a concentrated cluster in YCbCr space #$ (: Repeat steps 2-3 until a maximum allowable and this cluster has a highly irregular boundary (Fig. 1). number of iterations is reached or there is no further We will approximate this cluster using three 3-D change in the clusters. Store the cluster centroids and Gaussians. covariance matrices. )!!*&#+ ! Let " = [Y, Cb, Cr]T be a pixel in a color input image. The classification rule is as follows: " is skin color if it satisfies the following two tests: Y (i) Its projection on the Cb-Cr plane is inside a predetermined rectangle: Cb ∈ RCb and Cr ∈ RCr, (6) where RCb = [75, 135] and RCr = [130, 180]. The ranges RCb and RCr, which are found experimentally, are used to eliminate quickly non-skin colors. Cr Cb (ii) The minimum Mahalanobis distance from " to the clusters is below a threshold: ") & . Skin colors in YCbCr color space. The min {M ( , ii, )} < d. (7) i decision boundary on Cb-Cr plane is large for medium Y and is reduced for low or high Y. ' ! Let , = [xij]W×H be the input color image. The output of skin color detection is a binary map = [bij] W×H: Let be an ensemble of skin color samples: = {"1, "2, " " T 1 if x is skin color …, N} where i = [Yi, Cbi, Cri] . b = ij . (8) ij 0 otherwise #$ : Assign the samples to k (k = 3 in this case) initial Each set of connected 1’s in is called an object, which is clusters: = { 1, 2,…, k}: a potential face candidate. However, even with highly " ∈ ∆ ≤ ∆ i = { j | Ymin + (i-1) Y Yj < Ymin + (i-1) Y }, (1) accurate skin color detection, two sources of errors remain where i = 1, k , [Ymin, Ymax] is the range of Y in YCbCr to be addressed: (i) background pixels (i.e. non-face such Y - Y as hands or scenery) can have skin colors and this leads to ∆ max min color specification and Y = is the interval. false alarms; (ii) some facial regions (e.g. eyes and k mouths) do not have skin colors and this leads to false rejections. In the following discussion, let A , W , H and P #$ : Compute the centroid and covariance matrix for i i i i denote the area, width, height of the bounding box, and each cluster (i = 1, k ): perimeter of object i in respectively. )"1 '+! -+.) = 8 j , (2) N "% i ji The binary map may contain noise, which tends to &")")1 T scatter and have relatively small area. This suggests the ijiji = 8 ( )(( ) . (3) "% following noise removal technique: object i is removed if Ni 1 ji AA Ni is the size of cluster Ei. ii < θθ and < . (9) × aa12 WH Amax / × #$ ': For each sample "j (j = 1, N ) compute the k Amax is the area of the largest object in A = W H is the θ θ Mahalanobis distances to the k clusters (i = 1, k ): area of the input image, and a1 and a2 are two thresholds. Another useful technique is to remove, in each object, all " ) &") T1&" ) M (,jii , ) = ()() j i i j i. (4) cross sections (horizontal or vertical) that are relatively short compared to the longest cross sections of the object. Reassign the sample "j to the cluster i, which corresponds Used in conjunction with the first technique, this will to the smallest distance: remove “spikes” connected to a facial region. I - 290 ' $ # #$ : The vector 3 is projected onto face subspace spanned by a set of d eigenfaces (d=20), and a feature The human face resembles an oval shape even under vector 0 is extracted: different views and has a relatively constrained aspect 033T − = (m ) . (16) ratio. For each object in the binary map, two shape The set of eigenfaces is found by performing principal measures are defined: component analysis (PCA) [12] on a large set of face Circularity: CAP = / 2 , (10) iii vectors 3; 3m is the average face vector. This step is Aspect ratio: RHWiii = / . (11) essentially a dimension-reduction step. The object is kept only if: θ ∈ #$': Each feature vector 0 in d space is classified as CRRic > and iR, (12) face or non-face by a distribution-based method, which where θ (= 0.05) is a threshold and R (= [0.8, 2.2]) is a c R models the class-conditional densities of face p(0|face) range of valid aspect ratios for face. 0 d and non-face p( |non-face) in as two mixtures of '' **&+0# ))$#&&.*## Gaussians: G f pface(|00 ) = ∑π g(;% , ) and (17) fffiii Because the human face has an ellipse-like shape, face i=1 candidates can be refined by finding the best-fit ellipse for Gnf pnonface(|00 ) = ∑π g(;% , ). (18) each object in the binary map. This can be done, for nfiii nf nf example, using Hough transform. We adopt the following i=1 π technique for finding the best-fit ellipse. The subscripts f and nf denote face and non-face; i is a mixing factor, G is the number of Gaussian components in each mixture. Each Gaussian component is governed by a Let i = {(xj, yj)}, j = 1, N be the set of pixel coordinates µ µ Gaussian function: for object i in . We first compute means x and y, k − − 1 2 2 2 1 T −1 variances δ and δ , and covariance δ for i. The best- g(;,)0) =(2)exp{()()}π 2 −− 0 ) 0 − ) (19) x x xy 2 µ µ fit ellipse has the center at ( x, y) and the major axis at an where µand are the mean and covariance matrix.
File Typepdf
Upload Time-
Content LanguagesEnglish
Upload UserAnonymous/Not logged-in
File Pages4 Page
File Size-