Face Detection Using Locally Linear Embedding
Total Page:16
File Type:pdf, Size:1020Kb
Face Detection Using Locally Linear Embedding By Samuel Kadoury Department of Electrical and Computer Engineering McGill University, Montreal, Canada November 2005 A thesis submitted to McGill University in partial fulfillment of the requirements of the Degree of Master of Engineering © Samuel Kadoury 2005 Library and Bibliothèque et 1+1 Archives Canada Archives Canada Published Heritage Direction du Branch Patrimoine de l'édition 395 Wellington Street 395, rue Wellington Ottawa ON K1A ON4 Ottawa ON K1A ON4 Canada Canada Your file Votre référence ISBN: 978-0-494-24973-4 Our file Notre référence ISBN: 978-0-494-24973-4 NOTICE: AVIS: The author has granted a non L'auteur a accordé une licence non exclusive exclusive license allowing Library permettant à la Bibliothèque et Archives and Archives Canada to reproduce, Canada de reproduire, publier, archiver, publish, archive, preserve, conserve, sauvegarder, conserver, transmettre au public communicate to the public by par télécommunication ou par l'Internet, prêter, telecommunication or on the Internet, distribuer et vendre des thèses partout dans loan, distribute and sell th es es le monde, à des fins commerciales ou autres, worldwide, for commercial or non sur support microforme, papier, électronique commercial purposes, in microform, et/ou autres formats. paper, electronic and/or any other formats. The author retains copyright L'auteur conserve la propriété du droit d'auteur ownership and moral rights in et des droits moraux qui protège cette thèse. this thesis. Neither the thesis Ni la thèse ni des extraits substantiels de nor substantial extracts from it celle-ci ne doivent être imprimés ou autrement may be printed or otherwise reproduits sans son autorisation. reproduced without the author's permission. ln compliance with the Canadian Conformément à la loi canadienne Privacy Act some supporting sur la protection de la vie privée, forms may have been removed quelques formulaires secondaires from this thesis. ont été enlevés de cette thèse. While these forms may be included Bien que ces formulaires in the document page count, aient inclus dans la pagination, their removal does not represent il n'y aura aucun contenu manquant. any loss of content from the thesis. ••• Canada Abstract Abstract Human face detection in gray sc ale images has been researched extensively over the past decade, due to the recent emergence of applications such as security access control, visual surveillance and content-based information retrieval. However, this problem remains challenging because faces are non-rigid objects that have a high degree of variability in size, shape, color and texture. fudeed, few of the proposed face detection methods have been analyzed for performance under different conditions, such as head rotation, illumination, facial expression, occlusion and aging. Nowadays, most face detection methods are based upon statistical and learning strategies. Many of these appearance-based methods tend to increase data complexity, by mapping it onto a higher-dimensional space in order to extract the predominant features; this, however, often requires much more computational time. A novel technique that is gaining in popularity, known as Locally Linear Embedding (LLE) , adopts a different approach to the problem by applying dimensionality-reduction to the data for learning and classification. Proposed by Roweis and Saul, the objective of this method is to determine a locally-linear fit, so that each data point can be represented by a linear combination of its closest neighbors. The first objective of the current research is to apply the LLE algorithm to 2D facial images, so as to obtain their representation in a sub-space under the unfavorable conditions stated above. The low-dimensional data then will be used to train a Support Vector Machine to classify images as being face or non-face. For this research, six different databases of cropped facial images, corresponding to variations in head rotation, illumination, facial expression, occlusion and aging, were built to train and test the classifiers. The second objective is to evaluate the feasibility of using the combined efficacy of the six SVM classifiers in a two-stage face detection approach. Experimental results obtained with image databases demonstrated that the performance of the proposed method was similar to and sometimes better than other face detection methods, introducing a viable and accurate alternative to previously existing techniques. 11 Sommaire Sommaire La détection de visages dans une image a été un domaine de recherche très actif au cours des dernières années. Ceci est dû notamment à l'émergence d'applications tel que l'authentification, la reconnaissance et la recherche de visages dans des bases de données. Cependant, peu des méthodes proposées pour la détection de visages ont analysé la performance de leur technique sous des conditions variables tel que la position du visage, l'illumination, les expressions faciales, l'occlusion ou l'âge. Désormais, la plupart des nouvelles méthodes de détection sont basées sur des méthodes d'apprentissage statistiques. Cependant, plusieurs de ces méthodes cherchent à complexifier l'information fournie afin de récupérer les caractéristiques prédominantes, ce qui risque d'alourdir la tâche. Une nouvelle technique, dénommé Locally Linear Embedding (LLE), préconise une approche à l'inverse de ce problème en effectuant la réduction de complexité des données pour des fins d'apprentissage et de classification. Proposée par Roweis et Saul, l'objectif de cette méthode est de sélectionner un domaine approximativement linéaire par monceaux, afin que chaque point puisse être représenté par une combinaison linéaire de ses plus proches voisins. Le premier objectif de cette recherche est d'appliquer la technique LLE sur des images de visages afin d'obtenir leurs représentations dans un sous domaine simplifié, sous les conditions peu favorables mentionnées précédemment. Ces caractéristiques seront ensuite utilisées pour l'apprentissage de Support Vector Machines (SVM), afin de classifier des images comme étant des visages ou non. Pour ce projet, six bases de données d'images faciales, correspondant aux conditions de variations de la position du visage, d'illumination, d'expressions faciales, d'occlusion et d'age, ont été utilisées. Le second objectif est d'évaluer la faisabilité d'utiliser une combinaison de six classificateurs dans un processus à deux étapes pour la détection de visages. Des expériences effectuées sur des bases de données d'images démontrent que la méthode de détection proposée offre des résultats similaires et parfois même supérieurs à d'autres méthodes de détection, indiquant ainsi une méthode fiable et précise. 111 Acknowledgments Acknowledgments First and foremost, 1 would like to thank my supervisor, Professor Martin D. Levine for having proposed such a fascinating and meaningful research project, for his guidance and patience throughout the preparation of this thesis, and for relating to me his vast experience in the field of computer vision. But 1 would be remiss not to thank many others. Hence, 1 would like to express my gratitude to several professors from McGill University, the University of Montreal and École Polytechnique, for having helped me to understand the various aspects and challenges of computer vision. They inc1ude Prof. Doina Precup for teaching me about machine learning intelligence and probabilistie reasoning; Prof. Riehard Rose for automatic speech recognition perception; Prof. Martin D. Levine for image processing; Prof. Hannah Michalska for optimization algorithms; Prof. Benoit Godbout for computer graphies; and Prof. James Clark for statistical computer vision. 1 would like to thank Prof. Carl-Éric Aubin and Prof. Farida Cheriet for having accommodated my working part-time during my thesis, and for having provided me the opportunity to pursue research activities with their group. The research itself was supported by Le Fonds Québecois de la Recherche sur la Nature et les Technologies. AlI through the course of this thesis, my interactions with my colleagues have been so beneficial; so 1 would like to thank Gurman Singh Gill, Donovan Parks, Jean-Philippe Gravel, and especially Hakirat Shambi, all of whom have given great suggestions and tips towards solving my problems, as well as developing the LLE software that was fundamental to my project. Thank you again, so much, for your help and cooperation. Finally, and most importantly, 1 dedicate this thesis to my wonderful wife, Pascale, and to my loving parents, Morris and Dominique, for having supported and encouraged me throughout my Masters. They always have shown me, by way of example, the meanings of perseverance and patience; and, for that, 1 am extremely grateful. - Ta my wife and parents. IV Table of Contents Table of Contents Abstract ............................................................................................................................ .ii Sommaire ........................................................................................................................ .iii Acknowledgments ............................................................................................................ iv List of Figures ................................................................................................................ viii List of Tables ................................................................................................................... xi Chapter