
ANRV346-NE31-18 ARI 17 March 2008 21:40 V I E E W R S I E N C N A D V A Mechanisms of Face Perception Doris Y. Tsao1 and Margaret S. Livingstone2 1Centers for Advanced Imaging and Cognitive Sciences, Bremen University, D-28334 Bremen, Germany; email: [email protected] 2Department of Neurobiology, Harvard Medical School, Boston, Massachusetts 02115; email: [email protected] Annu. Rev. Neurosci. 2008. 31:411–37 Key Words The Annual Review of Neuroscience is online at face processing, face cells, holistic processing, face recognition, face neuro.annualreviews.org detection, temporal lobe This article’s doi: 10.1146/annurev.neuro.30.051606.094238 Abstract Copyright c 2008 by Annual Reviews. Faces are among the most informative stimuli we ever perceive: Even All rights reserved a split-second glimpse of a person’s face tells us his identity, sex, mood, 0147-006X/08/0721-0411$20.00 age, race, and direction of attention. The specialness of face processing is acknowledged in the artificial vision community, where contests for face-recognition algorithms abound. Neurological evidence strongly implicates a dedicated machinery for face processing in the human brain to explain the double dissociability of face- and object-recognition deficits. Furthermore, recent evidence shows that macaques too have specialized neural machinery for processing faces. Here we propose a unifying hypothesis, deduced from computational, neurological, fMRI, and single-unit experiments: that what makes face processing special is that it is gated by an obligatory detection process. We clarify this idea in concrete algorithmic terms and show how it can explain a variety of phenomena associated with face processing. 411 ANRV346-NE31-18 ARI 17 March 2008 21:40 mon basic configuration, in fact such differ- Contents ences must be represented in the brain for both faces and nonface objects. Most humans can INTRODUCTION .................. 412 easily identify hundreds of faces (Diamond & Detection ......................... 412 Carey 1986), but even if one cannot recognize Measurement and Categorization . 413 a hundred different bottles by name, one can COMPUTER VISION certainly distinguish them in pairwise discrim- ALGORITHMS................... 413 ination tasks. Furthermore, most of us can rec- Detection ......................... 413 ognize tens of thousands of words at a glance, Measurement ...................... 415 not letter by letter, a feat requiring expert detec- Categorization ..................... 417 tion of configural patterns of nonface stimuli. Invariance ......................... 417 Thus, face perception is in many ways a mi- Summary .......................... 417 crocosm of object recognition, and the solution HUMAN BEHAVIOR AND to the particular problem of understanding face FUNCTIONAL IMAGING ....... 418 recognition will undoubtedly yield insights into Norm-Based Coding ............... 420 the general problem of object recognition. Detection ......................... 420 The system of face-selective regions in the Holistic Processing of Faces ........ 421 human and macaque brain can be defined pre- HUMAN FUNCTIONAL cisely using fMRI, so we can now approach this IMAGING ........................ 423 system hierarchically and physiologically to ask Measurement and Categorization . 424 mechanistic questions about face processing at Invariance ......................... 424 a level of detail previously unimaginable. Here Summary .......................... 425 we review what is known about face processing MONKEY fMRI AND SINGLE- at each of Marr’s levels: computational theory, UNIT PHYSIOLOGY ............ 425 algorithm, and neural implementation. Detection ......................... 425 Computer vision algorithms for face percep- Holistic Processing of Faces ........ 426 tion divide the process into three distinct steps. Anatomical Specialization First, the presence of a face in a scene must of Face Cells.................... 426 be detected. Then the face must be measured The Functional Significance to identify its distinguishing characteristics. of the Anatomical Localization Finally, these measurements must be used to of Face Processing .............. 429 categorize the face in terms of identity, gender, Time Course of Feature- age, race, and expression. Combination Responses ......... 429 Norm-Based Coding ............... 429 Invariance ......................... 430 Detection Summary .......................... 431 The most basic aspect of face perception is sim- ply detecting the presence of a face, which re- quires the extraction of features that it has in INTRODUCTION common with other faces. The effectiveness The central challenge of visual recognition is and ubiquity of the simple T-shaped schematic the same for both faces and objects: We must face (eye, eye, nose, mouth) suggest that face distinguish among often similar visual forms de- detection may be accomplished by a simple spite substantial changes in the image arising template-like process. Face detection and iden- from changes in position, illumination, occlu- tification have opposing demands: The identi- sion, etc. Although face identification is often fication of individuals requires a fine-grained singled out as demanding particular sensitivity analysis to extract the ways in which each face to differences between objects sharing a com- differs from the others despite the fact that all 412 Tsao · Livingstone ANRV346-NE31-18 ARI 17 March 2008 21:40 faces share the same basic T-shaped configu- Moghaddam (2004). Our goal here is to dis- ration, whereas detection requires extracting cuss algorithms that offer special insights into what is common to all faces. A good detector possible biological mechanisms. should be poor at individual recognition and vice versa. Another reason why detection and identi- Detection fication should be separate processes is that How can a system determine if there is a face detection can act as a domain-specific filter, in an image, regardless of whose it is? An obvi- ensuring that precious resources for face recog- ous approach is to perform template matching nition [e.g., privileged access to eye movement (e.g., search for a region containing two eyes, a centers ( Johnson et al. 1991)] are used only mouth, and a nose, all inside an oval). In many if the stimulus passes the threshold of being a artificial face-detection systems a template is face. Such domain-specific gating may be one swept across the image at multiple scales, and reason for the anatomical segregation of face any part of the image that matches the template processing in primates (it is easier to gate cells is scored as a face. This approach works, but it that are grouped together). A further impor- is slow. tant benefit of preceding identification by de- To overcome this limitation, Viola & Jones tection is that detection automatically accom- (2004) introduced the use of a cascade of in- plishes face segmentation; i.e., it isolates the creasingly complex filters or feature detectors. face from background clutter and can aid in Their reasoning was that the presence of a aligning the face to a standard template. Many face can be ruled out most of the time with face-recognition algorithms require prior seg- a very simple filter, thus avoiding the com- mentation and alignment and will fail with putational effort of doing fine-scale filtering nonuniform backgrounds or varying face sizes. on uninformative parts of the image. The first stage in their cascade consists of only two sim- ple filters, each composed of a few rectangular Measurement and Categorization light or dark regions (Figure 1a). Subsequent After a face has been detected, it must be mea- stages of filtering are performed only on regions sured in a way that allows for accurate, effi- scoring positive at any preceding stage. This cient identification. The measurement process cascade approach proved just as accurate, but must not be so coarse as to miss the subtle fea- 10 times faster, than single-step face-detector tures that distinguish one face from another. algorithms. On the other hand, it must output a set of val- Sinha’s face-detection algorithm (Sinha ues that can be efficiently compared with stored 2002a) is based on the observation that qual- templates for identification. There is a zero- itative contrast relationships between different sum game between measurement and catego- parts of a face are highly conserved, even un- rization: The more efficient the measurement, der different lighting conditions (Figure 1b). the easier the classification; conversely, less ef- Even though any single contrast relationship ficient measurement (e.g., a brute force tabula- between two facial regions would be inadequate tion of pixel gray values) makes the classification to detect a face, a set of such relationships could process more laborious. be adequate (because probabilities multiply). A subset of Sinha’s directed contrasts ([r2, r3] and [r4, r5]) are equivalent to the first stage of the COMPUTER VISION Viola-Jones face detector. ALGORITHMS Effective primitives for face detection can A comprehensive review of computer algo- also be computed using an information the- rithms for face recognition can be found ory approach by identifying fragments (sub- in Zhao et al. (2003) and Shakhnarovich & windows) of face images that are maximally www.annualreviews.org • Mechanisms of Face Perception 413 ANRV346-NE31-18 ARI 17 March 2008 21:40 a b 168 245 130 148 69 102 43 66 107 85 147 161 244 153 154 135 24 140 r0 r1 r4 r5 r2 r3 r7r6 r8 r9 r10 r11 414 Tsao · Livingstone
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages27 Page
-
File Size-