Gait Recognition Using Image Self-Similarity
Total Page:16
File Type:pdf, Size:1020Kb
EURASIP Journal on Applied Signal Processing 2004:4, 572–585 c 2004 Hindawi Publishing Corporation Gait Recognition Using Image Self-Similarity Chiraz BenAbdelkader Identix Corporation, One Exchange Place, Jersey City, NJ 07302, USA Email: [email protected] Ross G. Cutler Microsoft Research, One Microsoft Way, Redmond, WA 98052-6399, USA Email: [email protected] Larry S. Davis Department of Computer Science, University of Maryland, College Park, MD 20742, USA Email: [email protected] Received 30 October 2002; Revised 18 May 2003 Gait is one of the few biometrics that can be measured at a distance, and is hence useful for passive surveillance as well as biometric applications. Gait recognition research is still at its infancy, however, and we have yet to solve the fundamental issue of finding gait features which at once have sufficient discrimination power and can be extracted robustly and accurately from low-resolution video. This paper describes a novel gait recognition technique based on the image self-similarity of a walking person. We contend that the similarity plot encodes a projection of gait dynamics. It is also correspondence-free, robust to segmentation noise, and works well with low-resolution video. The method is tested on multiple data sets of varying sizes and degrees of difficulty. Perfor- mance is best for fronto-parallel viewpoints, whereby a recognition rate of 98% is achieved for a data set of 6 people, and 70% for adatasetof54people. Keywords and phrases: gait recognition, human identification at a distance, human movement analysis, behavioral biometrics, pattern recognition. 1. INTRODUCTION (a well-advanced multidisciplinary field that spans kinesi- ology, physiotherapy, orthopedic surgery, ergonomics, etc.) 1.1. Motivation [6, 7, 8, 9, 10] that gait dynamics contain a signature that is Gait is a relatively new and emergent behavioral biometric characteristic of, and possibly unique to, each individual. [1, 2] that pertains to the use of an individual’s walking style From a biomechanics standpoint, human gait consists of (or “the way he walks”) to determine identity. Gait recogni- synchronized, integrated movements of hundreds of mus- tion is the term typically used in the computer vision com- cles and joints of the body. These movements follow the munity to refer to the automatic extraction of visual cues that same basic bipedal pattern for all humans, and yet vary characterize the motion of a walking person in video and is from one individual to another in certain details (such as used for identification purposes. Gait is particularly an at- their relative timing and magnitudes) as a function of their tractive modality for passive surveillance since, unlike most entire musculo-skeletal structure, that is, body mass, limb biometrics, it can be measured at a distance, hence not re- lengths, bone structure, and so forth. Because this struc- quiring interaction with or cooperation of the subject. How- ture is difficult to replicate, gait is believed to be unique to ever, gait features exhibit a high degree of intraperson vari- each individual and can be completely characterized by a ability, being dependent on various physiological, psycholog- few hundred kinematic parameters, namely, the angular ve- ical, and external factors such as footwear, clothing, surface locities and accelerations at certain joints and body land- of walking, mood, illness, fatigue, and so forth. The question marks [6, 7]. Achieving such a complete characterization au- then arises as to whether there is sufficientgaitvariabilitybe- tomatically from low-resolution video remains an open re- tween people that can discriminate them even in the presence search problem in computer vision. The difficulty lies in that of large variation within each individual. feature detection and tracking is error prone due to self- There is indeed strong evidence originating from psy- occlusions, insufficient texture, and so forth. This is why chophysical experiments [3, 4, 5] and gait analysis research computer-aided motion analysis systems still rely on special Gait Recognition Using Image Self-Similarity 573 wearable instruments, such as LED markers, and walking quality outdoor data set of 54 people, and a multiview data surfaces [9]. set of 12 people taken from 8 viewpoints. Luckily, we may not need to recover 3D kinematics for gait recognition after all. In Johansson’s early psychophysical 1.2. Assumptions experiments [3],humansubjectswereabletorecognizethe The method makes the following assumptions: type of movement solely by observing light bulbs attached (i) people walk with constant velocity for about 3–4 sec- to a few joints of the moving person. The experiments were onds; filmed in total darkness so that only the bulbs, a.k.a. moving (ii) people are located sufficiently far from the camera; light displays (MLDs), are visible. Similar experiments later (iii) the frame rate is greater than twice the frequency of the suggested that the identity of a familiar person (“a friend”) walking; [5], as well as the gender of the person [4], may be recogniz- (iv) the camera is stationary. able from their MLDs. While it is widely agreed that these ex- periments provide evidence about motion perception in hu- 1.3. Organization of the paper mans, there is no consensus on how the human visual system The rest of the paper is organized as follows. Section 2 dis- actually interprets this MLD-type stimuli. Two main theories cusses related work in the computer vision literature and exist: the first maintains that people recover the 3D struc- Section 3 describes the method in detail. We assess the per- ture of the moving object (person) and subsequently uses formance of the method on a number of different data sets it for recognition; the second theory states that motion in- in Section 4, and finally conclude in Section 5. formation is directly used for recognition, without structure recovery in the interim [11]. This seems to suggest that the raw spatiotemporal (XYT) patterns generated by the person’s 2. RELATED WORK motion in an MLD video encode information that is suffi- Interest in gait recognition is best evidenced by the near- cient to recognize their movement. exponential growth of the size of related literature over the In this paper, we describe a novel gait recognition past few years [13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, technique that derives classification features directly from 26, 27, 28, 29, 30, 31, 32]. Gait recognition is generally related these XYT patterns. Specifically, it computes the image self- to human movement analysis methods that automatically similarity plot (SSP), defined as the correlation of all pairs of detect and/or track human motion in video for a variety of images in the sequence. Normalized feature vectors are ex- applications-surveillance, videoconferencing, man-machine tracted from the SSP and used for recognition. Related work interfaces, smart rooms, and so forth. For good surveys on has demonstrated the effective use of SSP’s in recognizing dif- this topic, see [11, 33, 34]. It is perhaps most closely asso- ferent types of biological periodic motions, such as those of ciated with the subset of methods that analyze whole-body humans and dogs, and applied the technique for human de- movement, for example, for human detection [12, 35, 36] tectioninvideo[12]. We use them here to classify the move- and activity recognition [37, 38, 39, 40]. ment patterns of different people. We contend that the SSP A common characteristic to all of these methods is that encodes a projection of planar gait dynamics and hence a they consist of two main stages: a feature extraction stage 2D signature of gait. Whether it contains sufficient discrim- in which motion information is derived from the image se- inant power for accurate recognition is what we set to deter- quence and organized into some compact form (or represen- mine. tation), and a recognition stage in which some standard pat- As in any pattern recognition problem, these methods tern classification technique is applied to the obtained mo- typically consist of two stages: a feature extraction stage that tion patterns, such as KNN, SVM, and HMM. We distin- derives motion information from the image sequence and or- guish two main classes of gait recognition methods: holis- ganizes it into some compact form (or representation), and tic [14, 15, 16, 17, 18, 19, 23, 24, 25, 28, 29, 30, 31, 32]and a recognition stage that applies some standard pattern clas- feature-based [20, 21, 22, 26, 27, 41, 42, 43, 44]. The holis- sification technique to the obtained motion patterns, such as tic versus feature-based dichotomy can also be regarded as K-nearest neighbor (KNN), support vector machines (SVM), global versus local, nonparametric versus parametric, and and hidden Markov models (HMM). In our view, the crux of pixel-based versus geometric. This dichotomy is certainly re- the gait recognition problem lies in perfecting the first stage. currentinpatternrecognitionproblemssuchasfacerecog- The challenge is in finding motion patterns that are suffi- nition [45, 46]. In the sequel, we describe and critique ex- ciently discriminant despite the wide range of natural vari- amples from both approaches, and relate them to our gait ability of gait, and that can be extracted reliably and con- recognition method. sistently from video. The method of this paper is designed with these two requirements in mind. It is based on the SSP 2.1. Holistic approach which is robust to segmentation noise and can be computed The holistic approach characterizes body movement by the correspondence-free from fairly low-resolution images. Al- statistics of the XYT patterns generated in the image se- though this method is view-dependent (since it is inher- quence by the walking person. Although typically these pat- ently appearance-based), this is circumvented via view-based terns have no direct physical meaning, intuitively they cap- recognition.