<<

2010 International Conference on

Adaptive Motion Model for Human Tracking Using Particle Filter

Mohammad Hossein Ghaeminia1, Amir Hossein Shabani2, and Shahryar Baradaran Shokouhi1 1Iran Univ. of Science & Technology, Tehran, Iran 2University of Waterloo, ON, Canada [email protected] [email protected] [email protected]

Abstract is periodically adapted by an efficient learning procedure (Figure 1). The core of learning the motion This paper presents a novel approach to model the model is the parameter estimation for which the complex motion of human using a probabilistic sequence of velocity and acceleration are innovatively autoregressive moving average model. The analyzed to be modeled by a Gaussian Mixture Model parameters of the model are adaptively tuned during (GMM). The non-negative matrix factorization is then the course of tracking by utilizing the main varying used for dimensionality reduction to take care of high components of the pdf of the target’s acceleration and variations during abrupt changes [3]. Utilizing this velocity. This motion model, along with the color adaptive motion model along with a color histogram as the measurement model, has been as measurement model in the PF framework provided incorporated in the particle filtering framework for us an appropriate approach for human tracking in the human tracking. The proposed method is evaluated by real world scenario of PETS benchmark [4]. PETS benchmark in which the targets have non- The rest of this paper is organized as the following. smooth motion and suddenly change their motion Section 2 overviews the related works. Section 3 direction. Our method competes with the state-of-the- explains the particle filter and the probabilistic ARMA art techniques for human tracking in the real world model. Section 4 presents the experimental results. scenario. Finally, Section 5 concludes the paper.

1. Introduction

Human tracking is of interest in computer vision applications such as video surveillance, human computer , and human motion analysis [1,2]. Nonlinear motion, occlusion with background (or other moving objects), and illumination changes challenge any human tracking approaches. Human motion is especially hard to model due to the underlying complex dynamics. Particle filter (PF) approximates the Bayesian estimation in which Figure 1. Probabilistic learning of ARMA for human motion modeling. Note that the main nonlinear models can be incorporated to update the components of the pdf of velocity (and acceleration) state of the object of interest. However, most of the are used for the parameter estimation. existing human tracking methods which use the PF approximate the nonlinear human motion with a 2. Related works Markov model with fixed parameters. Obviously, complex dynamics of human motion might require The autoregressive moving average (ARMA) higher order model with adaptation throughout the model has been used for the analysis and modeling of tracking period. time-varying data series [5, 6]. Perez et al. [7] used a In this paper, we propose a probabilistic second-order ARMA for motion modeling. They used autoregressive moving average (PARMA) model for an ad-hoc model of dynamics for parameter learning. modeling the nonlinear motion of human. This model

1051-4651/10 $26.00 © 2010 IEEE 20732077 DOI 10.1109/ICPR.2010.510 Elnagar et al. [6] combined with average (MA) part. An ARMA (p,q) model ensembles conditional maximum likelihood to predict future a p-order AR model and a q-order MA model. positions and orientation of moving objects. Mikami et al. [8] used memory-based PF to handle the st = ƒi= :1 p ϕi st−i + ƒi= :1q θiε t−i+1 (4) nonlinearity of motion for face tracking. This approach exploits a of the where 1…p are the AR parameters and 1…q are the state’s history for motion modeling. weights of the zero- Gaussian . In this paper, we propose an ARMA model for Equation (3) is therefore an ARMA(1,1) model. In motion modeling whose coefficients are updated contrast, we suggest higher order ARMA model with throughout the tracking. The variation of the velocity adaptive weights to better model the human’s complex and acceleration of the target is therefore modeled by motion. This dynamic model has two main Gaussian mixture models with reduced dimensionality advantages: First, we still keep the linear form of the by non-negative matrix factorization [3]. state evolution to model the nonlinear motions. Second, higher order AR model can better model the 3. Human tracking using particle filter complex nature of human motion in real situations.

Particle filter is an approximation of Bayesian 3.2. Probabilistic ARMA model sequential estimation using Monte Carlo . The Bayesian sequential estimation determines the We utilize a probabilistic approach for modeling target’s state distribution in two steps: prediction and the human motion using ARMA model. The update. Consider that the history of the state (st) is parameters of this model are computed using the pdf denoted by S1:t-1=(s1,…,st-1) and the previous of the target’s state history, indirectly. For this observations by m1:t-1 = (m1,…,mt-1). The prediction purpose, we choose st= (xt, yt, vxt, vyt) as the state of step (1) utilizes the motion model p s S and the ( t :1 t−1 ) the moving person (where at frame t, Xt=(xt,yt) is the 2D position of the person and V =(v ,v ) is the prior model p(st−1 m :1t−1 ) to predict the probability of t xt yt velocity). We then compute the frame-to-frame the current state p(st m :1t−1 ) . In the update step (2), the velocity (ffv) distribution and frame-to-frame (or the observation model p(mt st )) acceleration (ffa) distribution during the learning weights the predicted probability to determine the period. posterior distribution of the target’s state p(s m ). t :1t ffv = ∆X = X − X , ffa = ∆V = V −V (5) t t t t−1 t t t t−1

p(st m :1t−1 )= — p(st S :1t−1 )p(st−1 m :1t−1 )dxt−1 (1) where “ ⋅ ” denotes the absolute value. Note that we p s m = p m s .p s m (2) ( t :1t ) ( t t ) ( t :1t−1 ) obtain two ffvt and two ffat for each frame, one for x movement and one for y movement. The values of ffat Most dynamic models rely on the Markov and ffvt over time show the “process of dynamics”. assumption, p(st S t:1 ) = p(st st−1 ) , and represent this The last T values of ffa1:T={ffat-T+1,…,ffat} and ffv1:T dynamic by a simple linear model. ={ffvt-T+1,…,ffvt} represent the distribution of the recent person’s motion in terms of pdf of the velocity ( p ffa ffa ) and the pdf of acceleration st = Ast−1 + wt , wt ~ N( ,0 δ ) (3) ( t +∆t :1 T )

( p( ffvt +∆t ffv :1 T )) using mixture of K Gaussians [9]: In this modeling, the current state is obtained using the motion model (A) acted on the previous state. A p( ffat+∆t ffa :1T ):= p( ffat+∆t ; ffa :1T ,ƒ) zero-mean Gaussian noise is then added to model the (6-1) K ~ motion model’s uncertainty. = ƒt=1µt N()ffat+∆t t,; ƒ

p( ffvt+∆t ffv :1T ):= p( ffvt+∆t ; ffv :1T ,ƒ) 3.1. Autoregressive moving average in PF (6-2) K ~ = ƒt=1 ρt N()ffvt+∆t t,; ƒ Given a of data, an autoregressive moving average (ARMA) model typically consist of ~ ~ where µt and ρt are the normalized weights of the two parts: an autoregressive (AR) part and a moving Gaussians characterizing the motion dynamics.

20742078 ~ ffat ~ ffvt The parameters of the model are then periodically µt = K and ρt = K (7) ƒt=1 ffat ƒt=1 ffvt updated based on the previous tracking results.

4. Experimental results Figure 2 shows the p( ffa; ffa :1 T , ƒ) with ƒ = 2.0 .

Similar construction can be produced for ffv 1:T To verify the effectiveness of our approach, we distribution. Note that the periodic nature of the performed on video sequences from motion is reflected in the acceleration distribution. The PETS 2009 (S2-L1 and L2 benchmark) [4]. We track alternation in ffa distribution reflects the existence of two people ( “p1”and “p2”) in S2-L1 video and person opposite movements in a single motion type (e.g., “p3” (labeled “B” in data set) in S2-L2. There exists during walking, hands might move backwards while both occlusion and abrupt changes in the motion the torso is moving forward). direction. We initially learn the motion model (p=4, Since the weights are positive, the weight vectors ∗ ~ ∗ ~ q=1) of person “p1” and “p2” in T=10 frames using µ1×K = [µt ]t= :1K and ρ1×K = [ρt ]t= :1K are therefore the result of standard mean shift [10]. We then track factorized by non-negative matrix factorization (8) to these people in T1=50 frames using particle filter (with omit non-important weights [3]. The factorized 300 particles) with the learnt motion model and a ∗ weights are in vectors WK×p and Hp×K for µ . Similar simple kernalized color histogram as measurement model. The procedure repeats by updating the motion minimization is performed for ρ ∗ . model using the last T frames and track for T1 frames. As the color information of the person “p3” is not very ≈ T ’ nmf ∆ µ ∗ (µ ∗ ) ÷ →WH : informative as measurement model, we consider p=5 « ◊ and decrease the tracking time to T =20. (8) 1 ≈ T ’ We compare the proposed approach (PARMA-PF) ∗ ∗ F()W , H = arg min∆ µ µ −WK× p H p×K ÷ with an approach which uses a second order ARMA as p « ◊ the motion model in the PF framework (2ARMA-PF)

[7] with kernelized color histogram as measurement where ⋅ is the norm of the matrix and model for both approaches. ƒK W H =1. We choose p most important values Figure 3 shows the trajectories of the person “p1”, i, j=1 ij ji “p2”, and “p3” using the proposed PARMA-PF of vector W (p

1 N G ∩ D Final GMM of ffa in x direction t t 0.5 ATA = ƒt=1 (10) Final GMM of ffa in y direction N Gt ∪ Dt

where Gt denotes the ground truth bounding box at

0 frame t, Dt is the estimated bounding box at frame t, 0 2 4 6 8 10 and N is total number of frames. Table 1 presents frame index Figure 2. Gaussian mixture model for frame-to- ATA values for the proposed approach and 2ARMA- frame acceleration distribution ffa in directions x and y. PF method [7]. Clearly, our approach gives much

20752079 better results than the other method, in both 5. Conclusion benchmarks. Figure 4 shows variation of the velocity coefficients over time within tracking. The We proposed a novel adaptive approach to model discontinuities in the plots correspond to the updating the human complex motion using probabilistic of the coefficients. autoregressive moving average. The parameters of the model have been automatically extracted, periodically using the probabilistic analysis of the target’s velocity and acceleration distribution during the learning stage. We then utilized this motion model along with kernalized color histogram with Bhattacharyya distance metric (as measurement model) in a Bayesian sequential estimation framework using particle filter. The experimental results on the PETS 2009, S2-L1 and S2-L2 benchmark, show the effectiveness of our approach by considerable improvement of human tracking in the presence of occlusion and abrupt motions.

References

[1] A. Elgammal, C. S. Lee, Tracking people on a torus, IEEE Trans. on Pattern Analysis and Machine Intelligence , Vol. 31(3), 2009. [2] J. H. Chuang, C. W. Lee, K. H. Lo, Human Activity Analysis Based on a Torso-Less Representation, IEEE 19th Conf. on Pattern Recognition, 2008. [3] S. Pathak, D. Haynor, C. Lau, M. Hawrylycz, Non- negative matrix factorization framework for dimensionality reduction and unsupervised clustering, The Insight Journal, 2007. Figure 3. PETS2009 dataset [4]. Results of [4] http://pets2009.net tracking person “p1” (first row), “p2” (second row), and [5] A. J. McDougall, Robust methods for recursive “p3” (third and fourth row). The trajectory extracted autoregressive moving average estimation, Journal of using the proposed approach is highlighted by solid the Royal Statistical Society, Vol. 56(1), 1994. (magenta) line, 2ARMA-PF with dotted (blue) line, and [6] A. Elnagar, K. Gupta, Motion prediction of moving the ground truth by dashed (yellow) line. objects based on autoregressive model, IEEE Trans. on System, Man and Cybernetics, Vol. 28, Issue 6, 1998. [7] P. Perez, C. Hue, J. Vermaak, M. Gangnet, Color-based probabilistic tracking, European Conf. on Computer Vision, 2002. [8] D. Mikami, K. Otsuka, J. Yamato, Memory-based particle filter for face pose tracking robust under complex dynamics, IEEE Conf. on Computer Vision and Pattern Recognition, 2009. [9] M. Peternel, A. Leonardis, “Visual learning and Figure 4. Graphical representation of xi (left) and recognition of a probabilistic spatio-temporal model of yi (right) for person “p1” over time, with p=4 in cyclic human motion”, 17th International Conference of equation (9). Pattern Recognition, 2004. [10] H. Liu, Z. Yu, H. Zha, Y. Zou, L. Zhang, Robust Table 1. Performance comparison of the proposed human tracking based on multi-cue integration and approach with 2ARMA-PF method [7] for human mean-shift, Pattern Recognition Letters, 2008. tracking in S2-L1 and L2 benchmark of PETS 2009 [4] [11] R. Kasturi, D. Goldgof, P. Soundararajan, V. Manohar, using the ATA metric [11]. J. Garofolo, R. Bowers, M. Boonstra, V. Korzhova, and Jing Zhang, Framework for performance evaluation of video sequence ATA similarity values proposed method 2ARMA-PF method face, text, and vehicle detection and tracking in video: S2-L1-person p1 0.53 0.28 Data, metric, and protocol, Pattern Analysis and S2-L1-person p2 0.71 0.59 Machine Intelligence , Vol. 31(2), 2009. S2-L2-person p3 0.66 0.56

20762080