<<

VISUAL TRACKING USING SEQUENTIAL IMPORTANCE SAMPLING WITH A STATE PARTITION TECHNIQUE

Yan 1, Mark Yeary 1, Joseph P. Havlicek 1 Jean-Charles Noyer 2, Patrick Lanvin 2

1School of Electrical and Computer Engineering University of Oklahoma, Norman, OK USA 2Laboratoire d’ Analyse des Syst`emes du Littoral Universit´e du Littoral Cˆote d’Opale, Calais Cedex, France

Corresponding Author’s E-mail: [email protected], [email protected]

ABSTRACT tion filter (JPDAF) [1]. Hidden Markov models (HMM) Sequential importance sampling (SIS), also known as par- have also been widely used [2, 3]. ticle filtering, has drawn increasing attention recently due More recently, a sequential importance sampling (SIS) to its superior performance in nonlinear and non-Gaussian technique, viz. particle filtering (PF), has been recognized dynamic problems. In the SIS framework, estimation accu- as a significant target tracking algorithm because of its ac- racy depends strongly on the choice of proposal distribution. curacy, robustness and flexibility in non-linear and non- In this paper we propose a novel SIS algorithm called PF- Gaussian scenarios [4]. Various PF tracking approaches SP-PEKF that is based on a state partition technique and have been reported, including the condensation method [5], a parallel bank of extended Kalman filters designed to im- switching PF [6], unscented PF [7] and others. However, prove the accuracy of the proposal distribution. Our results performance of these techniques is strongly influenced by show that this new approach yields a significantly improved the choice and accuracy of the proposal distribution (PD). estimate of the state, enabling the new particle filter to ef- In this paper, we introduce a new PF approach where the fectively track human subjects in a video sequence where PD is generated using the state partition technique and mul- the standard condensation filter fails to maintain track lock. tiple weighted EKFs. This technique provides highly accu- Moreover, because of the improved proposal distribution, rate and robust state estimates in the presence of noise and the new filter can achieve a given level of performance us- strong clutter, which improves tracking performance while ing fewer particles than its conventional SIS counterparts. simultaneously reducing the need for a large number of par- ticles. As illustrative example, we apply the new ap- proach to track human perambulation in a highly cluttered 1. INTRODUCTION environment. Robust, accurate visual object tracking is fundamental to a variety of computer vision applications including robotics, 2. SEQUENTIAL IMPORTANCE SAMPLING human tracking and biometric identification, intelligent transportation systems, smart rooms, and military target- The conventional PF technique [4] is based on a nonlinear ing systems. Usually, the objects of interest in a video se- system modeled in state space according to quence are represented by state space models that may in- x(t) =f(x(t − 1)) + v(t − 1) volve strong nonlinearities. In addition, the noise and back- y(t) =h(x(t)) + n(t) (1) ground clutter that are almost always present in real-world video sequences make the visual tracking problem particu- where x(t) and y(t) denote the state vector and observations larly challenging. Conventional methods for dealing with and where v(t) and n(t) are the process and observation nonlinear and non-Gaussian problems include the extended noises. The filtering objective is to simulate the posterior Kalman filter (EKF), Bayesian multiple-hypothesis filters, distribution p(x(0:t)|y(1:t)) via a large set of randomly gen- and approximate variants including the probability data as- erated samples called particles. Applying Bayes’ theorem, sociation filter (PDAF) and joint probability data associa- this distribution is given by p(y(1:t)|x(0:t))p(x(0:t)) This work was supported in part by the U.S. Army Research Labora- p(x(0:t)|y(1:t)) = R . tory and the U.S. Army Research Office under grant W911NF-04-1-0221. p(y(1:t)|x(0:t)p(x(0:t)) dx(0:t)

0-7803-9134-9/05/$20.00 ©2005 IEEE SIS approximates the posterior distribution according to where F(·) denotes the Jacobian. From (5), we have that

Ns X xn (t) + xr (t) ≈ f(xn (t − 1)) + vn (t − 1) p(x(0:t)|y(1:t)) ≈ ω˜(i)(t)δ(x(0:t) − x(i)(0:t)) , (2) i i i i − v (t − 1) + F(x (t − 1))x (t − 1) + v(t − 1), i=1 ni ri where Ns is the number of particles. The normalized im- which can be simplified to obtain portance weight ω˜(i)(t) is given by xri (t) ≈ F(xni (t−1))xri (t−1)+v(t−1)−vni (t−1) . (6) ω(i)(t) ω˜(i)(t) = P , (3) Manipulating the output equation in a similar way yields Ns (j) j=1 ω (t) y(t) ≈ h(xni (t)) + H(xni (t))xri (t) + n(t), (7) where ω(i)(t) denotes the importance weight associated with the ith particle at time t, given by where H(·) is the Jacobian. Note that the dynamic model given by (6) and (7) approximates the nonlinear system (1)

(i) (i) (i) but is linear in the residual part of the state xri (t). Thus, a (i) (i) p(y(t)|x (t)) p(x (t)|x (t − 1)) ω (t) = ω (t − 1) . bank of EKFs can be applied to update xr (t) and the overall q(x(i)(t)|x(i)(0:t − 1), y(1:t)) i system state is estimated by the weighted sum (4) The particles x(i)(t) are drawn from the PD x(i)(t) ∼ XN (i) q(x(t)|x (0:t − 1) , y (1:t)). Designing a proper PD q(·) xˆ(t|t) = xˆi(t|t)wi(t), (8) associated with the SIS is one of the greatest challenges in i=1 implementing a PF and can profoundly affect the filter per- where xˆi(t|t) = xni (t) +x ˆri (t|t). The filter weights wi(t) formance. In the next section, we introduce a new, robust in (8) are given by approach to this important task. (t|t)wi(t − 1) wi(t) = PN , (9) 3. NEW PF ALGORITHM (PF-SP-PEKF) i=1 Li(t|t)wi(t − 1) where It is well-known that using the state transition prior as the h 1 proposal (also known as condensation) is not robust in L (t|t) = |R (t|t − 1)|−0.5 exp − k y(t) i 2 strongly cluttered environments, since this proposal does i 2 −1 not include the most recent observations [4, 7]. In this sec- − yˆi(t|t − 1) k Ryi (t|t − 1) . (10) tion, we suggest a new PD based on the state partition tech- nique and a parallel bank of EKFs (SP-PEKF) [8]. We Finally, the overall estimation error covariance Rtotal is briefly review the SP-PEKF method and then show how it XN can be incorporated within the PF framework. Rtotal(t|t) = {Ri(t|t) + [ˆx(t|t) − xˆi(t|t)] SP-PEKF is a technique for calculating the statistics of i=1 T the states of a nonlinear system. The rationale is to generate × [ˆx(t|t) − xˆi(t|t)] }wi(t) . (11) a set of samples xi, i ∈ [1,N], associated with each state x according to the given initial distribution N(x(0),R(0)). Our unique contribution in this paper is developing a new PF algorithm (PF-SP-PEKF) which uses the SP-PEKF state These samples are partitioned as x(t) , xn(t) + xr(t), estimates for its PD q(·). At each iteration, particles are where xn(t) and xr(t) denote the nominal and residual parts of the true state, respectively. After partitioning, the sam- drawn from a normal distribution with mean (8) and co- ples are propagated through a parallel filter bank and the es- variance (11). These particles are then propagated as in timated states are generated as a weighted sum of the filtered the conventional PF according to (2)-(4) with a resampling step. In order to take maximum advantage of the state es- samples. Initially, samples xi(0) are generated according to timates derived from the particle filter, the final estimates xˆ(0) = xn (0) +x ˆr (0) and R(0) = Rn (0) + Rr (0), i i i i from the combined approach are recursively fed back into where xn (0) =x ˆ(0), xˆr (0) = 0, Rn (0) = R(0), and i i i the SP-PEKF filter. This update stage serves as a “correc- Rri (0) = 0. Here, Rni and Rri represent the covariance of the nominal state and the residual state, respectively. tion” step for the SP-PEKF filter at each iteration. We typ- ically choose the number of filter bank channels N in (8)- The nominal state is updated according to xni (t) = (11) much smaller than the number of particles Ns; our ex- f(xni (t−1))+vni (t−1), where vni (t) is distributed iden- tically to v(t) but is a different realization. The system state perience is that taking 10 ≤ N ≤ 20 is generally suffi- cient to obtain significantly improved performance relative trajectory is then linearized about xn (t) according to i to, e.g., a comparable condensation filter. The overall PF- x(t) ≈ f(x (t − 1)) ni SP-PEKF algorithm is illustrated in Fig. 1 and summarized

+ F(xni (t − 1))xri (t − 1) + v(t − 1), (5) below. The PF-SP-PEKF algorithm: Calculate Particle Weights (3) (1) EKF1 ω (t) • Sequential Importance Sampling (SIS) Step: (2) (t) EKF2 ω – Sample from the proposal distribution ac- Normalize (3) cording to (8) and (11): ω (t) Particle EKF Output 3 Weights & (i) (i) (i) x (t) ∼ q(x (t)|x (0:t − 1), y(1:t)) Observations Resample Generate Particles =N (ˆx(t|t),Rtotal(t|t)) ;

(Ns) Update using weighted sum given in (8) − (11) – Evaluate and normalize the importance EKFN Evaluate estimated mean and variance ω (t) weights according to (3) and (4);

• Resampling Step: Generate a new set of particles Fig. 1. Block diagram of the PF-SP-PEKF algorithm. i? (i) x (t) from x (t) by sampling Ns times the ap- proximate distribution of p(x(t)|y(1:t)) so that ³ ´ the major and minor axes of the ellipse, respectively. Let y P r xi?(t) = x(j)(t) =ω ˜(j)(t); represent the observation and convert the local coordinates (uk, vk) back to image coordinates. The observation equa- tion is then given by • Output and Update Step: Estimate x(t) according P to x(t) ≈ 1 Ns xi?(t) and update the proposal. Ns i=1 y(t) = [(uk + r(t), vk + s(t))] + n(t), (13)

where n(t) is the measurement noise. To reject clutter, we 4. PF-SP-PEKF VIDEO TRACKING apply 1-D edge detection along each ray. The edges are la- j = 1 ··· J J As an illustrative example, we apply the PF-SP-PEKF algo- beled as k, where k is the total number of edges kth rithm proposed in Section 3 to track the head of a peram- detected along the ray. The³ likelihood´ of each ray is then PJ 2 bulating human in a real-time video sequence. We adopt a pk(yt|xt) = Nm j=1 N (uk, vk), σkj , where Nm is a Langevin process [2, 7] to model the head as a moving el- normalizing factor. The overall likelihood computed across QK lipse centered at image coordinates (row, col) = (r, s). The all rays is given by p (y(t)|x(t)) = k=1 pk (y(t)|x(t)). state dynamics are given by This likelihood model is a simplified version of the one         given in [7]. The innovation along each ray is yk(t)−y¯(t|t− r(t + 1) 1 0 1 0 r(t) 0 PJ 1) = πkj((uk, vk)t,j − (uk, vk)t|t−1). s(t + 1) 0 1 0 1  s(t)  0  j   =     +   m(t) , r˙(t + 1) 0 0 ar 0  r˙(t) br s˙(t + 1) 0 0 0 as s˙(t) bs 5. RESULTS AND DISCUSSION

T Digital video data were acquired with a frame size of where x(t) = [r(t) s(t)r ˙(t)s ˙(t)] is the statep vector, ar = 2 480 × 720 pixels in a highly cluttered laboratory environ- expp (βr∆T ), as = exp (βs∆T ), br =v ¯r 1 − ar, bs = 2 ment using a high-quality full motion video camera man- v¯s 1 − a , parameters βr and βs are rate constants, ∆T is s ufactured by Cohu Electronics. The condensation method the discretization time step, v¯r and v¯s are the steady-state root-mean-square velocity, and m(t) is the process noise. and the PF-SP-PEKF algorithm were implemented for com- K equally spaced rays are drawn from the center of the parison. Tracking results for the two filters are shown in ellipse, which serves as the origin of a local coordinate sys- Fig. 2 and Fig. 3, respectively, where the estimated target tem (u, v). The intersections of these rays with the ellipse centroid is indicated by a white cross. For the first 20 frames boundary are taken as observations. In local coordinates, the motion was nearly rectilinear with minimal acceleration the intersections along the kth ray are obtained according to and both filters performed well. Beginning around frame 25, however, the motion became more complicated with q nontrivial accelerations and increased background clutter 2 2 2 2 2 uk = a b /(b + a tan θk ) around the head. While the PF-SP-PEKF filter maintained q a consistent track lock throughout the video sequence, it 2 2 2 2 2 vk = tan θk · a b /(b + a tan θk ) (12) may be seen in Fig. 2 that the condensation filter progres- sively diverged in frames 30 through 50, ultimately lock- 2 2 (uk−m) (vk−n) by solving the ellipse equation a2 + b2 = 1 ing onto the clutter structure and losing the tracked object and the ray equation vk = uk tan φk, where a and b denote all together. These results, which are typical of those we 50 50 50 50

100 100 100 100

150 150 150 150

200 200 200 200

250 250 250 250

300 300 300 300

350 350 350 350

400 400 400 400

450 450 450 450

100 200 300 400 500 600 700 100 200 300 400 500 600 700 100 200 300 400 500 600 700 100 200 300 400 500 600 700 (a) Frame 1 (b) Frame 10 (a) Frame 1 (b) Frame 10

50 50 50 50

100 100 100 100

150 150 150 150

200 200 200 200

250 250 250 250

300 300 300 300

350 350 350 350

400 400 400 400

450 450 450 450

100 200 300 400 500 600 700 100 200 300 400 500 600 700 100 200 300 400 500 600 700 100 200 300 400 500 600 700 (c) Frame 20 (d) Frame 30 (c) Frame 20 (d) Frame 30

50 50 50 50

100 100 100 100

150 150 150 150

200 200 200 200

250 250 250 250

300 300 300 300

350 350 350 350

400 400 400 400

450 450 450 450

100 200 300 400 500 600 700 100 200 300 400 500 600 700 100 200 300 400 500 600 700 100 200 300 400 500 600 700 (e) Frame 40 (f) Frame 50 (e) Frame 40 (f) Frame 50

Fig. 2. Condensation method results. Divergence is ob- Fig. 3. PF-SP-PEKF tracking result: a consistent track lock served in frame 30 with complete track loss by frame 50. is maintained throughout the entire video sequence. have obtained, demonstrate the performance advantage that [3] M. Bruno and J. Moura, “Multiframe detector/tracker: can be gained with the PF-SP-PEKF approach by explic- optimal performance,” IEEE Trans. Aerosp. Electron. itly considering the most recent observation when construct- Syst., vol. 37, pp. 925-946, July 2001. ing the proposal distribution. Due to its improved accuracy [4] M. Arulampalam, S. Maskell, N. Gordon and T. Clapp, and robustness, the PF-SP-PEKF is also capable of achiev- “A tutorial on particle filters for online nonlinear/non- ing a given level of tracking performance with fewer par- Gaussian Bayesian tracking,” IEEE Trans. Signal Pro- ticles than its conventional PF counterparts, which consti- cess., vol. 50, no.2, pp. 174-188, Feb., 2002. tutes a significant advantage in real-time implementations. Our future research is focused on combining the SP-PEKF [5] M. Isard and A. Blake, “Contour tracking by stochas- approach with Markov Chain Monte Carlo methods such tic propagation of conditional density,” Proc. Eur. Conf. as the Metropolis-Hastings algorithm to address the well- Computer Vision, 1996. known “sample impoverishment” problem that is typically seen in conventional particle filters when a small number of [6] T. Bando, T. Shibata, K. Doya and S. Ishii, “Switch- heavily weighted particles cause a loss of sample diversity. ing particle filters for efficient real-time visual track- ing”, Proc. Int. Conf. Pattern Recognition, pp.720-723, Aug., 2004. 6. REFERENCES [7] Y. Rui and Y. Chen, “Better proposal distributions: ob- [1] Y. Bar-Shalom and T. Fortmann, Tracking and Data As- ject tracking using unscented particle filter,” Proc. of sociation, Academic Press, 1998. IEEE CVPR, pp. 786-793, Dec., 2001. [2] Y. Chen, Y. Rui and T. , “JPDAF Based HMM for [8] D. Lainiotis and P. Papaparaskeva, “A new class of ef- Real-Time Contour Tracking”, Proc. of IEEE CVPR, ficient adaptive nonlinear filters (ANLF),” IEEE Trans. pp.I-543 to 550, Dec., 2001. Signal Process., vol. 46, pp. 1730-1737, June, 1998.