Quick viewing(Text Mode)

A Match Moving Technique for Merging Cg and Human Video Sequences

A Match Moving Technique for Merging Cg and Human Video Sequences

ICASSSP2001, Conference Proceedings CD-ROM, IMDSP-P3.1

A MATCH MOVING TECHNIQUE FOR MERGING CG AND HUMAN VIDEO SEQUENCES

Jun’ichi HOSHINO ‡ Hirofumi SAITO† ‡ University of Tsukuba/PRESTO, JST †Niigata University

E-mail: [email protected] ABSTRACT

Merging virtual object with human video sequence is an important technique for many applications such as special ef- fects in movies and augmented reality. In a traditional method, the operator manually fit 3D body model onto the human video sequence, and generate virtual objects at the current 3D body pose. However, the manual fitting is a time consuming task, and the automatic registration is required. In this paper, we propose a new method for merging virtual objects onto the human video sequence. First, we track the current 3D pose of (a) original image (b) merging CG cloth human figure by using the spatio-temporal analysis and the structural knowledge of human body. Then we generate CG Figure 1: Example of merging CG cloth with a human image. objects and merge it with the human figure in video. In this paper, we demonstrate examples of merging virtual cloth with the video captured images. timate 3D pose data of human from video. We use 3D human 1. INTRODUCTION model in Figure 4 to estimate the pose parameters of each parts. We represent a human body by an articulated structure consist- Merging computer generated objects onto human video se- ing of 10 rigid parts corresponding body, head, upper arms (ur- quence is an important technique for many applications such arm, ul-arm), under arms (ul-arm, ur-arm), upper legs (ur-leg, as special effects in movies and augmented reality applications. ul-leg), under legs (lr-leg, ll-leg). The parameters are In a traditional method, the operator manually fit 3D body model estimated using the spatial and temporal gradient method. The onto the human video sequence, and generate virtual objects at pose of human body at each frame is obtained from integration the current 3D pose. However, manual fitting is a time con- of a sequence of pose parameters onto the pose at the initial suming task, and automatic registration is required. frame. The automatic registration of virtual objects with video ψ ={T ,θ ,θ ,θ ,θ ,θ , image is also called “match moving”[10]. The conventional body body ur– leg ur– leg ur– arm lr– arm (1) techniques use the estimation of the camera position relative to θhead,θul– leg,θul– leg,θll– arm} the scene. The virtual objects are generated using the estimated camera position and orientation, and merged with the input Then we generate CG objects using the pose parameters of image. The similar techniques are also investigated for the aug- the articulated objects. The self occlusion due to the body parts mented reality applications [1,2,3,4,5,6,7]. However, merging are also estimated using the pose parameters and the 3D shape virtual objects with the a complex, articulated figures such as a of the body. Figure 1 shows the example of the merging virtual human body is still a difficult problem. cloth onto the human video sequence. A possible application of In this paper, we propose a new technique for the automatic this technique may be the virtual fashion simulator in registration of virtual objects with the human body images. As a typical example, we merge CG cloth onto the human video 3. TRACKING HUMAN BODY sequence. First, we track current 3D pose of human figure by using the spatio-temporal analysis and the structural knowl- 3.1 Human motion model edge of human body. Then we generate CG cloth and merge it The body models are approximated by a polyhedron made with the human in video. by a CAD modeler . The model has a tree structure with a root 2. OVERVIEW at the trunk, and have a local coordinate system whose origin is located at a joint. Rigid motion in each part of the body is a Figure 2 shows the overview of the algorithm. First, we es- combination of rotation and translation, denoted by a matrix head 2) Estimating 3D pose Z Z 4) Merging virtual cloth X Z 1) input images X Y Y and input images X Y Z left right upper upper X arm arm Y body Z Z X Y X Z Y Z X left Y 3) Generate right X Y forearm virtual cloth forearm using 3D pose left right thigh thigh Z Z

t X X Y Y right left Figure 2: Overview of the algorithm shin shin

Figure 3: Human body model Q and vector S respectively where j identifies the number of b j b j the part. When a 3D point on the part j moves from p to p ’, j j Gp =–E (7) the position is related by t ′ E E XE + YE pj = Qb pj+ Sb (2) X Y X Y j j G =(f z , f z , z ) Assuming that the movement is small, the rotation matrix By substituting eq.(5) to eq(6), the motion parameters can

Qj is given by be obtained using the least square method.

1–θ θ T z j yj G Jφ =–Et (8) Qb = θz 1–θx (3) j j j Eq. (7) can be extended to the multi camera form. Suppos- – θy θx 1 j j ing the number of the cameras to be n, we have the n systems of linear equations which correspond to cameras. We can get where θx , θy ,and θz are small rotations around x, y, and z j j j one system of linear equations by gathering these n systems. axes. A point pw in the world coordinate system can relate T G1J1 – Et to a point pci in camera coordinate system by 1 T – E G2J2 t2 pw = Rcipci+ Tci (4) φ = (9) Suppose that a body 1 and body 2 are a parent and a child, T – E GnJn tn and point pi on the body 2 moved to a new location pi’. The new location can be calculated as follows.

p ′ = R–1(R (Q R–1( R (Q R–1 (R p+ T – T ) 4. INITIAL REGISTRATION i ci b1 b1 b1 b2 b2 b2 ci i cs b2 (5) + Sb2)+Tb2– Tb1)+Sb1)+ Tb1– Tci The motion estimation technique in Sec. 3 only estimates where Rbj and Tbj are the rotation and translation of the object- inter-frame displacement of human body. Therefore we need oriented coordinate system. Jacobian matrix can be derived an initial registration of body model to the input image. We from eq.(5). estimate initial body pose by extracting the center line of each parts from 2D input image. ′ dp 2 ∂p ′ ∂p ′ ∂p ′ i i S i S i S = { xj + yj+ zj + 4.1 Modeling registration errors dt jΣ=1 ∂Sxj ∂Syj ∂Szj ∂p ′ ∂p ′ ∂p ′ i θ + i θ + i θ Figure 4 shows the relationship of 2D center line on the ∂θ xj ∂θ yj ∂θ zj} (6) xj yj zj image plane and a corresponding 3D body part. The 2D = J(φ)φ center line can be obtained using the silhouette of human body image and the principle axis calculation. The position 3.2 Estimation of motion parameters and orientation of a body parts can be represented as x=[r,t]T t r xyz The motion parameters are estimated using the spatial and where is the translation and is the rotation along the temporal gradient method. Optical flow constraints[11] in three axis. The error of a 2D center line and a 3D body part can dimensions can be written as be calculated as the minimum distance between 3D line seg- ment P=(p1,p2) and plane M. h(x,l) M

( Rp1+ t)⋅ n h( x, l)= (10) ( Rp2+ t)⋅ n Q' Q Q' P P R is a3x3 matrix derived from r. n is a unit normal vector of Q plane M. x'k ′ ′ q1× q2 n= O O ′ ′ (11) xk q1× q2 xk image plane image plane h(x,l) can be linearized using the Taylor expansion around ˆ ˆ the initial estimate l = li and x = xi−1 . Figure 4: A relationship of the 2D center line and a hx( ˆ , lˆ ) hx( ˆ , lˆ ) corresponding body part ˆ ∂ ii−1 i ∂ ii−1 i ˆ hxl(,)iii≈ hx (ˆ , l) + (xx− ˆi ) + (ll− i ) (12) −1 ∂x −1 ∂l

∂h ∂h where , are partial derivatives. 5.2 Dealing with occlusions ∂x ∂l 4.2 Minimizing registration errors We can not simply overlay the virtual objects because it may be partially occluded by body parts. For example, user’s hands We minimize the errors between 2D center line and a 3D may be before the virtual objects. We need relative depth infor- body parts by using the Extended Kalman Filter (EKF) [12]. mation to calculate which body parts may occlude virtual ob- The advantage of using EKF is that the estimation can be up- jects. Again we use the approximate human body model to deal dated when the new measurement are available. The measure- with occlusions. ment equation is 1) First, we create the texture mapped 3D model of the

zHxviii=+ human body. Because the body model is already where registrated onto the input image, the intensity value of input image can be stored as a texture of 3D models. ∂hx( ˆ , lˆ ) 2) The CG cloth and hairstyle are added to the texture z = ii−1 xhxlˆ − ( ˆ , ˆ ) i ∂x iii−−11 mapped 3D model of human body. 3) Render the texture mapped body model and CG objects hxˆ lˆ hxˆ lˆ ∂ ( ii−1 , ) ∂ ( ii−1 , ) ˆ together to obtain synthesized image. Hi = , vi = (llii− ) ∂x ∂l In our experiments, the small shape difference of the body model is not critical because we render from the same view The optimization can be done using the Kalman Filter equa- point with the input image. tions.

xxˆˆii=+−−11 KzHx iiii[ − ˆ ] 6. EXPERIMENTS

T −1 We have implemented a prototype system using PC to dem- KHHHR=+σσ() iiiii−−11i onstrate the algorithm. Figure 5 shows the example of the ini- tial model fitting. (a) is a input image, and (b) is the extracted σσii= −−11− KH iii σ center lines. (c) is the result of model fitting. Figure 6 shows the movie sequence of a walking person. where K is a Kalman Gain and is a covariance matrix of σi The input image is 50 frames of 640*480. (a) is 30 and 40 frame measurement. R is a covariance matrix of the estimates. of the input image sequences. (b) is the tracking result. The wireframe of the human body model is overlaid onto the input 5. MERGING GRAPHIC OBJECTS images. (c) is the generated cloth image using the 3D body pose estimated from the input images. (d) is the input images 5.1 Cloth simulation with CG cloth. We generate Cloth CG using the pose parameters estimated 7. CONCLUSION in Sec.3 and Sec.4. We use MAYA ClothTM for cloth simula- tion. Any other cloth simulation technique can be used for this In this paper, we proposed a new match moving technique purpose. We use the a approximate body model with the smooth for an articulated figures such as human. First, we track current surface to simulate natural shape of cloth CG. The size of the 3D pose of human figure by using the spatio-temporal analysis each parts are equal to the 3D body model used for the track- and the structural knowledge of human body. Then we gener- ing. ate CG cloth and merge it with the human figure in video. We 30 40

(a) input image (a) original image sequence

(b) extracting center line (b) tracking human body

(c) registration of 3D model (c) generated CG cloth image Figure 5: Result of automatic registration have demonstrated the examples using the video sequences. The future work includes the real-time implementation of the proposed technique. 8. REFERENCES [1] V. Blanz, T. Vetter: “A Morphable Model for the Synthsis of (d) merging CG cloth with the input images sequence 3D faces”, Proc. SIGGRAPH’99, pp.187-194, 1999 Figure 6: Exmaple of merging virtual cloth [2] M.Bajura, H.Fuchs and R.Ohbuchi: ``Merging virtual objects with the real world: Seeing ultrasound imagery within the patient'', SIGGRAPH '92, pp. 203-210 pp.197-204. [3] M.Bajura and U.Neumann: ``Dynamic Registration Correction in Video-Based Augmented Reality Systems'', IEEE Com- [8] A.L.Janin, D.W.Mizwell and T.P.Caudell: ``Calibration of Head- puter Graphics and Applications, Vol.15, No.5, Sep. 1995, Mounted Display for Augmented Reality Applications'', pp.52-60. IEEE VRAIS 1993 Proc., pp.246-255. [4] M.Bajura and U.Neumann: ``Dynamic Registration Correction [9] D.M. Garvila:"Visual Analysis of Human Movement: A Sur- in Augmented Reality Systems'', IEEE VRAIS 1995 Proc., vey", and Image Understanding, Vol.73, 1995, pp.189-196. No.1, pp.82-98, 1999 [5] E.K.Edwards, J.P.Rolland and K.P.keller: ``Video See-Through [10] C. Barron:” Painting in the Digital Age”, Design for Merging of Real and Virtula Environments'', Sketches, SIGGRAPH’98, 1998 IEEE VRAIS 1993 Proc., pp.222-233. [11] M. Yamamoto, A. Sato, S. Kawada,”Incremental tracking of [6] S.Gottschalk and J.Hughes: ``Autocalibration for Virtual Envi- human action from multiple views, Proc. of CVPR’98, pp.2- ronments Tracking Hardware'', SIGGRAPH'93, pp.65-72. 7, 1998 [7] R.Azuma and G.Bishop: ``Improving Static and Dynamic Reg- [12] H.W. Sorenson:"Kalman Filtering: Theory and Application", istration in an Optical See-Trough HMD'', SIGGRAPH '94, New York, IEEE Press, 1985