A Match Moving Technique for Merging Cg and Human Video Sequences

ICASSSP2001, Conference Proceedings CD-ROM, IMDSP-P3.1 A MATCH MOVING TECHNIQUE FOR MERGING CG AND HUMAN VIDEO SEQUENCES Jun’ichi HOSHINO ‡ Hirofumi SAITO† ‡ University of Tsukuba/PRESTO, JST †Niigata University E-mail: [email protected] ABSTRACT Merging virtual object with human video sequence is an important technique for many applications such as special effects in movies and augmented reality. In a traditional method, the operator manually fit 3D body model onto the human video sequence, and generate virtual objects at the current 3D body pose. However, the manual fitting is a time consuming task, and the automatic registration is required. In this paper, we propose a new method for merging virtual objects onto the human video sequence. First, we track the current 3D pose of (a) original image (b) merging CG cloth human figure by using the spatio-temporal analysis and the structural knowledge of human body. Then we generate CG Figure 1: Example of merging CG cloth with a human image. objects and merge it with the human figure in video. In this paper, we demonstrate examples of merging virtual cloth with the video captured images. timate 3D pose data of human from video. We use 3D human 1. INTRODUCTION model in Figure 4 to estimate the pose parameters of each parts. We represent a human body by an articulated structure consist- Merging computer generated objects onto human video se- ing of 10 rigid parts corresponding body, head, upper arms (ur- quence is an important technique for many applications such arm, ul-arm), under arms (ul-arm, ur-arm), upper legs (ur-leg, as special effects in movies and augmented reality applications. ul-leg), under legs (lr-leg, ll-leg). The motion parameters are In a traditional method, the operator manually fit 3D body model estimated using the spatial and temporal gradient method. The onto the human video sequence, and generate virtual objects at pose of human body at each frame is obtained from integration the current 3D pose. However, manual fitting is a time con- of a sequence of pose parameters onto the pose at the initial suming task, and automatic registration is required. frame. The automatic registration of virtual objects with video ψ ={T ,θ ,θ ,θ ,θ ,θ , image is also called “match moving”[10]. The conventional body body ur– leg ur– leg ur– arm lr– arm (1) techniques use the estimation of the camera position relative to θhead,θul– leg,θul– leg,θll– arm} the scene. The virtual objects are generated using the estimated camera position and orientation, and merged with the input Then we generate CG objects using the pose parameters of image. The similar techniques are also investigated for the aug- the articulated objects. The self occlusion due to the body parts mented reality applications [1,2,3,4,5,6,7]. However, merging are also estimated using the pose parameters and the 3D shape virtual objects with the a complex, articulated figures such as a of the body. Figure 1 shows the example of the merging virtual human body is still a difficult problem. cloth onto the human video sequence. A possible application of In this paper, we propose a new technique for the automatic this technique may be the virtual fashion simulator in registration of virtual objects with the human body images. As a typical example, we merge CG cloth onto the human video 3. TRACKING HUMAN BODY sequence. First, we track current 3D pose of human figure by using the spatio-temporal analysis and the structural knowl- 3.1 Human motion model edge of human body. Then we generate CG cloth and merge it The body models are approximated by a polyhedron made with the human in video. by a CAD modeler . The model has a tree structure with a root 2. OVERVIEW at the trunk, and have a local coordinate system whose origin is located at a joint. Rigid motion in each part of the body is a Figure 2 shows the overview of the algorithm. First, we es- combination of rotation and translation, denoted by a matrix head 2) Estimating 3D pose Z Z 4) Merging virtual cloth X Z 1) input images X Y Y and input images X Y Z left right upper upper X arm arm Y body Z Z X Y X Z Y Z X left Y 3) Generate right X Y forearm virtual cloth forearm using 3D pose left right thigh thigh Z Z t X X Y Y right left Figure 2: Overview of the algorithm shin shin Figure 3: Human body model Q and vector S respectively where j identifies the number of b j b j the part. When a 3D point on the part j moves from p to p ’, j j Gp =–E (7) the position is related by t ′ E E XE + YE pj = Qb pj+ Sb (2) X Y X Y j j G =(f z , f z , z ) Assuming that the movement is small, the rotation matrix By substituting eq.(5) to eq(6), the motion parameters can Qj is given by be obtained using the least square method. 1–θ θ T z j yj G Jφ =–Et (8) Qb = θz 1–θx (3) j j j Eq. (7) can be extended to the multi camera form. Suppos- – θy θx 1 j j ing the number of the cameras to be n, we have the n systems of linear equations which correspond to cameras. We can get where θx , θy ,and θz are small rotations around x, y, and z j j j one system of linear equations by gathering these n systems. axes. A point pw in the world coordinate system can relate T G1J1 – Et to a point pci in camera coordinate system by 1 T – E G2J2 t2 pw = Rcipci+ Tci (4) φ = (9) Suppose that a body 1 and body 2 are a parent and a child, T – E GnJn tn and point pi on the body 2 moved to a new location pi’. The new location can be calculated as follows. p ′ = R–1(R (Q R–1( R (Q R–1 (R p+ T – T ) 4. INITIAL REGISTRATION i ci b1 b1 b1 b2 b2 b2 ci i cs b2 (5) + Sb2)+Tb2– Tb1)+Sb1)+ Tb1– Tci The motion estimation technique in Sec. 3 only estimates where Rbj and Tbj are the rotation and translation of the object- inter-frame displacement of human body. Therefore we need oriented coordinate system. Jacobian matrix can be derived an initial registration of body model to the input image. We from eq.(5). estimate initial body pose by extracting the center line of each parts from 2D input image. ′ dp 2 ∂p ′ ∂p ′ ∂p ′ i i S i S i S = { xj + yj+ zj + 4.1 Modeling registration errors dt jΣ=1 ∂Sxj ∂Syj ∂Szj ∂p ′ ∂p ′ ∂p ′ i θ + i θ + i θ Figure 4 shows the relationship of 2D center line on the ∂θ xj ∂θ yj ∂θ zj} (6) xj yj zj image plane and a corresponding 3D body part. The 2D = J(φ)φ center line can be obtained using the silhouette of human body image and the principle axis calculation. The position 3.2 Estimation of motion parameters and orientation of a body parts can be represented as x=[r,t]T t r xyz The motion parameters are estimated using the spatial and where is the translation and is the rotation along the temporal gradient method. Optical flow constraints[11] in three axis. The error of a 2D center line and a 3D body part can dimensions can be written as be calculated as the minimum distance between 3D line seg- ment P=(p1,p2) and plane M. h(x,l) M ( Rp1+ t)⋅ n h( x, l)= (10) ( Rp2+ t)⋅ n Q' Q Q' P P R is a3x3 matrix derived from r. n is a unit normal vector of Q plane M. x'k ′ ′ q1× q2 n= O O ′ ′ (11) xk q1× q2 xk image plane image plane h(x,l) can be linearized using the Taylor expansion around ˆ ˆ the initial estimate l = li and x = xi−1 . Figure 4: A relationship of the 2D center line and a hx( ˆ , lˆ ) hx( ˆ , lˆ ) corresponding body part ˆ ∂ ii−1 i ∂ ii−1 i ˆ hxl(,)iii≈ hx (ˆ , l) + (xx− ˆi ) + (ll− i ) (12) −1 ∂x −1 ∂l ∂h ∂h where , are partial derivatives. 5.2 Dealing with occlusions ∂x ∂l 4.2 Minimizing registration errors We can not simply overlay the virtual objects because it may be partially occluded by body parts. For example, user’s hands We minimize the errors between 2D center line and a 3D may be before the virtual objects. We need relative depth infor- body parts by using the Extended Kalman Filter (EKF) [12]. mation to calculate which body parts may occlude virtual ob- The advantage of using EKF is that the estimation can be up- jects. Again we use the approximate human body model to deal dated when the new measurement are available. The measure- with occlusions. ment equation is 1) First, we create the texture mapped 3D model of the zHxviii=+ human body. Because the body model is already where registrated onto the input image, the intensity value of input image can be stored as a texture of 3D models. ∂hx( ˆ , lˆ ) 2) The CG cloth and hairstyle are added to the texture z = ii−1 xhxlˆ − ( ˆ , ˆ ) i ∂x iii−−11 mapped 3D model of human body.

A Match Moving Technique for Merging Cg and Human Video Sequences

2020 Croner Animation Position Grids.Xlsx

A Process of Seamlessly Replacing Cg Elements Into Live-Action Footage

New Mode of Cinema V1n1

3D Compositing and Visual Effects ITP 360X (3 Units)

Department of Media Sciences Anna University, Chennai

Motion 3 Supplemental Documentation

The Director's Method in Contemporary Visual Effects Film

Methods for Freezing Time with Computer Graphics Imagery

Chromakey Technology and Technique HISTORY Schüfftan Process Miniatures and Mirrors

The Visual Effects Guide to Davinci Resolve 17

Posthuman Geographies in Twentieth Century Literature and Film Alex

Srm Institute of Science and Technology