Modeling the Product Manifold of Posture and Motion

Modeling the Product Manifold of Posture and Motion Ankur Datta Yaser Sheikh Takeo Kanade Robotics Institute Carnegie Mellon University {ankurd,yaser,tk}@cs.cmu.edu Abstract Long-term human motion is composed of an ensemble of different activities with varying complexity. This makes it challenging to develop models to accurately estimate human motion. In this paper, we exploit the dependencies that exist between posture and motion for long-term human motion estimation. We propose to model the nonlinear motion manifold as a collection of local linear models, noting that given a particular posture, the variation in motion for that posture can be well-approximated by a linear model. A col- Figure 1. First column: top row – 700 posture exemplars out of lection of local linear models is easy to fit and also has the a data set of 18,000 posture exemplars and bottom row – 3D iso- expressiveness to encode several activities in any arbitrary contour for 18,000 posture exemplars. Second and third column: order. Furthermore, to account for the varying complexity tow row – Posture variability around two key postures and bottom of different activities, each local linear model can have a row – their associated 3D iso-contours. Developing global models different dimensionality. A collection of local linear models, for the nonlinear iso-contour (first column) generated by the entire thus, avoids the limitation of global models that require a data set is a difficult task. However, by exploiting the local depen- uniform dimensionality for the latent motion manifold. This dencies between posture and motion, we can compactly model the model allows us to linearly regularize motion estimation al- nonlinear manifold as a collection of Local Linear Models (LLMs) gorithms over the nonlinear human motion manifold. Our (second and third column). results demonstrate that a collection of local linear models provides an effective representation for the motion manifold by low-dimensional manifolds. However, to handle arbi- when compared to other global models such as the bilinear trary orderings of activities, extending global models would model [18] and the Principal Component Analysis [14]. require a combinatorial increase in the training data. In this paper, we propose an approach that exploits local 1. Introduction dependencies between posture and motion for human motion estimation. Given a particular posture, the variation Human motion is influenced by the internal intentions in motion for that posture is compactly approximated us- of the actor and the external constraints of the actor’s en- ing a local linear model. Taken together, a collection of vironment. Consider a driver inside a vehicle. The series local linear models can encode long-term human motion of activities performed by the driver depends on what the that may consist of human activities in any arbitrary order. driver wants to do (e.g. turn on the air conditioner, turn up Figure 1 illustrates the dependency between posture and the volume, or adjust the seat height), and what driving re- motion by comparing a global model, fit to motion across quires (e.g. shift gears, adjust the rear view mirror, and turn a sequence of 18,000 frames, to local models, fit to mo- the steering wheel). The challenge in estimating motion in tion given particular postures. It is evident from the figure such settings is that both the intentions of the driver and the that motion is highly constrained given knowledge of pos- nature of environmental constraints usually remain hidden. ture. A key benefit of our approach is that each local linear As a result, human motion manifests itself as a series of model can have a different dimensionality, thereby, well- activities with varying complexity whose order is arbitrary. approximating the irregular motion manifold. The irregu- In previous literature, activities have been modeled globally larity of the motion manifold arises from the fact that dif- ψ3 ψ5 ψ6 ferent activities have different complexity. Allowing vary- ψ4 ing dimensionality across the manifold avoids the limitation ψ2 of global models that require commitment to a uniform di- ψ1 mensionality. +"*%"# !+)#%$",- RP 2. Related Work Humans are capable of performing a wide-variety of complex motion due to the highly articulated nature of the human body. Human motion data, therefore, resides in a !"#$%&'()*%"# !+)#%$",- high-dimensional configuration space. Previous approaches P have attempted to model the high-dimensional motion man- R ifold using either the linear models or the nonlinear models [6, 7, 16, 22, 20, 1, 10]. x Linear motion models employ Principal Component x3 x4 5 x6 x1 x2 Analysis (PCA) [14] to develop models for human motion Figure 2. Product Manifold of posture and motion. Postures on in computer vision and graphics [13, 17, 1]. While it is the configuration manifold have associated Local Linear Models straightforward to fit the linear models, the complexity of (LLMs) that encode the motion variation in the neighborhood of human motion does not lend itself particularly amenable to that particular posture. modeling using the linear models [20]. Nonlinear motion models such as the Mixture of Prob- abilistic PCA [20], Locally Linear Embeddings [11] or the prone regularization penalty. Our approach, however, does Gaussian Process Dynamical Models (GPDM) [22], enable require comparatively more number of local models. The one to capture the nonlinearities in human motion at the computational cost of doing so, though, is minimal as the expense of local-minima prone fitting. Nonlinear motion local linear models are learnt using PCA over small neigh- models can be divided into two categories, those that com- borhoods on the motion manifold. mit to a single manifold dimensionality to model the motion manifold [11, 10, 22] or those that model the motion 3. The Product Manifold of Posture & Motion manifold using a mixture of several local linear or nonlinear models such as Mixture of Probabilistic PCA or Geometric The posture x ∈ RP of a human is defined as the instan- Topograhphic Mapping (GTM) [20, 19, 3]. Long-term hu- taneous configuration of a human body, in 2D image space man motion is an irregular manifold with a diverse range of or 3D world space. All human configurations x lie on a activities with varying complexity (see Figure 6). Commit- manifold X. The instantaneous change in human posture is ˙x x dx ˙x RP ting to a single dimensionality for the latent manifold, there- represented by = f( ), f(·) = dt , i ∈ . The pur- fore, neither captures the manifold structure effectively nor pose of this paper is to estimate human motion. This is chal- extracts the computational savings [11, 22]. lenging since humans are highly articulate objects and move Nonlinear approaches that model the motion manifold in complex, and unpredictable ways. If we could character- as a mixture of local linear or nonlinear models suffer ize the motion manifold X˙ then we could regularize the esti- from high subspace angles between the learnt local mod- mates of motion to lie on the manifold X˙ . A P-dimensional els [20, 19, 3, 2, 8]. This problem was highlighted by the chart on X˙ is a pair (V, ψ) consisting of a subset V of X˙ work of Roweis et al. [15, 21]. These high subspace angles and a bijection ψ from V onto an open set in RP . V is the between the local models arise because the traditional ap- domain of the chart. A point ˙x ∈ V can thus be assigned proaches employ a disjoint partition and fit approach which coordinates (a1,...,aP ) by means of the projection map- essentially makes the linear models independent causing pings vi : RP → R; ai = ui ◦ψ, i = 1,...,P . Henceforth, high subspace angles between subspaces. Roweis et al. we will denote a chart only by its bijection ψ when there is and Brant [15, 21, 4] outlined model selection algorithms no ambiguity. that imposed a nonlinear regularization penalty to encour- The human motion manifold is highly nonlinear and lies age neighboring subspace models to align. in a high-dimensional configuration space. It is therefore, a In this paper, we propose an alternative approach to en- difficult task to find a single chart covering the entire man- courage neighboring subspace models to align. We propose ifold X˙ . Linear or Non-Linear Dimensionality Reduction to learn a collection of local models by increasing the data techniques (NLDR) attempt to model the high-dimensional overlap between the neighboring models. Increased data manifold by computing a globally valid low-dimensional overlap between the models automatically forces them to mapping into a latent coordinate system, usually the real- ′ align with each other without the need for costly minima- plane RP , where P ′ ≪ P . However, the operating as- 60 50 &)% ! !& !& %$ %$ &)&( %# % %# %" %" %! %! 0#.$1)2&*34,"& & 0#.$1)2&*34,"& & %& %& (&)'&$%*(&+,-./' (&)'&$%*(&+,-./'% $ % $ ! # !"#$%&'$ ! # !"#$%&'$ 40 ' ' " " " " ( ! ( ! 30 (a) (b) PCA−2 Figure 4. Subspace angles between neighboring charts. (a) Low PCA−5 subspace angles are obtained using the proposed approach that PCA−10 20 LLM−2 uses shared data to learn the charts. (b) High subspace angles are Reprojection Error LLM−5 obtained using the Expectation-Maximization fitting of a Mixture LLM−10 10 of Probabilistic Principal Components Analysis [20]. 0 X 0 50 100 150 200 250 300 350 of charts to cover a manifold employ neighboring disjoint Frames charts [20, 19, 3, 2, 8]. Two charts, ψi and ψj are disjoint Figure 3. Comparing the representational power of PCA and a col- iff ψi ψj = ⊘. This implies that there is no apriori con- −1 lection of Local Linear Models (LLMs). This graph shows the ditionT to ensure that the mapping ψj ◦ ψi is a diffeomor- reconstruction error for held-out motion vectors from a motion phism leading to high-subspace angles between the neigh- capture data set of 70,000 frames.

Load more