15 Facial Performance Capture

Spring 2019 CSCI 621: Digital Geometry Processing 15 Facial Performance Capture Hao Li http://cs621.hao-li.com 1 Performance to Facial Animation 2 Motion Capture 3 Motion Capture 4 Facial Animation in Films 5 Weta Digital Facial Modeling and Scanning 7 Facial Modeling and Scanning acquisition registration initial merging alignment copyright Paramount Pictures 8 Facial Modeling and Scanning 9 High-End 3D Scanning 10 Low-Cost Passive Scanning (AGI Soft) stereo pair 3D scan 11 Low-Cost Active Scanning Microsoft Kinect & Kinect Fusion 12 Rigging & Animation 13 Blendshapes & Correctives for Realism 14 Motion Capture Technologies Sparse Markers Dense Markers Markerless MOVA Image Metrics 15 Using Markers tracking input video input performance with markers retargeting 16 Using Dense Markers 17 Vision-Based Tracking & Texturing LA Noire 18 Typical Animation Workﬂow in Industry Modeling + Light-weight Motion Cleanup & 3D Scanning Fitting Rigging Capture Key-Framing Source Character Complex Key-Framing + Modeling Retargeting Rigging Proc.+Sim. Target Character 19 Markerless Facial Capture 20 3D Range Sensor Motion can be Captured at the Same Resolution as the Geometry 21 Vapor Ware? (Spatial Phase Imaging) 22 USC ICT Light Stage 5 23 Template Fitting 24 Template Fitting with PCA 25 Template Based Tracking 26 Overview depth for tracking texture for tracking Li et al. Siggraph Asia Borshukov et al. 2003 2009 coupled tracking + texture and depth for reconstruction tracking Valgaerts et al. Siggraph Asia Beeler et al. Siggraph 2011 2012 27 Using Light Stage X Using Light Stage X Using Light Stage X Using Light Stage X Using Light Stage X Requirements for a Practical System 1.Real-time performance 2.Robustness to noise 3.High-level semantics 33 Realtime Facial Capture 34 Why Realtime? VFX/Game Production Virtual Avatars Robotics AR/Virtual Mirror 35 Objective 36 Building Expression Space tracked template input scan 37 Expression PCA for Reduced Dimension Principal Component Analysis = +w1 +w2 +w3 +w4 38 Realtime Systems depth sensor as input with training with little training little to no training Weise et al. SCA 09 Li et al. Siggraph 2010 Weise et al. Siggraph 2011 & Bouaziz et al. Siggraph 2013 reduced calibration and more accessible 39 Realtime Systems video as input without training without training with training Saragih et al. IJCV 2011 Image Metrics 2011 Cao et al. Siggraph 2013 increased accuracy and expressiveness 40 Automatic Facial Rigging 41 Blendshape Animation Blendshape Animation blending weights = B0 + α1B1 + α2B2 + α3B3 + ... laughing neutral face blendshapes SIGGRAPH 2010 42 Blendshape Retargeting laughing ... many blendshapes SIGGRAPH 2010 43 Expression Transfer prior blendshapes ... [Noh & Neumann ’01] [Sumner & Popovic ’04] reconstructed blendshapes ... 44 Problems prior expression transfer ground truth 45 Example Based-Facial Rigging input face ... input training poses Facial Rigging ... prior generic rig blending weights output blendshapes ... ... 46 Bilinear Problem ... input face input examples n B + α B S 0 ij i ⇡ j i=1 X blending weights output blendshapes ... ... 47 Decoupled Optimization n B + α B S 0 ij i ⇡ j i=1 X 48 Decoupled Optimization n B + α B S Step A 0 ij i ⇡ j i=1 X ... n Step B B + α B S 0 ij i ⇡ j i=1 X ... 49 Gradient Domain Optimization n n 2 ˜ 2 S 2 2 argmin B0 + αijBi Sj + β Bi Bi argmin M0 + αijMi Mj +β Mi + M0 Gi M0 k − k k − k M k − k ⇥ − · ⇥ Bi i=1 i i=1 X X 50 Comparison prior without examples with 6 examples input example whistle surprise 51 Directable Facial Animation 3D scans facial tracking 52 Blendshapes for Tracking ICP with Blendshapes 53 Animation Prior 54 Problem: Noisy Input input scans tracking goal 55 Performance-Based Facial Tracking expression model animation prior ... ... input data expression parameters Stable Tracking temporal coherence 56 Animation as Prior reference video 9500 frames artist 57 N-Dim Expression Space smile mouth open 58 Animation Manifold correction ↵t smile mouth open 59 Probabilistic Animation Prior correction ↵t smile mouth open Lau et al. 2009 60 Probabilistic Animation Prior mouth open correction ↵t smile time mouth open 61 Probabilistic Animation Prior correction ↵t smile t M ↵ − mouth open 62 Temporal Joint Probabilistic Distribution M time K t t M t t M T 2 p(α ,..,α − )= π (α ,..,α − µ ,C C + ⇥ I). kN | k k k k kX=1 MPPCA principal Gaussian weights mean model components noise 63 MAP Estimation t t 1 t M ↵ = arg max p(↵ D, ↵ − ,..,↵ − ) ↵ | MPPCA t 1 t M arg max p(D ↵) p(↵, ↵ − ,..,↵ − ) ≈ ↵ | likelihood prior | {z } | {z } V T 2 n (v v⇤) geometry p(G x)= k exp( || i i − i || ) | geo − 2σ2 i=1 geo Y V T 2 I (p p⇤) texture p(I x)= k exp( ||⇥ i i − i || ) | im − 2σ2 i=1 im Y 64 ILM’s Kinect Monster Mirror 65 Fast Calibration Li et al. SIGGRAPH 2013 Facial Performance Capture Li et al. SIGGRAPH 2013 Pipeline Pipeline Overview input output output retargeting retargeting 69 Pipeline Overview input output tracking retargeting 70 Pipeline Overview input output tracking 71 Pipeline Overview input tracking output rigid motion 72 Pipeline Overview input tracking output rigid motion 73 Pipeline Overview input tracking output rigid motion blendshape 74 Pipeline Overview input tracking output rigid motion blendshape 75 Pipeline Overview input tracking output per-vertex rigid motion blendshape deformation initial blendshapes 76 Building Initial Blendshape Model rigid ICP template deformation deformation deformation PCA fitting deformation fitting transfer transfer transfer non-rigid ICP 77 Pipeline Overview input tracking output per-vertex rigid motion blendshape deformation initial blendshapes initial blendshapes 78 Rigid Motion Tracking S ni v c (R, t)= n>(v (R, t) p ) i i i i − i pi S E =min ci (R, t) rigid R,t i X 79 Rigid Motion Tracking input tracking output per-vertex rigid motion blendshape deformation initial blendshapes 80 Blendshape Tracking (0) (l) vi(x)= vi + vi xl Xl S ni u c (x) = n> (vi(x) pi) v j i i − p i i F cj (x)= Hj(uj) P vj(x) E = min (cS(x))2 + w cF(x) 2 bs x i k j k i j X X x [0, 1] l 2 81 Pipeline Overview input tracking output per-vertex rigid motion blendshape deformation initial blendshapes 82 Per-Vertex Deformation P c (∆vi) = (pi vi) ∆v i − − i u j W pi vi cj (∆vj) = Hj(uj) P ∆vj L c (∆v) = L(b0) ∆v I G Q ∆v = a 2 3 L 4 5 83 Fast Laplacian Deformation I G Q ∆v = a 2 3 uj L pi vi 4 5 G K ∆v = a 1 1 ∆v = K>[KK>]− G− a sparse pre-factorized sparse constant in time constant in time trivially inversed 84 Pipeline Overview input tracking output per-vertex rigid motion blendshape deformation initial blendshapes 85 Pipeline Overview input tracking output per-vertex rigid motion blendshape deformation initial blendshapes 86 Pipeline Overview input tracking output per-vertex rigid motion blendshape deformation initial blendshapes 87 Pipeline Overview input tracking output per-vertex rigid motion blendshape adaptive PCA deformation adaptive PCA model initial blendshapes 88 Pipeline Overview input tracking output per-vertex rigid motion blendshape adaptive PCA deformation anchor corrective incremental shapes shapes PCA initial blendshapes adaptive PCA model 89 Tracking Comparison depth map & our Weise et al. ’11 our Weise et al. ’11 2D features tracking tracking retargeting retargeting 90 Tracking Basic Emotions Li et al. SIGGRAPH 2013 anger surprise joy sadness disgust fear input data our method Weise et al. 2011 Faces 2014: Discussion Bouaziz et al. 2013 Cao et al. 2014 Li et al. 2013 input 3D/2D input 2D input 3D/2D no calibration neutral face neutral face blendshape blendshape per-vertex deformation Into the Mainstream Ninja Theory / Epic Games Motivation (Human Digitization) State-of-the-art Digital Humans (Siren) Snapchat Facebook / MSQRD FaceX Robustness FaceX Instant User Switching FaceX Kids State-of-the-Art in Real-Time Facial Tracking Preliminary Findings: Segmentation Microsoft 2015 Pipeline Two-Stream Deconvolution Network Comparison Open Problems USC/ICT Activision Open Problems Melbourne Acting School 2010 Virtual Reality Once upon a dream Virtual Reality Reloaded Oculus VR 2012 / Crytek 2014 Consuming VR NextGen Communication Platform Online Virtual Worlds Oculus Oculus Virtual Training, Simulation, Education Challenges 2016 2021 ▪Accurate simulation/training ▪Building trust, communicate feelings, resolving conflicts Occlusions Preliminary Findings: Segmentation Microsoft 2015 Facial Performance Sensing HMD Facial Performance Sensing HMD Facial Performance Sensing HMD Ultra Thin Flexible Electronic Materials Live Demo HMD Prototype Design Microsoft 2015 Feature-Based Tracking Kazemi and Sullivan, CVPR 2014 Cao et al., SIGGRAPH 2014 High Dimensionality & Non-Linearity occlusions, lighting, expressions, identity sticky lips, biting, visemes, … Deep Learning Model for Facial Expressions non-linear high dimensional low dimensional Online Operation Label Transfer via Audio Alignment Eye Animation Results Retargeting Motivation (Human Digitization) AR Telepresence (Hololens 2) Full Facial Performance Capture Pinscreen Facial Tracker (2018) Pinscreen (2017) Real-Time Lighting Estimation Deep Learning-Based Face Synthesis Deep Learning-Based Face Synthesis Masked Neutral Input Deformed Neutral Generator Synthesized Expression Neutral Expression Normal + Depth Conditional GAN Deep Learning-Based Face Synthesis Deep Learning-Based Face Synthesis Deep Learning-Based Face Synthesis NextGen Photoreal Avatars conﬁdential www.pinscreen.com Full Video Inference Interaction With 3D Avatars Blade Runner 2049 (2017) http://cs621.hao-li.com Thanks! 162.

Load more