Spring 2019 CSCI 621: Digital Geometry Processing

15 Facial Performance Capture

Hao Li http://cs621.hao-li.com 1 Performance to Facial Animation

2 Motion Capture

3 Motion Capture

4 Facial Animation in Films

5 Facial Modeling and Scanning

7 Facial Modeling and Scanning

acquisition registration

initial merging alignment copyright Paramount Pictures

8 Facial Modeling and Scanning

9 High-End 3D Scanning

10 Low-Cost Passive Scanning (AGI Soft)

stereo pair

3D scan

11 Low-Cost Active Scanning

Microsoft & Kinect Fusion

12 Rigging & Animation

13 Blendshapes & Correctives for Realism

14 Motion Capture Technologies

Sparse Markers Dense Markers Markerless MOVA Image Metrics

15 Using Markers

tracking

input video input performance with markers retargeting

16 Using Dense Markers

17 Vision-Based Tracking & Texturing

LA Noire

18 Typical Animation Workflow in Industry

Modeling + Light-weight Motion Cleanup & 3D Scanning Fitting Rigging Capture Key-Framing

Source Character

Complex Key-Framing + Modeling Retargeting Rigging Proc.+Sim.

Target Character

19 Markerless Facial Capture

20 3D Range Sensor

Motion can be Captured at the Same Resolution as the Geometry

21 Vapor Ware? (Spatial Phase Imaging)

22 USC ICT Light Stage 5

23 Template Fitting

24 Template Fitting with PCA

25 Template Based Tracking

26 Overview

depth for tracking texture for tracking Li et al. Siggraph Asia Borshukov et al. 2003 2009

coupled tracking + texture and depth for reconstruction tracking Valgaerts et al. Siggraph Asia Beeler et al. Siggraph 2011 2012

27 Using Light Stage X Using Light Stage X Using Light Stage X Using Light Stage X Using Light Stage X Requirements for a Practical System

1.Real-time performance

2.Robustness to noise

3.High-level semantics

33 Realtime Facial Capture

34 Why Realtime?

VFX/Game Production Virtual Avatars

Robotics AR/Virtual Mirror

35 Objective

36 Building Expression Space

tracked template input scan

37 Expression PCA for Reduced Dimension

Principal Component Analysis

= +w1 +w2 +w3 +w4

38 Realtime Systems

depth sensor as input

with training with little training little to no training Weise et al. SCA 09 Li et al. Siggraph 2010 Weise et al. Siggraph 2011 & Bouaziz et al. Siggraph 2013

reduced calibration and more accessible

39 Realtime Systems

video as input

without training without training with training Saragih et al. IJCV 2011 Image Metrics 2011 Cao et al. Siggraph 2013

increased accuracy and expressiveness

40 Automatic Facial Rigging

41 Blendshape Animation Blendshape Animation

blending weights

= B0 + 1B1 + 2B2 + 3B3 + ...

laughing

neutral face blendshapes

SIGGRAPH 2010 42 Blendshape Retargeting

laughing

... many blendshapes SIGGRAPH 2010 43 Expression Transfer

prior blendshapes ...

[Noh & Neumann ’01] [Sumner & Popovic ’04]

reconstructed blendshapes ...

44 Problems

prior expression transfer ground truth

45 Example Based-Facial Rigging

input face

... input training poses Facial Rigging ... prior generic rig

blending weights output blendshapes

......

46 Bilinear Problem

... input face input examples

n B + B S 0 ij i ⇡ j i=1 X blending weights output blendshapes

......

47 Decoupled Optimization

n B + B S 0 ij i ⇡ j i=1 X

48 Decoupled Optimization

n B + B S Step A 0 ij i ⇡ j i=1 X

...

n Step B B + B S 0 ij i ⇡ j i=1 X

...

49 Gradient Domain Optimization

n n 2 ˜ 2 S 2 2 argmin B0 + ijBi Sj + Bi Bi argmin M0 + ijMi Mj + Mi + M0 Gi M0 k k k k M k k ⇥ · ⇥ Bi i=1 i i=1 X X

50 Comparison

prior without examples with 6 examples input example

whistle surprise

51 Directable Facial Animation

3D scans facial tracking

52 Blendshapes for Tracking

ICP with Blendshapes

53 Animation Prior

54 Problem: Noisy Input

input scans tracking goal

55 Performance-Based Facial Tracking

expression model animation prior

......

input data expression parameters

Stable Tracking

temporal coherence

56 Animation as Prior

reference video 9500 frames artist

57 N-Dim Expression Space

smile

mouth open

58 Animation Manifold

correction ↵t smile

mouth open

59 Probabilistic Animation Prior

correction ↵t smile

mouth open Lau et al. 2009

60 Probabilistic Animation Prior

mouth open

correction ↵t smile time

mouth open

61 Probabilistic Animation Prior

correction ↵t smile t M ↵

mouth open

62 Temporal Joint Probabilistic Distribution

M

time

K t t M t t M T 2 p( ,.., )= ( ,.., µ ,C C + ⇥ I). kN | k k k k kX=1 MPPCA principal Gaussian weights mean model components noise

63 MAP Estimation

t t 1 t M ↵ = arg max p(↵ D, ↵ ,..,↵ ) ↵ | MPPCA t 1 t M arg max p(D ↵) p(↵, ↵ ,..,↵ ) ↵ | likelihood prior | {z } | {z }

V T 2 n (v v⇤) geometry p(G x)= k exp( || i i i || ) | geo 22 i=1 geo Y

V T 2 I (p p⇤) texture p(I x)= k exp( ||⇥ i i i || ) | im 22 i=1 im Y

64 ILM’s Kinect Monster Mirror

65 Fast Calibration Li et al. SIGGRAPH 2013 Facial Performance Capture Li et al. SIGGRAPH 2013 Pipeline Pipeline Overview

input output output

retargeting

retargeting

69 Pipeline Overview

input output

tracking

retargeting

70 Pipeline Overview

input output

tracking

71 Pipeline Overview

input tracking output

rigid motion

72 Pipeline Overview

input tracking output

rigid motion

73 Pipeline Overview

input tracking output

rigid motion blendshape

74 Pipeline Overview

input tracking output

rigid motion blendshape

75 Pipeline Overview

input tracking output

per-vertex rigid motion blendshape deformation

initial blendshapes

76 Building Initial Blendshape Model

rigid ICP template deformation deformation deformation PCA fitting deformation fitting transfer transfer transfer non-rigid ICP

77 Pipeline Overview

input tracking output

per-vertex rigid motion blendshape deformation

initial blendshapes initial blendshapes

78 Rigid Motion Tracking

S ni v c (R, t)= n>(v (R, t) p ) i i i i i pi

S E =min ci (R, t) rigid R,t i X

79 Rigid Motion Tracking

input tracking output

per-vertex rigid motion blendshape deformation

initial blendshapes

80 Blendshape Tracking

(0) (l) vi(x)= vi + vi xl Xl S ni u c (x) = n> (vi(x) pi) v j i i p i i F cj (x)= Hj(uj) P vj(x)

E = min (cS(x))2 + w cF(x) 2 bs x i k j k i j X X x [0, 1] l 2

81 Pipeline Overview

input tracking output

per-vertex rigid motion blendshape deformation

initial blendshapes

82 Per-Vertex Deformation

P c (vi) = (pi vi) v i i u j W pi vi cj (vj) = Hj(uj) P vj L c (v) = L(b0) v

I G Q v = a 2 3 L 4 5

83 Fast Laplacian Deformation

I G Q v = a 2 3 uj L pi vi 4 5 G K v = a

1 1 v = K>[KK>] G a

sparse pre-factorized sparse constant in time constant in time trivially inversed

84 Pipeline Overview

input tracking output

per-vertex rigid motion blendshape deformation

initial blendshapes

85 Pipeline Overview

input tracking output

per-vertex rigid motion blendshape deformation

initial blendshapes

86 Pipeline Overview

input tracking output

per-vertex rigid motion blendshape deformation

initial blendshapes

87 Pipeline Overview

input tracking output

per-vertex rigid motion blendshape adaptive PCA deformation

adaptive PCA model initial blendshapes

88 Pipeline Overview

input tracking output

per-vertex rigid motion blendshape adaptive PCA deformation

anchor corrective incremental shapes shapes PCA initial blendshapes adaptive PCA model

89 Tracking Comparison

depth map & our Weise et al. ’11 our Weise et al. ’11 2D features tracking tracking retargeting retargeting

90 Tracking Basic Emotions Li et al. SIGGRAPH 2013

anger surprise joy sadness disgust fear input data

our method

Weise et al. 2011 Faces 2014: Discussion

Bouaziz et al. 2013 Cao et al. 2014 Li et al. 2013

input 3D/2D input 2D input 3D/2D

no calibration neutral face neutral face

blendshape blendshape per-vertex deformation Into the Mainstream Ninja Theory / Epic Games Motivation (Human Digitization) State-of-the-art Digital Humans (Siren) Snapchat / MSQRD FaceX Robustness FaceX Instant User Switching FaceX Kids State-of-the-Art in Real-Time Facial Tracking Preliminary Findings: Segmentation Microsoft 2015 Pipeline Two-Stream Deconvolution Network

Comparison Open Problems USC/ICT Activision Open Problems Melbourne Acting School 2010 Once upon a dream Virtual Reality Reloaded VR 2012 / Crytek 2014 Consuming VR

NextGen Communication Platform Online Virtual Worlds Oculus Oculus Virtual Training, Simulation, Education Challenges

2016 2021

▪Accurate simulation/training ▪Building trust, communicate feelings, resolving conflicts Occlusions Preliminary Findings: Segmentation

Microsoft 2015 Facial Performance Sensing HMD Facial Performance Sensing HMD Facial Performance Sensing HMD Ultra Thin Flexible Electronic Materials Live Demo HMD Prototype Design

Microsoft 2015 Feature-Based Tracking

Kazemi and Sullivan, CVPR 2014

Cao et al., SIGGRAPH 2014 High Dimensionality & Non-Linearity

occlusions, lighting, expressions, identity sticky lips, biting, visemes, … Model for Facial Expressions

non-linear

high dimensional low dimensional Online Operation Label Transfer via Audio Alignment

Eye Animation Results Retargeting

Motivation (Human Digitization) AR Telepresence (Hololens 2) Full Facial Performance Capture Pinscreen Facial Tracker (2018) Pinscreen (2017) Real-Time Lighting Estimation Deep Learning-Based Face Synthesis Deep Learning-Based Face Synthesis

Masked Neutral

Input

Deformed Neutral Generator Synthesized Expression

Neutral Expression Normal + Depth Conditional GAN Deep Learning-Based Face Synthesis Deep Learning-Based Face Synthesis Deep Learning-Based Face Synthesis

NextGen Photoreal Avatars

confidential www.pinscreen.com

Full Video Inference

Interaction With 3D Avatars

Blade Runner 2049 (2017) http://cs621.hao-li.com

Thanks!

162