The Digital Emily Project
Total Page:16
File Type:pdf, Size:1020Kb
CREATING A PHOTOREAL DIGITAL ACTOR: THE DIGITAL EMILY PROJECT Oleg Alexander1 Mike Rogers1 William Lambeth1 Matt Chiang 2 Paul Debevec2 1 Image Metrics 1918 Main St, 2nd Floor, Santa Monica, CA 90405, USA e-mail: foleg.alexander,mike.rogers,[email protected] 2 University of Southern California Institute for Creative Technologies 13274 Fiji Way, Marina del Rey, CA 90292, USA e-mail: fchiang,[email protected] 1 Introduction Abstract Creating a photoreal digital actor with computer graphics The Digital Emily Project is a collaboration between has been a central goal of the field for at least thirty years facial animation company Image Metrics and the Graphics [Parke 1972]. The Digital Emily project undertaken by Image Laboratory at the University of Southern California’s Institute Metrics and the University of Southern California’s Institute for Creative Technologies to achieve one of the world’s for Creative Technologies (USC ICT) attempted to achieve an first photorealistic digital facial performances. The project animated, photoreal digital face by bringing together latest- leverages latest-generation techniques in high-resolution face generation results in 3D facial capture, modeling, animation, scanning, character rigging, video-based facial animation, and rendering. The project aimed to cross the ”uncanny valley” and compositing. An actress was first filmed on a studio set [20], producing a computer-generated face which appeared speaking emotive lines of dialog in high definition. The lighting to be a real, relatable, animated person. Some of the key on the set was captured as a high dynamic range light probe technologies employed included a fast high-resolution digital image. The actress’ face was then three-dimensionally scanned face scanning process using the light stage at USC ICT, and in thirty-three facial expressions showing different emotions the Image Metrics video-based facial animation system. The and mouth and eye movements using a high-resolution facial result of the project was by several accounts the first public scanning process accurate to the level of skin pores and demonstration of a photoreal computer-generated face able to fine wrinkles. Lighting-independent diffuse and specular convincingly speak and emote in a medium closeup. reflectance maps were also acquired as part of the scanning process. Correspondences between the 3D expression scans 2 Previous Efforts at Photoreal Digital Humans were formed using a semi-automatic process, allowing a blendshape facial animation rig to be constructed whose A variety of laudable efforts have been made to create realistic expressions closely mirrored the shapes observed in the rich digital actors over the last decade, each leveraging numerous set of facial scans; animated eyes and teeth were also added advances in computer graphics technology and artistry. In this to the model. Skin texture detail showing dynamic wrinkling section, we overview some of these key efforts in order to was converted into multiresolution displacement maps also compare and contrast them with the Digital Emily project. driven by the blend shapes. A semi-automatic video-based The SIGGRAPH 1999 Electronic Theater featured ”The facial animation system was then used to animate the 3D face Jester” [15], a short animation by Life/FX of a woman rig to match the performance seen in the original video, and reading poetry in jester’s cap. The actor’s face was three- this performance was tracked onto the facial motion in the dimensionally laser scanned and textured using photographic studio video. The final face was illuminated by the captured textures stitched together with artistic effort to minimize the studio illumination and shading using the acquired reflectance original shading and specular reflectance effects, and her maps with a skin translucency shading algorithm. Using performance was recorded with a traditional arrangement of this process, the project was able to render a synthetic facial motion capture markers. The motion capture dots were used performance which was generally accepted as being a real to drive a volumetric finite element model which allowed a face. high-resolution facial mesh to produce simulated buckling and wrinkling from anisotropic stress at a scale significantly more detailed than the original motion capture data was able Keywords: Digital Actors; Facial Animation; 3D Scanning to record. While the skin shading lacked realistic specular reflectance and pore detail, the face conveyed realistic motion Spider Man 2 (2004) built digital stunt doubles for villain Doc and emotion in significant part due to the skin dynamics Ock (Alfred Molina) and hero Spider-Man (Tobey Maguire) model. The process was later extended with significant using facial reflectance scans in USC ICT’s Light Stage 2 additional artistic effort to create an aged version of the actor device [24]. Each actor was filmed with four synchronized in a follow-up animation ”Young at Heart”. 35mm film cameras in several facial expressions from 480 lighting directions. Colorspace techniques as in [5] were used Disney’s ”Human Face Project” developed technology to to separate diffuse and specular reflections. The relightable show an older actor encountering a younger version of texture information was projected onto a 3D facial rig based himself [27, 12]. A facial mold of the actor’s face was taken on geometry from a traditional laser scan, and illuminated and the resulting cast was scanned using a high-resolution variously by HDRI image-based lighting [4] and traditional face scanning process, and twenty-two Cyberware scans in CG light sources using a custom shading algorithm for various expressions were also acquired. A medium-resolution approximately 40 digital double shots. Using additional animated facial rig was sculpted to mimic the expression cameras, the technique was also used to construct digital scans used as reference. A multi-camera facial capture actors for Superman Returns (2006), Spider Man 3 (2007), and setup was employed to film the face under relatively flat Hancock (2008). Due to the extensive reflectance information cross-polarized lighting. Polarization difference images collected, the technique yielded realistic facial reflectance isolating specular reflections of the face [5] were acquired for the digital characters, including close-up shots with mild to reveal high-resolution texture detail, and analysis of this degrees of facial animation, especially in Superman Returns. texture was used to add skin pores and fine wrinkle detail as However, results of the process did not demonstrate emotive displacement maps to the scans. Optical flow performed on facial performances in closeup; significant facial animation video of the actor’s performance was used to drive the digital was shown only in wide shots. face, achieving remarkably close matches to the original performance. A younger version of the actor was artistically Beowulf (2007) used a multitude of digital characters for a modeled based on photographs of him in his twenties. HDR fully computer-rendered film. The performance capture based lighting information was captured on set so that the digital film following the approach of the 2001 film Final Fantasy actor could be rendered with image-based lighting [4] to match of constructing as detailed characters as possible and then the on-set illumination. Final renderings achieved convincing driving them with motion capture, keyframe animation, and facial motion and lighting in a two-shot but less convincing simulation. Beowulf substantially advanced the state of the facial reflectance; a significant problem was that no simulation art in this area by leveraging greater motion capture fidelity of the skin’s translucency [14] had been performed. and performance volume, employing more complex lighting simulation, and using better skin shading techniques. While The Matrix Sequels (2003) used digital actors for many scenes some characters were based closely in appearance their voice where Keanu Reeves acrobatically fights many copies of Hugo and motion capture actors (e.g. Angelina Jolie and Anthony Weaving. High-quality facial casts of Reeves and Weaving Hopkins), other characters bore little resemblance (e.g. Ray were acquired and scanned with a high-resolution laser Winstone), requiring additional artistic effort to model them. scanning system (the XYZRGB system based on technology Static renderings of the faces achieved impressive levels from the Canadian National Research Council) to provide of realism, although the single-layer subsurface scattering 3D geometry accurate to the level of skin pores and fine model produced a somewhat ”waxy” appearance. Also, wrinkles. A six-camera high-definition facial capture rig the artist-driven character creation process did not leverage was used to film facial performance clips of the actors under high-resolution 3D scans of each actor in a multitude of facial relatively flat illumination. The six views in the video were expressions. According to Variety [3], the ”digitized figures used to animate a facial rig of the actor and, as in [10], to in ’Beowulf’ look eerily close to storefront mannequins ... provide time-varying texture maps for the dynamic facial suspended somewhere between live-action and animation, appearance. Still renderings using image-based lighting [4] fairy tale and videogame.” and a texture-space approximation to subsurface scattering [2] showed notably realistic faces which greatly benefitted The Curious Case of Benjamin Button (2008) was the first