The Digital Emily Project

Total Page:16

File Type:pdf, Size:1020Kb

The Digital Emily Project CREATING A PHOTOREAL DIGITAL ACTOR: THE DIGITAL EMILY PROJECT Oleg Alexander1 Mike Rogers1 William Lambeth1 Matt Chiang 2 Paul Debevec2 1 Image Metrics 1918 Main St, 2nd Floor, Santa Monica, CA 90405, USA e-mail: foleg.alexander,mike.rogers,[email protected] 2 University of Southern California Institute for Creative Technologies 13274 Fiji Way, Marina del Rey, CA 90292, USA e-mail: fchiang,[email protected] 1 Introduction Abstract Creating a photoreal digital actor with computer graphics The Digital Emily Project is a collaboration between has been a central goal of the field for at least thirty years facial animation company Image Metrics and the Graphics [Parke 1972]. The Digital Emily project undertaken by Image Laboratory at the University of Southern California’s Institute Metrics and the University of Southern California’s Institute for Creative Technologies to achieve one of the world’s for Creative Technologies (USC ICT) attempted to achieve an first photorealistic digital facial performances. The project animated, photoreal digital face by bringing together latest- leverages latest-generation techniques in high-resolution face generation results in 3D facial capture, modeling, animation, scanning, character rigging, video-based facial animation, and rendering. The project aimed to cross the ”uncanny valley” and compositing. An actress was first filmed on a studio set [20], producing a computer-generated face which appeared speaking emotive lines of dialog in high definition. The lighting to be a real, relatable, animated person. Some of the key on the set was captured as a high dynamic range light probe technologies employed included a fast high-resolution digital image. The actress’ face was then three-dimensionally scanned face scanning process using the light stage at USC ICT, and in thirty-three facial expressions showing different emotions the Image Metrics video-based facial animation system. The and mouth and eye movements using a high-resolution facial result of the project was by several accounts the first public scanning process accurate to the level of skin pores and demonstration of a photoreal computer-generated face able to fine wrinkles. Lighting-independent diffuse and specular convincingly speak and emote in a medium closeup. reflectance maps were also acquired as part of the scanning process. Correspondences between the 3D expression scans 2 Previous Efforts at Photoreal Digital Humans were formed using a semi-automatic process, allowing a blendshape facial animation rig to be constructed whose A variety of laudable efforts have been made to create realistic expressions closely mirrored the shapes observed in the rich digital actors over the last decade, each leveraging numerous set of facial scans; animated eyes and teeth were also added advances in computer graphics technology and artistry. In this to the model. Skin texture detail showing dynamic wrinkling section, we overview some of these key efforts in order to was converted into multiresolution displacement maps also compare and contrast them with the Digital Emily project. driven by the blend shapes. A semi-automatic video-based The SIGGRAPH 1999 Electronic Theater featured ”The facial animation system was then used to animate the 3D face Jester” [15], a short animation by Life/FX of a woman rig to match the performance seen in the original video, and reading poetry in jester’s cap. The actor’s face was three- this performance was tracked onto the facial motion in the dimensionally laser scanned and textured using photographic studio video. The final face was illuminated by the captured textures stitched together with artistic effort to minimize the studio illumination and shading using the acquired reflectance original shading and specular reflectance effects, and her maps with a skin translucency shading algorithm. Using performance was recorded with a traditional arrangement of this process, the project was able to render a synthetic facial motion capture markers. The motion capture dots were used performance which was generally accepted as being a real to drive a volumetric finite element model which allowed a face. high-resolution facial mesh to produce simulated buckling and wrinkling from anisotropic stress at a scale significantly more detailed than the original motion capture data was able Keywords: Digital Actors; Facial Animation; 3D Scanning to record. While the skin shading lacked realistic specular reflectance and pore detail, the face conveyed realistic motion Spider Man 2 (2004) built digital stunt doubles for villain Doc and emotion in significant part due to the skin dynamics Ock (Alfred Molina) and hero Spider-Man (Tobey Maguire) model. The process was later extended with significant using facial reflectance scans in USC ICT’s Light Stage 2 additional artistic effort to create an aged version of the actor device [24]. Each actor was filmed with four synchronized in a follow-up animation ”Young at Heart”. 35mm film cameras in several facial expressions from 480 lighting directions. Colorspace techniques as in [5] were used Disney’s ”Human Face Project” developed technology to to separate diffuse and specular reflections. The relightable show an older actor encountering a younger version of texture information was projected onto a 3D facial rig based himself [27, 12]. A facial mold of the actor’s face was taken on geometry from a traditional laser scan, and illuminated and the resulting cast was scanned using a high-resolution variously by HDRI image-based lighting [4] and traditional face scanning process, and twenty-two Cyberware scans in CG light sources using a custom shading algorithm for various expressions were also acquired. A medium-resolution approximately 40 digital double shots. Using additional animated facial rig was sculpted to mimic the expression cameras, the technique was also used to construct digital scans used as reference. A multi-camera facial capture actors for Superman Returns (2006), Spider Man 3 (2007), and setup was employed to film the face under relatively flat Hancock (2008). Due to the extensive reflectance information cross-polarized lighting. Polarization difference images collected, the technique yielded realistic facial reflectance isolating specular reflections of the face [5] were acquired for the digital characters, including close-up shots with mild to reveal high-resolution texture detail, and analysis of this degrees of facial animation, especially in Superman Returns. texture was used to add skin pores and fine wrinkle detail as However, results of the process did not demonstrate emotive displacement maps to the scans. Optical flow performed on facial performances in closeup; significant facial animation video of the actor’s performance was used to drive the digital was shown only in wide shots. face, achieving remarkably close matches to the original performance. A younger version of the actor was artistically Beowulf (2007) used a multitude of digital characters for a modeled based on photographs of him in his twenties. HDR fully computer-rendered film. The performance capture based lighting information was captured on set so that the digital film following the approach of the 2001 film Final Fantasy actor could be rendered with image-based lighting [4] to match of constructing as detailed characters as possible and then the on-set illumination. Final renderings achieved convincing driving them with motion capture, keyframe animation, and facial motion and lighting in a two-shot but less convincing simulation. Beowulf substantially advanced the state of the facial reflectance; a significant problem was that no simulation art in this area by leveraging greater motion capture fidelity of the skin’s translucency [14] had been performed. and performance volume, employing more complex lighting simulation, and using better skin shading techniques. While The Matrix Sequels (2003) used digital actors for many scenes some characters were based closely in appearance their voice where Keanu Reeves acrobatically fights many copies of Hugo and motion capture actors (e.g. Angelina Jolie and Anthony Weaving. High-quality facial casts of Reeves and Weaving Hopkins), other characters bore little resemblance (e.g. Ray were acquired and scanned with a high-resolution laser Winstone), requiring additional artistic effort to model them. scanning system (the XYZRGB system based on technology Static renderings of the faces achieved impressive levels from the Canadian National Research Council) to provide of realism, although the single-layer subsurface scattering 3D geometry accurate to the level of skin pores and fine model produced a somewhat ”waxy” appearance. Also, wrinkles. A six-camera high-definition facial capture rig the artist-driven character creation process did not leverage was used to film facial performance clips of the actors under high-resolution 3D scans of each actor in a multitude of facial relatively flat illumination. The six views in the video were expressions. According to Variety [3], the ”digitized figures used to animate a facial rig of the actor and, as in [10], to in ’Beowulf’ look eerily close to storefront mannequins ... provide time-varying texture maps for the dynamic facial suspended somewhere between live-action and animation, appearance. Still renderings using image-based lighting [4] fairy tale and videogame.” and a texture-space approximation to subsurface scattering [2] showed notably realistic faces which greatly benefitted The Curious Case of Benjamin Button (2008) was the first
Recommended publications
  • Programmable Image-Based Light Capture for Previsualization
    ii Abstract Previsualization is a class of techniques for creating approximate previews of a movie sequence in order to visualize a scene prior to shooting it on the set. Often these techniques are used to convey the artistic direction of the story in terms of cinematic elements, such as camera movement, angle, lighting, dialogue, and char- acter motion. Essentially, a movie director uses previsualization (previs) to convey movie visuals as he sees them in his ”minds-eye”. Traditional methods for previs include hand-drawn sketches, Storyboards, scaled models, and photographs, which are created by artists to convey how a scene or character might look or move. A recent trend has been to use 3D graphics applications such as video game engines to perform previs, which is called 3D previs. This type of previs is generally used prior to shooting a scene in order to choreograph camera or character movements. To visualize a scene while being recorded on-set, directors and cinematographers use a technique called On-set previs, which provides a real-time view with little to no processing. Other types of previs, such as Technical previs, emphasize accurately capturing scene properties but lack any interactive manipulation and are usually employed by visual effects crews and not for cinematographers or directors. This dissertation’s focus is on creating a new method for interactive visualization that will automatically capture the on-set lighting and provide interactive manipulation of cinematic elements to facilitate the movie maker’s artistic expression, validate cine- matic choices, and provide guidance to production crews. Our method will overcome the drawbacks of the all previous previs methods by combining photorealistic ren- dering with accurately captured scene details, which is interactively displayed on a mobile capture and rendering platform.
    [Show full text]
  • Capturing and Simulating Physically Accurate Illumination in Computer Graphics
    Capturing and Simulating Physically Accurate Illumination in Computer Graphics PAUL DEBEVEC Institute for Creative Technologies University of Southern California Marina del Rey, California Anyone who has seen a recent summer blockbuster has witnessed the dramatic increases in computer-generated realism in recent years. Visual effects supervisors now report that bringing even the most challenging visions of film directors to the screen is no longer a question of whats possible; with todays techniques it is only a matter of time and cost. Driving this increase in realism have been computer graphics (CG) techniques for simulating how light travels within a scene and for simulating how light reflects off of and through surfaces. These techniquessome developed recently, and some originating in the 1980sare being applied to the visual effects process by computer graphics artists who have found ways to channel the power of these new tools. RADIOSITY AND GLOBAL ILLUMINATION One of the most important aspects of computer graphics is simulating the illumination within a scene. Computer-generated images are two-dimensional arrays of computed pixel values, with each pixel coordinate having three numbers indicating the amount of red, green, and blue light coming from the corresponding direction within the scene. Figuring out what these numbers should be for a scene is not trivial, since each pixels color is based both on how the object at that pixel reflects light and also the light that illuminates it. Furthermore, the illumination does not only come directly from the lights within the scene, but also indirectly Capturing and Simulating Physically Accurate Illumination in Computer Graphics from all of the surrounding surfaces in the form of bounce light.
    [Show full text]
  • SIGGRAPH 2008 Course: Computational Photography: Advanced Topics
    SIGGRAPH 2008 Course: Computational Photography: Advanced Topics http://computationalphotography.org Speakers Paul Debevec (USC, USA) (debevec (at) ict.usc.edu) Ramesh RASKAR (MIT Media Lab, USA) (raskar (at) media.mit.edu) Jack TUMBLIN (Northwestern University, USA) (jet (at) cs.northwestern.edu) Course Abstract Computational photography combines plentiful computing, digital sensors, modern optics, many varieties of actuators, probes and smart lights to escape the limitations of traditional film cameras and enables novel imaging applications. Unbounded dynamic range, variable focus, resolution, and depth of field, hints about shape, reflectance, and lighting, and new interactive forms of photos that are partly snapshots and partly videos, performance capture and interchangeably relighting real and virtual characters are just some of the new applications emerging in Computational Photography. The computational techniques encompass methods from modification of imaging parameters during capture to sophisticated reconstructions from indirect measurements. We will bypass basic and introductory material presented in earlier versions of this course (Computational Photography 2005,6,7) and expand coverage of more recent topics. Emphasizing more recent work in computational photography and related fields (2006 or later) this course will give more attention to advanced topics only briefly touched before, including tomography, heterodyning and Fourier Slice applications, inverse problems, gradient illumination, novel optics, emerging sensors
    [Show full text]
  • The Light Stages and Their Applications to Photoreal Digital Actors
    The Light Stages and Their Applications to Photoreal Digital Actors Paul Debevec∗ University of Southern California Institute for Creative Technologies Figure 1: Left to Right: Light Stage 1’s spiraling spotlight records a reflectance field in 60 seconds; Light Stage 2 records actor Alfred Molina for Spider-Man 2; Light Stage 3 illuminates an actor with a reproduction of the colorful light of the Grace Cathedral HDRI map; Light Stage 5 uses high-speed photography to record an actor’s reflectance with time-multiplexed illumination; Light Stage 6, at 8m in diameter, allows performance relighting for the whole human body. Abstract the computer-generated object to create a spatially modulated area light which mimics the color, intensity, and directionality of the The Light Stage systems built at UC Berkeley and USC ICT have recorded incident illumination. When the light from this IBL en- enabled a variety of facial scanning and reflectance measurement vironment is simulated with global illumination, a rendering of the techniques that have been explored in several research papers and object is produced as if it were illuminated by the recorded real- used in various commercial applications. This short paper presents world lighting. This can add an additional level of realism to the the evolutionary history of the Light Stage Systems and some of the appearance of the object, adding subtle variation to its shading, techniques and applications they have enabled. highlights, and shadows, and is also useful for lighting the object consistently with a scene into which it is being composited, allow- Keywords: 3D scanning, facial animation, computational illumi- ing it to appear as if it were really there.
    [Show full text]
  • Acquiring the Reflectance Field of a Human Face
    To appear in the SIGGRAPH 2000 Conference Proceedings Acquiring the Reflectance Field of a Human Face Ý Ý Ý Ý Paul DebevecÝ Tim Hawkins Chris Tchou Haarm-Pieter Duiker Westley Sarokin and Mark SagarÞ ½ Þ Ý University of California at Berkeley LifeF/X, Inc. ABSTRACT models of individual faces. Work to animate facial expressions through morphing [2, 4, 29], performance-driven animation [38], We present a method to acquire the reflectance field of a human motion capture [14], and physics-based simulation [34, 20, 30] has face and use these measurements to render the face under arbitrary produced examples of realistic facial motion. changes in lighting and viewpoint. We first acquire images of the An outstanding problem is the lack of a method for capturing the face from a small set of viewpoints under a dense sampling of in- spatially varying reflectance characteristics of the human face. The cident illumination directions using a light stage. We then con- traditional approach of texture-mapping a photograph of a face onto struct a reflectance function image for each observed image pixel a geometric model usually fails to appear realistic under changes from its values over the space of illumination directions. From the in lighting, viewpoint, and expression. The problem is that the reflectance functions, we can directly generate images of the face reflectance properties of the face are complex: skin reflects light from the original viewpoints in any form of sampled or computed both diffusely and specularly, and both of these reflection com- illumination. To change the viewpoint, we use a model of skin re- ponents are spatially varying.
    [Show full text]
  • Acquiring the Reflectance Field of a Human Face
    To appear in the SIGGRAPH 2000 Conference Proceedings Acquiring the Reflectance Field of a Human Face Ý Ý Ý Ý Paul DebevecÝ Tim Hawkins Chris Tchou Haarm-Pieter Duiker Westley Sarokin and Mark SagarÞ ½ Þ Ý University of California at Berkeley LifeF/X, Inc. ABSTRACT models of individual faces. Work to animate facial expressions through morphing [2, 4, 29], performance-driven animation [38], We present a method to acquire the reflectance field of a human motion capture [14], and physics-based simulation [34, 20, 30] has face and use these measurements to render the face under arbitrary produced examples of realistic facial motion. changes in lighting and viewpoint. We first acquire images of the An outstanding problem is the lack of a method for capturing the face from a small set of viewpoints under a dense sampling of in- spatially varying reflectance characteristics of the human face. The cident illumination directions using a light stage. We then con- traditional approach of texture-mapping a photograph of a face onto struct a reflectance function image for each observed image pixel a geometric model usually fails to appear realistic under changes from its values over the space of illumination directions. From the in lighting, viewpoint, and expression. The problem is that the reflectance functions, we can directly generate images of the face reflectance properties of the face are complex: skin reflects light from the original viewpoints in any form of sampled or computed both diffusely and specularly, and both of these reflection com- illumination. To change the viewpoint, we use a model of skin re- ponents are spatially varying.
    [Show full text]
  • Multi-Scale Capture of Facial Geometry and Motion
    Multi-Scale Capture of Facial Geometry and Motion The Harvard community has made this article openly available. Please share how this access benefits you. Your story matters Citation Bickel, Bernd, Mario Botsch, Roland Angst, Wojciech Matusik, Miguel Otaduy, Hanspeter Pfister, and Markus Gross. 2007. Multi-scale capture of facial geometry and motion. Proceedings International Conference on Computer Graphics and Interactive Techniques, ACM SIGGRAPH 2007 papers: August 05-09, 2007, San Diego, California, 33-41. New York, NY: ACM. Also published in ACM Transactions on Graphics 26(3): 33-41. Published Version doi:10.1145/1275808.1276419;doi:10.1145/1276377.1276419 Citable link http://nrs.harvard.edu/urn-3:HUL.InstRepos:4726184 Terms of Use This article was downloaded from Harvard University’s DASH repository, and is made available under the terms and conditions applicable to Other Posted Material, as set forth at http:// nrs.harvard.edu/urn-3:HUL.InstRepos:dash.current.terms-of- use#LAA Multi-Scale Capture of Facial Geometry and Motion Bernd Bickel∗ Mario Botsch∗ Roland Angst† Wojciech Matusik† Miguel Otaduy∗ Hanspeter Pfister† Markus Gross∗ Figure 1: Animation of a high-resolution face scan using marker-based motion capture and a video-driven wrinkle model. From left to right: video frame, large-scale animation without wrinkles, synthesis of medium-scale wrinkles, realistic skin-rendering, different expression. Abstract 1 Introduction Capturing the likeness and dynamic performance of a human face We present a novel multi-scale representation and acquisition with all its subtleties is one of the most challenging problems in method for the animation of high-resolution facial geometry and computer graphics.
    [Show full text]
  • Faces and Image-Based Lighting
    Outline • Image-based lighting • 3D acquisition for faces Faces and Image-Based Lighting • Statistical methods (with application to face super-resolution) • 3D Face models from single images Digital Visual Effects • Image-based faces Yung-Yu Chuang • Relighting for faces with slides by Richard Szeliski, Steve Seitz, Alex Efros, Li-Yi Wei and Paul Debevec Rendering • Rendering is a function of geometry, reflectance, lighting and viewing. • To synthesize CGI into real scene, we have to mathtch the above four fac tors. Image-based lighting • Viewing can be obtained from calibration or structure from motion. •Geometryypg can be captured using 3D photography or made by hands. • How to capture lighting and reflectance? Reflectance Rendering equation • The Bidirectional Reflection Distribution Function – Given an incoming ray and outgoing ray Li (p,ωi ) what proportion of the incoming light is reflected along outgoing ray? ωi surface normal ωo p Lo (p,ωo ) 5D light field Lo (p,ωo ) Le (p,ωo ) 2 (p,ωo ,ωi )Li (p,ωi ) cosθi dωi Answer given by the BRDF: s Complex illumination Point lights Lo (p,ωo ) Le (p,ωo ) Classically, rendering is performed assuming point light sources f (p,ωo ,ωi )Li (p,ωi ) cosθi dωi s2 B(p,ωo ) 2 f (p,ωo ,ωi )Ld (p,ωi ) cosθi dωi s reflectance lighting directional source Natural illumination Natural illumination People perceive materials more easily under Rendering with natural illumination is more natural illumination than simplified illumination. expensive compared to using simplified illumination ItRDdTdAdlImages
    [Show full text]
  • Capturing and Simulating Physically Accurate Illumination in Computer Graphics
    Capturing and Simulating Physically Accurate Illumination in Computer Graphics PAUL DEBEVEC Institute for Creative Technologies University of Southern California Marina del Rey, California Anyone who has seen a recent summer blockbuster has witnessed the dramatic increases in computer-generated realism in recent years. Visual effects supervisors now report that bringing even the most challenging visions of film directors to the screen is no longer a question of what’s possible; with today’s techniques it is only a matter of time and cost. Driving this increase in realism have been computer graphics (CG) techniques for simulating how light travels within a scene and for simulating how light reflects off of and through surfaces. These techniques—some developed recently, and some originating in the 1980’s—are being applied to the visual effects process by computer graphics artists who have found ways to channel the power of these new tools. RADIOSITY AND GLOBAL ILLUMINATION One of the most important aspects of computer graphics is simulating the illumination within a scene. Computer-generated images are two-dimensional arrays of computed pixel values, with each pixel coordinate having three numbers indicating the amount of red, green, and blue light coming from the corresponding direction within the scene. Figuring out what these numbers should be for a scene is not trivial, since each pixel’s color is based both on how the object at that pixel reflects light and also the light that illuminates it. Furthermore, the illumination does not only come directly from the lights within the scene, but also indirectly Capturing and Simulating Physically Accurate Illumination in Computer Graphics from all of the surrounding surfaces in the form of “bounce” light.
    [Show full text]
  • A Lighting Reproduction Approach to Live-Action Compositing Paul Debevec Andreas Wenger† Chris Tchou Andrew Gardner Jamie Waese Tim Hawkins
    To appear at SIGGRAPH 2002, San Antonio, July 21-26, 2002 A Lighting Reproduction Approach to Live-Action Compositing Paul Debevec Andreas Wenger† Chris Tchou Andrew Gardner Jamie Waese Tim Hawkins University of Southern California Institute for Creative Technologies 1 ABSTRACT We describe a process for compositing a live performance of an ac- tor into a virtual set wherein the actor is consistently illuminated by the virtual environment. The Light Stage used in this work is a two-meter sphere of inward-pointing RGB light emitting diodes fo- cused on the actor, where each light can be set to an arbitrary color and intensity to replicate a real-world or virtual lighting environ- ment. We implement a digital two-camera infrared matting system to composite the actor into the background plate of the environ- ment without affecting the visible-spectrum illumination on the ac- tor. The color reponse of the system is calibrated to produce correct color renditions of the actor as illuminated by the environment. We demonstrate moving-camera composites of actors into real-world environments and virtual sets such that the actor is properly illumi- nated by the environment into which they are composited. Keywords: Matting and Compositing, Image-Based Lighting, Ra- diosity, Global Illumination, Reflectance and Shading Figure 1: Light Stage 3 focuses 156 red-green-blue LED lights to- ward an actor, who is filmed simultaneously with a color camera and an infrared matting system. The device composites an actor 1 Introduction into a background environment as illuminated by a reproduction of the light from that environment, yielding a composite with consis- Many applications of computer graphics involve compositing, the tent illumination.
    [Show full text]