A Facial Model and Animation Techniques for Animated Speech

A Facial Model and Animation Techniques for Animated Speech DISSERTATION Presented in Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in the Graduate School of The Ohio State University By Scott Alan King, B.S., M.S. * * * * * The Ohio State University 2001 Dissertation Committee: Approved by Richard E. Parent, Adviser Wayne E. Carlson Adviser Han-Wei Shen Department of Computer and Information Science c Copyright by Scott Alan King 2001 ABSTRACT Creating animated speech requires a facial model capable of representing the myr- iad shapes the human face experiences during speech and a method to produce the correct shape at the correct time. We present a facial model designed to support animated speech. Our model has a highly deformable lip model that is grafted onto the input facial geometry providing the necessary geometric complexity for creating lip shapes and high-quality lip renderings. We provide a highly deformable tongue model that can represent the shapes the tongue experiences during speech. We add teeth, gums, and upper palate geometry to complete the inner mouth. For more realistic movement of the skin we consider the underlying soft and hard tissue. To decrease the processing time we hierarchically deform the facial surface. We also present a method to animate the facial model over time to create animated speech. We use a track-based animation system that has one facial model parameter per track with possibly more than one track per parameter. The tracks contain control points for a curve that describes the value of the parameter over time. We allow many different types and orders of curves that can be combined in different manners. For more realistic speech we develop a coarticulation model that defines visemes as curves instead of a single position. This treats a viseme as a dynamic shaping of the vocal tract and not as a static shape. ii This work is dedicated to Tamara, my wonderful spouse, who sacrificed much during the years of research, and to our son Graham whose arrival created the urgency needed for me to finally decide to finish. iii ACKNOWLEDGMENTS I would like to thank my advisor, Dr Richard Parent. The commitment that an advisor takes on when he agrees to be your mentor is tremendous, and in my case, extremely appreciated. I thank the members of my dissertation committee, Dr Wayne E. Carlson and Dr Han-Wei Shen, for their time, valuable comments, and careful reading of this dissertation. I also thank Dr Jacqueline C. Henninger for her careful reading of this thesis and her valuable comments during the oral defense. And I thank Dr Roger Crawfis for his generous support of me and the lab, and for our collaboration. This work was made possible by the help and support of many people. I would like to thank Barbara Olsafsky for her work on the procedural shaders. I thank Dr Osamu Fujimura for sharing his knowledge of human speech. I would like to thank Texas Instruments Inc. for their financial support of this research, particularly Bruce Flinchbaugh. I would like to thank Dr. Maureen Stone and Andrew Lundberg for their time and data from their work on tongue surface reconstruction. Like most big software projects I used the work of many people out in the free software community. Their enormous efforts and generosity in releasing their software has saved countless people years of work, and allowed access to cutting edge solutions to numerous problems. In my case I would like to thank in particular: the Festival team (Alan W Black, Paul Taylor, Richard Caley and Rob Clark) [BTCC00]; iv Jonathan Richard Shewchuk for his work on Triangle [She96a]; the MBROLA project [MBR99]; the Visible Human Project [NLH99] sponsored by the National Library of Medicine; Viewpoint Digital, Inc. for their free scan of my head at SIGGRAPH; and Cyberware, Inc. for their release of 3D models on the net, particularly that of teeth. Our lab has a great working environment with lots of talented students who are always willing to talk about ideas, give advice on a problem or read a rough draft. I give thanks to all of the members of our lab and department that have shared ideas, support and an ear with me or just made my experience at OSU a better one. Par- ticularly I'd like to thank Frank Adelstein, Mowgli Assor, Kirk Bowers, Paolo Bucci, Tamera Cramer, Steve Demlow, Sandy Farrar, Tom Fine, Mark Fontana, Margaret Geroch, Sandy Hill, Leslie Holton, Yair Kurzion, Matt Lewis, Nathan Loofbourrow, Marty Marlatt, Steve May, Torsten Möller, Klaus Mueller, Barbara Olsafsky, Eliza- beth O'Neill, Johan Ostmann,¨ Eleanor Quinlan, Kevin Rodgers, Steve Romig, Ferdi Scheepers, Naeem Shareef, Po-wen Shih, Karansher Singh, Don Stredney, Brad Wine- miller, Suba Varadarajan, Lawson Wade, and Pete Ware for not only helping me get through the research, but for also giving me a life during my time at OSU. v VITA 1988 . .B.S. Computer Science, Utah State University 1994 . .M.S. Computer and Information Sci- ence, The Ohio State University Sep, 1988 - Sep, 1989 . Tandy Corp - Programmer Sep, 1989 - Jun, 1991 . General Dynamics - Software Engineer Jun, 1991 - Sep, 1992 . Harris Methodist Health Systems - Programmer Analyst II 1992-2000 . Graduate Assistant, Department of Computer and Information Science, The Ohio State University Jan 1995-Jan 1996 . .Graduate Research Assistant, Ohio Su- percomputer Center, The Ohio State University 1995-1997 . Summer Intern, Texas Instruments 1997-2000 . Graduate Research Assistant, Depart- ment of Computer and Information Sci- ence, The Ohio State University Sep 2000 - present . Lecturer, Department of Computer and Information Science, The Ohio State University PUBLICATIONS Research Publications vi Scott A. King, Roger A. Crawfis and Wayland Reid, \Fast Volume Rendering and Animation of Amorphous Phenomena", chapter 14 in Volume Graphics edited by Min Chen, Arie E. Kaufman and Roni Yagel, Springer, London, 2000. FIELDS OF STUDY Major Field: Computer and Information Science Studies in: Computer Graphics Prof. Richard E. Parent Software Methodology Prof. Spiro Michaylov Communications Prof. Dhabaleswar K. Panda vii TABLE OF CONTENTS Page Abstract . ii Dedication . iii Acknowledgments . iv Vita . vi List of Tables . xiii List of Figures . xv Chapters: 1. Introduction . 1 1.1 Applications of Facial Modeling and Animated Speech . 3 1.2 Motivation For Using Computers To Animate Speech . 4 1.3 Current Solutions for Speech-Synchrony . 6 1.4 Thesis Overview . 10 1.5 Organization of Thesis . 12 2. The Facial Model . 15 2.1 Skin . 18 2.1.1 Characteristic Points . 19 2.1.2 Cylindrical space . 21 2.2 Skull . 22 2.2.1 Collision Detection. 24 2.3 Lips . 27 2.4 Tongue . 27 viii 2.5 Eyes . 28 2.6 Other facial parts . 29 2.7 Summary . 30 3. The Lip Model . 31 3.1 Introduction . 31 3.2 Previous Work . 34 3.2.1 Speech Reading . 34 3.2.2 Computer Models . 36 3.2.3 ICP Lip Model . 37 3.3 Lip Anatomy . 38 3.3.1 The Mandible . 40 3.4 Lip Parameterization . 41 3.5 Implementation . 45 3.5.1 Grafting . 48 3.6 Rendering . 49 3.7 Results . 51 3.8 Summary . 53 4. The Tongue Model . 56 4.1 Introduction . 56 4.2 Previous Work . 57 4.3 Tongue Anatomy . 60 4.4 Tongue Model . 62 4.4.1 Geometry . 63 4.4.2 Tongue Parameterization . 65 4.4.3 Implementation . 66 4.5 Rendering . 69 4.6 Results . 69 4.7 Summary . 72 5. Speech-Synchronized Animation . 76 5.1 Coarticulation . 76 5.1.1 Previous Work . 77 5.1.2 Our Coarticulation Model . 80 5.2 Results . 92 5.3 Summary . 92 ix 6. TalkingHead: A Text-to-Audiovisual-Speech System . 94 6.1 Introduction . 94 6.2 System Overview . 95 6.3 Text-To-Speech Synthesis . 96 6.4 The Viseme And Expression Generator . 97 6.5 The TalkingHead Animation Subsystem . 100 6.5.1 Different Characters . 100 6.6 Summary . 102 7. Results . 103 7.1 Results From Our Facial Model . 104 7.1.1 The Tongue Model . 104 7.1.2 The Lip Model . 109 7.2 Animation Results . 112 7.3 Summary . 125 8. Future Research . 130 8.1 Future Modeling Research . 130 8.1.1 Hair . 131 8.1.2 Rendering Skin . 131 8.2 Future Animation Research . ..

Load more