University of Pennsylvania ScholarlyCommons
Center for Human Modeling and Simulation Department of Computer & Information Science
6-1-1993
Simulating Humans: Computer Graphics, Animation, and Control
Bonnie L. Webber University of Pennsylvania, [email protected]
Cary B. Phillips University of Pennsylvania
Norman I. Badler University of Pennsylvania, [email protected]
Follow this and additional works at: https://repository.upenn.edu/hms
Recommended Citation Webber, B. L., Phillips, C. B., & Badler, N. I. (1993). Simulating Humans: Computer Graphics, Animation, and Control. Retrieved from https://repository.upenn.edu/hms/68
Reprinted with Permission by Oxford University Press. Reprinted from Simulating humans: computer graphics animation and control, Norman I. Badler, Cary B. Phillips, and Bonnie L. Webber (New York: Oxford University Press, 1993), 283 pages. Author URL: http://www.cis.upenn.edu/~badler/book/book.html
This paper is posted at ScholarlyCommons. https://repository.upenn.edu/hms/68 For more information, please contact [email protected]. Simulating Humans: Computer Graphics, Animation, and Control
Abstract People are all around us. They inhabit our home, workplace, entertainment, and environment. Their presence and actions are noted or ignored, enjoyed or disdained, analyzed or prescribed. The very ubiquitousness of other people in our lives poses a tantalizing challenge to the computational modeler: people are at once the most common object of interest and yet the most structurally complex. Their everyday movements are amazingly uid yet demanding to reproduce, with actions driven not just mechanically by muscles and bones but also cognitively by beliefs and intentions. Our motor systems manage to learn how to make us move without leaving us the burden or pleasure of knowing how we did it. Likewise we learn how to describe the actions and behaviors of others without consciously struggling with the processes of perception, recognition, and language.
Comments Reprinted with Permission by Oxford University Press. Reprinted from Simulating humans: computer graphics animation and control, Norman I. Badler, Cary B. Phillips, and Bonnie L. Webber (New York: Oxford University Press, 1993), 283 pages. Author URL: http://www.cis.upenn.edu/~badler/book/book.html
This journal article is available at ScholarlyCommons: https://repository.upenn.edu/hms/68
SIMULATING HUMANS: COMPUTER
GRAPHICS, ANIMATION, AND CONTROL
Norman I. Badler
1
Cary B. Phillips
Bonnie L. Webb er
Department of Computer and Information Science
UniversityofPennsylvania
Philadelphia, PA 19104-638 9
Oxford University Press
c
1992 Norman I Badler, Cary B. Phillips, Bonnie L. Webb er
March 25, 1999
1
Current address: Paci c Data Images, 1111 Karlstad Dr., Sunnyvale, CA 94089.
i
To Ginny, Denise, and Mark ii
Contents
1 Intro duction and Historical Background 1
1.1 Why Make Human Figure Mo dels? :::: ::: :::: ::: :: 4
1.2 Historical Ro ots : ::: :::: ::: :::: ::: :::: ::: :: 7
1.3 What is Currently Possible? : ::: :::: ::: :::: ::: :: 11
1.3.1 A Human Mo del must b e Structured Like the Human
Skeletal System :::: ::: :::: ::: :::: ::: :: 12
1.3.2 A Human Mo del should Move or Resp ond Like a Human 12
1.3.3 A Human Mo del should b e Sized According to Permis-
sible Human Dimensions :: :::: ::: :::: ::: :: 14
1.3.4 A Human Mo del should have a Human-Like App earance 15
1.3.5 A Human Mo del must Exist, Work, Act and React
Within a 3D Virtual Environment ::: :::: ::: :: 15
1.3.6 Use the Computer to Analyze Synthetic Behaviors ::: 16
1.3.7 An Interactive Software Tool must b e Designed for Us-
ability :: ::: :::: ::: :::: ::: :::: ::: :: 18
1.4 Manipulation, Animation, and Simulation ::: :::: ::: :: 19
1.5 What Did We Leave Out? :: ::: :::: ::: :::: ::: :: 20
2 Bo dy Mo deling 23
2.1 Geometric Bo dy Mo deling :: ::: :::: ::: :::: ::: :: 23
2.1.1 Surface and Boundary Mo dels ::: ::: :::: ::: :: 23
2.1.2 Volume and CSG Mo dels : :::: ::: :::: ::: :: 25
2.1.3 The Principal Bo dy Mo dels Used : ::: :::: ::: :: 27
2.2 Representing Articulated Figures : :::: ::: :::: ::: :: 28
2.2.1 Background :: :::: ::: :::: ::: :::: ::: :: 29
2.2.2 The Terminology of Peabody ::: ::: :::: ::: :: 30
2.2.3 The Peab o dy Hierarchy :: :::: ::: :::: ::: :: 31
2.2.4 Computing Global Co ordinate Transforms ::: ::: :: 33
2.2.5 Dep endent Joints ::: ::: :::: ::: :::: ::: :: 33
2.3 A Flexible Torso Mo del ::: ::: :::: ::: :::: ::: :: 34
2.3.1 Motion of the Spine : ::: :::: ::: :::: ::: :: 36
2.3.2 Input Parameters ::: ::: :::: ::: :::: ::: :: 37
2.3.3 Spine Target Position ::: :::: ::: :::: ::: :: 38
2.3.4 Spine Database :::: ::: :::: ::: :::: ::: :: 38 iii
iv CONTENTS
2.4 Shoulder Complex :: :::: ::: :::: ::: :::: ::: :: 39
2.4.1 Primitive Arm Motions :: :::: ::: :::: ::: :: 40
2.4.2 Allo cation of Elevation and Ab duction : :::: ::: :: 41
2.4.3 Implementation of Shoulder Complex : :::: ::: :: 41
2.5 Clothing Mo dels ::: :::: ::: :::: ::: :::: ::: :: 45
2.5.1 Geometric Mo deling of Clothes :: ::: :::: ::: :: 46
2.5.2 Draping Mo del :::: ::: :::: ::: :::: ::: :: 48
2.6 The Anthrop ometry Database ::: :::: ::: :::: ::: :: 49
2.6.1 Anthrop ometry Issues ::: :::: ::: :::: ::: :: 49
2.6.2 ImplementationofAnthrop ometric Scaling :: ::: :: 50
2.6.3 Joints and Joint Limits :: :::: ::: :::: ::: :: 51
2.6.4 Mass ::: ::: :::: ::: :::: ::: :::: ::: :: 53
2.6.5 Moment of Inertia :: ::: :::: ::: :::: ::: :: 53
2.6.6 Strength : ::: :::: ::: :::: ::: :::: ::: :: 54
2.7 The Anthrop ometry Spreadsheet : :::: ::: :::: ::: :: 54
2.7.1 Interactive Access Anthrop ometric Database : ::: :: 56
2.7.2 SASS and the Bo dy Hierarchy :: ::: :::: ::: :: 57
2.7.3 The Rule System for Segment Scaling : :::: ::: :: 57
2.7.4 Figure Creation :::: ::: :::: ::: :::: ::: :: 59
2.7.5 Figure Scaling :::: ::: :::: ::: :::: ::: :: 59
2.8 Strength and Torque Display ::: :::: ::: :::: ::: :: 60
2.8.1 Goals of Strength Data Display :: ::: :::: ::: :: 61
2.8.2 Design of Strength Data Displays : ::: :::: ::: :: 61
3 Spatial Interaction 67
3.1 Direct Manipulation : :::: ::: :::: ::: :::: ::: :: 67
3.1.1 Translation :: :::: ::: :::: ::: :::: ::: :: 68
3.1.2 Rotation : ::: :::: ::: :::: ::: :::: ::: :: 68
3.1.3 Integrated Systems :: ::: :::: ::: :::: ::: :: 69
3.1.4 The Jack Direct Manipulation Op erator :::: ::: :: 70
3.2 Manipulation with Constraints :: :::: ::: :::: ::: :: 75
3.2.1 Postural Control using Constraints ::: :::: ::: :: 75
3.2.2 Constraints for Inverse Kinematics ::: :::: ::: :: 77
3.2.3 Features of Constraints :: :::: ::: :::: ::: :: 78
3.2.4 Inverse Kinematics and the Center of Mass :: ::: :: 78
3.2.5 Interactive Metho dology :: :::: ::: :::: ::: :: 80
3.3 Inverse Kinematic Positioning ::: :::: ::: :::: ::: :: 83
3.3.1 Constraints as a Nonlinear Programming Problem ::: 86
3.3.2 Solving the Nonlinear Programming Problem : ::: :: 87
3.3.3 Assembling Multiple Constraints : ::: :::: ::: :: 91
3.3.4 Sti ness of Individual Degrees of Freedom ::: ::: :: 93
3.3.5 An Example : :::: ::: :::: ::: :::: ::: :: 93
3.4 Reachable Spaces ::: :::: ::: :::: ::: :::: ::: :: 94
3.4.1 Workspace Point Computation Mo dule : :::: ::: :: 96
3.4.2 Workspace Visualization :: :::: ::: :::: ::: :: 97
3.4.3 Criteria Selection ::: ::: :::: ::: :::: ::: :: 98
CONTENTS v
4 Behavioral Control 101
4.1 An Interactive System for Postural Control :: :::: ::: ::102
4.1.1 Behavioral Parameters ::: :::: ::: :::: ::: ::103
4.1.2 Passive Behaviors ::: ::: :::: ::: :::: ::: ::109
4.1.3 Active Behaviors ::: ::: :::: ::: :::: ::: ::114
4.2 Interactive Manipulation With Behaviors ::: :::: ::: ::116
4.2.1 The Feet ::: :::: ::: :::: ::: :::: ::: ::117
4.2.2 The Center of Mass and Balance : ::: :::: ::: ::117
4.2.3 The Torso ::: :::: ::: :::: ::: :::: ::: ::120
4.2.4 The Pelvis ::: :::: ::: :::: ::: :::: ::: ::123
4.2.5 The Head and Eyes : ::: :::: ::: :::: ::: ::123
4.2.6 The Arms ::: :::: ::: :::: ::: :::: ::: ::123
4.2.7 The Hands and Grasping : :::: ::: :::: ::: ::126
4.3 The Animation Interface ::: ::: :::: ::: :::: ::: ::126
4.4 Human Figure Motions ::: ::: :::: ::: :::: ::: ::128
4.4.1 Controlling Behaviors Over Time : ::: :::: ::: ::129
4.4.2 The Center of Mass : ::: :::: ::: :::: ::: ::129
4.4.3 The Pelvis ::: :::: ::: :::: ::: :::: ::: ::130
4.4.4 The Torso ::: :::: ::: :::: ::: :::: ::: ::130
4.4.5 The Feet ::: :::: ::: :::: ::: :::: ::: ::130
4.4.6 Moving the Heels ::: ::: :::: ::: :::: ::: ::131
4.4.7 The Arms ::: :::: ::: :::: ::: :::: ::: ::132
4.4.8 The Hands :: :::: ::: :::: ::: :::: ::: ::132
4.5 Virtual Human Control ::: ::: :::: ::: :::: ::: ::132
5 Simulation with So cieties of Behaviors 137
5.1 Forward Simulation with Behaviors :::: ::: :::: ::: ::139
5.1.1 The Simulation Mo del ::: :::: ::: :::: ::: ::141
5.1.2 The Physical Execution Environment : :::: ::: ::142
5.1.3 Networks of Behaviors and Events ::: :::: ::: ::144
5.1.4 Interaction with Other Mo dels :: ::: :::: ::: ::145
5.1.5 The Simulator :::: ::: :::: ::: :::: ::: ::147
5.1.6 Implemented Behaviors :: :::: ::: :::: ::: ::149
5.1.7 Simple human motion control ::: ::: :::: ::: ::150
5.2 Lo comotion ::: ::: :::: ::: :::: ::: :::: ::: ::150
5.2.1 Kinematic Control :: ::: :::: ::: :::: ::: ::151
5.2.2 Dynamic Control ::: ::: :::: ::: :::: ::: ::152
5.2.3 Curved Path Walking ::: :::: ::: :::: ::: ::154
5.2.4 Examples ::: :::: ::: :::: ::: :::: ::: ::159
5.3 Strength Guided Motion ::: ::: :::: ::: :::: ::: ::161
5.3.1 Motion from Dynamics Simulation ::: :::: ::: ::161
5.3.2 Incorp orating Strength and Comfort into Motion :: ::163
5.3.3 Motion Control :::: ::: :::: ::: :::: ::: ::164
5.3.4 Motion Strategies ::: ::: :::: ::: :::: ::: ::167
5.3.5 Selecting the Active Constraints : ::: :::: ::: ::169
5.3.6 Strength Guided Motion Examples ::: :::: ::: ::170
vi CONTENTS
5.3.7 Evaluation of this Approach :::: ::: :::: ::: ::173
5.3.8 Performance Graphs : ::: :::: ::: :::: ::: ::173
5.3.9 Co ordinated Motion : ::: :::: ::: :::: ::: ::174
5.4 Collision-Free Path and Motion Planning ::: :::: ::: ::180
5.4.1 Rob otics Background ::: :::: ::: :::: ::: ::180
5.4.2 Using Cspace Groups ::: :::: ::: :::: ::: ::181
5.4.3 The Basic Algorithm : ::: :::: ::: :::: ::: ::182
5.4.4 The Sequential Algorithm : :::: ::: :::: ::: ::183
5.4.5 The Control Algorithm :: :::: ::: :::: ::: ::185
5.4.6 The Planar Algorithm ::: :::: ::: :::: ::: ::186
5.4.7 Resolving Con icts b etween Di erent Branches ::: ::186
5.4.8 Playing Back the Free Path :::: ::: :::: ::: ::187
5.4.9 Incorp orating Strength Factors into the Planned Motion 189
5.4.10 Examples ::: :::: ::: :::: ::: :::: ::: ::190
5.4.11 Completeness and Complexity :: ::: :::: ::: ::191
5.5 Posture Planning ::: :::: ::: :::: ::: :::: ::: ::192
5.5.1 Functionally Relevant High-level Control Parameters ::196
5.5.2 Motions and Primitive Motions :: ::: :::: ::: ::197
5.5.3 Motion Dep endencies ::: :::: ::: :::: ::: ::197
5.5.4 The Control Structure of Posture Planning :: ::: ::199
5.5.5 An Example of Posture Planning : ::: :::: ::: ::200
6 Task-Level Sp eci cations 207
6.1 Performing Simple Commands :: :::: ::: :::: ::: ::208
6.1.1 Task Environment :: ::: :::: ::: :::: ::: ::208
6.1.2 Linking Language and Motion Generation ::: ::: ::209
6.1.3 Sp ecifying Goals ::: ::: :::: ::: :::: ::: ::209
6.1.4 The Knowledge Base : ::: :::: ::: :::: ::: ::210
6.1.5 The Geometric Database : :::: ::: :::: ::: ::211
6.1.6 Creating an Animation :: :::: ::: :::: ::: ::211
6.1.7 Default Timing Constructs :::: ::: :::: ::: ::212
6.2 Language Terms for Motion and Space :: ::: :::: ::: ::214
6.2.1 Simple Commands :: ::: :::: ::: :::: ::: ::214
6.2.2 Representational Formalism :::: ::: :::: ::: ::215
6.2.3 Sample Verb and Prep osition Sp eci cations :: ::: ::217
6.2.4 Pro cessing a sentence ::: :::: ::: :::: ::: ::219
6.2.5 Summary ::: :::: ::: :::: ::: :::: ::: ::221
6.3 Task-Level Simulation :::: ::: :::: ::: :::: ::: ::222
6.3.1 Programming Environment :::: ::: :::: ::: ::223
6.3.2 Task-actions : :::: ::: :::: ::: :::: ::: ::224
6.3.3 Motivating Some Task-Actions :: ::: :::: ::: ::225
6.3.4 Domain-sp eci c task-actions ::: ::: :::: ::: ::226
6.3.5 Issues :: ::: :::: ::: :::: ::: :::: ::: ::228
6.3.6 Summary ::: :::: ::: :::: ::: :::: ::: ::231
6.4 A Mo del for Instruction Understanding : ::: :::: ::: ::231
CONTENTS vii
7 Epilogue 243
7.1 A Roadmap Toward the Future :: :::: ::: :::: ::: ::244
7.1.1 Interactive Human Mo dels : :::: ::: :::: ::: ::245
7.1.2 Reasonable Biomechanical Prop erties : :::: ::: ::245
7.1.3 Human-like Behaviors ::: :::: ::: :::: ::: ::245
7.1.4 Simulated Humans as Virtual Agents :: :::: ::: ::246
7.1.5 Task Guidance through Instructions :: :::: ::: ::246
7.1.6 Natural Manual Interfaces and Virtual Reality ::: ::246
7.1.7 Generating Text, Voice-over, and Sp oken Explication
for Animation : :::: ::: :::: ::: :::: ::: ::247
7.1.8 Co ordinating Multiple Agents ::: ::: :::: ::: ::247
7.2 Conclusion ::: ::: :::: ::: :::: ::: :::: ::: ::248
Bibliography 249
Index 267
viii CONTENTS
Preface
The decade of the 80's saw the dramatic expansion of high p erformance
computer graphics into domains previously able only to irt with the tech-
nology. Among the most dramatic has b een the incorp oration of real-time
interactive manipulation and display for human gures. Though actively pur-
sued by several research groups, the problem of providing a virtual or synthetic
human for an engineer or designer already accustomed to Computer-Aided De-
sign techniques was most comprehensively attacked by the Computer Graphics
Research Lab oratory at the UniversityofPennsylvania. The breadth of that
e ort as well as the details of its metho dology and software environment are
presented in this volume.
This b o ok is intended for human factors engineers requiring current knowl-
edge of how a computer graphics surrogate human can augment their analy-
ses of designed environments. It will also help inform design engineers of the
state-of-the-art in human gure mo deling, and hence of the human-centered
design central to the emergent notion of Concurrent Engineering. Finally,it
do cuments for the computer graphics community a ma jor research e ort in
the interactive control and motion sp eci cation of articulated human gures.
Many p eople have contributed to the work describ ed in this b o ok, but the
textual material derives more or less directly from the e orts of our current
and former students and sta : Tarek Alameldin, Francisco Azuola, Breck
Baldwin, Welton Becket, Wallace Ching, Paul Diefenbach, Barbara Di Eu-
ngenio, Je rey Esakov, Christopher Geib, John Granieri, Marc Grosso, Pei-
Hwa Ho, Mike Hollick, Mo on Jung, Jugal Kalita, Hyeongseok Ko, Eunyoung
Koh, Jason Kopp el, Michael Kwon, Philip Lee, Libby Levison, Gary Monheit,
Michael Mo ore, Ernest Otani, Susanna Wei, Graham Walters, Michael White,
Jianmin Zhao, and Xinmin Zhao. Additional animation help has come from
Leanne Hwang, David Haynes, and Brian Stokes. John Granieri and Mike
Hollick help ed considerably with the photographs and gures.
This work would not have b een p ossible without the generous and often
long term supp ort of many organizations and individuals. In particular we
would liketoacknowledge our many colleagues and friends: Barbara Wo olford,
Geri Brown, Jim Maida, Abhilash Pandya and the late Linda Orr in the Crew
Station Design Section and Mike Greenisen at NASA Johnson Space Center;
Ben Cummings, Brenda Thein, Bernie Corona, and Rick Kozycki of the U.S.
Army Human Engineering Lab oratory at Ab erdeen Proving Grounds; James
Hartzell, James Larimer, Barry Smith, Mike Prevost, and Chris Neukom of
3
the A I Pro ject in the Aero ight Dynamics Directorate of NASA Ames Re-
search Center; StevePaquette of the U. S. Army Natick Lab oratory; Jagdish
Chandra and David Hislop of the U. S. Army Research Oce; the Army Arti-
cial Intelligence Center of Excellence at the UniversityofPennsylvania and
its Director, Aravind Joshi; Art Iverson and Jack Jones of the U.S. Army
TACOM; Jill Easterly,EdBoyle, John Ianni, and Wendy Campb ell of the
U. S. Air Force Human Resources Directorate at Wright-Patterson Air Force
Base; Medhat Korna and Ron Dierker of Systems Exploration, Inc.; Pete Glor
CONTENTS ix
and Joseph Spann of Hughes Missile Systems formerly General Dynamics,
Convair Division; Ruth Maulucci of MOCO Inc.; John McConville, Bruce
Bradtmiller, and Bob Beecher of Anthrop ology Research Pro ject, Inc.; Ed-
mund Khouri of Lo ckheed Engineering and Management Services; Barb Fecht
of Battelle Paci c Northwest Lab oratories; Jerry Duncan of Deere and Com-
pany; Ed Bellandi of FMC Corp.; Steve Gulasy of Martin-Marietta Denver
Aerospace; Joachim Grollman of Siemens Research; Kathleen Robinette of the
Armstrong Medical Research Lab at Wright-Patterson Air Force Base; Harry
Frisch of NASA Go ddard Space Flight Center; Jerry Allen and the folks at Sil-
icon Graphics, Inc.; Jack Scully of Ascension Technology Corp.; the National
Science Foundation CISE GrantCDA88-22719 and ILI Grant USE-9152503;
and the State of Pennsylvania Benjamin Franklin Partnership. Martin Zaidel
a
contributed valuable L T X help. Finally, the encouragement and patience of
E
Don Jackson at Oxford University Press has b een most appreciated.
Norman I. Badler
UniversityofPennsylvania
Cary B. Phillips
PDI, Sunnyvale
Bonnie L. Webb er
UniversityofPennsylvania
x CONTENTS
Chapter 1
Intro duction and
Historical Background
People are all around us. They inhabit our home, workplace, entertainment,
and environment. Their presence and actions are noted or ignored, enjoyed or
disdained, analyzed or prescrib ed. The very ubiquitousness of other p eople in
our lives p oses a tantalizing challenge to the computational mo deler: p eople
are at once the most common ob ject of interest and yet the most structurally
complex. Their everydaymovements are amazingly uid yet demanding to
repro duce, with actions driven not just mechanically bymuscles and b ones
but also cognitively by b eliefs and intentions. Our motor systems manage
to learn how to makeusmove without leaving us the burden or pleasure
of knowing howwe did it. Likewise we learn how to describ e the actions
and b ehaviors of others without consciously struggling with the pro cesses of
p erception, recognition, and language.
A famous Computer Scientist, Alan Turing, once prop osed a test to deter-
mine if a computational agentisintelligent[Tur63]. In the Turing Test, a sub-
ject communicates with two agents, one human and one computer, through
akeyb oard which e ectively restricts interaction to language. The sub ject
attempts to determine which agent is whichby p osing questions to b oth of
them and guessing their identities based on the \intelligence" of their answers.
No physical manifestation or image of either agent is allowed as the pro cess
seeks to establish abstract \intellectual b ehavior," thinking, and reasoning.
Although the Turing Test has sto o d as the basis for computational intelli-
gence since 1963, it clearly omits any p otential to evaluate physical actions,
b ehavior, or app earance.
Later, Edward Feigenbaum prop osed a generalized de nition that included
action: \Intelligent action is an act or decision that is goal-oriented, arrived
at by an understandable chain of symb olic analysis and reasoning steps, and
is one in which knowledge of the world informs and guides the reasoning."
[Bo d77]. We can imagine an analogous \Turing Test" that would have the 1
2 CHAPTER 1. INTRODUCTION AND HISTORICAL BACKGROUND
sub ject watching the b ehaviors of two agents, one human and one synthetic,
while trying to determine at a b etter than chance level which is which. Human
movement enjoys a universality and complexity that would de nitely challenge
an animated gure in this test: if a computer-synthesized gure lo oks, moves,
and acts like a real p erson, are we going to b elieve that it is real? On the sur-
face the question almost seems silly, since wewould rather not allow ourselves
to b e fo oled. In fact, however, the question is mo ot though the premises are
slightly di erent: carto on characters are hardly \real," yet wewatch them and
prop erly interpret their actions and motions in the evolving context of a story.
Moreover, they are not \realistic" in the physical sense { no one exp ects to
see a manifest Mickey Mouse walking down the street. Nor do carto ons even
move like p eople { they squash and stretch and p erform all sorts of actions
that wewould never want to do. But somehow our p erceptions often make
these characters believable: they app ear to act in a goal-directed way b ecause
their human animators haveimbued them with physical \intelligence" and
b ehaviors that apparently cause them to chase enemies, b ounce o walls, and
talk to one another. Of course, these ends are achieved by the skillful weaving
of a story into the crafted images of a character. Perhaps surprisingly, the
mechanisms by which motion, b ehavior, and emotion are enco ded into car-
to ons is not by building synthetic mo dels of little creatures with muscles and
nerves. The requisite animator skills do not come easily; even in the carto on
world re nements to the art and technique to ok muchwork, time, and study
[TJ81]. Creating suchmovements automatically in resp onse to real-time in-
teractive queries p osed by the sub ject in our hyp othetical exp eriment do es not
make the problem any easier. Even Turing, however, admitted that the intel-
ligence sought in his original test did not require the computational process
of thinking to b e identical to that of the human: the external manifestation
in a plausible and reasonable answer was all that mattered.
So why are we willing to assimilate the truly arti cial reality of carto ons {
characters created and moved entirely unlike \real" p eople { yet b e skeptical
of more human-like forms? This question holds the key to our physical Turing
Test: as the app earance of a character b ecomes more human, our p erceptual
apparatus demands motion qualities and b ehaviors which sympathize with
our exp ectations. As a carto on character takes on a human form, the only
currently viable metho d for accurate motion is the recording of a real actor
and the tracing or transfer \rotoscoping" of that motion into the animation.
Needless to say, this is not particularly satisfying to the mo deler: the motion
and actor must exist prior to the synthesized result. Even if we recorded
thousands of individual motions and retrieved them through some kind of
indexed video, wewould still lack the freshness, variability, and adaptability
of humans to live, work, and play in an in nite variety of settings.
If synthetic human motion is to b e pro duced without the b ene t of prior
\real" execution and still have a shot at passing the physical Turing Test, then
mo dels must carefully balance structure, shap e, and motion in a compatible
package. If the mo dels are highly simpli ed or stylized, carto ons or caricatures
will b e the dominant p erception; if they lo ok likehumans, then they will b e
3
exp ected to b ehave like them. How to accomplish this without a real actor
showing the way is the challenge addressed here.
Present technology can approachhuman app earance and motion through
computer graphics mo deling and three-dimensional animation, but there is
considerable distance to go b efore purely synthesized gures trick our senses.
Anumb er of promising research routes can b e explored and many are tak-
ing us a considerable waytoward that ultimate goal. By prop erly delimiting
the scop e and application of human mo dels, we can move forward, not to re-
place humans, but to substitute adequate computational surrogates in various
situations otherwise unsafe, imp ossible, or to o exp ensive for the real thing.
The goals we set in this study are realistic but no less ambitious than the
physical Turing Test: we seek to build computational mo dels of human-like
gures which, though they may not trick our senses into b elieving they are
alive, nonetheless manifest animacy and convincing b ehavior. Towards this
end, we
Create an interactive computer graphics human mo del.
Endow it with reasonable biomechanical prop erties.
Provide it with \human-like" b ehaviors.
Use this simulated gure as an agent to e ect changes in its world.
Describ e and guide its tasks through natural language instructions.
There are presently no p erfect solutions to any of these problems, but sig-
ni cant advances have enabled the consideration of the suite of goals under
uniform and consistent assumptions. Ultimately,we should b e able to give
our surrogate human directions that, in conjunction with suitable symb olic
reasoning pro cesses, make it app ear to b ehave in a natural, appropriate, and
intelligent fashion. Compromises will b e essential, due to limits in computa-
tion, throughput of display hardware, and demands of real-time interaction,
but our algorithms aim to balance the physical device constraints with care-
fully crafted mo dels, general solutions, and thoughtful organization.
This study will tend to fo cus on one particularly well-motivated application
for human mo dels: human factors analysis. While not as exciting as motion
picture characters, as p ersonable as carto ons, or as skilled as Olympic athletes,
there are justi able uses to virtual human gures in this domain. Visualizing
the app earance, capabilities and p erformance of humans is an imp ortant and
demanding application Plate 1. The lessons learned may b e transferred to
less critical and more entertaining uses of human-like mo dels. From mo deling
realistic or at least reasonable b o dy size and shap e, through the control of
the highly redundant body skeleton, to the simulation of plausible motions,
human gures o er numerous computational problems and constraints. Build-
ing software for human factors applications serves a widespread, non-animator
user p opulation. In fact, it app ears that such software has broader applica-
tion since the features needed for analytic applications { suchasmultiple
4 CHAPTER 1. INTRODUCTION AND HISTORICAL BACKGROUND
simultaneous constraints { provide extremely useful features for the conven-
tional animator. Our software design has tried to takeinto account a wide
varietyofphysical problem-oriented tasks, rather than just o er a computer
graphics and animation to ol for the already skilled or computer-sophisticated
animator.
The remainder of this chapter motivates the human factors environment
and then traces some of the relevant history b ehind the simulation of human
gures in this and other domains. It concludes with a discussion of the sp eci c
features a human mo deling and animation system should have and whywe
have concentrated on some and not others. In particular, we are not consid-
ering cognitive problems such as p erception or sensory interpretation, target
tracking, ob ject identi cation, or control feedback that might b e imp ortant
parts of some human factors analyses. Instead we concentrate on mo deling a
virtual human with reasonable biomechanical structure and form, as describ ed
in Chapter 2. In Chapter 4 we address the psychomotor b ehaviors manifested
by such a gure and showhow these b ehaviors maybeinteractively accessed
and controlled. Chapter 5 presents several metho ds of motion control that
bridge the gap b etween biomechanical capabilities and higher level tasks. Fi-
nally, in Chapter 6 weinvestigate the cognition requirements and strategies
needed to have one of these computational agents follow natural language task
instructions.
1.1 Why Make Human Figure Mo dels?
Our research has fo cused on software to make the manipulation of a simulated
human gure easy for a particular user p opulation: human factors design en-
gineers or ergonomics analysts. These p eople typically study, analyze, assess,
and visualize human motor p erformance, t, reach, view, and other physical
tasks in a workplace environment. Traditionally,human factors engineers an-
alyze the design of a prototyp e workplace by building a mo ck-up, using real
sub jects to p erform sample tasks, and rep orting observations ab out design
satisfaction. This is limiting for several reasons. Jerry Duncan, a human fac-
tors engineer at Deere & Company,says that once a design has progressed
to the stage at which there is sucient information for a mo del builder to
construct the mo ck-up, there is usually so much inertia to the design that
radical changes are dicult to incorp orate due to cost and time considera-
tions. After a design go es into pro duction, de ciencies are alleviated through
sp ecialized training, limits on physical characteristics of p ersonnel, or vari-
ous op erator aids such as mirrors, markers, warning lab els, etc. The goal of
computer-simulated human factors analysis is not to replace the mo ck-up pro-
cess altogether, but to incorp orate the analysis into early design stages so that
designers can eliminate a high prop ortion of t and function problems b efore
building the mo ck-ups. Considering human factors and other engineering and
functional analyses together during rather than after the ma jor design pro cess
is a hallmark of Concurrent Engineering [Hau89].
1.1. WHY MAKE HUMAN FIGURE MODELS? 5
It is dicult to precisely characterize the typ es of problems a human fac-
tors engineer might address. Diverse situations demand empirical data on
human capabilities and p erformance in generic as well as highly sp eci c tasks.
Here are some examples.
Population studies can determine b o dy sizes representative of some
group, say NASA astronaut trainees, and this information can b e used
to determine if space vehicle work cells are adequately designed to t
the individuals exp ected to work there. Will all astronauts b e able to
t through do ors or hatches? How will changes in the workplace design
a ect the t? Will there b e unexp ected obstructions to zero gravity
lo comotion? Where should fo ot- and hand-holds b e lo cated?
An individual op erating a vehicle such as a tractor will need to see
the surrounding space to execute the task, avoid any obstructions, and
insure safety of nearby p eople. What can the op erator see from a par-
ticular vantage p oint? Can he control the vehicle while lo oking out the
rear window? Can he see the blade in order to follow an excavation line?
Sp eci c lifting studies might b e p erformed to determine back strain
limits for a typical worker p opulation. Is there ro om to p erform a lift
prop erly? What joints are receiving the most strain? Is there a b etter
p osture to minimize torques? How do es placement of the weight and
target a ect p erformance? Is the worker going to su er fatigue after a
few iterations?
Even more sp ecialized exp eriments may b e undertaken to evaluate the
comfort and feel of a particular to ol's hand grip. Is there sucientroom
for a large hand? Is the grip to o large for a small hand? Are all the
controls reachable during the grip?
The answers to these and other questions will either verify that the design
is adequate or p oint to p ossible changes and improvements early in the design
pro cess. But once again, the diversityofhuman b o dy sizes coupled with
the multiplier of human action and interaction with a myriad things in the
environment leads to an explosion in p ossible situations, data, and tests.
Any desire to build a \complete" mo del of human b ehavior, even for the
human factors domain, is surely a futile e ort. The eld is to o broad, the
literature immense, and the theory largely empirical. There app ear to b e
two directions out of this dilemma. The rst would b e the construction of
a computational database of all the known, or at least useful, data. Vari-
ous e orts have b een undertaken to assemble such material, for example, the
NASA sourceb o oks [NAS78, NAS87] and the Engineering Data Compendium
[BKT86, BL88]. The other way is to build a sophisticated computational
human mo del and use it as a sub ject in simulated virtual environment tests.
The mo del will utilize an ever-expanding human factors data set to dictate its
p erformance. Up on some re ection, it app ears that database direction may
6 CHAPTER 1. INTRODUCTION AND HISTORICAL BACKGROUND
start out as the smo other road, but it quickly divides into numerous sinu-
ous paths p ot-holed with data gaps, empirical data collection limitations, and
p opulation-sp eci c dep endencies. The alternative direction using a compu-
tational mo del underlying any data may b e harder to construct at rst, and
mayhave many detours for awhile, but gradually it leads to more destinations
with b etter roads.
This metaphor carries a philosophy for animating human movement that
derives from a computer science rather than an empirical p oint of view. We
cannot do without the e orts of the human factors community, but we cannot
use their work per se as the starting p oint for human gure mo deling. Com-
puter scientists seek computationally general yet ecient solutions to prob-
lems. Human factors engineers often analyze a succession of sp eci c tasks
or situations. The role we play is transforming the sp eci c needs of the en-
gineer or analyst into a generalized setting where some large p ercentage of
situations may b e successfully analyzed. There is sucient research required
to solve general yet dicult problems to justify building suitable software in
a computer science environment. The exp ectation is that in the long run
a more sp eci c case-by-case implementation approach will b e economically
impractical or technologically infeasible.
As we continue to interact with human factors sp ecialists, wehave come to
appreciate the broad range of problems they must address: t, reach, visibility,
comfort, access, strength, endurance, and fatigue, to mention only some of
the non-cognitive ones. Our approach is not a denial of their p erception and
analysis, rather it is an alternative view of the problem as modeling. Broadly
sp eaking, mo deling is the emb o diment within computer databases or programs
of worldly phenomena. Mo dels can b e of manytyp es:
Mathematical formulations: physical equations of motion, limb strength
in tables of empirical data, evaluation formulas measuring workload or
fatigue.
Geometric and top ological mo dels: structures representing workplace
ob jects, human b o dy segments, paths to follow, joints and joint limits,
spaces that can b e reached, attachments, and constraints.
Conceptual mo dels: names for things, attributes such as color, exibil-
ity, and material, relationships b etween ob jects, functional prop erties.
Of course, mo deling esp ecially of the rst sort is a signi cant and fun-
damental part of many studies in the human factors domain, but it has b een
dicult to balance the needs of the engineer against the complexity of the
mo deling software. Often, the mo del is elab orated in only a few dimensions
to study some problem while no global integration of mo dels is attempted.
Clearly the broadest interpretation of mo deling draws not only from many
areas of computer science such as arti cial intelligence, computer graphics,
simulation, and rob otics, but also from the inherently relevant elds of biome-
chanics, anthrop ometry,physiology, and ergonomics.
1.2. HISTORICAL ROOTS 7
The challenge to emb ed a reasonable set of capabilities in an integrated
system has provided dramatic incentives to study issues and solutions in three-
dimensional interaction metho dologies, multiple goal p ositioning, visual eld
assessment, reach space generation, and strength guided motion, to name a
few. The empirical data b ehind these pro cesses is either determined from
reliable published rep orts or supplied by the system user. By leaving the
actual data op en to the user, the results are as valid as the user wishes to
b elieve. While this is not a very pleasant situation, the inherentvariability
in human capability data makes some error unavoidable. Better, we think,
to let the user know or control the source data than to hide it. This attitude
toward validation is not the only plausible one, but it do es p ermit exibility
and generality for the computer and allows nal judgmenttobevested in the
user.
Lest there b e concern that wehave pared the problem down so far that
little of interest remains, wechange tactics for awhile and present an historical
view of e orts to mo del humans and their movements. By doing so, we should
demonstrate that the human factors domain mirrors problems which arise in
other contexts such as dance, sp orts, or gestural communication. The criteria
for success in these elds may b e more stringent, so understanding the role
and scop e of human movement in them can only serve to strengthen our
understanding of more mundane actions.
1.2 Historical Ro ots
Interactive computer graphics systems to supp ort human gure mo deling,
manipulation, and animation have existed since the early seventies. We trace
relevant developments with a sense more of history and evolution rather than
of exhaustive survey. There are numerous side branches that lead to interest-
ing topics, but we will sketch only a few of those here.
Three-dimensional human gure mo dels apparently arose indep endently
from at least six di erent applications.
1. Crash simulation. Automobile and aircraft safety issues led to the de-
velopment of sophisticated co des for linked mass deceleration studies.
These programs generally ran in batch mo de for long hours on main-
frame computers. The results were tabulated, and in some cases con-
verted to a form that could animate a simple 3D mannequin mo del
[Fet82, Wil82, BOT79]. The application was characterized by non-
interactive p ositioning and force-based motion computations with anal-
ysis of impact forces to a ected b o dy regions and subsequent injury
assessment.
2. Motion analysis. Athletes, patients with psychomotor disabilities, ac-
tors, or animals were photographed in motion by one or more xed,
calibrated cameras. The two-dimensional information was correlated
between views and reconstructed as timed 3D data p oints [RA90]. This
8 CHAPTER 1. INTRODUCTION AND HISTORICAL BACKGROUND
data could b e ltered and di erentiated to compute velo cities, accelera-
tions, torques and forces. Visualization of the original data validated the
data collection pro cess, but required human gure mo dels. Often just
wire-frames, they served in a supp ort role for athletic p erformance im-
provement, biomechanical analysis, carto on motion [TJ81], and training
or physical therapy [Win90]. Related e orts substituted direct or active
sensing devices for photographic pro cessing [CCP80]. Presently, active
motion sensing is used not only for p erformance analysis but gestural
+
input for virtual environments for example, [FMHR87, BBH 90] and
many others.
3. Workplace assessment. The earliest system with widespread use was
SAMMIE [KSC81]. This problem domain is characterized byinterac-
tive b o dy p ositioning requirements and analyses based on visual insp ec-
tion of 3D computer graphics mo dels. Fast interaction with wire-frame
displays provided dynamic feedback to the workplace evaluator. Other
mo deling to ols were develop ed, such as CAR I I [HBD80], Combiman
+ +
[BEK 81], and Crew Chief [MKK 88, EI91] to provide validated an-
throp ometric or capability data for real p opulations.
4. Dance or movement notation. The sp eci cation of self-generated, pur-
p osive, aesthetically-pleasing, human movement has b een the sub ject
of numerous notational systems [Hut84]. Dance notations were consid-
ered as a viable, compact, computationally tractable mo de of expres-
sion for human movement due to their re nement as symb olic motion
descriptions [BS79]. An animation was to b e the debugging to ol to val-
idate the correctness of a given notated score. The direct creation of
movement through a notational or numeric interface was also considered
[CCP80, CCP82, HE78].
5. Entertainment. People or at least animate creatures are the favorite
sub ject of carto ons and movies. Two-dimensional animation techniques
were the most widely used [Cat72, BW76, Lev77, Cat78]. In an ef-
fort to avoid rotoscoping live actors, early 3D mo deling and animation
techniques were develop ed at the University of Utah, Ohio State Univer-
sity, and the New York Institute of Technology [Wes73, Hac77, Stu84,
Gom84, HS85a].
6. Motion understanding. There are deep connections b etween human
motion and natural language. One of these attempted to pro duce a
sort of narration of observed synthetic movementbycharacterizing
changes in spatial lo cation or orientation descriptions over time [Bad75].
More recently, the inverse direction has b een more challenging, namely,
pro ducing motion from natural language descriptions or instructions
[BWKE91, TST87].
Our earliest e orts were directed at language descriptions of ob ject motion.
Sp eci cally,we created representations of 3D ob ject motion and directional
1.2. HISTORICAL ROOTS 9
adverbials in suchaway that image sequences of moving ob jects could b e
analyzed to pro duce English sentence motion descriptions [Bad75, Bad76]. We
then extended the mo del to articulated gures, concentrating on graphically
valid human gure mo dels to aid the image understanding pro cess [BOT79].
This e ort led to the work of Joseph O'Rourke, who attempted mo del-driven
analysis of human motion [OB80] using novel 3D constraint propagation and
goal-directed image understanding.
To improve the motion understanding comp onentwe fo cused on motion
representations sp ecially designed for human movement [WSB78, BS79]. An
in-depth study of several human movement notation systems such as La-
banotation [Hut70] and Eshkol-Wachmann [Hut84] fostered our appreciation
for the breadth and complexityofhuman activities. Our early attempts to
re-formulate Labanotation in computational mo dels re ected a need to cover
at least the space of human skeletal motion. Weinvestigated input sys-
tems for Labanotation [BS76, Hir77], although later they were discarded as a
generally accessible means of conveying human movement information from
animator to computer gure: there was simply to o muchoverhead in learning
the nuances and symb ology of the notational system. Moreover, concurrent
developments in three-dimensional interactive computer graphics o ered more
natural p osition and motion sp eci cation alternatives. The nal blowtous-
ing Labanotation was its lack of dynamic information other than timing and
crude \accent" and phrasing marks.
The movement representations that we develop ed from Labanotation, how-
ever, retained one critically imp ortant feature: goal-directedness for ecient
motion sp eci cation. Given goals, pro cesses had to b e develop ed to satisfy
them. A simulation paradigm was adopted and some of the sp ecial problems
of human movement simulation were investigated [BSOW78, BOK80, KB82].
Others studied lo comotion [CCP82, Zel82, GM85, Gir87, Bru88 , BC89], while
we concentrated on inverse kinematics for reach goals [KB82, Kor85]. Wewere
esp ecially anxious to manage multiple reach and motion goals that mutually
a ected many parts of the b o dy. Solving this problem in particular led to later
re-examination of constraint-based p ositioning and more general and robust
algorithms to achievemultiple simultaneous goals [BMW87, ZB89]. Chapter
4 will discuss our current approach in detail.
Our study of movement notations also led to an appreciation of certain
fundamental limitations most of them p ossessed: they were go o d at describing
the changes or end results of a movementwhat should b e done, but were
coarse or even non-sp eci c when it came to indicating how amovement ought
to b e p erformed. The notator's justi cation was that the p erformer for ex-
ample, a dancer was an expert system who knew from exp erience and training
just how to do the notated motion. The transformation from notation into
smo oth, natural, expressivemovements was part of the art. The exception
to the nearly universal failure of notational systems to capture nuances of
b ehavior was E ort-Shap e notation [Del70, BwDL80]. We b egan a study of
p ossible computational analogues to the purely descriptive semantics of that
system. By 1986 a mo del of human movement emerged whichintegrated the
10 CHAPTER 1. INTRODUCTION AND HISTORICAL BACKGROUND
kinematic and inverse kinematic approach with a dynamic, force-based mo del
[Bad89]. Ma jor contributions to dynamics-based animation were made by
others, notably [AG85,AGL87, WB85, Wil86, Wil87, IC87, HH87, Hah88].
Recently we combined some of the characteristics of the dynamics approach
{ the use of physical torques at the b o dy joints { with goal-directed b ehavior
to achieve strength guided motion [LWZB90] and Chapter 5.
While wewere actively engaged in the study of motion representations,
concurrent developments in interactive systems for the graphical manipula-
tion of a computerized gure were b eing actively implemented at the Uni-
versityofPennsylvania. Implementation and development has b een a strong
exp erimental comp onent of our research, from the early p ositioning language
based on Labanotation concepts [WSB78], to the next generation frame bu er-
+
based system called TEMPUS [Kor85, BKK 85], to the direct manipulation
of the gure with a 6-axis digitizer [BMB86], and nally to our present Silicon
TM
1
Graphics workstation-based system Jack [PB88, PZB90] Chapters 2 and
4.
As our exp erience with interactive graphical manipulation of a gure ma-
tured, we returned to the connections b etween language and motion we had
b egun in the mid-1970's. The manipulation of the gure for task analysis
b egged for more ecient means of sp ecifying the task. So we b egan to in-
vestigate natural language control for task animation [Gan85, BG86, Kar87,
Kar88, Kal90, KB90, KB91]. New representations for motion verbs and tech-
niques for de ning and esp ecially executing their semantics were investigated.
A simple domain of panel-typ e ob jects and their motions were studied by Je
Gangel. Robin Karlin extended the semantics to certain temp oral adverbials
such as rep etitions and culminations in a domain of kitchen-ob jects. Our
present e ort is exempli ed here by the work of Jugal Kalita and Libby Lev-
ison. Kalita studied verbs of physical manipulation and used constraints in a
fundamental fashion to determine generalized verb semantics. Levison makes
explicit connections b etween a verb's semantic representation and the sorts of
primitive b ehaviors and constraints known to b e directly simulatable by the
Jack animation system Chapter 6.
Given that natural language or some other arti cial language was to b e
used to describ e tasks or pro cesses, a suitable simulation metho dology had
to b e adopted. In his HIRES system, Paul Fishwickinvestigated task and
pro cess simulation for human animation [Fis86, Fis88]. Output was pro duced
by selecting from among pre-de ned key p ostures. For example, an animation
of the famous \Dining Philosophers" problem using vehuman gure mo dels
was pro duced by simulation of the p etri net solution in the HIRES simula-
tor. By 1989 we replaced HIRES by a new simulation system, YAPS, which
incorp orated temp oral planning with imprecise sp eci cations [KKB88], task
interruption, and task time estimation based on human p erformance mo dels
[EBJ89, EB90 , BWKE91] Chapter 6.
This brings us to the present Jack system structure designed to accommo-
1
Jack is a registered trademark of the UniversityofPennsylvania.
1.3. WHAT IS CURRENTLY POSSIBLE? 11
date as many of the historical applications as p ossible within an integrated
and consistent software foundation. To b egin to describ e that, we need to
review what human mo deling capabilities are needed and what problem im-
plementation choices we might make.
1.3 What is Currently Possible?
Since we wish to animate synthetic human gures primarily in the human
factors engineering domain, we should decide what features are essential, de-
sirable, optional, or unnecessary. Only by prioritizing the e ort can sucha
large-scale undertaking b e managed. Given priorities, implementation tech-
niques and trade-o s maybeinvestigated. Though wemay often draw on ex-
isting knowledge and algorithms, there are many fundamental features which
wemayhavetoinventorevolve due to the sp eci c structure of the human
gure, characteristics of human b ehavior, timing demands of real-time in-
teraction, or limitations of the display hardware. Accordingly,avarietyof
human gure mo deling issues will b e examined here to intro duce and justify
the choices wehave made in our broad yet integrated e ort.
The emb o diment of our choices for human mo deling is a software system
called Jack. Designed to run on Silicon Graphics 4D workstations, Jack is
used for the de nition, manipulation, animation, and human factors p erfor-
mance analysis of virtual human gures. Built on a p owerful representation
for articulated gures, Jack o ers the interactive user a simple, intuitive, and
yet extremely capable interface into any three dimensional world. Jack incor-
p orates sophisticated yet highly usable algorithms for anthrop ometric human
gure generation, a exible torso, multiple limb p ositioning under constraints,
view assessment, reach space generation, and strength guided p erformance
simulation of human gures. Of particular imp ortance is a simulation level
which allows access to Jack by high level task control, various knowledge bases,
task de nitions and natural language instructions. Thus human activities can
b e visualized from high level task understanding and planning as well as by
interactive sp eci cation.
One can think of Jack as an exp erimental environment in whichanumber
of useful general variables may b e readily created, adjusted, or controlled: the
workplace, the task, the human agents and some resp onses of the workplace
to internally or externally controlled actions. The results of sp eci c instances
of these input parameters are rep orted through computer graphics displays,
th
textual information, and animations. Thus the eld of view of a 50 p ercentile
male while leaning over backwards in a tractor seat as far a p ossible maybe
directly visualized through the graphic display. If the gure is supp osed to
watch the corner of the bulldozer blade as it moves through its allowed motion,
the human gure's gaze will follow in direct animation of the view.
In the following subsections, several desiderata are presented for human
mo dels. Under each, we summarize the ma jor features { with justi cations
and b ene ts { of the Jack software. The detailed discussions of these features
12 CHAPTER 1. INTRODUCTION AND HISTORICAL BACKGROUND
and their implementation constitute the bulk of the remaining chapters.
1.3.1 A Human Mo del must b e Structured Like the Hu-
man Skeletal System
To build a biomechanically reasonable gure, the skeletal structure should
resemble but need not copy that of humans. We can bu er the complexity
of actual b one shap es, jointtyp es and joint contact surfaces with require-
ments for interactive use and external motion approximations. For example,
rotational joints are usually assumed to have a virtual center ab out which
the adjacent b o dy segments move. While such simpli cations would not b e
appropriate for, say, knee prosthesis design, there app ears to b e little harm
in variations on the order of a centimeter or so. Of course, there are situ-
ations where small departures from reality could a ect the verisimilitude of
the gure; accordingly wehave concentrated on rather accurate mo dels for
the torso and shoulder complex. Many other software systems or manual
metho ds presume a xed shoulder joint but it is obvious that this is not true
as the arm is elevated.
1. Jack has a fully linked b o dy mo del including a 17 segment exible torso
with vertebral joint limits. In general, individual joints mayhave one,
two, or three degrees of freedom DOFs. Related groups of joints,
such as the spine or the shoulder complex, may b e manipulated as a
unit. The result is reasonable biomechanical realism with only mo dest
computational overhead.
2. The Jack shoulder mass joint center is p osture-dep endent. Accurate
shoulder motion is mo deled through an explicit dep endency b etween
arm p osition and clavicle rotation. The gure therefore presents appro-
priate shoulder and clavicle motions during p ositioning. The shoulder
joint has spherical glob ographic [EP87] limits for improved motion
range accuracy.
3. All joint rotations are sub ject to limits. During manipulation, rotations
propagate when joint limits would b e exceeded.
4. A fully articulated hand mo del is attached.
5. The fo ot is articulated enough to provide to e and heel exibility. If more
DOFs were required, they could b e easily added.
1.3.2 A Human Mo del should Move or Resp ond Likea
Human
Ideally, the motions presented by a simulated gure will b e biomechanically
valid. They should not only app ear \human-like," but they should b e vali- dated against empirical data for real sub jects under similar conditions. This
1.3. WHAT IS CURRENTLY POSSIBLE? 13
goal is desirable but dicult to reach in a generalized motion mo del precisely
b ecause such a mo del must allowinterp olation and extrap olation to situations
other than those originally measured. Mo dels are needed to provide reason-
able interpretations of data that by necessitymust b e sampled rather coarsely,
in sp eci c situations, and with a collection of sp eci c sub jects. The closest we
can get to the ideal is to provide generic mechanisms that incorp orate when-
ever p ossible empirical data that a user b elieves to b e valid up to the degree
of error p ermitted for the task. Rather than hide such data, Jack takes the
view of an op en database where reasonable default human anthrop ometric,
strength or p erformance data is provided, but user customizing is the rule
rather than the exception.
1. Jack p ermits multiple gures to simultaneously inhabit an environment.
Multi-p erson environments and op erator interactions may b e studied for
interference, view, and co ordinated tasks.
2. A numb er of active b ehaviors are de ned for the gure and may b e se-
lected or disabled by the user. Among the most interesting is constrain-
ing the center of mass of the entire gure during manipulation. This
allows automatic balance and weight-shifting while other tasks suchas
reaching, viewing, b ending over, etc. are b eing p erformed. Spare cycles
on the workstation are used inbetween human op erator inputs to con-
stantly monitor the active constraints and move the b o dy joints towards
their satisfaction.
3. People usually moveinways that conserve resources except when they
are delib erately trying to achieve optimum p erformance. If strength
information as a function of b o dy p osture and joint p osition is available,
that data may b e used to predict certain end-e ector motion paths under
sp eci ed \comfort" conditions. Thus the exact motions involved in,
say, lifting a weight are subservient to the strength mo del, comfort and
fatigue parameters, and heuristics for selecting among various movement
strategies. By avoiding \canned" or arbitrary for example, straight
line motion paths, great exibility in executing tasks is provided.
4. The Jack hand mo del has an automatic grip. This feature saves the user
from the indep endent manipulation of large numb ers of joints and DOFs.
The user sp eci es a grip typ e and an optional site on the ob ject to b e
grasp ed. When p ossible, the hand itself cho oses a suitable approach
direction. Though frictional forces are not mo deled, the p ositioning
task is greatly aided by the hand's skill. Once gripp ed, the ob ject stays
attached to the hand and moves along with it until explicitly freed by the user.
14 CHAPTER 1. INTRODUCTION AND HISTORICAL BACKGROUND
1.3.3 A Human Mo del should b e Sized According to
Permissible Human Dimensions
E ectively, there is no such thing as a \average" human. Statistically one
must always prescrib e a target p opulation when talking ab out p ercentiles of
th th
size, weight, or stature. A p erson maybe50 p ercentile in stature, 75
th
p ercentile in weight, but 40 p ercentile in lower leg length. Human dimen-
sional variability is enormous but not arbitrary. Within a given p opulation,
th th
for example, 5 p ercentile legs might never b e found on anyb o dy with 95
p ercentile arms, even though the p opulation allows such sizes individually.
Moreover, dimensions are not just limited to lengths, stature, and weight,
but include joint limits, moments of inertia for each b o dy segment, muscle
strengths, fatigue rates, and so on. For prop er b ehaviors, one must b e able
to instantiate a prop erly sized gure with appropriately scaled attributes,
preferably from a known p opulation suitable for the required task analysis.
1. The Jack anthrop ometric database is not proprietary. All data is readily
available and accessible. Consequently, it is easily customized to new
p opulations or sets of individuals. Some databases are available, such
as NASA astronaut trainees, Army soldiers, and So ciety of Automotive
Engineers \standard p eople."
2. A database may consist of either p opulation statistics or individuals.
If p opulations, then p ercentile data p oints are exp ected in order to de-
ne b o dy dimensions. If individuals, then explicit information for each
p erson in the collection is separately stored. For example, the NASA
astronaut trainees constitute an explicit list of individuals, while the
Army soldier data is statistically derived.
3. Enough information ab out a human gure must b e stored to p ermit
b o dy sizing, display, and motion. We use overall segment dimensions
length, width, thickness, joint limits, mass, moment of inertia, and
strength. Geometric prop erties are used during graphical manipulation;
physical prop erties aid active b ehaviors balance, dynamic simulation,
and strength guided motion.
4. With so many DOFs in a b o dy and many useful physical attributes,
there must b e a convenientway of accessing, selecting, and mo difying
any data. Jack uses a spreadsheet-likeinterface to access anthrop ometry
data. This paradigm p ermits simple access and data interdep endencies:
for example, changing leg length should change stature; changing p op-
ulation p ercentile should change mass distribution.
5. Seeing the results of changing b o dy dimensions is imp ortant to under-
standing how di erent b o dies t, reach, and see in the same workplace.
Jack allows interactive b o dy sizing while the b o dy itself is under active
th
constraints. Changing a gure seated in a co ckpit from 95 p ercentile to
1.3. WHAT IS CURRENTLY POSSIBLE? 15
th
5 p ercentile, for example, creates interesting changes in fo ot p osition,
arm p ostures and view.
1.3.4 A Human Mo del should have a Human-Like Ap-
p earance
As we argued earlier, a human mo del's app earance has a lot to do with our
p erception of acceptable b ehavior. The more accurate the mo del, the b etter
the motions ought to b e. Providing a selection of b o dy mo dels is a convenient
way to handle a sp ectrum of interactive analysis and animation requirements.
For quick assessments, a human mo del with simpli ed app earance mightbe
ne. If the skin surface is not totally realistic, the designer can move the view
around to check for sucient clearances, for example. When the completed
analysis is shown to the b oss, however, an accurate skin mo del might b e used
so that the rob otic nature of the simpler mo del do es not obscure the message.
Lo oking b etter is often asso ciated in computer graphics with b eing b etter.
1. The \standard" or default b o dy mo del in Jack strikes a balance b etween
detail and interactive manipulation sp eed. It app ears solid, has a 17
segment exible torso, has a reasonable shoulder/clavicle mass, has a
full hand, and has a generic face. A hat is added to avoid mo deling
hairstyles. The gure is mo deled by surface p olygons to take advantage
of the available workstation display capabilities.
2. There are times when more accurate skin surface mo dels are needed. For
that, computerized mo dels of real p eople are used. These are derived
from a database of biostereometrically-scanned b o dies.
3. For extra realism, clothing is added to b o dy segments by expanding and
coloring the existing segment geometry. Besides the inherent desirability
of having a virtual gure in a work environment app ear to b e dressed,
clothing will also a ect task p erformance if adjustments are made to
joint limits or if collision tests are p erformed.
4. Facial features may b e provided by graphical texture maps. By showing
a sp eci c face a particular individual may b e installed in the scene, the
gure may b e uniquely identi ed throughout an animation, or generic
gender may b e conveyed. Moreover, if the face is animated, an entire
communicationchannel is enabled.
1.3.5 A Human Mo del must Exist, Work, Act and React
Within a 3D Virtual Environment
We do not live in Flatland [Abb53] and neither should virtual gures. Three-
dimensional environment mo deling is the hallmark of contemp orary computer-
aided design CAD systems. The workplace will frequently b e constructed
electronically for design, analysis, and manufacturing reasons; we merely add
16 CHAPTER 1. INTRODUCTION AND HISTORICAL BACKGROUND
human factors analysis to the list. Design imp ortation facilitates up-front
analyses b efore commitment to pro duction hardware. Since geometric mo dels
are readily constructed, wemust b e sure that our b o dy mo dels and interactive
software are compatible with the vast ma jority of CAD mo deling schemes so
the two can work together in the same virtual space.
1. Since Jack manipulates surface p olygon geometry, simple features are
provided to interactively construct and edit the geometry of the virtual
workplace. Jack is not intended as a substitute for a go o d CAD sys-
tem, but CAD features are provided for convenience. For example, if
a designer nds a problem with the imp orted workplace, the o ending
region can b e immediately edited in Jack. When the changes prove
acceptable, the designer can use the data values from Jack to mo dify
the \real" mo del in the external CAD system. While not optimal, this
avoids any tendency to migrate the working mo del into Jack and bypass
the more generalized CAD features provided by most CAD systems.
2. Standardized geometric data transfer to and from Jack would b e highly
desirable, but the state of standards in geometric mo deling still leaves
some imp ortant gaps. For example, few CAD systems adequately mo del
articulated ob jects. Currently, Jack imp orts mo dels from several com-
mon CAD vendors through a less satisfactory scheme of explicit trans-
lators from external geometry les. This situation will change to a
standard as so on as p ossible.
1.3.6 Use the Computer to Analyze Synthetic Behaviors
What would a real p erson do in a real environment? How can we get a virtual
human to b ehave appropriately and rep ort similar exp eriences? People are
constantly managing multiple simultaneous constraints or tasks, for example,
staying balanced while reaching to lift a b ox, or walking while carrying a cup
of co ee. The essential parallelism of human motion demands an approachto
b ehavior animation that do es not just cop e with parallelism but exploits it.
We will b e interested mostly in psychomotor and viewing b ehaviors; clearly
auditory and cognitive tasks are worthy of inclusion but are not dealt with
here.
1. Jack allows the user to sp ecify, and the system maintains, multiple si-
multaneous p osition and orientation goals. For instance, a typical p os-
ture mightinvolve reaching with two hands while lo oking at a target
and staying balanced. There are constraints on the p ositions of the
hands and the orientation of the eyes dictated by the task, and bal-
ance constraints required by gravity. Additionally,we mightwant the
torso to remain upright or, if seated, for the p elvis to tilt to a more re-
laxed p osture. There are to o many p ossible human b o dy con gurations
to manage every combination by sp ecialized rules; our approachisto
1.3. WHAT IS CURRENTLY POSSIBLE? 17
use a global solution technique, inverse kinematics, to satisfy the given
constraints sub ject to the inherent joint limits of the b o dy.
2. While in a p osture, the strength requirements of the gure maybe
displayed on screen. This interactive strength data display shows all
torque loads along any selected chain of b o dy segments. If a load is
attached to an end-e ector, for example, it p ermits the easy visualization
of the distribution of that additional weight on the b o dy.
3. What the gure is lo oking at is often a critical question in human fac-
tors analysis. In Jack, the direction of eye gaze is controlled through
constraints to some environmental lo cation or ob ject site. If that site
moves, the gaze will follow. Since eyemovement will a ect head orienta-
tion, the e ect of gaze direction can propagate b ecause of joint limits
to the neck and torso and hence in uence overall b o dy p osture.
4. In Jack, the user can see what the gure sees from an internal or exter-
nal p ersp ective. Internally, a separate graphics windowmay b e op ened
which shows the view from the selected eye. The image app ears nat-
urally shaded and it moves as the gure's gaze is adjusted by direct
manipulation or constraints. Concurrentchanges to the environment
and visible parts of the gure's own \self " are displayed in the view
window. If eld of view is critical, a retinal projection may b e used
where a p olar pro jection displays workplace features based on their an-
gle from the fovea. Although the image thereby app ears distorted, the
actual eld of view may b e sup erimp osed to assess the range of foveal
or p eripheral p erception. For the external p ersp ective, a selected eld
of view is displayed as a translucent pair of view cones, one for each
eye. The cones move with the eyes. Ob jects in view are shadowed by
the translucent cones. Anyoverlapp ed region is clearly in the area of
bino cular vision.
5. It is not yet feasible to do real-time collision detection b etween all the
moving ob jects in a complex environment. By various simpli cations,
however, sucient capabilities may b e presented. On fast displays with
real-time viewing rotation such as the Silicon Graphics workstations,
the ability to rapidly change the view means that the user can quickly
move ab out to check clearances. Another metho d used in Jack is to
optionally pro ject three orthogonal views of a gure and other selected
ob jects onto back, side, and b ottom planes. These three additional
views give simultaneous contact and interference information and are
used during interactive manipulations. Often, users will work with
shaded images and detect collisions by noting when one ob ject visu-
ally passes into another. The alternative to visual collision detection
is direct ob ject{ob ject interference computation. Usually limited to a
selected b o dy segment and a convex ob ject, this metho d is slower but
guarantees to detect a collision even if it would b e dicult to see.
18 CHAPTER 1. INTRODUCTION AND HISTORICAL BACKGROUND
6. Sometimes it is valuable to visualize the entire reachable space of an
end-e ector of the gure. Empirical data is sometimes available in the
literature [NAS78, NAS87], but it is collected on sp eci c sub jects and
do es not readily extend to gures with di ering anthrop ometry or joint
limits. By constructing the reach space for a given gure in a given
p osture as a geometric ob ject, the reach space may b e viewed and ob jects
may b e readily classi ed as in or out of the reach space.
1.3.7 An Interactive Software Tool must b e Designed for
Usability
A system to build, move, and analyze virtual humans should b e usable by mere
mortals with a mo dest training p erio d. Isolating the p otential user community
by requiring unusual artistic skills would b e counter-pro ductive to our wider
purp oses of aiding design engineers. Existing interaction paradigms suchas
p op-up menus or command line completions should b e followed when they
are the most ecacious for a particular task, but new techniques will b e
needed to manage and control three-dimensional articulated structures with
standard graphical input to ols. The interface should b e simple yet p owerful,
comprehensive but easy to learn.
1. The Jack user interface is designed for fast resp onse to multiple con-
straint situations. Real-time end-e ector interactive dragging through
arbitrary length jointchains means that the user can watch the gure
resp ond to reachormovement tasks. The paradigm of manipulating one
joint angle at a time is p ossible, but almost useless. The goal-directed
b ehaviors provide an enormous b ene t to the user in allowing the sp ec-
i cation of what is to b e done while constraint satisfaction handles the
how through p ositioning interdep endencies of the entire b o dy structure.
Dragging also p ermits quick exp erimentation with and manual optimiza-
tion of p ostures.
2. By taking advantage of the Silicon Graphics display hardware, Jack
shows the user shaded or wireframe displays during interaction for nat-
ural images and easy real-time visualization.
3. For the highest quality images, Jack provides its own multi-featured
ray-tracing and radiosity programs.
4. Besides direct user input, Jack may b e controlled through scripted com-
mands in the Jack command language built in the course of interactive
manipulation. This saves time and trouble in setting up complex situa-
tions, establishing a b o dy p osture, or trying a series of actions.
5. Jack also allows external control through op erating system \so ckets" to
other programs, simulations, or real sensors, providing ho oks into exter-
nal data sources for virtual environments or networked remote systems
1.4. MANIPULATION, ANIMATION, AND SIMULATION 19
sharing a common virtual dataspace. Jack itself can share an environ-
ment with other Jack's on the network. This could b e used for novel
co op erativeworkgroup applications.
6. The standard Jack user interface consists of just a three button mouse
and keyb oard. It is simple to learn and no sp ecial hardware devices
are required unless a virtual environment setup is desired. The interac-
tion paradigms in Jack include menu-driven or typ ed commands, on-line
help, and command completion. It is easy to use after only a dayorso
of training.
7. The direct manipulationinterface into three dimensions implemented
in Jack is b oth highly ecient and natural to use. Dep ending only
on the mouse and keyb oard, it o ers a friendly, kinesthetic corresp on-
dence b etween manually comfortable hand motions, on-screen displays,
and three-dimensional consequences. Three dimensional cursors, rota-
tion wheels, and orthogonal pro jections of the principal view provide
excellent visual feedback to the user.
8. Jack provides multiple windows with indep endent camera views for com-
plex analyses, multiple p oints of view, and internal and external eye
views.
1.4 Manipulation, Animation, and Simulation
There are imp ortant distinctions b etween manipulation, animation, and sim-
ulation. Geometric manipulation is the pro cess of interactive scene comp osi-
tion, or the interactive sp eci cation of p ositions and p ostures for geometric
gures, usually on a trial and error basis. Manipulation usually involves move-
ment of the gures, but the movement serves to assist in the control pro cess
and is generally not worth saving as a memorable motion sequence. Rather,
the purp ose of manipulation is to get the gures into a desired static p osture,
although the p osture need not remain static afterwards. Manipulation is in-
herently real-time: ob jects move as a direct resp onse to the actions of the
user. In short, interactive manipulation is not necessarily choreography.
Animation, on the other hand, is choreography. In computer animation,
the goal is to describ e motion, and the animator usually imagines the desired
motion b efore b eginning the animation pro cess. Of course, exp erimentation
may lead to revisions, like an illustrator who erases lines in a drawing, but
the computer do es not servesomuch to answer questions as to ob ey orders.
Animators measure the success of a computer animation system in terms of
howwell it serves as a medium for expressing ideas.
Simulation is automated animation, and the concern is again with motion.
The system generates the motion based on some kind of input from the user
ahead of time. The input usually consists of ob jectives and rules for making
decisions, and it is generally less sp eci c than with animation. The user knows
20 CHAPTER 1. INTRODUCTION AND HISTORICAL BACKGROUND
Figure 1.1: Is it the Motion or the Posture?
less ab out what motion should result. The job of the simulator is to predict
what would happ en under certain circumstances and inform the user of the
results. Sometimes simulation can generate animation, as in the case of the
animation of physics and natural phenomena. Simulation of human gures
generally implies some mo deling of human capabilities to delib erately o -load
some of the low-level p ositioning overhead from the animator.
Animation and simulation have b een studied extensively, but manipulation
of articulated gures has not received the attention it deserves. Volumes of
research discuss animation techniques and simulation algorithms, but most
research directed at interactive manipulation deals either with the low-level
input mechanisms of describing 3D translations and rotations, or with the
numerical issues of real-time dynamics. For example, consider the task of
b ending a gure over to touch its to es Fig. 1.1. Is the b ending motion
imp ortant, or is it just the nal p osture that is critical? In animation, it's
the motion: the motion must lo ok realistic. In simulation, the motion must
be realistic. In manipulation, the ner p oints of the p osture are critical. Is
the gure balanced? Are the knees b ent? Where is the head p ointed? How
are the feet oriented? The motion through which the manipulation system
p ositions the gure is not imp ortant in itself. It serves only to assist the user
in arriving at the p osture.
1.5 What Did We Leave Out?
It is only fair that in this exp osition we are clear ab out what our existing
software do es not do. There are choices to b e made in any implementation, but
1.5. WHAT DID WE LEAVE OUT? 21
the vastness of the human p erformance problem demands scop e b oundaries
as well. There are fascinating problems remaining that wehave not touched.
Some of these problems are b eing examined now, but the early results are
to o premature to rep ort. The activity in this eld is amazing, and there
will surely b e advances in mo deling, animation, and p erformance simulation
rep orted eachyear.
A glance at the Engineering Data Compendium [BL88], for example, will
quickly showhowmuch information has b een collected on human factors and
simultaneously how little is available interactively on a computer. But even
without much thought, we can place some b ounds on this study.
Wehave ignored auditory information pro cessing, environmental fac-
tors such as temp erature and humidity, vibration sensitivity, and so
on. While critical for harsh environments, we seek useful approxima-
tions to rst order geometric problems to see if a gure can do the
task in the absence of external signals, distress, or threats. The proba-
ble degradation of task p erformance may b e established from intelligent
manual search through existing publications. While it would b e at-
tractive to include such data, wehave not b egun to acquire or use it
yet.
Wehavebypassed sup er-accurate skin mo dels b ecause the en eshment
of a gure is so dep endent on individual physiology,muscle tone, mus-
cle/fat ratio, and gender. For the kinds of analyses done with whole
b o dy mo dels, errors of a centimeter or so in skin surfaces are sub ordi-
nate to anthrop ometric errors in lo cating true joint centers and joint
geometry.
Wehaveavoided injury assessment, as that is another whole eld devel-
op ed from anatomy, crash studies, or national exertion standards. Of
course, Jack could b e used in conjunction with such systems, but we
have not tried to connect them yet.
Because wehave taken a generalized view of human factors and re-
stricted analyses to reach, t, view, and strength, wehave necessarily
avoided anyworkplace-sp eci c p erformance data. For example, action
timings will b e most accurate when measured with real sub jects in a
real environment. Wehave concentrated on measurements in environ-
ments prior to physical construction to save cost and p ossible p ersonnel
dangers.
Wehave only touched on p erceptual, reactive and cognitive issues. Many
workers are engaged in control-theoretic activities: sensing the environ-
ment and reacting to maintain some desired state. We see this capabil-
ityeventually driving the gure through the simulation interface, but at
the presentwehave not mo deled such situations. Instead we are investi-
gating natural language instructions which generate an animation of the
simulated agent's \arti cial intelligence" understanding of the situation.
22 CHAPTER 1. INTRODUCTION AND HISTORICAL BACKGROUND
Finally,we are leaving issues of learning for later study. There is muchto
b e said for an agent who not only follows instructions but who also learns
how to do similar but not identical actions in comparable situations
in the future. Learning might rst o ccur at the psychomotor level,
for example, to gure out the b est way to lift a heavy weight. Later,
learning can extend to the task level: to rep eat a variant of a previously
successful plan when the overall goals are encountered again.
These issues are exciting, but we need to start with the basics and describ e
what exists to day.
Chapter 2
Bo dy Mo deling
In order to manipulate and animate a human gure with computer graphics,
a suitable gure must b e modeled. This entails constructing a satisfactory
surface skin for the overall human b o dy shap e, de ning a skeletal structure
which admits prop er joint motions, adding clothing to improve the verisimil-
itude of analyses as well as providing an appropriate measure of mo desty,
sizing b o dy dimensions according to some target individual or p opulation,
and providing visualization to ols to showphysically-relevant b o dy attributes
such as torque loads and strength.
2.1 Geometric Bo dy Mo deling
In computer graphics, the designer gets a wide choice of representations for
the surfaces or volumes of ob jects. We will brie y review current geometric
mo deling schemes with an emphasis on their relevance to human gures.
We classify geometric mo dels into two broad categories: b oundary schemes
and volumetric schemes. In a b oundary representation the surface of the
ob ject is approximated by or partitioned into non-overlapping 0-, 1-, or 2-
dimensional primitives. We will examine in turn those representations relevant
to human mo deling: p oints and lines, p olygons, and curved surface patches.
Inavolumetric representation the 3D volume of the ob ject is decomp osed
into p ossibly overlapping primitivevolumes. Under volumetric schemes we
discuss voxels, constructive solid geometry, ellipsoids, cylinders, spheres, and
p otential functions.
2.1.1 Surface and Boundary Mo dels
The simplest surface mo del is just a collection of 3D p oints or lines. Surfaces
represented by p oints require a fairly dense distribution of p oints for accurate
mo deling. Clouds of p oints with depth shading were used until the early
1980's for human mo dels on vector graphics displays. They to ok advantage of 23
24 CHAPTER 2. BODY MODELING
the display's sp eed and hierarchical transformations to pro duce the p erceptual
depth e ect triggered bymoving p oints [Joh76] for example, [GM86].
A related technique to retain display sp eed while o ering more shap e in-
formation is to use parallel rings or strips of p oints. This technique is used
TM 1
in Lif eF or ms [Lif91, Cal91]. Artistically p ositioned \sketch lines" were
used in one of the earliest human gure mo dels [Fet82] and subsequently in a
Mick Jagger music video, \Hard Woman" from Digital Pro ductions.
Polygons
Polygonal p olyhedral mo dels are one of the most commonly encountered
representations in computer graphics. The mo dels are de ned as networks of
p olygons forming 3D p olyhedra. Each p olygon primitive consists of some
connected vertex, edge, and face structure. The p olygons are sized, shap ed,
and p ositioned so that they completely tile the required surface at some res-
olution. Polygon mo dels are relatively simple to de ne, manipulate, and dis-
play. They are the most common mo dels pro cessed byworkstation hardware
and commercial graphics software. In general, p olygons are b est at mo del-
ing ob jects meanttohave at surfaces, though with a large enough number
of p olygons quite intricate and complex ob jects can b e represented. \Large
enough" may mean hundreds of thousands of p olygons!
All viable interactive human gure mo dels are done with p olygons, pri-
marily b ecause the p olygon is the primitive easiest to manage for a mo dern
workstation such as the Silicon Graphics machines. With real-time smo oth
shading, p olygon mo dels of mo derate complexity several hundred p olygons
can lo ok acceptably human-like; accurate skin mo dels require thousands. A
realistic face alone may require two or three thousand p olygons if it is to b e
animated and if the p olygons are the only source of detail.
Polygon mo dels are to o numerous to cite extensively. Ones with an inter-
esting level of detail have b een used in Mannequin software from Biomechan-
ics Corp oration of America [Pot91], various movies suchas\TonydePeltrie"
from the UniversityofMontreal [Emm85], and detailed synthetic actor mo dels
of Marilyn Monro e and Humphrey Bogart from Daniel Thalmann and Nadia
Magnenat-Thalmann [MTT90, MTT91b]. They even mo del details of skin de-
formation by applying physical forces to p olygon meshs [GMTT89, MTT91b].
The p olygon mo dels used in Jack are p olygonal with two di erent levels of
detail. The normal mo dels have a few hundred p olygons. More accurate mo d-
els obtained from actual scans of real b o dies have several thousand p olygons.
See Section 2.1.3.
Curved Surfaces
Since p olygons are go o d at representing at surfaces, considerable e ort has
b een exp ended determining mathematical formulations for true curved sur-
faces. Most curved surface ob ject mo dels are formed by one or more para-
1
LifeForms is a registered trademark of Kinetic E ects, Inc.
2.1. GEOMETRIC BODY MODELING 25
metric functions of twovariables bivariate functions. Each curved surface
is called a patch; patches may b e joined along their b oundary edges into
more complex surfaces. Usually patches are de ned bylow order p olynomials
typically cubics giving the patch easily computed mathematical prop erties
suchaswell-de ned surface normals and tangents, and computable continuity
conditions b etween edge-adjacent patches. The shap e of a patch is derived
from control p oints or tangentvectors; there are b oth approximating and
interp olating typ es. The former take the approximate shap e of the control
vertices; the latter must pass through them. There are numerous formulations
of curved surfaces, including: Bezier, Hermite, bi-cubic, B-spline, Beta-spline,
and rational p olynomial [Far88, BBB87 ].
Various human gure mo dels have b een constructed from curved patches,
but display algorithm constraints make these gures awkward for real-time
manipulation. They are excellent for animation. provided that sucient care
is taken to mo del joint connections. This is a go o d example of where increased
realism in the b o dy segments demands additional e ort in smo othing and
b ending joint areas prop erly. Curved surface mo dels were used in a gymnastic
piece [NHK86] and the AcademyAward-winning \Tin Toy" [GP88].
2.1.2 Volume and CSG Mo dels
The volume and CSG mo dels divide the world into three-dimensional chunks.
The mo dels may b e comp osed of non-intersecting elements within a spatial
partition, suchasvoxels or o ct-trees, or created from p ossibly overlapping
combinations of inherently 3D primitivevolumes.
Voxel Mo dels
The rst volumetric mo del we examine is the voxel mo del. Here space is
completely lled by a tessellation of cub es or parallelopip eds called voxels
volume elements. Usually there is a density or other numerical value asso-
ciated with eachvoxel. Storing a high resolution tessellation is exp ensivein
space but simple in data structure just a large 3D arrayofvalues. Usually
some storage optimization schemes are required for detailed work 1K x 1K
x 1K spaces. Sp ecial techniques are needed to compute surface normals and
shading to suppress the b oxiness of the rawvoxel primitive. Voxel data is
commonly obtained in the medical domain; it is highly regarded for diagnos-
tic purp oses as the 3D mo del do es not sp eculate on additional data sayby
surface tting nor suppress any of the original data however convoluted.
Voxel mo dels are the basis for much of the scienti c visualization work
in biomedical imaging [FLP89]. The p ossible detail for human mo dels is
only limited by the resolution of the sensor. Accurate b one joint shap es may
b e visualized, as well as the details of internal and external physiological
features. These metho ds have not yet found direct application in the human
factors domain, since biomechanical rather than anatomical issues are usually
addressed. Real-time displayofvoxel images is also dicult, requiring either
26 CHAPTER 2. BODY MODELING
+
low resolution image sets or sp ecial hardware [GRB 85].
Constructive Solid Geometry
One of the most ecient and p owerful mo deling techniques is constructive
solid geometry CSG. Unlike the voxel mo dels, there is no requirementto
regularly tessellate the entire space. Moreover, the primitive ob jects are not
limited to uniform cub es; rather there are anynumb er of simple primitives
such as cub e, sphere, cylinder, cone, half-space, etc. Each primitive is trans-
formed or deformed and p ositioned in space. Combinations of primitives or of
previously combined ob jects are created by the Bo olean op erations. An ob-
ject therefore exists as a tree structure which is \evaluated" during rendering
or measurement.
CSG has b een used to great advantage in mo deling machined parts, but
has not b een seriously used for human b o dy mo deling. Besides the mechanical
lo ok created, real-time display is not p ossible unless the CSG primitives are
p olygonized into surfaces. When the set of primitives is restricted in one way
or other, however, some useful or interesting human mo dels have b een built.
Single Primitive Systems
The generality of the constructive solid geometry metho d { with its multiplic-
ity of primitive ob jects and exp ensive and slowray-tracing display metho d {
is frequently reduced to gain eciency in mo del construction, avoid Bo olean
combinations other than union, and increase display sp eed. The idea is to re-
strict primitives to one typ e then design manipulation and display algorithms
to take advantage of the uniformity of the representation. Voxels mightbe
considered such a sp ecial case, where the primitives are all co ordinate axis
aligned and integrally p ositioned cub es. Other schemes are p ossible, for ex-
ample, using ellipsoids, cylinders, sup erquadrics, or spheres.
Ellipsoids have b een used to mo del carto on-like gures [HE78, HE82].
They are go o d for elongated, symmetric, rounded ob jects. Unfortunately, the
shaded display algorithm is nearly the same as the general ray-tracing pro cess.
Cylinders have also b een used to mo del elongated, symmetric ob jects. El-
liptic cylinders were used in an early human mo deling system [Wil82]. These
primitives su er from joint connection problems and rather p o or representa-
tions of actual b o dy segment cross-sections.
Sup erquadrics are a mathematical generalization of spheres which include
an interesting class of shap es within a single framework: spheres, ellipsoids,
and ob jects which arbitrarily closely lo ok like prisms, cylinders, and stars.
Simple parameters control the shap e so that deformations through memb ers
of the class are simple and natural. Sup erquadrics are primarily used to mo del
man-made ob jects, but when overlapp ed can give the app earance of faces and
gures [Pen86].
Spheres as a single primitive form an intriguing class. Spheres havea
simplicity of geometry that rivals that of simple p oints: just add a radius.
2.1. GEOMETRIC BODY MODELING 27
There are two metho ds of rendering spheres. Normally they are drawn as
regular 3D ob jects. Ahuman mo deled this way tends to lo ok like a large
bumpy molecule. Alternatively, spheres may b e treated like \scales" on the
mo deled ob ject; in this case a sphere is rendered as a at shaded disk. With
sucient densityofoverlapping spheres, the result is a smo othly shaded solid
which mo dels curved volumes rather well. A naturalistic human gure was
+
done this way in our earlier TEMPUS system [BB78, BKK 85, SEL84]. We
stopp ed using this metho d as we could not adequately control the sphere/disk
overlaps during animation and newer workstation display technology favored
p olygons.
Potential Functions
An interesting generalization of spheres which solves some ma jor mo deling
problems is to consider the volume as a p otential function with a center and a
eld function that decreases monotonically by an exp onential or p olynomial
function from the center outward. There is no \radius" or size of the p otential
function; rather, the size or surface is determined by setting a threshold value
for the eld. What makes this more interesting is that p otential functions act
like energy sources: adjacent p otential functions haveoverlapping elds and
the resultantvalue at a p oint in space is in fact the sum of the elds activeat
that p oint. Thus adjacent elds blend smo othly, unlike the \creases" that are
obtained with xed radius spheres [Bli82]. Recently, directional dep endence
and selective eld summation across mo dels have b een added to create \soft"
mo dels that blend with themselves but not with other mo deled ob jects in the
+
environment [WMW86, NHK 85]. Potential functions were originally used
to mo del molecules, since atoms exhibit exactly this form of eld b ehavior,
but the mo dels have an amazing naturalistic \lo ok" and have b een used to
great e ect in mo deling organic forms including human and animal gures
+
[NHK 85, BS91]. The principal disadvantages to p otential functions lie in
prop erly generating the numerous overlapping functions and very slow display
times. They remain an interesting p ossibility for highly realistic mo dels in the
future.
2.1.3 The Principal Bo dy Mo dels Used
2
The default p olyhedral human gure in Jack is comp osed of 69 segments,
68 joints, 136 DOFs, and 1183 p olygons including cap and glasses. The
app earance is a compromise b etween realism and display sp eed. No one is
likely to mistake the gure for a real p erson; on the other hand, the movements
and sp eed of control are go o d enough to convey a suitably resp onsive attitude.
The stylized face, hat, and glasses lend a bit of character and actually assist
in the p erception of the forward-facing direction.
For more accurate human b o dies, wehave adapted a database of actual
b o dy scans of 89 sub jects 31 males and 58 females supplied by Kathleen
2
Pei-Hwa Ho.
28 CHAPTER 2. BODY MODELING
Robinette of Wright-Patterson Air Force Base and used with her p ermission.
The original data came in contours, that is, slices of the b o dy in the trans-
+
verse plane [GQO 89]. Each b o dy segmentwas supplied as a separate set of
contours. A p olygon tiling program was used to transform the contours of
each b o dy segmentinto a surface representation.
In order to represent the human b o dy as an articulated gure we needed to
rst compute the prop er joint centers to connect the segments together. Joint
center lo cations were computed through the co ordinates of anthrop ometric
landmarks provided with the contour data.
Real humans are not symmetrical around the sagittal plane: our left half
is not identical to our right half. This was the case with the contour data.
For consistency, rather than accuracy,we used the right half of the b o dy data
to construct a left half and then put them together. We also sliced the upp er
torso to take advantage of the seventeen segment spine mo del Section 2.3.
The resulting human b o dy mo del has thirty-nine segments and ab out 18,700
p olygons compared to the 290 slices in the original data.
2.2 Representing Articulated Figures
Underneath the skin of a human b o dy mo del is a representation of the skele-
ton. This skeletal representation serves to de ne the moving parts of the
gure. Although it is p ossible to mo del each of the b ones in the human b o dy
and enco de in the mo del how they move relative to each other, for most typ es
of geometric analyses it is sucient to mo del the b o dy segments in terms
of their lengths and dimensions, and the joints in terms of simple rotations.
There are some more complex joint groups such as the shoulder and spine
where inherent dep endencies across several joints require more careful and
sophisticated mo deling.
The increasing interest in recentyears in ob ject-oriented systems is largely
due to the realization that the design of a system must b egin with a deep
understanding of the ob jects it manipulates. This seems particularly true in a
geometric mo deling system, where the word \ob ject" takes on many of its less
abstract connotations. It has long b een an adage in the user interface software
community that a system with a p o orly designed basic structure cannot b e
repaired by improving the interface, and likewise that a well designed system
lends itself easily to an elegantinterface.
This section describ es Peabody, which represents articulated gures com-
p osed of segments connected by joints. The Peabody data structure has a
companion language and an interactiveinterface in Jack for sp ecifying and
creating articulated gures. The data structure itself maintains geometric in-
formation ab out segment dimensions and joint angles, but it also provides a
highly ecient mechanism for computing, storing, and accessing various kinds
of geometric information. One of the principal tasks requested of Peabody
is to map segment dimensions and joint angles into global co ordinates for end e ectors.
2.2. REPRESENTING ARTICULATED FIGURES 29
Peabody was designed with several criteria in mind:
It should b e general purp ose. It should b e able to represent manytyp es
of gures of tree-structured top ology. It should not b e hard co ded to
represent a sp eci c typ e of gure, suchasahuman gure or a particular
rob ot manipulator.
It should haveawell develop ed notion of articulation. Rather than con-
centrating on representations for primitive geometric shap es, Peabody
addresses how such shap es can b e connected together and how they
b ehave relativetoeach other.
It should represent tree-structure d ob jects through a hierarchy. The
inverse kinematics p ositioning algorithm can calculate and maintain the
information necessary to simulate closed lo ops.
It should b e easy to use. The external user view of the gures should b e
logical, clear, and easy to understand. Understanding the gures should
not require any knowledge of the internal implementation, and it should
not require any advanced knowledge of rob otics or mechanics.
2.2.1 Background
Kinematic Notations in Rob otics
The most common kinematic representation in rob otics is the notation of
Denevit and Hartenb erg [Pau81, DH55]. This representation derives a set of
parameters for describing a linkage based on measurements b etween the axes
of a rob ot manipulator. The notation de nes four parameters that measure
the o set b etween subsequent co ordinate frames emb edded in the links, or
segments: 1 the angle of rotation for a rotational joint or distance of trans-
lation for a prismatic joint; 2 the length of the link, or the distance b etween
the axes at each end of a link along the common normal; 3 the lateral o set
of the link, or the distance along the length of the axis b etween subsequent
common normals; and 4 the twist of the link, or the angle b etween neigh-
b oring axes. The notation prescrib es a formal pro cedure for assigning the
co ordinate systems to the links in a unique way.
The ob jective b ehind these kinematic notations in rob otics is to develop
a standard representation that all researchers can use in the analysis and
description of manipulators. There are several typ es of manipulators that are
extremely common in the rob otics research community. The adoption of a
standard representation would greatly simplify the pro cess of analyzing and
implementing rob otics algorithms since so many algorithms are describ ed in
the literature using these manipulators.
Animation Systems
Computer graphics and animation literature seldom addresses syntactic, or
even semantic, issues in representations for mechanisms, except as background
30 CHAPTER 2. BODY MODELING
for some other discussion of an animation technique or system.
Most interactive animation systems such as GRAMPS [OO81], TWIXT
[Gom84], and BBOP [Stu84, Ste83], as well as commercial animation packages
such as Alias [Ali90] and Wavefront[Wav89] only provide a mechanism of at-
taching one ob ject to another. In this way, the user can construct hierarchies.
When the user manipulates one ob ject, its child ob jects follow, but there is
no real notion of articulation. The attachments simply state that the origin
of the child ob ject is relative to the origin of the parent.
Many animation systems are non-interactive and are based on scripts that
provide a hierarchy only through a programming language interface. Exam-
ples of such systems are ANIMA-I I [Hac77], ASAS [Rey82] and MIRA-3D
[MTT85]. In this kind of system, the hierarchy is hard-co ded into the script,
p ossibly through an interaction lo op. A hierarchy designed in this wayis
very limited, except in the hands of a talented programmer/animator who
can write into the animation a notion of b ehavior.
Physically-Based Mo deling Systems
Physically based mo deling systems such as that of Witkin, Fleisher, and Barr
[WFB87] and Barzel and Barr [BB88] view the world as ob jects and con-
straints. Constraints connect ob jects together through desired geometric re-
lationships or keep them in place. Otherwise, they oat in space under the
appropriate laws of physics. There is no notion of articulation other than
constraints. This forces the burden of maintaining ob ject p ositions entirely to
the algorithms that do the p ositioning. For simple ob jects, this is conceptu-
ally pleasing, although for complex ob jects it is computationally dicult. If
a system represents joints like the elb ow as a constraint, the constraintmust
haveavery high weighting factor in order to ensure that it never separates,
requiring very small time steps in the simulation. This may also complicate
the user's view of ob jects such as rob ots or human gures, which are inher-
ently articulated. We b elieve it is imp ortant to di erentiate the relationship
between b o dy segments at the elb ow and the relationship b etween a hand and
a steering wheel.
2.2.2 The Terminology of Peabody
Peabody uses the term environment to refer to the entire world of geometric
ob jects. The environment consists of individual gures, each of whichisa
collection of segments. The segments are the basic building blo cks of the
environment. Each segment has a geometry. It represents a single physical
ob ject or part, which has shap e and mass but no movable comp onents. The
geometry of each segment is represented bya psurf, which is generally a
p olyhedron or a p olygonal mesh but can b e of a more general nature.
The term gure applies not only to articulated, jointed gures suchasa
human b o dy: any single \ob ject" is a gure. It need not havemoving parts.
A gure mayhave only a single segment, such as a co ee cup, or it maybe
2.2. REPRESENTING ARTICULATED FIGURES 31
comp osed of several segments connected by joints, such as a rob ot. Sometimes
the term \ob ject" denotes any part of the Peabody environment.
Joints connect segments through attachment frames called sites. A site is
a lo cal co ordinate frame relative to the co ordinate frame of its segment. Each
segment can have several sites. Joints connect sites on di erent segments
within the same gure. Sites need not lie on the surface of a segment. A site
is a co ordinate frame that has an orientation as well as a p osition. Each site
has a location that is the homogeneous transform that describ es its placement
relative to the base co ordinate frame of its segment.
Segments do not have sp eci c dimensions, such as the length, o set, and
twist of Denevit and Hartenb erg notation, b ecause the origin can lie anywhere
on the segment. The lo cation of the axes of the joints that connect the seg-
ment are phrased in terms of this origin, rather than the other way around.
The measurement of quantities such as length is complicated, b ecause seg-
ments mayhave several joints connected to them, and none of these joints is
designated in the de nition as the \parent."
Joints mayhave several DOFs, which are rotational and translational axes.
Each axis and its corresp onding angle form a single rotation or translation,
and the pro duct of the transform at each DOF de nes the transform across
the joint, de ning the placement of the sites, and thus the segments, that the
joint connects.
The directionality of a joint is imp ortant b ecause it de nes the order in
which the DOF transforms are concatenated. Because these transforms are
not commutative, it is essential that the order is well-de ned. This is an
esp ecially imp ortant feature of Peabody, since it is sometimes convenientto
de ne the direction of the jointinaway di erent from the way the joint o ccurs
in the gure hierarchy. An example of this is the human knee. Although it
may b e useful to structure the hierarchyofahuman b o dy with the ro ot at
the fo ot, it is also app ealing to have the joints at b oth knees de ned in the
same manner.
2.2.3 The Peab o dy Hierarchy
Peabody avoids imp osing a prede ned hierarchy on the gures by encour-
aging the user to think of gures as collections of segments and joints, none
with sp ecial imp ortance. However, there must exist an underlying hierarchy
b ecause Peabody is not equipp ed to handle closed-lo op mechanisms. Closed
lo op structures are managed through the constraint satisfaction mechanism.
The structure of the Peabody tree is de ned by designating one site on the
gure as the root. The ro ot site roughly corresp onds to the origin of the g-
ure, and it provides a handle by which to sp ecify the lo cation of the gure.
Viewing the gure as a tree, the ro ot of the gure is the ro ot of the tree.
The ro ot site of a gure maychange from time to time, dep ending up on the
desired b ehavior of the gure.
This means there are two representations for the hierarchy, one internal
and one external. There are many advantages to having a dual representation
32 CHAPTER 2. BODY MODELING
of the hierarchy. First, it allows the hierarchytobeinverted on the y. Most
mo dels have a natural order to their hierarchy, emanating from a logical origin,
but this hierarchy and origin mayormay not corresp ond to how a mo del is
placed in the environment and used.
The choice of the gure ro ot is particularly imp ortant to the inverse kine-
matics algorithm, since the algorithm op erates on chains of joints within the
gure. At least one p oint on the gure must remain xed in space. Because
the internal representation of the hierarchy is separate, the user maintains a
consistent view of the transform across a joint, regardless of how the gure is
ro oted.
The example b elow illustrates the Peabody hierarchy. Each segment has
its base co ordinate frame in the middle and an arc leading to each of its sites.
The transform along this arc is the site's lo cation. Each site mayhave several
joints branching out from it, connecting it downwards in the tree to sites on
other segments.
figure table {
segment leg {
psurf = "leg.pss";
attribute = plum;
site base->location = trans0.00cm,0.00cm,0.00cm;
site top->location = trans5.00cm,75.00cm,5.00cm;
}
segment leg0 {
psurf = "leg.pss";
attribute = springgreen;
site base->location = trans0.00cm,0.00cm,0.00cm;
site top->location = trans5.00cm,75.00cm,5.00cm;
}
segment leg1 {
psurf = "leg.pss";
attribute = darkslategray;
site base->location = trans0.00cm,0.00cm,0.00cm;
site top->location = trans5.00cm,75.00cm,5.00cm;
}
segment leg2 {
psurf = "leg.pss";
attribute = darkfirebrick;
site base->location = trans0.00cm,0.00cm,0.00cm;
site top->location = trans5.00cm,75.00cm,5.00cm;
}
segment top {
psurf = "cube.pss" * scale1.00,0.10,2.00;
site base->location = trans0.00cm,0.00cm,0.00cm;
site leg0->location = trans95.00cm,0.00cm,5.00cm;
site leg2->location = trans5.00cm,0.00cm,5.00cm;
2.2. REPRESENTING ARTICULATED FIGURES 33
site leg3->location = trans95.00cm,0.00cm,195.00cm;
site leg4->location = trans5.00cm,0.00cm,195.00cm;
}
joint leg1 {
connect top.leg3 to leg1.top;
type = Rz;
}
joint leg2 {
connect top.leg2 to leg0.top;
type = Rz;
}
joint leg3 {
connect top.leg4 to leg2.top;
type = Rz;
}
joint leg4 {
connect top.leg0 to leg.top;
type = Rz;
}
root = top.base;
location = trans0.00cm,75.00cm,0.00cm;
}
2.2.4 Computing Global Co ordinate Transforms
The ro ot site for the gure is the one at the top of the tree, and its global
lo cation is taken as given, that is, not dep endentonany other elementof
the environment. The ro ot, the site lo cations, and the joint displacements
uniquely determine the global lo cation of every site and segment in the tree
in terms of a pro duct of transforms from the ro ot downward.
The computation of the co ordinate transforms for each segment and site
in the downward traversal of the tree requires inverting the site lo cations
that connect the segment to other segments lower in the tree. It may also
require inverting joint displacements if the joint is oriented upwards in the tree.
Computationally, this is not exp ensive b ecause the inverse of a homogeneous
transform is easy to compute, through a transp ose and a dot pro duct.
2.2.5 Dep endent Joints
3
The human gure can b e abstracted as an ob ject which is to b e instantiated
into a certain p ose byany sp eci cation of all the joint angles. While any
p ose can b e represented by a set of joint angles, it is not always p ossible to
supply a full and reasonable set of angles. Often, for example, there is a
3 Jianmin Zhao.
34 CHAPTER 2. BODY MODELING
natural grouping of joints such as the torso or shoulder mass that typically
work together. Arbitrary admissible joint angles for the joints in the group
may not represent a legitimate p osture: they are functionally dep endenton
each other.
Conceptually, these dep endencies compromise the notion of the joint and
joint angle and blur the b oundary of the ob ject de nition. It seems that not
all joints are created equal. Rather than have a system which tries to cop e
with every joint the same way,we take a more practical approach. We use a
joint group concept in Peabody to accommo date joint dep endency so that
the relationship is co ded into the ob ject de nition rather than the application
program.
A joint group is a set of joints which are controlled as one entity.Internal
joint angles are not visible outside of the group: they are driven by the group
driver. The driver is nothing but a mapping from a numb er of parameters
counterparts of joint angles of the indep endent joint to joint angles of its
constituent joints. Those indep endent parameters will b e called group angles.
Similar to the joint angles of the indep endent joint, the group angles of the
joint group are sub ject to linear constraints of the form
n
X
a b 2:1
i i i
i=1
where 's are group angles and n is the number of 's, or numb er of DOFs of
the joint group. There may b e many such constraints for each group.
There can b e many applications of the joint group. Forearm pronation and
supination change the segment geometry, so one way to manage that within
the geometry constraints of psurfs is to divide the forearm intoanumber of
nearly cylindrical sub-segments. As the wrist moves, its rotation is trans-
mitted to the forearm segments such that distal sub-segments rotate more
than proximal ones. The segment adjacent to the elb ow do es not pronate or
supinate at all. Fingers could b e managed in a similar fashion by distributing
the desired orientation of the ngertip over the three joints in the nger chain.
The most interesting examples, though, involve the torso and the shoulder.
We address these cases in the next sections.
2.3 A Flexible Torso Mo del
4
Human gure mo dels have b een studied in computer graphics almost since
the intro duction of the medium. Through the last dozen years or so, the
structure, exibility, and delityofhuman mo dels has increased dramatically:
from the wire-frame stick gure, through simple p olyhedral mo dels, to curved
surfaces, and even nite element mo dels. Computer graphics mo delers have
tried to maximize detail and realism while maintaining a reasonable overall
display cost. The same issue p ertains to control: improving motion realism
4 Gary Monheit.
2.3. A FLEXIBLE TORSO MODEL 35
requires a great numb er of DOFs in the b o dy linkage, and such redundancy
strains e ective and intuitively useful control metho ds. We can either sim-
plify control by simplifying the mo del, thereby risking unrealistic movements;
or complicate control with a complex mo del and hop e the resulting motions
app ear more natural. The recent history of computer animation of human
gures is fo cused on the quest to move the technology from the former situ-
ation towards the latter while simultaneously forcing the control complexity
into algorithms rather than skilled manual manipulation.
This p oint of view motivates our e orts in human gure mo deling and
animation, as well as those of several other groups. Though notable algorithms
for greater animation p ower have addressed kinematics, dynamics, inverse
kinematics, available torque, global optimization, lo comotion, deformation,
and gestural and directional control, the human mo dels themselves tended
to b e rather simpli ed versions of real human exibility. In the early 1980's
wewarned that increased realism in the mo dels would demand ever more
accurate and complicated motion control; now that the control regimes are
improving, wemust return to the human mo dels and ask if wemust re-evaluate
their structure to take advantage of algorithmic improvements. When we
considered this question, we determined that a more accurate mo del of the
human spine and torso would b e essential to further realism in human motion.
Although many mo dels have app eared to have a exible torso, they have
b een computer constructions of the surface shap e manipulated by skilled an-
imators [Emm85]. We needed a torso that was suitable for animation, but
also satis ed our requirements for anthrop ometric scalability.Thus a single
mo del of xed prop ortions is unacceptable as human b o dy typ es manifest
considerable di erences. A similar typ e of exible gure is found in snakes
[Mil88, Mil91], but the anthrop ometry issues do not arise. Moreover, this
snake animation is dynamics-based; humans do not need to lo comote by wig-
gling their torsos and so a kinematics mo del was deemed adequate. Zeltzer
and Stredney's \George" skeleton mo del has a detailed vertebral column, but
it is not articulated nor is it b ent during kinematic animation [Zel82]. Lim-
ited neckvertebral motion in the saggital plane was simulated by Willmert
[Wil82]. Various b o dy mo dels attempt a spine with a 3D curve but shap e and
control it in a ad hoc fashion.
If the spine were realistically mo deled, then the torso, a vessel connected
and totally dep endent on the spine, could then b e viewed and manipulated
interactively.Sowe underto ok the development of a far more satisfactory and
highly exible vertebral mo del of the spine and its asso ciated torso shap e.
The conceptual mo del of the spinal column is derived from medical data
and heuristics related to human kinesiology. The spine is a collection of verte-
brae connected by ligaments, small muscles, vertebral joints called pro cesses,
and intervertebral discs [BBA88]. Nature has designed the spine for supp ort
of the b o dy's weight, stability of the torso, exibility of motion, and protection
of the spinal cord [AM71, Hol82].
The spine moves as a column of vertebrae connected by dep endent joints,
meaning that it is imp ossible to isolate movement of one vertebral joint from
36 CHAPTER 2. BODY MODELING
the surrounding vertebrae [Lou83]. Muscle groups of the head, neck, ab domen
and back initiate the movement of the spine, and the interconnecting ligaments
allow the movement of neighb oring vertebrae [BBA88, Wel71].
2.3.1 Motion of the Spine
Anatomy of the Vertebrae and Disc
The spinal column consists of 33 vertebrae organized into 5 regions [BBA88]:
cervical, thoracic, lumbar, sacral, and co ccyx.
The vertebrae are lab eled by medical convention in vertical descending
order: C1{C7, T1{T12, L1{L5, and S1{S5. Which regions should b e consid-
ered part of the torso? The cervical spine lies within the neck. The sacrum
and co ccyx contain vertebrae that are xed through fusion [AM71]. Since the
mobile part of the torso includes the 12 thoracic and 5 lumbar vertebrae, all
together 17 vertebrae and 18 joints of movement are included in the torso
mo del.
Eachvertebra is uniquely sized and shap ed, but all vertebrae contain a
columnar b o dy and an arch. The b o dy is relatively large and cylindrical, sup-
p orting most of the weight of the entire spine. The vertebral b o dies increase
gradually in size from the cervical to the lumbar region [AM71].
The arch supp orts seven pro cesses: four articular, two transverse, and one
spinous [AM71]. The pro cesses are b ony protrusions on the vertebra that aid
and limit the vertebral motion. The transverse and spinous pro cesses serve
as levers for b oth muscles and ligaments [BBA88]. The articular pro cesses
provide a joint facet for the jointbetween successivevertebral arches. These
pro cesses, due to their geometry, cause the vertebrae to rotate with 3 DOFs.
Ligaments and small muscles span successivevertebral pro cesses. They give
the spinal column its stability. Because of this strong interconnectivity, spinal
movement is mo deled as interdep endentmovements of neighb oring joints.
Vertebrae are each separated byintervertebral discs. The disc has 3 parts
[Lou83]:
nucleus pulp osus - the sphere in the center, consisting of 85 water
annulus brosus - the b ers running as concentric cylinders around
the nucleus
cartilaginous plates - a thin wall separating the disc from the vertebral
body.
The disc changes shap e as the neighb oring vertebrae b end. But, since the
nucleus is 85 water, there is very little compression. The disc can bulge
out spherically, as force is applied to the columnar b o dy ab ove or b elow.
Therefore, overall the disc do es not function as a spring, but as a deformable
cylindrical separation b etween vertebrae, supp orting the theory that the ver-
tebrae do not slide, but rotate around an axis [Lou83].
2.3. A FLEXIBLE TORSO MODEL 37
Range of Movement of EachVertebra
Vertebral movement is limited by the relative size of the disks, the attached
ligaments, and the shap e and slant of the pro cesses and facet joints. Statis-
tics for joint limits b etween each successivevertebra have b een recorded and
compiled [Lou83]. Also, the spine has a natural shap e at rest p osition. The
initial joint p osition of eachvertebra is input to the mo del.
The range of movement of each region of the spine is di erent. For in-
stance, the optimum movement of the lumbar region is exion or extension.
The thoracic area easily moves laterally, while exion/extension in the sagittal
plane is limited. The cervical area is very exible for b oth axial twisting and
lateral b ending. The joint limits for each region a ect howmuch that joint
is able to participate in any given movement. The p osture of the torso is a
result of the sp ecialization of the spinal regions [Wil75].
E ect of the Surrounding Ligaments and Muscles
The vertebrae are interconnected by a complex web of ligaments and muscles.
If the force initiated byamuscle group is applied at one joint, the jointmoves
and the neighb oring joints also move to a lesser degree. Some joints farther
away might not b e a ected by the initiator joint's movement.
It is p ossible to deactivate joints that are not initiating the movement.
This action is achieved by simultaneous contractions of extensor and exor
muscles around the spinal column [Wil75]. Dep ending on the force of these
resisting muscles, the joints on or near the joint closest to the resistor will
move less than they would if the resisting force had not b een applied. The
nal p osition of the spine is a function of the initiator force, the resisting
muscle, and the amount of resistance.
2.3.2 Input Parameters
The spine is mo deled as a blackbox with an initial state, input parameters,
and an output state [MB91]. To initiate movement of the spine, several input
parameters are intro duced. These parameters are:
joint range FROM and TO: Within the total numb er of joints in the
spine, any non-empty contiguous subset of vertebral joints may b e sp ec-
i ed bytwo joint indices. These joints indicate which part of the spine
is activeinmovement. For example, the user sp eci es movement in the
range b etween T5 and T10. All other joints are frozen in the movement.
initiator joint: The joint where movement b egins, usually the joint with
greatest motion.
resistor joint: The joint that resists the movement. This may b e equated
toamuscle that contracts and tries to keep part of the spine immobile.
resistance: The amount of resistance provided by the resistor joint.
38 CHAPTER 2. BODY MODELING
spine target p osition: This is a 3D vector describing the target p osition
after rotation around the x, y , and z axis. The target p osition is the
sum of all joint p osition vectors in the spine after movement succeeds.
zero interp olation: Avalue of \yes" indicates that movementisinterp o-
lated through the joint rest p osition. A value of \no" indicates that only
the joint limits are used to interp olate movement.
2.3.3 Spine Target Position
The jointbetween eachvertebra has three degrees of rotation. The spine will
movetoward the target p osition by rotating around the three p ossible axes
[Lou83]:
ROTATION OF THE SPINE
exion/extension Forward/backward b ending Rotation around x axis
axial rotation Twisting Rotation around y axis
lateral b ending Side b ending Rotation around z axis
The p osition of the exion rotational axis for eachvertebral joint has
b een measured from cadavers, and is not equidistant to the two adjacent
vertebrae, but is closer to the b ottom vertebra [Lou83]. The origin of the axis
of movement determines how the vertebrae move. When the torso is mo deled
on the spine, the axis also directly determines how the torso changes shap e.
Elongation and compression are absent from the mo del. The hydrophilic
intervertebral disc, when submitted to prolonged compression induces a slight
decrease in height due to uid leakage. Conversely, after a long p erio d of rest
or zero-gravity, the spine elongates by maximum lling of the nucleus pulp osus
at the center of the disc [Lou83]. Dehydration during a day's activity can
result in a loss of height of 2 cm in an adult p erson. In any short duration
of movement the disc is essentially incompressible, and therefore elongation
is imp erceptible [Hol81].
Shearing or sliding translational movements of the vertebrae would lead
to variation in the intervertebral separation. This would not b e allowed by
the mechanics of the intervertebral disc [Lou83]. Therefore, the assumption
is made that for normal activities the three degrees of rotational movement
are the only ones p ossible for eachvertebral joint.
2.3.4 Spine Database
Anyhuman gure can have a wide variety of torso shap es. Also, each p erson
has a di erent degree of exibility and range of movement. In order to mo del
the p osition and shap e changes of an individual's spine, a database has b een
designed for creating a unique set of features for the spine and torso. Medical
data is the source of the database elements of an average p erson [Lou83].
The database consists of the size of eachvertebra in the x; y ; z dimension,
2.4. SHOULDER COMPLEX 39
Figure 2.1: Spherical Tra jectory of the Shoulder.
the intervertebral disc size, the joint limits 3 rotations with 2 limits p er
rotation, and the joint rest initial p osition. In Section 4.2.3 we will see how
spine movement is realized.
2.4 Shoulder Complex
5
It is well known that the movement of the humerus the upp er arm is not a
matter of simple articulation as most computer graphics mo dels would have
it. The movement is caused by articulations of several joints { glenohumeral
joint, claviscapular joint and sterno clavicular joint. Collectively, they are
called the shoulder complex [EP87, ET89].
In Jack, the shoulder complex is simpli ed with two joints { one connect-
ing the sternum to the clavicle and the other connecting the clavicle to the
+
humerus [GQO 89]. We call the former joint the clavicle joint and the latter
the shoulder joint. This simpli cation implies that the rotational center of the
humerus lies on a spatial sphere when the upp er arm moves. It turns out that
it is very close to empirical data collected with a 6-D sensor attached to the
external midp ointbetween the dorsal and ventral side of the right upp er arm
[Mau91]. The z -axis of the sensor was along the longitudinal axis of the upp er
arm with the p ositive z axis direction p ointing proximally, and the negative
y -axis p ointing into the upp er arm. The sensor was placed 10 inches from the
shoulder the extremal p oint of the humerus.
From the exp erimental data the tra jectory of the shoulder can b e easily
computed. We tted the tra jectory by the sphere which yields minimum
average residual error. The result is quite satisfactory: the radius of the
sphere is 6.05cm, and the average error the distance from the tra jectory
p oint to the sphere is 0.16cm. Figure 2.1. Notice that the radius of the
5 Jianmin Zhao.
40 CHAPTER 2. BODY MODELING
Figure 2.2: A Neutral Human Figure.
sphere is not the same as the dimension of the clavicle. This is due to the
fact that the shoulder complex has indeed three joints instead of two. The
net result can b e mo deled bytwo joints, however, if we put the center of the
clavicle jointinbetween the two clavicle extremes.
In our applications we feel that two joints are adequate to mo del the
shoulder complex. So, we mo deled the shoulder complex by grouping these
two joints into a joint group. Nowwe need to de ne the group angles and the
way they drive the internal joint angles.
2.4.1 Primitive Arm Motions
It will b e convenienttohave the group angles of the shoulder complex describ e
the arm's motion in a natural way. To fo cus on this motion, we assume
that the elb ow, the wrist and all joints down to ngers are xed. What are
the arm's primitive motions? Mathematically,any three indep endent arm's
motions will suce. But careful selection will pay o in p ositioning ease.
The arm's motion can b e decomp osed into two parts: spherical and twist-
ing motions. The motion whichmoves the vector from the proximal end to
the distal end of the upp er arm is called spherical motion, and the motion
which leaves this vector unchanged is called twisting.
In a neutral b o dy stance, consider a co ordinate system where the z axis
p oints vertically downward, the x axis p oints towards the front, and the y
axis p oints to the right. If we place the starting end of the vector from
the proximal end to the distal end of the upp er arm at the origin of the
co ordinate system, the terminating end will stay on a sphere with the center
at the origin when the arm moves. The spherical motion can b e further
decomp osed into two motions: one whichmoves the arm vector along the
longitude of the sphere elevation, and another whichmoves the vector along
2.4. SHOULDER COMPLEX 41
the latitude ab duction. To describ e the current status of the arm vector, we
need a convention for zero elevation or ab duction. Let us de ne the amount
of elevation as the unsigned angle between z axis and the arm vector and,
for the left arm, the amount of ab duction as the signed angle between the