Editor: Graphically Speaking Miguel Encarnação

Carnival—Combining Speech Technology and Computer Michael A. Berger Speech Graphics Ltd.

Gregor Hofer Speech Graphics Ltd.

Hiroshi Shimodaira University of Edinburgh

peech is powerful information technology ary concept called visual . Since the and the basis of human interaction. By emit- late 1980s, two applications have been in develop- ting streams of buzzing, popping, and hiss- ment. Audio-driven animation automatically syn- ingS noises from our mouths, we transmit thoughts, thesizes facial animation from audio. Text-driven intentions, and knowledge of the world from one animation (or audiovisual text-to-speech synthe- mind to another. We’re accustomed to thinking sis) synthesizes both auditory and visual speech of speech as an acoustic, auditory phenomenon. from text. The former is used for automatic lip However, speech is also visible. Although the pri- sync with recorded audio, the latter for entirely mary function of speech is to manipulate air in text-based avatars. the vocal tract to produce sound, this action has But speech technology and computer graphics an ancillary effect of changing the face’s appear- remain worlds apart, and the development of vi- ance. In particular, the action of the lips and jaw sual speech synthesis suffers from lack of a unifi ed during speech causes constant deformation of the conceptual and technological framework. To meet facial surface, generating a robust visual signal this need, researchers at Speech Graphics (www. highly correlated with the acoustic one1—that is, speech-graphics.com) and the University of Ed- visual speech. inburgh’s Centre for Speech Technology Research In , this means that speech (CSTR; www.cstr.ed.ac.uk) are developing Carni- is something that must be studied, understood, val, an object-oriented environment for integrat- and simulated. Speech animation, or lip synchro- ing speech processing with real-time graphics. nization, is a major challenge to , ow- Carnival comprises an unlimited number of mod- ing to its intrinsic complexity and viewers’ innate ules that can be dynamically loaded and assembled sensitivity to the face. (Actually, the term lip syn- into a mutable animation production system. chronization—lip sync—incorrectly implies that only the lips move, whereas actually almost all the Visual Speech Synthesis facial surface below the eyes gets deformed during Both audio- and text-driven animation involve a speech.) At the same time, demand for lip sync is series of operations converting a representation of sharply increasing, in terms of both realism and speech from an input form to an output form. quantity. Automated solutions are now absolutely necessary. (For more on why this is the case, see Audio-Driven Animation the sidebar.) In audio-driven animation (see Figure 1), the fi rst The past two decades have seen the emergence step is acoustic analysis to extract useful infor- of techniques for animating speech automati- mation from the audio signal. This information cally using speech technology—an interdisciplin- might be of two kinds:

80 September/October 2011 Published by the IEEE Computer Society 0272-1716/11/$26.00 © 2011 IEEE Audio

Acoustic Acoustic parameters/speech categories analysis

w ∧ nn aIth rg ræm ∧ rwo Ω kp∧

Motion synthesis Motion parameters

Adaptation Animation parameters

Rendering Animation

Figure 1. A typical processing pipeline for audio-driven facial animation. Acoustic analysis extracts continuously and categorically valued representations of the audio. Both can be used as input to motion synthesis, which produces audio-synchronous motion in some parameter space, which must be mapped to a facial model’s animation parameters (adaptation). From these parameters, we can render the animation using standard methods.

■■ continuous acoustic parameters, such as pitch, in- mantic description of speech events. This descrip- tensity, or mel-frequency cepstral coefficients; or tion abstracts the speech from the audio domain, ■■ discrete speech categories, such as phonemes or allowing its reconstruction in the motion domain. visemes. After synthesizing facial motion in some form, we must still map it to a facial model’s animation Both can be the basis for the next step, synthesizing parameters, determined by its deformers. In a 3D audio-synchronous motion. Given some regression facial rig with blendshapes and bones (also called model, we can map continuous acoustic parameters joints), the parameters are blendshape weights directly to motion parameters. On the other hand, and bone transformation parameters. Using these a categorical analysis (see Figure 2) provides a se- parameters, we can render the animation. This is

IEEE Computer Graphics and Applications 81 Graphically Speaking

Why Automate Speech?

Realistic facial synthesis is one of the most funda- tion, means that what we think to be a given consonant or mental problems in computer graphics—and one of vowel is actually realized quite differently depending on the the most difficult.1 sounds preceding and following it. Consequently, it’s difficult or impossible to define units of speech in a context-invariant raditionally, lip synchronization (lip sync) has been done way. Such dynamic complexity is understandably difficult Tmanually, by keyframing or . However, as 3D to reproduce by hand. animation reaches increasing heights of realism, all aspects of the animation industry must keep up, including lip sync. Audio Synchronicity And realistic lip sync is extremely labor intensive and difficult Unlike other animated behaviors such as walking, visual to achieve manually. This difficulty is due to four character- speech must be tightly and continuously synchronized istics of visual speech: dynamic complexity, audio synchron- with an audio channel. This synchronization makes speech icity, high sensitivity, and high volume. animation a uniquely double-edged problem. The visual speech must be not only dynamically realistic in itself but Dynamic Complexity also sufficiently synchronous and commensurate with the Speech is arguably one of the most complex human mo- auditory speech to create the illusion that the two signals tor activities. Our alphabetic writing system can deceive are physically tied—that is, that the face we’re watching is us into thinking that speech is just a succession of dis- the source of the sound we hear. crete, sound-producing events, corresponding to letters. But as people discovered in the 19th century with the High Sensitivity advent of instrumental acoustics, this isn’t the physical Beyond the intrinsic difficulties of synthesizing visual reality. Speech is a continuous activity with no real “units” speech—complexity and audio synchronicity—there’s an of any kind. extrinsic perceptual problem. Humans are innately well Like other task-oriented motor behaviors, speech is highly attuned to faces, which makes us sensitive to unrealistic efficient: energy expenditure is minimized. So instead of facial animation or bad lip sync. producing one sound after another sequentially, we begin This sensitivity might serve a communicative function. producing each sound well before concluding the previous With our highly expressive faces, we seem designed for face- one. The movements of the tongue, lips, and jaw in speech to-face communication. This obviously includes nonverbal are like an athlete’s coordinated movements: different body communication: facial expressions modify the spoken word’s parts acting in concert, future movements efficiently overlap- meaning and transmit emotional states and signals in the ping with current ones, all efficiently compressed in time. absence of speech. But faces are also integral to speech com- This simultaneous production of sounds, called coarticula- munication. Humans have an innate ability to lip-read, or rec-

k w ih k sh aa t s r ae ng aw t

13.65 14.77 Time (sec.)

Figure 2. The waveform and categorical analysis of the utterance “Quick shots rang out.” Such analyses provide a semantic description of speech events that we can use to reconstruct speech in the motion domain.

82 September/October 2011

ognize words by sight; this is true for both hearing-impaired multiplayer online game, The Old Republic, to be released and normally hearing people. The next time you’re in a noisy in 2012 by Bioware, will feature some “hundreds of thou- place such as a bar, notice how much you rely on seeing sands of lines of dialogue,” or the equivalent of about 40 someone’s face in order to “hear” them. Even in noise-free novels.4 At just one-third of the way through the voice work environments, speech perception is still a function of both for the game, the amount of audio recorded reportedly had auditory and visual channels. Visual speech so strongly already exceeded the entire six-season run of The Sopranos.5 influences speech perception that it can even override As if that weren’t enough, most games are released in the auditory percept, causing us to hear a sound different multiple languages; to avoid unsynchronized or “dubbed” from the one the ear received—the famous McGurk effect.2 speech, all the speech animation must be redone for every language. developers simply can’t do this by High Volume hand; they need automated solutions. As the bar for realism in 3D animation continues to rise, and with it the demand for higher-quality lip sync, the quantity of speech and dialogue in animation is also rising References exponentially. These are antagonistic sources of pressure: 1. F. Pighin et al., “Synthesizing Realistic Facial Expressions animators can’t satisfy quantity demands without sacrific- from Photographs,” Proc. Siggraph, ACM Press, 1998, pp. ing quality, and vice versa. 75–84. A case in point is the video game industry. Video game 2. H. McGurk and J. MacDonald, “Hearing Lips and Seeing characters are becoming increasingly realistic, in both static Voices,” Nature, vol. 264, 1976, pp. 746–748. appearance and behavior. Poor lip sync will be reflected in 3. “Rockstar Games’ Dan Houser on Grand Theft Auto IV and game reviews, which often include rants about lip sync. Digitally Degentrifying New York,” 2 May 2008; http:// At the same time, games are becoming increasingly story- nymag.com/daily/entertainment/2008/05/rockstar_games_ driven and cinematic, more like interactive movies than dan_houser.html. games. This means much more speech and dialogue, all of 4. L. Smith, “E3 2009: : The Old Republic Is World’s which must be animated. First ‘Fully Voiced’ MMO,” blog, 1 June 2009; http://massively. As big-title games move online, the amount of assets joystiq.com/2009/06/01/e3-2009-star-wars-the-old-republic- in a game, including recorded audio, can increase by an is-worlds-first-fully-voi. order of magnitude. Rockstar Games’ Grand Theft Auto IV, 5. B. Crecente, “The Old Republic Wordier Than Entire Run released in 2008, had 660 speaking parts with 80,000 lines of the Sopranos,” Kotaku, 3 June 2009; http://kotaku.com/ of dialogue. This was considered a staggering amount of #!5278008/the-old-republic-wordier-than-entire-run-of- speech at the time.3 However, the new Star Wars massively the-sopranos. the fundamental process flow in any audio-driven This is because they’re generated by the same physi- lip-sync method. For an example of audio-driven ological system: the vocal tract, of which the mouth animation produced using Carnival, visit http://doi. and surrounding facial features are the visible ter- ieeecomputersociety.org/10.1109/MCG.2011.71. minus. For instance, both channels exhibit coartic- ulation—the overlapping production of sounds. So, Text-Driven Animation visual speech synthesis can and does borrow tech- In the text-driven pipeline (see Figure 3), the first niques from acoustic-speech synthesis. step is to apply pronunciation and duration rules to At the same time, the two types of synthesis dif- produce a categorical time series, like that derived fer considerably. The mediums are entirely differ- from audio in Figure 1. From this semantic repre- ent: synthesizing facial shapes or images as opposed sentation, we synthesize both audio and articula- to acoustic energy. Also, in the visual domain, it’s tory motion. After motion synthesis, the left side of important to synthesize nonverbal motion, such as Figure 3 is identical to Figure 1. This is the typical emotional expressions, eye gaze, and autonomic ac- process flow for text-driven methods, but variations tivities such as blinking and breathing. There’s no exist that synthesize audio and motion in a more such thing as “silence” in visual speech because the unified manner from a single speech model. face is never really still.

Acoustic vs. Visual Synthesis Facial Performance Capture vs. Many parallels exist between acoustic-speech and Visual Speech Synthesis visual-speech synthesis because acoustic and visual Besides visual speech synthesis, another technique speech signals have similar underlying dynamics. for automating lip sync is facial performance

IEEE Computer Graphics and Applications 83 Graphically Speaking

Text

One night her grandmother woke up ...

Pronunciation Speech categories and duration

w ∧∧nn aIth rgræm∧ rwo Ω kp

Motion Audio synthesis Motion parameters Audio synthesis

Adaptation Animation parameters

Rendering Animation

Figure 3. A typical processing pipeline for text-driven facial animation. Pronunciation and duration rules generate a categorical representation of the speech over time, providing common input to audio synthesis and motion synthesis. Thereafter, the process is similar to audio-driven animation.

capture, which captures the motions of a live quality facial modeling or computer graphics. In actor’s face and maps them onto a facial model. the graphics community, the problem is reversed. But performance-driven animation is expensive— Practitioners specialize in realistic 3D models but requiring a studio, trained actors, careful application tend to underestimate the speech issues, result- of facial markers or makeup (or both), a director ing in poor lip sync. Few individuals sufficiently to oversee the performance, and a grasp both speech and computer facial animation system run by a trained operator. Audio recording to make the entire process work. Thus, we need a conditions are often poor, so audio has to be more collaborative approach. dubbed in later, with no automatic synchronization. Besides the cultural divide, a technological divide Moreover, capturing motion and transferring it to also exists. No standard software infrastructure in- the animated character is error-prone and requires corporates speech technology and computer graph- manual supervision and cleanup. Tracking the ics. Visual-speech-synthesis platforms tend to be lips in particular is notoriously difficult. Owing patchwork, with speech processing handled by re- to the labor and expense, this method can’t scale search code, and rendering done offline in external to large volumes of speech. However, we believe programs such as Maya or Blender. A few systems performance-driven and audio-driven methods can have built-in rendering—for example, the Baldi complement one another: the former for nonverbal project at the University of California Santa Cruz’s body and facial motion, and the latter for high- Perceptual Science Lab (http://mambo.ucsc.edu), volume speech. the Semaine platform at Télécom ParisTech (www. semaine-project.eu), and a few commercial appli- Bridging the Divides cations over the years. The tendency, however, is to Unlike performance-driven animation, which sim- build monolithic systems for a single application. ply maps motion to motion, visual speech synthesis Building on or extending such a software base is involves more distal mappings from audio or text difficult, such that if you want to do one part dif- to motion. As we’ve seen, this requires modeling ferently, it’s often easier to start from scratch. The complex dynamic phenomena specific to speech. current state of affairs clearly isn’t conducive to So, our field is interdisciplinary, combining speech accelerated development and collaboration. technology and computer graphics. But although it spans two disciplines, it’s a stepchild to both. Carnival’s Genesis Speech technologists might have a good grasp of We needed something better for our own work. speech but tend to show little interest in high- The Carnival system arose out of the need for

84 September/October 2011 Component

Processor Pipeline Sequence Event Visualizer

String TimeSeries

NumericalTimeSeries CategoricalTimeSeries

RegularTimeSeries IrregularTimeSeries

Signal

Audio Video

Figure 4. The Carnival API’s Component class hierarchy. Components are self-contained modules— processors, data objects, and output systems—that can be dynamically loaded and assembled to form a mutable animation production system. a research tool with which to try out various consists of any number of dynamically loadable, audio-driven facial-animation methods that CSTR combinable modules, all belonging to the super- colleagues and Speech Graphics colleagues were class Component. Figure 4 shows the Component developing. The tool had to be class hierarchy, which contains five subtypes:Se - quence, Visualizer, Event, Processor, and ■■ a flexible system in which we could easily inter- Pipeline. change and compare different methods, ■■ a well-structured system to which other devel- Sequences opers could contribute easily without it break- A Sequence is any sequence of values, which might ing, and be a signal (for example, audio or video) or some en- ■■ an interactive, real-time system letting us ana- coding thereof. For example, each input and output lyze animation against time-varying data. object in Figures 1 and 3 is a Sequence. Sequence has two subtypes: String and TimeSeries. A Nothing like this existed at the time. String is an atemporal sequence (usually of char- We didn’t just want to engineer software for a acters—that is, text), whereas a TimeSeries is a particular application embodying a particular set sequence of values at specific time points. of synthesis methods. Instead, we asked, how do TimeSeries has two subtypes. A Numerical- we step back and provide a more general founda- TimeSeries has floating-point values on multiple tion for this work? The answer was to first per- channels, which we can sample continuously by form a conceptual and structural analysis of the interpolation. A CategoricalTimeSeries has field, which would let us view speech technology categorical values, which extend over intervals and and computer graphics components in a common change at discrete boundaries (see Figure 2). Nu- ontological universe. Then, we implemented that mericalTimeSeries divides into two further sub- analysis in an object-oriented system that empha- types: Regular, with values spaced at a uniform sizes modularity and flexibility. The name Car- time interval, and Irregular, with values spaced nival is a play on Festival—a widely used speech nonuniformly. The former includes Signals, which synthesis platform previously developed at CSTR— are Sequences that can be output in real time, in- with the added connotation of faces. cluding the subclasses Audio and Video.

The Carnival API Visualizers At its core, Carnival is a C++ API. The API provides We considered it essential for Carnival to have a flexible animation-system architecture, which real-time rendering capabilities. Users can always

IEEE Computer Graphics and Applications 85 Graphically Speaking

NumericalTimeSeries

Control Interface OGRE scene Visualizer

Figure 5. Our 3D implementation of the Visualizer, consisting of a control interface and an OGRE (Object-Oriented Graphics Rendering Engine) scene containing a facial model. The interface comprises a set of deformation parameters (DPs—the blue squares), which can be bound to the current time point in a NumericalTimeSeries or to other DPs by linking functions. Ultimately, DPs link to low-level animation parameters of the facial model in the OGRE scene (the blue squares), such as blendshape weights or bone transformation parameters.

export animation to packages such as Maya, Series but also to other DPs by linking func- 3ds Max, or Softimage for high-quality in-scene tions. Ultimately, DPs link to low-level animation rendering. However, they should also be able to parameters of the facial model in the OGRE scene. preview animation in real time in the same system that produced it, so that feedback is immediate Events and they can view the animation in synchrony A Sequence is essentially a representation of with relevant time-varying data. some temporal event, such as an utterance or an So, we designed a real-time rendering engine. action. For example, when we record someone But in keeping with Carnival’s design philosophy, speaking, the audio signal is a representation of we made it a Component—modular and self-con- the event of their speaking. If video was recorded tained. It’s called a Visualizer. A Visualizer at the same time, it too represents the same event. has a control interface consisting of a set of defor- Extracting some features from the audio, such as mation parameters (DPs). Each DP has the range mel-frequency cepstral coefficients or pitch, will [0, 1] for unidirectional or [–1, 1] for bidirectional result in yet another representation. We can also deformations of the face, with a rest state of 0. A have a text transcript of what the person said, Visualizer is essentially an image decoder, con- which represents the event in still another way. verting a vector of DPs into an image on the screen. A group of Sequences like this that all represent Thanks to encapsulation, how it does this—2D or the same event in different ways form a natural 3D rendering—is of no concern to external callers. grouping, which Carnival supports with the class For real-time animation, a Visualizer can be Event. An Event is a container Component bound to a NumericalTimeSeries whose chan- that contains an ordered list of Sequences. All nels are the Visualizer’s DPs. Whatever the cur- the TimeSeries members of the Event share a rent time point is in the NumericalTimeSeries, common time domain. The String members have any bound Visualizer will display an image vi- no time dimension, but they textually refer to the sualizing that time point’s vector of values. Figure same interval. Figure 6 illustrates an Event. 5 illustrates binding to a NumericalTimeSeries. Besides forming a grouping of related Se- Figure 5 also displays our 3D implementation quences, Event also functions as a playback of the Visualizer, which is based on OGRE system, performing synchronous output of its (Object-Oriented Graphics Rendering Engine; real-time members. As it outputs each Signal, it www.ogre3d.org) and can accommodate any facial also updates the current time in all TimeSeries, model created in standard 3D modeling packages. which is automatically reflected by an image As Figure 5 shows, the 3D Visualizer consists change in any bound Visualizers. So, by simply of a control interface and an OGRE scene. The changing the current time, the Event automati- DPs can be bound not only to a NumericalTime- cally produces animation.

86 September/October 2011 One night her grandmother woke up...

wnw ∧∧n nan a IgI g rrææ mmrr wwpo Ω k ∧ p

Play> Pause> Seek> Time t

Figure 6. An Event, containing an ordered list of Sequences representing the same temporal event, such as an utterance. The TimeSeries members share a common time domain with current elapsed time t. This Event includes a String, an Audio, two NumericalTimeSeries (to one of which a Visualizer is bound), a CategoricalTimeSeries, and a Video. An Event has playback functions such as play, pause, and seek, which control synchronous output of the set’s real-time members (Signals and bound NumericalTimeSeries).

Processors and Pipelines Event to each of them in order. Each Proces- As Figures 1 and 3 illustrate, automated animation sor searches the Event for its required input Se- production basically entails converting Sequences quences, by traversing the list from end to begin- from one form to another. The Component to ning and performing type checking. If it finds the handle Sequence conversion is the Processor required input, it runs the process and adds its class. Simply, a Processor takes one or more output Sequences to the end of the Event. The Sequences as input and gives one or more Se- next Processor will then have access to this out- quences as output. put and the output of earlier Processors. Figure Processors are modeled after Unix commands 7 shows an example Pipeline. in that they do relatively simple jobs, and they In contrast to Unix pipelines, the searchable- can be concatenated or piped. The analog of the list method of concatenation lets Processors Unix pipeline is the Pipeline class, a container have multiple inputs and outputs and receive in- Component holding an ordered list of Proces- put produced by nonimmediate predecessors. It’s sors. In a Unix pipeline, each process’s output is similar in nature to concatenative programming redirected as input to the next one. But a Carni- languages, such as Joy or PostScript, but by using val Pipeline works differently: instead of con- a list instead of a stack, it preserves all output in- necting Processors by direct feeds, it passes an stead of “popping” items off.

IEEE Computer Graphics and Applications 87 Graphically Speaking

One night her grandmother woke up ...

w ∧∧n naIthrg ræmrw okΩ ∧ p

One night her grandmother woke up ...

w∧nmnalt hr græ ∧∧r wo Ω k p One night her One night her grandmother woke up ... grandmother woke up ...

One night her grandmother woke up ...

w ∧∧n naIthrg ræmrw okΩ ∧ p Ω

wn∧∧naIthrg ræmrw ok∧ p w∧nmnalt hr græ ∧∧r wo Ω k p

MFCC_Extractor Aligner Syllabifier MotionGenerator

Figure 7. An example Pipeline in action, demonstrating audio-driven animation in which a text transcript accompanies the audio. At the start, the Event contains two Sequences: Audio and String. Each Processor searches the Event backwards for its required input Sequences and adds its output to the end of the Event. So, Processors have access to multiple Sequences, including those produced by nonimmediate predecessors.

Like other Components, Processors are dy- Accelerated Development namically loaded at runtime. So, while an appli- Carnival-based applications extend the modular cation runs, various different Pipelines can be paradigm by further subclassing Components. assembled and used, for a runtime-programmable Typically, they also add a GUI and other layers that system. The modular and programmable nature of tailor the user experience by providing higher-level Pipelines fulfills the design objective that Car- commands and restricting what users can do or nival should be mutable and should allow for easy see. For example, you can make an application interchange of methods. that’s exclusively text-driven or audio-driven. But underneath, each application inherits the parent system’s dynamically modular design. Application development is accelerated be- cause the API supplies so much structure and function in advance, but without confining the application to a particular form. Once an ap- plication is established, it can grow quickly by adding more and more Components. The mod- ularity of Components allows for concurrent development. Researchers can focus on coding their own algorithms by writing Processors; the Processor interface lets them immediately join their Processors to any Pipeline. Pro- cessors are black boxes that snap together—a desirable outcome in object-oriented design. The developer determines the user interface. In designing our in-house application, we’ve studied GUIs from a variety of exemplars: audio analy- sis tools, video-editing systems, speech synthesis programs, and 3D modeling and animation pack- ages. We also obtained feedback from animation professionals about the functionality they would Figure 8. The graphical user interface for a Visualizer, with manual like. Our application includes graphical editors access to deformation parameters. Application developers can extend for each type of Component. For example, the Carnival by giving Components graphical interfaces for user editing Visualizer interface (see Figure 8) gives man- and viewing. ual access to deformation parameters. The Time-

88 September/October 2011 line interface provides playback controls and an cations in this framework will become available interactive display of time-series data, with a cur- for licensing by animators and video game produc- sor that can be scrubbed (dragged back and forth ers, as well as for academic research. in the timeline for framewise output).

Reference nimators are increasingly looking for auto- 1. H. Yehia, P. Rubin, and E. Vatikois-Bateson, “Quan­ Amated solutions to their problems. Visual titative Association of Vocal-Tract and Facial Behav- speech synthesis is an attractive option from the ior,” Speech Communication, vol. 26, nos. 1–2, 1998, viewpoints of cost and quality. pp. 23–43. Proper integration of speech technology requires stepping back from specific programming tasks Michael A. Berger is a cofounder and the chief technol- and looking at the big picture. Animation is just ogy officer of Speech Graphics, and a PhD candidate at the another form of synthesis, and visual output is just University of Edinburgh School of Informatics, in the Cen- another form of output. Speech technologists are tre for Speech Technology Research. Contact him at berger@ used to separating the problem of synthesis into speech-graphics.com. generation of underlying dynamics followed by re- construction of signal output. That the output is Gregor Hofer is a cofounder and the chief executive officer of visual in this case shouldn’t hinder us from seeing Speech Graphics. Contact him at [email protected]. the problem as a unified synthesis problem. Break- ing that problem down in a modular, abstract, Hiroshi Shimodaira is a lecturer in the University of Edin- object-oriented manner is correct from both a con- burgh School of Informatics and a member of the Centre for ceptual and a technological viewpoint. Speech Technology Research. Contact him at h.shimodaira@ We plan to continue our two-pronged approach, ed.ac.uk. using Carnival in both the collaborative research environment at the University of Edinburgh and Contact Department Editor Miguel Encarnação at lme@ the commercial setting at Speech Graphics. Appli- computer.org. Running in Circles Looking for a Great Computer Job or Hire?

Make the Connection - IEEE Computer > Software Engineer Society Jobs is the best niche employment > Member of Technical Staff source for computer science and engineer- Computer Scientist ing jobs, with hundreds of jobs viewed by > thousands of the finest scientists each > Dean/Professor/Instructor month - in Computer magazine > Postdoctoral Researcher and/or online! > Design Engineer > Consultant

http://www.computer.org/jobs

IEEE Computer Society Jobs is part of the Physics Today Career Network, a niche job board network for the physical sciences and engineering disciplines. Jobs and resumes are shared with four partner job boards - Physics Today Jobs and the American Association of Physics Teachers (AAPT), American Physical Society (APS), and AVS: Science and Technology of Materials, Interfaces and Processing Career Centers.

IEEE Computer Graphics and Applications 89