UDC 001.81:003.6:681.32

Human-machine Interface Using Humanoid Cartoon Character

VSatoshi Iwata VTakahiro Matsuda VTakashi Morihara (Manuscript received June 6, 1999)

This paper describes our Cartoon Character Interface (CCIF). The CCIF is a graphical user interface that features a humanoid cartoon character that acts as an anthropo- morphic agent. First, we report on the effectiveness of humanoid characters in human-machine interactions and how users accept such characters. Next, we describe a simulator for creating scenes for the CCIF for use in a sales terminal and describe the results of an experiment to evaluate how the CCIF affected the terminal’s operabil- ity. We confirmed that the CCIF made the interface of the terminal easier to use. Final- ly, we introduce a cartoon animation composer for creating CCIFs.

1. Introduction agents “Shall we dance” on network applications, The use of animated humanoid characters to in which the agent dances with a partner on the assist in interactions between humans and ma- network.4) chines is quite common these days. Some examples These examples illustrate that humanoid- are provided below. type characters can play important roles in 1) Characters on terminals interfacing between humans and machines. Animated characters are used on Automated On the other hand, in the case of terminal Teller Machines (ATMs) and ticket vending use, there are many occasions when people need machines. to operate terminals with which they are unfa- 2) Characters on PCs miliar. Some of these terminals are complicated Some mailer systems use animated charac- to use and require many operations. The intro- ters that indicate the status of mail delivery. duction of an anthropomorphous agent on the GUI 3) Characters in communication networks would be very useful in these cases. Users perform tasks in a virtual world by Our efforts are focused on terminal applica- controlling a character (called an avatar). tions, so we developed an animated character There have been several attempts to use interface and a new GUI with cartoon characters computer graphics (CG) characters as anthropo- acting as anthropomorphic agents. We also made morphous agents on graphical user interfaces an experimental application of the character in- (GUIs). Hasegawa et al. designed multi-modal terface and then verified the effects of these systems with an anthropomorphous agent.1) Tani interfaces. et al. used a character as the navigator of a digi- We then developed an animation composition tal archive.2) Kawabata designed “Noddy,” which tool called the Cartoon Character Interface Com- is an agent that can recognize the speech of a poser (CCIC) for the animated character interface. user.3) Ebihara et al. designed the humanoid In this paper we describe these developments and

FUJITSU Sci. Tech. J.,35,2,pp.165-173(December 1999) 165 S. Iwata et al.: Human-machine Interface Using Humanoid Cartoon Character

present the results of our experiments. The subjects’ comments after the experiment included the following. 2. Our approach 1) The characters of the interfaces seem to give 2.1 Cartoon characters a personality to the system being operated. We applied cartoon characters that were pro- 2) The characters’ voices are easier to catch with duced by connecting a series of cartoon animation mouth movements than without. cells created in advance to the animated charac- 3) The characters give the impression that a ter interface and called it the Cartoon Character real human is assisting the user, which Interface (CCIF). makes the terminal’s responses seem auton- It is sometimes difficult for users to view com- omous. puter-generated characters as anthropomorphous 4) The characters convey sufficient emotion and or autonomous agents because the machine- feeling to affect the user. rendered image limits the resolution of the char- 5) Repetition of the same character actions be- acters, making it difficult to design features that comes boring and spoils the illusion of move naturally. character autonomy. We chose cartoon characters for the follow- 6) Installing various gestures would be an ing reasons. improvement. 1) Most people are familiar with cartoon char- After considering the above comments, we acters. came to the following conclusions with regard to 2) Although it is difficult to build a cartoon an- the use of humanoid cartoon characters for the imation program, once built, it is relatively interface. easy to use. 1) The humanoid cartoon characters attract users to the applications, so they can enhance 2.2 Effects of humanoid cartoon the usability of practical GUIs. character 2) The users are affected by the characters’ We investigated the effects of using a movements and expressions, so the characters humanoid cartoon character. We sampled facial can be used to convey non-verbal information. expression clips from well-known TV animation 3) Natural looking movements increase the characters to acquire an understanding of anima- likelihood that the characters appear auton- tion methods and then categorized the clips omous. according to expression. By joining the clips to- gether and incorporating a voice, we made several 3. Cartoon Character Interface experimental applications of interfaces that use 3.1 Sales terminal a humanoid cartoon character. The applications We used the CCIF simulator to simulate a included an Animation Narrator who acts as a character’s expressions and gestures as described weather reporter and as a guide on a web brows- above for a sales terminal. Figure 1 shows an er and ATM display, and an Interactive Tutor who example of a sales terminal produced by our com- acts out a simple Q & A scenario. The voices were pany that is in current use. It has a display created using a text-to-speech system designed by attached to a touch panel for input, a speaker, and our company.5) a printer for issuing tickets, receipts, etc. It can We then asked 10 people between 20 and 30 handle various kinds of tickets and sales by cata- years old to operate the interfaces and give their logue and it can convey product information and subjective evaluations. The subjects all use a per- sell software directly. Using this display, a user sonal computer in their daily work. can select and purchase products and services.

166 Sci. Tech. J.,35, 2,(December 1999) S. Iwata et al.: Human-machine Interface Using Humanoid Cartoon Character

Start 8) Response

1) Start menu 9) Seat selection

2) Ticket menu 10) Payment selection

3) Selection index 11) Input ID

4) Artist index 12) Confirmation 2

5) Hall & number selection 13) CM 2

6) Confirmation 1 14) Bye

7) CM End

Figure 1 Figure 2 Sales terminal. Example interaction sequence.

Table 1 Instruction sheet.

(a) Start menu (#1) (b) Selection of hall and ticket number (#5) Instructions Please provide the following information: Artists : Rolling Stones Date : 11/3 or 11/4 Place : Dome 2 Tickets Seat : NA Purchase by Credit .

(c) Confirmation (#6) (d) CM (#7)

Figure 3 Example scenes in sequence.

3.2 CCIF simulator (see Table 1). 3.2.1 Example interaction sequences The interaction flow of this terminal is as Figure 2 shows the sequence of interactions follows (the numbers correspond to the scene num- for the sales terminal, and Figure 3 shows some bers in Figure 2). example scenes in this sequence. The cartoon 1) Start Menu: The user selects the ticket. character is placed in the center of the display and 2) Ticket Menu: The user selects the concert. explains the functions and confirms user inputs. 3) Selection Index: The user selects the Artist Index. This particular application is for concert ticket 4) Artist Index: The user selects the “Rolling purchasing. In the sequence, users operate the Stones.” sales terminal by following an instruction sheet 5) Hall & Number Selection: The user selects

FUJITSU Sci. Tech. J.,35, 2,(December 1999) 167 S. Iwata et al.: Human-machine Interface Using Humanoid Cartoon Character

the hall name, date, and number of tickets. places the pen and note pad. 6) to 13) The user completes the sequence. When the user inquires about seat availabil- 14) Bye: The character thanks the user. ity, the character inputs the seat number on 3.2.2 Gestures and facial the display using the keyboard. expressions When the user inputs his or her ID and pass- After the sequence was determined, five cat- word, the character touches the touch panel egories of gestures were designed for this on the display. application. Categories 2) to 5) are gestures aimed 4) Emotional expression at attracting users. In categories 2) and 5), five If a seat is sold out, the character takes on actions are displayed randomly to avoid repeti- an apologetic expression. tion. 5) Unconscious movements The five categories are as follows. When the character is waiting for a user in- 1) Pointing put, he makes various facial expressions and Pointing gestures are intended to direct the body movements unconsciously. These move- user’s attention to a button or space on the ments maintain the sense of autonomy. display. 3.2.3 Expression units 2) Gestures made during speech We designed various expression units that Hand movements improve communication. show mouth movements, body movements, neck For a speaking character, five types of hand movements, eye movements, and blinking. To con- gestures are used randomly. nect two expression units, the end of one unit is 3) Operation joined to the beginning of the next unit via inter- When the character is waiting for an input mediate units which provide a smooth transition from the user, he brings out a pen and note of the character’s gesture and facial expression. pad. After the user performs the input, the Figure 4 shows three of the expression units: bow- character makes a note of the input and re- ing, speaking, and pointing to the upper right of

(a) Bowing

(b) Speaking

(c) Pointing to upper right of display

Figure 4 Example expression units.

168 FUJITSU Sci. Tech. J.,35, 2,(December 1999) S. Iwata et al.: Human-machine Interface Using Humanoid Cartoon Character

the display. There are a total of 61 units. The 1) Scene display time scenes shown in Figure 3 were made by connect- This is the total amount of time during which ing various expression units with other graphics a scene is displayed. This time is a measure units and were linked with voice data. of the user’s performance in each scene. It varies with the complexity of the scene and 3.3 Experiment using CCIF the number of choices the user can make. We conducted an experiment using a display Some scenes are intuitively understood af- terminal installed with the example sequence. ter viewing them for only a short time. The experiment is outlined below. 2) Post-experiment interview 1) The system has 14 scenes (greeting, purchas- We asked the users whether they thought the ing tickets, commercials, etc.). character made the sequence easier to fol- 2) The experiment consisted of two trials. In low and asked them for their subjective one trial, the character gave guidance dur- feelings about the character. ing the scenes and explained the buttons by using his voice and making gestures. 3.4 Experimental results In the other trial, the character was not dis- 3.4.1 Evaluation of display times played on the screen and the user was given Figure 5 shows the experimental results. The voice guidance only. x-axis indicates the scene number and the y-axis 3) Nine of the scenes (Nos. 1 to 5 and 8 to 11) indicates the average display time of each scene. have menu buttons. These scenes are Figure 5 (a) shows the results of the first tri- changed to the next scene by touching one or als for group A and group B. In the 1st scene, the more buttons. The user can exit the guid- display time with the character displayed is ance or explanation by touching these as long as the display time without the character. buttons. In this experiment, there were no This gap decreased gradually as the sequence branches, so the user could select only one moved to the 4th scene, which indicates that the button at a time. The other buttons were users were interested and listening attentively to dummy buttons that had no responses. the character speaking. Then, after the users 4) The scenes do not change until the user se- learnt the system operations scene by scene, the lects the correct choice as indicated on an listening time decreased. instruction sheet. If the user did not under- Figures 5 (c) and (d) show the results for both stand a scene, the user could not proceed to trials by group A and group B, respectively. In the next scene. these figures, the results for the 5th scene are 5) In the other five scenes, there were no but- particularly interesting. The 5th scene is a com- tons and the next scene was started plex scene in which users have many choices to automatically after the guidance had finished. make, for example, they must select a hall and The 10 subjects involved in this experiment ticket number. Group A’s average time in their were between 20 and 30 years old and worked in the same laboratory. They attempted to follow the Table 2 same operation sequence. The experiment was Experiment. intended to make a comparison between a voice- 1st trial 2nd trial only interface and the CCIF. We divided the Group A Instructions by voice Instructions by voice subjects into two groups (see Table 2). and the character only

After the experiment, we investigated two Group B Instructions by Instructions by voice points. voice only and the character

FUJITSU Sci. Tech. J.,35, 2,(December 1999) 169 S. Iwata et al.: Human-machine Interface Using Humanoid Cartoon Character

first trial on this scene, which was done with the 2) Confirmation assistance of the voice and the character, was The user can visually check the information almost the same as their average time in the sec- he or she inputs. ond trial (voice only). However, group B were only 3) Voice assistance given voice guidance in their first trial and then The character’s mouth movements make the made a 6-second improvement when the charac- voice assistance easier to understand. ter was added in the second trial. We attribute 4) Attraction this 6-second improvement not only to the help The users were attracted to the character and the character gave to speed up the performance felt comfortable with the terminals. of the slower users but also to the braking effect Observations 1) and 3) suggest the potential it had on users who tended to make mistakes be- of intuitive communication, which reduces the cause they proceeded too quickly. operation time of interfaces. Observation 2) sug- These results indicate that the character gests that the non-verbal communication given by helped the users understand each scene and op- the CCIF was effective. erate the terminal correctly. 3.4.2 Interview results 4. Cartoon character interface composer The users confirmed the effectiveness of the 4.1 Configuration of composer CCIF by making the following observations. We designed an animation composer to help 1) Intuitive interaction us compose the cartoon character animations for It is possible to understand the operations the CCIF. Using the composer, the designer of a intuitively by watching the character per- terminal sequence can create cartoon character forming them on the display. animations with no professional experience.

40 40 35 : Group A 35 : Group A 30 Voice and 30 Voice only 25 character 25 20 : Group B 20 : Group B 15 Voice only 15 Voice and 10 10 character Display time (s) Display time (s) 5 5 0 0 1 2345891011 1 2345891011 Scene Scene (a) First trials of both groups (b) Second trials of both groups

40 : Group A 40 : Group B 35 first trial 35 first trial 30 Voice and 30 Voice only 25 character 25 20 : Group A 20 : Group B 15 second trial 15 second trial 10 Voice only 10 Voice and Display time (s) Display time (s) character 5 5 0 0 1 2345891011 1 2345891011 Scene Scene (c) First and second trials of Group A (d) First and second trials of Group B

Figure 5 Evaluation results.

170 FUJITSU Sci. Tech. J.,35, 2,(December 1999) S. Iwata et al.: Human-machine Interface Using Humanoid Cartoon Character

Figure 6 shows an outline of the composer. The of the unit. The key images are used to re- composer makes character animation a resource trieve the expression units. of the Visual Basic developer tool. The principal We will add emotional information in the elements are the animated character database, an- future. imated character engine, and composer GUI. 4.3 Animated character engine 4.2 Animated character database This engine is an executable component that This database contains the properties of the is used to make the character’s movements ap- expression units for the gestures and facial ex- pear natural. It automatically gives the animation pressions of the character. The properties are as a “natural behavior” and is designed for applica- follows: tion by the composer and the terminal. 1) Unit ID The principal functions are as follows: Identifies an expression unit. 1) Connecting 2) Repetition/talking This function checks the IDs of two connect- Indicates talking or the repetition of a move- ing expression units and inserts the correct ment (e.g., operating the keyboard, writing intermediate unit for a natural-looking tran- with a pencil). sition between them. 3) Frequency of use 2) Talking Used to create “natural behavior” and to This function inserts the appropriate num- avoid repetition. ber of speaking units by counting the words. 4) Number of contiguous cells (A speaking unit consists of image cells de- Indicates the number of image cells in an picting two mouth movements.) expression unit. 3) Idling 5) Playback order When the terminal is waiting for a user in- Indicates the order in which the image cells put, this function selects parts automatically are displayed. and provides animation. It prevents the dis- 6) Principal cell ID play from remaining static and improves Indicates the ID of a cell in the expression user-understanding. unit that has been chosen as the key image

OS Windows API

Terminal Developer Composer application Visual Basic GUI Data

Animated character engine Animated character Expression parts database Voice clips

Text-to-speech engine

Figure 6 Figure 7 Configuration of character composer. Character composer.

FUJITSU Sci. Tech. J.,35, 2,(December 1999) 171 S. Iwata et al.: Human-machine Interface Using Humanoid Cartoon Character

4.4 Graphic user interface of composer behavior as autonomous. Figure 7 shows the GUI of our composer. After considering the above results, we de- The terminal designer refers to the animat- signed a simulator for sales terminals so that the ed character database to select sequences of Cartoon Character Interface could be put to prac- expression units called animation parts and then tical use. We conducted an experiment and places them at the appropriate spaces in the time completed an evaluation. We found that the CCIF chart on the composer GUI. There is one anima- made an existing sales terminal interface easier tion part for each set of communications between to use. the character and the user. Some examples of Finally, we introduced a cartoon animation animation parts are a greeting, a request for composer we developed for producing CCIFs. input, an instruction, and a confirmation. Then, the animated character engine deter- Acknowledgments mines which animation parts should be connected Some parts of this work were carried out in together and makes the transition between them the Fujitsu Design Laboratory. The authors thank appear natural. Mr. Kakiuchi and Mr. Watanabe for their contri- If the designer wants to use voice clips, they butions to this work and for the fruitful can be attached as text to the animation parts on discussions we had with them. the composer GUI. Then, the animated character engine selects the appropriate expression units References for speaking. 1) O. Hasegawa: Computer Vision Inspired by The composed data is used to develop the Biological Visual Systems. (in Japanese), scene. The animation parts described in the com- Journal of the IPSJ (The Information Pro- posed data are played back in the target cessing Society of ), 39, 2, pp.133-138 application on the terminals. (1998). This composer reduced the time required to 2) M. Tani, T. Kamiya, and S. Ichikawa: User create the CCIF described in Chapter 3 by about Interfaces for Information Strolling in a Dig- 99%. ital Library. NEC Research & Development, 37, 4, pp.535-545 (1996). 5. Summary 3) T. Kawabata: Spoken dialog system “Noddy.” Humanoid cartoon characters are being used (in Japanese), NTT R&D, 47, 4, pp.405-410 more frequently. We studied the use of such char- (1998). acters for human-machine interface applications. 4) K. Ebihara, L. S. Davis, J. Kurumisawa, First, we incorporated facial expression clips from T. Horprasert, R. I. Haritaoglu, T. Sakaguchi, well-known animations. We then made experi- and J. Ohya: Shall we dance? – Real time 3D mental applications using these clips and control of a CG puppet –. SIGGRAPH’98 completed a subjective evaluation. Enhanced Realities, conference abstracts and We found that users accept animated char- applications, July 1998, p.124. acters because they are familiar to them. We also 5) N. Katae and S. Kimura: Natural Prosody found that an animated character can communi- Generation for Domain Specific Text-to- cate non-verbally and that natural movement is speech Systems. ICSLP’96, October 1996. necessary for users to view the character’s

172 FUJITSU Sci. Tech. J.,35, 2,(December 1999) S. Iwata et al.: Human-machine Interface Using Humanoid Cartoon Character

Satoshi Iwata received the B.S. and Takashi Morihara received the B.E. M.S. degrees in Electrical Engineering degree in Electronic Engineering from from Keio University, Yokohama, Japan Iwate University, Morioka, Japan in in 1983 and 1985, respectively. 1980. He joined Fujitsu Laboratories He joined Fujitsu Laboratories Ltd., Ltd. in 1980 and has been engaged in Atsugi, Japan in 1985 and has been en- research and development of peripher- gaged in research and development of al systems. imaging technology for peripheral sys- tems. He is a member of the Institute of Electronics, Information and Commu- nication Engineers (IEICE) of Japan and a member of the Institute of Electrical and Electronics Engineers (IEEE).

Takahiro Matsuda graduated in Elec- trical Engineering from Kurume College of Technology, Kurume, Japan in 1991. He joined Fujitsu Laboratories Ltd. in 1991 and has been engaged in re- search and development of peripheral systems.

FUJITSU Sci. Tech. J.,35, 2,(December 1999) 173