Exploring Miscommunication and Collaborative Behaviour in Human
Total Page:16
File Type:pdf, Size:1020Kb
Exploring Miscommunication and Collaborative Behaviour in Hu- man-Robot Interaction Theodora Koulouri Stanislao Lauria Department of Information Systems and Department of Information Systems Computing and Computing Brunel University Brunel University Middlesex UB8 3PH Middlesex UB8 3PH [email protected] [email protected] and linguistic domain of the robot. Physical co- Abstract presence will lead people to make strong but misplaced assumptions of mutual knowledge This paper presents the first step in de- (Clark, 1996), increasing the use of underspeci- signing a speech-enabled robot that is ca- fied referents and deictic expressions. Robots pable of natural management of mis- operate in and manipulate the same environment communication. It describes the methods as humans, so failure to prevent and rectify errors and results of two WOz studies, in which has potentially severe consequences. Finally, dyads of naïve participants interacted in a these issues are aggravated by unresolved chal- collaborative task. The first WOz study lenges with automatic speech recognition (ASR) explored human miscommunication technologies. In conclusion, miscommunication management. The second study investi- in HRI grows in scope, frequency and costs, im- gated how shared visual space and moni- pelling researchers to acknowledge the necessity toring shape the processes of feedback to integrate miscommunication in the design and communication in task-oriented inte- process of speech-enabled robots. ractions. The results provide insights for the development of human-inspired and 1.2 Aims of study robust natural language interfaces in ro- bots. The goal of this study is two-fold; first, to incor- porate “natural” and robust miscommunication 1 Introduction management mechanisms (namely, prevention and repair) into a mobile personal robot, which is Robots are now escaping laboratory and indus- capable of learning by means of natural language trial environments and moving into our homes instruction (Lauria et al., 2001). Secondly, it and offices. Research activities have focused on aims to offer some insights that are relevant for offering richer and more intuitive interfaces, the development of NLIs in HRI in general. This leading to the development of several practical research is largely motivated by models of hu- systems with Natural Language Interfaces man communication. It is situated within the lan- (NLIs). However, there are numerous open chal- guage-as-action tradition and its approach is to lenges arising from the nature of the medium explore and build upon how humans manage itself as well as the unique characteristics of miscommunication. Human-Robot Interaction (HRI). 2 Method 1.1 Miscommunication in Human-Robot Interaction We designed and performed two rounds of Wiz- ard of Oz (WOz) simulations. Given that the HRI involves embodied interaction, in which general aim of the study is to determine how ro- humans and robots coordinate their actions shar- bots should initiate repair and provide feedback ing time and space. As most speech-enabled ro- in collaborative tasks, the simulations departed bots remain in the labs, people are generally un- from the typical WOz methodology in that the aware of what robots can understand and do re- wizards were also naive participants. The domain sulting in utterances that are out of the functional of the task is navigation. In particular, the user Proceedings of SIGDIAL 2009: the 10th Annual Meeting of the Special Interest Group in Discourse and Dialogue, pages 111–119, Queen Mary University of London, September 2009. c 2009 Association for Computational Linguistics 111 guided the robot to six designated locations in a right corner of the screen that displayed the cur- simulated town. The user had full access to the rent surrounding area of the robot, but not the map whereas the wizard could only see the sur- robot itself. Then, for the purposes of the second rounding area of the robot. Thus, the wizard re- study, this feature was removed (see Figure 1 in lied on the user’s instructions on how to reach Appendix A). the destination. In this section we outline the aim and approach of each WOz study, the materials used and the experimental procedure. Sections 4 and 5 focus on each study individually and their results. 2.1 The first WOz study This study is a continuation of previous work by the authors (Koulouri and Lauria, 2009). In that study, the communicative resources of the wizard were incrementally restricted, from “normal” dialogue capabilities towards the capabilities of a dialogue system, in three experimental condi- tions: Figure 1. The user’s interface. The wizard simulates a super-intelligent The wizard’s interface was modified accord- robot capable of using unconstrained, ing to the two experimental conditions. For both natural language with the user (henceforth, conditions, the wizard could only see a fraction Unconstrained Condition). of the map- the area around the robot’s current The wizard can select from a list of de- position. The robot was operated by the wizard fault responses but can also ask for clarifi- using the arrow keys on the keyboard. The dialo- cation or provide task-related information gue box of the wizard displayed the most recent (henceforth, Semi-Constrained condition). messages of both participants as well as a history of the user’s messages. The buttons on the right The wizard is restricted to choose from a side of the screen simulated the actual robot’s limited set of canned responses similar to ability to remember previous routes: the wizard a typical spoken dialogue system (SDS). clicked on the button that corresponded to a The current study investigates the first two con- known route and the robot automatically ex- ditions and presents new findings. ecuted. In the interface for the Unconstrained condition, the wizard could freely type and send 2.2 The second WOz study messages (Figure 2). The second round of WOz experiments explored the effects of monitoring and shared visual in- formation on the dialogue. 2.3 Set-up A custom Java-based system was developed and was designed to simulate the existing prototype (the mobile robot). The system consisted of two applications which sent and received coordinates and dialogue and were connected using the TCP/IP protocol over a LAN. The system kept a log of the interaction and the robot’s coordinates. Figure 2. The wizard’s interface in the Uncon- The user’s interface displayed the full map of strained condition. the town (Figure 1). The dialogue box was below the map. Similar to an instant messaging applica- In the version for the Semi-Constrained condi- tion, the user could type his/her messages and see tion, the wizard could interact with the user in the robot’s responses appearing on the lower part two ways: first, they could click on the buttons, of the box. In the first WOz study, the user’s in- situated on the upper part of the dialogue box, to terface included a small “monitor” on the upper automatically send the canned responses, “Hel- 112 lo”, “Goodbye”, “Yes”, “No”, “Ok” and the back? problem-signalling responses, “What?”, “I don’t U7 (351,68@10:46:55) instruct go forward and at the cros- understand” and “I cannot do that”. The second sroads keep going forward and way was to click on the “Robot Asks Question” the tube is at the end of the and “Robot Gives Info” buttons which allowed road R8 (351,0@10:47:14) WE explain the wizard to type his/her own responses (see Out of bounds. Figure 2 in Appendix A). R9 (351,608@10:47:47) query-w Where to go? 2.4 Procedure U10 (364,608@10:48:12) instruct the tube is in front of you A total of 32 participants were recruited, 16 users R11 (402,547@10:48:23) BOT query-yn and 16 wizards. The participants were randomly Is it this one? assigned to the studies, experimental conditions U12 (402,547@10:49:7) SUC reply-y and to the roles of wizard or user. The pairs were yes it is. Table 1. Example of an annotated dialogue. ID seated in different rooms equipped with a desk- top PC. The wizards were given a short demon- denotes the speaker (User or Robot), T.S. stands stration and a trial period to familiarise with the for task status and MISC for miscommunication. operation of the system and were also informed 3.1 Annotation of dialogue acts about whether the users would be able to monitor them. The users were told that they would inte- The DAs in the corpus were annotated following ract with a robot via a computer interface; this the HCRC coding scheme (Carletta et al., 1996). robot was very fluent in understanding spatial Motivated by Skantze (2005), the last column in language and could give appropriate responses, it Table 1 contains information on the explicitness could learn routes but had limited vision. The of the response. This feature was only relevant users were asked to begin each task whenever for repair initiations by the wizards. For instance, they felt ready by clicking on the links on their responses like “What?” and the the ones in Table computer screen, start the interaction with “Hel- 3 were considered to be explicit (EX) signals of lo”, which opened the wizard’s application, and miscommunication, whereas lines 2 and 4 in the end it with “Goodbye” which closed both appli- dialogue above were labelled as implicit (IMP). cations. The participants received verbal and 3.2 Annotation of task execution status written descriptions of the experiment. They were not given any specific guidelines on how to The coordinates (x,y) of the robot’s position re- interact or what routes to take.