A Data-Driven Model of Explanations for a Chatbot That Helps to Practice Conversation in a Foreign Language
Total Page:16
File Type:pdf, Size:1020Kb
A data-driven model of explanations for a chatbot that helps to practice conversation in a foreign language Sviatlana Höhn AI Minds Vianden, Luxembourg [email protected] Abstract conversations where one of the speakers is more knowledgeable in some matters than the other, for This article describes a model of other- instance in mastering professional terminology or initiated self-repair for a chatbot that helps communication in a second language not yet fully to practice conversation in a foreign lan- mastered. Therefore it is crucial for conversational guage. The model was developed using a agents acting in such environments to recognize corpus of instant messaging conversations and to handle repair initiations properly. between German native and non-native Repair sequences where the machine is the speakers. Conversation Analysis helped to trouble-speaker are in focus of this article. The create computational models from a small learner initiates a repair in response to something number of examples. The model has been not (fully) understood, and the machine explains. validated in an AIML-based chatbot. Un- This type of repair corresponds to other-initiated like typical retrieval-based dialogue sys- self-repair with a linguistic trouble source where tems, the explanations are generated at the language learner is the recipient of the trouble run-time from a linguistic database. talk (OISRL). CommICALL research is mainly grounded in 1 Introduction Second Language Acquisition (SLA) theory (Pe- Conversational agents tailored for communication tersen, 2010; Wilske, 2014). The model of expla- with language learners are studied in the area nation sequences, so called negotiations of mean- of Communicative Intelligent Computer-Assisted ing introduced by (Varonis and Gass, 1985) re- Language Learning (CommICALL). Starting with ceived a lot of attention and was highly re-used the idea of creating a machine that behaves like a in subsequent CALL research (Fredriksson, 2012; language expert in an informal chat, specific in- Satomi Kawaguchi, 2012). The model includes teractional practices need to be described where a trigger, an indicator, a response and a reaction linguistic identities of interaction participants be- to response. However, this model has been criti- come visible. Such practices include repair with cized for its view on repair as something "marring linguistic trouble source where non-native speak- the flow" of a conversation and for being inappli- ers address troubles in comprehension or produc- cable to non-institutional settings (Markee, 2000). tion (Danilava et al., 2013). Although repair in native/non-native speaker talk Repair is a building block of conversation that has been intensively studied in Conversation Anal- helps to deal with troubles in understanding and ysis (CA) (Markee, 2000; Gardner and Wagner, production of talk. Depending on who produced 2004; Hosoda, 2006), the results have not been a trouble source and who initiates a repair we dis- operationalized for an implementation in a Com- tinguish between self-initiated and other-initiated mICALL system. Therefore, this article has two repair. A repair can be carried out by the same objectives: speaker who produced the trouble source or by the 1. Identify typical interactional resources em- other speaker (self-repair and other-repair). ployed for initiation and carry-out of repair Because there is a preference for self-repair, using methods of Conversation Analysis. other-initiated self-repair is the most frequent re- pair type. It may become even more frequent in 2. Create a computation models of the repair 395 Proceedings of the SIGDIAL 2017 Conference, pages 395–405, Saarbrucken,¨ Germany, 15-17 August 2017. c 2017 Association for Computational Linguistics of the type OISRL to be implemented in a tion) and misunderstandings (the system matched CommICALL application. user’s input to a wrong representation) (Dzikovska et al., 2009; Meena et al., 2015). These re- We use a dataset of German native/non-native in- pair types correspond to other-initiated self-repair stant messaging conversations (Höhn, 2015) to when the user is the trouble-speaker. analyze practices of repair in native/non-native Clarification requests in AI and NLP publica- speaker informal chat. All repair sequences have tions should not be confused with clarification re- been annotated. Collections of similar cases have quests in SLA publications where this term is used been built. Interactional resources used by lan- to refer to only a particular form of corrective feed- guage learners for repair initiations have been an- back (Lyster et al., 2013), or to a dialogue move in alyzed. Patterns of repair initiations have been ob- meaning negotiations (Varonis and Gass, 1985). tained through generalization. In this way, rules Emphasising the importance of correct recogni- for recognition of repair initiations have been cre- tion of user’s clarification requests, Purver(2004) ated. An implementation case study was set up to provides a study of various types of clarification validate the resulting computational models in an requests, see also follow-up publications (Purver, AIML-based chatbot. 2006; Ginzburg et al., 2007; Ginzburg, 2012). 2 Repair in Conversational Agents Purver(2004) uses the HPSG framework to cover the main classes of the identified classification Non-native speakers are usually not considered scheme. Because different functions might be ex- as the main user group of general-purpose dia- pressed by a clarification request of the same form, logue systems. The assumption dominates that Purver(2004) analyses the clarification readings human users understand everything what an agent to cover the correspondence between the form and may say. This assumption is reflected in the two the meaning of the repair initiations. However, main problems addressed by research on repair several points for critiques arise. For instance, for conversational agents: dealing with user’s self- some utterances may be formatted as repair ini- corrections which may make speech recognition tiations but have a different interactional func- difficult and managing system’s lack of informa- tion, such as expressing surprise and topicaliza- tion in order to satisfy user’s request. tion (not listed as possible readings). In addition, These two research areas may be found under repair initiations designed to deal with troubles keywords self-repairs, sometimes speech repairs in understanding are put together with strategies (Zwarts et al., 2010) or disfluencies (Shriberg, for dealing with troubles in production (e.g. gap 1994; Martin and Jurafsky, 2009), and clarifica- fillers). From the CA perspective, Purver(2004)’s tion dialogues or clarification requests, CRs in AI gap fillers correspond to self-initiated other-repair, and NLP publications. What is referred to by the thus are sequentially completely different. There- term self-repair in speech recognition domain cor- fore, modifications in the classification proposed responds to user’s self-initiated self-repair in CA by (Purver, 2004) are needed in order to better terminology. comply with studies in CA, and therefore better re- Shriberg(1994) uses the term reparandum to flect the state-of-the-art in CA-informed dialogue refer to what is called trouble source in CA. The research. model considers pauses (moment of interruption) Different types of causes for clari- and lexicalised means to focus on the replacement Example 2.1. fication used in (Schlangen, 2004, Ex. (12)). (editing terms). These are interactional recourses used by speakers to signal trouble in production a. A I ate a Pizza with chopsticks the other day and to pre-announce a coming replacement. B A Pizza with chopsticks on it? The term clarification dialogues is mostly used b. A Please give me a double torx. B What’s a torx? to describe repairs dealing with insufficient infor- c. A Please give me a double torx. mation available for a system after speech recog- B Which one? d. A Every wire has to be connected to a power nition and language understanding (Kruijff et al., source. 2008; Jian et al., 2010; Buß and Schlangen, 2011). B Each to a different one, or can it be the same for The term miscommunication was introduced to every wire? distinguish between non-understandings (the sys- Schlangen(2004) analyses communication prob- tem could not match user’s input to a representa- lems leading to clarification requests focusing on 396 trouble source types (what caused the communica- source, that is to signal trouble and to reference tion problem). Schlangen(2004) makes clear that the trouble source. Turn formats are specifically a more fine-grained classification of causes for re- important for the future recognition of repair initi- questing clarification in dialogue may be needed, ations by chatbots. specifically, a model distinguishing between dif- ferent cases in Example 2.1. 3.1 Repair initiations From the CA perspective, speakers’ linguistic Two abstract types of repair other-initiations were and professional identities and preferences play a identified in the dataset: statements of non- role in speaker’s selection of a specific format of understanding where a part of partner’s utter- a repair initiation. Speaker B in Example 2.1.b. ance is marked as unclear, and candidate under- positions herself as a novice in torx matters with standings where the own version of understand- her repair initiation, while speakers B in Exam- ing of the problematic unit is provided. Non- ples 2.1.c. positions herself as knowledgeable in understandings require an explanation of the trou- torx matters. In addition, utterances may be de- ble source in the repair while candidate under- signed as repair initiations, but may in fact have standings require a yes/no answer. a different function. For instance, the repair ini- Repair other-initiations were found at two dis- tiation produced by B in Example 2.1.a. may be tinct types of position: immediate and delayed. analysed as a joke not requiring any explanation. The first type comes immediately after the trou- Other-initiated self-repair when the machine is ble source turn.