Asking Clarification Questions in Knowledge-Based Question Answering
Total Page:16
File Type:pdf, Size:1020Kb
Asking Clarification Questions in Knowledge-Based Question Answering Jingjing Xu1,∗ Yuechen Wang2, Duyu Tang3, Nan Duan3, Pengcheng Yang1, Qi Zeng1, Ming Zhou3, Xu Sun1 1 MOE Key Lab of Computational Linguistics, School of EECS, Peking University 2 University of Science and Technology of China 3 Microsoft Research Asia, Beijing, China fjingjingxu,yang pc,pkuzengqi,[email protected] [email protected] fdutang,nanduan,[email protected] Abstract Midori Midori computer.web_browser computer.operating_system Type The ability to ask clarification questions is Type computer.software… computer.software… essential for knowledge-based question an- Midori is a free and open- Midori was the code name for a Description source light-weight web Description managed code operating swering (KBQA) systems, especially for han- browser.... system being developed… dling ambiguous phenomena. Despite its importance, clarification has not been well Language C Language M# explored in current KBQA systems. Fur- What are the languages used to ther progress requires supervised resources for When you say the source code language create the source code of Midori? used in the program Midori, are you training and evaluation, and powerful models referring to web browser Midori or for clarification-related text understanding and operating system Midori? generation. In this paper, we construct a new I mean the first one. clarification dataset, CLAQUA, with nearly C. 40K open-domain examples. The dataset sup- ports three serial tasks: given a question, iden- Figure 1: An example of a clarification question in tify whether clarification is needed; if yes, KBQA. There are two entities named “Midori” using generate a clarification question; then predict different programming languages, which confuses the answers base on external user feedback. We system. provide representative baselines for these tasks and further introduce a coarse-to-fine model For these ambiguous or confusing questions, it is for clarification question generation. Experi- hard to directly give satisfactory responses unless ments show that the proposed model achieves better performance than strong baselines. The systems can ask clarification questions to confirm further analysis demonstrates that our dataset the participant’s intention. Therefore, this work brings new challenges and there still remain explores how to use clarification to improve cur- several unsolved problems, like reasonable rent KBQA systems. automatic evaluation metrics for clarification We introduce an open-domain clarification cor- question generation and powerful models for handling entity sparsity.1 pus, CLAQUA, for KBQA. Unlike previous clarification-related datasets with limited anno- 1 Introduction tated examples (De Boni and Manandhar, 2003; Stoyanchev et al., 2014) or in specific domains (Li Clarification is an essential ability for knowledge- et al., 2017; Rao and III, 2018), our dataset cov- based question answering, especially when han- ers various domains and supports three tasks. The dling ambiguous questions (Demetras et al., comparison of our dataset with relevant datasets is 1986). In real-world scenarios, ambiguity is a shown in Table1. Our dataset considers two kinds common phenomenon as many questions are not of ambiguity for single-turn and multi-turn ques- clearly articulated, e.g. “What are the languages tions. In the single-turn case, an entity name refers used to create the source code of Midori?” in Fig- to multiple possible entities in a knowledge base ure1. There are two “ Midori” using different pro- while the current utterance lacks necessary iden- gramming languages, which confuses the system. tifying information. In the multi-turn case, am- ∗ The work was done while Jingjing Xu and Yuechen biguity mainly comes from the omission where a Wang were interns in Microsoft Research, Asia. 1The dataset and code will be released at https:// pronoun refers to multiple possible entities from github.com/msra-nlc/MSParS_V2.0 the previous conversation turn. Unlike CSQA Dataset Domain Size Task De Boni and Manandhar(2003) Open domain 253 Clarification question generation. Stoyanchev et al.(2014) Open domain 794 Clarification question generation. Learning to generate responses based on previous clar- Li et al.(2017) Movie 180K ification questions and user feedback in dialogue. Learning to ask clarification questions in reading com- Guo et al.(2017) Synthetic 100K prehension. Ranking clarification questions in an online QA forum, Rao and III(2018) Operating system 77K StackExchange2. Clarification in KBQA, supporting clarification iden- CLAQUA (Our dataset) Open domain 40K tification, clarification question generation, and clarification-based question answering. Table 1: The comparison of our dataset with relevant datasets. The first and second datasets focus on generating clarification questions while the small size limits their applications. The middle three tasks are either not designed for KBQA or limited in specific domains. Unlike them, we present an open-domain dataset for KBQA. dataset (Saha et al., 2018) that constructs clarifica- tion, the best performing system obtains an accu- tion questions base on predicate-independent tem- racy of 86.6%. For clarification question genera- plates, our clarification questions are predicate- tion, our proposed coarse-to-fine model achieves aware and more diverse. Based on the clarifica- a BLEU score of 45.02, better than strong base- tion raising pipeline, we formulate three tasks in line models. For clarification-based question an- our dataset, including clarification identification, swering, the best accuracy is 74.7%. We also con- clarification question generation, and clarification- duct a detailed analysis for three tasks and find that based question answering. These tasks can natu- our dataset brings new challenges that need to be rally be integrated into existing KBQA systems. further explored, like reasonable automatic eval- Since asking too many clarification questions sig- uation metrics and powerful models for handling nificantly lowers the conversation quality, we first the sparsity of entities. design the task of identifying whether clarification is needed. If no clarification is needed, systems 2 Data Collection can directly respond. Otherwise, systems need to generate a clarification question to request more The construction process consists of three steps, information from the participant. Then, external sub-graph extraction, ambiguous question annota- user feedback is used to predict answers. tion, and clarification question annotation. We de- Our main contribution is constructing a new sign different annotation interfaces for single-turn clarification dataset for KBQA. To build high- and multi-turn cases. quality resources, we elaborately design a data an- notation pipeline, which can be divided into three 2.1 Single-Turn Annotation steps: sub-graph extraction, ambiguous ques- Sub-graph Extraction. As shown in Figure2, tion annotation, and clarification question anno- we extract ambiguous sub-graphs from an open- tation. We design different annotation interfaces domain knowledge base, Satori. For simplifica- for single-turn and multi-turn cases. We first ex- tion, we set the maximum number of ambiguous tract “ambiguous sub-graphs” from a knowledge entities to 2. In single-turn cases, we focus on base as raw materials. To enable systems to per- shared-name ambiguity where two entities have form robustly across domains, the sub-graphs are the same name and there is a lack of necessary dis- extracted from an open-domain KB, covering do- tinguishing information. To construct such cases, mains like music, tv, book, film, etc. Based on the we extract two entities sharing the same entity sub-graphs, annotators are required to write am- name and the same predicate. Predicate represents biguous questions and clarification questions. the relation between two entities. The sub-graphs We also contribute by implementing representa- provide reference for human annotators to write tive neural networks for the three tasks and further diverse ambiguous questions based on the shared developing a new coarse-to-fine model for clari- predicates. fication question generation. Take multi-turn as an example. In the task of clarification identifica- 2An online QA community. Computer.software. E1 Name Midori language_used Midori C E1 Type computer.software computer.web browser media common.creative work Computer.software. ... ... language_used Midori M# E1 Description Midori is a free and open-source light-weight web browser. It uses the WebKit rendering engine and the Figure 2: An extracted ambiguous sub-graph in GTK+ 2 or GTK+ 3 ... ... the single-turn case. Two entities share the same E2 Name Midori name “Midori” and the same predicate “Com- E2 Type computer.operating system puter.software,language used”. computer.software E2 Description Midori was the code name for a man- aged code operating system being Shared Name Midori developed by Microsoft with joint Predicate # 1 computer.software.language used effort of Microsoft Research... ... Predicate A property relating an instance Qa What are the languages used to cre- Description of computer.software to a pro- ate the source code of Midori? gramming language that was used Qc: When you say the source code language used in to create the source code for the the program Midori, are you referring to [web browser software. Midori] or [operating system Midori]? Qa: What are the languages used to create the source code of Midori? Table 3: A clarification question annotation