Learning to Learn Semantic Parsers from Natural Language Supervision
Total Page:16
File Type:pdf, Size:1020Kb
Learning to Learn Semantic Parsers from Natural Language Supervision Igor Labutov∗ Bishan Yang∗ Tom Mitchell LAER AI, Inc. LAER AI, Inc. Carnegie Mellon University New York, NY New York, NY Pittsburgh, PA [email protected] [email protected] [email protected] Abstract moyer and Collins, 2005, 2009; Kwiatkowski et al., 2010). A well recognized practical chal- As humans, we often rely on language to learn lenge in supervised learning of structured mod- language. For example, when corrected in a conversation, we may learn from that correc- els is that fully annotated structures (e.g., logical tion, over time improving our language flu- forms) that are needed for training are often highly ency. Inspired by this observation, we pro- labor-intensive to collect. This problem is further pose a learning algorithm for training semantic exacerbated in semantic parsing by the fact that parsers from supervision (feedback) expressed these annotations can only be done by people fa- in natural language. Our algorithm learns a miliar with the underlying logical language, mak- semantic parser from users’ corrections such ing it challenging to construct large scale datasets as “no, what I really meant was before his job, not after”, by also simultaneously learn- by non-experts. ing to parse this natural language feedback in Over the years, this practical observation has order to leverage it as a form of supervision. spurred many creative solutions to training seman- Unlike supervision with gold-standard logical tic parsers that are capable of leveraging weaker forms, our method does not require the user to forms of supervision, amenable to non-experts. be familiar with the underlying logical formal- One such weaker form of supervision relies on ism, and unlike supervision from denotation, logical form denotations (i.e, the results of a logi- it does not require the user to know the correct cal form’s execution) – rather than the logical form answer to their query. This makes our learning algorithm naturally scalable in settings where itself, as “supervisory” signal (Clarke et al., 2010; existing conversational logs are available and Liang et al., 2013; Berant et al., 2013; Pasupat and can be leveraged as training data. We con- Liang, 2015; Liang et al., 2016; Krishnamurthy struct a novel dataset of natural language feed- et al., 2017). In Question Answeing (QA), for back in a conversational setting, and show that example, this means the annotator needs only to our method is effective at learning a semantic know the answer to a question, rather than the full parser from such natural language supervision. SQL query needed to obtain that answer. Para- 1 Introduction phrasing of utterances already annotated with log- ical forms is another practical approach to scale up Semantic parsing is a problem of mapping a natu- annotation without requiring experts with a knowl- ral language utterance into a formal meaning rep- edge of the underlying logical formalism (Berant resentation, e.g., an executable logical form (Zelle and Liang, 2014; Wang et al., 2015). and Mooney, 1996). Because the space of all logi- Although these and similar methods do reduce cal forms is large but constrained by an underlying the difficulty of the annotation task, collecting structure (i.e., all trees), the problem of learning a even these weaker forms of supervision (e.g., de- semantic parser is commonly formulated as an in- notations and paraphrases) still requires a dedi- stance of structured prediction. cated annotation event, that occurs outside of the Historically, approaches based on supervised normal interactions between the end-user and the learning of structured prediction models have semantic parser1. Furthermore, expert knowledge emerged as some of the first and still remain com- may still be required, even if the underlying logi- mon in the semantic parsing community (Zettle- 1although in some cases existing datasets can be leveraged ∗ Work done while at Carnegie Mellon University. to extract this information 1676 Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 1676–1690 Brussels, Belgium, October 31 - November 4, 2018. c 2018 Association for Computational Linguistics User Utterance Before [September 10th], how many places was be a correction of y^i expressed in natural language. [Brad] employed? The observed variables in a single interaction are System NLG find employment [Brad] had, prior to [September 10th] the task utterance ui, the predicted task logical User Feedback Yes, but I also want you to count how many placed Brad was employed at. form y^i and the user’s feedback fi; the true logical form yi is hidden. See Figure1 for an illustration. Figure 1: Example (i) user’s original utterance, (ii) A key observation that we make from the above confirmation query generated by inverting the original formulation is that learning from such interactions parse, (iii) user’s generated feedback towards the con- can be effectively done in an offline (i.e., non- firmation query (i.e., original parse) . interactive) setting, using only the logs of past in- teractions with the user. Our aim is to formulate a model that can learn a task semantic parser (i.e., cal form does not need to be given by the annotator one parsing the original utterance u) from such in- (QA denotations, for example, require the annota- teraction logs, without access to the true logical tor to know the correct answer to the question – an forms (or denotations) of the users’ requests. assumption which doesn’t hold for end-users who asked the question with the goal of obtaining the 2.1 Modelling conversational logs answer). In contrast, our goal is to leverage natural language feedback and corrections that may occur Formally, we propose to learn a semantic parser naturally as part of the continuous interaction with from conversational logs represented as follows: the non-expert end-user, as training signal to learn a semantic parser. D = f(ui; y^i; fi)gi;:::;N The core challenge in leveraging natural lan- guage feedback as a form of supervision in train- where ui is the user’s task utterance (e.g., a ques- ing semantic parsers, however, is the challenge of tion), y^i is the system’s original parse (logical correctly parsing that feedback to extract the su- form) of that utterance, fi is the user’s natural lan- pervisory signal embedded in it. Parsing the feed- guage feedback towards that original parse and N back, just like parsing the original utterance, re- is the number of dialog turns in the log. Note that quires its own semantic parser trained to inter- the original parse y^i of utterance ui could come pret that feedback. Motivated by this observa- from any semantic parser that was deployed at the tion, our main contribution in this work is a semi- time the data was logged – there are no assump- supervised learning algorithm that learns a task tions made on the source of how y^i was produced. parser (e.g., a question parser) from feedback ut- Contrast this with the traditional learning set- terances while simultaneously learning a parser to tings for semantic parsers, where the user (or an- interpret the feedback utterances. Our algorithm notator) provides the correct logical form yi or the relies only on a small number of annotated logical execution of the correct logical form yi (deno- forms, and can continue learning as more feedback tation) (Table1). In our setting, insteadJ K of the is collected from interactions with the user. correct logical form yi, we only have access to Because our model learns from supervision that the logical form produced by whatever semantic it simultaneously learns to interpret, we call our parser was interacting with the user at the time the approach learning to learn semantic parsers from data was collected, i.e., y^i is not necessarily the natural language supervision. correct logical form (though it could be). In this 2 Problem Formulation work, however, we focus only on corrective feed- back (i.e., cases where y^ =6 y) as our main objec- Formally, the setting proposed in this work can be tive is to evaluate the ability to interpret rich natu- modelled as follows: (i) the user poses a natural ral language feedback (which is only given when language input ui (e.g., a question) to the system, the original parse is incorrect). Our key hypothesis (ii) the system parses the user’s utterance ui, pro- in this work is that although we do not observe yi ducing a logical form y^i, (iii) the system commu- directly, combining y^i with the user’s feedback fi nicates y^i to the user in natural language in the should be sufficient to get a better estimate of the form of a confirmation (i.e., “did you mean . ”), correct logical form (closer to yi), which we can (iv) in response to the system’s confirmation, the then leverage as a “training example” to improve user generates a feedback utterance fi, which may the task parser (and the feedback parser). 1677 Supervision Dataset utterance to agree with the “stronger” model that Full logical forms f(ui; yi)gi;:::;N Denotations f(ui; yi )gi;:::;N does (i.e., using the feedback parser’s prediction as Binary feedback f(ui; Jy^iK= yi )gi;:::;N a noisy “label” to bootstrap the task parser), while J K NL feedback (this work) f(ui; y^i; fi)gi;:::;N conversely encouraging a more “complex” model Table 1: Different types of supervision used in litera- to agree with a “simpler” model (i.e., using the ture for training semantic parsers, and the correspond- simpler task parser model as a “regularizer” for the ing data needed for each type of supervision.