Exploiting Explicit Paths for Multi-Hop Reading Comprehension
Total Page:16
File Type:pdf, Size:1020Kb
Exploiting Explicit Paths for Multi-hop Reading Comprehension Souvik Kunduy∗ and Tushar Khotz and Ashish Sabharwalz and Peter Clarkz yDepartment of Computer Science, National University of Singapore zAllen Institute for Artificial Intelligence, Seattle, WA, U.S.A. [email protected], ftushark,ashishs,[email protected] Abstract Query:(always breaking my heart, record label, ?) Supporting Passages: We propose a novel, path-based reasoning (p1) “Always Breaking My Heart” is the second sin- approach for the multi-hop reading compre- gle from Belinda Carlisle’s A Woman and a Man al- hension task where a system needs to com- bum , released in 1996 ( see 1996 in music ) . It ... (p2) A Woman and a Man is the sixth studio al- bine facts from multiple passages to answer bum by American singer Belinda Carlisle, released a question. Although inspired by multi-hop in the United Kingdom on September 23, 1996 by reasoning over knowledge graphs, our pro- Chrysalis Records (then part of the EMI Group, ... posed approach operates directly over unstruc- Candidates: chrysalis records, emi group, virgin tured text. It generates potential paths through records, ... passages and scores them without any di- Answer: chrysalis records rect path supervision. The proposed model, Paths: named PathNet, attempts to extract implicit (“Always Breaking My Heart” ... single from ... A relations from text through entity pair repre- Woman and a Man) sentations, and compose them to encode each (A Woman and a Man ... released ... by ... Chrysalis path. To capture additional context, Path- Records) Net also composes the passage representations Figure 1: Example illustrating our proposed path ex- along each path to compute a passage-based traction and reasoning approach. representation. Unlike previous approaches, our model is then able to explain its reason- ing via these explicit paths through the pas- requiring a system to combine information from sages. We show that our approach outper- forms prior models on the multi-hop Wikihop multiple sentences in order to arrive at the answer, dataset, and also can be generalized to apply referred to as multi-hop reasoning. to the OpenBookQA dataset, matching state- Multi-hop reasoning has been studied for ques- of-the-art performance. tion answering (QA) over structured knowledge graphs (Lao et al., 2011; Guu et al., 2015; Das 1 Introduction et al., 2017). Many of the successful models ex- Many reading comprehension (RC) datasets (Ra- plicitly identify paths in the knowledge graph that jpurkar et al., 2016; Trischler et al., 2017; Joshi led to the answer. A strength of these models et al., 2017) have been proposed recently to eval- is high interpretability, arising from explicit path- arXiv:1811.01127v2 [cs.CL] 8 Jul 2019 uate a system’s ability to answer a question from based reasoning over the underlying graph struc- a given text passage. However, most of the ques- ture. However, they cannot be directly applied to tions in these datasets can be answered by using QA in the absence of such structure. only a single sentence or passage. As a result, Consequently, most multi-hop RC models over systems designed for these tasks may not be able unstructured text (Dhingra et al., 2017; Hu et al., to compose knowledge from multiple sentences or 2018) extend standard attention-based models passages, a key aspect of natural language under- from RC by iteratively updating the attention to standing. To remedy this, new datasets (Weston indirectly “hop” over different parts of the text. et al., 2015; Welbl et al., 2018; Khashabi et al., Recently, graph-based models (Song et al., 2018; 2018a; Mihaylov et al., 2018) have been proposed, Cao et al., 2018) have been proposed for the Wik- ∗ Work performed while doing an internship at the Allen iHop dataset (Welbl et al., 2018). Nevertheless, Institute for Artificial Intelligence. these models still only implicitly combine knowl- edge from all passages, and are therefore unable to based approaches. provide explicit reasoning paths. We make three main contributions: We propose an approach1 for multiple choice (1) A novel path-based reasoning approach for RC that explicitly extracts potential paths from multi-hop QA over text that produces explanations text (without direct path supervision) and encodes in the form of explicit paths; (2) A model, PathNet, the knowledge captured by each path. Figure1 which aims to extract implicit relations from text shows how to apply this approach to an exam- and compose them; and (3) Outperforming prior ple in the WikiHop dataset. It shows two sam- models on the target WikiHop dataset2 and gen- ple paths connecting an entity in the question eralizing to the open-domain science QA dataset, (Always Breaking My Heart) to a candidate an- OpenBookQA, with performance comparable to swer (Chrysalis Records) through a singer (Be- prior models. linda Carlisle) and an album (A Woman and a Man). 2 Related Work To encode the path, our model, named PathNet, We summarize related work in QA over text, semi- first aims to extract implicit (latent) relations be- structured knowledge, and knowledge graphs. tween entity pairs in a passage based on their con- Multi-hop RC. Recent datasets such as textual representations. For example, it aims to ex- bAbI (Weston et al., 2015), Multi-RC (Khashabi tract the implicit single from relation between the et al., 2018a), WikiHop (Welbl et al., 2018), and song and the name of the album in the first pas- OpenBookQA (Mihaylov et al., 2018) have en- sage. Similarly, it extracts the released by relation couraged research in multi-hop QA over text. The between the album and the record label in the sec- resulting multi-hop models can be categorized ond passage. It learns to compose the extracted into state-based and graph-based reasoning mod- implicit relations such that they map to the main els. State-based reasoning models (Dhingra et al., relation in the query, in this case record label. In 2017; Shen et al., 2017; Hu et al., 2018) are closer essence, the motivation is to learn to extract im- to a standard attention-based RC model with an plicit relations from text and to identify their valid additional “state” representation that is iteratively compositions, such as: (x, single from, y), (y, re- updated. The changing state representation re- leased by, z) ! (x, record label, z). Due to the sults in the model focusing on different parts of absence of direct supervision on these relations, the passage during each iteration, allowing it to PathNet does not explicitly extract these relations. combine information from different parts of the However, our qualitative analysis on a sampled set passage. Graph-based reasoning models (Dhingra of instances from WikiHop development set shows et al., 2018; Cao et al., 2018; Song et al., 2018), on that the top scoring paths in 78% of the correctly the other hand, create graphs over entities within answered questions have implied relations in the the passages and update entity representations via text that could be composed to derive the query recurrent or convolutional networks. In contrast, relations. our approach explicitly identifies paths connecting In addition, PathNet also learns to compose ag- entities in the question to the answer choices. gregated passage representations in a path to cap- Semi-structured QA. Our model is closer to ture more global information: encoding(p1), en- Integer Linear Programming (ILP) based meth- coding(p2) ! (x, record label, z). This passage- ods (Khashabi et al., 2016; Khot et al., 2017; based representation is especially useful in do- Khashabi et al., 2018b), which define an ILP mains such as science question answering where program to find optimal support graphs for con- the lack of easily identifiable entities limits the ef- necting the question to the choices through a fectiveness of the entity-based path representation. semi-structured knowledge representation. How- While this passage-based representation is less in- ever, these models require a manually authored terpretable than the entity-based path representa- and tuned ILP program, and need to convert text tion, it still identifies the two passages used to se- into a semi-structured representation—a process lect the answer, compared to a spread out attention that is often noisy (such as using Open IE tu- over all documents produced by previous graph- 2Other systems, such as by Zhong et al.(2019), have 1The source code is available at https://github. recently appeared on the WikiHop leaderboard (http:// com/allenai/PathNet qangaroo.cs.ucl.ac.uk/leaderboard.html). ples (Khot et al., 2017), SRL frames (Khashabi tity and r represents the relation between he and et al., 2018b)). Our model, on the other hand, the unknown tail entity. The task is to select the is trained end-to-end, and discover relevant rela- unknown tail entity from a given set of candidates tional structure from text. Instead of an ILP pro- fc1; c2; : : : cN g, by reasoning over supporting pas- gram, Jansen et al.(2017) train a latent ranking sages P = p1;:::; pM . To perform multi-hop rea- perceptron using features from aggregated syntac- soning, we extract multiple paths P (cf. Section4) tic structures from multiple sentences. However, connecting he to each ck from the supporting pas- their system operates at the detailed (and often sages P. The j-th 2-hop path for candidate ck is noisy) level of dependency graphs, whereas we denoted pkj, where pkj = he ! e1 ! ck, and e1 identify entities and let the model learn implicit is referred to as the intermediate entity. relations and their compositions. In OpenBookQA, different from WikiHop, the Knowledge Graph QA. QA datasets on knowl- questions and candidate answer choices are plain edge graphs such as Freebase (Bollacker et al., text sentences. To construct paths, we extract 2008), require systems to map queries to a sin- all head entities from the question and tail enti- gle relation (Bordes et al., 2015), a path (Guu ties from candidate answer choices, considering et al., 2015), or complex structured queries (Be- all noun phrases and named entities as entities.