Entity-Relation Extraction As Multi-Turn Question Answering

Entity-Relation Extraction as Multi-turn Question Answering Xiaoya Li∗1, Fan Yin∗1, Zijun Sun1, Xiayu Li1 Arianna Yuan1,2, Duo Chai1, Mingxin Zhou1 and Jiwei Li1 1 Shannon.AI 2 Computer Science Department, Stanford University {xiaoya li, fan yin, zijun sun, xiayu li, arianna yuan, duo chai, mingxin zhou, jiwei li}@shannonai.com Abstract Person Corp Time Position Musk SpaceX 2002 CEO CEO& Musk Tesla 2003 In this paper, we propose a new paradigm for product architect the task of entity-relation extraction. We cast Musk SolarCity 2006 chairman the task as a multi-turn question answering Musk Neuralink 2016 CEO problem, i.e., the extraction of entities and re- Musk The Boring Company 2016 - lations is transformed to the task of identifying answer spans from the context. This multi-turn Table 1: An illustration of an extracted structural table. QA formalization comes with several key ad- vantages: firstly, the question query encodes important information for the entity/relation In 2002, Musk founded SpaceX, an aerospace class we want to identify; secondly, QA pro- manufacturer and space transport services Com- vides a natural way of jointly modeling entity pany, of which he is CEO and lead designer. He and relation; and thirdly, it allows us to exploit helped fund Tesla, Inc., an electric vehicle and so- the well developed machine reading compre- lar panel manufacturer, in 2003, and became its hension (MRC) models. CEO and product architect. In 2006, he inspired Experiments on the ACE and the CoNLL04 the creation of SolarCity, a solar energy services corpora demonstrate that the proposed Company, and operates as its chairman. In 2016, paradigm significantly outperforms previous he co-founded Neuralink, a neurotechnology Com- best models. We are able to obtain the state- pany focused on developing brain–computer inter- of-the-art results on all of the ACE04, ACE05 and CoNLL04 datasets, increasing the SOTA faces, and is its CEO. In 2016, Musk founded The results on the three datasets to 49.4 (+1.0), Boring Company, an infrastructure and tunnel- 60.2 (+0.6) and 68.9 (+1.1), respectively. construction Company. Additionally, we construct a newly developed We need to extract four different types of en- dataset RESUME in Chinese, which requires tities, i.e., Person, Company, Time and Position, multi-step reasoning to construct entity depen- and three types of relations, FOUND, FOUNDING- dencies, as opposed to the single-step depen- TIME and SERVING-ROLE. The text is to be trans- dency extraction in the triplet exaction in pre- formed into a structural dataset shown in Table 1. arXiv:1905.05529v4 [cs.CL] 4 Sep 2019 vious datasets. The proposed multi-turn QA model also achieves the best performance on Most existing models approach this task by the RESUME dataset. 1 2 extracting a list of triples from the text, i.e., 1 Introduction REL(e1, e2), which denotes that relation REL holds between entity e1 and entity e2. Previous models Identifying entities and their relations is the pre- fall into two major categories: the pipelined ap- requisite of extracting structured knowledge from proach, which first uses tagging models to identify unstructured raw texts, which has recieved grow- entities, and then uses relation extraction models ing interest these years. Given a chunk of natural to identify the relation between each entity pair; language text, the goal of entity-relation extraction and the joint approach, which combines the entity is to transform it to a structural knowledge base. model and the relation model throught different For example, given the following text: strategies, such as constraints or parameters shar- 1* indicates equal contribution. ing. 2to appear in ACL2019. There are several key issues with current ap- proaches, both in terms of the task formalization 2005; Lemon et al., 2006); (2) the question query and the algorithm. At the formalization level, the encodes important prior information for the rela- REL(e1, e2) triplet structure is not enough to fully tion class we want to identify. This informative- express the data structure behind the text. Take ness can potentially solve the issues that existing the Musk case as an example, there is a hierar- relation extraction models fail to solve, such as chical dependency between the tags: the extrac- distantly-separated entity pairs, relation span over- tion of Time depends on Position since a Person lap, etc; (3) the QA framework provides a natu- can hold multiple Positions in a Company during ral way to simultaneously extract entities and rela- different Time periods. The extraction of Posi- tions: most MRC models support outputting spe- tion also depends on Company since a Person can cial NONE tokens, indicating that there is no an- work for multiple companies. At the algorithm swer to the question. Throught this, the original level, for most existing relation extraction mod- two tasks, entity extraction and relation extraction els (Miwa and Bansal, 2016; Wang et al., 2016a; can be merged to a single QA task: a relation holds Ye et al., 2016), the input to the model is a raw if the returned answer to the question correspond- sentence with two marked mentions, and the out- ing to that relation is not NONE, and this returned put is whether a relation holds between the two answer is the entity that we wish to extract. mentions. As pointed out in Wang et al. (2016a); In this paper, we show that the proposed Zeng et al. (2018), it is hard for neural models to paradigm, which transforms the entity-relation ex- capture all the lexical, semantic and syntactic cues traction task to a multi-turn QA task, introduces in this formalization, especially when (1) entities significant performance boost over existing sys- are far away; (2) one entity is involved in multiple tems. It achieves state-of-the-art (SOTA) perfor- triplets; or (3) relation spans have overlaps3. mance on the ACE and the CoNLL04 datasets. In the paper, we propose a new paradigm to han- The tasks on these datasets are formalized as dle the task of entity-relation extraction. We for- triplet extraction problems, in which two turns of malize the task as a multi-turn question answer- QA suffice. We thus build a more complicated and ing task: each entity type and relation type is more difficult dataset called RESUME which re- characterized by a question answering template, quires to extract biographical information of indi- and entities and relations are extracted by an- viduals from raw texts. The construction of struc- swering template questions. Answers are text tural knowledge base from RESUME requires four spans, extracted using the now standard machine or five turns of QA. We also show that this multi- reading comprehension (MRC) framework: pre- turn QA setting could easilty integrate reinforce- dicting answer spans given context (Seo et al., ment learning (just as in multi-turn dialog systems) 2016; Wang and Jiang, 2016; Xiong et al., 2017; to gain additional performance boost. Wang et al., 2016b). To extract structural data like The rest of this paper is organized as follows: Table 1, the model need to answer the following Section 2 details related work. We describe the questions sequentially: dataset and setting in Section 3, the proposed • Q: who is mentioned in the text? A: Musk; model in Section 4, and experimental results in • Q: which Company / companies did Musk Section 5. We conclude this paper in Section 6. work for? A: SpaceX, Tesla, SolarCity, Neu- ralink and The Boring Company; 2 Related Work • Q: when did Musk join SpaceX? A: 2002; • Q: what was Musk’s Position in SpaceX? A: 2.1 Extracting Entities and Relations CEO. Many earlier entity-relation extraction systems are Treating the entity-relation extraction task as pipelined (Zelenko et al., 2003; Miwa et al., 2009; a multi-turn QA task has the following key ad- Chan and Roth, 2011; Lin et al., 2016): an entity vantages: (1) the multi-turn QA setting provides extraction model first identifies entities of interest an elegant way to capture the hierarchical depen- and a relation extraction model then constructs re- dency of tags. As the multi-turn QA proceeds, we lations between the extracted entities. Although progressively obtain the entities we need for the pipelined systems has the flexibility of integrat- next turn. This is closely akin to the multi-turn ing different data sources and learning algorithms, slot filling dialogue system (Williams and Young, they suffer significantly from error propagation. 3e.g., in text ABCD,(A, C) is a pair and (B, D) is a pair. To tackle this issue, joint learning models have been proposed. Earlier joint learning ap- to be selected from multiple passages. Multi- proaches connect the two models through var- passage MRC tasks can be easily simplified ious dependencies, including constraints solved to single-passage MRC tasks by concatenating by integer linear programming (Yang and Cardie, passages (Shen et al., 2017; Wang et al., 2017b). 2013; Roth and Yih, 2007), card-pyramid pars- Wang et al. (2017a) first rank the passages and ing (Kate and Mooney, 2010), and global prob- then run single-passage MRC on the selected abilistic graphical models (Yu and Lam, 2010; passage. Tan et al. (2017) train the passage Singh et al., 2013). In later studies, Li and Ji ranking model jointly with the reading compre- (2014) extract entity mentions and relations us- hension model. Pretraining methods like BERT ing structured perceptron with efficient beam- (Devlin et al., 2018) or Elmo (Peters et al., 2018) search, which is significantly more efficient and have proved to be extremely helpful in MRC less Time-consuming than constraint-based ap- tasks.

Load more