Multi-View Inference for Relation Extraction with Uncertain Knowledge
Total Page:16
File Type:pdf, Size:1020Kb
The Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI-21) Multi-view Inference for Relation Extraction with Uncertain Knowledge Bo Li1;2, Wei Ye1∗, Canming Huang3, Shikun Zhang1 1National Engineering Research Center for Software Engineering, Peking University 2School of Software and Microelectronics, Peking University 3Beijing University of Posts and Telecommunications [email protected], [email protected], [email protected], [email protected] Abstract WordNet1; Lei et al. (2018) extracted the n-gram text match- ing words of each entity from FreeBase (Bollacker et al. Knowledge graphs (KGs) are widely used to facilitate rela- 2008); Li et al. (2019a) used relative semantic frames from tion extraction (RE) tasks. While most previous RE methods FrameNet (Ruppenhofer et al. 2006). The other way is to focus on leveraging deterministic KGs, uncertain KGs, which explore the latent semantics of KGs. Researchers may in- assign a confidence score for each relation instance, can pro- vide prior probability distributions of relational facts as valu- corporate pre-trained entity and relation embeddings (Wang able external knowledge for RE models. This paper proposes et al. 2018; Hu et al. 2019), e.g., from TransE (Bordes et al. to exploit uncertain knowledge to improve relation extraction. 2013), or jointly learn the entity embedding and RE models’ Specifically, we introduce ProBase, an uncertain KG that in- parameters (Han, Liu, and Sun 2018; Xu and Barbosa 2019). dicates to what extent a target entity belongs to a concept, into Despite various ways of leveraging KGs, most previous our RE architecture. We then design a novel multi-view infer- RE methods focus on deterministic KGs, where a specific ence framework to systematically integrate local context and relation either exists between two entities or not. However, global knowledge across three views: mention-, entity- and in real-world scenarios, prior knowledge resources may con- concept-view. The experiment results show that our model tain inconsistent, ambiguous, and uncertain information (Wu achieves competitive performances on both sentence- and document-level relation extraction, which verifies the effec- et al. 2012). It would not be easy to capture highly-related tiveness of introducing uncertain knowledge and the multi- external information if we treat all relations between entities view inference framework that we design. as equally important. Another type of KGs, called Uncertain KGs (Chen et al. 2019) such as ConceptNet (Speer, Chin, and Havasi 2017) and ProBase (Wu et al. 2012), come to the Introduction rescue. Given two words (typically two entities or an entity and a concept) and a relation, uncertain KGs provide a con- The goal of relation extraction (RE) is to classify the se- fidence score that measures the possibility of the two words mantic relation between entities in a given context. It has having the given relation. This inspires us to exploit the prior plenty of practical applications, such as question answering probability distribution of relational facts in uncertain KGs (Yu et al. 2017) and information retrieval (Kadry and Di- to improve relation extraction. etz 2017). Knowledge graphs (KGs) (Bollacker et al. 2008; As a representative uncertain KG, ProBase provides IsA Ruppenhofer et al. 2006; Speer, Chin, and Havasi 2017; Wu relation between an entity and a concept, indicating to what et al. 2012), which contain pre-defined nodes (usually enti- extent the target entity belongs to the concept, which is es- ties or concepts) and their relations, have been widely incor- sential information for RE. Meanwhile, the concise struc- porated into RE tasks to integrate various prior knowledge ture of ProBase makes it convenient to be coupled with su- in recent years. pervised relation extraction datasets. Therefore, we choose The interaction between KGs and RE lies in two main to use ProBase as our external uncertain KG and retrieve aspects. On the one hand, we can use relational facts in ex- highly-related concepts of each entity. The relational facts iting KGs to assign relation labels for entity pairs in unla- and their confidence scores provide a prior probability distri- beled corpora to build datasets for distant supervision RE bution of concepts for an entity, which can serve as valuable (Mintz et al. 2009). On the other hand, we can leverage the supplementary knowledge given limited local contextual in- external knowledge in KGs to boost the performance of RE formation. models, which is the case of this paper. Generally, there are ProBase brings global knowledge about related concepts two ways to integrate prior knowledge into RE. One way for an entity pair, while the target document (or sentence) is to utilize structured information explicitly. For example, provides local semantic information about related mentions. Can et al. (2019) retrieved the synonyms of each entity from We now have three views for the contextual information of a ∗ Corresponding author. relational fact: concept view, entity view, and mention view. Copyright c 2021, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. 1https://wordnet.princeton.edu/ 13234 Concept-view sults show that our method achieves competitive results on Politician Person Developed both sentence- and document-level relation extraction tasks. Nation Country City Place Political 0.79 0.21 Our contributions are summarized as follows: Party IsA 0.71 0.29 0.53 0.47 Party IsA IsA 0.67 IsA 0.33 • We propose MIUK, a Multi-view Inference framework Entity-view head of government Bill de for relation extraction with Uncertain Knowledge. Our is_lacated_in Blasio New York place of birth leader_of is_lacated_in Democratic spouse work is pioneering research on introducing uncertain America an_organization_ Party is_lacated_in of KG into relation extraction and investigating interactions member_of among mentions, entities, and concepts. Mention-view USA Bill de New York City America Intra-view Links Blasio • We conduct extensive experiments to verify MIUK Cross-view Links Democratic Relation in Context NYC the United States Party Blasio and achieve competitive results on both sentence- and Information New York Aggregation document-level relation extraction tasks. • We build and release a corpus with high-quality descrip- Figure 1: An example of a three-view inference framework tions of more than 15,000,000 entities and concepts. It for document-level relation extraction. In a document, an en- can serve as a valuable external knowledge resource for tity may have many mentions, usually in different forms. For relation extraction and many other natural language pro- a given entity, the local interactive representation is aggre- cessing tasks. gated by different mention representations. This is a cross- view interaction between mention- and entity-view. More- over, by retrieving uncertain KG, the concepts and weighting Problem Definition and Input Formalization scores are used to produce the concept representation, which Problem Definition will gather descriptions of entities and concepts to obtain MIUK can handle both sentence- and document-level rela- global interactive representations. This is a cross-view in- tion extraction tasks by leveraging the multi-view inference teraction between entity- and concept-view. Finally, entity- framework. For sentence-level relation extraction, MIUK view is used for contextual information aggregation and re- uses two-view inference framework that exploits the entity- lation prediction. Sentence-level relation extraction only in- and concept-view representations; for document-level rela- volves entity- and concept-view. The figure is better viewed tion extraction, the mention-view representation is added to in color. constitute a three-view inference framework. In this section, we introduce the architecture of MIUK by taking document- level relation extraction as an example, as sentence-level re- It is a non-trivial task to aggregate information across differ- lation extraction is just its simplified case. ent views effectively to obtain discriminative feature vectors. For an input document D that contains n sentences (D = Inspired by Hao et al. (2019), we designed a multi-view in- fs1; s2; :::; sng), and p different entities, there will be p·(p− ference framework to synthesize contextual semantics from 1) entity pair candidates. The goal of MIUK is to predict different sources and representation levels systematically. the relations of all possible entity pairs in a parallel way. We For a given entity, we first retrieve the top K concepts that use m; e; c; s to denote the mention, entity, concept, and sen- the target entity most likely belongs to, and then perform tence respectively, and their corresponding low-dimensional cross-view interaction to aggregate the local and global in- vectors are denoted as m, e, c, s. Besides, the weighting formation based on the confidence scores in ProBase. Fig- score between an entity and a concept is a real number de- ure 1 demonstrates the overview of our multi-view inference noted as w. framework. Last but not least, since the text descriptions of entities Input Formalization and concepts on Wikipedia also provide rich semantic in- This section details the data preparation for our method, formation for RE, we retrieved the text descriptions from which consists of three parts: input document (the context), Wikipedia to enrich the representations of entities and con- uncertain KG (ProBase), and descriptions from Wikipedia cepts, resulting in a corpus that couples with ProBase, which (ProBase Desp), as shown in Figure 2. Note that we use Un- we call ProBase Desp. The dataset provides high-quality de- cased BERT-Base (Devlin et al. 2019) to encode the input scriptions of more than 15,000,000 entities and concepts, document and descriptions. serving as another valuable external knowledge resource for our method. We have released ProBase Desp 2 to facilitate Input Document Since the input document may contain a further research. lot of entities with various mentions, it is not applicable to In summary, we leverage uncertain KG, ProBase, to im- add position embeddings (Zeng et al.