Using Entity Information from a Knowledge Base to Improve Relation Extraction

Using Entity Information from a Knowledge Base to Improve Relation Extraction Lan Du1,∗ Anish Kumar2, Mark Johnson2 and Massimiliano Ciaramita3 1Faculty of Information Technology, Monash University Clayton, VIC 3800, Australia 2Department of Computing, Macquarie University Sydney, NSW 2109, Australia 3Google Research, Zurich, Switzerland Abstract A typical relation extraction system functions as a pipeline, first performing named entity recog- Relation extraction is the task of ex- nition (NER) and entity disambiguation to link tracting predicate-argument relationships the entity mentions found in sentences to their between entities from natural language database entries (e.g., “Michael Bay” and “Trans- text. This paper investigates whether back- formers” would both be linked to their respective ground information about entities avail- database ids). Then the context in which these en- able in knowledge bases such as FreeBase tity mentions co-occur is used to predict the re- can be used to improve the accuracy of lationship between the entities. For example, the a state-of-the-art relation extraction sys- path in a syntactic parse between two mentions in tem. We describe a simple and effective a sentence can be used as a feature to predict the way of incorporating FreeBase’s notable relation holding between the two entities. Contin- types into a state-of-the-art relation extrac- uing our example, the text pattern feature X-the- tion system (Riedel et al., 2013). Experi- director-of-Y (or a corresponding parse subtree mental results show that our notable type- fragment) might be used to predict the database based system achieves an average 7.5% relation film director(X,Y). In such a pipeline ar- weighted MAP score improvement. To chitecture, information about the entities from the understand where the notable type infor- database is available and can be used to help de- mation contributes the most, we perform termine the most appropriate relationship between a series of ablation experiments. Results the entities. The goal of this paper is to identify show that the notable type information im- whether that information is useful in a relation ex- proves relation extraction more than NER traction task, and study such information about the labels alone across a wide range of entity entities with a set of ablation experiments. types and relations. We hypothesise that information from database 1 Introduction entries can play the role of background knowledge in human sentence comprehension. There is The goal of relation extraction is to extract rela- strong evidence that humans use world knowledge tional information about entities from a large text and contextual information in both syntactic and collection. For example, given the text “Michael semantic interpretation (Spivey-Knowlton and Se- Bay, the director of Transformers, visited Paris divy, 1995), so it is reasonable to expect a machine yesterday,” a relation extraction system might ex- might benefit from it as well. Continuing with our tract the relationship film director(Michael Bay, example, if our database contained the information Transformers). These tuples can be then used to that one particular entity with the name Michael extend a knowledge base. With the increase in the Bay had died a decade before the movie Trans- amount of textual data available on the web, rela- formers was released, then it might be reason- tion extraction has gained wide applications in in- able to conclude that this particular individual was formation extraction from both general newswire unlikely to have directed Transformers. Clearly, texts and specialised document collections such as modelling all the ways in which such background biomedical texts (Liu et al., 2007). information about entities might be used would be ∗This work was partially done while Lan Du was at Mac- extremely complex. This paper explores a simple quarie University. way of using some of the background information Lan Du, Anish Kumar, Mark Johnson and Massimiliano Ciaramita. 2015. Using Entity Information from a Knowledge Base to Improve Relation Extraction . In Proceedings of Australasian Language Technology Association Workshop, pages 31−38. about entities available in FreeBase (Bollacker et but creating these annotations is both expensive al., 2008). and error-prone. Semi-supervised approaches, by Here we focus on one particular kind of back- contrast, rely on correlations between relations ground information about entities — the informa- and other large data sources. tion encoded in FreeBase’s notable types. Free- In relation extraction, most semi-supervised ap- Base’s notable types are simple atomic labels proaches use distant supervision, which aligns given to entities that indicate what the entity is facts from a large database, e.g., Freebase, to un- notable for, and so serve as a useful informa- labelled text by assuming some systematic relation source that should be relatively easy to ex- tionship between the documents and the database ploit. For example, the search results for “Jim (Bunescu and Mooney, 2007; Mintz et al., 2009; Jones” given by FreeBase contains several dif- Riedel et al., 2010; Yao et al., 2010). Typically, ferent entities. Although they all have the same we assume that (a) an entity linker can reliably name entity (NE) category PERSON, their no- identify entity mentions in the text and map them table types are different. The notable types to the corresponding database entries, and (b) for for the top 4 “Jim Jones” results are organiza- all tuples of entities that appear in a relation in tion/organization founder, music/composer, base- the database, if we observe that entity tuple co- ball/baseball player and government/politician. It occurring in a suitable linguistic construction (e.g., is clear that the notable type information provides a sentence) then that construction expresses the much finer-grained information about “Jim Jones” database relationship about those entities. Pre- than just the NE category. It is reasonable to ex- vious work (Weston et al., 2013; Riedel et al., pect that notable types would be useful for relation 2013; Bordes et al., 2013; Chang et al., 2014) extraction; e.g., the politician Jim Jones is likely has shown that models leveraging rich information to stand for election, while the baseball player is from database often yield improved performance. likely to be involved in sport activities. In this work we are particularly interested in ex- We extend one state-of-the-art relation extrac- ploring entity type information in relation extraction system of Riedel et al. (2013) to exploit this tion, as semantic relations often have selectional notable type information. Our notable type ex- preference over entity types. Yao et al. (2010), tensions significantly improve the mean averaged Singh et al. (2013), Yao et al. (2013), Koch et al. precision (MAP) by 7.5% and the weighted MAP (2014) and Chang et al. (2014) have shown that the by 6% over a strong state-of-the-art baseline. With use of type information, e.g., NE categories, sig- a set of ablation experiments we further evaluate nificantly improves relation extraction. Our work how and where the notable type information con- here is similar except that we rely on Freebase’s tributes to relation extraction.The rest of this paper notable types, which provide much finer-grained is structured as follows. The next section describes information about entities. One of the challenges related work on relation extraction. Section 3 de- in relation extraction, particularly when attempt- scribes how a state-of-the-art relation extraction ing to extract a large number of relations, is to gen- system can be extended to exploit the notable type eralise appropriately over both entities and rela- information available in FreeBase. Section 4 spec- tions. Techniques for inducing distributed vector- ifies the inference procedures used to identify the space representations can learn embeddings of values of the model parameters, while section 5 both entities and relations in a high-dimensional explains how we evaluate our models and presents vector space, providing a natural notion of simi- a systematic experimental comparison of the mod- larity (Socher et al., 2013) that can be exploited in els by ablating the notable type in different ways the relation extraction task (Weston et al., 2013). based on entities’ NE categories. Section 6 con- Instead of treating notable types as features Ling cludes the paper and discusses future work. and Weld (2012), here we learn distributed vector- space representations for notable types as well as 2 Related work entities, entity tuples and relations. Most approaches to relation extraction are either 3 Relation extraction as matrix supervised or semi-supervised. Supervised ap- completion proaches require a large set of manually annotated text as training data (Culotta and Sorensen, 2004), Riedel et al. (2013) formulated the relation extrac- 32 tion task as a matrix completion problem. In this lation r that will be learnt from the training data. section we extend this formulation to exploit no- The neighbourhood model can be regarded as pre- table types in a simple and effective way. Specif- dicting an entry θr;t by using entries along the ically, we follow Riedel et al. (2013) in assum- same row. It functions as a logistic regression clas- ing that our data O consists of pairs hr; ti, where sifier predicting the log odds of a FreeBase rela- r 2 R is a relation and t 2 T is a tuple of en- tion r applying to the entity tuple t using as fea- tities. The tuples are divided into training and tures the syntactic relations r0 that hold of t. test depending on which documents they are ex- Our notable type extension to the neigh- tracted from. In this paper, the tuples in T are bourhood model enriches the syntactic pat- always pairs of entities, but nothing depends on terns in the training data O with notable this.

Load more