Arxiv:2001.04425V4 [Cs.IR] 8 Sep 2020
Total Page:16
File Type:pdf, Size:1020Kb
Negative Statements Considered Useful Hiba Arnaouta,∗, Simon Razniewskia, Gerhard Weikuma, Jeff Z. Panb aMax Planck Institute for Informatics, Saarland Informatics Campus, Saarbr¨ucken66123, Germany bSchool of Informatics, The University of Edinburgh, Informatics Forum, Edinburgh EH8 9AB, Scotland Abstract Knowledge bases (KBs) about notable entities and their properties are an important asset in applications such as search, question answering and dialogue. All popular KBs capture virtually only positive statements, and abstain from taking any stance on statements not stored in the KB. This paper makes the case for explicitly stating salient statements that do not hold. Negative statements are useful to overcome limitations of question answering systems that are mainly geared for positive questions; they can also contribute to informative summaries of entities. Due to the abundance of such invalid statements, any effort to compile them needs to address ranking by saliency. We present a statistical inference method for compiling and ranking negative statements, based on expectations from positive statements of related entities in peer groups. Experimental results, with a variety of datasets, show that the method can effectively discover notable negative statements, and extrinsic studies underline their usefulness for entity summarization. Datasets and code are released as resources for further research. Keywords: knowledge bases, negative knowledge, information extraction, statistical inference, ranking 1. Introduction applications. In medicine, for instance, it is important to distinguish between knowing about the absence of a bio- Motivation and Problem. Structured knowledge is chemical reaction between substances, and not knowing crucial in a range of applications like question answering, about its existence at all. In corporate integrity, it is im- dialogue agents, and recommendation systems. The re- portant to know whether a person was never employed by a quired knowledge is usually stored in KBs, and recent years certain competitor, while in anti-corruption investigations, have seen a rise of interest in KB construction, query- absence of family relations needs to be ascertained. In ing and maintenance, with notable projects being Wiki- data science and machine learning, on-the-spot counterex- data [60], DBpedia [8], Yago [56], or the Google Knowl- amples are important to ensure the correctness of learned edge Graph [55]. These KBs store positive statements such extraction patterns and associations. as \Ren´eeZellweger won the 2020 Oscar for the best ac- State of the Art and its Limitations. Absence of tress", and are a key asset for many knowledge-intensive explicit negative knowledge has consequences for usage of AI applications. KBs: for instance, today's question answering (QA) sys- A major limitation of all these KBs is their inability to tems are well geared for positive questions, and questions deal with negative information [25]. At present, most ma- where exactly one answer should be returned (e.g., quiz jor KBs only contain positive statements, whereas state- questions or reading comprehension tasks) [24, 66]. In con- ments such as that \Tom Cruise did not win an Oscar" trast, for answering negative questions like \Actors with- could only be inferred with the major assumption that out Oscars", QA systems lack a data basis. Similarly, they the KB is complete - the so-called closed-world assump- arXiv:2001.04425v6 [cs.IR] 25 Sep 2021 struggle with positive questions that have no answer, like tion (CWA). Yet as KBs are only pragmatic collections \Children of Emmanuel Macron", too often still returning of positive statements, the CWA is not realistic to as- a best-effort answer even if it is incorrect. Materialized sume, and there remains uncertainty whether statements negative information would allow a better treatment of not contained in a KBs are false, or truth is merely un- both cases. known to the KB. Approach and Contribution. In this paper, we make Not being able to formally distinguish whether a state- the case that important negative knowledge should be ex- ment is false or unknown poses challenges in a variety of plicitly materialized. We motivate this selective materi- alization with the challenge of overseeing a near-infinite space of false statements, and with the importance of ex- ∗Corresponding author Email addresses: [email protected] (Hiba Arnaout), plicit negation in search and question answering. [email protected] (Simon Razniewski), We consider three classes of negative statements: (i) [email protected] (Gerhard Weikum), grounded negative statements \Tom Cruise is not a British http://knowledge-representation.org/j.z.pan/ (Jeff Z. Pan) Preprint submitted to Elsevier September 28, 2021 citizen", (ii) conditional negative statements \Tom Cruise studied deleted statements between two Wikidata versions has not won an award from the Oscar categories" and (iii) from 1/2017 and 1/2018, focusing in particular on state- universal negative statements \Tom Cruise is not member ments for people (close to 0.5m deleted statements). On of any political party". In a nutshell, given a KB and an a random sample of 1k deleted statements, we found that entity e, we select highly related entities to e (we call them over 82% were just caused by ontology modifications, gran- peers). We then use these peers to derive positive expec- ularity changes, rewordings, or prefix modifications. An- tations about e, where the absence of these expectations other 15% were statements that were actually restored a might be interesting for e. In this approach, we are assum- year later, so presumably reflected erroneous deletions. ing completeness within a group of peers. More precisely, The remaining 3% represented actual negation, yet we if the KB does not mention the Nobel Prize in Physics as found them to be rarely noteworthy, i.e., presenting mostly an award won by Stephen Hawking, but does mention it things like corrections of birth dates or location updates for at least one of his peers, it is assumed to be false for reflecting geopolitical changes. Hawking, and not a missing statement. This is followed by In Wikidata, erroneous changes can also be directly a ranking step where we use predicate and object promi- recorded via the deprecated rank feature [38]. Yet again nence, frequency, and textual context in a learning-to-rank we found that this mostly relates to errors coming from model. various import sources, and did not concern the active The salient contributions of this paper are: collection of interesting negations, as advocated in this article. 1. We make the first comprehensive case for material- Count and Negated Predicates. Another way of ex- izing useful negative statements, and formalize im- pressing negation is via counts matching with instances, portant classes of such statements. for instance, storing 5 children statements for Trump and 2. We present a judiciously designed method for collect- numerical statement (number of children; 5) allow to in- ing and ranking negative statements based on knowl- fer that anyone else is not a child of Trump. Yet while such edge about related entities. count predicates exist in popular KBs, none of them has a formal way of dealing with these, especially concerning 3. We show the usefulness of our models in use cases like linking them to instance-based predicates [29]. entity summarization, decision support, and ques- Moreover, some KBs contain relations that carry a neg- tion answering. ative meaning. For example, DBpedia has predicates like Experimental datasets and code are released as re- carrier never available (for phones), or never exceed alt 1 sources for further research . (for airplanes), Knowlife [22] contains medical predicates The present article extends the earlier conference pub- like is not caused by and is not healed by, and Wikidata lication [5] in several directions: contains does not have part and different from. Yet these present very specific pieces of knowledge, and do not gener- 1. We extend the statistical inference to ordered sets of alize. Although there have been discussions to extend the related entities, thereby removing the need to select Wikidata data model to allow generic opposites2, these a single peer set, and obtaining finer-grained contex- have not been worked out so far. tualizations of negative statements (Section 5); Wikidata No-Values. Wikidata can capture state- ments about universal absence via the \no-value" sym- 2. To bridge the gap between overly fine-grained groun- bol [23]. This allows KB editors to add a statement where ded negative statements and coarse universal nega- the object is empty. For example, what we express as tive statements, we introduce a third notion of nega- :9x(Angela Merkel; child; x), the current version of Wiki- tive statement, conditional negative statements, and data allows to be expressed as (Angela Merkel; child; show how to compute them post-hoc (Section 6); no-value)3. As of 8/2021, there exist 135k of such \no- 3. We evaluate the value of negative statements in an value" statements, yet only used in narrow domains. For additional use case, with hotels from Booking.com instance, 53% of these statements come for just two prop- (Section 8). erties country (used almost exclusively for geographic fea- tures in Antarctica), and follows (indicating that an art- work is not a sequel). 2. State of the Art 2.1. Negation in Existing Knowledge Bases 2.2. Negation in Logics and Data Management Deleted Statements. Statements that were once part Negation has a long history in logics and data man- of a KB but got subsequently deleted are promising can- agement. Early database paradigms usually employed the didates for negative information [58]. As an example, we 2https://www.wikidata.org/wiki/Wikidata:Property_ 1https://www.mpi-inf.mpg.de/departments/ proposal/fails_compliance_with databases-and-information-systems/research/ 3https://www.wikidata.org/wiki/Q567 knowledge-base-recall/interesting-negations-in-kbs/ 2 closed-world assumption (CWA), i.e., assumed that all RuDiK [44] is a rule mining system that can learn rules statements not stated to be true were false [52], [40].