Predicting Conceptnet Path Quality Using Crowdsourced Assessments of Naturalness
Total Page:16
File Type:pdf, Size:1020Kb
Predicting ConceptNet Path Quality Using Crowdsourced Assessments of Naturalness Yilun Zhou Steven Schockaert Julie A. Shah MIT CSAIL Cardiff University MIT CSAIL [email protected] [email protected] [email protected] ABSTRACT 1 INTRODUCTION In many applications, it is important to characterize the way in Many applications require information about the semantic relation which two concepts are semantically related. Knowledge graphs between two (or more) words, concepts, or entities. For example, a such as ConceptNet provide a rich source of information for such recommender system needs to recommend items related to a user’s characterizations by encoding relations between concepts as edges browsing history, a task allocation agent needs to match people’s in a graph. When two concepts are not directly connected by an experiences and skills to a pool of problems to maximize problem- edge, their relationship can still be described in terms of the paths solving efficiency, and a household robot given the instruction that connect them. Unfortunately, many of these paths are uninfor- to “wash the plates” needs to infer that a dishwasher could be mative and noisy, which means that the success of applications that used. Open-domain knowledge graphs such as ConceptNet (27) use such path features crucially relies on their ability to select high- allow us to characterize such semantic relationships in the form of quality paths. In existing applications, this path selection process relational paths. Compared to the use of word embeddings (21) for is based on relatively simple heuristics. In this paper we instead characterizing relatedness, knowledge graphs have the potential propose to learn to predict path quality from crowdsourced human advantage of producing easier-to-understand characterizations, and assessments. Since we are interested in a generic task-independent they can capture relationships that go beyond what is encoded in notion of quality, we simply ask human participants to rank paths standard word embeddings (33). according to their subjective assessment of the paths’ naturalness, Typical knowledge graphs (KGs), such as DBpedia and Wiki- without attempting to define naturalness or steering the partici- Data, are concerned with named entities and their relations (e.g. pants towards particular indicators of quality. We show that a neural “Abraham Lincoln” is one of “United States Presidents”). In this network model trained on these assessments is able to predict hu- paper, however, we are concerned with capturing semantic rela- man judgments on unseen paths with near optimal performance. tionships between common nouns or concepts, which requires a Most notably, we find that the resulting path selection method is commonsense KG such as ConceptNet. Despite its indisputable substantially better than the current heuristic approaches at identi- value, however, effectively using ConceptNet in applications comes fying meaningful paths. with a unique set of challenges. The knowledge captured in Concept- Net is, by design, often informal, subjective and vague. Due to the CCS CONCEPTS fact that ConceptNet partly consists of unverified crowdsourced assertions, it is also noisier than many other KGs. Furthermore, • Information systems Crowdsourcing Answer ranking ; ; many commonsense assertions are true only under some circum- Computing methodologies Knowledge representa- • stances and to some extent. For example, ConceptNet encodes that tion and reasoning Natural language processing ; . “popcorn” is required for “watching a movie”, which captures the useful commonsense knowledge that eating popcorn is associated KEYWORDS with watching a movie, but the statement that popcorn is required Crowdsourcing, Knowledge Graph, Feature Selection, Common- is nonetheless false. In addition, ConceptNet only disambiguates sense Knowledge, ConceptNet concepts to a coarse part-of-speech level (e.g. noun meaning vs verb meaning of the word “watch”). Finally, a lot of concepts are ACM Reference Format: linked by the generic “RelatedTo” relation, which covers relation- RelatedTo Yilun Zhou, Steven Schockaert, and Julie A. Shah. 2019. Predicting Con- ships as diverse as collocations (“Tire −−−−−−! Spare”), hypernyms ceptNet Path Quality Using Crowdsourced Assessments of Naturalness. (“Tire RelatedTo All Seasons Tire”), co-meronyms (“Tire RelatedTo In Proceedings of the 2019 World Wide Web Conference (WWW’19), May −−−−−−! −−−−−−! RelatedTo 13–17, 2019, San Francisco, CA, USA. ACM, New York, NY, USA, 12 pages. Exhaust”), homonyms (“Tire −−−−−−! Tier”), and very loosely re- RelatedTo https://doi.org/10.1145/3308558.3313486 lated terms (“Tire −−−−−−! Clothes”). Because of those challenges, few authors use relational paths from ConceptNet in applications. In particular, while several au- This paper is published under the Creative Commons Attribution 4.0 International thors have found the knowledge encoded in ConceptNet to be (CC-BY 4.0) license. Authors reserve their rights to disseminate the work on their highly valuable, they typically restrict themselves to using relation- personal and corporate Web sites with the appropriate attribution. ships that are directly encoded as an edge. For example, Speer et al. WWW ’19, May 13–17, 2019, San Francisco, CA, USA © 2019 IW3C2 (International World Wide Web Conference Committee), published (27) showed that ConceptNet can be used to improve word embed- under Creative Commons CC-BY 4.0 License. dings by incorporating the intuition that words which are linked ACM ISBN 978-1-4503-6674-8/19/05. by an edge in ConceptNet should be represented by similar vectors. https://doi.org/10.1145/3308558.3313486 2460 We believe that this common restriction to direct edges is due to and nearly 3 million edges (i.e. triples). ConceptNet partly obtained the lack of sufficiently accurate methods for filtering nonsensical through crowdsourcing, but also incorporates external sources such paths, or conversely, for identifying the most natural paths. This as OpenCyc (18), WordNet (22), Verbosity (29), DBPedia (2), and path selection problem is the focus of our paper. Wiktionary1. In existing work (11, 19), the problem of selecting high-quality Knowledge base construction by crowdsourcing: While the ConceptNet paths has been addressed in a heuristic way (see Section use of crowdsourcing for constructing knowledge bases has already 4). While some intuitions about high-quality paths can easily be been well studied, most existing approaches focus on knowledge ac- formulated (e.g. shorter paths tend to be more informative than quisition. This can involve, for instance, direct collaborative editing longer paths, the nodes occurring in natural paths tend to have of a knowledge base (3, 31) or indirect construction of a knowledge similar word vector representations), such heuristic methods still base through the use of an interactive game (29). In contrast to fail to filter out many nonsensical paths, and conversely sometimes these works, our focus is on using crowdsourcing for learning how erroneously filter out highly valuable paths. For example, thebest to detect noisy paths within an existing knowledge graph. path between “lead” and “poison” found by one of the standard An important challenge with crowdsourcing is the fact that Synonym heuristics (pairwise baseline, see Section 4.3) is “Lead −−−−−−! Take there will inevitably be disagreements within the collected data. DistinctFrom RelatedTo −−−−−−−−−! Give −−−−−−! Poison”, which uses the verb meaning One framework (1) suggests that this can be due to (i) the inclusion of “lead”, despite the fact that the noun meaning (i.e. the poisonous of low-quality participants, (ii) an ambiguous interpretation of chemical element Pb) is more relevant in this case. the input data and (iii) an ambiguous definition of the labels that Rather than trying to construct increasingly intricate heuristics, crowdsourcers are required to provide. In our work, we mitigated we propose to learn to predict path quality using a neural network the first issue with a quality control mechanism (Section 4.1).To model. To this end, we rely on crowdsourced assessments about address the remaining two issues, we will rely on a probabilistic the naturalness of ConceptNet paths. Specifically, to collect training generative model to interpret the provided ratings. data, human annotators were asked to choose the more natural path Word embeddings: Word embedding methods (21, 24) represent among a pair of paths, without any guidance on how to interpret each word in a relatively low-dimensional space of typically around naturalness. This notion of naturalness was chosen because it is 300 dimensions, which is estimated from word co-occurrence sta- intuitively easy to understand for crowdsourcers (as opposed to tistics. With most training procedures, vector differences represent e.g. terms such as “semantic coherence” or “predictive value”). The abstract relations in the resulting embedding space: for example, term is also deliberately vague, as we do not want to steer anno- vking − vqueen ≈ vman − vwoman, in which vword represents the tators towards particular types of features. The resulting pairwise embedding of a given word. judgments are then used to train a neural network to predict a Both KGs and word embeddings capture aspects of meaning, and latent naturalness score, which can be used for path ranking or can thus to some extent be seen as alternatives. One advantage of path selection. word embeddings is that they