Commonsense Knowledge Mining from Pretrained Models

Total Page:16

File Type:pdf, Size:1020Kb

Commonsense Knowledge Mining from Pretrained Models Commonsense Knowledge Mining from Pretrained Models Joshua Feldman∗, Joe Davison∗, Alexander M. Rush School of Engineering and Applied Sciences Harvard University fjoshua feldman@g, jddavison@g, [email protected] Abstract entities (i.e. dog, running away, excited, etc.) and the pre-defined edges representing the nature Inferring commonsense knowledge is a key of the relations between concepts (IsA, UsedFor, challenge in natural language processing, but due to the sparsity of training data, previ- CapableOf, etc.). Commonsense knowledge base ous work has shown that supervised methods completion (CKBC) is a machine learning task for commonsense knowledge mining under- motivated by the need to improve the coverage of perform when evaluated on novel data. In these resources. In this formulation of the prob- this work, we develop a method for generat- lem, one is supplied with a list of candidate entity- ing commonsense knowledge using a large, relation-entity triples, and the task is to distin- pre-trained bidirectional language model. By guish which of the triples express valid common- transforming relational triples into masked sentences, we can use this model to rank a sense knowledge and which are fictitious (Li et al., triple’s validity by the estimated pointwise 2016). mutual information between the two entities. Several approaches have been proposed for Since we do not update the weights of the training models for commonsense knowledge base bidirectional model, our approach is not bi- completion (Li et al., 2016; Jastrzebski et al., ased by the coverage of any one common- 2018). Each of these approaches uses some sense knowledge base. Though this method sort of supervised training on a particular knowl- performs worse on a test set than models ex- edge base, evaluating the model’s performance plicitly trained on a corresponding training set, it outperforms these methods when mining on a held-out test set from the same database. commonsense knowledge from new sources, These works use relations from ConceptNet, a suggesting that unsupervised techniques may crowd-sourced database of structured common- generalize better than current supervised ap- sense knowledge, to train and validate their mod- proaches. els (Liu and Singh, 2004). However, it has been shown that these methods generalize poorly to 1 Introduction novel data (Li et al., 2016; Jastrzebski et al., 2018). Commonsense knowledge consists of facts about Jastrzebski et al.(2018) demonstrated that much the world which are assumed to be widely of the data in the ConceptNet test set were simply known. For this reason, commonsense knowledge rephrased relations from the training set, and that is rarely stated explicitly in natural language, mak- this train-test set leakage led to artificially inflated ing it challenging to infer this information with- test performance metrics. This problem of train- out an enormous amount of data (Gordon and test leakage is typical in knowledge base comple- Van Durme, 2013). Some have even argued that tion tasks (Toutanova et al., 2015; Dettmers et al., machine learning models cannot learn common 2018). sense implicitly (Davis and Marcus, 2015). Instead of training a predictive model on any One method for mollifying this issue is directly specific database, we attempt to utilize the world augmenting models with commonsense knowl- knowledge of large language models to identify edge bases (Young et al., 2018), which typically commonsense facts directly. By constructing a contain high-quality information but with low cov- candidate piece of knowledge as a sentence, we erage. These knowledge bases are represented can use a language model to approximate the like- as a graph, with nodes consisting of conceptual lihood of this text as a proxy for its truthfulness. 1173 Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, pages 1173–1178, Hong Kong, China, November 3–7, 2019. c 2019 Association for Computational Linguistics In particular, we use a masked language model to which maps a triple to a single sentence, and a estimate point-wise mutual information between scoring model σ which then determines a validity entities in a possible relation, an approach that score y. differs significantly from fine-tuning approaches Our approach relies on two types of pretrained used for other language modeling tasks. Since the language models. Standard unidirectional models weights of the model are fixed, our approach is are typically represented as autoregressive proba- not biased by the coverage of any one dataset. As bilities: we might expect, our method underperforms when m Y compared to previous benchmarks on the Con- p(w1; w2; : : : ; wm) = p(wijw1; : : : ; wi−1) ceptNet common sense triples dataset (Li et al., i 2016), but demonstrates a superior ability to gen- Masked bidirectional models such as BERT, pro- eralize when mining novel commonsense knowl- posed by Devlin et al.(2018), instead model in edge from Wikipedia. both directions, training word representations con- ditioned both on future and past words. The mask- Related Work Schwartz et al.(2017) and Trinh ing allows any number of words in the sequence to and Le(2018) demonstrate a similar approach to be hidden. This setup provides an intuitive frame- using language models for tasks requiring com- work to evaluate the probability of any word in a monsense, such as the Story Cloze Task and sequence conditioned on the rest of the sequence, the Winograd Schema Challenge, respectively 0 0 (Mostafazadeh et al., 2016; Levesque et al., 2012). p(wijw1:i−1; wi+1:m) Bosselut et al.(2019) and Trinh and Le(2019) where w0 2 V [ fκg and κ is a special token indi- use unidirectional language models for CKBC, but cating a masked word. their approach requires a supervised training step. Our approach differs in that we intentionally avoid 2.1 Generating Sentences from Triples training on any particular database, relying instead We first consider methods for turning a triple such on the language model’s general world knowl- as (ferret, AtLocation, pet store) into a edge. Additionally, we use a bidirectional masked sentence such as “the ferret is in the pet store”. model which provides a more flexible framework Our approach is to generate a set of candidate sen- for likelihood estimation and allows us to estimate tences via hand-crafted templates and select the point-wise mutual information. Although it is be- best proposal according to a language model. yond the scope of this paper, it would be interest- For each relation r 2 R, we hand-craft a set ing to adapt the methods presented here for the re- of sentence templates. For example, one template lated task of generating new commonsense knowl- in our experiments for the relation AtLocation edge (Saito et al., 2018). is, “you are likely to find HEAD in TAIL”. For 2 Method the above example, this would yield the sentence, “You are likely to find ferret in pet store”. Given a commonsense head-relation-tail triple Because these sentences are not always gram- x = (h; r; t), we are interested in determining the matically correct, such as in the above example, validity of that tuple as a representation of a com- we apply a simple set of transformations. These monsense fact. Specifically, we would like to de- consist of inserting articles before nouns, con- termine a numeric score y 2 R reflecting our con- verting verbs into gerunds, and pluralizing nouns fidence that a given tuple represents true knowl- which follow numbers. See the supplementary edge. materials for details and Table1 for an exam- We assume that heads and tails are arbitrary- ple. We then enumerate a set of alternative sen- length sequences of words in a vocabulary V tences S = fS1;:::;Sjg resulting from each tem- so that h = fh1; h2; : : : ; hng and t = plate and from all combinations of transforma- ft1; t2; : : : ; tmg. We further assume that we have tions. This yields a set of candidate sentences for a known set of possible relations R so that r 2 R. each data point. We then select the candidate sen- The goal is to determine a function f that maps tence with the highest log-likelihood according to relational triples to validity scores. We propose a pre-trained unidirectional language model Pcoh. decomposing f(x) = σ(τ(x)) into two sub- ∗ S = arg max [log Pcoh(S)] components: a sentence generation function τ S2S 1174 Candidate Sentence Si log p(Si) times. Finally, we calculate the total conditional likelihood of the tail by the product of these terms, “musician can playing musical instrument” −5:7 Qj p(tjh; r) = pk. “musician can be play musical instrument” −4:9 k=1 The marginal p(tjr) is computed similarly, but “musician often play musical instrument” −5:5 in this case we mask the head throughout. For “a musician can play a musical instrument” −2:9 example, to compute the marginal tail probability for the sentence, “You are likely to find a ferret Table 1: Example of generating candidate sen- tences. Several enumerated sentences for the in the pet store” we mask both the head and the triple (musician, CapableOf, play musical tail and then sequentially unmask the tail words instrument). The sentence with the highest log- only: “You are likely to find a κh1 in the κt1 κt2”. likelihood according to a pretrained language model is If κt2 = “store” has a higher probability than selected. κt1 = “pet”, we unmask “store” and compute “You are likely to find a κh1 in the κt1 store”. The We refer to this method of generating a sen- marginal likelihood p(tjr) is then the product of the two probabilities. tence from a triple as COHERENCY RANKING.
Recommended publications
  • Commonsense Reasoning for Natural Language Processing
    Introductory Tutorial: Commonsense Reasoning for Natural Language Processing Maarten Sap 1 Vered Shwartz 1;2 Antoine Bosselut 1;2 Yejin Choi 1;2 Dan Roth 3 1 Paul G. Allen School of Computer Science & Engineering, Seattle, WA, USA 2 Allen Institute for Artificial Intelligence, Seattle, WA, USA 3 Department of Computer and Information Science, University of Pennsylvania fmsap, vereds, antoineb, yejing @cs.washington.edu, [email protected] 1 Introduction settings, such as social interactions (Sap et al., 2019b; Rashkin et al., 2018a) and physical situ- Commonsense knowledge, such as knowing that ations (Zellers et al., 2019; Talmor et al., 2019). “bumping into people annoys them” or “rain We hope that in the future, machines develop makes the road slippery”, helps humans navigate the kind of intelligence required to, for exam- everyday situations seamlessly (Apperly, 2010). ple, properly assist humans in everyday situations Yet, endowing machines with such human-like (e.g., a chatbot that anticipates the needs of an el- commonsense reasoning capabilities has remained derly person; Pollack, 2005). Current methods, an elusive goal of artificial intelligence research however, are still not powerful or robust enough to for decades (Gunning, 2018). be deployed in open-domain production settings, Commonsense knowledge and reasoning have despite the clear improvements provided by large- received renewed attention from the natural lan- scale pretrained language models. This shortcom- guage processing (NLP) community in recent ing is partially due to inadequacy in acquiring, years, yielding multiple exploratory research di- understanding and reasoning about commonsense rections into automated commonsense under- knowledge, topics which remain understudied by standing.
    [Show full text]
  • Review Articles
    review articles DOI:10.1145/2701413 AI has seen great advances of many kinds recently, but there is one critical area where progress has been extremely slow: ordinary commonsense. BY ERNEST DAVIS AND GARY MARCUS Commonsense Reasoning and Commonsense Knowledge in Artificial Intelligence WHO IS TALLER, Prince William or his baby son Prince key insights George? Can you make a salad out of a polyester shirt? ˽ To achieve human-level performance in domains such as natural language If you stick a pin into a carrot, does it make a hole processing, vision, and robotics, basic knowledge of the commonsense world— in the carrot or in the pin? These types of questions time, space, physical interactions, people, may seem silly, but many intelligent tasks, such as and so on—will be necessary. understanding texts, computer vision, planning, and ˽ Although a few forms of commonsense reasoning, such as taxonomic reasoning scientific reasoning require the same kinds of real- and temporal reasoning are well world knowledge and reasoning abilities. For instance, understood, progress has been slow. ˽ Extant techniques for implementing if you see a six-foot-tall person holding a two-foot-tall commonsense include logical analysis, handcrafting large knowledge bases, person in his arms, and you are told they are father Web mining, and crowdsourcing. Each of these is valuable, but none by itself is a and son, you do not have to ask which is which. If you full solution. need to make a salad for dinner and are out of lettuce, ˽ Intelligent machines need not replicate you do not waste time considering improvising by human cognition directly, but a better understanding of human commonsense taking a shirt of the closet and cutting might be a good place to start.
    [Show full text]
  • Extracting Common Sense Knowledge from Text for Robot Planning
    Extracting Common Sense Knowledge from Text for Robot Planning Peter Kaiser1 Mike Lewis2 Ronald P. A. Petrick2 Tamim Asfour1 Mark Steedman2 Abstract— Autonomous robots often require domain knowl- edge to act intelligently in their environment. This is particu- larly true for robots that use automated planning techniques, which require symbolic representations of the operating en- vironment and the robot’s capabilities. However, the task of specifying domain knowledge by hand is tedious and prone to error. As a result, we aim to automate the process of acquiring general common sense knowledge of objects, relations, and actions, by extracting such information from large amounts of natural language text, written by humans for human readers. We present two methods for knowledge acquisition, requiring Fig. 1: The humanoid robots ARMAR-IIIa (left) and only limited human input, which focus on the inference of ARMAR-IIIb working in a kitchen environment ([5], [6]). spatial relations from text. Although our approach is applicable to a range of domains and information, we only consider one type of knowledge here, namely object locations in a kitchen environment. As a proof of concept, we test our approach using domain knowledge based on information gathered from an automated planner and show how the addition of common natural language texts. These methods will provide the set sense knowledge can improve the quality of the generated plans. of object and action types for the domain, as well as certain I. INTRODUCTION AND RELATED WORK relations between entities of these types, of the kind that are commonly used in planning. As an evaluation, we build Autonomous robots that use automated planning to make a domain for a robot working in a kitchen environment decisions about how to act in the world require symbolic (see Fig.
    [Show full text]
  • Open Mind Common Sense: Knowledge Acquisition from the General Public
    Open Mind Common Sense: Knowledge Acquisition from the General Public Push Singh, Grace Lim, Thomas Lin, Erik T. Mueller Travell Perkins, Mark Tompkins, Wan Li Zhu MIT Media Laboratory 20 Ames Street Cambridge, MA 02139 USA {push, glim, tlin, markt, wlz}@mit.edu, [email protected], [email protected] Abstract underpinnings for commonsense reasoning (Shanahan Open Mind Common Sense is a knowledge acquisition 1997), there has been far less work on finding ways to system designed to acquire commonsense knowledge from accumulate the knowledge to do so in practice. The most the general public over the web. We describe and evaluate well-known attempt has been the Cyc project (Lenat 1995) our first fielded system, which enabled the construction of which contains 1.5 million assertions built over 15 years a 400,000 assertion commonsense knowledge base. We at the cost of several tens of millions of dollars. then discuss how our second-generation system addresses Knowledge bases this large require a tremendous effort to weaknesses discovered in the first. The new system engineer. With the exception of Cyc, this problem of scale acquires facts, descriptions, and stories by allowing has made efforts to study and build commonsense participants to construct and fill in natural language knowledge bases nearly non-existent within the artificial templates. It employs word-sense disambiguation and intelligence community. methods of clarifying entered knowledge, analogical inference to provide feedback, and allows participants to validate knowledge and in turn each other. Turning to the general public 1 In this paper we explore a possible solution to this Introduction problem of scale, based on one critical observation: Every We would like to build software agents that can engage in ordinary person has common sense of the kind we want to commonsense reasoning about ordinary human affairs.
    [Show full text]
  • Downloading Programs
    Futures of Artificial Intelligence through Technology Readiness Levels Fernando Mart´ınez-Plumed1,2, Emilia Gomez´ 1, Jose´ Hernandez-Orallo´ 2 Abstract Artificial Intelligence (AI) offers the potential to transform our lives in radical ways. However, the main unanswered questions about this foreseen transformation are its depth, breadth and timelines. To answer them, not only do we lack the tools to determine what achievements will be attained in the near future, but we even ignore what various technologies in present-day AI are capable of. Many so-called breakthroughs in AI are associated with highly-cited research papers or good performance in some particular benchmarks. However, research breakthroughs do not directly translate into a technology that is ready to use in real-world environments. In this paper, we present a novel exemplar- based methodology to categorise and assess several AI technologies, by mapping them onto Technology Readiness Levels (TRL) (representing their depth in maturity and availability). We first interpret the nine TRLs in the context of AI, and identify several categories in AI to which they can be assigned. We then introduce a generality dimension, which represents increasing layers of breadth of the technology. These two dimensions lead to the new readiness-vs-generality charts, which show that higher TRLs are achievable for low-generality technologies, focusing on narrow or specific abilities, while high TRLs are still out of reach for more general capabilities. We include numerous examples of AI technologies in a variety of fields, and show their readiness-vs-generality charts, serving as exemplars. Finally, we show how the timelines of several AI technology exemplars at different generality layers can help forecast some short-term and mid-term trends for AI.
    [Show full text]
  • Commonsense Knowledge in Wikidata
    Commonsense Knowledge in Wikidata Filip Ilievski1, Pedro Szekely1, and Daniel Schwabe2 1 Information Sciences Institute, University of Southern California 2 Dept. of Informatics, Pontificia Universidade Cat´olicaRio de Janeiro filievski,[email protected], [email protected] Abstract. Wikidata and Wikipedia have been proven useful for reason- ing in natural language applications, like question answering or entity linking. Yet, no existing work has studied the potential of Wikidata for commonsense reasoning. This paper investigates whether Wikidata con- tains commonsense knowledge which is complementary to existing com- monsense sources. Starting from a definition of common sense, we devise three guiding principles, and apply them to generate a commonsense subgraph of Wikidata (Wikidata-CS). Within our approach, we map the relations of Wikidata to ConceptNet, which we also leverage to integrate Wikidata-CS into an existing consolidated commonsense graph. Our ex- periments reveal that: 1) albeit Wikidata-CS represents a small portion of Wikidata, it is an indicator that Wikidata contains relevant com- monsense knowledge, which can be mapped to 15 ConceptNet relations; 2) the overlap between Wikidata-CS and other commonsense sources is low, motivating the value of knowledge integration; 3) Wikidata-CS has been evolving over time at a slightly slower rate compared to the overall Wikidata, indicating a possible lack of focus on commonsense knowl- edge. Based on these findings, we propose three recommended actions to improve the coverage and quality of Wikidata-CS further. Keywords: Commonsense Knowledge · Wikidata · Knowledge Graphs 1 Introduction Common sense is \the basic ability to perceive, understand, and judge things that are shared by nearly all people and can be reasonably expected of nearly all people without need for debate" [10].
    [Show full text]
  • Commonsense Knowledge Base Completion with Structural and Semantic Context
    Commonsense Knowledge Base Completion with Structural and Semantic Context Chaitanya Malaviya}, Chandra Bhagavatula}, Antoine Bosselut}|, Yejin Choi}| }Allen Institute for Artificial Intelligence |University of Washington fchaitanyam,[email protected], fantoineb,[email protected] Abstract MotivatedByGoal prevent go to tooth Automatic KB completion for commonsense knowledge dentist decay eat graphs (e.g., ATOMIC and ConceptNet) poses unique chal- candy lenges compared to the much studied conventional knowl- HasPrerequisite edge bases (e.g., Freebase). Commonsense knowledge graphs Causes brush use free-form text to represent nodes, resulting in orders of your tooth HasFirstSubevent Causes magnitude more nodes compared to conventional KBs ( ∼18x tooth bacteria decay more nodes in ATOMIC compared to Freebase (FB15K- pick 237)). Importantly, this implies significantly sparser graph up your toothbrush structures — a major challenge for existing KB completion ReceivesAction methods that assume densely connected graphs over a rela- HasPrerequisite tively smaller set of nodes. good Causes breath NotDesires In this paper, we present novel KB completion models that treat by can address these challenges by exploiting the structural and dentist infection semantic context of nodes. Specifically, we investigate two person in cut key ideas: (1) learning from local graph structure, using graph convolutional networks and automatic graph densifi- cation and (2) transfer learning from pre-trained language Figure 1: Subgraph from ConceptNet illustrating semantic models to knowledge graphs for enhanced contextual rep- diversity of nodes. Dashed blue lines represent potential resentation of knowledge. We describe our method to in- edges to be added to the graph. corporate information from both these sources in a joint model and provide the first empirical results for KB com- pletion on ATOMIC and evaluation with ranking metrics on ConceptNet.
    [Show full text]
  • Natural Language Understanding with Commonsense Reasoning
    E.T.S. DE INGENIEROS INFORMÁTICOS UNIVERSIDAD POLITÉCNICA DE MADRID MASTER TESIS MSc IN ARTIFICIAL INTELLIGENCE (MUIA) NATURAL LANGUAGE UNDERSTANDING WITH COMMONSENSE REASONING: APPLICATION TO THE WINOGRAD SCHEMA CHALLENGE AUTHOR: ALFONSO LÓPEZ TORRES SUPERVISOR: MARTÍN MOLINA GONZÁLEZ JUNE, 2016 This is for my children Carla and Alonso, and my wife Véronique Thanks for their unconditional support and patient (also for the coming adventures…) v Acknowledgments: I’d like to thank the advices and help received from Martín. I was very lucky being your student. vi RESUMEN En 1950, Alan Turing propuso un test para evaluar el grado de inteligencia humana que podría presentar una máquina. La idea principal era realmente sencilla: llevar a cabo una charla abierta entre un evaluador y la máquina. Si dicho evaluador era incapaz de discernir si el examinado era una persona o una máquina, podría afirmarse que el test había sido superado. Desde entonces, a lo largo de los últimos 60 años se han presentado numerosas propuestas a través de los cuales se han puesto al descubierto ciertas debilidades del test. Quizás la más importante es el hecho de centrarse en la inteligencia humana, dejando a un lado otros tipos de inteligencia. El test obliga en gran medida a definir en la máquina un comportamiento antropomórfico y de imitación con el único fin de pasar el test. Con el fin de superar estos y otros puntos débiles, Hector Levesque propuso en 2011 un nuevo reto, “The Winograd Schema Challenge”. Un sencillo test basado en Pregunta y Respuesta sobre una frase que describe una situación cotidiana.
    [Show full text]
  • Evaluating Commonsense Reasoning in Neural Machine Translation
    The Box is in the Pen: Evaluating Commonsense Reasoning in Neural Machine Translation Jie Hey∗, Tao Wangz∗, Deyi Xiongy, and Qun Liux y College of Intelligence and Computing, Tianjin University, Tianjin, China z School of Computer Science and Technology, Soochow University, Suzhou, China x Huawei Noah’s Ark Lab, Hong Kong, China [email protected], [email protected] [email protected], [email protected] Abstract evidence for the need of commonsense knowl- edge in machine translation is “The box is in the Does neural machine translation yield transla- pen”, where machine translation is expected to per- tions that are congenial with common sense? form reasoning on the relative sizes of “box” and In this paper, we present a test suite to evalu- “pen”. Bar-Hillel also doubts that a machine, even ate the commonsense reasoning capability of equipped with extra-linguistic knowledge, would neural machine translation. The test suite consists of three test sets, covering lexical be able to reason with such knowledge sponta- and contextless/contextual syntactic ambiguity neously as human translators do (Bar-Hillel, 1960a; that requires commonsense knowledge to re- Macklovitch, 1995). solve. We manually create 1,200 triples, each Modern natural language processing (NLP) has of which contain a source sentence and two made tremendous progress, not only in building contrastive translations, involving 7 different abundant resources to develop linguistic insights, common sense types. Language models pre- but also in plenty of methodological practices. On trained on large-scale corpora, such as BERT, GPT-2, achieve a commonsense reasoning ac- the one hand, machine translation has been sub- curacy of lower than 72% on target transla- stantially advanced with large-scale parallel data tions of this test suite.
    [Show full text]
  • Common Sense Reasoning with the Semantic Web
    Common Sense Reasoning with the Semantic Web Christopher C. Johnson and Push Singh MIT Summer Research Program Massachusetts Institute of Technology, Cambridge, MA 02139 [email protected], [email protected] http://groups.csail.mit.edu/dig/2005/08/Johnson-CommonSense.pdf Abstract Current HTML content on the World Wide Web has no real meaning to the computers that display the content. Rather, the content is just fodder for the human eye. This is unfortunate as in fact Web documents describe real objects and concepts, and give particular relationships between them. The goal of the World Wide Web Consortium’s (W3C) Semantic Web initiative is to formalize web content into Resource Description Framework (RDF) ontologies so that computers may reason and make decisions about content across the Web. Current W3C work has so far been concerned with creating languages in which to express formal Web ontologies and tools, but has overlooked the value and importance of implementing common sense reasoning within the Semantic Web. As Web blogging and news postings become more prominent across the Web, there will be a vast source of natural language text not represented as RDF metadata. Common sense reasoning will be needed to take full advantage of this content. In this paper we will first describe our work in converting the common sense knowledge base, ConceptNet, to RDF format and running N3 rules through the forward chaining reasoner, CWM, to further produce new concepts in ConceptNet. We will then describe an example in using ConceptNet to recommend gift ideas by analyzing the contents of a weblog.
    [Show full text]
  • Conceptnet 5.5: an Open Multilingual Graph of General Knowledge
    Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI-17) ConceptNet 5.5: An Open Multilingual Graph of General Knowledge Robyn Speer Joshua Chin Catherine Havasi Luminoso Technologies, Inc. Union College Luminoso Technologies, Inc. 675 Massachusetts Avenue 807 Union St. 675 Massachusetts Avenue Cambridge, MA 02139 Schenectady, NY 12308 Cambridge, MA 02139 Abstract In this paper, we will concisely represent assertions such Machine learning about language can be improved by sup- as the above as triples of their start node, relation label, and plying it with specific knowledge and sources of external in- end node: the assertion that “a dog has a tail” can be repre- formation. We present here a new version of the linked open sented as (dog, HasA, tail). data resource ConceptNet that is particularly well suited to ConceptNet also represents links between knowledge re- be used with modern NLP techniques such as word embed- sources. In addition to its own knowledge about the English dings. term astronomy, for example, ConceptNet contains links to ConceptNet is a knowledge graph that connects words and URLs that define astronomy in WordNet, Wiktionary, Open- phrases of natural language with labeled edges. Its knowl- Cyc, and DBPedia. edge is collected from many sources that include expert- The graph-structured knowledge in ConceptNet can be created resources, crowd-sourcing, and games with a pur- particularly useful to NLP learning algorithms, particularly pose. It is designed to represent the general knowledge in- those based on word embeddings, such as (Mikolov et al. volved in understanding language, improving natural lan- 2013). We can use ConceptNet to build semantic spaces that guage applications by allowing the application to better un- are more effective than distributional semantics alone.
    [Show full text]
  • Analogyspace: Reducing the Dimensionality of Common Sense
    Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence (2008) AnalogySpace: Reducing the Dimensionality of Common Sense Knowledge Robyn Speer Catherine Havasi Henry Lieberman CSAIL Laboratory for Linguistics and Computation Software Agents Group Massachusetts Institute of Technology Brandeis University MIT Media Lab Cambridge, MA 02139, USA Waltham, MA 02454, USA Cambridge, MA 02139, USA [email protected] [email protected] [email protected] Abstract to reduce the dimensionality of that matrix. This results in computing principal components which represent the most We are interested in the problem of reasoning over very large salient aspects of the knowledge, which can then be used to common sense knowledge bases. When such a knowledge base contains noisy and subjective data, it is important to organize it along the most semantically meaningful dimen- have a method for making rough conclusions based on simi- sions. The key idea is that semantic similarity can be deter- larities and tendencies, rather than absolute truth. We present mined using linear operations over the resulting vectors. AnalogySpace, which accomplishes this by forming the ana- logical closure of a semantic network through dimensionality What AnalogySpace Can Do reduction. It self-organizes concepts around dimensions that can be seen as making distinctions such as “good vs. bad” AnalogySpace provides a computationally efficient way to or “easy vs. hard”, and generalizes its knowledge by judging calculate a wide variety of semantically meaningful opera- where concepts lie along these dimensions. An evaluation tions: demonstrates that users often agree with the predicted knowl- AnalogySpace can generalize from sparsely-collected edge, and that its accuracy is an improvement over previous knowledge.
    [Show full text]