Combined Distributional and Logical Semantics

Total Page:16

File Type:pdf, Size:1020Kb

Combined Distributional and Logical Semantics Combined Distributional and Logical Semantics Mike Lewis Mark Steedman School of Informatics School of Informatics University of Edinburgh University of Edinburgh Edinburgh, EH8 9AB, UK Edinburgh, EH8 9AB, UK [email protected] [email protected] Abstract of limited use in capturing the huge variety of mean- ings that can be expressed in language. We introduce a new approach to semantics However, distributional semantics has largely de- which combines the benefits of distributional veloped in isolation from the formal semantics liter- and formal logical semantics. Distributional models have been successful in modelling the ature. Whilst distributional semantics has been ef- meanings of content words, but logical se- fective in modelling the meanings of content words mantics is necessary to adequately represent such as nouns and verbs, it is less clear that it can be many function words. We follow formal se- applied to the meanings of function words. Semantic mantics in mapping language to logical rep- operators, such as determiners, negation, conjunc- resentations, but differ in that the relational tions, modals, tense, mood, aspect, and plurals are constants used are induced by offline distri- ubiquitous in natural language, and are crucial for butional clustering at the level of predicate- high performance on many practical applications— argument structure. Our clustering algorithm is highly scalable, allowing us to run on cor- but current distributional models struggle to capture pora the size of Gigaword. Different senses of even simple examples. Conversely, computational a word are disambiguated based on their in- models of formal semantics have shown low recall duced types. We outperform a variety of ex- on practical applications, stemming from their re- isting approaches on a wide-coverage question liance on ontologies such as WordNet (Miller, 1995) answering task, and demonstrate the ability to to model the meanings of content words (Bobrow et make complex multi-sentence inferences in- al., 2007; Bos and Markert, 2005). volving quantifiers on the FraCaS suite. For example, consider what is needed to answer a question like Did Google buy YouTube? from the 1 Introduction following sentences: Mapping natural language to meaning representa- 1. Google purchased YouTube tions is a central challenge of NLP. There has been 2. Google’s acquisition of YouTube much recent progress in unsupervised distributional semantics, in which the meaning of a word is in- 3. Google acquired every company duced based on its usage in large corpora. This ap- 4. YouTube may be sold to Google proach is useful for a range of key applications in- 5. Google will buy YouTube or Microsoft cluding question answering and relation extraction 6. Google didn’t takeover YouTube (Lin and Pantel, 2001; Poon and Domingos, 2009; Yao et al., 2011). Because such a semantics can be All of these require knowledge of lexical seman- automically induced, it escapes the limitation of de- tics (e.g. that buy and purchase are synonyms), but pending on relations from hand-built training data, some also need interpretation of quantifiers, nega- knowledge bases or ontologies, which have proved tives, modals and disjunction. It seems unlikely that 179 Transactions of the Association for Computational Linguistics, 1 (2013) 179–192. Action Editor: Johan Bos. Submitted 1/2013; Revised 3/2013; Published 5/2013. c 2013 Association for Computational Linguistics. Downloaded from http://www.mitpressjournals.org/doi/pdf/10.1162/tacl_a_00219 by guest on 03 October 2021 distributional or formal approaches can accomplish Every dog barks the task alone. NP↑/NNS NP \ λ pλq. x[p(x) = q(x)] λx.dog0(x) λx.bark0(x) We propose a method for mapping natural lan- ∀ ⇒ > guage to first-order logic representations capable of NP↑ λq. x[dog0(x) = q(x)] capturing the meanings of function words such as ∀ ⇒ > every, not and or, but which also uses distributional S x[dog (x) = bark (x)] statistics to model the meaning of content words. ∀ 0 ⇒ 0 Our approach differs from standard formal seman- Figure 1: A standard logical form derivation using CCG. tics in that the non-logical symbols used in the log- The NP↑ notation means that the subject is type-raised, ical form are cluster identifiers. Where standard se- and taking the verb-phrase as an argument—so is an ab- mantic formalisms would map the verb write to a breviation of S/(S NP). This is necessary in part to sup- \ write’ symbol, we map it to a cluster identifier such port a correct semantics for quantifiers. as relation37, which the noun author may also map to. This mapping is learnt by offline clustering. Input Sentence Unlike previous distributional approaches, we Shakespeare wrote Macbeth perform clustering at the level of predicate-argument ⇓ structure, rather than syntactic dependency struc- Intial semantic analysis ture. This means that we abstract away from many writearg0,arg1(shakespeare, macbeth) syntactic differences that are not present in the se- ⇓ mantics, such as conjunctions, passives, relative Entity Typing clauses, and long-range dependencies. This signifi- writearg0:PER,arg1:BOOK(shakespeare:PER, cantly reduces sparsity, so we have fewer predicates macbeth:BOOK) to cluster and more observations for each. ⇓ Of course, many practical inferences rely heavily Distributional semantic analysis on background knowledge about the world—such relation37(shakespeare:PER, macbeth:BOOK) knowledge falls outside the scope of this work. Figure 2: Layers used in our model. 2 Background form of semantics captures the underlying predicate- Our approach is based on Combinatory Categorial argument structure, but fails to license many impor- Grammar (CCG; Steedman, 2000), a strongly lexi- tant inferences—as, for example, write and author calised theory of language in which lexical entries do not map to the same predicate. for words contain all language-specific information. In addition to the lexicon, there is a small set of The lexical entry for each word contains a syntactic binary combinators and unary rules, which have a category, which determines which other categories syntactic and semantic interpretation. Figure 1 gives the word may combine with, and a semantic inter- an example CCG derivation. pretation, which defines the compositional seman- tics. For example, the lexicon may contain the entry: 3 Overview of Approach write (S NP)/NP : λyλx.write (x,y) ` \ 0 Crucially, there is a transparent interface between We attempt to learn a CCG lexicon which maps the syntactic category and the semantics. For ex- equivalent words onto the same logical form—for ample the transitive verb entry above defines the example learning entries such as: verb syntactically as a function mapping two noun- author N/PP[o f ] : λxλy.relation37(x,y) ` phrases to a sentence, and semantically as a bi- write (S NP)/NP : λxλy.relation37(x,y) ` \ nary relation between its two argument entities. The only change to the standard CCG derivation is This means that it is relatively straightforward to that the symbols used in the logical form are arbi- deterministically map parser output to a logical trary relation identifiers. We learn these by first map- form, as in the Boxer system (Bos, 2008). This ping to a deterministic logical form (using predicates 180 Downloaded from http://www.mitpressjournals.org/doi/pdf/10.1162/tacl_a_00219 by guest on 03 October 2021 such as author’ and write’), using a process simi- exactly preserve meaning, but still captures the most lar to Boxer, and then clustering predicates based on important relations. Note that this allows us to their arguments. This lexicon can then be used to compare semantic relations across different syntac- parse new sentences, and integrates seamlessly with tic types—for example, both transitive verbs and CCG theories of formal semantics. argument-taking nouns can be seen as expressing bi- Typing predicates—for example, determining that nary semantic relations between entities. writing is a relation between people and books— Figure 2 shows the layers used in our model. has become standard in relation clustering (Schoen- mackers et al., 2010; Berant et al., 2011; Yao et 4 Initial Semantic Analysis al., 2012). We demonstate how to build a typing The initial semantic analysis maps parser output model into the CCG derivation, by subcategorizing onto a logical form, in a similar way to Boxer. The all terms representing entities in the logical form semantic formalism is based on Steedman (2012). with a more detailed type. These types are also in- The first step is syntactic parsing. We use the duced from text, as explained in Section 5, but for C&C parser (Clark and Curran, 2004), trained on convenience we describe them with human-readable CCGBank (Hockenmaier and Steedman, 2007), us- labels, such as PER, LOC and BOOK. ing the refined version of Honnibal et al. (2010) A key advantage of typing is that it allows us to which brings the syntax closer to the predicate- model ambiguous predicates. Following Berant et argument structure. An automatic post-processing al. (2011), we assume that different type signatures step makes a number of minor changes to the parser of the same predicate have different meanings, but output, which converts the grammar into one more given a type signature a predicate is unambiguous. suitable for our semantics. PP (prepositional phrase) For example a different lexical entry for the verb and PR (phrasal verb complement) categories are born is used in the contexts Obama was born in sub-categorised with the relevant preposition. Noun Hawaii and Obama was born in 1961, reflecting a compounds with the same MUC named-entity type distinction in the semantics that is not obvious in the (Chinchor and Robinson, 1997) are merged into a 1 syntax . Typing also greatly improves the efficiency single non-compositional node2 (we otherwise ig- of clustering, as we only need to compare predicates nore named-entity types).
Recommended publications
  • Semantics and Pragmatics
    Semantics and Pragmatics Christopher Gauker Semantics deals with the literal meaning of sentences. Pragmatics deals with what speakers mean by their utterances of sentences over and above what those sentences literally mean. However, it is not always clear where to draw the line. Natural languages contain many expressions that may be thought of both as contributing to literal meaning and as devices by which speakers signal what they mean. After characterizing the aims of semantics and pragmatics, this chapter will set out the issues concerning such devices and will propose a way of dividing the labor between semantics and pragmatics. Disagreements about the purview of semantics and pragmatics often concern expressions of which we may say that their interpretation somehow depends on the context in which they are used. Thus: • The interpretation of a sentence containing a demonstrative, as in “This is nice”, depends on a contextually-determined reference of the demonstrative. • The interpretation of a quantified sentence, such as “Everyone is present”, depends on a contextually-determined domain of discourse. • The interpretation of a sentence containing a gradable adjective, as in “Dumbo is small”, depends on a contextually-determined standard (Kennedy 2007). • The interpretation of a sentence containing an incomplete predicate, as in “Tipper is ready”, may depend on a contextually-determined completion. Semantics and Pragmatics 8/4/10 Page 2 • The interpretation of a sentence containing a discourse particle such as “too”, as in “Dennis is having dinner in London tonight too”, may depend on a contextually determined set of background propositions (Gauker 2008a). • The interpretation of a sentence employing metonymy, such as “The ham sandwich wants his check”, depends on a contextually-determined relation of reference-shifting.
    [Show full text]
  • Reference and Sense
    REFERENCE AND SENSE y two distinct ways of talking about the meaning of words y tlkitalking of SENSE=deali ng with relationshippggs inside language y talking of REFERENCE=dealing with reltilations hips bbtetween l. and the world y by means of reference a speaker indicates which things (including persons) are being talked about ege.g. My son is in the beech tree. II identifies persons identifies things y REFERENCE-relationship between the Enggplish expression ‘this p pgage’ and the thing you can hold between your finger and thumb (part of the world) y your left ear is the REFERENT of the phrase ‘your left ear’ while REFERENCE is the relationship between parts of a l. and things outside the l. y The same expression can be used to refer to different things- there are as many potential referents for the phrase ‘your left ear’ as there are pppeople in the world with left ears Many expressions can have VARIABLE REFERENCE y There are cases of expressions which in normal everyday conversation never refer to different things, i.e. which in most everyday situations that one can envisage have CONSTANT REFERENCE. y However, there is very little constancy of reference in l. Almost all of the fixing of reference comes from the context in which expressions are used. y Two different expressions can have the same referent class ica l example: ‘the MiMorning St’Star’ and ‘the Evening Star’ to refer to the planet Venus y SENSE of an expression is its place in a system of semantic relati onshi ps wit h other expressions in the l.
    [Show full text]
  • The Meaning of Language
    01:615:201 Introduction to Linguistic Theory Adam Szczegielniak The Meaning of Language Copyright in part: Cengage learning The Meaning of Language • When you know a language you know: • When a word is meaningful or meaningless, when a word has two meanings, when two words have the same meaning, and what words refer to (in the real world or imagination) • When a sentence is meaningful or meaningless, when a sentence has two meanings, when two sentences have the same meaning, and whether a sentence is true or false (the truth conditions of the sentence) • Semantics is the study of the meaning of morphemes, words, phrases, and sentences – Lexical semantics: the meaning of words and the relationships among words – Phrasal or sentential semantics: the meaning of syntactic units larger than one word Truth • Compositional semantics: formulating semantic rules that build the meaning of a sentence based on the meaning of the words and how they combine – Also known as truth-conditional semantics because the speaker’ s knowledge of truth conditions is central Truth • If you know the meaning of a sentence, you can determine under what conditions it is true or false – You don’ t need to know whether or not a sentence is true or false to understand it, so knowing the meaning of a sentence means knowing under what circumstances it would be true or false • Most sentences are true or false depending on the situation – But some sentences are always true (tautologies) – And some are always false (contradictions) Entailment and Related Notions • Entailment: one sentence entails another if whenever the first sentence is true the second one must be true also Jack swims beautifully.
    [Show full text]
  • Two-Dimensionalism: Semantics and Metasemantics
    Two-Dimensionalism: Semantics and Metasemantics YEUNG, \y,ang -C-hun ...:' . '",~ ... ~ .. A Thesis Submitted in Partial Fulfilment of the Requirements for the Degree of Master of Philosophy In Philosophy The Chinese University of Hong Kong January 2010 Abstract of thesis entitled: Two-Dimensionalism: Semantics and Metasemantics Submitted by YEUNG, Wang Chun for the degree of Master of Philosophy at the Chinese University of Hong Kong in July 2009 This ,thesis investigates problems surrounding the lively debate about how Kripke's examples of necessary a posteriori truths and contingent a priori truths should be explained. Two-dimensionalism is a recent development that offers a non-reductive analysis of such truths. The semantic interpretation of two-dimensionalism, proposed by Jackson and Chalmers, has certain 'descriptive' elements, which can be articulated in terms of the following three claims: (a) names and natural kind terms are reference-fixed by some associated properties, (b) these properties are known a priori by every competent speaker, and (c) these properties reflect the cognitive significance of sentences containing such terms. In this thesis, I argue against two arguments directed at such 'descriptive' elements, namely, The Argument from Ignorance and Error ('AlE'), and The Argument from Variability ('AV'). I thereby suggest that reference-fixing properties belong to the semantics of names and natural kind terms, and not to their metasemantics. Chapter 1 is a survey of some central notions related to the debate between descriptivism and direct reference theory, e.g. sense, reference, and rigidity. Chapter 2 outlines the two-dimensional approach and introduces the va~ieties of interpretations 11 of the two-dimensional framework.
    [Show full text]
  • Acquaintance Inferences As Evidential Effects
    The acquaintance inference as an evidential effect Abstract. Predications containing a special restricted class of predicates, like English tasty, tend to trigger an inference when asserted, to the effect that the speaker has had a spe- cific kind of `direct contact' with the subject of predication. This `acquaintance inference' has typically been treated as a hard-coded default effect, derived from the nature of the predicate together with the commitments incurred by assertion. This paper reevaluates the nature of this inference by examining its behavior in `Standard' Tibetan, a language that grammatically encodes perceptual evidentiality. In Tibetan, the acquaintance inference trig- gers not as a default, but rather when, and only when, marked by a perceptual evidential. The acquaintance inference is thus a grammaticized evidential effect in Tibetan, and so it cannot be a default effect in general cross-linguistically. An account is provided of how the semantics of the predicate and the commitment to perceptual evidentiality derive the in- ference in Tibetan, and it is suggested that the inference ought to be seen as an evidential effect generally, even in evidential-less languages, which invoke evidential notions without grammaticizing them. 1 Introduction: the acquaintance inference A certain restricted class of predicates, like English tasty, exhibit a special sort of behavior when used in predicative assertions. In particular, they require as a robust default that the speaker of the assertion has had direct contact of a specific sort with the subject of predication, as in (1). (1) This food is tasty. ,! The speaker has tasted the food. ,! The speaker liked the food's taste.
    [Show full text]
  • Scope Ambiguity in Syntax and Semantics
    Scope Ambiguity in Syntax and Semantics Ling324 Reading: Meaning and Grammar, pg. 142-157 Is Scope Ambiguity Semantically Real? (1) Everyone loves someone. a. Wide scope reading of universal quantifier: ∀x[person(x) →∃y[person(y) ∧ love(x,y)]] b. Wide scope reading of existential quantifier: ∃y[person(y) ∧∀x[person(x) → love(x,y)]] 1 Could one semantic representation handle both the readings? • ∃y∀x reading entails ∀x∃y reading. ∀x∃y describes a more general situation where everyone has someone who s/he loves, and ∃y∀x describes a more specific situation where everyone loves the same person. • Then, couldn’t we say that Everyone loves someone is associated with the semantic representation that describes the more general reading, and the more specific reading obtains under an appropriate context? That is, couldn’t we say that Everyone loves someone is not semantically ambiguous, and its only semantic representation is the following? ∀x[person(x) →∃y[person(y) ∧ love(x,y)]] • After all, this semantic representation reflects the syntax: In syntax, everyone c-commands someone. In semantics, everyone scopes over someone. 2 Arguments for Real Scope Ambiguity • The semantic representation with the scope of quantifiers reflecting the order in which quantifiers occur in a sentence does not always represent the most general reading. (2) a. There was a name tag near every plate. b. A guard is standing in front of every gate. c. A student guide took every visitor to two museums. • Could we stipulate that when interpreting a sentence, no matter which order the quantifiers occur, always assign wide scope to every and narrow scope to some, two, etc.? 3 Arguments for Real Scope Ambiguity (cont.) • But in a negative sentence, ¬∀x∃y reading entails ¬∃y∀x reading.
    [Show full text]
  • Compositional and Lexical Semantics • Compositional Semantics: The
    Compositional and lexical semantics Compositional semantics: the construction • of meaning (generally expressed as logic) based on syntax. This lecture: – Semantics with FS grammars Lexical semantics: the meaning of • individual words. This lecture: – lexical semantic relations and WordNet – one technique for word sense disambiguation 1 Simple compositional semantics in feature structures Semantics is built up along with syntax • Subcategorization `slot' filling instantiates • syntax Formally equivalent to logical • representations (below: predicate calculus with no quantifiers) Alternative FS encodings possible • 2 Objective: obtain the following semantics for they like fish: pron(x) (like v(x; y) fish n(y)) ^ ^ Feature structure encoding: 2 PRED and 3 6 7 6 7 6 7 6 2 PRED 3 7 6 pron 7 6 ARG1 7 6 6 7 7 6 6 ARG1 1 7 7 6 6 7 7 6 6 7 7 6 6 7 7 6 4 5 7 6 7 6 7 6 2 3 7 6 PRED and 7 6 7 6 6 7 7 6 6 7 7 6 6 7 7 6 6 2 3 7 7 6 6 PRED like v 7 7 6 6 7 7 6 6 6 7 7 7 6 6 ARG1 6 ARG1 1 7 7 7 6 6 6 7 7 7 6 6 6 7 7 7 6 6 6 7 7 7 6 ARG2 6 6 ARG2 2 7 7 7 6 6 6 7 7 7 6 6 6 7 7 7 6 6 6 7 7 7 6 6 4 5 7 7 6 6 7 7 6 6 7 7 6 6 2 3 7 7 6 6 PRED fish n 7 7 6 6 ARG2 7 7 6 6 6 7 7 7 6 6 6 ARG 2 7 7 7 6 6 6 1 7 7 7 6 6 6 7 7 7 6 6 6 7 7 7 6 6 4 5 7 7 6 6 7 7 6 4 5 7 6 7 4 5 3 Noun entry 2 3 2 CAT noun 3 6 HEAD 7 6 7 6 6 AGR 7 7 6 6 7 7 6 6 7 7 6 4 5 7 6 7 6 COMP 7 6 filled 7 6 7 fish 6 7 6 SPR filled 7 6 7 6 7 6 7 6 INDEX 1 7 6 2 3 7 6 7 6 SEM 7 6 6 PRED fish n 7 7 6 6 7 7 6 6 7 7 6 6 ARG 1 7 7 6 6 1 7 7 6 6 7 7 6 6 7 7 6 4 5 7 4 5 Corresponds to fish(x) where the INDEX • points to the characteristic variable of the noun (that is x).
    [Show full text]
  • Sentential Negation and Negative Concord
    Sentential Negation and Negative Concord Published by LOT phone: +31.30.2536006 Trans 10 fax: +31.30.2536000 3512 JK Utrecht email: [email protected] The Netherlands http://wwwlot.let.uu.nl/ Cover illustration: Kasimir Malevitch: Black Square. State Hermitage Museum, St. Petersburg, Russia. ISBN 90-76864-68-3 NUR 632 Copyright © 2004 by Hedde Zeijlstra. All rights reserved. Sentential Negation and Negative Concord ACADEMISCH PROEFSCHRIFT ter verkrijging van de graad van doctor aan de Universiteit van Amsterdam op gezag van de Rector Magnificus Prof. Mr P.F. van der Heijden ten overstaan van een door het College voor Promoties ingestelde commissie, in het openbaar te verdedigen in de Aula der Universiteit op woensdag 15 december 2004, te 10:00 uur door HEDZER HUGO ZEIJLSTRA geboren te Rotterdam Promotiecommissie: Promotores: Prof. Dr H.J. Bennis Prof. Dr J.A.G. Groenendijk Copromotor: Dr J.B. den Besten Leden: Dr L.C.J. Barbiers (Meertens Instituut, Amsterdam) Dr P.J.E. Dekker Prof. Dr A.C.J. Hulk Prof. Dr A. von Stechow (Eberhard Karls Universität Tübingen) Prof. Dr F.P. Weerman Faculteit der Geesteswetenschappen Voor Petra Table of Contents TABLE OF CONTENTS ............................................................................................ I ACKNOWLEDGEMENTS .......................................................................................V 1 INTRODUCTION................................................................................................1 1.1 FOUR ISSUES IN THE STUDY OF NEGATION.......................................................1
    [Show full text]
  • Semantic Computing
    SEMANTIC COMPUTING Lecture 7: Introduction to Distributional and Distributed Semantics Dagmar Gromann International Center For Computational Logic TU Dresden, 30 November 2018 Overview • Distributional Semantics • Distributed Semantics – Word Embeddings Dagmar Gromann, 30 November 2018 Semantic Computing 2 Distributional Semantics Dagmar Gromann, 30 November 2018 Semantic Computing 3 Distributional Semantics Definition • meaning of a word is the set of contexts in which it occurs • no other information is used than the corpus-derived information about word distribution in contexts (co-occurrence information of words) • semantic similarity can be inferred from proximity in contexts • At the very core: Distributional Hypothesis Distributional Hypothesis “similarity of meaning correlates with similarity of distribution” - Harris Z. S. (1954) “Distributional structure". Word, Vol. 10, No. 2-3, pp. 146-162 meaning = use = distribution in context => semantic distance Dagmar Gromann, 30 November 2018 Semantic Computing 4 Remember Lecture 1? Types of Word Meaning • Encyclopaedic meaning: words provide access to a large inventory of structured knowledge (world knowledge) • Denotational meaning: reference of a word to object/concept or its “dictionary definition” (signifier <-> signified) • Connotative meaning: word meaning is understood by its cultural or emotional association (positive, negative, neutral conntation; e.g. “She’s a dragon” in Chinese and English) • Conceptual meaning: word meaning is associated with the mental concepts it gives access to (e.g. prototype theory) • Distributional meaning: “You shall know a word by the company it keeps” (J.R. Firth 1957: 11) John Rupert Firth (1957). "A synopsis of linguistic theory 1930-1955." In Special Volume of the Philological Society. Oxford: Oxford University Press. Dagmar Gromann, 30 November 2018 Semantic Computing 5 Distributional Hypothesis in Practice Study by McDonald and Ramscar (2001): • The man poured from a balack into a handleless cup.
    [Show full text]
  • Chapter 1 Negation in a Cross-Linguistic Perspective
    Chapter 1 Negation in a cross-linguistic perspective 0. Chapter summary This chapter introduces the empirical scope of our study on the expression and interpretation of negation in natural language. We start with some background notions on negation in logic and language, and continue with a discussion of more linguistic issues concerning negation at the syntax-semantics interface. We zoom in on cross- linguistic variation, both in a synchronic perspective (typology) and in a diachronic perspective (language change). Besides expressions of propositional negation, this book analyzes the form and interpretation of indefinites in the scope of negation. This raises the issue of negative polarity and its relation to negative concord. We present the main facts, criteria, and proposals developed in the literature on this topic. The chapter closes with an overview of the book. We use Optimality Theory to account for the syntax and semantics of negation in a cross-linguistic perspective. This theoretical framework is introduced in Chapter 2. 1 Negation in logic and language The main aim of this book is to provide an account of the patterns of negation we find in natural language. The expression and interpretation of negation in natural language has long fascinated philosophers, logicians, and linguists. Horn’s (1989) Natural history of negation opens with the following statement: “All human systems of communication contain a representation of negation. No animal communication system includes negative utterances, and consequently, none possesses a means for assigning truth value, for lying, for irony, or for coping with false or contradictory statements.” A bit further on the first page, Horn states: “Despite the simplicity of the one-place connective of propositional logic ( ¬p is true if and only if p is not true) and of the laws of inference in which it participate (e.g.
    [Show full text]
  • A Comparison of Word Embeddings and N-Gram Models for Dbpedia Type and Invalid Entity Detection †
    information Article A Comparison of Word Embeddings and N-gram Models for DBpedia Type and Invalid Entity Detection † Hanqing Zhou *, Amal Zouaq and Diana Inkpen School of Electrical Engineering and Computer Science, University of Ottawa, Ottawa ON K1N 6N5, Canada; [email protected] (A.Z.); [email protected] (D.I.) * Correspondence: [email protected]; Tel.: +1-613-562-5800 † This paper is an extended version of our conference paper: Hanqing Zhou, Amal Zouaq, and Diana Inkpen. DBpedia Entity Type Detection using Entity Embeddings and N-Gram Models. In Proceedings of the International Conference on Knowledge Engineering and Semantic Web (KESW 2017), Szczecin, Poland, 8–10 November 2017, pp. 309–322. Received: 6 November 2018; Accepted: 20 December 2018; Published: 25 December 2018 Abstract: This article presents and evaluates a method for the detection of DBpedia types and entities that can be used for knowledge base completion and maintenance. This method compares entity embeddings with traditional N-gram models coupled with clustering and classification. We tackle two challenges: (a) the detection of entity types, which can be used to detect invalid DBpedia types and assign DBpedia types for type-less entities; and (b) the detection of invalid entities in the resource description of a DBpedia entity. Our results show that entity embeddings outperform n-gram models for type and entity detection and can contribute to the improvement of DBpedia’s quality, maintenance, and evolution. Keywords: semantic web; DBpedia; entity embedding; n-grams; type identification; entity identification; data mining; machine learning 1. Introduction The Semantic Web is defined by Berners-Lee et al.
    [Show full text]
  • Distributional Semantics
    Distributional Semantics Computational Linguistics: Jordan Boyd-Graber University of Maryland SLIDES ADAPTED FROM YOAV GOLDBERG AND OMER LEVY Computational Linguistics: Jordan Boyd-Graber UMD Distributional Semantics 1 / 19 j j From Distributional to Distributed Semantics The new kid on the block Deep learning / neural networks “Distributed” word representations Feed text into neural-net. Get back “word embeddings”. Each word is represented as a low-dimensional vector. Vectors capture “semantics” word2vec (Mikolov et al) Computational Linguistics: Jordan Boyd-Graber UMD Distributional Semantics 2 / 19 j j From Distributional to Distributed Semantics This part of the talk word2vec as a black box a peek inside the black box relation between word-embeddings and the distributional representation tailoring word embeddings to your needs using word2vec Computational Linguistics: Jordan Boyd-Graber UMD Distributional Semantics 3 / 19 j j word2vec Computational Linguistics: Jordan Boyd-Graber UMD Distributional Semantics 4 / 19 j j word2vec Computational Linguistics: Jordan Boyd-Graber UMD Distributional Semantics 5 / 19 j j word2vec dog cat, dogs, dachshund, rabbit, puppy, poodle, rottweiler, mixed-breed, doberman, pig sheep cattle, goats, cows, chickens, sheeps, hogs, donkeys, herds, shorthorn, livestock november october, december, april, june, february, july, september, january, august, march jerusalem tiberias, jaffa, haifa, israel, palestine, nablus, damascus katamon, ramla, safed teva pfizer, schering-plough, novartis, astrazeneca, glaxosmithkline, sanofi-aventis, mylan, sanofi, genzyme, pharmacia Computational Linguistics: Jordan Boyd-Graber UMD Distributional Semantics 6 / 19 j j Working with Dense Vectors Word Similarity Similarity is calculated using cosine similarity: dog~ cat~ sim(dog~ ,cat~ ) = dog~ · cat~ jj jj jj jj For normalized vectors ( x = 1), this is equivalent to a dot product: jj jj sim(dog~ ,cat~ ) = dog~ cat~ · Normalize the vectors when loading them.
    [Show full text]