Wordnet Embeddings

Total Page:16

File Type:pdf, Size:1020Kb

Wordnet Embeddings WordNet Embeddings Chakaveh Saedi, António Branco, João António Rodrigues, João Ricardo Silva University of Lisbon NLX-Natural Language and Speech Group, Department of Informatics Faculdade de Ciências Campo Grande, 1749-016 Lisboa, Portugal {chakaveh.saedi, antonio.branco, joao.rodrigues, jsilva}@di.fc.ul.pt Abstract nymy, meronymy, etc.). In a feature-based model, the semantics of a lexicon is represented by a hash Semantic networks and semantic spaces table where a key is the lexical unit of interest and have been two prominent approaches to the respective value is a set of other units denoting represent lexical semantics. While a uni- typical characteristics of the denotation of the unit fied account of the lexical meaning re- in the key (e.g. role, usage or shape, etc.). Under lies on one being able to convert between a semantic space perspective, in turn, the mean- these representations, in both directions, ing of a lexical unit is represented by a vector in the conversion direction from semantic a high-dimensional space, where each component networks into semantic spaces started to is based on some frequency level of co-occurrence attract more attention recently. In this pa- with the other units in contexts of language usage. per we present a methodology for this con- The motivation for these three families of lexi- version and assess it with a case study. cal representation is to be found in their different When it is applied over WordNet, the per- suitability and success in explaining a wide range formance of the resulting embeddings in of empirical phenomena, in terms of how these are a mainstream semantic similarity task is manifest in ordinary language usage and how they very good, substantially superior to the are elicited in laboratory experimentation. These performance of word embeddings based phenomena are related to the acquisition, storage on very large collections of texts like and retrieval of lexical knowledge (e.g. the spread word2vec. activation effect (Meyer and Schvaneveldt, 1971), the fan effect (Anderson, 1974), among many oth- 1 Introduction ers) and to how this knowledge interacts with other The study of lexical semantics has been at the core cognitive faculties or tasks, including categoriza- of the research on language science and technol- tion (Estes, 1994), reasoning (Rips, 1975), prob- ogy as the meaning of linguistic forms results from lem solving (Holyoak and Koh, 1987), learning the meaning of their lexical units and from the (Ross, 1984), etc. way these are combined (Pelletier, 2016). How In the scope of the formal and computational to represent lexical semantics has thus been a cen- modeling of lexical semantics, these approaches tral topic of inquiry. Three broad families of ap- have inspired a number of initiatives to build proaches have emerged in this respect, namely repositories of lexical knowledge. Popular exam- those advocating that lexical semantics is repre- ples of such repositories are, for semantic net- sented as a semantic network (Quillan, 1966), a works, WordNet (Fellbaum, 1998), for feature- feature-based model (Minsky, 1975; Bobrow and based models, Small World of Words (De Deyne Norman, 1975), or a semantic space (Harris, 1954; et al., 2013), and for the semantic space, word2vec Osgood et al., 1957). (Mikolov et al., 2013a), among many others. In- In terms of data structures, under a semantic terestingly, to achieve the highest quality, reposi- network approach, the meaning of a lexical unit tories of different types typically resort to different is represented as a node in a graph whose edges empirical sources of data. For instance, WordNet between nodes encode different types of seman- is constructed on the basis of systematic lexical in- tic relations holding among the units (e.g. hyper- tuitions handled by human experts; the informa- 122 Proceedings of the 3rd Workshop on Representation Learning for NLP, pages 122–131 Melbourne, Australia, July 20, 2018. c 2018 Association for Computational Linguistics tion encoded in Small World of Words is evoked Princeton WordNet version 3 as a repository of the from laypersons; and word2vec is built on the ba- lexical semantics of the English language, repre- sis of the co-occurrence frequency of lexical units sented as a semantic graph, and converted a sub- in a collection of documents. graph of it with half of its concepts into wnet2vec, Even when motivated in the first place by psy- a collection of vectors in a high-dimension space. cholinguistic research goals, these repositories of These WordNet embeddings were evaluated un- lexical knowledge have been extraordinarily im- der the same conditions that semantic space based portant for language technology. They have been repositories like word2vec are, namely under the instrumental for major advances in language pro- processing task of determining the semantic sim- cessing tasks and applications such as word sense ilarity between pairs of lexical units. The evalua- disambiguation, part-of-speech tagging, named tion results obtained for wnet2vec are around 15% entity recognition, sentiment analysis (e.g. (Li superior to the results obtained for word2vec with and Jurafsky, 2015)), parsing (e.g. (Socher et al., the same mainstream evaluation data set SimLex- 2013)), textual entailment (e.g. (Baroni et al., 999 (Hill et al., 2016). 2012)), discourse analysis (e.g. (Ji and Eisenstein, 2014)), among many others.1 2 Distributional vectors from ontological The proliferation of different types of represen- graphs tation for the same object of research is common For a given word w, its distributional represen- in science, and searching for a unified rendering tation ~w (aka word embedding) is a high dimen- of a given research domain has been a major goal sion vector whose elements ~wi record real val- in many disciplines. To a large extent, such search ued scores expressing the strength of the seman- focuses on finding ways of converting from one tic affinity of w with other words in the vocab- type of representation into another. Once this is ulary. The usual source of these scores, and ul- made possible, it brings not only the theoretical timately the empirical base of word embeddings, satisfaction of getting a better unified insight into has been the frequency of co-occurrence between the research object, but also important instrumen- words taken from large collections of text. tal rewards of reapplying results, resources and The goal here instead is to use semantic net- tools that had been obtained under one representa- works as the empirical source of word embed- tion to the other representations, thus opening the dings. This will permit that the lexical knowledge potential for further research advances. that is encoded in a semantic graph be re-encoded This is the case also in what concerns the re- as an embeddings matrix compiling the distribu- search on lexical semantics. Establishing whether tional vectors of the words in the vocabulary. and how any given lexical representation can be To determine the strength of semantic affinity of converted into another representation is important two words from their representation in a semantic for a more unified account of it. On the language graph, we follow this intuition: the larger the num- science side, this will likely enhance the plausibil- ber of paths and the shorter the paths connecting ity of our empirical modeling about how the mind- any two nodes the stronger is their affinity. brain handles lexical meaning. On the language To make this intuition operative we resort to the technology side, in turn, this will permit to reuse following procedure, to be refined later on. First, resources and find new ways to combine different the semantic graph G is represented as an adja- sources of lexical information for better applica- cency matrix M such that iff two nodes of G with tion results. words wi and wj are related by an edge represent- In the present paper, we seek to contribute to- ing a direct semantic relation between them, the wards a unified account of lexical semantics. We element Mij is set to 1 (to 0 otherwise). report on the methodology we used to convert Second, to enrich M with scores that represent from a semantic network based representation of the strength of semantic affinity of nodes not di- lexical meaning into a semantic space based one, rectly connected with each other by an edge, the and on the successful evaluation results obtained following cumulative iteration is resorted to when applying that methodology. We resorted to (n) 2 2 n n MG = I + αM + α M + ::: + α M (1) 1For the vast number of applications of WordNet, see http://lit.csci.unt.edu/∼wordnet where I is the identity matrix; the n-th power of 123 n the transition matrix, M , is the matrix where Model Similarity each Mij counts the number of paths of lenght n wnet2vec 0.50 between nodes i and j; and α < 1 is a decay fac- tor determining how longer paths are dominated word2vec 0.44 by shorter ones. Third, this iterative procedure is pursued until Table 1: Performance in semantic similarity task it converges into matrix MG, which is analytically over SimLex-999 given by Spearman’s coefficient obtained by an inverse matrix operation given by2 (higher score is better). 1 X e −1 MG = (αM) = (I − αM) (2) For the sound application of the conversion, e=0 each line in MG was normalized, using L2-norm, so that it corresponds to a vector whose scores sum 3 WordNet embeddings to 1, corresponding to a transition matrix. In order to assess this procedure, we use it to Finally, we used Principal Component Analysis convert a mainstream ontological graph into an (PCA) (Wold et al., 1987) to transform the matrix, embeddings matrix.
Recommended publications
  • Automatic Wordnet Mapping Using Word Sense Disambiguation*
    Automatic WordNet mapping using word sense disambiguation* Changki Lee Seo JungYun Geunbae Leer Natural Language Processing Lab Natural Language Processing Lab Dept. of Computer Science and Engineering Dept. of Computer Science Pohang University of Science & Technology Sogang University San 31, Hyoja-Dong, Pohang, 790-784, Korea Sinsu-dong 1, Mapo-gu, Seoul, Korea {leeck,gblee }@postech.ac.kr seojy@ ccs.sogang.ac.kr bilingual Korean-English dictionary. The first Abstract sense of 'gwan-mog' has 'bush' as a translation in English and 'bush' has five synsets in This paper presents the automatic WordNet. Therefore the first sense of construction of a Korean WordNet from 'gwan-mog" has five candidate synsets. pre-existing lexical resources. A set of Somehow we decide a synset {shrub, bush} automatic WSD techniques is described for among five candidate synsets and link the sense linking Korean words collected from a of 'gwan-mog' to this synset. bilingual MRD to English WordNet synsets. As seen from this example, when we link the We will show how individual linking senses of Korean words to WordNet synsets, provided by each WSD method is then there are semantic ambiguities. To remove the combined to produce a Korean WordNet for ambiguities we develop new word sense nouns. disambiguation heuristics and automatic mapping method to construct Korean WordNet based on 1 Introduction the existing English WordNet. There is no doubt on the increasing This paper is organized as follows. In section 2, importance of using wide coverage ontologies we describe multiple heuristics for word sense for NLP tasks especially for information disambiguation for sense linking.
    [Show full text]
  • Probabilistic Topic Modelling with Semantic Graph
    Probabilistic Topic Modelling with Semantic Graph B Long Chen( ), Joemon M. Jose, Haitao Yu, Fajie Yuan, and Huaizhi Zhang School of Computing Science, University of Glasgow, Sir Alwyns Building, Glasgow, UK [email protected] Abstract. In this paper we propose a novel framework, topic model with semantic graph (TMSG), which couples topic model with the rich knowledge from DBpedia. To begin with, we extract the disambiguated entities from the document collection using a document entity linking system, i.e., DBpedia Spotlight, from which two types of entity graphs are created from DBpedia to capture local and global contextual knowl- edge, respectively. Given the semantic graph representation of the docu- ments, we propagate the inherent topic-document distribution with the disambiguated entities of the semantic graphs. Experiments conducted on two real-world datasets show that TMSG can significantly outperform the state-of-the-art techniques, namely, author-topic Model (ATM) and topic model with biased propagation (TMBP). Keywords: Topic model · Semantic graph · DBpedia 1 Introduction Topic models, such as Probabilistic Latent Semantic Analysis (PLSA) [7]and Latent Dirichlet Analysis (LDA) [2], have been remarkably successful in ana- lyzing textual content. Specifically, each document in a document collection is represented as random mixtures over latent topics, where each topic is character- ized by a distribution over words. Such a paradigm is widely applied in various areas of text mining. In view of the fact that the information used by these mod- els are limited to document collection itself, some recent progress have been made on incorporating external resources, such as time [8], geographic location [12], and authorship [15], into topic models.
    [Show full text]
  • Semantic Memory: a Review of Methods, Models, and Current Challenges
    Psychonomic Bulletin & Review https://doi.org/10.3758/s13423-020-01792-x Semantic memory: A review of methods, models, and current challenges Abhilasha A. Kumar1 # The Psychonomic Society, Inc. 2020 Abstract Adult semantic memory has been traditionally conceptualized as a relatively static memory system that consists of knowledge about the world, concepts, and symbols. Considerable work in the past few decades has challenged this static view of semantic memory, and instead proposed a more fluid and flexible system that is sensitive to context, task demands, and perceptual and sensorimotor information from the environment. This paper (1) reviews traditional and modern computational models of seman- tic memory, within the umbrella of network (free association-based), feature (property generation norms-based), and distribu- tional semantic (natural language corpora-based) models, (2) discusses the contribution of these models to important debates in the literature regarding knowledge representation (localist vs. distributed representations) and learning (error-free/Hebbian learning vs. error-driven/predictive learning), and (3) evaluates how modern computational models (neural network, retrieval- based, and topic models) are revisiting the traditional “static” conceptualization of semantic memory and tackling important challenges in semantic modeling such as addressing temporal, contextual, and attentional influences, as well as incorporating grounding and compositionality into semantic representations. The review also identifies new challenges
    [Show full text]
  • Knowledge Graphs on the Web – an Overview Arxiv:2003.00719V3 [Cs
    January 2020 Knowledge Graphs on the Web – an Overview Nicolas HEIST, Sven HERTLING, Daniel RINGLER, and Heiko PAULHEIM Data and Web Science Group, University of Mannheim, Germany Abstract. Knowledge Graphs are an emerging form of knowledge representation. While Google coined the term Knowledge Graph first and promoted it as a means to improve their search results, they are used in many applications today. In a knowl- edge graph, entities in the real world and/or a business domain (e.g., people, places, or events) are represented as nodes, which are connected by edges representing the relations between those entities. While companies such as Google, Microsoft, and Facebook have their own, non-public knowledge graphs, there is also a larger body of publicly available knowledge graphs, such as DBpedia or Wikidata. In this chap- ter, we provide an overview and comparison of those publicly available knowledge graphs, and give insights into their contents, size, coverage, and overlap. Keywords. Knowledge Graph, Linked Data, Semantic Web, Profiling 1. Introduction Knowledge Graphs are increasingly used as means to represent knowledge. Due to their versatile means of representation, they can be used to integrate different heterogeneous data sources, both within as well as across organizations. [8,9] Besides such domain-specific knowledge graphs which are typically developed for specific domains and/or use cases, there are also public, cross-domain knowledge graphs encoding common knowledge, such as DBpedia, Wikidata, or YAGO. [33] Such knowl- edge graphs may be used, e.g., for automatically enriching data with background knowl- arXiv:2003.00719v3 [cs.AI] 12 Mar 2020 edge to be used in knowledge-intensive downstream applications.
    [Show full text]
  • Large Semantic Network Manual Annotation 1 Introduction
    Large Semantic Network Manual Annotation V´aclav Nov´ak Institute of Formal and Applied Linguistics Charles University, Prague [email protected] Abstract This abstract describes a project aiming at manual annotation of the content of natural language utterances in a parallel text corpora. The formalism used in this project is MultiNet – Multilayered Ex- tended Semantic Network. The annotation should be incorporated into Prague Dependency Treebank as a new annotation layer. 1 Introduction A formal specification of the semantic content is the aim of numerous semantic approaches such as TIL [6], DRT [9], MultiNet [4], and others. As far as we can tell, there is no large “real life” text corpora manually annotated with such markup. The projects usually work only with automatically generated annotation, if any [1, 6, 3, 2]. We want to create a parallel Czech-English corpora of texts annotated with the corresponding semantic network. 1.1 Prague Dependency Treebank From the linguistic viewpoint there language resources such as Prague Dependency Treebank (PDT) which contain a deep manual analysis of texts [8]. PDT contains annotations of three layers, namely morpho- logical, analytical (shallow dependency syntax) and tectogrammatical (deep dependency syntax). The units of each annotation level are linked with corresponding units on the preceding level. The morpho- logical units are linked directly with the original text. The theoretical basis of the treebank lies in the Functional Gener- ative Description of language system [7]. PDT 2.0 is based on the long-standing Praguian linguistic tradi- tion, adapted for the current computational-linguistics research needs. The corpus itself is embedded into the latest annotation technology.
    [Show full text]
  • Universal Or Variation? Semantic Networks in English and Chinese
    Universal or variation? Semantic networks in English and Chinese Understanding the structures of semantic networks can provide great insights into lexico- semantic knowledge representation. Previous work reveals small-world structure in English, the structure that has the following properties: short average path lengths between words and strong local clustering, with a scale-free distribution in which most nodes have few connections while a small number of nodes have many connections1. However, it is not clear whether such semantic network properties hold across human languages. In this study, we investigate the universal structures and cross-linguistic variations by comparing the semantic networks in English and Chinese. Network description To construct the Chinese and the English semantic networks, we used Chinese Open Wordnet2,3 and English WordNet4. The two wordnets have different word forms in Chinese and English but common word meanings. Word meanings are connected not only to word forms, but also to other word meanings if they form relations such as hypernyms and meronyms (Figure 1). 1. Cross-linguistic comparisons Analysis The large-scale structures of the Chinese and the English networks were measured with two key network metrics, small-worldness5 and scale-free distribution6. Results The two networks have similar size and both exhibit small-worldness (Table 1). However, the small-worldness is much greater in the Chinese network (σ = 213.35) than in the English network (σ = 83.15); this difference is primarily due to the higher average clustering coefficient (ACC) of the Chinese network. The scale-free distributions are similar across the two networks, as indicated by ANCOVA, F (1, 48) = 0.84, p = .37.
    [Show full text]
  • Structure at Every Scale: a Semantic Network Account of the Similarities Between Very Unrelated Concepts
    De Deyne, S., Navarro D. J., Perfors, A. and Storms, G. (2016). Structure at every scale: A semantic network account of the similarities between very unrelated concepts. Journal of Experimental Psychology: General, 145, 1228-1254 https://doi.org/10.1037/xge0000192 Structure at every scale: A semantic network account of the similarities between unrelated concepts Simon De Deyne, Danielle J. Navarro, Amy Perfors University of Adelaide Gert Storms University of Leuven Word Count: 19586 Abstract Similarity plays an important role in organizing the semantic system. However, given that similarity cannot be defined on purely logical grounds, it is impor- tant to understand how people perceive similarities between different entities. Despite this, the vast majority of studies focus on measuring similarity between very closely related items. When considering concepts that are very weakly re- lated, little is known. In this paper we present four experiments showing that there are reliable and systematic patterns in how people evaluate the similari- ties between very dissimilar entities. We present a semantic network account of these similarities showing that a spreading activation mechanism defined over a word association network naturally makes correct predictions about weak sim- ilarities and the time taken to assess them, whereas, though simpler, models based on direct neighbors between word pairs derived using the same network cannot. Keywords: word associations, similarity, semantic networks, random walks. This work was supported by a research grant funded by the Research Foundation - Flanders (FWO), ARC grant DE140101749 awarded to the first author, and by the interdisciplinary research project IDO/07/002 awarded to Dirk Speelman, Dirk Geeraerts, and Gert Storms.
    [Show full text]
  • Semantic Computing
    SEMANTIC COMPUTING Lecture 7: Introduction to Distributional and Distributed Semantics Dagmar Gromann International Center For Computational Logic TU Dresden, 30 November 2018 Overview • Distributional Semantics • Distributed Semantics – Word Embeddings Dagmar Gromann, 30 November 2018 Semantic Computing 2 Distributional Semantics Dagmar Gromann, 30 November 2018 Semantic Computing 3 Distributional Semantics Definition • meaning of a word is the set of contexts in which it occurs • no other information is used than the corpus-derived information about word distribution in contexts (co-occurrence information of words) • semantic similarity can be inferred from proximity in contexts • At the very core: Distributional Hypothesis Distributional Hypothesis “similarity of meaning correlates with similarity of distribution” - Harris Z. S. (1954) “Distributional structure". Word, Vol. 10, No. 2-3, pp. 146-162 meaning = use = distribution in context => semantic distance Dagmar Gromann, 30 November 2018 Semantic Computing 4 Remember Lecture 1? Types of Word Meaning • Encyclopaedic meaning: words provide access to a large inventory of structured knowledge (world knowledge) • Denotational meaning: reference of a word to object/concept or its “dictionary definition” (signifier <-> signified) • Connotative meaning: word meaning is understood by its cultural or emotional association (positive, negative, neutral conntation; e.g. “She’s a dragon” in Chinese and English) • Conceptual meaning: word meaning is associated with the mental concepts it gives access to (e.g. prototype theory) • Distributional meaning: “You shall know a word by the company it keeps” (J.R. Firth 1957: 11) John Rupert Firth (1957). "A synopsis of linguistic theory 1930-1955." In Special Volume of the Philological Society. Oxford: Oxford University Press. Dagmar Gromann, 30 November 2018 Semantic Computing 5 Distributional Hypothesis in Practice Study by McDonald and Ramscar (2001): • The man poured from a balack into a handleless cup.
    [Show full text]
  • Untangling Semantic Similarity: Modeling Lexical Processing Experiments with Distributional Semantic Models
    Untangling Semantic Similarity: Modeling Lexical Processing Experiments with Distributional Semantic Models. Farhan Samir Barend Beekhuizen Suzanne Stevenson Department of Computer Science Department of Language Studies Department of Computer Science University of Toronto University of Toronto, Mississauga University of Toronto ([email protected]) ([email protected]) ([email protected]) Abstract DSMs thought to instantiate one but not the other kind of sim- Distributional semantic models (DSMs) are substantially var- ilarity have been found to explain different priming effects on ied in the types of semantic similarity that they output. Despite an aggregate level (as opposed to an item level). this high variance, the different types of similarity are often The underlying conception of semantic versus associative conflated as a monolithic concept in models of behavioural data. We apply the insight that word2vec’s representations similarity, however, has been disputed (Ettinger & Linzen, can be used for capturing both paradigmatic similarity (sub- 2016; McRae et al., 2012), as has one of the main ways in stitutability) and syntagmatic similarity (co-occurrence) to two which it is operationalized (Gunther¨ et al., 2016). In this pa- sets of experimental findings (semantic priming and the effect of semantic neighbourhood density) that have previously been per, we instead follow another distinction, based on distri- modeled with monolithic conceptions of DSM-based seman- butional properties (e.g. Schutze¨ & Pedersen, 1993), namely, tic similarity. Using paradigmatic and syntagmatic similarity that between syntagmatically related words (words that oc- based on word2vec, we show that for some tasks and types of items the two types of similarity play complementary ex- cur in each other’s near proximity, such as drink–coffee), planatory roles, whereas for others, only syntagmatic similar- and paradigmatically related words (words that can be sub- ity seems to matter.
    [Show full text]
  • A Comparison of Word Embeddings and N-Gram Models for Dbpedia Type and Invalid Entity Detection †
    information Article A Comparison of Word Embeddings and N-gram Models for DBpedia Type and Invalid Entity Detection † Hanqing Zhou *, Amal Zouaq and Diana Inkpen School of Electrical Engineering and Computer Science, University of Ottawa, Ottawa ON K1N 6N5, Canada; [email protected] (A.Z.); [email protected] (D.I.) * Correspondence: [email protected]; Tel.: +1-613-562-5800 † This paper is an extended version of our conference paper: Hanqing Zhou, Amal Zouaq, and Diana Inkpen. DBpedia Entity Type Detection using Entity Embeddings and N-Gram Models. In Proceedings of the International Conference on Knowledge Engineering and Semantic Web (KESW 2017), Szczecin, Poland, 8–10 November 2017, pp. 309–322. Received: 6 November 2018; Accepted: 20 December 2018; Published: 25 December 2018 Abstract: This article presents and evaluates a method for the detection of DBpedia types and entities that can be used for knowledge base completion and maintenance. This method compares entity embeddings with traditional N-gram models coupled with clustering and classification. We tackle two challenges: (a) the detection of entity types, which can be used to detect invalid DBpedia types and assign DBpedia types for type-less entities; and (b) the detection of invalid entities in the resource description of a DBpedia entity. Our results show that entity embeddings outperform n-gram models for type and entity detection and can contribute to the improvement of DBpedia’s quality, maintenance, and evolution. Keywords: semantic web; DBpedia; entity embedding; n-grams; type identification; entity identification; data mining; machine learning 1. Introduction The Semantic Web is defined by Berners-Lee et al.
    [Show full text]
  • Distributional Semantics
    Distributional Semantics Computational Linguistics: Jordan Boyd-Graber University of Maryland SLIDES ADAPTED FROM YOAV GOLDBERG AND OMER LEVY Computational Linguistics: Jordan Boyd-Graber UMD Distributional Semantics 1 / 19 j j From Distributional to Distributed Semantics The new kid on the block Deep learning / neural networks “Distributed” word representations Feed text into neural-net. Get back “word embeddings”. Each word is represented as a low-dimensional vector. Vectors capture “semantics” word2vec (Mikolov et al) Computational Linguistics: Jordan Boyd-Graber UMD Distributional Semantics 2 / 19 j j From Distributional to Distributed Semantics This part of the talk word2vec as a black box a peek inside the black box relation between word-embeddings and the distributional representation tailoring word embeddings to your needs using word2vec Computational Linguistics: Jordan Boyd-Graber UMD Distributional Semantics 3 / 19 j j word2vec Computational Linguistics: Jordan Boyd-Graber UMD Distributional Semantics 4 / 19 j j word2vec Computational Linguistics: Jordan Boyd-Graber UMD Distributional Semantics 5 / 19 j j word2vec dog cat, dogs, dachshund, rabbit, puppy, poodle, rottweiler, mixed-breed, doberman, pig sheep cattle, goats, cows, chickens, sheeps, hogs, donkeys, herds, shorthorn, livestock november october, december, april, june, february, july, september, january, august, march jerusalem tiberias, jaffa, haifa, israel, palestine, nablus, damascus katamon, ramla, safed teva pfizer, schering-plough, novartis, astrazeneca, glaxosmithkline, sanofi-aventis, mylan, sanofi, genzyme, pharmacia Computational Linguistics: Jordan Boyd-Graber UMD Distributional Semantics 6 / 19 j j Working with Dense Vectors Word Similarity Similarity is calculated using cosine similarity: dog~ cat~ sim(dog~ ,cat~ ) = dog~ · cat~ jj jj jj jj For normalized vectors ( x = 1), this is equivalent to a dot product: jj jj sim(dog~ ,cat~ ) = dog~ cat~ · Normalize the vectors when loading them.
    [Show full text]
  • NL Assistant: a Toolkit for Developing Natural Language: Applications
    NL Assistant: A Toolkit for Developing Natural Language Applications Deborah A. Dahl, Lewis M. Norton, Ahmed Bouzid, and Li Li Unisys Corporation Introduction scale lexical resources have been integrated with the toolkit. These include Comlex and WordNet We will be demonstrating a toolkit for as well as additional, internally developed developing natural language-based applications resources. The second strategy is to provide easy and two applications. The goals of this toolkit to use editors for entering linguistic information. are to reduce development time and cost for natural language based applications by reducing Servers the amount of linguistic and programming work needed. Linguistic work has been reduced by Lexical information is supplied by four external integrating large-scale linguistics resources--- servers which are accessed by the natural Comlex (Grishman, et. al., 1993) and WordNet language engine during processing. Syntactic (Miller, 1990). Programming work is reduced by information is supplied by a lexical server based automating some of the programming tasks. The on the 50K word Comlex dictionary available toolkit is designed for both speech- and text- from the Linguistic Data Consortium. Semantic based interface applications. It runs in a information comes from two servers, a KB Windows NT environment. Applications can server based on the noun portion of WordNet run in either Windows NT or Unix. (70K concepts), and a semantics server containing case frame information for 2500 System Components English verbs. A denotations server using unique concept names generated for each The NL Assistant toolkit consists of WordNet synset at ISI links the words in the lexicon to the concepts in the KB.
    [Show full text]