A Semantic Network Approach to Measuring Relatedness

Total Page:16

File Type:pdf, Size:1020Kb

A Semantic Network Approach to Measuring Relatedness A Semantic Network Approach to Measuring Relatedness Brian Harrington Oxford University Computing Laboratory [email protected] Abstract 1999) notes, “Semantic similarity represents a special case of semantic relatedness: for ex- Humans are very good at judging the ample, cars and gasoline would seem to be strength of relationships between two more closely related than, say, cars and bi- terms, a task which, if it can be au- cycles, but the latter pair are certainly more tomated, would be useful in a range similar”. (Budanitsky and Hirst, 2006) fur- of applications. Systems attempting to ther note that “Computational applications solve this problem automatically have typically require relatedness rather than just traditionally either used relative po- similarity; for example, money and river are sitioning in lexical resources such as cues to the in-context meaning of bank that WordNet, or distributional relation- are just as good as trust company”. ships in large corpora. This paper pro- Systems for automatically determining the poses a new approach, whereby rela- degree of semantic relatedness between two tionships are derived from natural lan- terms have traditionally either used a mea- guage text by using existing nlp tools, surement based on the distance between the then integrated into a large scale se- terms within WordNet (Banerjee and Ped- mantic network. Spreading activation ersen, 2003; Hughes and Ramage, 2007), or is then used on this network in order used co-occurrence statistics from a large cor- to judge the strengths of all relation- pus (Mohammad and Hirst, 2006; Pad´oand ships connecting the terms. In com- Lapata, 2007). Recent systems have, how- parisons with human measurements, ever, shown improved results using extremely this approach was able to obtain re- large corpora (Agirre et al., 2009), and ex- sults on par with the best purpose built isting large-scale resources such as Wikipedia systems, using only a relatively small (Strube and Ponzetto, 2006). corpus extracted from the web. This In this paper, we propose a new approach is particularly impressive, as the net- to determining semantic relatedness, in which work creation system is a general tool a semantic network is automatically created for information collection and integra- from a relatively small corpus using exist- tion, and is not specifically designed for ing NLP tools and a network creation system tasks of this type. called ASKNet (Harrington and Clark, 2007), 1 Introduction and then spreading activation is used to de- termine the strength of the connections within The ability to determine semantic relatedness that network. This process is more analogous between terms is useful for a variety of nlp ap- to the way the task is performed by humans. plications, including word sense disambigua- Information is collected from fragments and tion, information extraction and retrieval, and assimilated into a large semantic knowledge text summarisation (Budanitsky and Hirst, structure which is not purposely built for a 2006). However, there is an important dis- single task, but is constructed as a general tinction to be made between semantic relat- resource containing a wide variety of infor- edness and semantic similarity. As (Resnik, mation. Relationships represented within this 356 Coling 2010: Poster Volume, pages 356–364, Beijing, August 2010 structure can then be used to determine the and Brew, 2004; Pad´oand Lapata, 2007) or total strength of the relations between any two from the web using search engine results (Tur- terms. ney, 2001). In a recent paper, Agirre et al. (2009) parsed 2 Existing Approaches 4 billion documents (1.6 Terawords) crawled 2.1 Resource Based Methods from the web, and then used a search func- tion to extract syntactic relations and con- A popular method for automatically judging text windows surrounding key words. These semantic distance between terms is through were then used as features for vector space, WordNet (Fellbaum, 1998), using the lengths in a similar manner to work done in (Pad´o of paths between words in the taxonomy as a and Lapata, 2007), using the British National measure of distance. While WordNet-based Corpus (bnc). This system has produced ex- approaches have obtained promising results cellent results, indicating that the quality of for measuring semantic similarity (Jiang and the results for these types of approaches is re- Conrath, 1997; Banerjee and Pedersen, 2003), lated to the size and coverage of their corpus. the results for the more general notion of se- This does however present problems moving mantic relatedness have been less promising forward, as 1.6 Terawords is obviously an ex- (Hughes and Ramage, 2007). tremely large corpus, and it is likely that there One disadvantage of using WordNet for would be a diminishing return on investment evaluating semantic relatedness is its hierar- for increasingly large corpora. In the same pa- chical taxonomic structure. This results in per, another method was shown which used terms such as car and bicycle being close in the pagerank algorihm, run over a network the network, but terms such as car and gaso- formed from WordNet and the WordNet gloss line being far apart. Another difficulty arises tags in order to produce equally impressive re- from the non-scalability of WordNet. While sults. the quality of the network is high, the man- ual nature of its construction means that arbi- 3 A Semantic Network Approach trary word pairs may not occur in the network. Hence in this paper we pursue an approach in The resource we use is a semantic network, which the resource for measuring semantic re- automatically created by the large scale net- latedness is created automatically, based on work creation program, ASKNet. The rela- naturally occurring text. tions between nodes in the network are based A similar project, not using WordNet is on the relations returned by a parser and se- WikiRelate (Strube and Ponzetto, 2006), mantic analyser, which are typically the argu- which uses the existing link structure of ments of predicates found in the text. Hence Wikipedia as its base network, and uses sim- terms in the network are related by the chain ilar path based measurements to those found of syntactic/semantic relations which connect in WordNet approaches to compute semantic the terms in documents, making the network relatedness. This project has seen improved ideal for measuring the general notion of se- results over most WordNet base approaches, mantic relatedness. largely due to the nature of Wikipedia, where Distinct occurrences of terms and entities articles tend to link to other articles which are are combined into a single node using a novel related, rather than just ones which are simi- form of spreading activation (Collins and Lof- lar. tus, 1975). This combining of distinct men- tions produces a cohesive connected network, 2.2 Distributional Methods allowing terms and entities to be related An alternative method for judging semantic across sentences and even larger units such as distance is using word co-occurrence statistics documents. Once the network is built, spread- derived from a very large corpus (McDonald ing activation is used to determine semantic 357 relatedness between terms. For example, to As an example of the usefulness of infor- determine how related car and gasoline are, mation integration, consider the monk-asylum activation is given to one of the nodes, say example, taken from the rg dataset (de- car, and the network is “fired” to allow the scribed in Section 5.1). It is possible that even activation to spread to the rest of the net- a large corpus could contain sentences linking work. The amount of activation received by monk with church, and linking church with gasoline is then a measure of the strength of asylum, but no direct links between monk and the semantic relation between the two terms. asylum. However, with an integrated seman- We use three datasets derived from human tic network, activation can travel across mul- judgements of semantic relatedness to test our tiple links, and through multiple paths, and technique. Since the datasets contain general will show a relationship, albeit probably not terms which may not appear in an existing a very strong one, between monk and asylum, corpus, we create our own corpus by harvest- which corresponds nicely with our intuition. ing text from the web via Google. This ap- Figure 1, which gives an example net- proach has the advantage of requiring little work built from duc documents describing the human intervention and being extensible to Elian Gonzalez custody battle, gives an indi- new datasets. Our results using the semantic cation of the kind of network that ASKNet network derived from the web-based corpus builds. This figure does not give the full net- are comparable to the best performing exist- work, which is too large to show in a sin- ing methods tested on the same datasets. gle figure, but shows the “core” of the net- 4 Creating the Semantic Networks work, where the core is determined using the technique described in (Harrington and Clark, ASKNet creates the semantic networks using 2009). The black boxes represent named en- existing nlp tools to extract syntactic and se- tities mentioned in the text, which may have mantic information from text. This informa- been mentioned a number of times across doc- tion is then combined using a modified version uments, and possibly using different names of the update algorithm used by Harrington (e.g. Fidel Castro vs. President Castro). The and Clark (2007) to create an integrated large- diamonds are named directed edges, which scale network. By mapping together concepts represent relationships between entities. and objects that relate to the same real-world A manual evaluation using human judges entities, the system is able to transform the has been performed to measure the accuracy output of various nlp tools into a single net- of ASKNet networks.
Recommended publications
  • Combining Semantic and Lexical Measures to Evaluate Medical Terms Similarity?
    Combining semantic and lexical measures to evaluate medical terms similarity? Silvio Domingos Cardoso1;2, Marcos Da Silveira1, Ying-Chi Lin3, Victor Christen3, Erhard Rahm3, Chantal Reynaud-Dela^ıtre2, and C´edricPruski1 1 LIST, Luxembourg Institute of Science and Technology, Luxembourg fsilvio.cardoso,marcos.dasilveira,[email protected] 2 LRI, University of Paris-Sud XI, France [email protected] 3 Department of Computer Science, Universit¨atLeipzig, Germany flin,christen,[email protected] Abstract. The use of similarity measures in various domains is corner- stone for different tasks ranging from ontology alignment to information retrieval. To this end, existing metrics can be classified into several cate- gories among which lexical and semantic families of similarity measures predominate but have rarely been combined to complete the aforemen- tioned tasks. In this paper, we propose an original approach combining lexical and ontology-based semantic similarity measures to improve the evaluation of terms relatedness. We validate our approach through a set of experiments based on a corpus of reference constructed by domain experts of the medical field and further evaluate the impact of ontology evolution on the used semantic similarity measures. Keywords: Similarity measures · Ontology evolution · Semantic Web · Medical terminologies 1 Introduction Measuring the similarity between terms is at the heart of many research inves- tigations. In ontology matching, similarity measures are used to evaluate the relatedness between concepts from different ontologies [9]. The outcomes are the mappings between the ontologies, increasing the coverage of domain knowledge and optimize semantic interoperability between information systems. In informa- tion retrieval, similarity measures are used to evaluate the relatedness between units of language (e.g., words, sentences, documents) to optimize search [39].
    [Show full text]
  • Probabilistic Topic Modelling with Semantic Graph
    Probabilistic Topic Modelling with Semantic Graph B Long Chen( ), Joemon M. Jose, Haitao Yu, Fajie Yuan, and Huaizhi Zhang School of Computing Science, University of Glasgow, Sir Alwyns Building, Glasgow, UK [email protected] Abstract. In this paper we propose a novel framework, topic model with semantic graph (TMSG), which couples topic model with the rich knowledge from DBpedia. To begin with, we extract the disambiguated entities from the document collection using a document entity linking system, i.e., DBpedia Spotlight, from which two types of entity graphs are created from DBpedia to capture local and global contextual knowl- edge, respectively. Given the semantic graph representation of the docu- ments, we propagate the inherent topic-document distribution with the disambiguated entities of the semantic graphs. Experiments conducted on two real-world datasets show that TMSG can significantly outperform the state-of-the-art techniques, namely, author-topic Model (ATM) and topic model with biased propagation (TMBP). Keywords: Topic model · Semantic graph · DBpedia 1 Introduction Topic models, such as Probabilistic Latent Semantic Analysis (PLSA) [7]and Latent Dirichlet Analysis (LDA) [2], have been remarkably successful in ana- lyzing textual content. Specifically, each document in a document collection is represented as random mixtures over latent topics, where each topic is character- ized by a distribution over words. Such a paradigm is widely applied in various areas of text mining. In view of the fact that the information used by these mod- els are limited to document collection itself, some recent progress have been made on incorporating external resources, such as time [8], geographic location [12], and authorship [15], into topic models.
    [Show full text]
  • Semantic Memory: a Review of Methods, Models, and Current Challenges
    Psychonomic Bulletin & Review https://doi.org/10.3758/s13423-020-01792-x Semantic memory: A review of methods, models, and current challenges Abhilasha A. Kumar1 # The Psychonomic Society, Inc. 2020 Abstract Adult semantic memory has been traditionally conceptualized as a relatively static memory system that consists of knowledge about the world, concepts, and symbols. Considerable work in the past few decades has challenged this static view of semantic memory, and instead proposed a more fluid and flexible system that is sensitive to context, task demands, and perceptual and sensorimotor information from the environment. This paper (1) reviews traditional and modern computational models of seman- tic memory, within the umbrella of network (free association-based), feature (property generation norms-based), and distribu- tional semantic (natural language corpora-based) models, (2) discusses the contribution of these models to important debates in the literature regarding knowledge representation (localist vs. distributed representations) and learning (error-free/Hebbian learning vs. error-driven/predictive learning), and (3) evaluates how modern computational models (neural network, retrieval- based, and topic models) are revisiting the traditional “static” conceptualization of semantic memory and tackling important challenges in semantic modeling such as addressing temporal, contextual, and attentional influences, as well as incorporating grounding and compositionality into semantic representations. The review also identifies new challenges
    [Show full text]
  • Knowledge Graphs on the Web – an Overview Arxiv:2003.00719V3 [Cs
    January 2020 Knowledge Graphs on the Web – an Overview Nicolas HEIST, Sven HERTLING, Daniel RINGLER, and Heiko PAULHEIM Data and Web Science Group, University of Mannheim, Germany Abstract. Knowledge Graphs are an emerging form of knowledge representation. While Google coined the term Knowledge Graph first and promoted it as a means to improve their search results, they are used in many applications today. In a knowl- edge graph, entities in the real world and/or a business domain (e.g., people, places, or events) are represented as nodes, which are connected by edges representing the relations between those entities. While companies such as Google, Microsoft, and Facebook have their own, non-public knowledge graphs, there is also a larger body of publicly available knowledge graphs, such as DBpedia or Wikidata. In this chap- ter, we provide an overview and comparison of those publicly available knowledge graphs, and give insights into their contents, size, coverage, and overlap. Keywords. Knowledge Graph, Linked Data, Semantic Web, Profiling 1. Introduction Knowledge Graphs are increasingly used as means to represent knowledge. Due to their versatile means of representation, they can be used to integrate different heterogeneous data sources, both within as well as across organizations. [8,9] Besides such domain-specific knowledge graphs which are typically developed for specific domains and/or use cases, there are also public, cross-domain knowledge graphs encoding common knowledge, such as DBpedia, Wikidata, or YAGO. [33] Such knowl- edge graphs may be used, e.g., for automatically enriching data with background knowl- arXiv:2003.00719v3 [cs.AI] 12 Mar 2020 edge to be used in knowledge-intensive downstream applications.
    [Show full text]
  • Evaluating Lexical Similarity to Build Sentiment Similarity
    Evaluating Lexical Similarity to Build Sentiment Similarity Grégoire Jadi∗, Vincent Claveauy, Béatrice Daille∗, Laura Monceaux-Cachard∗ ∗ LINA - Univ. Nantes {gregoire.jadi beatrice.daille laura.monceaux}@univ-nantes.fr y IRISA–CNRS, France [email protected] Abstract In this article, we propose to evaluate the lexical similarity information provided by word representations against several opinion resources using traditional Information Retrieval tools. Word representation have been used to build and to extend opinion resources such as lexicon, and ontology and their performance have been evaluated on sentiment analysis tasks. We question this method by measuring the correlation between the sentiment proximity provided by opinion resources and the semantic similarity provided by word representations using different correlation coefficients. We also compare the neighbors found in word representations and list of similar opinion words. Our results show that the proximity of words in state-of-the-art word representations is not very effective to build sentiment similarity. Keywords: opinion lexicon evaluation, sentiment analysis, spectral representation, word embedding 1. Introduction tend or build sentiment lexicons. For example, (Toh and The goal of opinion mining or sentiment analysis is to iden- Wang, 2014) uses the syntactic categories from WordNet tify and classify texts according to the opinion, sentiment or as features for a CRF to extract aspects and WordNet re- emotion that they convey. It requires a good knowledge of lations such as antonymy and synonymy to extend an ex- the sentiment properties of each word. This properties can isting opinion lexicons. Similarly, (Agarwal et al., 2011) be either discovered automatically by machine learning al- look for synonyms in WordNet to find the polarity of words gorithms, or embedded into language resources.
    [Show full text]
  • Large Semantic Network Manual Annotation 1 Introduction
    Large Semantic Network Manual Annotation V´aclav Nov´ak Institute of Formal and Applied Linguistics Charles University, Prague [email protected] Abstract This abstract describes a project aiming at manual annotation of the content of natural language utterances in a parallel text corpora. The formalism used in this project is MultiNet – Multilayered Ex- tended Semantic Network. The annotation should be incorporated into Prague Dependency Treebank as a new annotation layer. 1 Introduction A formal specification of the semantic content is the aim of numerous semantic approaches such as TIL [6], DRT [9], MultiNet [4], and others. As far as we can tell, there is no large “real life” text corpora manually annotated with such markup. The projects usually work only with automatically generated annotation, if any [1, 6, 3, 2]. We want to create a parallel Czech-English corpora of texts annotated with the corresponding semantic network. 1.1 Prague Dependency Treebank From the linguistic viewpoint there language resources such as Prague Dependency Treebank (PDT) which contain a deep manual analysis of texts [8]. PDT contains annotations of three layers, namely morpho- logical, analytical (shallow dependency syntax) and tectogrammatical (deep dependency syntax). The units of each annotation level are linked with corresponding units on the preceding level. The morpho- logical units are linked directly with the original text. The theoretical basis of the treebank lies in the Functional Gener- ative Description of language system [7]. PDT 2.0 is based on the long-standing Praguian linguistic tradi- tion, adapted for the current computational-linguistics research needs. The corpus itself is embedded into the latest annotation technology.
    [Show full text]
  • Universal Or Variation? Semantic Networks in English and Chinese
    Universal or variation? Semantic networks in English and Chinese Understanding the structures of semantic networks can provide great insights into lexico- semantic knowledge representation. Previous work reveals small-world structure in English, the structure that has the following properties: short average path lengths between words and strong local clustering, with a scale-free distribution in which most nodes have few connections while a small number of nodes have many connections1. However, it is not clear whether such semantic network properties hold across human languages. In this study, we investigate the universal structures and cross-linguistic variations by comparing the semantic networks in English and Chinese. Network description To construct the Chinese and the English semantic networks, we used Chinese Open Wordnet2,3 and English WordNet4. The two wordnets have different word forms in Chinese and English but common word meanings. Word meanings are connected not only to word forms, but also to other word meanings if they form relations such as hypernyms and meronyms (Figure 1). 1. Cross-linguistic comparisons Analysis The large-scale structures of the Chinese and the English networks were measured with two key network metrics, small-worldness5 and scale-free distribution6. Results The two networks have similar size and both exhibit small-worldness (Table 1). However, the small-worldness is much greater in the Chinese network (σ = 213.35) than in the English network (σ = 83.15); this difference is primarily due to the higher average clustering coefficient (ACC) of the Chinese network. The scale-free distributions are similar across the two networks, as indicated by ANCOVA, F (1, 48) = 0.84, p = .37.
    [Show full text]
  • Using Prior Knowledge to Guide BERT's Attention in Semantic
    Using Prior Knowledge to Guide BERT’s Attention in Semantic Textual Matching Tasks Tingyu Xia Yue Wang School of Artificial Intelligence, Jilin University School of Information and Library Science Key Laboratory of Symbolic Computation and Knowledge University of North Carolina at Chapel Hill Engineering, Jilin University [email protected] [email protected] Yuan Tian∗ Yi Chang∗ School of Artificial Intelligence, Jilin University School of Artificial Intelligence, Jilin University Key Laboratory of Symbolic Computation and Knowledge International Center of Future Science, Jilin University Engineering, Jilin University Key Laboratory of Symbolic Computation and Knowledge [email protected] Engineering, Jilin University [email protected] ABSTRACT 1 INTRODUCTION We study the problem of incorporating prior knowledge into a deep Measuring semantic similarity between two pieces of text is a fun- Transformer-based model, i.e., Bidirectional Encoder Representa- damental and important task in natural language understanding. tions from Transformers (BERT), to enhance its performance on Early works on this task often leverage knowledge resources such semantic textual matching tasks. By probing and analyzing what as WordNet [28] and UMLS [2] as these resources contain well- BERT has already known when solving this task, we obtain better defined types and relations between words and concepts. Recent understanding of what task-specific knowledge BERT needs the works have shown that deep learning models are more effective most and where it is most needed. The analysis further motivates on this task by learning linguistic knowledge from large-scale text us to take a different approach than most existing works. Instead of data and representing text (words and sentences) as continuous using prior knowledge to create a new training task for fine-tuning trainable embedding vectors [10, 17].
    [Show full text]
  • Structure at Every Scale: a Semantic Network Account of the Similarities Between Very Unrelated Concepts
    De Deyne, S., Navarro D. J., Perfors, A. and Storms, G. (2016). Structure at every scale: A semantic network account of the similarities between very unrelated concepts. Journal of Experimental Psychology: General, 145, 1228-1254 https://doi.org/10.1037/xge0000192 Structure at every scale: A semantic network account of the similarities between unrelated concepts Simon De Deyne, Danielle J. Navarro, Amy Perfors University of Adelaide Gert Storms University of Leuven Word Count: 19586 Abstract Similarity plays an important role in organizing the semantic system. However, given that similarity cannot be defined on purely logical grounds, it is impor- tant to understand how people perceive similarities between different entities. Despite this, the vast majority of studies focus on measuring similarity between very closely related items. When considering concepts that are very weakly re- lated, little is known. In this paper we present four experiments showing that there are reliable and systematic patterns in how people evaluate the similari- ties between very dissimilar entities. We present a semantic network account of these similarities showing that a spreading activation mechanism defined over a word association network naturally makes correct predictions about weak sim- ilarities and the time taken to assess them, whereas, though simpler, models based on direct neighbors between word pairs derived using the same network cannot. Keywords: word associations, similarity, semantic networks, random walks. This work was supported by a research grant funded by the Research Foundation - Flanders (FWO), ARC grant DE140101749 awarded to the first author, and by the interdisciplinary research project IDO/07/002 awarded to Dirk Speelman, Dirk Geeraerts, and Gert Storms.
    [Show full text]
  • Latent Semantic Analysis for Text-Based Research
    Behavior Research Methods, Instruments, & Computers 1996, 28 (2), 197-202 ANALYSIS OF SEMANTIC AND CLINICAL DATA Chaired by Matthew S. McGlone, Lafayette College Latent semantic analysis for text-based research PETER W. FOLTZ New Mexico State University, Las Cruces, New Mexico Latent semantic analysis (LSA) is a statistical model of word usage that permits comparisons of se­ mantic similarity between pieces of textual information. This papersummarizes three experiments that illustrate how LSA may be used in text-based research. Two experiments describe methods for ana­ lyzinga subject's essay for determining from what text a subject learned the information and for grad­ ing the quality of information cited in the essay. The third experiment describes using LSAto measure the coherence and comprehensibility of texts. One of the primary goals in text-comprehension re­ A theoretical approach to studying text comprehen­ search is to understand what factors influence a reader's sion has been to develop cognitive models ofthe reader's ability to extract and retain information from textual ma­ representation ofthe text (e.g., Kintsch, 1988; van Dijk terial. The typical approach in text-comprehension re­ & Kintsch, 1983). In such a model, semantic information search is to have subjects read textual material and then from both the text and the reader's summary are repre­ have them produce some form of summary, such as an­ sented as sets ofsemantic components called propositions. swering questions or writing an essay. This summary per­ Typically, each clause in a text is represented by a single mits the experimenter to determine what information the proposition.
    [Show full text]
  • Ying Liu, Xi Yang
    Liu, Y. & Yang, X. (2018). A similar legal case retrieval Liu & Yang system by multiple speech question and answer. In Proceedings of The 18th International Conference on Electronic Business (pp. 72-81). ICEB, Guilin, China, December 2-6. A Similar Legal Case Retrieval System by Multiple Speech Question and Answer (Full paper) Ying Liu, School of Computer Science and Information Engineering, Guangxi Normal University, China, [email protected] Xi Yang*, School of Computer Science and Information Engineering, Guangxi Normal University, China, [email protected] ABSTRACT In real life, on the one hand people often lack legal knowledge and legal awareness; on the other hand lawyers are busy, time- consuming, and expensive. As a result, a series of legal consultation problems cannot be properly handled. Although some legal robots can answer some problems about basic legal knowledge that are matched by keywords, they cannot do similar case retrieval and sentencing prediction according to the criminal facts described in natural language. To overcome the difficulty, we propose a similar case retrieval system based on natural language understanding. The system uses online speech synthesis of IFLYTEK and speech reading and writing technology, integrates natural language semantic processing technology and multiple rounds of question-and-answer dialogue mechanism to realise the legal knowledge question and answer with the memory-based context processing ability, and finally retrieves a case that is most similar to the criminal facts that the user consulted. After trial use, the system has a good level of human-computer interaction and high accuracy of information retrieval, which can largely satisfy people's consulting needs for legal issues.
    [Show full text]
  • Untangling Semantic Similarity: Modeling Lexical Processing Experiments with Distributional Semantic Models
    Untangling Semantic Similarity: Modeling Lexical Processing Experiments with Distributional Semantic Models. Farhan Samir Barend Beekhuizen Suzanne Stevenson Department of Computer Science Department of Language Studies Department of Computer Science University of Toronto University of Toronto, Mississauga University of Toronto ([email protected]) ([email protected]) ([email protected]) Abstract DSMs thought to instantiate one but not the other kind of sim- Distributional semantic models (DSMs) are substantially var- ilarity have been found to explain different priming effects on ied in the types of semantic similarity that they output. Despite an aggregate level (as opposed to an item level). this high variance, the different types of similarity are often The underlying conception of semantic versus associative conflated as a monolithic concept in models of behavioural data. We apply the insight that word2vec’s representations similarity, however, has been disputed (Ettinger & Linzen, can be used for capturing both paradigmatic similarity (sub- 2016; McRae et al., 2012), as has one of the main ways in stitutability) and syntagmatic similarity (co-occurrence) to two which it is operationalized (Gunther¨ et al., 2016). In this pa- sets of experimental findings (semantic priming and the effect of semantic neighbourhood density) that have previously been per, we instead follow another distinction, based on distri- modeled with monolithic conceptions of DSM-based seman- butional properties (e.g. Schutze¨ & Pedersen, 1993), namely, tic similarity. Using paradigmatic and syntagmatic similarity that between syntagmatically related words (words that oc- based on word2vec, we show that for some tasks and types of items the two types of similarity play complementary ex- cur in each other’s near proximity, such as drink–coffee), planatory roles, whereas for others, only syntagmatic similar- and paradigmatically related words (words that can be sub- ity seems to matter.
    [Show full text]