Using a Weighted Semantic Network for Lexical Semantic Relatedness

Total Page:16

File Type:pdf, Size:1020Kb

Using a Weighted Semantic Network for Lexical Semantic Relatedness Using a Weighted Semantic Network for Lexical Semantic Relatedness Reda Siblini Leila Kosseim Concordia University Concordia University 1400 de Maisonneuve Blvd. West 1400 de Maisonneuve Blvd. West Montreal, Quebec, Canada, H3G 1M8 Montreal, Quebec, Canada, H3G 1M8 [email protected] [email protected] Abstract lexicon-based, corpus-based, and hybrid ap- proaches. The measurement of semantic relatedness Lexicon-based methods use the features of a between two words is an important met- lexicon to measure semantic relatedness. The ric for many natural language processing most frequently used lexicon is Princeton’s applications. In this paper, we present a WordNet (Fellbaum, 1998) which groups words novel approach for measuring semantic re- into synonyms sets (called synsets) and includes latedness that is based on a weighted se- various semantic relations between those synsets, mantic network. This approach explores in addition to their definitions (or glosses). the use of a lexicon, semantic relation WordNet contains 26 semantic relations that types as weights, and word definitions include: hypernymy, hyponymy, meronymy, and as a basis to calculate semantic related- entailment. ness. Our results show that our approach To measure relatedness, most of the lexicon-based outperforms many lexicon-based methods approaches rely on the structure of the lexicon, to semantic relatedness, especially on the such as the semantic link path, depth (Leacock TOEFL synonym test, achieving an accu- and Chodorow, 1998; Wu and Palmer, 1994), racy of 91.25%. direction (Hirst and St-Onge, 1998), or type (Tsat- saronis et al., 2010). Most of these approaches 1 Introduction exploit the hypernym/hyponym relations, but a few approaches have also included the use of Lexical semantic relatedness is a measurement other semantic relations. Leacock and Chodorow of how two words are related in meaning. Many (1998) for example, computed semantic related- natural language processing applications such as ness as the length of the shortest path between textual entailment, question answering, or infor- synsets over the depth of the taxonomy. Wu and mation retrieval require a robust measurement of Palmer (1994) also used the hyponym tree to lexical semantic relatedness. Current approaches calculate relatedness by using the depth of the to address this problem can be categorized into words in the taxonomy and the depth of the least three main categories: those that rely on a lexicon common superconcept between the two words. and its structure, those that use the distributional Hirst and St-Onge (1998), on the other hand, used hypothesis on a large corpus, and hybrid ap- the lexical chains between words based on their proaches. synsets and the semantic edges that connect them. In this paper, we propose a new lexicon-based In addition to using the hypernym relations, they approach to measure semantic relatedness that is classified the relations into classes: “extra strong” based on a weighted semantic network that in- for identical words, “strong” for synonyms, cludes all 26 semantic relations found in WordNet “medium strong” for when there is a path between in addition to information found in the glosses. the two, and “not related” for no paths at all. The semantic measurement is then based on the path length and the path direction changes. Tsatsaronis 2 Related Work et al. (2010) used a combination of semantic path Approaches to computing semantic relatedness length, node depth in the hierarchy, and the types can be classified into three broad categories: of the semantic edges that compose the path. 610 Proceedings of Recent Advances in Natural Language Processing, pages 610–618, Hissar, Bulgaria, 7-13 September 2013. Figure 1: Example of the semantic network around the word car. On the other hand, corpus-based approaches of relations considered includes not only the hy- rely mainly on distributional properties of words ponym/hypernym relations but also all other avail- learned from a large corpus to compute semantic able semantic relations found in WordNet in addi- relatedness. Such as the work of Finkelstein et tion to word definitions. al. (2001) that used Latent Semantic Analysis, and the work of Strube and Ponzetto (2006) and 3.1 WordNet’s Semantic Network Gabrilovich and Markovitch (2007), which both To implement our idea, we created a weighted and used the distributional hypothesis on Wikipedia. directed semantic network based on the content of Finally, hybrid approaches use a combination of WordNet. To build the semantic network, we used corpus-based and lexicon-based methods. For WordNet 3.1’s words and synsets as the nodes of example, the approach proposed by Hughes and the network. Each word is connected by an edge Ramage (2007) used a random walk method over to its synsets, and each synset is in turn connected a lexicon-based semantic graph supplemented to other synsets based on the semantic relations with corpus-based probabilities. Another example included in WordNet. In addition each synset is is the work of Agirre et al. (2009) that used a connected to the content words contained in its supervised machine learning approach to combine gloss. For example, Figure 1 shows part of the three methods: WordNet-based similarity, a bag semantic network created around the word car. In of word based similarity, and a context window this graph, single-line ovals represent words, while based similarity. double-line ovals represent synsets. The approach presented in this paper belongs to By mining WordNet entirely, we created a net- the lexicon-based category. However, as opposed work of 265,269 nodes connected through a total to the typical lexicon-based approaches described of 1,919,329 edges. The nodes include all words above, our approach uses all 26 semantic relations and synsets, and the edges correspond to all 26 se- found in WordNet in addition to information mantic relations in WordNet in addition to the re- found in glosses. The novelty of this approach is lation between a synset and every content word of that these relations are used to create an explicit a synset definition. semantic network, where the edges of the network representing the semantic relations are weighted 3.2 Semantic Classes of Relations according to the type of the semantic relation. The To compute the semantic relatedness between semantic relatedness is computed as the lowest nodes in the semantic network, it is necessary to cost path between a pair of words in the network. take into consideration the semantic relation in- volved between two nodes. Indeed, WordNet’s 26 3 Our Approach semantic relations do not contribute equally to the semantic relatedness between words. The hyper- Our method to measure semantic relatedness is nym relation (relation #2), for example, is a good based on the idea that the types of relations that indicator of semantic relatedness; while the rela- relate two concepts are a suitable indicator of the tion of member of this domain - topic (relation semantic relatedness between the two. The type #15) is less significant. This can be seen in Fig- 611 Category Weight Semantic Relations in WordNet Similar α antonym, cause, entailment, participle of verb, pertainym, similar to, verb group Hypernym 2 × α derivationally related, instance hypernym, hypernym Sense 4 × α + β lemma-synset Gloss 6 × α lemma-gloss content words P art 8 × α holonym (part, member, substance), inverse gloss, meronym (part, member, substance) Instance 10 × α instance hyponym, hyponym Other 12 × α also see, attribute, domain of synset (topic, region, usage), member of this domain (topic, region, usage) Table 1: Relations Categories and Corresponding Weights. ure 1, for example, where the word car is more class of relations includes relations that are the closely related to Motor vehicle than to Renting. most useful to compute semantic relatedness as In order to determine the contribution of each re- per our manual corpus analysis and are the rarest lation, we compared a manually created set of available relations in the semantic network and 210 semantic relations for their degree of related- hence was assigned the lowest weight of all cat- ness. For example, for the concept car we have egories of relations: α. compared the sense of automobile with the hyper- The second category of semantic relations is the nym motor vehicle, the gloss word wheel, the part Hypernym which includes WordNet’s relations of meronym air bag, the member of this topic rent- hypernym, instance hypernym and derivationally ing, and another sense of car such as a cable car. related. Being less important than the similar rela- This comparison has lead us to classify the rela- tions to compute relatedness, as shown in Table 1, tions into seven categories, and rank these cate- the Hypernym category was assigned a weight of gories from the most related category to the least (2 × α). related one as follows: Similar (highest contribu- The Sense category represents the relationship be- tion), Hypernym, Sense, Gloss, Part, Instance, and tween a word and its synset. Because a word can Other (lowest contribution). By classifying Word- belong to several synsets, in order to favor the Net’s relations into these classes, we are able to most frequent senses as opposed to the infrequent weight the contribution of a relation based on the ones, the weight of this category is modulated by class it belongs to, as opposed to assigning a con- a factor β. Specifically, we use (4 × α + β), where tributory weight to each relations. For example, all β is computed as the ratio of the frequency of relations of type Similar will contribute equally to the sense number in WordNet over the maximum the semantic relatedness of words, and will con- number of senses for that word. tribute more than any relations of the class Hyper- The fourth category of semantic relations is the nym.
Recommended publications
  • Automatic Wordnet Mapping Using Word Sense Disambiguation*
    Automatic WordNet mapping using word sense disambiguation* Changki Lee Seo JungYun Geunbae Leer Natural Language Processing Lab Natural Language Processing Lab Dept. of Computer Science and Engineering Dept. of Computer Science Pohang University of Science & Technology Sogang University San 31, Hyoja-Dong, Pohang, 790-784, Korea Sinsu-dong 1, Mapo-gu, Seoul, Korea {leeck,gblee }@postech.ac.kr seojy@ ccs.sogang.ac.kr bilingual Korean-English dictionary. The first Abstract sense of 'gwan-mog' has 'bush' as a translation in English and 'bush' has five synsets in This paper presents the automatic WordNet. Therefore the first sense of construction of a Korean WordNet from 'gwan-mog" has five candidate synsets. pre-existing lexical resources. A set of Somehow we decide a synset {shrub, bush} automatic WSD techniques is described for among five candidate synsets and link the sense linking Korean words collected from a of 'gwan-mog' to this synset. bilingual MRD to English WordNet synsets. As seen from this example, when we link the We will show how individual linking senses of Korean words to WordNet synsets, provided by each WSD method is then there are semantic ambiguities. To remove the combined to produce a Korean WordNet for ambiguities we develop new word sense nouns. disambiguation heuristics and automatic mapping method to construct Korean WordNet based on 1 Introduction the existing English WordNet. There is no doubt on the increasing This paper is organized as follows. In section 2, importance of using wide coverage ontologies we describe multiple heuristics for word sense for NLP tasks especially for information disambiguation for sense linking.
    [Show full text]
  • Probabilistic Topic Modelling with Semantic Graph
    Probabilistic Topic Modelling with Semantic Graph B Long Chen( ), Joemon M. Jose, Haitao Yu, Fajie Yuan, and Huaizhi Zhang School of Computing Science, University of Glasgow, Sir Alwyns Building, Glasgow, UK [email protected] Abstract. In this paper we propose a novel framework, topic model with semantic graph (TMSG), which couples topic model with the rich knowledge from DBpedia. To begin with, we extract the disambiguated entities from the document collection using a document entity linking system, i.e., DBpedia Spotlight, from which two types of entity graphs are created from DBpedia to capture local and global contextual knowl- edge, respectively. Given the semantic graph representation of the docu- ments, we propagate the inherent topic-document distribution with the disambiguated entities of the semantic graphs. Experiments conducted on two real-world datasets show that TMSG can significantly outperform the state-of-the-art techniques, namely, author-topic Model (ATM) and topic model with biased propagation (TMBP). Keywords: Topic model · Semantic graph · DBpedia 1 Introduction Topic models, such as Probabilistic Latent Semantic Analysis (PLSA) [7]and Latent Dirichlet Analysis (LDA) [2], have been remarkably successful in ana- lyzing textual content. Specifically, each document in a document collection is represented as random mixtures over latent topics, where each topic is character- ized by a distribution over words. Such a paradigm is widely applied in various areas of text mining. In view of the fact that the information used by these mod- els are limited to document collection itself, some recent progress have been made on incorporating external resources, such as time [8], geographic location [12], and authorship [15], into topic models.
    [Show full text]
  • Semantic Memory: a Review of Methods, Models, and Current Challenges
    Psychonomic Bulletin & Review https://doi.org/10.3758/s13423-020-01792-x Semantic memory: A review of methods, models, and current challenges Abhilasha A. Kumar1 # The Psychonomic Society, Inc. 2020 Abstract Adult semantic memory has been traditionally conceptualized as a relatively static memory system that consists of knowledge about the world, concepts, and symbols. Considerable work in the past few decades has challenged this static view of semantic memory, and instead proposed a more fluid and flexible system that is sensitive to context, task demands, and perceptual and sensorimotor information from the environment. This paper (1) reviews traditional and modern computational models of seman- tic memory, within the umbrella of network (free association-based), feature (property generation norms-based), and distribu- tional semantic (natural language corpora-based) models, (2) discusses the contribution of these models to important debates in the literature regarding knowledge representation (localist vs. distributed representations) and learning (error-free/Hebbian learning vs. error-driven/predictive learning), and (3) evaluates how modern computational models (neural network, retrieval- based, and topic models) are revisiting the traditional “static” conceptualization of semantic memory and tackling important challenges in semantic modeling such as addressing temporal, contextual, and attentional influences, as well as incorporating grounding and compositionality into semantic representations. The review also identifies new challenges
    [Show full text]
  • Knowledge Graphs on the Web – an Overview Arxiv:2003.00719V3 [Cs
    January 2020 Knowledge Graphs on the Web – an Overview Nicolas HEIST, Sven HERTLING, Daniel RINGLER, and Heiko PAULHEIM Data and Web Science Group, University of Mannheim, Germany Abstract. Knowledge Graphs are an emerging form of knowledge representation. While Google coined the term Knowledge Graph first and promoted it as a means to improve their search results, they are used in many applications today. In a knowl- edge graph, entities in the real world and/or a business domain (e.g., people, places, or events) are represented as nodes, which are connected by edges representing the relations between those entities. While companies such as Google, Microsoft, and Facebook have their own, non-public knowledge graphs, there is also a larger body of publicly available knowledge graphs, such as DBpedia or Wikidata. In this chap- ter, we provide an overview and comparison of those publicly available knowledge graphs, and give insights into their contents, size, coverage, and overlap. Keywords. Knowledge Graph, Linked Data, Semantic Web, Profiling 1. Introduction Knowledge Graphs are increasingly used as means to represent knowledge. Due to their versatile means of representation, they can be used to integrate different heterogeneous data sources, both within as well as across organizations. [8,9] Besides such domain-specific knowledge graphs which are typically developed for specific domains and/or use cases, there are also public, cross-domain knowledge graphs encoding common knowledge, such as DBpedia, Wikidata, or YAGO. [33] Such knowl- edge graphs may be used, e.g., for automatically enriching data with background knowl- arXiv:2003.00719v3 [cs.AI] 12 Mar 2020 edge to be used in knowledge-intensive downstream applications.
    [Show full text]
  • Large Semantic Network Manual Annotation 1 Introduction
    Large Semantic Network Manual Annotation V´aclav Nov´ak Institute of Formal and Applied Linguistics Charles University, Prague [email protected] Abstract This abstract describes a project aiming at manual annotation of the content of natural language utterances in a parallel text corpora. The formalism used in this project is MultiNet – Multilayered Ex- tended Semantic Network. The annotation should be incorporated into Prague Dependency Treebank as a new annotation layer. 1 Introduction A formal specification of the semantic content is the aim of numerous semantic approaches such as TIL [6], DRT [9], MultiNet [4], and others. As far as we can tell, there is no large “real life” text corpora manually annotated with such markup. The projects usually work only with automatically generated annotation, if any [1, 6, 3, 2]. We want to create a parallel Czech-English corpora of texts annotated with the corresponding semantic network. 1.1 Prague Dependency Treebank From the linguistic viewpoint there language resources such as Prague Dependency Treebank (PDT) which contain a deep manual analysis of texts [8]. PDT contains annotations of three layers, namely morpho- logical, analytical (shallow dependency syntax) and tectogrammatical (deep dependency syntax). The units of each annotation level are linked with corresponding units on the preceding level. The morpho- logical units are linked directly with the original text. The theoretical basis of the treebank lies in the Functional Gener- ative Description of language system [7]. PDT 2.0 is based on the long-standing Praguian linguistic tradi- tion, adapted for the current computational-linguistics research needs. The corpus itself is embedded into the latest annotation technology.
    [Show full text]
  • Universal Or Variation? Semantic Networks in English and Chinese
    Universal or variation? Semantic networks in English and Chinese Understanding the structures of semantic networks can provide great insights into lexico- semantic knowledge representation. Previous work reveals small-world structure in English, the structure that has the following properties: short average path lengths between words and strong local clustering, with a scale-free distribution in which most nodes have few connections while a small number of nodes have many connections1. However, it is not clear whether such semantic network properties hold across human languages. In this study, we investigate the universal structures and cross-linguistic variations by comparing the semantic networks in English and Chinese. Network description To construct the Chinese and the English semantic networks, we used Chinese Open Wordnet2,3 and English WordNet4. The two wordnets have different word forms in Chinese and English but common word meanings. Word meanings are connected not only to word forms, but also to other word meanings if they form relations such as hypernyms and meronyms (Figure 1). 1. Cross-linguistic comparisons Analysis The large-scale structures of the Chinese and the English networks were measured with two key network metrics, small-worldness5 and scale-free distribution6. Results The two networks have similar size and both exhibit small-worldness (Table 1). However, the small-worldness is much greater in the Chinese network (σ = 213.35) than in the English network (σ = 83.15); this difference is primarily due to the higher average clustering coefficient (ACC) of the Chinese network. The scale-free distributions are similar across the two networks, as indicated by ANCOVA, F (1, 48) = 0.84, p = .37.
    [Show full text]
  • Structure at Every Scale: a Semantic Network Account of the Similarities Between Very Unrelated Concepts
    De Deyne, S., Navarro D. J., Perfors, A. and Storms, G. (2016). Structure at every scale: A semantic network account of the similarities between very unrelated concepts. Journal of Experimental Psychology: General, 145, 1228-1254 https://doi.org/10.1037/xge0000192 Structure at every scale: A semantic network account of the similarities between unrelated concepts Simon De Deyne, Danielle J. Navarro, Amy Perfors University of Adelaide Gert Storms University of Leuven Word Count: 19586 Abstract Similarity plays an important role in organizing the semantic system. However, given that similarity cannot be defined on purely logical grounds, it is impor- tant to understand how people perceive similarities between different entities. Despite this, the vast majority of studies focus on measuring similarity between very closely related items. When considering concepts that are very weakly re- lated, little is known. In this paper we present four experiments showing that there are reliable and systematic patterns in how people evaluate the similari- ties between very dissimilar entities. We present a semantic network account of these similarities showing that a spreading activation mechanism defined over a word association network naturally makes correct predictions about weak sim- ilarities and the time taken to assess them, whereas, though simpler, models based on direct neighbors between word pairs derived using the same network cannot. Keywords: word associations, similarity, semantic networks, random walks. This work was supported by a research grant funded by the Research Foundation - Flanders (FWO), ARC grant DE140101749 awarded to the first author, and by the interdisciplinary research project IDO/07/002 awarded to Dirk Speelman, Dirk Geeraerts, and Gert Storms.
    [Show full text]
  • Untangling Semantic Similarity: Modeling Lexical Processing Experiments with Distributional Semantic Models
    Untangling Semantic Similarity: Modeling Lexical Processing Experiments with Distributional Semantic Models. Farhan Samir Barend Beekhuizen Suzanne Stevenson Department of Computer Science Department of Language Studies Department of Computer Science University of Toronto University of Toronto, Mississauga University of Toronto ([email protected]) ([email protected]) ([email protected]) Abstract DSMs thought to instantiate one but not the other kind of sim- Distributional semantic models (DSMs) are substantially var- ilarity have been found to explain different priming effects on ied in the types of semantic similarity that they output. Despite an aggregate level (as opposed to an item level). this high variance, the different types of similarity are often The underlying conception of semantic versus associative conflated as a monolithic concept in models of behavioural data. We apply the insight that word2vec’s representations similarity, however, has been disputed (Ettinger & Linzen, can be used for capturing both paradigmatic similarity (sub- 2016; McRae et al., 2012), as has one of the main ways in stitutability) and syntagmatic similarity (co-occurrence) to two which it is operationalized (Gunther¨ et al., 2016). In this pa- sets of experimental findings (semantic priming and the effect of semantic neighbourhood density) that have previously been per, we instead follow another distinction, based on distri- modeled with monolithic conceptions of DSM-based seman- butional properties (e.g. Schutze¨ & Pedersen, 1993), namely, tic similarity. Using paradigmatic and syntagmatic similarity that between syntagmatically related words (words that oc- based on word2vec, we show that for some tasks and types of items the two types of similarity play complementary ex- cur in each other’s near proximity, such as drink–coffee), planatory roles, whereas for others, only syntagmatic similar- and paradigmatically related words (words that can be sub- ity seems to matter.
    [Show full text]
  • Ontology-Based Approach to Semantically Enhanced Question Answering for Closed Domain: a Review
    information Review Ontology-Based Approach to Semantically Enhanced Question Answering for Closed Domain: A Review Ammar Arbaaeen 1,∗ and Asadullah Shah 2 1 Department of Computer Science, Faculty of Information and Communication Technology, International Islamic University Malaysia, Kuala Lumpur 53100, Malaysia 2 Faculty of Information and Communication Technology, International Islamic University Malaysia, Kuala Lumpur 53100, Malaysia; [email protected] * Correspondence: [email protected] Abstract: For many users of natural language processing (NLP), it can be challenging to obtain concise, accurate and precise answers to a question. Systems such as question answering (QA) enable users to ask questions and receive feedback in the form of quick answers to questions posed in natural language, rather than in the form of lists of documents delivered by search engines. This task is challenging and involves complex semantic annotation and knowledge representation. This study reviews the literature detailing ontology-based methods that semantically enhance QA for a closed domain, by presenting a literature review of the relevant studies published between 2000 and 2020. The review reports that 83 of the 124 papers considered acknowledge the QA approach, and recommend its development and evaluation using different methods. These methods are evaluated according to accuracy, precision, and recall. An ontological approach to semantically enhancing QA is found to be adopted in a limited way, as many of the studies reviewed concentrated instead on Citation: Arbaaeen, A.; Shah, A. NLP and information retrieval (IR) processing. While the majority of the studies reviewed focus on Ontology-Based Approach to open domains, this study investigates the closed domain.
    [Show full text]
  • NL Assistant: a Toolkit for Developing Natural Language: Applications
    NL Assistant: A Toolkit for Developing Natural Language Applications Deborah A. Dahl, Lewis M. Norton, Ahmed Bouzid, and Li Li Unisys Corporation Introduction scale lexical resources have been integrated with the toolkit. These include Comlex and WordNet We will be demonstrating a toolkit for as well as additional, internally developed developing natural language-based applications resources. The second strategy is to provide easy and two applications. The goals of this toolkit to use editors for entering linguistic information. are to reduce development time and cost for natural language based applications by reducing Servers the amount of linguistic and programming work needed. Linguistic work has been reduced by Lexical information is supplied by four external integrating large-scale linguistics resources--- servers which are accessed by the natural Comlex (Grishman, et. al., 1993) and WordNet language engine during processing. Syntactic (Miller, 1990). Programming work is reduced by information is supplied by a lexical server based automating some of the programming tasks. The on the 50K word Comlex dictionary available toolkit is designed for both speech- and text- from the Linguistic Data Consortium. Semantic based interface applications. It runs in a information comes from two servers, a KB Windows NT environment. Applications can server based on the noun portion of WordNet run in either Windows NT or Unix. (70K concepts), and a semantics server containing case frame information for 2500 System Components English verbs. A denotations server using unique concept names generated for each The NL Assistant toolkit consists of WordNet synset at ISI links the words in the lexicon to the concepts in the KB.
    [Show full text]
  • Similarity Detection Using Latent Semantic Analysis Algorithm Priyanka R
    International Journal of Emerging Research in Management &Technology Research Article August ISSN: 2278-9359 (Volume-6, Issue-8) 2017 Similarity Detection Using Latent Semantic Analysis Algorithm Priyanka R. Patil Shital A. Patil PG Student, Department of Computer Engineering, Associate Professor, Department of Computer Engineering, North Maharashtra University, Jalgaon, North Maharashtra University, Jalgaon, Maharashtra, India Maharashtra, India Abstract— imilarity View is an application for visually comparing and exploring multiple models of text and collection of document. Friendbook finds ways of life of clients from client driven sensor information, measures the S closeness of ways of life amongst clients, and prescribes companions to clients if their ways of life have high likeness. Roused by demonstrate a clients day by day life as life records, from their ways of life are separated by utilizing the Latent Dirichlet Allocation Algorithm. Manual techniques can't be utilized for checking research papers, as the doled out commentator may have lacking learning in the exploration disciplines. For different subjective views, causing possible misinterpretations. An urgent need for an effective and feasible approach to check the submitted research papers with support of automated software. A method like text mining method come to solve the problem of automatically checking the research papers semantically. The proposed method to finding the proper similarity of text from the collection of documents by using Latent Dirichlet Allocation (LDA) algorithm and Latent Semantic Analysis (LSA) with synonym algorithm which is used to find synonyms of text index wise by using the English wordnet dictionary, another algorithm is LSA without synonym used to find the similarity of text based on index.
    [Show full text]
  • Introduction to Wordnet: an On-Line Lexical Database
    Introduction to WordNet: An On-line Lexical Database George A. Miller, Richard Beckwith, Christiane Fellbaum, Derek Gross, and Katherine Miller (Revised August 1993) WordNet is an on-line lexical reference system whose design is inspired by current psycholinguistic theories of human lexical memory. English nouns, verbs, and adjectives are organized into synonym sets, each representing one underlying lexical concept. Different relations link the synonym sets. Standard alphabetical procedures for organizing lexical information put together words that are spelled alike and scatter words with similar or related meanings haphazardly through the list. Unfortunately, there is no obvious alternative, no other simple way for lexicographers to keep track of what has been done or for readers to ®nd the word they are looking for. But a frequent objection to this solution is that ®nding things on an alphabetical list can be tedious and time-consuming. Many people who would like to refer to a dictionary decide not to bother with it because ®nding the information would interrupt their work and break their train of thought. In this age of computers, however, there is an answer to that complaint. One obvious reason to resort to on-line dictionariesÐlexical databases that can be read by computersÐis that computers can search such alphabetical lists much faster than people can. A dictionary entry can be available as soon as the target word is selected or typed into the keyboard. Moreover, since dictionaries are printed from tapes that are read by computers, it is a relatively simple matter to convert those tapes into the appropriate kind of lexical database.
    [Show full text]