Words, Lexicons and Ontologies

Total Page:16

File Type:pdf, Size:1020Kb

Words, Lexicons and Ontologies HG8003 Technologically Speaking: The intersection of language and technology. Words, Lexicons and Ontologies Francis Bond Division of Linguistics and Multilingual Studies http://www3.ntu.edu.sg/home/fcbond/ [email protected] Lecture 4 Location: LT8 HG8003 (2014) Schedule Lec. Date Topic 1 01-16 Introduction, Organization: Overview of NLP; Main Issues 2 01-23 Representing Language 3 02-06 Representing Meaning 4 02-13 Words, Lexicons and Ontologies 5 02-20 Text Mining and Knowledge Acquisition Quiz 6 02-27 Structured Text and the Semantic Web Recess 7 03-13 Citation, Reputation and PageRank 8 03-20 Introduction to MT, Empirical NLP 9 03-27 Analysis, Tagging, Parsing and Generation Quiz 10 Video Statistical and Example-based MT 11 04-03 Transfer and Word Sense Disambiguation 12 04-10 Review and Conclusions Exam 05-06 17:00 ➣ Video week 10 Words, Lexicons and Ontologies 1 Review of Meaning Words, Lexicons and Ontologies 2 Review of Representing Meaning ➣ Three ways of defining meaning ➢ Attributional (Compositional) ➢ Relational ➢ Distributional ➣ the Syntax-Semantic Interface ➢ Usage ↽⇀ Meaning Words, Lexicons and Ontologies 3 Attributional Meaning ➣ Give a semantic description of word use in isolation of the categorisation of other lexical items ➢ definitions ➢ decompositional semantics (break down into primitives) ➣ Easy for humans to understand ➣ Hard to decide on sense boundaries (granularity: splitters vs. lumpers) ➣ Definitions are circular (the grounding problem) ➣ Hard to be consistent Words, Lexicons and Ontologies 4 Relational Meaning ➣ Capture correspondences between lexical items by way of a finite set of pre-defined semantic relations ➣ Methodologies: ➢ lexical relations ➢ constructional relations ➣ Captures many generalizations usefully ➣ Hard to make complete ➣ Leads to large, complex graphs Words, Lexicons and Ontologies 5 Distributional Meaning ➣ Capture word meanings as collections of contexts in which words appear ➢ n-grams ➢ syntactic relations ➢ sentences ➢ documents ➣ Good for synonymy, not so good for antonymy ➣ Computationally tractable Words, Lexicons and Ontologies 6 Why are dictionaries important? ➣ For humans ➢ find meaning of unknown words ➢ find more information about known words ➢ codify knowledge about word usage (glossaries) ➣ For machines ➢ store information about words ➢ link between text and knowledge Words, Lexicons and Ontologies 7 Words Words, Lexicons and Ontologies 8 Introduction to Words, Lexicons and Ontologies ➣ Design and implementation ➢ Machine Readable Dictionaries ➢ Morphological lexicons ➢ Syntactic lexicons ➢ Semantic lexicons ➢ Ontologies ➣ Construction and Maintenance ➢ Construction from scratch ➢ Boot-strapping from existing resources ➢ Ensuring consistency Words, Lexicons and Ontologies 9 Machine Readable Dictionaries (MRDs) ➣ Human dictionaries made available on machine ➢ Electronic Dictionaries ➢ Dictionary Applications ∗ often with automatic word lookup ➢ On-line dictionaries ∗ Sometimes with glosses Words, Lexicons and Ontologies 10 A typical entry definition (n) a concise explanation of the meaning of a word or phrase or symbol ➣ Headword: definition ➣ Part of Speech: n (noun) ➣ Definition: ➢ genus: explanation ➢ differentia: concise; of the meaning of a word or phrase or symbol ? Implied: countable (a), regular plural Words, Lexicons and Ontologies 11 Parts-of-Speech (POS) ➣ Traditional Grammar has eight: Noun, Verb, Adjective, Adverb (open class) Conjunction, Preposition, Pronoun, Interjection (closed class) ➣ In the US, the Penn Treebank POS set is de-facto standard: ➢ http://www.comp.leeds.ac.uk/ccalas/tagsets/upenn.html ➢ 45 tags (including punctuation) ➣ In Europe, CLAWS tagset is popular ➢ http://ucrel.lancs.ac.uk/claws7tags.html ➢ 137 tags (without punctuation) Words, Lexicons and Ontologies 12 Penn Treebank Examples (14/45) Tag Description Tag Description NN Noun,singularormass VB Verb,baseform NNS Noun,plural VBD Verb,pasttense NNP Propernoun,singular VBG Verb,gerundorpresentparticiple NNPS Propernoun,plural VBN Verb,pastparticiple PRP Personalpronoun VBP Verb,non-3rdpersonsingularpresent IN Preposition VBZ Verb,3rdpersonsingularpresent TO to . SentenceFinalpunct(.,?,!) ➣ The tags include inflectional information ➢ If you know the tag, you can generally find the lemma ➣ Some tags are very specialized: I/PRP wanted/VBD to/TO go/VB ./. Words, Lexicons and Ontologies 13 Good Definitions ➣ a definition should be simpler than the word being explained ➣ the definition should match the part of speech definition (n) a concise explanation of the meaning of a word or phrasel define (v) – give a definition for the meaning of a word; “Define ‘sadness”’ ➣ the definition should not be circular ➣ all words in the definition should be defined (somewhere) ➢ prefer small defining vocabulary ➢ only use metalanguage (NSM: Natural Semantic Metalanguage) Words, Lexicons and Ontologies 14 Circular definitions beauty the state of being beautiful beautiful full of beauty bobcat a lynx lynx a bobcat Words, Lexicons and Ontologies 15 Other useful information http://en.wiktionary.org/wiki/lynx ➣ Pronunciation ➣ Usage Examples ➣ Illustrations ➣ Etymology (history of the word) ➣ Links to other resources Easier to do without the space restrictions of a paper dictionary. Words, Lexicons and Ontologies 16 Dictionaries for NLP Minimize content in order to minimize acquisition problem. Declarativity and human readability with compilation into a machine- friendly representation. Modularity so components are reusable: e.g. distinct monolingual and transfer lexicons in an MT system. Capture generalizations with inheritance (and lexical rules etc). Avoids errors, easier to maintain and expand. Underspecification to reduce disambiguation for a particular application. (Copestake, 1992) 17 Morphological Analysis 森 永 前 日 銀 総 裁 rin ei zen hi gin sou sai mori mae nichi morinaga zennichi gin sousai morinaga zen nichigin sousai ➣ 森永 前 日銀 総裁 Morinaga former Bank of Japan President Words, Lexicons and Ontologies 18 Morphological Lexicons ➣ Stem ➣ Inflectional Class ➣ Part of Speech (often 1-200) ➣ Arguments (?) ➣ For example ➢ Relations: 前 - 総裁 ➢ Arguments: 前(総裁) ➢ Abstraction: 前(title); 総裁 ⊂ title Words, Lexicons and Ontologies 19 Morphological Lexicon ➣ I fabricate for a living ➣ I make things for a living ➣ I fabricated yesterday ➣ I made things for a living ➣ These are differences in the inflectional class Words, Lexicons and Ontologies 20 Inflection ➣ Inflection: In many languages, words appear in different forms to show small differences in meaning: for example number (dog/dogs; child/children) or tense/aspect (make/made/making/made; take/took/taking/taken) ➣ Many words pattern the same way, this is called an inflectional class (or paradigm). For example, one class of plurals in English is words that end in y: fly/flies; sky/skies. ➣ The inflectional class is normally not predictable from the meaning or syntax; and so must be stored for each word ➣ The root form (lemma) and the inflected form have the same meaning modulo the number/tense/...and the same basic part-of-speech ➣ Normally a word only undergoes one inflection Words, Lexicons and Ontologies 21 Derivation ➣ New words can also be created by changing the form. If the part of speech or meaning changes, we call it derivation: (happy/happiness; happy/unhappy; happy/happily) ➣ You can also get zero derivation, where the meaning changes without a change in form (I butter the bread/I like butter. ➣ The root form in derivation is called the stem and the process of stripping off derivational affixes is stemming ➣ You can have multiple derivations: anti-dis-establish-ment-arian-ism: the stem is establish ➣ Derivation is largely but not entirely productive: employer, teacher, *studier, actor, contractor Words, Lexicons and Ontologies 22 Syntactic Lexicon ➣ I fabricated the results ➣ I made up the results ➣ = I made the results up ➣ I walked down the road ➣ 6= I walked the road down ➣ These are differences in the syntactic lexical type Words, Lexicons and Ontologies 23 Differences in Argument Structure ➣ These are also differences in the syntactic lexical type ➢ I gave the book to him ➢ I gave him the book ➢ Cats eat mice ➢ Cats eat ➢ Cats devour mice ➢ *Cats devour mice ➣ The information about what arguments a verb can take is also called subcategorization, valence or argument frame Words, Lexicons and Ontologies 24 Semantic Lexicon ➣ I deposited the money in the bank (financial) ➣ The river overflowed its bank (riverside) ➣ I had lunch by the bank (???) ➣ These are differences in the semantic class Words, Lexicons and Ontologies 25 All the possibilities combine ➣ I saw her duck ➢ see/saw, saw/sawed ➢ duckN , duckV ➢ duckN:cloth, duckN:bird ➣ Still useful to keep separate ➢ inflectional paradigm ➢ arguments (subcategorization) ➢ semantics (selectional preferences) Words, Lexicons and Ontologies 26 Transfer Lexicons ➣ bank ↔ 銀行 ginkou ➣ bank ↔ 土手 dote ➣ 鼻 hana ↔ nose ➢ trunk [of elephant] ➢ muzzle [of horse] ➢ snout [of boar] Words, Lexicons and Ontologies 27 Dictionaries in Processing ➣ Lexical lookup is slow (disk-based) ➢ Compile dictionaries into compressed format ➢ Index ➢ Cache the index cache = load it into memory ➢ Cache already accessed entries ➢ Keep a list of frequent entries and cache them the most frequent words are very frequent ➣ Batch check for consistency off-line Words, Lexicons and Ontologies 28 Dictionaries and Intellectual Property Rights (IPR) ➣ Lexicography has along tradition of extending other’s work ➢ Johnson, Murray, . ➢ Language itself should not be restricted
Recommended publications
  • Lexical Semantics I
    Semantic Theory: Lexical Semantics I Summer 2007 M.Pinkal/ S. Thater • 03-07-07 Lecture: Lexical Semantics I • 05-07-07 Lecture: Lexical Semantics II • 10-07-07 Lecture: Everything else • 12-07-07 Exercise: lexical Semantics • 17-07-07 Question time, Sample exam • 19-07-07 Individual question time • 25-07-07 Final exam, 11:00 (s.t.) Semantic Theory, SS 2007 © M. Pinkal, S. Thater 2 Structure of this course • Sentence semantics • Discourse semantics • Lexical semantics Semantic Theory, SS 2007 © M. Pinkal, S. Thater 3 Dolphins in First-order Logic Dolphins are mammals, not fish. !d (dolphin'(d)"mammal'(d) #¬fish'(d)) Dolphins live-in pods. !d (dolphin'(d)" $x (pod'(p)live-in'(d,p)) Dolphins give birth to one baby at a time. !d (dolphin'(d)" !x !y !t (give-birth-to' (d,x,t)give-birth-to' (d,y,t) " x=y) Semantic Theory, SS 2007 © M. Pinkal, S. Thater 4 Dolphins in First-order Logic Dolphins are mammals, not fish. !d (dolphin'(d)"mammal'(d) #¬fish'(d)) Dolphins live-in pods. !d (dolphin'(d)" $x (pod'(p)live-in'(d,p)) Dolphins give birth to one baby at a time. !d (dolphin'(d)" !x !y !t (give-birth-to' (d,x,t)give-birth-to' (d,y,t) " x=y) Semantic Theory, SS 2007 © M. Pinkal, S. Thater 5 The dolphin text Dolphins are mammals, not fish. They are warm blooded like man, and give birth to one baby called a calf at a time. At birth a bottlenose dolphin calf is about 90-130 cms long and will grow to approx.
    [Show full text]
  • LEXADV - a Multilingual Semantic Lexicon for Adverbs
    LEXADV - a multilingual semantic Lexicon for Adverbs Sanni Nimb Center for Sprogteknologi (CST) – Københavns Universitet Njalsgade 80, Copenhagen, Denmark [email protected] Abstract The LEXADV-project is a Scandinavian research project (2004-2006, financed by Nordplus Sprog) with the aim of extending three Scandinavian semantic lexicons building on the SIMPLE lexicon model (Lenci et al., 2000) with the word class of adverbs. In the lexicons of approx. 400 Danish, Norwegian and Swedish adverbs the different senses are described with a semantic type and a set of semantic features. A classification covering the many meanings that adverbs can have has been established and integrated in the original SIMPLE ontology. Similarly new features have been added to the model in order to describe the adverb senses. The working method of the project builds on the fact that the vocabularies of Danish, Norwegian and Swedish are closely related. An encoding tool has been developed with the special purpose of permitting easy transfer of semantic types and features between entries in the three languages. The Danish adverb senses have been described first, based on the definition in a modern, comprehensive Danish dictionary. Afterwards the lemmas have been translated and the semantic data have been copied into the Swedish as well as into the Norwegian equivalent entry. Finally these copies have been evaluated and when necessary adjusted by native speakers. ’Telic’, ’Agentive’ and ’Constitutive’ has been extended 1. The Scandinavian SIMPLE Lexicons with three more semantic types. The first one, ’Functional’, covers adverb senses of 1.1. Background which the meaning must be described by logic expressions referring to sets, either in the context or in The Danish and the Swedish SIMPLE lexicons are the discourse.
    [Show full text]
  • A New Semantic Lexicon and Similarity Measure in Bangla
    A New Semantic Lexicon and Similarity Measure in Bangla Manjira Sinha, Abhik Jana, Tirthankar Dasgupta, Anupam Basu Indian Institute of Technology Kharagpur {manjira87, abhikjana1, iamtirthankar, anupambas}@gmail.com ABSTRACT The Mental Lexicon (ML) refers to the organization of lexical entries of a language in the human mind.A clear knowledge of the structure of ML will help us to understand how the human brain processes language. The knowledge of semantic association among the words in ML is essential to many applications. Although, there are works on the representation of lexical entries based on their semantic association in the form of a lexicon in English and other languages, such works of Bangla is in a nascent stage. In this paper, we have proposed a distinct lexical organization based on semantic association between Bangla words which can be accessed efficiently by different applications. We have developed a novel approach of measuring the semantic similarity between words and verified it against user study. Further, a GUI has been designed for easy and efficient access. KEYWORDS : Bangla Lexicon, Synset, Semantic Similarity, Hierarchical Graph 1 Introduction The lexicon of a language is a collection of lexical entries consisting of information regarding words and expressions, comprising both form and meaning (Levelt,). Form refers to the orthography, phonology and morphology of the lexical item and Meaning refers to its syntactic and semantic information. The term Mental Lexicon refers to the organization and interaction of lexical entries of a language in the human mind. Depending on the definition of word , an adult knows and uses around 40000 to 150000 words.
    [Show full text]
  • The Diachronic Semantic Lexicon of Dutch As Linked Open Data
    The Diachronic Semantic Lexicon of Dutch as Linked Open Data Katrien Depuydt, Jesse de Does Instituut voor de Nederlandse Taal Rapenburg 61, 2311GJ Leiden, The Netherlands [email protected], [email protected] Abstract This paper describes the Linked Open Data (LOD) model for the diachronic semantic lexicon DiaMaNT, currently under development at the Instituut voor de Nederlandse Taal (INT; Dutch Language Institute). The lexicon is part of a digital historical language infrastructure for Dutch at INT. This infrastructure, for which the core data is formed by the four major historical dictionaries of Dutch covering Dutch language from ca. 500 - ca 1976, currently consists of three modules: a dictionary portal, giving access to the historical dictionaries, a computational lexicon GiGaNT, providing information on words, their inflectional and spelling variation, and DiaMaNT, aimed at providing information on diachronic lexical variation (both semasiological and onomasiological). The DiaMaNT lexicon is built by adding a semantic layer to the word form lexicon GiGaNT, using the semantic information in the historical dictionaries. Ontolex-Lemon is a good point of departure for the LOD model, but we need extensions to be able to deal with the historical dictionary content incorporated in our lexicon. Keywords: Linked Open Data, Ontolex, Diachronic Lexicon, Semantic Lexicon, Historical Lexicography, Language Resources 1. Background digital, based on a closed corpus, in digital format. Hav- Even though Dutch lexicography1 can be dated back to the ing four scholarly dictionaries of Dutch in digital format 13th century with the glossarium Bernense, a Latin-Middle opened up opportunities for further exploitation of the con- Dutch word list, we had to wait until the 19th century for tents of these dictionaries.
    [Show full text]
  • Teaching and Learning Academic Vocabulary
    Musa Nushi Shahid Beheshti University Homa Jenabzadeh Shahid Beheshti University Teaching and Learning Academic Vocabulary Developing learners' lexical competence through vocabulary instruction has always been high on second language teachers' agenda. This paper will be focusing on the importance of academic vocabulary and how to teach such vocabulary to adult EFL/ESL learners at intermediate and higher levels of proficiency in the English language. It will also introduce several techniques, mostly those that engage students’ cognitive abilities, which can in turn facilitate the process of teaching and learning academic vocabulary. Several online tools that can assist academic vocabulary teaching and learning will also be introduced. The paper concludes with the introduction and discussion of four important academic vocabulary word lists which can help second language teachers with the identification and selection of academic vocabulary in their instructional planning. Keywords: Vocabulary, Academic, Second language, Word lists, Instruction. 1. Introduction Vocabulary is essential to conveying meaning in a second language (L2). Schmitt (2010) notes that L2 learners seem cognizant of the importance of vocabulary in language learning, as evidenced by their tendency to carry dictionaries, and not grammar books, around with them. It has even been suggested that the main difference between intermediate and advanced L2 learners lies not in how complex their grammatical knowledge is but in how expanded and developed their mental lexicon is (Lewis, 1997). Similarly, McCarthy (1990) observes that "no matter how well the student learns grammar, no matter how successfully the sounds of L2 are mastered, without words to express a wide range of meanings, communication in an L2 just cannot happen in any meaningful way" (p.
    [Show full text]
  • Natural Language Processing
    Chowdhury, G. (2003) Natural language processing. Annual Review of Information Science and Technology, 37. pp. 51-89. ISSN 0066-4200 http://eprints.cdlr.strath.ac.uk/2611/ This is an author-produced version of a paper published in The Annual Review of Information Science and Technology ISSN 0066-4200 . This version has been peer-reviewed, but does not include the final publisher proof corrections, published layout, or pagination. Strathprints is designed to allow users to access the research output of the University of Strathclyde. Copyright © and Moral Rights for the papers on this site are retained by the individual authors and/or other copyright owners. Users may download and/or print one copy of any article(s) in Strathprints to facilitate their private study or for non-commercial research. You may not engage in further distribution of the material or use it for any profitmaking activities or any commercial gain. You may freely distribute the url (http://eprints.cdlr.strath.ac.uk) of the Strathprints website. Any correspondence concerning this service should be sent to The Strathprints Administrator: [email protected] Natural Language Processing Gobinda G. Chowdhury Dept. of Computer and Information Sciences University of Strathclyde, Glasgow G1 1XH, UK e-mail: [email protected] Introduction Natural Language Processing (NLP) is an area of research and application that explores how computers can be used to understand and manipulate natural language text or speech to do useful things. NLP researchers aim to gather knowledge on how human beings understand and use language so that appropriate tools and techniques can be developed to make computer systems understand and manipulate natural languages to perform the desired tasks.
    [Show full text]
  • Defining Vocabulary Words Grades
    Vocabulary Instruction Booster Session 2: Defining Vocabulary Words Grades 5–8 Vaughn Gross Center for Reading and Language Arts at The University of Texas at Austin © 2014 Texas Education Agency/The University of Texas System Acknowledgments Vocabulary Instruction Booster Session 2: Defining Vocabulary Words was developed with funding from the Texas Education Agency and the support and talent of many individuals whose names do not appear here, but whose hard work and ideas are represented throughout. These individuals include national and state reading experts; researchers; and those who work for the Vaughn Gross Center for Reading and Language Arts at The University of Texas at Austin and the Texas Education Agency. Vaughn Gross Center for Reading and Language Arts College of Education The University of Texas at Austin www.meadowscenter.org/vgc Manuel J. Justiz, Dean Greg Roberts, Director Texas Education Agency Michael L. Williams, Commissioner of Education Monica Martinez, Associate Commissioner, Standards and Programs Development Team Meghan Coleman, Lead Author Phil Capin Karla Estrada David Osman Jennifer B. Schnakenberg Jacob Williams Design and Editing Matthew Slater, Editor Carlos Treviño, Designer Special thanks to Alice Independent School District in Alice, Texas Vaughn Gross Center for Reading and Language Arts at The University of Texas at Austin © 2014 Texas Education Agency/The University of Texas System Vaughn Gross Center for Reading and Language Arts at The University of Texas at Austin © 2014 Texas Education Agency/The University of Texas System Introduction Explicit and robust vocabulary instruction can make a significant difference when we are purposeful in the words we choose to teach our students.
    [Show full text]
  • Aspects of Idiom*
    University of Calgary PRISM: University of Calgary's Digital Repository Calgary (Working) Papers in Linguistics Volume 07, Winter 1982 1982-01 Aspects of idiom* Sayers, Coral University of Calgary Sayers, C. (1982). Aspects of idiom*. Calgary Working Papers in Linguistics, 7(Winter), 13-32. http://hdl.handle.net/1880/51297 journal article Downloaded from PRISM: https://prism.ucalgary.ca Aspects of Idiom* Coral Sayers • 1. Introduction Weinreich (1969) defines an idiom as "a complex expression • whose meaning cannot be derived from the meanings of its elements." This writer has collected examples of idioms from the English, German, Australian English, and Quebec French dialects (Appendix I) in order to examine the properties of the idiom, to explore in the literature the current concepts of idiom, and to relate relevant knowledge gained by these processes to the teaching of English as a Second Language. An idiom may be a word, a "lexical idiom," or, if it has a more complicated constituent structure, a "phrasal idiom." (Katz and Postal, 1964, cited by Fraser, 1970:23) As the lexical idiom, domi­ nated by a single branch syntactic category (e.g. "squatter," a noun) does not present as much difficulty as does the phrasal idiom, this paper will focus upon the latter. Despite the difficulties presented by idioms to the Trans­ formational Grammar model, e.g. the contradiction of the claim that virtually all sentences have a low occurrence probability and frequency (Coulmas, 1979), the consensus of opinion seems to be that idioms can be accommodated in the grammatical description. Phrasal idioms of English should be considered as a "more complicated variety of mono­ morphemic lexical entries." (Fraser, 1970:41) 2.
    [Show full text]
  • SELECTING and CREATING a WORD LIST for ENGLISH LANGUAGE TEACHING by Deny A
    Teaching English with Technology, 17(1), 60-72, http://www.tewtjournal.org 60 SELECTING AND CREATING A WORD LIST FOR ENGLISH LANGUAGE TEACHING by Deny A. Kwary and Jurianto Universitas Airlangga Dharmawangsa Dalam Selatan, Surabaya 60286, Indonesia d.a.kwary @ unair.ac.id / [email protected] Abstract Since the introduction of the General Service List (GSL) in 1953, a number of studies have confirmed the significant role of a word list, particularly GSL, in helping ESL students learn English. Given the recent development in technology, several researchers have created word lists, each of them claims to provide a better coverage of a text and a significant role in helping students learn English. This article aims at analyzing the claims made by the existing word lists and proposing a method for selecting words and a creating a word list. The result of this study shows that there are differences in the coverage of the word lists due to the difference in the corpora and the source text analysed. This article also suggests that we should create our own word list, which is both personalized and comprehensive. This means that the word list is not just a list of words. The word list needs to be accompanied with the senses and the patterns of the words, in order to really help ESL students learn English. Keywords: English; GSL; NGSL; vocabulary; word list 1. Introduction A word list has been noted as an essential resource for language teaching, especially for second language teaching. The main purpose for the creation of a word list is to determine the words that the learners need to know, in order to provide more focused learning materials.
    [Show full text]
  • Linking Wordnet to the SIL Semantic Domains
    Bringing together over- and under-represented languages: Linking Wordnet to the SIL Semantic Domains Muhammad Zulhelmy bin Mohd Rosman Francis Bond and Frantisekˇ Kratochv´ıl Linguistics and Multilingual Studies, Nanyang Technological University, Singapore [email protected], [email protected], [email protected] Abstract English, Japanese and Finnish (Bond and Paik, 2012). We have created an open-source mapping SD is designed for rapid construction and in- between the SIL’s semantic domains (used tuitive organization of lexicons, not primarily for for rapid lexicon building and organiza- the analysis of the resulting data. As a result, tion for under-resourced languages) and many potentially interesting relationships are only WordNet, the standard resource for lexical implicitly realized. By linking SD to WN we semantics in natural language processing. can take advantage of the relationships modeled in We show that the resources complement WN to make more of these explicit. For example, each other, and suggest ways in which the the semantic relations in WN would be a useful mapping can be improved even further. input into SD while the domains hierarchy would The semantic domains give more general enforce the existing WN relations. This will al- domain and associative links, which word- low more quantitative computational modeling of net still has few of, while wordnet gives under-resourced languages. explicit semantic relations between senses, It is currently an exciting time for field lexicog- which the domains lack. raphy with better tools and hardware allowing for rapid digitization of lexical resources. Typically, 1 Introduction linguists tag text soon after they collect it.
    [Show full text]
  • Semanticnet-Perception of Human Pragmatics
    SemanticNet-Perception of Human Pragmatics Amitava Das 1 and Sivaji Bandyopadhyay 2 Department of Computer Science and Engineering Jadavpur University [email protected] 1 [email protected] 2 technical. It is often used in ordinary language Abstract to denote a problem of understanding that comes down to word selection or connotation. SemanticNet is a semantic network of We studied with various Psycholinguistics ex- lexicons to hold human pragmatic periments to understand how human natural knowledge. So far Natural Language intelligence helps to understand general se- Processing (NLP) research patronized mantic from nature. Our study was to under- much of manually augmented lexicon stand the human psychology about semantics resources such as WordNet. But the beyond language. We were haunting for the small set of semantic relations like intellectual structure of the psychological and Hypernym, Holonym, Meronym and neurobiological factors that enable humans to Synonym etc are very narrow to cap- acquire, use, comprehend and produce natural ture the wide variations human cogni- languages. Let’s come with an example of tive knowledge. But no such informa- simple conversation about movie between two tion could be retrieved from available persons. lexicon resources. SemanticNet is the Person A: Have you seen the attempt to capture wide range of con- movie ‘ No Man's Land’? How text dependent semantic inference is it? among various themes which human Person B: Although it is beings perceive in their pragmatic good but you should see knowledge, learned by day to day cog- ‘The Hurt Locker’? nitive interactions with the surrounding May be the conversation looks very casual, physical world.
    [Show full text]
  • An Analysis of the Merriam-Webster's Advanced Learner's English Dictionary^ Second Edition
    An Analysis of the Merriam-Webster's Advanced Learner's English Dictionary^ Second Edition Takahiro Kokawa Y ukiyoshi Asada JUNKO SUGIMOTO TETSUO OSADA Kazuo Iked a 1. Introduction The year 2016 saw the publication of the revised Second Edition of the Merriam-Webster^ Advanced Learner }s English Dictionary (henceforth MWALED2)y after the eight year interval since the first edition was put on the market in 2008. We would like to discuss in this paper how it has, or has not changed through nearly a decade of updating between the two publications. In fact,when we saw the striking indigo color on the new edition ’s front and back covers, which happens to be prevalent among many EFL dictionaries on the market nowadays, yet totally different from the emerald green appearance of the previous edition, MWALED1, we anticipated rather extensive renovations in terms of lexicographic fea­ tures, ways of presentation and the descriptions themselves, as well as the extensiveness of information presented in MWALED2. However, when we compare the facts and figures presented in the blurbs of both editions, we find little difference between the two versions —‘more than 160,000 example sentences ,’ ‘100,000 words and phrases with definitions that are easy to understand/ ‘3,000 core vocabulary words ,’ ’32,000 IPA pronunciations ,,‘More than 22,000 idioms, verbal collocations, and commonly used phrases from Ameri­ can and British English/ (More than 12,000 usage labels, notes, and paragraphs’ and 'Original drawings and full-color art aid understand ­ ing [sic]5 are the common claims by the two editions.
    [Show full text]