D.5.1, WP5, Balkanet, IST-2000-29388
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Problem of Creating a Professional Dictionary of Uncodified Vocabulary
Utopía y Praxis Latinoamericana ISSN: 1315-5216 ISSN: 2477-9555 [email protected] Universidad del Zulia Venezuela Problem of Creating a Professional Dictionary of Uncodified Vocabulary MOROZOVA, O.A; YAKHINA, A.M; PESTOVA, M.S Problem of Creating a Professional Dictionary of Uncodified Vocabulary Utopía y Praxis Latinoamericana, vol. 25, no. Esp.7, 2020 Universidad del Zulia, Venezuela Available in: https://www.redalyc.org/articulo.oa?id=27964362018 DOI: https://doi.org/10.5281/zenodo.4009661 This work is licensed under Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International. PDF generated from XML JATS4R by Redalyc Project academic non-profit, developed under the open access initiative O.A MOROZOVA, et al. Problem of Creating a Professional Dictionary of Uncodified Vocabulary Artículos Problem of Creating a Professional Dictionary of Uncodified Vocabulary Problema de crear un diccionario profesional de vocabulario no codificado O.A MOROZOVA DOI: https://doi.org/10.5281/zenodo.4009661 Kazan Federal University, Rusia Redalyc: https://www.redalyc.org/articulo.oa? [email protected] id=27964362018 http://orcid.org/0000-0002-4573-5858 A.M YAKHINA Kazan Federal University, Rusia [email protected] http://orcid.org/0000-0002-8914-0995 M.S PESTOVA Ural State Law University, Rusia [email protected] http://orcid.org/0000-0002-7996-9636 Received: 03 August 2020 Accepted: 10 September 2020 Abstract: e article considers the problem of lexicographical fixation of uncodified vocabulary in a professional dictionary. e oil sublanguage, being one of the variants of common language realization used by a limited group of its medium in conditions of official and also non-official communication, provides interaction of people employed in the oil industry. -
Automated Method of Hyper-Hyponymic Verbal Pairs Extraction from Dictionary Definitions Based on Syntax Analysis and Word Embeddings
Rhema. Рема. 2020. № 2 Лингвистика DOI: 10.31862/2500-2953-2020-2-60-75 O. Antropova, E. Ogorodnikova Ural Federal University named after the first President of Russia B.N. Yeltsin, Yekaterinburg, 620002, Russian Federation Automated method of hyper-hyponymic verbal pairs extraction from dictionary definitions based on syntax analysis and word embeddings This study is dedicated to the development of methods of computer-aided hyper-hyponymic verbal pairs extraction. The method’s foundation is that in a dictionary article the defined verb is commonly a hyponym and its definition is formulated through a hypernym in the infinitive. The first method extracted all infinitives having dependents. The second method also demanded an extracted infinitive to be the root of the syntax tree. The first method allowed the extraction of almost twice as many hypernyms; whereas the second method provided higher precision. The results were post-processed with word embeddings, which allowed an improved precision without a crucial drop in the number of extracted hypernyms. Key words: semantics, verbal hyponymy, syntax models, word embeddings, RusVectōrēs Acknowledgments. The reported study was funded by RFBR according to the research project № 18-302-00129 FOR CITATION: Antropova O., Ogorodnikova E. Automated method of hyper- hyponymic verbal pairs extraction from dictionary definitions based on syntax analysis and word embeddings. Rhema. 2020. No. 2. Pp. 60–75. DOI: 10.31862/2500- 2953-2020-2-60-75 © Antropova O., Ogorodnikova E., 2020 Контент доступен по лицензии Creative Commons Attribution 4.0 International License 60 The content is licensed under a Creative Commons Attribution 4.0 International License Rhema. -
Methods in Lexicography and Dictionary Research* Stefan J
Methods in Lexicography and Dictionary Research* Stefan J. Schierholz, Department Germanistik und Komparatistik, Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany ([email protected]) Abstract: Methods are used in every stage of dictionary-making and in every scientific analysis which is carried out in the field of dictionary research. This article presents some general consid- erations on methods in philosophy of science, gives an overview of many methods used in linguis- tics, in lexicography, dictionary research as well as of the areas these methods are applied in. Keywords: SCIENTIFIC METHODS, LEXICOGRAPHICAL METHODS, THEORY, META- LEXICOGRAPHY, DICTIONARY RESEARCH, PRACTICAL LEXICOGRAPHY, LEXICO- GRAPHICAL PROCESS, SYSTEMATIC DICTIONARY RESEARCH, CRITICAL DICTIONARY RESEARCH, HISTORICAL DICTIONARY RESEARCH, RESEARCH ON DICTIONARY USE Opsomming: Metodes in leksikografie en woordeboeknavorsing. Metodes word gebruik in elke fase van woordeboekmaak en in elke wetenskaplike analise wat in die woor- deboeknavorsingsveld uitgevoer word. In hierdie artikel word algemene oorwegings vir metodes in wetenskapfilosofie voorgelê, 'n oorsig word gegee van baie metodes wat in die taalkunde, leksi- kografie en woordeboeknavorsing gebruik word asook van die areas waarin hierdie metodes toe- gepas word. Sleutelwoorde: WETENSKAPLIKE METODES, LEKSIKOGRAFIESE METODES, TEORIE, METALEKSIKOGRAFIE, WOORDEBOEKNAVORSING, PRAKTIESE LEKSIKOGRAFIE, LEKSI- KOGRAFIESE PROSES, SISTEMATIESE WOORDEBOEKNAVORSING, KRITIESE WOORDE- BOEKNAVORSING, HISTORIESE WOORDEBOEKNAVORSING, NAVORSING OP WOORDE- BOEKGEBRUIK 1. Introduction In dictionary production and in scientific work which is carried out in the field of dictionary research, methods are used to reach certain results. Currently there is no comprehensive and up-to-date documentation of these particular methods in English. The article of Mann and Schierholz published in Lexico- * This article is based on the article from Mann and Schierholz published in Lexicographica 30 (2014). -
Towards a Conceptual Representation of Lexical Meaning in Wordnet
Towards a Conceptual Representation of Lexical Meaning in WordNet Jen-nan Chen Sue J. Ker Department of Information Management, Department of Computer Science, Ming Chuan University Soochow University Taipei, Taiwan Taipei, Taiwan [email protected] [email protected] Abstract Knowledge acquisition is an essential and intractable task for almost every natural language processing study. To date, corpus approach is a primary means for acquiring lexical-level semantic knowledge, but it has the problem of knowledge insufficiency when training on various kinds of corpora. This paper is concerned with the issue of how to acquire and represent conceptual knowledge explicitly from lexical definitions and their semantic network within a machine-readable dictionary. Some information retrieval techniques are applied to link between lexical senses in WordNet and conceptual ones in Roget's categories. Our experimental results report an overall accuracy of 85.25% (87, 78, 89, and 87% for nouns, verbs, adjectives and adverbs, respectively) when evaluating on polysemous words discussed in previous literature. 1 Introduction Knowledge representation has traditionally been thought of as the heart of various applications of natural language processing. Anyone who has built a natural language processing system has had to tackle the problem of representing its knowledge of the world. One of the applications for knowledge representation is word sense disambiguation (WSD) for unrestricted text, which is one of the major problems in analyzing sentences. In the past, the substantial literature on WSD has concentrated on statistical approaches to analyzing and extracting information from corpora (Gale et al. 1992; Yarowsky 1992, 1995; Dagan and Itai 1994; Luk 1995; Ng and Lee 1996). -
CS460/626 : Natural Language Processing/Speech, NLP and the Web
CS460/626 : Natural Language Processing/Speech, NLP and the Web Lecture 24, 25, 26 Wordnet Pushpak Bhattacharyya CSE Dept., IIT Bombay 17th and 19th (morning and night), 2013 NLP Trinity Problem Semantics NLP Trinity Parsing Part of Speech Tagging Morph Analysis Marathi French HMM Hindi English Language CRF MEMM Algorithm NLP Layer Discourse and Corefernce Increased Semantics Extraction Complexity Of Processing Parsing Chunking POS tagging Morphology Background Classification of Words Word Content Function Word Word Verb Noun Adjective Adverb Prepo Conjun Pronoun Interjection sition ction NLP: Thy Name is Disambiguation A word can have multiple meanings and A meaning can have multiple words Word with multiple meanings Where there is a will, Where there is a will, There are hundreds of relatives Where there is a will There is a way There are hundreds of relatives A meaning can have multiple words Proverb “A cheat never prospers” Proverb: “A cheat never prospers but can get rich faster” WSD should be distinguished from structural ambiguity Correct groupings a must … Iran quake kills 87, 400 injured When it rains cats and dogs run for cover Should be distinguished from structural ambiguity Correct groupings a must … Iran quake kills 87, 400 injured When it rains, cats and dogs runs for cover When it rains cats and dogs, run for cover Groups of words (Multiwords) and names can be ambiguous Broken guitar for sale, no strings attached (Pun) Washington voted Washington to power pujaa ne pujaa ke liye phul todaa (Pujaa plucked -
Machine Reading = Text Understanding! David Israel! What Is It to Understand a Text? !
Machine Reading = Text Understanding! David Israel! What is it to Understand a Text? ! • To understand a text — starting with a single sentence — is to:! • determine its truth (perhaps along with the evidence for its truth)! ! • determine its truth-conditions (roughly: what must the world be like for it to be true)! • So one doesn’t have to check how the world actually is — that is whether the sentence is true! • calculate its entailments! • take — or at least determine — appropriate action in light of it (and some given preferences/goals/desires)! • translate it accurately into another language! • and what does accuracy come to?! • ground it in a cognitively plausible conceptual space! • …….! • These are not necessarily competing alternatives! ! For Contrast: What is it to Engage Intelligently in a Dialogue? ! • Respond appropriately to what your dialogue partner has just said (and done)! • Typically, taking into account the overall purpose(s) of the interaction and the current state of that interaction (the dialogue state)! • Compare and contrast dialogue between “equals” and between a human and a computer, trying to assist the human ! • Siri is a very simple example of the latter! A Little Ancient History of Dialogue Systems! STUDENT (Bobrow 1964) Question-Answering: Word problems in simple algebra! “If the number of customers Tom gets is twice the square of 20% of the number of advertisements he runs, and the number of advertisements he runs is 45, what is the number of customers Tom gets?” ! Overview of the method ! . 1 Map referential expressions to variables. ! ! . 2 Use regular expression templates to identify and transform mathematical expressions. -
Inferring Parts of Speech for Lexical Mappings Via the Cyc KB
to appear in Proceeding 20th International Conference on Computational Linguistics (COLING-04), August 23-27, 2004 Geneva Inferring parts of speech for lexical mappings via the Cyc KB Tom O’Hara‡, Stefano Bertolo, Michael Witbrock, Bjørn Aldag, Jon Curtis,withKathy Panton, Dave Schneider,andNancy Salay ‡Computer Science Department Cycorp, Inc. New Mexico State University Austin, TX 78731 Las Cruces, NM 88001 {bertolo,witbrock,aldag}@cyc.com [email protected] {jonc,panton,daves,nancy}@cyc.com Abstract in the phrase “painting for sale.” In cases like this, semantic or pragmatic criteria, as opposed We present an automatic approach to learn- to syntactic criteria only, are necessary for deter- ing criteria for classifying the parts-of-speech mining the proper part of speech. The headword used in lexical mappings. This will fur- part of speech is important for correctly identifying ther automate our knowledge acquisition sys- phrasal variations. For instance, the first term can tem for non-technical users. The criteria for also occur in the same sense in “paint for money.” the speech parts are based on the types of However, this does not hold for the second case, the denoted terms along with morphological since “paint for sale” has an entirely different sense and corpus-based clues. Associations among (i.e., a substance rather than an artifact). these and the parts-of-speech are learned us- When lexical mappings are produced by naive ing the lexical mappings contained in the Cyc users, such as in DARPA’s Rapid Knowledge For- knowledge base as training data. With over mation (RKF) project, it is desirable that tech- 30 speech parts to choose from, the classifier nical details such as the headword part of speech achieves good results (77.8% correct). -
Symbols Used in the Dictionary
Introduction 1. This dictionary will help you to be able to use Einglish - English Dictionary well & use it in your studying. 2. The exercies will provide you with what information your dictionary contains & how to find as well use them quickly & correctly. So you MUST: 1. make sure you understand the instructuion of the exercise. 2. check on yourself by working out examples, using your dictionay. 3. Do it carefully but quickly as you can even if you think you know the answer. 4. always scan the whole entry first to get the general idea, then find the part that you’re interested in. 5. Read slowly & carefully. 6. Help to make the learning process interesting. 7. If you make a mistake, find out why. Remember, the more you use the dictionary, the more uses you will find for it. 1 Chapter one Dictionary Entry • Terms used in the Dictionary. •Symbols used in the Dictionary. •Different typfaces used in the Dictionary. •Abbreviation used in the Dictionary. 2 1- Terms used in the Dictionay: -What is Headword? -What is an Entry? -What is a Definition? -What is a Compound? -What is a Derivative? -What information does an entry contain? 3 1- What is Headword? It’s the word you look up in a dictionary. Also, it’s the name given to each of the words listed alphabetically. 4 2- What is an Entry? It consists of a headword & all the information about the headword. 5 3- What is a Definition? It’s a part of an entry which describes a particular meaning & usage of the headword. -
DICTIONARY News
Number 17 y July 2009 Kernerman kdictionaries.com/kdn DICTIONARY News KD’s BLDS: a brief introduction In 2005 K Dictionaries (KD) entered a project of developing We started by establishing an extensive infrastructure both dictionaries for learners of different languages. KD had already contentwise and technologically. The lexicographic compilation created several non-English titles a few years earlier, but those was divided into 25 projects: 1 for the French core, 8 for the were basic word-to-word dictionaries. The current task marked dictionary cores of the eight languages, another 8 for the a major policy shift for our company since for the first time we translation from French to eight languages, and 8 more for the were becoming heavily involved in learner dictionaries with translation of each language to French. target languages other than English. That was the beginning of An editorial team was set up for developing each of the nine our Bilingual Learners Dictionaries Series (BLDS), which so (French + 8) dictionary cores. The lexicographers worked from far covers 20 different language cores and keeps growing. a distance, usually at home, all over the world. The chief editor The BLDS was launched with a program for eight French for each language was responsible for preparing the editorial bilingual dictionaries together with Assimil, a leading publisher styleguide and the list of headwords. Since no corpora were for foreign language learning in France which made a strategic publicly available for any of these languages, each editor used decision to expand to dictionaries in cooperation with KD. different means to retrieve information in order to compile the The main target users were identified as speakers of French headword list. -
The Use of Ostensive Definition As an Explanatory Technique in the Presentation of Semantic Information in Dictionary-Making By
The Use of Ostensive Definition as an Explanatory Technique in the Presentation of Semantic Information in Dictionary-Making by John Lubinda The University of Zambia School of Humanities and Social Sciences Email: [email protected] Abstract Lexicography – The theory and practice of dictionary making – is a multifarious activity involving a number of ordered steps. It essentially involves the management of a language’s word stock. Hartmann (1983: vii) summarises the main aspects of the lexicographic management of vocabulary into three main activities, viz. recording, description and presentation. The stages involved in dictionary-making may be outlined as follows (paraphrasing Gelb’s(1958) five-step model): (a) Identification, selection and listing of lexical items (b) sequential arrangement of lexical units (c) parsing and excerpting of entries (d) semantic specification of each unit (e) dictionary compilation from processed data and publication of the final product. The present article focuses specifically on one aspect of the intermediate stages of this rather complex process of dictionary-making: the stage that lexicographers generally consider to be the principal stage in the entire process and one that is at the very heart of the raison d’être of every monolingual linguistic dictionary, namely semantic specification. The main thesis being developed here is that semantic specification can be greatly enhanced by the judicious 21 Journal of Lexicography and Terminology, Volume 2, Issue 1 use of pictorial illustrations (as a complement to a lexicographic definition) in monolingual dictionaries, and as an essential lexicographic device. This feature is discussed under the broad concept of ostensive definition as an explanatory technique in dictionary- making. -
Automatic Labeling of Troponymy for Chinese Verbs
Automatic labeling of troponymy for Chinese verbs 羅巧Ê Chiao-Shan Lo*+ s!蓉 Yi-Rung Chen+ [email protected] [email protected] 林芝Q Chih-Yu Lin+ 謝舒ñ Shu-Kai Hsieh*+ [email protected] [email protected] *Lab of Linguistic Ontology, Language Processing and e-Humanities, +Graduate School of English/Linguistics, National Taiwan Normal University Abstract 以同©^Æ與^Y語意關¶Ë而成的^Y知X«,如ñ語^² (Wordnet)、P語^ ² (EuroWordnet)I,已有E分的研v,^²的úË_已øv完善。ú¼ø同的目的,- 研b語言@¦已úË'規!K-文^Y²路 (Chinese Wordnet,CWN),è(Ð供完t的 -文YK^©@分。6而,(目MK-文^Y²路ûq-,1¼目M;要/¡(ºº$ 定來標記同©^ÆK間的語意關Â,因d這些標記KxÏ尚*T成可L應(K一定規!。 因d,,Ç文章y%針對動^K間的上下M^Y語意關 (Troponymy),Ðú一.ê動標 記的¹法。我們希望藉1句法上y定的句型 (lexical syntactic pattern),úË一個能 ê 動½取ú動^上下M的ûq。透N^©意$定原G的U0,P果o:,dûqê動½取ú 的動^上M^,cº率將近~分K七A。,研v盼能將,¹法應(¼c(|U-的-文^ ²ê動語意關Â標記,以Ê知X,體Kê動úË,2而能有H率的úË完善的-文^Y知 XÇ源。 關關關uuu^^^:-文^Y²路、語©關Âê動標記、動^^Y語© Abstract Synset and semantic relation based lexical knowledge base such as wordnet, have been well-studied and constructed in English and other European languages (EuroWordnet). The Chinese wordnet (CWN) has been launched by Academia Sinica basing on the similar paradigm. The synset that each word sense locates in CWN are manually labeled, how- ever, the lexical semantic relations among synsets are not fully constructed yet. In this present paper, we try to propose a lexical pattern-based algorithm which can automatically discover the semantic relations among verbs, especially the troponymy relation. There are many ways that the structure of a language can indicate the meaning of lexical items. For Chinese verbs, we identify two sets of lexical syntactic patterns denoting the concept of hypernymy-troponymy relation. -
Japanese-English Cross-Language Headword Search of Wikipedia
Japanese-English Cross-Language Headword Search of Wikipedia Satoshi Sato and Masaya Okada Graduate School of Engineering Nagoya University Chikusa-ku, Nagoya, Japan [email protected] masaya [email protected] Abstract glish article. To realize CLHS, term translation is required. This paper describes a Japanese-English cross-language headword search system of Term translation is not a mainstream of machine Wikipedia, which enables to find an appro- translation research, because a simple dictionary- priate English article from a given Japanese based method is widely used and enough for query term. The key component of the sys- general terms. For technical terms and proper tem is a term translator, which selects an ap- nouns, automatic extraction of translation pairs propriate English headword among the set from bilingual corpora and the Web has been in- of headwords in English Wikipedia, based on the framework of non-productive ma- tensively studied (e.g., (Daille et al., 1994) and chine translation. An experimental result (Lin et al., 2008)) in order to fill the lack of en- shows that the translation performance of tries in a bilingual dictionary. However, the qual- our system is equal or slightly better than ity of automatic term translation is not well exam- commercial machine translation systems, ined with a few exception (Baldwin and Tanaka, Google translate and Mac-Transer. 2004). Query translation in cross-language infor- mation retrieval is related to term translation, but it concentrates on translation of content words in 1 Introduction a query (Fujii and Ishikawa, 2001). Many people use Wikipedia, an encyclopedia on In the CLHS situation, the ultimate goal of the Web, to find meanings of unknown terms.