Some Features of Monolingual LSP Dictionaries G.-R
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
A Method for Automatically Building and Evaluating Dictionary Resources
A Method for Automatically Building and Evaluating Dictionary Resources Smaranda Muresan∗, Judith Klavansy ∗Computer Science Department, Columbia University 500 West 120th, New York, USA [email protected] yCenter for Research on Information Access, Columbia University 535 West 114th St, New York, USA [email protected] Abstract This paper describes a method toward automatically building dictionaries from text. We present DEFINDER, a rule-based system for extraction of definitions from on-line consumer-oriented medical articles. We provide an extensive evaluation on three dimensions: i) performance of the definition extraction technique in terms of precision and recall, ii) quality of the built dictionary as judged both by specialists and lay users, iii) coverage of existing on-line dictionaries. The corpus we used for the study is publicly available. A major contribution of the paper is the range of quantitative and qualitative evaluation methods. 1. Introduction used in the context of summarization of technical articles to Most machine readable dictionaries or glossaries are ei- provide explanation of the technical terms in lay language. ther manually built by human experts or transformed in Our system, was extensively evaluated. First we eval- electronic forms from hard-copy versions through an ex- uated the performance of the definition extraction method pensive digitization process. Also for some particular do- by comparing the results against a gold standard in terms mains, such as medical domain, the effort is concentrated in of precision and recall. Second, we evaluated the quality of building technical dictionaries for specialists that are of lit- the dictionary as judged both by specialists and lay users. -
Biomedical Entity Representations with Synonym Marginalization
Biomedical Entity Representations with Synonym Marginalization Mujeen Sung Hwisang Jeon Jinhyuk Leey Jaewoo Kangy Korea University fmujeensung,j hs,jinhyuk lee,[email protected] Abstract Unlike named entities from general domain text, typical biomedical entities have several different Biomedical named entities often play impor- surface forms, making the normalization of biomed- tant roles in many biomedical text mining ical entities very challenging. For instance, while tools. However, due to the incompleteness two chemical entities ‘motrin’ and ‘ibuprofen’ be- of provided synonyms and numerous varia- tions in their surface forms, normalization of long to the same concept ID (MeSH:D007052), biomedical entities is very challenging. In this they have completely different surface forms. paper, we focus on learning representations of On the other hand, mentions having similar sur- biomedical entities solely based on the syn- face forms could also have different meanings onyms of entities. To learn from the incom- (e.g. ‘dystrophinopathy’ (MeSH:D009136) and plete synonyms, we use a model-based candi- ‘bestrophinopathy’ (MeSH:C567518)). These ex- date selection and maximize the marginal like- amples show a strong need for building latent rep- lihood of the synonyms present in top candi- dates. Our model-based candidates are itera- resentations of biomedical entities that capture se- tively updated to contain more difficult neg- mantic information of the mentions. ative samples as our model evolves. In this In this paper, we propose a novel framework for way, we avoid the explicit pre-selection of learning biomedical entity representations based on negative samples from more than 400K can- the synonyms of entities. -
The Effectiveness of Monolingual and Bilingual Dictionaries Features Toward the Students Narrative Writing of Uin Malang: Comparative Study
THE EFFECTIVENESS OF MONOLINGUAL AND BILINGUAL DICTIONARIES FEATURES TOWARD THE STUDENTS NARRATIVE WRITING OF UIN MALANG: COMPARATIVE STUDY THESIS WINDA FIFI PUSPITA SARI 10320118 ENGLISH LANGUAGE AND LETTERS DEPARTMENT FACULTY OF HUMANITIES MAULANA MALIK IBRAHIM STATE ISLAMIC UNIVERSITY MALANG 2014 THE EFFECTIVENESS OF MONOLINGUAL AND BILINGUAL DICTIONARIES FEATURES TOWARD THE STUDENTS NARRATIVE WRITING OF UIN MALANG: COMPARATIVE STUDY THESIS Presented to Maulana Malik Ibrahim State Islamic University of Malang In a Partial of the Requirements for the Degree of Sarjana Sastra (S.S.) By: WindaFifiPuspita Sari 10320118 Advisor: Drs. H.DjokoSusanto, M.Ed., Ph.D. NIP 19670529 2000031 001 ENGLISH LANGUAGE AND LETTERS DEPARTMENT FACULTY OF HUMANITIES MAULANA MALIK IBRAHIM STATE ISLAMIC UNIVERSITY MALANG 2014 i ii iii iv MOTTO ُ ……يَ ْزفَ ِع هَّللاُ اله ِذ َين َآمنُوا ِم ْن ُك ْم َواله ِذ َين أوتُوا ْال ِع ْل َم َد َر َج ٍات َو هَّللا ُ مِ َاا َت ْع َالُو َ َبيِ ٌز ….. Allah will exalt those of you who believe, and those who are given knowledge, in high degrees; and Allah is aware of what you do.(Surah Al Mujadilah 58:11) v DEDICATION I dedicate my thesis to my beloved family: I would like to say thank you very much to my beloved parents, brother and sister who always support me during my thesis progress. And thank you very much to my family who never stop bring me in their prayer, cares, and loves that make me always spirit. vi ACKNOWLEDGMENT First of all, I would like to thank Allah SWT who always gives me a blessing and strength to finish my thesis. -
Methods in Lexicography and Dictionary Research* Stefan J
Methods in Lexicography and Dictionary Research* Stefan J. Schierholz, Department Germanistik und Komparatistik, Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany ([email protected]) Abstract: Methods are used in every stage of dictionary-making and in every scientific analysis which is carried out in the field of dictionary research. This article presents some general consid- erations on methods in philosophy of science, gives an overview of many methods used in linguis- tics, in lexicography, dictionary research as well as of the areas these methods are applied in. Keywords: SCIENTIFIC METHODS, LEXICOGRAPHICAL METHODS, THEORY, META- LEXICOGRAPHY, DICTIONARY RESEARCH, PRACTICAL LEXICOGRAPHY, LEXICO- GRAPHICAL PROCESS, SYSTEMATIC DICTIONARY RESEARCH, CRITICAL DICTIONARY RESEARCH, HISTORICAL DICTIONARY RESEARCH, RESEARCH ON DICTIONARY USE Opsomming: Metodes in leksikografie en woordeboeknavorsing. Metodes word gebruik in elke fase van woordeboekmaak en in elke wetenskaplike analise wat in die woor- deboeknavorsingsveld uitgevoer word. In hierdie artikel word algemene oorwegings vir metodes in wetenskapfilosofie voorgelê, 'n oorsig word gegee van baie metodes wat in die taalkunde, leksi- kografie en woordeboeknavorsing gebruik word asook van die areas waarin hierdie metodes toe- gepas word. Sleutelwoorde: WETENSKAPLIKE METODES, LEKSIKOGRAFIESE METODES, TEORIE, METALEKSIKOGRAFIE, WOORDEBOEKNAVORSING, PRAKTIESE LEKSIKOGRAFIE, LEKSI- KOGRAFIESE PROSES, SISTEMATIESE WOORDEBOEKNAVORSING, KRITIESE WOORDE- BOEKNAVORSING, HISTORIESE WOORDEBOEKNAVORSING, NAVORSING OP WOORDE- BOEKGEBRUIK 1. Introduction In dictionary production and in scientific work which is carried out in the field of dictionary research, methods are used to reach certain results. Currently there is no comprehensive and up-to-date documentation of these particular methods in English. The article of Mann and Schierholz published in Lexico- * This article is based on the article from Mann and Schierholz published in Lexicographica 30 (2014). -
Determining the Characteristic Vocabulary for a Specialized Dictionary Using Word2vec and a Directed Crawler Gregory Grefenstette, Lawrence Muchemi
Determining the Characteristic Vocabulary for a Specialized Dictionary using Word2vec and a Directed Crawler Gregory Grefenstette, Lawrence Muchemi To cite this version: Gregory Grefenstette, Lawrence Muchemi. Determining the Characteristic Vocabulary for a Special- ized Dictionary using Word2vec and a Directed Crawler. GLOBALEX 2016: Lexicographic Resources for Human Language Technology, May 2016, Portoroz, Slovenia. hal-01323591 HAL Id: hal-01323591 https://hal.inria.fr/hal-01323591 Submitted on 30 May 2016 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Determining the Characteristic Vocabulary for a Specialized Dictionary using Word2vec and a Directed Crawler Gregory Grefenstette Lawrence Muchemi Inria Saclay/TAO, Rue Noetzlin - Bât 660 Inria Saclay/TAO, , Rue Noetzlin - Bât 660 91190 Gif sur Yvette, France 91190 Gif sur Yvette, France [email protected] [email protected] ABSTRACT crawler works. This is followed by a description of one Specialized dictionaries are used to understand concepts in distributional semantics tool, word2vec. Then we show how specific domains, especially where those concepts are not these two tools can be used together to extract the basis of a part of the general vocabulary, or having meanings that specialized vocabulary for a domain. -
DICTIONARY News
Number 17 y July 2009 Kernerman kdictionaries.com/kdn DICTIONARY News KD’s BLDS: a brief introduction In 2005 K Dictionaries (KD) entered a project of developing We started by establishing an extensive infrastructure both dictionaries for learners of different languages. KD had already contentwise and technologically. The lexicographic compilation created several non-English titles a few years earlier, but those was divided into 25 projects: 1 for the French core, 8 for the were basic word-to-word dictionaries. The current task marked dictionary cores of the eight languages, another 8 for the a major policy shift for our company since for the first time we translation from French to eight languages, and 8 more for the were becoming heavily involved in learner dictionaries with translation of each language to French. target languages other than English. That was the beginning of An editorial team was set up for developing each of the nine our Bilingual Learners Dictionaries Series (BLDS), which so (French + 8) dictionary cores. The lexicographers worked from far covers 20 different language cores and keeps growing. a distance, usually at home, all over the world. The chief editor The BLDS was launched with a program for eight French for each language was responsible for preparing the editorial bilingual dictionaries together with Assimil, a leading publisher styleguide and the list of headwords. Since no corpora were for foreign language learning in France which made a strategic publicly available for any of these languages, each editor used decision to expand to dictionaries in cooperation with KD. different means to retrieve information in order to compile the The main target users were identified as speakers of French headword list. -
WME 3.0: an Enhanced and Validated Lexicon of Medical Concepts
WME 3.0: An Enhanced and Validated Lexicon of Medical Concepts Anupam Mondal1 Dipankar Das1 Erik Cambria2 Sivaji Bandyopadhyay1 1Department of Computer Science and Engineering 2School of Computer Science and Engineering Jadavpur University, Kolkata, India Nanyang Technological University, Singapore [email protected], [email protected] [email protected], 1sivaji cse [email protected] Abstract However, medical text is in general unstructured since doctors do not like to fill forms and pre- Information extraction in the medical do- fer free-form notes of their observations. Hence, main is laborious and time-consuming due a lexical design is difficult due to lack of any to the insufficient number of domain- prior knowledge of medical terms and contexts. specific lexicons and lack of involve- Therefore, we are motivated to enhance a med- ment of domain experts such as doctors ical lexicon namely WordNet of Medical Events and medical practitioners. Thus, in the (WME 2.0) which helps to identify medical con- present work, we are motivated to de- cepts and their features. In order to enrich this sign a new lexicon, WME 3.0 (WordNet lexicon, we have employed various well-known of Medical Events), which contains over resources like conventional WordNet, SentiWord- 10,000 medical concepts along with their Net (Esuli and Sebastiani, 2006), SenticNet (Cam- part of speech, gloss (descriptive expla- bria et al., 2016), Bing Liu (Liu, 2012), and nations), polarity score, sentiment, sim- Taboada’s Adjective list (Taboada et al., 2011) ilar sentiment words, category, affinity and a preprocessed English medical dictionary1 on score and gravity score features. -
Lexicographic Interchange Between a Specialized and a General Language Dictionary1
Lexicographic interchange between a specialized and a general language dictionary1 Marie-Claude Demers, Ilan Kernerman & Marie-Claude L’Homme Keywords: general language dictionary, terminological database, specialized meaning, term, wordlist. Abstract One of the important issues lexicographers need to address concerns the desired coverage of a dictionary’s wordlist. This paper addresses the issue from a practical angle. We propose a method for comparing the contents of two resources and evaluating to what extent each can contribute to increase and improve the coverage of the other. Concretely, the project consists of comparing the contents of the English version of DiCoInfo (a dictionary of computing and Internet terms) with the appropriate entries of the Random House Webster’s College Dictionary (RHWCD). The entries missing in one resource are considered for inclusion in the other, and vice versa. The approach proves beneficial for both resources. Approximately 100 entries were added to DiCoInfo and over 500 lexical items or meanings are being included in the RHWCD. 1. Introduction An important issue lexicographers need to address concerns the coverage of a dictionary’s wordlist. The question is relevant from the points of view of general as well as specialized lexicography, although it leads to different answers in each area. Specialized dictionaries should include all items that are related to the field they aim to cover. General language dictionaries include many specialized lexical items but attempt to cover fields and terms of more general interest (Alonso Campo 2008; Béjoint 1988; Boulanger 1996; Josselin-Leray 2005; Svensén 2009; Wiegand 1999). This paper addresses the issue of coverage from a practical angle. -
A Study of Chinese Medical Students As Dictionary Users and Potential Users for an Online Medical Termfinder
Lexicography ASIALEX https://doi.org/10.1007/s40607-017-0031-9 ORIGINAL PAPER A study of Chinese medical students as dictionary users and potential users for an online medical termfinder Jun Ding1 Received: 2 May 2017 / Accepted: 12 November 2017 Ó Springer-Verlag GmbH Germany, part of Springer Nature 2017 Abstract This paper examines the dictionary use of Chinese medical students as ESP learners and their trial experience with the online Health Termfinder (HTF), a specialized dictionary tool for medical terms. Empirical data were collected first from a survey among a group of medical students at Fudan University on their previous dictionary-use behaviors and initial response to the introduction to the HTF and its bilingualization, and second from an assignment of using the bilingualized HTF for reading comprehension with the same group of students. The survey findings reveal that the medical students in general are aware of the importance of dictionaries in their ESP learning in spite of the alarming fact that the majority of them have not even used any English medical dictionaries (monolingual or bilin- gual). They are thus quite open to the idea of using an online termbank of spe- cialized medical terms. But the HTF assignment findings also show that proper guidance and training is necessary for the termbank to be made actually useful for the lexical needs of Chinese ESP learners. It is also expected that the students’ feedback on their HTF experience, especially their demand for the termbank to include more cross-disciplinary medical terms, could be taken into consideration by the HTF builders. -
DICTIONARY News Kernerman
Number 20 y July 2012 Kernerman kdictionaries.com/kdn DICTIONARY News What’s in a name? This is the twentieth issue of this newsletter, and our company name replacement seemed to be Kernerman, which by that time is now in its twentieth year. The first issue of the newsletter was had become well known in the dictionary world, but which published by Kernerman Publishing and Password Publishers I personally avoided — because that would be too close to in July 1994. It was entitled Password News, and had the goal the name of Kernerman Publishing, and because I considered of serving as a “forum for discussion about the semi-bilingual that carrying my surname was somewhat vain. On the other English dictionary.” The title was changed in the next issue hand, I liked the short form K as an abbreviation of Kernerman to Kernerman Dictionary News. Issues No. 2 and 3 appeared (and for 1,000), which was already in our logo, its anonymity, in 1995, and since then the newsletter has been published and the nice counterbalance it produced against the long word regularly in July each year. Over the years the scope of topics Dictionaries. Thus, K Dictionaries emerged — and may a has expanded, and the look and size have changed as well. thousand dictionaries bloom! Kernerman Publishing was established in 1969, as an The irony of fate, however, is that on numerous occasions independent ELT publishing house. Since the 1980s it has been we are referred to as Kernerman Dictionaries, most notably by conducted by my father, Ari (Lionel) Kernerman, who initiated dictionary professionals… Accepting that there is no escape the semi-bilingual dictionary, and became the from your name, and assuming the weight it leading English dictionary publisher in Israel implies, we began to introduce Kernerman to and renowned globally. -
Examining the Usefulness of Medical Bilingual Dictionaries for Translation Purposes
International Journal of Linguistics ISSN 1948-5425 2019, Vol. 11, No. 6 Examining the Usefulness of Medical Bilingual Dictionaries for Translation Purposes Fatima Al Qaisiya (Corresponding author) Department of English Language and Literature The University of Jordan, Jordan E-mail: [email protected] Rajai Rasheed Al-Khanji University of Jordan, Jordan Received: October 29, 2019 Accepted: November 11, 2019 Published: December 22, 2019 doi:10.5296/ijl.v11i6.15723 URL: https://doi.org/10.5296/ijl.v11i6.15723 Abstract This paper aims at examining the effectiveness of four medical bilingual English-Arabic dictionaries for translational purposes. This is done by investigating the provided information used in the presentation of a number of medical words in the examined dictionaries. The results reveal an inconsistency in the presentation of the selected words in the dictionaries; which might be correlated to the lack of provision policies given by the compilers of the dictionaries. Moreover, an inadequacy in the provision of semantic, pragmatic, and encyclopedic information was noticed which would be inadequate for translational purposes. However, it was found that the Unified Medical dictionary covered more types of information like the provision of encyclopedic illustrations and pictorial illustration. Keywords: Medical words, Bilingual medical dictionaries, Dictionary and translation 1. Introduction The disciplines of translation and lexicography are linked to each other in 'give and take' operations (Hartmann, 2004). In this regard, translation is considered as the supplier of translation equivalents to bilingual dictionaries and a customer of information provided by lexicographers to translators, (Hartmann, 2004). In his article 'Lexicography and Translation', www.macrothink.org/ijl 229 International Journal of Linguistics ISSN 1948-5425 2019, Vol. -
From Wordnet to Medical Wordnet
Journal of Biomedical Informatics 39 (2006) 321–332 www.elsevier.com/locate/yjbin Towards new information resources for public health—From WORDNET to MEDICALWORDNET Christiane Fellbaum a,b,*, Udo Hahn c, Barry Smith d,e a Department of Psychology, Princeton University Green Hall, Washington Rd., Princeton, NJ 08544, United States b Berlin-Bradenburg Academy of Science, Berlin, Germany c Jena University Language and Information Engineering (JULIE) Lab, Friedrich-Schiller-Universita¨t Jena, Jena, Germany d IFOMIS, Universita¨t des Saarlandes, Saarbru¨cken, Germany e Department of Philosophy, State University of New York at Buffalo, USA Received 21 July 2005 Available online 24 October 2005 Abstract In the last two decades, WORDNET has evolved as the most comprehensive computational lexicon of general English. In this article, we discuss its potential for supporting the creation of an entirely new kind of information resource for public health, viz. MEDICALWORDNET. This resource is not to be conceived merely as a lexical extension of the original WORDNET to medical terminology; indeed, there is already a considerable degree of overlap between WORDNET and the vocabulary of medicine. Instead, we propose a new type of repository, consisting of three large collections of (1) medically relevant word forms, structured along the lines of the existing Princeton WORDNET; (2) medically validated propositions, referred to here as medical facts, which will consti- tute what we shall call MEDICALFACTNET; and (3) propositions reflecting laypersonsÕ medical beliefs, which will constitute what we shall call the MEDICALBELIEFNET. We introduce a methodology for setting up the MEDICALWORDNET. We then turn to the discus- sion of research challenges that have to be met to build this new type of information resource.