Building Specialized Bilingual Lexicons Using Large Scale

Total Page:16

File Type:pdf, Size:1020Kb

Building Specialized Bilingual Lexicons Using Large Scale Building Specialized Bilingual Lexicons Using Large-Scale Background Knowledge Dhouha Bouamor1, Adrian Popescu1, Nasredine Semmar1, Pierre Zweigenbaum2 1 CEA, LIST, Vision and Content Engineering Laboratory, 91191 Gif-sur-Yvette CEDEX, France; [email protected] 2LIMSI-CNRS, F-91403 Orsay CEDEX, France; [email protected] Abstract Translate 1 and Systran 2. However, due to the intrin- sic difficulty of the task, a number of related prob- lems remain open, including: the gap between text Bilingual lexicons are central components of machine translation and cross-lingual infor- semantics and statistically derived translations, the mation retrieval systems. Their manual con- scarcity of resources in a large majority of languages struction requires strong expertise in both lan- and the quality of automatically obtained resources guages involved and is a costly process. Sev- and translations. While the first challenge is general eral automatic methods were proposed as an and inherent to any automatic approach, the second alternative but they often rely on resources and the third can be at least partially addressed by available in a limited number of languages an appropriate exploitation of multilingual resources and their performances are still far behind the quality of manual translations. We intro- that are increasingly available on the Web. duce a novel approach to the creation of spe- In this paper we focus on the automatic creation of cific domain bilingual lexicon that relies on domain-specific bilingual lexicons. Such resources Wikipedia. This massively multilingual en- play a vital role in Natural Language Processing cyclopedia makes it possible to create lexi- (NLP) applications that involve different languages. cons for a large number of language pairs. At first, research on lexical extraction has relied on Wikipedia is used to extract domains in each the use of parallel corpora (Och and Ney, 2003). language, to link domains between languages and to create generic translation dictionaries. The scarcity of such corpora, in particular for spe- The approach is tested on four specialized do- cialized domains and for language pairs not involv- mains and is compared to three state of the art ing English, pushed researchers to investigate the approaches using two language pairs: French- use of comparable corpora (Fung, 1998; Chiao and English and Romanian-English. The newly in- Zweigenbaum, 2003). These corpora include texts troduced method compares favorably to exist- which are not exact translation of each other but ing methods in all configurations tested. share common features such as domain, genre, sam- pling period, etc. The basic intuition that underlies bilingual lexi- 1 Introduction con creation is the distributional hypothesis (Harris, 1954) which puts that words with similar meanings The plethora of textual information shared on the occur in similar contexts. In a multilingual formu- Web is strongly multilingual and users’ information lation, this hypothesis states that the translations of needs often go well beyond their knowledge of for- a word are likely to appear in similar lexical envi- eign languages. In such cases, efficient machine ronments across languages (Rapp, 1995). The stan- translation and cross-lingual information retrieval dard approach to bilingual lexicon extraction builds systems are needed. Machine translation already has a decades long history and an array of commercial 1http://translate.google.com/ systems were already deployed, including Google 2http://www.systransoft.com/ 479 Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pages 479–489, Seattle, Washington, USA, 18-21 October 2013. c 2013 Association for Computational Linguistics on the distributional hypothesis and compares con- Markovitch, 2007) is fitted to our application con- text vectors for each word of the source and tar- text. ESA was already successfully tested in differ- get languages. In this approach, the comparison of ent NLP tasks, such as word relatedness estimation context vectors is conditioned by the existence of a or text classification, and we modify it to mine spe- seed bilingual dictionary. A weakness of the method cialized domains, to characterize these domains and is that poor results are obtained for language pairs to link them across languages. that are not closely related (Ismail and Manandhar, The evaluation of the newly introduced approach 2010). Another important problem occurs whenever is realized on four diversified specialized domains the size of the seed dictionary is small due to ignor- (Breast Cancer, Corporate Finance, Wind Energy ing many context words. Conversely, when dictio- and Mobile Technology) and for two pairs of lan- naries are detailed, ambiguity becomes an important guages: French-English and Romanian-English. drawback. This choice allows us to study the behavior of dif- We introduce a bilingual lexicon extraction ap- ferent approaches for a pair of languages that are proach that exploits Wikipedia in an innovative richly represented and for a pair that includes Roma- manner in order to tackle some of the problems nian, a language that has fewer associated resources mentioned above. Important advantages of using than French and English. Experimental results show Wikipedia are: that the newly introduced approach outperforms the three state of the art methods that were implemented • The resource is available in hundreds of lan- for comparison. guages and it is structured as unambiguous con- cepts (i.e. articles). 2 Related Work • The languages are explicitly linked through In this section, we first give a review of the stan- concept translations proposed by Wikipedia dard approach and then introduce methods that build contributors. upon it. Finally, we discuss works that rely on Ex- plicit Semantic Analysis to solve other NLP tasks. • It covers a large number of domains and is thus potentially useful in order to mine a wide array 2.1 Standard Approach (SA) of specialized lexicons. Most previous approaches that address bilingual lex- icon extraction from comparable corpora are based Mirroring the advantages, there are a number of on the standard approach (Fung, 1998; Chiao and challenges associated with the use of Wikipedia: Zweigenbaum, 2002; Laroche and Langlais, 2010). • The comparability of concept descriptions in This approach is composed of three main steps: different languages is highly variable. 1. Building context vectors: Vectors are first • The translation graph is partial since, when extracted by identifying the words that ap- considering any language pair, only a part of pear around the term to be translated Wcand the concepts are available in both languages in a window of n words. Generally, asso- and explicitly connected. ciation measures such as the mutual infor- mation (Morin and Daille, 2006), the log- • Domains are unequally covered in Wikipedia likelihood (Morin and Prochasson, 2011) or the (Halavais and Lackaff, 2008) and efficient do- Discounted Odds-Ratio (Laroche and Langlais, main targeting is needed. 2010) are employed to shape the context vec- tors. The approach introduced in this paper aims to draw on Wikipedia’s advantages while appropri- 2. Translation of context vectors: To enable the ately addressing associated challenges. Among comparison of source and target vectors, source the techniques devised to mine Wikipedia content, vectors are translated intoto the target language we hypothesize that an adequate adaptation of Ex- by using a seed bilingual dictionary. When- plicit Semantic Analysis (ESA) (Gabrilovich and ever several translations of a context word exist, 480 all translation variants are taken into account. 2.3 Explicit Semantic Analysis Words not included in the seed dictionary are Explicit Semantic Analysis (ESA) (Gabrilovich and simply ignored. Markovitch, 2007) is a method that maps textual documents onto a structured semantic space using 3. Comparison of source and target vectors: classical text indexing schemes such as TF-IDF. Ex- Given W , its automatically translated con- cand amples of semantic spaces used include Wikipedia text vector is compared to the context vectors or the Open Directory Project but, due to superior of all possible translations from the target lan- performances, Wikipedia is most frequently used. guage. Most often, the cosine similarity is In the original evaluation, ESA outperformed state used to rank translation candidates but alterna- of the art methods in a word relatedness estimation tive metrics, including the weighted Jaccard in- task. dex (Prochasson et al., 2009) and the city-block distance (Rapp, 1999), were studied. Subsequently, ESA was successfully exploited in other NLP tasks and in information retrieval. Radin- 2.2 Improvements of the Standard Approach sky and al. (2011) added a temporal dimension to word vectors and showed that this addition improves Most of the improvements of the standard approach the results of word relatedness estimation. (Hassan are based on the observation that the more repre- and Mihalcea, 2011) introduced Salient Semantic sentative the context vectors of a candidate word Analysis (SSA), a development of ESA that relies are, the better the bilingual lexicon extraction is. At on the detection of salient concepts prior to map- first, additional linguistic resources, such as special- ping words to concepts. SSA and the original ESA ized dictionaries (Chiao and Zweigenbaum, 2002) or implementation were tested on several word related- transliterated words (Prochasson
Recommended publications
  • The Effectiveness of Monolingual and Bilingual Dictionaries Features Toward the Students Narrative Writing of Uin Malang: Comparative Study
    THE EFFECTIVENESS OF MONOLINGUAL AND BILINGUAL DICTIONARIES FEATURES TOWARD THE STUDENTS NARRATIVE WRITING OF UIN MALANG: COMPARATIVE STUDY THESIS WINDA FIFI PUSPITA SARI 10320118 ENGLISH LANGUAGE AND LETTERS DEPARTMENT FACULTY OF HUMANITIES MAULANA MALIK IBRAHIM STATE ISLAMIC UNIVERSITY MALANG 2014 THE EFFECTIVENESS OF MONOLINGUAL AND BILINGUAL DICTIONARIES FEATURES TOWARD THE STUDENTS NARRATIVE WRITING OF UIN MALANG: COMPARATIVE STUDY THESIS Presented to Maulana Malik Ibrahim State Islamic University of Malang In a Partial of the Requirements for the Degree of Sarjana Sastra (S.S.) By: WindaFifiPuspita Sari 10320118 Advisor: Drs. H.DjokoSusanto, M.Ed., Ph.D. NIP 19670529 2000031 001 ENGLISH LANGUAGE AND LETTERS DEPARTMENT FACULTY OF HUMANITIES MAULANA MALIK IBRAHIM STATE ISLAMIC UNIVERSITY MALANG 2014 i ii iii iv MOTTO ُ ……يَ ْزفَ ِع هَّللاُ اله ِذ َين َآمنُوا ِم ْن ُك ْم َواله ِذ َين أوتُوا ْال ِع ْل َم َد َر َج ٍات َو هَّللا ُ مِ َاا َت ْع َالُو َ َبيِ ٌز ….. Allah will exalt those of you who believe, and those who are given knowledge, in high degrees; and Allah is aware of what you do.(Surah Al Mujadilah 58:11) v DEDICATION I dedicate my thesis to my beloved family: I would like to say thank you very much to my beloved parents, brother and sister who always support me during my thesis progress. And thank you very much to my family who never stop bring me in their prayer, cares, and loves that make me always spirit. vi ACKNOWLEDGMENT First of all, I would like to thank Allah SWT who always gives me a blessing and strength to finish my thesis.
    [Show full text]
  • Methods in Lexicography and Dictionary Research* Stefan J
    Methods in Lexicography and Dictionary Research* Stefan J. Schierholz, Department Germanistik und Komparatistik, Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany ([email protected]) Abstract: Methods are used in every stage of dictionary-making and in every scientific analysis which is carried out in the field of dictionary research. This article presents some general consid- erations on methods in philosophy of science, gives an overview of many methods used in linguis- tics, in lexicography, dictionary research as well as of the areas these methods are applied in. Keywords: SCIENTIFIC METHODS, LEXICOGRAPHICAL METHODS, THEORY, META- LEXICOGRAPHY, DICTIONARY RESEARCH, PRACTICAL LEXICOGRAPHY, LEXICO- GRAPHICAL PROCESS, SYSTEMATIC DICTIONARY RESEARCH, CRITICAL DICTIONARY RESEARCH, HISTORICAL DICTIONARY RESEARCH, RESEARCH ON DICTIONARY USE Opsomming: Metodes in leksikografie en woordeboeknavorsing. Metodes word gebruik in elke fase van woordeboekmaak en in elke wetenskaplike analise wat in die woor- deboeknavorsingsveld uitgevoer word. In hierdie artikel word algemene oorwegings vir metodes in wetenskapfilosofie voorgelê, 'n oorsig word gegee van baie metodes wat in die taalkunde, leksi- kografie en woordeboeknavorsing gebruik word asook van die areas waarin hierdie metodes toe- gepas word. Sleutelwoorde: WETENSKAPLIKE METODES, LEKSIKOGRAFIESE METODES, TEORIE, METALEKSIKOGRAFIE, WOORDEBOEKNAVORSING, PRAKTIESE LEKSIKOGRAFIE, LEKSI- KOGRAFIESE PROSES, SISTEMATIESE WOORDEBOEKNAVORSING, KRITIESE WOORDE- BOEKNAVORSING, HISTORIESE WOORDEBOEKNAVORSING, NAVORSING OP WOORDE- BOEKGEBRUIK 1. Introduction In dictionary production and in scientific work which is carried out in the field of dictionary research, methods are used to reach certain results. Currently there is no comprehensive and up-to-date documentation of these particular methods in English. The article of Mann and Schierholz published in Lexico- * This article is based on the article from Mann and Schierholz published in Lexicographica 30 (2014).
    [Show full text]
  • Determining the Characteristic Vocabulary for a Specialized Dictionary Using Word2vec and a Directed Crawler Gregory Grefenstette, Lawrence Muchemi
    Determining the Characteristic Vocabulary for a Specialized Dictionary using Word2vec and a Directed Crawler Gregory Grefenstette, Lawrence Muchemi To cite this version: Gregory Grefenstette, Lawrence Muchemi. Determining the Characteristic Vocabulary for a Special- ized Dictionary using Word2vec and a Directed Crawler. GLOBALEX 2016: Lexicographic Resources for Human Language Technology, May 2016, Portoroz, Slovenia. hal-01323591 HAL Id: hal-01323591 https://hal.inria.fr/hal-01323591 Submitted on 30 May 2016 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Determining the Characteristic Vocabulary for a Specialized Dictionary using Word2vec and a Directed Crawler Gregory Grefenstette Lawrence Muchemi Inria Saclay/TAO, Rue Noetzlin - Bât 660 Inria Saclay/TAO, , Rue Noetzlin - Bât 660 91190 Gif sur Yvette, France 91190 Gif sur Yvette, France [email protected] [email protected] ABSTRACT crawler works. This is followed by a description of one Specialized dictionaries are used to understand concepts in distributional semantics tool, word2vec. Then we show how specific domains, especially where those concepts are not these two tools can be used together to extract the basis of a part of the general vocabulary, or having meanings that specialized vocabulary for a domain.
    [Show full text]
  • Lexicographic Interchange Between a Specialized and a General Language Dictionary1
    Lexicographic interchange between a specialized and a general language dictionary1 Marie-Claude Demers, Ilan Kernerman & Marie-Claude L’Homme Keywords: general language dictionary, terminological database, specialized meaning, term, wordlist. Abstract One of the important issues lexicographers need to address concerns the desired coverage of a dictionary’s wordlist. This paper addresses the issue from a practical angle. We propose a method for comparing the contents of two resources and evaluating to what extent each can contribute to increase and improve the coverage of the other. Concretely, the project consists of comparing the contents of the English version of DiCoInfo (a dictionary of computing and Internet terms) with the appropriate entries of the Random House Webster’s College Dictionary (RHWCD). The entries missing in one resource are considered for inclusion in the other, and vice versa. The approach proves beneficial for both resources. Approximately 100 entries were added to DiCoInfo and over 500 lexical items or meanings are being included in the RHWCD. 1. Introduction An important issue lexicographers need to address concerns the coverage of a dictionary’s wordlist. The question is relevant from the points of view of general as well as specialized lexicography, although it leads to different answers in each area. Specialized dictionaries should include all items that are related to the field they aim to cover. General language dictionaries include many specialized lexical items but attempt to cover fields and terms of more general interest (Alonso Campo 2008; Béjoint 1988; Boulanger 1996; Josselin-Leray 2005; Svensén 2009; Wiegand 1999). This paper addresses the issue of coverage from a practical angle.
    [Show full text]
  • DICTIONARY News Kernerman
    Number 20 y July 2012 Kernerman kdictionaries.com/kdn DICTIONARY News What’s in a name? This is the twentieth issue of this newsletter, and our company name replacement seemed to be Kernerman, which by that time is now in its twentieth year. The first issue of the newsletter was had become well known in the dictionary world, but which published by Kernerman Publishing and Password Publishers I personally avoided — because that would be too close to in July 1994. It was entitled Password News, and had the goal the name of Kernerman Publishing, and because I considered of serving as a “forum for discussion about the semi-bilingual that carrying my surname was somewhat vain. On the other English dictionary.” The title was changed in the next issue hand, I liked the short form K as an abbreviation of Kernerman to Kernerman Dictionary News. Issues No. 2 and 3 appeared (and for 1,000), which was already in our logo, its anonymity, in 1995, and since then the newsletter has been published and the nice counterbalance it produced against the long word regularly in July each year. Over the years the scope of topics Dictionaries. Thus, K Dictionaries emerged — and may a has expanded, and the look and size have changed as well. thousand dictionaries bloom! Kernerman Publishing was established in 1969, as an The irony of fate, however, is that on numerous occasions independent ELT publishing house. Since the 1980s it has been we are referred to as Kernerman Dictionaries, most notably by conducted by my father, Ari (Lionel) Kernerman, who initiated dictionary professionals… Accepting that there is no escape the semi-bilingual dictionary, and became the from your name, and assuming the weight it leading English dictionary publisher in Israel implies, we began to introduce Kernerman to and renowned globally.
    [Show full text]
  • Phraseme Analysis and Concept Analysis: Exploring a Symbiotic Relationship in the Specialized Lexicon
    Ingrid Meyer and Kristen Mackintosh University of Ottawa, Canada Phraseme Analysis and Concept Analysis: Exploring a Symbiotic Relationship in the Specialized Lexicon Abstract This paper analyzes ways in which phraseme analysis can facilitate concept analysis, and vice versa, in terminography work. We compare the phraseology of a number of conceptually related terms with conceptual information in our terminological knowledge base on optical storage technologies. We propose that a better understanding of phraseme-concept relations is important for both knowledge- and corpus-based approaches to terminography, approaches which we believe will merge in the next generation of term banks. 1. Introduction This paper is concerned with phraseology as it pertains to terminography, by which we mean the identification, analysis and description of terms (=specialized lexical items). With the help of specialized texts and interviews with domain (=subject-field) experts, the terminographer carries out three basic tasks, which are not strictly sequential. 1) Identification of object of study. The terminographer circumscribes the domain and finds related domains; he identifies the principal concepts and terms, and eliminates lexical items belonging to general language. 2) Analysis. The terminographer analyzes the specialized corpus from both a conceptual and linguistic point of view. On the conceptual side, the concepts' principal attributes and relations (collectively, characteristics) are determined, a process which goes hand-in-hand with building up the conceptual structure of the domain, and mapping out links between these systems and those of related domains. On the linguistic side, various aspects of the terms are identified, such as collocational behaviour, grammatical features, and usage restrictions. 3) Synthesis.
    [Show full text]
  • Comparative Study on Currently Available Wordnets
    International Journal of Applied Engineering Research ISSN 0973-4562 Volume 13, Number 10 (2018) pp. 8140-8145 © Research India Publications. http://www.ripublication.com Comparative Study on Currently Available WordNets Sreedhi Deleep Kumar Reshma E U PG Scholar PG Scholar Department of Computer Science and Engineering Department of Computer Science and Engineering Vidya Academy of Science and Technology Vidya Academy of Science and Technology Thrissur, India. Thrissur, India. Sunitha C Amal Ganesh Associate Professor Assistant Professor Department of Computer Science and Engineering Department of Computer Science and Engineering Vidya Academy of Science and Technology Vidya Academy of Science and Technology Thrissur, India. Thrissur, India. Abstract SinoTibetan, Tibeto-Burman and Austro-Asiatic. The major ones are the Indo-Aryan, spoken by the northern to western WordNet is an information base which is arranged part of India and Dravidian, spoken by southern part of India. hierarchically in any language. Usually, WordNet is The Eighth Schedule of the Indian Constitution lists 22 implemented using indexed file system. Good WordNets languages, which have been referred to as scheduled available in many languages. However, Malayalam is not languages and given recognition, status and official having an efficient WordNet. WordNet differs from the encouragement. dictionaries in their organization. WordNet does not give pronunciation, derivation morphology, etymology, usage notes, A Dictionary can be called as are source dealing with the or pictorial illustrations. WordNet depicts the semantic relation individual words of a language along with its orthography, between word senses more transparently and elegantly. In this pronunciation, usage, synonyms, derivation, history, work, a general comparison of currently browsable WordNets etymology, etc.
    [Show full text]
  • Detailed Terminographic Analysis of “Glossary of Terminology Used in the Standardization of Geographical Names *
    UNITED NATIONS WORKING PAPER GROUP OF EXPERTS ON NO. 38/10 GEOGRAPHICAL NAMES Twenty-eighth session New York, 28 April-2 May 2014 Item 10 of the provisional agenda Activities relating to the Working Group on Toponymic Terminology Detailed Terminographic Analysis of “Glossary of Terminology Used in the Standardization of Geographical Names * Submitted by Poland * Prepared by Marek Łukasik (Department of English Studies, Pomeranian University in Słupsk, Poland; Corpus Linguistics Laboratory, Faculty of Applied Linguistics, University of Warsaw) and Maciej Zych (Commission on Standardization of Geographical Names Outside the Republic of Poland). DETAILED TERMINOGRAPHIC ANALYSIS OF GLOSSARY OF TERMINOLOGY USED IN THE STANDARDIZATION OF GEOGRAPHICAL NAMES (SUMMARY) The Commission on Standardization of Geographical Names Outside the Republic of Poland is planning to publish a new edition of Glossary of Terminology Used in the Standardization of Geographical Names [Polish: Słownik terminów używanych przy standaryzacji nazw geograficznych] (the latest edition was released in 1998). Before commencing any work, the Commission decided to analyze the compositional correctness of the existing dictionary. To this end, an expert lexicographic opinion had been drafted. On account of the fact that the Polish version is in many places a faithful translation of the UNGEGN dictionary, whose main part, i.e. the English version, has not undergone considerable changes since the release of the Polish version, remarks included in the opinion and this paper may, to a great extent, be applicable to the English original. These comments may prove useful for the UNGEGN Working Group on Toponymic Terminology as guidelines before undertaking a revision of the Glossary of Terms for the Standardization of Geographical Names.
    [Show full text]
  • A Comparison Between Specialized and General Dictionaries with Reference to Legal Ones
    Alexandria University Faculty of Arts- English Department Language and Translation Section ------------- A Comparison between Specialized and General Dictionaries With Reference to Legal ones Nada At. Sharaan Alexandria University 36 Abstract Dictionaries play a major role when it comes to different linguistic processes. Indeed, they are quite essential for beginners as they are considered the main sources for obtaining meaning. However, dictionaries do not only provide information related to meanings of words, but they also provide different kinds of linguistic information such as spelling, pronunciation, etymology and syntax. Thus, not only do learners depend on dictionaries, but specialists also consult them for different purposes. In other words, dictionaries have a variety of different functions, that is why Lexicography, which is compiling dictionaries, is quite an essential field. Dictionary compilers try to make things easier for their users. In fact, the way a dictionary is presented is affected by its purpose or aim. In other words, there are different kinds of dictionaries depending on the type of users. There are general dictionaries and specialized ones. What is the purpose of these types? What are the differences between them? Which is more useful for users? My research is an attempt to provide answers to these questions. Keywords: linguistic, dictionary, general, specialized, user, purpose 37 Introduction Lexicography is concerned with the way different linguistic information should be presented in a dictionary. Different principles and approaches are created in order to make things easier for learners and specialists. However, things are not that easy as there are different types of users. That is why different types of dictionaries are created.
    [Show full text]
  • Enriching the MCR with a Terminological Dictionary
    WNTERM: Enriching the MCR with a terminological dictionary Eli Pociello, Antton Gurrutxaga Elhuyar R&D Zelai Haundi kalea, 3. Osinalde Industrialdea, 20170 Usurbil. Basque Country E-mail: {agurrutxaga, eli}@elhuyar.com Eneko Agirre, Izaskun Aldezabal, German Rigau IXA NLP Research Group 649 pk. 20.080 – Donostia. Basque Country E-mail: {e.agirre, izaskun.aldezabal, german.rigau}@ehu.es Abstract In this paper we describe the methodology and the first steps for the creation of WNTERM (from WordNet and Terminology), a specialized lexicon produced from the merger of the EuroWordNet-based Multilingual Central Repository (MCR) and the Basic Encyclopaedic Dictionary of Science and Technology (BDST). As an example, the ecology domain has been used. The final result is a multilingual (Basque and English) light-weight domain ontology, including taxonomic and other semantic relations among its concepts, which is tightly connected to other wordnets. the MCR framework, which we also used to build the 1. Introduction Basque WordNet). The MCR model provides English- In the last seven years the IXA research group has been Basque equivalence links across the English and Basque working on the Basque WordNet (Agirre et al., 2006). wordnets, and it also offers the possibility of enriching The focus have been on the representation of common both the Basque WordNet and WNTERM vocabulary (we have incorporated around 27,000 words), simultaneously. but now, we have turned our attention to specialized Domain ontology terms are imported into WNTERM language. from the MCR itself (more precisely, from the Basque Due to the unstoppable development of specialized and English wordnets) and the in-house Basic language and terminology, it is becoming increasingly Encyclopaedic Dictionary of Science and Technology 4 difficult to capture and organize terminological (BDST) data-base .
    [Show full text]
  • Composition of the Entry in a Bilingual Dictionary
    Composition of the Entry in a Bilingual Dictionary Edita Hornáčková Klapicová This paper focuses on the essential concepts of the composition of the entry in a bilingual dictionary, on the general standpoints of some lexicographers as well as on the practical usage of certain rules to be observed when elaborating the entry in a bilingual dictionary. The introductory part consists of a research of ideas of some authors regarding the science of bilingual lexicography and the analysis of their theories. The following paragraphs discuss questions of terminological dictionaries and dictionaries containing both the definitions as well as the translation of terms. Moreover, the paper ponders upon the ideas of Ella Sekaninová on the composition of the entry and the set of parameters according to which a lexical unit must appear in a bilingual dictionary. Accordingly, each of the seven parameters mentioned by Sekaninová is applied on some Slovak, English and Spanish voices. 1. The Entry in a Bilingual Dictionary 1.1 A Bilingual Dictionary Dictionaries may be classified by manifold criteria, some of them obvious to everyone, such as size, but there is no standard, agreed-upon taxonomy for dictionaries. Perspective is based on how the compiler views the work and what approach he takes. First, is the work diachronic (covering an extended time) or synchronic (confined to one period)? Second, how is it organized – alphabetically, by sound (as in rhyming dictionaries), by concept (as in some thesauruses), or by some other means? Third, is the level of tone detached, perceptive (didactic), or facetious? Martínez de Sousa defines a bilingual dictionary as a “plurilingual dictionary which registers the equivalences of meanings in two languages” (Martínez de Sousa 1995: 129).
    [Show full text]
  • ONLINE LEGAL DICTIONARIES and Terminology Databases
    ONLINE LEGAL DICTIONARIES and Terminology Databases US < > FR A Selection for Translators and Legal Professionals > Monolingual US/UK Legal Dictionaries > Monolingual French Legal Dictionaries > Bilingual English < > French Dictionaries > Multilingual Terminology Databases MONOLINGUAL US/UK LEGAL DICTIONARIES – and Business and Financial Glossaries Black’s Law Dictionary– An online version of the latest edition can be accessed through the paid Westlaw legal information service. West Academic has published Black’s Law Dictionary Digital, 8th edition with toolbars that integrate with Microsoft Word, Mozilla Firefox and Internet Explorer. http://www.blackslawdictionary.com/ Merriam-Webster’s Dictionary of Law – Published under license with Merriam-Webster on FindLaw. : http://dictionary.lp.findlaw.com. Ballantine’s Lawl Dictionary and Thesaurus – Search online: http://www.mindserpent.com/American_History/reference/1969_Ballentine_3/1969_Ballentine_3_index.ht ml. Duhaime’s Legal Dictionary – “Legal concepts and law terms painstakingly researched and written in plain language.” http://www.duhaime.org/legaldictionary.aspx Jurisdictionary – Provides very detailed definitions of most terms. http://www.jurisdictionary.com/ Criminal Justice Brief – An excellent English-language glossary. https://www.pearson.com/us/higher- education/professional—career/legal-studies—paralegal/legal-studies—paralegal.html Bouviers Law Dictionary – This Dictionary is based on the 1856 Edition of the Bouviers Law Dictionary. Although the legal dictionary is pretty
    [Show full text]