Creating Reverse Bilingual Dictionaries

Total Page:16

File Type:pdf, Size:1020Kb

Creating Reverse Bilingual Dictionaries Creating Reverse Bilingual Dictionaries Khang Lam Jugal Kalita Department of Computer Science Department of Computer Science University of Colorado at Colorado SpringsUniversity of Colorado at Colorado Springs Colorado Springs, USA Colorado Springs, USA [email protected] [email protected] Abstract fuse the results. They query several existing dictio- naries and then merge results to maximize accuracy. Bilingual dictionaries are expensive resources They use four pivot languages, German, Spanish, and not many are available when one of the Dutch and Italian, as intermediate languages. An- languages is resource-poor. In this paper, we other existing approach for creating bilingual dictio- propose algorithms for creation of new reverse bilingual dictionaries from existing bilingual naries is using probabilistic inference (Mausam et dictionaries in which English is one of the two al., 2010). They organize dictionaries in a graph languages. Our algorithms exploit the simi- topology and use random walks and probabilistic larity between word-concept pairs using the graph sampling. (Shaw et al., 2011) propose a set English Wordnet to produce reverse dictionary of algorithms to create a reverse dictionary in the entries. Since our algorithms rely on available context of single language by using converse map- bilingual dictionaries, they are applicable to ping. In particular, given an English-English dictio- any bilingual dictionary as long as one of the two languages has Wordnet type lexical ontol- nary, they attempt to find the original words or terms ogy. given a synonymous word or phrase describing the meaning of a word. The goal of this research is to study the feasibility 1 Introduction of creating a reverse dictionary by using only one ex- The Ethnologue organization1 lists 6,809 distinct isting dictionary and Wordnet lexical ontology. For 3 languages in the world, most of which are resource- example, given a Karbi -English dictionary, we will poor. Most existing online bilingual dictionaries are construct an ENG-AJZ dictionary. The remainder of between two resource-rich languages (e.g., English, this paper is organized as follows. In Section 2, we Spanish, French or German) or between a resource- discuss the nature of bilingual dictionaries. Section rich language and a resource-poor language. There 3 describes the algorithms we propose to create new are languages for which we are lucky to find a single bilingual dictionaries from existing dictionaries. Re- bilingual dictionary online. For example, the Uni- sults of our experiments are presented in Section 4. versity of Chicago hosts bilingual dictionaries from Section 5 concludes the paper. 29 Southeast Asian languages2, but many of these languages have only one bilingual dictionary online. 2 Existing Online Bilingual Dictionaries Existing algorithms for creating new bilingual Powerful online translators developed by Google dictionaries use intermediate languages or interme- and Bing provide pairwise translations (including diate dictionaries to find chains of words with the for individual words) for 65 and 40 languages, re- same meaning. For example, (Gollins and Sander- spectively. Wiktionary, a dictionary created by vol- son, 2001) use lexical triangulation to translate in unteers, supports over 170 languages. We find a parallel across multiple intermediate languages and 3Karbi is an endangered language spoken by 492,000 peo- 1http://www.ethnologue.com/ ple (2007 Ethnologue data) in Northeast India, ISO 639-3 code 2http://dsal.uchicago.edu/dictionaries/list.html AJZ. ISO 693-3 code for English is ENG. large number of bilingual dictionaries at PanLex4 meaning, and possibly additional information. The including an ENG-Hindi5 and a Vietnamese6-ENG meaning associated with it can have several Senses. dictionary. The University of Chicago has a number A Sense is a discrete representation of a single aspect of bilinugal dictionaries for South Asian languages. of the meaning of a word. Thus, a dictionary entry Xobdo7 has a number of dictionaries, focused on is of the form <LexicalUnit, Sense , Sense , >. 1 2 ··· Northeast India. In this section, we propose a series of algorithms, We classify the many freely available dictionaries each one of which automatically creates a reverse into three main kinds. dictionary, or ReverseDictionary, from a dictio- Word to word dictionaries: These are dictionar- nary that translates a word in language L to a word • 1 ies that translate one word in one language to or phrase in language L2. We require that at least one word or a phrase in another language. An one of two these languages has a Wordnet type lexi- example is an ENG-HIN dictionary at Panlex. cal ontology (Miller, 1995). Our algorithms are used Definition dictionaries: One word in one lan- to create reverse dictionaries from them at various • guage has one or more meanings in the second levels of accuracy and sophistication. language. It also may have pronunciation, parts 3.1 Direct Reversal (DR) of speech, synonyms and examples. An exam- ple is the VIE-ENG dictionary, also at Panlex. The existing dictionary has alphabetically sorted One language dictionaries: A dictionary of this LexicalUnits in L1 and each of them has one or • kind is found at dictionary.com. more Senses in L2. To create ReverseDictionary, we simply take every pair <LexicalUnit, Sense> We have examined several hundred online dictionar- in SourceDictionary and swap the positions of the ies and found that they occur in many different for- two. mats. Extracting information from these dictionaries is arduous. We have experimented with five existing Algorithm 1 DR Algorithm bilingual dictionaries: VIE-ENG, ENG-HIN, and a ReverseDictionary := φ dictionary supported by Xobdo with 4 languages: for all LexicalEntryi SourceDictionary do 8 9 2 Assamese , ENG, AJZ, and Dimasa . We consider for all Sensej LexicalEntry do 2 i the last one to be a collection of 3 bilingual dictio- Add tuple <Sensej, naries: ASM-ENG, AJZ-ENG, and DIS-ENG. We LexicalEntryi.LexicalUnit> to choose these languages since one of our goals is to ReverseDictionary work with resource-poor languages to enhance the end for quantity and quality of resources available. end for 3 Proposed Solution Approach This is a baseline algorithm so that we can com- A dictionary entry, called LexicalEntry, is 2-tuple pare improvements as we create new algorithms. <LexicalUnit, Definition>.ALexicalUnit is a If in our input dictionary, the sense definitions word or a phrase being defined, also called definien- are mostly single words, and occasionally a sim- dum (Landau, 1984). A list of entries sorted by ple phrase, even such a simple algorithm gives the LexicalUnit is called a lexicon or a dictionary. fairly good results. In case there are long or com- Given a LexicalUnit, the Definition associated with plex phrases in senses, we skip them. The ap- it usually contains its class and pronunciation, its proach is easy to implement, and produces a high- 4http://panlex.org/ accuracy ReverseDictionary. However, the num- 5ISO 693-3 code HIN ber of entries in the created dictionaries are lim- 6ISO 693-3 code VIE ited because this algorithm just swaps the posi- 7http://www.xobdo.org/ 8 tions of LexicalUnit and Sense of each entry in the Assamese is an Indo-European language spoken by about SourceDictionary 30 million people, but it is resource-poor, ISO 693-3 code ASM. and does not have any method 9Dimasa is another endangered language from Northeast In- to find the additional words having the same mean- dia, spoken by about 115,000 people, ISO 693-3 code DIS. ings. 3.2 Direct Reversal with Distance (DRwD) Algorithm 2 DRwD Algorithm ReverseDictionary := φ To increase the number of entries in the output dic- for all LexicalEntry SourceDictionary do tionary, we compute the distance between words i 2 for all Sensej LexicalEntry do in the Wordnet hierarchy. For example, the words 2 i for all LexicalEntry "hasta-lipi" and "likhavat" in HIN have the meanings u 2 SourceDictionary do "handwriting" and "script", respectively. The dis- for all Sensev LexicalEntry do tance between "handwriting" and "script" in Word- 2 u if distance(<LexicalEntry .LexicalUnit, net hierarchy is 0.0, so that "handwriting" and i Sensej> ,<LexicalEntry .LexicalUnit, "script" likely have the same meaning. Thus, each of u Sensev>)6 ↵ then "hasta-lipi" and "likhavat" should have both mean- Add tuple <Sensej, ings "handwriting" and "script". This approach LexicalEntry .LexicalUnit> helps us find additional words having the same u to ReverseDictionary meanings and possibly increase the number of lexi- end if cal entries in the reverse dictionaries. end for To create a ReverseDictionary, for every end for LexicalEntryi in the existing dictionary, end for we find all LexicalEntry ,i = j with dis- j 6 end for tance to LexicalEntryi equal to or smaller than a threshold ↵. As results, we have new 3.3 Direct Reversal with Similarly (DRwS) pairs of entries <LexicalEntry .LexicalUnit, i The DRwD approach computes simply the dis- LexicalEntry .Sense> ; then we swap positions j tance between two senses, but does not look at in the two-tuples, and add them into the Reverse- the meanings of the senses in any depth. The Dictionary. The value of ↵ affects the number of DRwS approach represents a concept in terms of entries and the quality of created dictionaries. The its Wordnet synset10, synonyms, hyponyms and greater the value of ↵, the larger the number of hypernyms. This approach is like the DRwD lexical entries, but the smaller the accuracy of the approach, but instead of computing the distance ReverseDictionary. between lexical entries in each pair, we calcu- The distance between the two LexicalEntrys is the late the similarity, called simValue. If the sim- distance between the two LexicalUnits if the Lexi- Value of a <LexicalEntry ,LexicalEntry >, i = i j 6 calUnits occur in Wordnet ontology; otherwise, it is j pair is equal or larger than ↵, we conclude the distance between the two Senses.
Recommended publications
  • English for Speakers of Other Languages (ESOL) Services Individualized Modifications/Accommodations Plan
    Student Name/ESOL Level:_____ __________ _ _ _ _ _ _ School/Grade Level: _____________________ | 1 School District of _____________________________________ County English for Speakers of Other Languages (ESOL) Services Individualized Modifications/Accommodations Plan General Specific Strategies and Ideas Modifications General Collaborate closely with ESOL teacher. Establish a safe/relaxed/supportive learning environment. Review previously learned concepts regularly and connect to new learning. Contextualize all instruction. Utilize cooperative learning. Teach study, organization, and note taking skills. Use manuscript (print) fonts. Teach to all modalities. Incorporate student culture (as appropriate). Activate prior knowledge. Allow extended time for completion of assignments and projects. Rephrase directions and questions. Simplify language. (Ex. Use short sentences, eliminate extraneous information, convert narratives to lists, underline key words/key points, use charts and diagrams, change pronouns to nouns). Use physical activity. (Total Physical Response) Incorporate students L1 when possible. Develop classroom library to include multicultural selections of all reading levels; especially books exemplifying students’ cultures. Articulate clearly, pause often, limit idiomatic expressions, and slang. Permit student errors in spelling and grammar except when explicitly taught. Acknowledge errors as indications of learning. Allow frequent breaks. Provide preferential seating. Model expected student outcomes. Prioritize course objectives. Reading Pre-teach vocabulary. in the Content Areas Teach sight vocabulary for beginning English readers. Allow extended time. Shorten reading selections. Choose alternate reading selections. Allow in-class time for free voluntary and required reading. Use graphic novels/books and illustrated novels. Leveled readers Modified text Use teacher read-alouds. Incorporate gestures/drama. Experiment with choral reading, duet (buddy) reading, and popcorn reading.
    [Show full text]
  • Four Strands to Language Learning
    International Journal of Innovation in English Language Teaching… ISSN: 2156-5716 Volume 1, Number 2 © 2012 Nova Science Publishers, Inc. APPLYING THE FOUR STRANDS TO LANGUAGE LEARNING Paul Nation1 and Azusa Yamamoto2 Victoria University of Wellington, New Zealand1 and Temple University Japan2 ABSTRACT The principle of the four strands says that a well balanced language course should have four equal strands of meaning focused input, meaning focused output, language focused learning, and fluency development. By applying this principle, it is possible to answer questions like How can I teach vocabulary? What should a well balanced listening course contain? How much extensive reading should we do? Is it worthwhile doing grammar translation? and How can I find out if I have a well balanced conversation course? The article describes the rationale behind such answers. The four strands principle is primarily a way of providing a balance of learning opportunities, and the article shows how this can be done in self regulated foreign language learning without a teacher. Keywords: Fluency learning, four strands, language focused learning, meaning focused input, meaning focused output. INTRODUCTION The principle of the four strands (Nation, 2007) states that a well balanced language course should consist of four equal strands – meaning focused input, meaning focused output, language focused learning, and fluency development. Each strand should receive a roughly equal amount of time in a course. The meaning focused input strand involves learning through listening and reading. This is largely incidental learning because the learners’ attention should be focused on comprehending what is being read or listened to. The meaning focused output strand involves learning through speaking and writing and in common with the meaning focused input strand involves largely incidental learning.
    [Show full text]
  • Bilingual Dictionary Generation for Low-Resourced Language Pairs
    Bilingual dictionary generation for low-resourced language pairs Varga István Yokoyama Shoichi Yamagata University, Yamagata University, Graduate School of Science and Engineering Graduate School of Science and Engineering [email protected] [email protected] choice and adaptation of the translation method Abstract to the problem of available translation resources between the chosen languages. Bilingual dictionaries are vital resources in One possible solution is bilingual corpus ac- many areas of natural language processing. quisition for statistical machine translation Numerous methods of machine translation re- (SMT). However, for highly accurate SMT sys- quire bilingual dictionaries with large cover- tems large bilingual corpora are required, which age, but less-frequent language pairs rarely are rarely available for less represented lan- have any digitalized resources. Since the need for these resources is increasing, but the hu- guages. Rule or sentence pattern based systems man resources are scarce for less represented are an attractive alternative, for these systems the languages, efficient automatized methods are need for a bilingual dictionary is essential. needed. This paper introduces a fully auto- Our paper targets bilingual dictionary genera- mated, robust pivot language based bilingual tion, a resource which can be used within the dictionary generation method that uses the frameworks of a rule or pattern based machine WordNet of the pivot language to build a new translation system. Our goal is to provide a low- bilingual dictionary. We propose the usage of cost, robust and accurate dictionary generation WordNet in order to increase accuracy; we method. Low cost and robustness are essential in also introduce a bidirectional selection method order to be re-implementable with any arbitrary with a flexible threshold to maximize recall.
    [Show full text]
  • The Impact of Using a Bilingual Dictionary (English-Arabic) for Reading and Writing in a Saudi High School
    THE IMPACT OF USING A BILINGUAL DICTIONARY (ENGLISH-ARABIC) FOR READING AND WRITING IN A SAUDI HIGH SCHOOL By Ali Almaliki A Master’s Thesis/Project Capstone Submitted in Partial Fulfillment of the Requirements for the Degree of Master of Science in Education Teaching English to Speakers of Other Languages (TESOL) Department of Language, Learning and leadership State University of New York at Fredonia Fredonia, New York December 2017 THE IMPACT OF USING A BILINGUAL DICTIONARY (ENGLISH-ARABIC) FOR READING AND WRITING IN A SAUDI HIGH SCHOOL ABSTRACT The purpose of this study is to explore the impact of using a bilingual dictionary (English- Arabic) for reading and writing in a Saudi high school and also to explore the Saudi Arabian students’ attitudes and EFL teachers’ perceptions toward the use of bilingual dictionaries. This study involves 65 EFL students and 5 EFL teachers in one Saudi high school in the city of Alkobar. Mixed methods research is used in which both qualitative and quantitative data are collected. For participating students, pre-test, post-test, and surveys are used to collect quantitative data. For participating teachers and students, in-person interviews are conducted with select teachers and students so as to collect qualitative data. This study has produced eight findings; first is that the use of a bilingual dictionary has a significant effect on the reading and writing scores for both high and low proficiency EFL students. Other findings include that most EFL students feel that using a bilingual dictionary in EFL classrooms is very important to help them translate and learn new vocabulary words but their use of a bilingual dictionary is limited by the strategies for use that students know or are taught, and that both invoice and experienced EFL teachers agree that the use of a bilingual dictionary is important for learning word meaning and vocabulary, but they do not all agree about which grades should use bilingual dictionaries.
    [Show full text]
  • Dictionaries in the ESL Classroom
    7/22/2009 The need for dictionaries in the ESL classroom: Dictionaries empower students by making them responsible for their own learning. Once students are able to use a dictionary well, they are self-sufficient in Dictionaries: finding the information on their own. Use and Function in the ESL Dictionaries present a very useful tool in the ESL classroom. However, teachers need to EXPLICITLY teach students skills so that they can be Classroom utilized to maximum extent. Dictionary use can enhance vocabulary acquisition and comprehension. If the dictionary is sufficiently current, and idiomatic colloquialisms are included, the student can [also] see representations of contemporary culture as seen through language. Why teachers need to explicitly teach dictionary skills: “If we do not teach students how to use the dictionary, it is unlikely that they will demand that they be taught, since, while teachers do not believe that students have adequate dictionary skills, students believe that they do” (Bilash, Gregoret & Loewen, 1999, p.4). Dictionary use skills: Dictionaries are not self explanatory; directions need to be made clear so that students can disentangle information, and select the appropriate meaning for the task at hand. Students need to learn how a dictionary works, how a dictionary or reference resource can help them, and also how to become aware of what they need and what kind of dictionary will best respond to their needs. Teachers need to firstly make sure that students are acquainted and familiar with the alphabet and its order. Secondly, teachers need to orientate students as to what a dictionary As students become more familiar and comfortable with basic dictionary entails, its functions, and relative terminology.
    [Show full text]
  • Bilingual Word-To-Word Glossaries List: for Use with the M-STEP and MI-Access Assessments
    Bilingual Word-to-Word Glossaries List: For use with the M-STEP and MI-Access Assessments List of Bilingual Word-to-Word Dictionaries and Glossaries The Office of Standards & Assessment (OSA) approves the following bilingual dictionaries and glossaries for use on Michigan Test of Educational Progress (M-STEP) and MI-Access tests (Available as appropriate, see Michigan’s Accommodations Manual). More specifically, word-to- word bilingual dictionaries may be used with the M-STEP and MI-Access Assessments only. Please note that the use of dictionaries of any kind is prohibited on the WIDA assessments: W-APT, ACCESS for ELLs, and Alternate ACCESS for ELLs. Educators should refer to the OSA accessibility tables and guidance for more information on this type of support. These word-to-word bilingual dictionaries can be used by students who are currently reported as English learners (ELs) or who have been reported as ELs in prior years and may still need this support from time-to-time in the classroom. The bilingual dictionaries and glossaries listed in this publication are limited to those that provide word-to-word translations only. A list of distributors appears at the end of this publication. This is not an exhaustive list of available bilingual word-to- word dictionaries, but is intended to aid districts in their selection process and ensuring that students who may need this support for the classroom and assessment have access to it. Additionally, students who may use electronic word-to-word dictionaries in the classroom must use a paper based version for Michigan’s assessments.
    [Show full text]
  • APPENDIX a ESOL Strategies Matrix Category Alpha Numeric ID
    APPENDIX A ESOL Strategies Matrix Category Alpha Strategies/Resources Numeric ID A. Listening A1 LEA (Language Experience Approach) A2 Modeling A3 Teacher Lead Groups A4 Total Physical Response (TPR) A5 Use Illustrations/Diagrams A6 Use Simple, Direct Language A7 Use Substitution, Expansion, Paraphrase, and Repetition. B. Speaking B1 Brainstorming B2 Cooperative Learning (Group Reports/Projects) B3 Panel Discussions/Debates B4 Provide Meaningful Language Practice B5 Repetition B6 Role-play B7 Teacher-Led Groups B8 Teacher/Student/Modeling B9 Think Aloud C. Reading C1 Activate Prior Knowledge C2 Picture Walk C3 Prediction C4 K-W-L (Know/Wants to Know/Learned) C5 Question-Answer-Relationship (QAR) C6 Use Task Cards C7 Teacher Made Questions C8 Vary the complexity of assignment (Differentiated Instruction (DI)) C9 Read Aloud (RA) C10 Choral Reading C11 Jump In Reading C12 Reader’s Theater C13 Cooperative Learning C14 Chunking C15 Explain Key Concepts C16 Focus on Key Vocabulary C17 Vocabulary with Context Clues C18 Vocabulary Improvement Strategy (VIS) C19 Use Multiple Meaning Words C20 Interactive Word Walls C21 Use Of Cognates C22 Word Banks/Vocabulary Notebooks 1 APPENDIX A C23 Decoding/Phonics/Spelling C24 Unscramble: Sentences/Words C25 Graphic Organizers C26 Semantic Mapping C27 Timelines C28 Praise-Question-Polish (PQP) C29 Visualization C30 Reciprocal Teaching C31 Context Clues C32 Verbal Clues/Pictures C33 Schema Stories C34 Captioning C35 Venn Diagrams C36 Story Maps C37 Structural Analysis C38 Reading for a Specific Purpose C39 Pantomimes/Dramatization C40 Interview C41 Retelling C42 Think/Pair/Share C43 Dictation C44 Cloze Procedures C45 Graphic Representations C46 Student Self Assessment C47 Flexible Grouping C48 Observation/Anecdotal C49 Portfolios C50 Wordless/Picture Books C51 Highlighting Text C52 Note-Taking/ Outline Notes C53 Survey/Question/Read/Recite/Review (SQ3R) C54 Summarizing C55 Buddy/Partner Reading C56 Collaborative Groups C57 Pacing of Lessons C58 Exit Slips C59 Sustained Silent Reading (SSR) D.
    [Show full text]
  • Revitalizing Indigenous Languages
    Revitalizing Indigenous Languages edited by Jon Reyhner Gina Cantoni Robert N. St. Clair Evangeline Parsons Yazzie Flagstaff, Arizona 1999 Revitalizing Indigenous Languages is a compilation of papers presented at the Fifth Annual Stabilizing Indigenous Languages Symposium on May 15 and 16, 1998, at the Galt House East in Louisville, Kentucky. Symposium Advisory Board Robert N. St. Clair, Co-chair Evangeline Parsons Yazzie, Co-chair Gina Cantoni Barbara Burnaby Jon Reyhner Symposium Staff Tyra R. Beasley Sarah Becker Yesenia Blackwood Trish Burns Emil Dobrescu Peter Matallana Rosemarie Maum Jack Ramey Tina Rose Mike Sorendo Nancy Stone B. Joanne Webb Copyright © 1999 by Northern Arizona University ISBN 0-9670554-0-7 Library of Congress Catalog Card Number: 99-70356 Second Printing, 2005 Additional copies can be obtained from College of Education, Northern Ari- zona University, Box 5774, Flagstaff, Arizona, 86011-5774. Phone 520 523 5342. Reprinting and copying on a nonprofit basis is hereby allowed with proper identification of the source except for Richard Littlebear’s poem on page iv, which can only be reproduced with his permission. Publication information can be found at http://jan.ucc.nau.edu/~jar/TIL.html ii Contents Repatriated Bones, Unrepatriated Spirits iv Richard Littlebear Introduction: Some Basics of Language Revitalization v Jon Reyhner Obstacles and Opportunities for Language Revitalization 1. Some Rare and Radical Ideas for Keeping Indigenous Languages Alive 1 Richard Littlebear 2. Running the Gauntlet of an Indigenous Language Program 6 Steve Greymorning Language Revitalization Efforts and Approaches 3. Sm’algyax Language Renewal: Prospects and Options 17 Daniel S. Rubin 4. Reversing Language Shift: Can Kwak’wala Be Revived 33 Stan J.
    [Show full text]
  • Best-Practice Strategies to Use with Secondary English Language Learners in Content Areas
    Best-Practice Strategies to Use with Secondary English Language Learners in Content Areas Modifying Instruction for Secondary ESL in Social Studies Dr. Stella Belsky, ESL Secondary Specialist, Carrollton-Farmers Branch ISD 1 Sources • Texas Education Agency Bilingual-ESL resources - http://www.tea.state.tx.us/curriculum/biling/tearesour ces.html • Enriching Content Classes for Secondary ESOL Students. J. H. Jameson. Center for Applied Linguistics, 1998 • Classroom Instruction that Works: Research-based Strategies for Increasing Student Achievement. Robert J. Marzano, Debra J. Pickering, Jane E. Pollock. ASCD, 2001. • Middle & High School ESL Teachers, Carrollton – Farmers Branch ISD Dr. Stella Belsky, ESL Secondary Specialist, Carrollton-Farmers Branch ISD 2 Modifications for Second Language Learners • According to §89.1210 Program Content and Design, the district shall modify the instruction, pacing, and materials to ensure that limited English proficient students have a full opportunity to master the essential knowledge and skills of the required curriculum. • Teachers are required and will be held accountable for modifying the instruction, pace, and materials to ensure that LEP students have a full opportunity to master the essential knowledge and skills of the required curriculum. (TEA, July 1999) Dr. Stella Belsky, ESL Secondary Specialist, Carrollton-Farmers Branch ISD 3 This is How a Paragraph Looks to ESL Students: • Beginning ESL Students The___in New York are very___in the___. There are not many___about and the___are made by___and not___. You___the___of___in the___, the___of the___, the___of___ ___ in the___and the___of the___. • Intermediate ESL Students The___Gardens in New York are very___in the morning. There are not many persons about and the sounds are made by___and not men.
    [Show full text]
  • Extracting Synonyms from Bilingual Dictionaries
    Extracting Synonyms from Bilingual Dictionaries Mustafa Jarrar Eman Karajah Birzeit University Birzeit University Palestine Palestine [email protected] [email protected] Muhammad Khalifa Khaled Shaalan Cairo University The British University in Dubai Egypt United Arab Emirates [email protected] [email protected] question answering, and machine translation Abstract among others. Synonyms are also considered essential parts in several types of lexical We present our progress in developing a resources, such as thesauri, wordnets (Miller et novel algorithm to extract synonyms from al., 1990), and linguistic ontologies (Jarrar, 2021; bilingual dictionaries. Identification and Jarrar, 2006). usage of synonyms play a significant role in improving the performance of There are different notions of synonymy in the information access applications. The idea literature varying from strict to lenient. In is to construct a translation graph from ontology engineering (see e.g., Jarrar, 2021), translation pairs, then to extract and synonymy is a formal equivalence relation (i.e., consolidate cyclic paths to form bilingual reflexive, symmetric, and transitive). Two terms are synonyms iff they have the exact same concept sets of synonyms. The initial evaluation of (i.e., refer, intentionally, to the same set of this algorithm illustrates promising results instances). Thus, T1 =Ci T2. In other words, given in extracting Arabic-English bilingual two terms T1 and T2 lexicalizing concepts C1 and synonyms. In the evaluation, we first C2, respectively, then T1 and T2 are considered to converted the synsets in the Arabic be synonyms iff C1 = C2. A less strict definition of WordNet into translation pairs (i.e., losing synonymy is used for constructing Wordnets, word-sense memberships).
    [Show full text]
  • Bilingual Dictionaries, the Lexicographer and the Translator
    Bilingual Dictionaries, the Lexicographer and the Translator Rachélle Gauton, Department of African Languages, University of Pretoria, Pretoria, Republic of South Africa ([email protected]) Abstract: This article focuses on the problems, and advantages and disadvantages of the bilin- gual dictionary from both the lexicographer's and the translator's point of view, with specific refer- ence to bilingual Zulu dictionaries. It is shown that there are many and varying problems the lexi- cographer has to deal with and take cognisance of when compiling a translation dictionary. Of these, the main problem is the basic lack of equivalence or anisomorphism which exists between languages. This non-equivalence between languages is also the root cause of the difficulties with which the translator or user of the bilingual dictionary has to contend. The problems experienced by translators therefore overlap to a great extent with those which the lexicographer experiences in compiling a bilingual dictionary. It is concluded that the user of a bilingual dictionary should not only know what to expect to find in a translation dictionary, but must also treat such a dictionary with caution and discernment. It is also shown that there are clear criteria which the lexicographer can follow in compiling a bilingual dictionary, which would then enable the user (and in particular the translator as user) to disambiguate the recorded information successfully. Keywords: BILINGUAL DICTIONARIES, LEXICOGRAPHER, TRANSLATOR, ISIZULU, PROBLEMS EXPERIENCED BY LEXICOGRAPHERS, PROBLEMS EXPERIENCED BY TRANS- LATORS, NONEQUIVALENCE BETWEEN LANGUAGES, CHARACTER, SHORTCOMINGS AND ADVANTAGES OF BILINGUAL DICTIONARIES Opsomming: Tweetalige woordeboeke, die leksikograaf en die vertaler. Die fokus van hierdie artikel is op die probleme, en voor- en nadele van die tweetalige woordeboek vanuit die oogpunt van sowel die leksikograaf as die vertaler, met spesifieke verwysing na tweeta- lige Zuluwoordeboeke.
    [Show full text]
  • Dictionary Policy Located on the TEA’S STAAR Resources Webpage
    TEA approval is NOT required. ARF Texas Education Agency® Dictionary Description of Accommodation This accommodation, or designated support, facilitates comprehension of unfamiliar words and provides spelling assistance for a student. Assessments For a student who meets the eligibility criterion, this accommodation may be used on • STAAR and STAAR Spanish grades 3–5 reading • STAAR and STAAR Spanish grade 4 writing (all sections of the assessment) Student Eligibility Criteria A student may use this accommodation if he or she rrroutinely, independently, and effectively uses this accommodation during classroom instruction and classroom testing. Authority for Decision and Required Documentation • For a student not receiving special education or Section 504 services, the decision is made by the appropriate team of people at the campus level (e.g., RTI team, student assistance team) based on the eligibility criterion and is documented according to district policies. • For a student who is an ELL, the decision is made by the LPAC based on the eligibility criterion and is documented in the student’s permanent record file. • For a student receiving Section 504 services, the decision is made by the Section 504 committee based on the eligibility criterion and is documented in the student’s IAP. • For a student receiving special education services, the decision is made by the ARD committee based on the eligibility criterion and is documented in the student’s IEP. • In the case of an ELL with a disability, the decision is made by the applicable group above in conjunction with the student’s LPAC. The decision is to be documented by the LPAC in the student’s permanent record file and by the other applicable group, as described above.
    [Show full text]