Online English-English Learner Dictionaries Boost Word Learning “Vocabulary Is One of Those Things Where the Rich Get Richer.” —Dr

Total Page:16

File Type:pdf, Size:1020Kb

Online English-English Learner Dictionaries Boost Word Learning “Vocabulary Is One of Those Things Where the Rich Get Richer.” —Dr Ulugbek Nurmukhamedov Online English-English Learner Dictionaries Boost Word Learning “Vocabulary is one of those things where the rich get richer.” —Dr. Eli Hinkel (Editorial Board 2009) earners of English might be they become familiar with diction­ familiar with several online ary features (Nesi and Haill 2002; monolingual dictionaries that Rizo-Rodríguez 2004). For this rea­ Lare not necessarily the best choic­ son, ESL/EFL teachers should learn es for the English as Second/Foreign about and introduce the several excel­ Language (ESL/EFL) context. In my lent online dictionaries with features experience while teaching ESL at an especially designed for learners of intensive English program in the Unit­ English. These online dictionaries ed States, learners of English often promote strategic and effective word consult sources like Dictionary.com learning, but it is critical for learn­ (www.dictionary.com) and Merriam- ers to be trained by teachers who are Webster (www.merriam-webster.com). familiar with proper dictionary use so Although these monolingual online that the online dictionaries are used dictionaries contain definitions, pro­ to maximum benefit in the class­ nunciation guides, and other elements room. The best online learner dic­ normally found in general-use diction­ tionaries include (1) a corpus-based aries, they are compiled with native compilation of words; (2) word fre­ or near-native speakers of English in quency data; (3) collocation guides; mind, not for learners of English. Eng­ (4) authentic examples of how words lish learners tend to gravitate to these are used; and (5) topical vocabulary dictionaries because they are seemingly from different disciplines. This article unaware of other dictionaries that are will help teachers become aware of specifically designed for them. these important features and also pro­ Research on dictionary use in vide suggestions on how to integrate second language teaching indicates dictionary-related exercises into their learners of English gain much when ESL/EFL lesson plans. 10 2 0 1 2 N UMBER 4 | E NGLISH T EACHING F ORUM Learner dictionaries 1. Cambridge Learner’s Dictionary (CLD) Learner dictionaries designed for ESL/EFL http://dictionary.cambridge.org/dictionary/learner-english students offer information on a word’s differ­ ent meanings, the ways that certain words are 2. Cambridge Advanced Learner’s Dictionary (CALD) used together, example sentences, explana­ http://dictionary.cambridge.org/dictionary/british tory notes, and many other features that are 3. Longman Dictionary of Contemporary English (LDCE) included with the learner in mind. Due to www.ldoceonline.com advances in technology, learner dictionar­ ies come in many forms: hard-copy books, 4. Macmillan Dictionary (MD) CD-ROM, online, and even in applications www.macmillandictionary.com for smartphones, tablet computers, and other mobile devices. Most of the recent English- 5. Merriam-Webster Learner’s Dictionary (MWLD) English learner dictionaries, particularly those www.learnersdictionary.com published after 2005, come with a hard copy 6. Oxford Advanced Learner’s Dictionary (OALD) and a CD-ROM. The CD-ROM offers useful www.oxfordadvancedlearnersdictionary.com features not found in the hard copy, which include: Table 1. Online English-English learner dictionaries • pronunciation guides (usually for both British and American language varieties) When teachers become familiar with the • electronic writing tools (designed to most useful elements of learner dictionaries, improve learners’ writing) they can easily incorporate dictionary-based • quick find (clicking on a word in a text activities into their lesson plans. The follow­ gives its definition) ing sections focus on five important features • picture dictionaries (mostly in color) and provide suggestions on how teachers can • extra grammar and vocabulary exercises enable students to recognize and use the fea­ (to enhance learning) tures to help them learn English. • vocabulary notes (an electronic version of a vocabulary diary) Dictionary feature 1: • wildcard functions (looking up words Corpus-based dictionaries without knowing their exact spelling) The dictionaries listed in Table 1 are cor- pus-based, which means the content is based It is important to point out that all of these on real-world spoken and written discourse features are designed to help students learn when words, definitions, and examples are new words strategically and enjoyably (see selected and organized. Church (2008) claims Rizo-Rodríguez [2008] for more information that corpus-based dictionaries “describe how about CD-ROM dictionaries). When these language is actually used, as opposed to how it features become the focus, they help learners ought to be used” (334). Thus, definitions for with efficient dictionary use and effective a word are sequenced based on their frequency vocabulary learning (McCarthy and O’Dell of use, and example sentences in corpus-based 2005). dictionaries are authentic. This sequencing The six online English-English learner makes it easy for learners of English to focus dictionaries in Table 1, freely available on first on frequent words and meanings that the Internet, share many of the features of will give them the most mileage. When both CD-ROM dictionaries. They provide learners teachers and learners of English appreciate the with information about word meanings and influence of corpus-based research on learner word frequency, easy-to-read definitions, word dictionaries, they begin to see the distinct combination choices, and explanatory notes. advantage of corpus-based learner dictionaries However, even though these online dictionar­ as opposed to non-corpus dictionaries. ies share similar features, some of the features differ in terms of information representation, Teacher application usability, interface, and user friendliness. For It is important to raise both the teacher’s this reason, a gentle learning curve faces stu­ and the learner’s awareness about differenc­ dents and teachers alike. es in the way that general-use dictionaries E NGLISH T ECHINGA F ORUM | N UMBER 4 2 0 1 2 11 and corpus-based dictionaries are compiled. words in academic texts. It is important to Language teachers can provide learners with note that the second 1,000 words on the list eye-opening activities to explain the distinc­ drop considerably in frequency, covering 6 tions between the two types of dictionaries percent of conversations, 4.7 percent of news­ (see Reppen 2010 for detailed activities). For paper articles, and 4.6 percent of academic example, teachers can pick two or three words texts. Word-frequency issues merit a separate that their students will profit from learning article, and interested readers can learn more (e.g., appointment, breakfast, and discrimi­ about frequency lists at www.lextutor.ca/freq/ nate). The words could come from a reading lists_download (maintained by Tom Cobb). passage or a video that the students have just Some online corpus-based dictionaries use read or watched, or will be assigned to read special designations to indicate particular or watch. The teacher writes these words on a words that will be encountered a large number blackboard or has the students write the words of times. For example, the OALD places a key- on a piece of paper, and asks them to check shaped icon next to high-frequency words. In the definition from one general-use dictionary addition to the first and second 1,000-word (e.g., dictionary.com or www.merriam-webster frequency lists, the OALD indicates the words .com) and one online learner dictionary that are commonly used in academic texts from Table 1 as a homework assignment. (Coxhead 2006). The MD uses a red font to Teachers can do this activity in a classroom if identify more than 7,000 words that are most there is access to the Internet. After the stu­ often used in oral and written communication dents finish the activity at home or in class, and also places one star (lower frequency) to the teacher asks whether the information three stars (higher frequency) after each word presented in the general-use and the learner to further distinguish levels of frequency use. dictionary is different or the same. Students Reinforcing the importance of the high-fre­ state which type of dictionary contains more quency word lists makes English learning more useful information about the words. When efficient and strategic because learners know contrasting these two types of dictionaries, which words to learn first. Teachers can use the students are likely to observe that learner dic­ following activity to help the learners notice tionaries recommend more level-appropriate and acquire high- and low-frequency words. useful tips about words than the general-use dictionaries. After this activity, the students Teacher application might look for additional helpful elements in Conscientious students often want to learn the learner dictionaries. each and every word they encounter when they read a text, listen to a lecture, or talk to Dictionary feature 2: Word frequency other people. These types of learners, often Knowing whether a word is used a lot lower-proficiency or beginning-level students, (high frequency) or a little (low frequency) attempt to learn all the words or expressions in overall written and spoken discourse is they encounter by making a huge list in their very helpful information to English learn­ vocabulary notebooks. Unavoidably, these ers; that makes corpus-based dictionaries a lists include both high- and low-frequency must for the ESL/EFL classroom. Because words and expressions. While we do not high-frequency words are important for lan­ want to dissuade learners from being word guage learners, several vocabulary research­ collectors, we can help them become more ers encourage language teachers to spend strategic word learners. To do so, teachers can considerable time on them (Nation 2001; raise their students’ awareness about word Schmitt 2000). According to Nation (2001, frequency and its importance in word learn­ 17), the first 1,000 of 2,000 high-frequency ing. This also helps learners select dictionaries words cover 84.3 percent of conversations, with the best word-frequency features.
Recommended publications
  • Teaching Vocabulary Across the Curriculum
    Teaching Vocabulary Across the Curriculum William P. Bintz Learning vocabulary is an important instructional aim learning vocabulary. This research clearly indicates for teachers in all content areas in middle grades schools that enlargement of vocabulary has always been and (Harmon, Wood, & Kiser, 2009). Recent research, however, continues to be an important goal in literacy and indicates that vocabulary instruction may be problematic learning (National Institute of Child Health and Human because many teachers are not “confident about best Development, 2004). Educators have long recognized practice in vocabulary instruction and at times don’t know the importance of vocabulary development. In the early where to begin to form an instructional emphasis on word 20th century, John Dewey (1910) stated that vocabulary is learning” (Berne & Blachowicz, 2008, p. 315). critically important because a word is an instrument for In this article, I summarize important research on thinking about the meanings which it expresses. Since vocabulary growth and development and share effective then, there has been an “ebb and flow of concern for instructional strategies that middle school teachers vocabulary” (Manzo, Manzo, & Thomas, 2006, p. 612; can use to teach vocabulary across the content areas. see also Blachowicz & Fisher, 2000). At times, interest in My hope is that teachers will use these strategies to vocabulary has been high and intense, and at other times help students become verbophiles—“people who enjoy low and neglected, alternating back and forth over time word study and become language enthusiasts, lovers of (Berne & Blachowicz, 2008). words, appreciative readers, and word-conscious writers” (Mountain, 2002, p. 62). Research on vocabulary growth and development The importance of vocabulary Vocabulary has long been an important topic in middle Vocabulary can be defined as “the words we must grades education, but today it could be considered a know to communicate effectively: words in speaking hot topic (Cassidy & Cassidy, 2003/2004).
    [Show full text]
  • Etytree: a Graphical and Interactive Etymology Dictionary Based on Wiktionary
    Etytree: A Graphical and Interactive Etymology Dictionary Based on Wiktionary Ester Pantaleo Vito Walter Anelli Wikimedia Foundation grantee Politecnico di Bari Italy Italy [email protected] [email protected] Tommaso Di Noia Gilles Sérasset Politecnico di Bari Univ. Grenoble Alpes, CNRS Italy Grenoble INP, LIG, F-38000 Grenoble, France [email protected] [email protected] ABSTRACT a new method1 that parses Etymology, Derived terms, De- We present etytree (from etymology + family tree): a scendants sections, the namespace for Reconstructed Terms, new on-line multilingual tool to extract and visualize et- and the etymtree template in Wiktionary. ymological relationships between words from the English With etytree, a RDF (Resource Description Framework) Wiktionary. A first version of etytree is available at http: lexical database of etymological relationships collecting all //tools.wmflabs.org/etytree/. the extracted relationships and lexical data attached to lex- With etytree users can search a word and interactively emes has also been released. The database consists of triples explore etymologically related words (ancestors, descendants, or data entities composed of subject-predicate-object where cognates) in many languages using a graphical interface. a possible statement can be (for example) a triple with a lex- The data is synchronised with the English Wiktionary dump eme as subject, a lexeme as object, and\derivesFrom"or\et- at every new release, and can be queried via SPARQL from a ymologicallyEquivalentTo" as predicate. The RDF database Virtuoso endpoint. has been exposed via a SPARQL endpoint and can be queried Etytree is the first graphical etymology dictionary, which at http://etytree-virtuoso.wmflabs.org/sparql.
    [Show full text]
  • Multilingual Ontology Acquisition from Multiple Mrds
    Multilingual Ontology Acquisition from Multiple MRDs Eric Nichols♭, Francis Bond♮, Takaaki Tanaka♮, Sanae Fujita♮, Dan Flickinger ♯ ♭ Nara Inst. of Science and Technology ♮ NTT Communication Science Labs ♯ Stanford University Grad. School of Information Science Natural Language ResearchGroup CSLI Nara, Japan Keihanna, Japan Stanford, CA [email protected] {bond,takaaki,sanae}@cslab.kecl.ntt.co.jp [email protected] Abstract words of a language, let alone those words occur- ring in useful patterns (Amano and Kondo, 1999). In this paper, we outline the develop- Therefore it makes sense to also extract data from ment of a system that automatically con- machine readable dictionaries (MRDs). structs ontologies by extracting knowledge There is a great deal of work on the creation from dictionary definition sentences us- of ontologies from machine readable dictionaries ing Robust Minimal Recursion Semantics (a good summary is (Wilkes et al., 1996)), mainly (RMRS). Combining deep and shallow for English. Recently, there has also been inter- parsing resource through the common for- est in Japanese (Tokunaga et al., 2001; Nichols malism of RMRS allows us to extract on- et al., 2005). Most approaches use either a special- tological relations in greater quantity and ized parser or a set of regular expressions tuned quality than possible with any of the meth- to a particular dictionary, often with hundreds of ods independently. Using this method, rules. Agirre et al. (2000) extracted taxonomic we construct ontologies from two differ- relations from a Basque dictionary with high ac- ent Japanese lexicons and one English lex- curacy using Constraint Grammar together with icon.
    [Show full text]
  • Bridge of Vocabulary: Evidence Based Activities for Academic Success (NCS Pearson Inc, 2007)
    The following information was based on information from Judy K. Montgomery’s book: The Bridge of Vocabulary: Evidence Based Activities for Academic Success (NCS Pearson Inc, 2007) There are 4 types of vocabulary: □ Listening □ Speaking □ Reading Writing The first two constitute spoken vocabulary and the last two, written vocabulary. Children begin to acquire listening and speaking vocabularies many years before they start to build reading and writing vocabularies. Spoken language forms the basis for written language. Each type has a different purpose and, luckily, vocabulary development in one type facilitates growth in another. Listening Vocabulary: The words we hear and understand. Starting in the womb, fetuses can detect sounds as early as 16 weeks. Furthermore, babies are listening during all their waking hours – and we continue to learn new words this way all of our lives. By the time we reach adulthood, most of us will recognize and understand close to 50,000 words. (Stahl, 1999; Tompkins, 2005) Children who are completely deaf do not get exposed to a listening vocabulary. Instead, if they have signing models at home or school, they will be exposed to a “visual” listening vocabulary. The amount of words modeled is much less than a hearing child’s incidental listening vocabulary. Speaking Vocabulary: The words we use when we speak. Our speaking vocabulary is relatively limited: Most adults use a mere 5,000 to 10,000 words for all their conversations and instructions. This number is much less than our listening vocabulary most likely due to ease of use. Reading Vocabulary: The words we understand when we read text.
    [Show full text]
  • (STAAR®) Dictionary Policy
    STAAR® State of Texas Assessments of Academic Readiness State of Texas Assessments of Academic Readiness (STAAR®) Dictionary Policy Dictionaries must be available to all students taking: STAAR grades 3–8 reading tests STAAR grades 4 and 7 writing tests, including revising and editing STAAR Spanish grades 3–5 reading tests STAAR Spanish grade 4 writing test, including revising and editing STAAR English I, English II, and English III tests The following types of dictionaries are allowable: standard monolingual dictionaries in English or the language most appropriate for the student dictionary/thesaurus combinations bilingual dictionaries* (word-to-word translations; no definitions or examples) E S L dictionaries* (definition of an English word using simplified English) sign language dictionaries picture dictionary Both paper and electronic dictionaries are permitted. However, electronic dictionaries that provide access to the Internet or have photographic capabilities are NOT allowed. For electronic dictionaries that are hand-held devices, test administrators must ensure that any features that allow note taking or uploading of files have been cleared of their contents both before and after the test administration. While students are working through the tests listed above, they must have access to a dictionary. Students should use the same type of dictionary they routinely use during classroom instruction and classroom testing to the extent allowable. The school may provide dictionaries, or students may bring them from home. Dictionaries may be provided in the language that is most appropriate for the student. However, the dictionary must be commercially produced. Teacher-made or student-made dictionaries are not allowed. The minimum schools need is one dictionary for every five students testing, but the state’s recommendation is one for every three students or, optimally, one for each student.
    [Show full text]
  • Methods in Lexicography and Dictionary Research* Stefan J
    Methods in Lexicography and Dictionary Research* Stefan J. Schierholz, Department Germanistik und Komparatistik, Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany ([email protected]) Abstract: Methods are used in every stage of dictionary-making and in every scientific analysis which is carried out in the field of dictionary research. This article presents some general consid- erations on methods in philosophy of science, gives an overview of many methods used in linguis- tics, in lexicography, dictionary research as well as of the areas these methods are applied in. Keywords: SCIENTIFIC METHODS, LEXICOGRAPHICAL METHODS, THEORY, META- LEXICOGRAPHY, DICTIONARY RESEARCH, PRACTICAL LEXICOGRAPHY, LEXICO- GRAPHICAL PROCESS, SYSTEMATIC DICTIONARY RESEARCH, CRITICAL DICTIONARY RESEARCH, HISTORICAL DICTIONARY RESEARCH, RESEARCH ON DICTIONARY USE Opsomming: Metodes in leksikografie en woordeboeknavorsing. Metodes word gebruik in elke fase van woordeboekmaak en in elke wetenskaplike analise wat in die woor- deboeknavorsingsveld uitgevoer word. In hierdie artikel word algemene oorwegings vir metodes in wetenskapfilosofie voorgelê, 'n oorsig word gegee van baie metodes wat in die taalkunde, leksi- kografie en woordeboeknavorsing gebruik word asook van die areas waarin hierdie metodes toe- gepas word. Sleutelwoorde: WETENSKAPLIKE METODES, LEKSIKOGRAFIESE METODES, TEORIE, METALEKSIKOGRAFIE, WOORDEBOEKNAVORSING, PRAKTIESE LEKSIKOGRAFIE, LEKSI- KOGRAFIESE PROSES, SISTEMATIESE WOORDEBOEKNAVORSING, KRITIESE WOORDE- BOEKNAVORSING, HISTORIESE WOORDEBOEKNAVORSING, NAVORSING OP WOORDE- BOEKGEBRUIK 1. Introduction In dictionary production and in scientific work which is carried out in the field of dictionary research, methods are used to reach certain results. Currently there is no comprehensive and up-to-date documentation of these particular methods in English. The article of Mann and Schierholz published in Lexico- * This article is based on the article from Mann and Schierholz published in Lexicographica 30 (2014).
    [Show full text]
  • The Impact of Using a Bilingual Dictionary (English-Arabic) for Reading and Writing in a Saudi High School
    THE IMPACT OF USING A BILINGUAL DICTIONARY (ENGLISH-ARABIC) FOR READING AND WRITING IN A SAUDI HIGH SCHOOL By Ali Almaliki A Master’s Thesis/Project Capstone Submitted in Partial Fulfillment of the Requirements for the Degree of Master of Science in Education Teaching English to Speakers of Other Languages (TESOL) Department of Language, Learning and leadership State University of New York at Fredonia Fredonia, New York December 2017 THE IMPACT OF USING A BILINGUAL DICTIONARY (ENGLISH-ARABIC) FOR READING AND WRITING IN A SAUDI HIGH SCHOOL ABSTRACT The purpose of this study is to explore the impact of using a bilingual dictionary (English- Arabic) for reading and writing in a Saudi high school and also to explore the Saudi Arabian students’ attitudes and EFL teachers’ perceptions toward the use of bilingual dictionaries. This study involves 65 EFL students and 5 EFL teachers in one Saudi high school in the city of Alkobar. Mixed methods research is used in which both qualitative and quantitative data are collected. For participating students, pre-test, post-test, and surveys are used to collect quantitative data. For participating teachers and students, in-person interviews are conducted with select teachers and students so as to collect qualitative data. This study has produced eight findings; first is that the use of a bilingual dictionary has a significant effect on the reading and writing scores for both high and low proficiency EFL students. Other findings include that most EFL students feel that using a bilingual dictionary in EFL classrooms is very important to help them translate and learn new vocabulary words but their use of a bilingual dictionary is limited by the strategies for use that students know or are taught, and that both invoice and experienced EFL teachers agree that the use of a bilingual dictionary is important for learning word meaning and vocabulary, but they do not all agree about which grades should use bilingual dictionaries.
    [Show full text]
  • Wiktionary Matcher
    Wiktionary Matcher Jan Portisch1;2[0000−0001−5420−0663], Michael Hladik2[0000−0002−2204−3138], and Heiko Paulheim1[0000−0003−4386−8195] 1 Data and Web Science Group, University of Mannheim, Germany fjan, [email protected] 2 SAP SE Product Engineering Financial Services, Walldorf, Germany fjan.portisch, [email protected] Abstract. In this paper, we introduce Wiktionary Matcher, an ontology matching tool that exploits Wiktionary as external background knowl- edge source. Wiktionary is a large lexical knowledge resource that is collaboratively built online. Multiple current language versions of Wik- tionary are merged and used for monolingual ontology matching by ex- ploiting synonymy relations and for multilingual matching by exploiting the translations given in the resource. We show that Wiktionary can be used as external background knowledge source for the task of ontology matching with reasonable matching and runtime performance.3 Keywords: Ontology Matching · Ontology Alignment · External Re- sources · Background Knowledge · Wiktionary 1 Presentation of the System 1.1 State, Purpose, General Statement The Wiktionary Matcher is an element-level, label-based matcher which uses an online lexical resource, namely Wiktionary. The latter is "[a] collaborative project run by the Wikimedia Foundation to produce a free and complete dic- tionary in every language"4. The dictionary is organized similarly to Wikipedia: Everybody can contribute to the project and the content is reviewed in a com- munity process. Compared to WordNet [4], Wiktionary is significantly larger and also available in other languages than English. This matcher uses DBnary [15], an RDF version of Wiktionary that is publicly available5. The DBnary data set makes use of an extended LEMON model [11] to describe the data.
    [Show full text]
  • A Study of Idiom Translation Strategies Between English and Chinese
    ISSN 1799-2591 Theory and Practice in Language Studies, Vol. 3, No. 9, pp. 1691-1697, September 2013 © 2013 ACADEMY PUBLISHER Manufactured in Finland. doi:10.4304/tpls.3.9.1691-1697 A Study of Idiom Translation Strategies between English and Chinese Lanchun Wang School of Foreign Languages, Qiongzhou University, Sanya 572022, China Shuo Wang School of Foreign Languages, Qiongzhou University, Sanya 572022, China Abstract—This paper, focusing on idiom translation methods and principles between English and Chinese, with the statement of different idiom definitions and the analysis of idiom characteristics and culture differences, studies the strategies on idiom translation, what kind of method should be used and what kind of principle should be followed as to get better idiom translations. Index Terms— idiom, translation, strategy, principle I. DEFINITIONS OF IDIOMS AND THEIR FUNCTIONS Idiom is a language in the formation of the unique and fixed expressions in the using process. As a language form, idioms has its own characteristic and patterns and used in high frequency whether in written language or oral language because idioms can convey a host of language and cultural information when people chat to each other. In some senses, idioms are the reflection of the environment, life, historical culture of the native speakers and are closely associated with their inner most spirit and feelings. They are commonly used in all types of languages, informal and formal. That is why the extent to which a person familiarizes himself with idioms is a mark of his or her command of language. Both English and Chinese are abundant in idioms.
    [Show full text]
  • Introduction to Wordnet: an On-Line Lexical Database
    Introduction to WordNet: An On-line Lexical Database George A. Miller, Richard Beckwith, Christiane Fellbaum, Derek Gross, and Katherine Miller (Revised August 1993) WordNet is an on-line lexical reference system whose design is inspired by current psycholinguistic theories of human lexical memory. English nouns, verbs, and adjectives are organized into synonym sets, each representing one underlying lexical concept. Different relations link the synonym sets. Standard alphabetical procedures for organizing lexical information put together words that are spelled alike and scatter words with similar or related meanings haphazardly through the list. Unfortunately, there is no obvious alternative, no other simple way for lexicographers to keep track of what has been done or for readers to ®nd the word they are looking for. But a frequent objection to this solution is that ®nding things on an alphabetical list can be tedious and time-consuming. Many people who would like to refer to a dictionary decide not to bother with it because ®nding the information would interrupt their work and break their train of thought. In this age of computers, however, there is an answer to that complaint. One obvious reason to resort to on-line dictionariesÐlexical databases that can be read by computersÐis that computers can search such alphabetical lists much faster than people can. A dictionary entry can be available as soon as the target word is selected or typed into the keyboard. Moreover, since dictionaries are printed from tapes that are read by computers, it is a relatively simple matter to convert those tapes into the appropriate kind of lexical database.
    [Show full text]
  • Extracting an Etymological Database from Wiktionary Benoît Sagot
    Extracting an Etymological Database from Wiktionary Benoît Sagot To cite this version: Benoît Sagot. Extracting an Etymological Database from Wiktionary. Electronic Lexicography in the 21st century (eLex 2017), Sep 2017, Leiden, Netherlands. pp.716-728. hal-01592061 HAL Id: hal-01592061 https://hal.inria.fr/hal-01592061 Submitted on 22 Sep 2017 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Extracting an Etymological Database from Wiktionary Benoît Sagot Inria 2 rue Simone Iff, 75012 Paris, France E-mail: [email protected] Abstract Electronic lexical resources almost never contain etymological information. The availability of such information, if properly formalised, could open up the possibility of developing automatic tools targeted towards historical and comparative linguistics, as well as significantly improving the automatic processing of ancient languages. We describe here the process we implemented for extracting etymological data from the etymological notices found in Wiktionary. We have produced a multilingual database of nearly one million lexemes and a database of more than half a million etymological relations between lexemes. Keywords: Lexical resource development; etymology; Wiktionary 1. Introduction Electronic lexical resources used in the fields of natural language processing and com- putational linguistics are almost exclusively synchronic resources; they mostly include information about inflectional, derivational, syntactic, semantic or even pragmatic prop- erties of their entries.
    [Show full text]
  • Expanding Academic Vocabulary with an Interactive On-Line Database
    Language Learning & Technology May 2005, Volume 9, Number 2 http://llt.msu.edu/vol9num2/horst/ pp. 90-110 EXPANDING ACADEMIC VOCABULARY WITH AN INTERACTIVE ON-LINE DATABASE Marlise Horst Concordia University Tom Cobb Université de Québec à Montreal Ioana Nicolae Concordia University ABSTRACT University students used a set of existing and purpose-built on-line tools for vocabulary learning in an experimental ESL course. The resources included concordance, dictionary, cloze-builder, hypertext, and a database with interactive self-quizzing feature (all freely available at www.lextutor.ca). The vocabulary targeted for learning consisted of (a) Coxhead's (2000) Academic Word List, a list of items that occur frequently in university textbooks, and (b) unfamiliar words students had met in academic texts and selected for entry into the class database. The suite of tools were designed to foster retention by engaging learners in deep processing, an aspect that is often described as missing in computer exercises for vocabulary learning. Database entries were examined to determine whether context sentences supported word meanings adequately and whether entered words reflected the unavailability of cognates in the various first languages of the participants. Pre- and post-treatment performance on tests of knowledge of words targeted for learning in the course were compared to establish learning gains. Regression analyses investigated connections between use of specific computer tools and gains. INTRODUCTION In a 1997 review of research-informed techniques for teaching and learning L2 vocabulary, Sökmen issued the following challenge to designers of software for language learners: There is a need for programs which specialize on a useful corpus, provide expanded rehearsal, and engage the learner on deeper levels and in a variety of ways as they practice vocabulary.
    [Show full text]