Making the Dictionary of the Frisian Language Available in the Dutch Historical Dictionary Portal
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Robust Ontology Acquisition from Machine-Readable Dictionaries
Robust Ontology Acquisition from Machine-Readable Dictionaries Eric Nichols Francis Bond Daniel Flickinger Nara Inst. of Science and Technology NTT Communication Science Labs CSLI Nara, Japan Nippon Telegraph and Telephone Co. Stanford University [email protected] Keihanna, Japan California, U.S.A. [email protected] [email protected] Abstract Our basic approach is to parse dictionary definition sen- tences with multiple shallow and deep processors, generating In this paper, we outline the development of a semantic representations of varying specificity. The seman- system that automatically constructs ontologies by tic representation used is robust minimal recursion semantics extracting knowledge from dictionary definition (RMRS: Section 2.2). We then extract ontological relations sentences using Robust Minimal Recursion Se- using the most informative semantic representation for each mantics (RMRS), a semantic formalism that per- definition sentence. mits underspecification. We show that by com- In this paper we discuss the construction of an ontology for bining deep and shallow parsing resources through Japanese using the the Japanese Semantic Database Lexeed the common formalism of RMRS, we can extract [Kasahara et al., 2004]. The deep parser uses the Japanese ontological relations in greater quality and quan- Grammar JACY [Siegel and Bender, 2002] and the shallow tity. Our approach also has the advantages of re- parser is based on the morphological analyzer ChaSen. quiring a very small amount of rules and being We carried out two evaluations. The first gives an automat- easily adaptable to any language with RMRS re- ically obtainable measure by comparing the extracted onto- sources. logical relations by verifying the existence of the relations in exisiting WordNet [Fellbaum, 1998]and GoiTaikei [Ikehara 1 Introduction et al., 1997] ontologies. -
Hittite Etymologies and Notes*
Studia Linguistica Universitatis Iagellonicae Cracoviensis 129 (2012) DOI 10.4467/20834624SL.12.015.0604 ROBERT WOODHOUSE The University oo Queensland, Brisbane [email protected] HITTITE ETYMOLOGIES AND NOTES* Keywords: Hittite, etymology, Proto-Indo-European, historical phonology, semantics Abstract Discussed are the etymologies of twelve Hittite words and word groups (alpa- ‘cloud’, aku- ‘seashell’, ariye/a-zi ‘determine by or consult an oracle’, heu- / he(y)aw- ‘rain’, hāli- ‘pen, corral’, kalmara- ‘ray’ etc., māhla- ‘grapevine branch’, sūu, sūwaw- ‘full’, tarra-tta(ri) ‘be able’ and tarhu-zi ‘id.; conquer’, idālu- ‘evil’, tara-i / tari- ‘become weary, henkan ‘death, doom’) and some points of Hittite historical phonology, such as the fate of medial *-h2n- (sub §7) and final *-i (§13), all of which appear to receive somewhat inadequate treatment in Kloekhorst’s 2008 Hittite etymological dictionary. Several old etymologies are defended and some new ones suggested. The following notes were compiled while writing a response (in press b) to that part of the (2006) paper, recently kindly brought to my attention by its author, Professor Witold Mańczak, that purports to unseat the laryngeal theory on the basis of al- legedly incompatible Hittite material collected over three decades ago by Tischler (1980). The massive debate on the laryngeal theory that essentially followed Tisch- ler’s paper was no doubt in part a response to it and produced solutions to most if not all of the problems raised by Tischler, a position I attempt to summarize in my own paper noted above with reference to the superb Hittite etymological diction- ary recently published by Kloekhorst (2008, hereinafter referred to as K:) with its several innovations in the areas of Hittite and Anatolian historical phonology and morphology. -
Lemma Selection in Domain Specific Computational Lexica – Some Specific Problems
Lemma selection in domain specific computational lexica – some specific problems Sussi Olsen Center for Sprogteknologi Njalsgade 80, DK-2300, Denmark Phone: +45 35 32 90 90 Fax: +45 35 32 90 89 e-mail: [email protected] URL: www.cst.dk Abstract This paper describes the lemma selection process of a Danish computational lexicon, the STO project, for domain specific language and focuses on some specific problems encountered during the lemma selection process. After a short introduction to the STO project and an explanation of why the lemmas are selected from a corpus and not chosen from existing dictionaries, the lemma selection process for domain specific language is described in detail. The purpose is to make the lemma selection process as automatic as possible but a manual examination of the final candidate lemma lists is inevitable. The lemmas found in the corpora are compared to a list of lemmas of general language, sorting out lemmas already encoded in the database. Words that have already been encoded as general language words but that are also found with another meaning and perhaps another syntactic behaviour in a specific domain should be kept on a list and the paper describes how this is done. The recognition of borrowed words the spelling of which have not been established constitutes a big problem to the automatic lemma selection process. The paper gives some examples of this problem and describes how the STO project tries to solve it. The selection of the specific domains is based on the 1. Introduction potential future applications and at present the following The Danish STO project, SprogTeknologisk Ordbase, domains have been selected, while – at least – one is still (i.e. -
Etytree: a Graphical and Interactive Etymology Dictionary Based on Wiktionary
Etytree: A Graphical and Interactive Etymology Dictionary Based on Wiktionary Ester Pantaleo Vito Walter Anelli Wikimedia Foundation grantee Politecnico di Bari Italy Italy [email protected] [email protected] Tommaso Di Noia Gilles Sérasset Politecnico di Bari Univ. Grenoble Alpes, CNRS Italy Grenoble INP, LIG, F-38000 Grenoble, France [email protected] [email protected] ABSTRACT a new method1 that parses Etymology, Derived terms, De- We present etytree (from etymology + family tree): a scendants sections, the namespace for Reconstructed Terms, new on-line multilingual tool to extract and visualize et- and the etymtree template in Wiktionary. ymological relationships between words from the English With etytree, a RDF (Resource Description Framework) Wiktionary. A first version of etytree is available at http: lexical database of etymological relationships collecting all //tools.wmflabs.org/etytree/. the extracted relationships and lexical data attached to lex- With etytree users can search a word and interactively emes has also been released. The database consists of triples explore etymologically related words (ancestors, descendants, or data entities composed of subject-predicate-object where cognates) in many languages using a graphical interface. a possible statement can be (for example) a triple with a lex- The data is synchronised with the English Wiktionary dump eme as subject, a lexeme as object, and\derivesFrom"or\et- at every new release, and can be queried via SPARQL from a ymologicallyEquivalentTo" as predicate. The RDF database Virtuoso endpoint. has been exposed via a SPARQL endpoint and can be queried Etytree is the first graphical etymology dictionary, which at http://etytree-virtuoso.wmflabs.org/sparql. -
Problem of Creating a Professional Dictionary of Uncodified Vocabulary
Utopía y Praxis Latinoamericana ISSN: 1315-5216 ISSN: 2477-9555 [email protected] Universidad del Zulia Venezuela Problem of Creating a Professional Dictionary of Uncodified Vocabulary MOROZOVA, O.A; YAKHINA, A.M; PESTOVA, M.S Problem of Creating a Professional Dictionary of Uncodified Vocabulary Utopía y Praxis Latinoamericana, vol. 25, no. Esp.7, 2020 Universidad del Zulia, Venezuela Available in: https://www.redalyc.org/articulo.oa?id=27964362018 DOI: https://doi.org/10.5281/zenodo.4009661 This work is licensed under Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International. PDF generated from XML JATS4R by Redalyc Project academic non-profit, developed under the open access initiative O.A MOROZOVA, et al. Problem of Creating a Professional Dictionary of Uncodified Vocabulary Artículos Problem of Creating a Professional Dictionary of Uncodified Vocabulary Problema de crear un diccionario profesional de vocabulario no codificado O.A MOROZOVA DOI: https://doi.org/10.5281/zenodo.4009661 Kazan Federal University, Rusia Redalyc: https://www.redalyc.org/articulo.oa? [email protected] id=27964362018 http://orcid.org/0000-0002-4573-5858 A.M YAKHINA Kazan Federal University, Rusia [email protected] http://orcid.org/0000-0002-8914-0995 M.S PESTOVA Ural State Law University, Rusia [email protected] http://orcid.org/0000-0002-7996-9636 Received: 03 August 2020 Accepted: 10 September 2020 Abstract: e article considers the problem of lexicographical fixation of uncodified vocabulary in a professional dictionary. e oil sublanguage, being one of the variants of common language realization used by a limited group of its medium in conditions of official and also non-official communication, provides interaction of people employed in the oil industry. -
Germanic Standardizations: Past to Present (Impact: Studies in Language and Society)
<DOCINFO AUTHOR ""TITLE "Germanic Standardizations: Past to Present"SUBJECT "Impact 18"KEYWORDS ""SIZE HEIGHT "220"WIDTH "150"VOFFSET "4"> Germanic Standardizations Impact: Studies in language and society impact publishes monographs, collective volumes, and text books on topics in sociolinguistics. The scope of the series is broad, with special emphasis on areas such as language planning and language policies; language conflict and language death; language standards and language change; dialectology; diglossia; discourse studies; language and social identity (gender, ethnicity, class, ideology); and history and methods of sociolinguistics. General Editor Associate Editor Annick De Houwer Elizabeth Lanza University of Antwerp University of Oslo Advisory Board Ulrich Ammon William Labov Gerhard Mercator University University of Pennsylvania Jan Blommaert Joseph Lo Bianco Ghent University The Australian National University Paul Drew Peter Nelde University of York Catholic University Brussels Anna Escobar Dennis Preston University of Illinois at Urbana Michigan State University Guus Extra Jeanine Treffers-Daller Tilburg University University of the West of England Margarita Hidalgo Vic Webb San Diego State University University of Pretoria Richard A. Hudson University College London Volume 18 Germanic Standardizations: Past to Present Edited by Ana Deumert and Wim Vandenbussche Germanic Standardizations Past to Present Edited by Ana Deumert Monash University Wim Vandenbussche Vrije Universiteit Brussel/FWO-Vlaanderen John Benjamins Publishing Company Amsterdam/Philadelphia TM The paper used in this publication meets the minimum requirements 8 of American National Standard for Information Sciences – Permanence of Paper for Printed Library Materials, ansi z39.48-1984. Library of Congress Cataloging-in-Publication Data Germanic standardizations : past to present / edited by Ana Deumert, Wim Vandenbussche. -
Methods in Lexicography and Dictionary Research* Stefan J
Methods in Lexicography and Dictionary Research* Stefan J. Schierholz, Department Germanistik und Komparatistik, Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany ([email protected]) Abstract: Methods are used in every stage of dictionary-making and in every scientific analysis which is carried out in the field of dictionary research. This article presents some general consid- erations on methods in philosophy of science, gives an overview of many methods used in linguis- tics, in lexicography, dictionary research as well as of the areas these methods are applied in. Keywords: SCIENTIFIC METHODS, LEXICOGRAPHICAL METHODS, THEORY, META- LEXICOGRAPHY, DICTIONARY RESEARCH, PRACTICAL LEXICOGRAPHY, LEXICO- GRAPHICAL PROCESS, SYSTEMATIC DICTIONARY RESEARCH, CRITICAL DICTIONARY RESEARCH, HISTORICAL DICTIONARY RESEARCH, RESEARCH ON DICTIONARY USE Opsomming: Metodes in leksikografie en woordeboeknavorsing. Metodes word gebruik in elke fase van woordeboekmaak en in elke wetenskaplike analise wat in die woor- deboeknavorsingsveld uitgevoer word. In hierdie artikel word algemene oorwegings vir metodes in wetenskapfilosofie voorgelê, 'n oorsig word gegee van baie metodes wat in die taalkunde, leksi- kografie en woordeboeknavorsing gebruik word asook van die areas waarin hierdie metodes toe- gepas word. Sleutelwoorde: WETENSKAPLIKE METODES, LEKSIKOGRAFIESE METODES, TEORIE, METALEKSIKOGRAFIE, WOORDEBOEKNAVORSING, PRAKTIESE LEKSIKOGRAFIE, LEKSI- KOGRAFIESE PROSES, SISTEMATIESE WOORDEBOEKNAVORSING, KRITIESE WOORDE- BOEKNAVORSING, HISTORIESE WOORDEBOEKNAVORSING, NAVORSING OP WOORDE- BOEKGEBRUIK 1. Introduction In dictionary production and in scientific work which is carried out in the field of dictionary research, methods are used to reach certain results. Currently there is no comprehensive and up-to-date documentation of these particular methods in English. The article of Mann and Schierholz published in Lexico- * This article is based on the article from Mann and Schierholz published in Lexicographica 30 (2014). -
Organizing Knowledge: Comparative Structures of Intersubjectivity in Nineteenth-Century Historical Dictionaries
Organizing Knowledge: Comparative Structures of Intersubjectivity in Nineteenth-Century Historical Dictionaries Kelly M. Kistner A dissertation submitted in partial fulfillment for the requirements for the degree of Doctor of Philosophy University of Washington 2014 Reading Committee: Gary G. Hamilton, Chair Steven Pfaff Katherine Stovel Program Authorized to Offer Degree: Sociology ©Copyright 2014 Kelly M. Kistner University of Washington Abstract Organizing Knowledge: Comparative Structures of Intersubjectivity in Nineteenth-Century Historical Dictionaries Kelly Kistner Chair of the Supervisory Committee: Professor Gary G. Hamilton Sociology Between 1838 and 1857 language scholars throughout Europe were inspired to create a new kind of dictionary. Deemed historical dictionaries, their projects took an unprecedented leap in style and scale from earlier forms of lexicography. These lexicographers each sought to compile historical inventories of their national languages and were inspired by the new scientific approach of comparative philology. For them, this science promised a means to illuminate general processes of social change and variation, as well as the linguistic foundations for cultural and national unity. This study examines two such projects: The German Dictionary, Deutsches Worterbuch, of the Grimm Brothers, and what became the Oxford English Dictionary. Both works utilized collaborative models of large-scale, long-term production, yet the content of the dictionaries would differ in remarkable ways. The German dictionary would be characterized by its lack of definitions of meaning, its eclectic treatment of entries, rich analytical prose, and self- referential discourse; whereas the English dictionary would feature succinct, standardized, and impersonal entries. Using primary source materials, this research investigates why the dictionaries came to differ. -
LCSH Section W
W., D. (Fictitious character) William Kerr Scott Lake (N.C.) Waaddah Island (Wash.) USE D. W. (Fictitious character) William Kerr Scott Reservoir (N.C.) BT Islands—Washington (State) W.12 (Military aircraft) BT Reservoirs—North Carolina Waaddah Island (Wash.) USE Hansa Brandenburg W.12 (Military aircraft) W particles USE Waadah Island (Wash.) W.13 (Seaplane) USE W bosons Waag family USE Hansa Brandenburg W.13 (Seaplane) W-platform cars USE Waaga family W.29 (Military aircraft) USE General Motors W-cars Waag River (Slovakia) USE Hansa Brandenburg W.29 (Military aircraft) W. R. Holway Reservoir (Okla.) USE Váh River (Slovakia) W.A. Blount Building (Pensacola, Fla.) UF Chimney Rock Reservoir (Okla.) Waaga family (Not Subd Geog) UF Blount Building (Pensacola, Fla.) Holway Reservoir (Okla.) UF Vaaga family BT Office buildings—Florida BT Lakes—Oklahoma Waag family W Award Reservoirs—Oklahoma Waage family USE Prix W W. R. Motherwell Farmstead National Historic Park Waage family W.B. Umstead State Park (N.C.) (Sask.) USE Waaga family USE William B. Umstead State Park (N.C.) USE Motherwell Homestead National Historic Site Waahi, Lake (N.Z.) W bosons (Sask.) UF Lake Rotongaru (N.Z.) [QC793.5.B62-QC793.5.B629] W. R. Motherwell Stone House (Sask.) Lake Waahi (N.Z.) UF W particles UF Motherwell House (Sask.) Lake Wahi (N.Z.) BT Bosons Motherwell Stone House (Sask.) Rotongaru, Lake (N.Z.) W. Burling Cocks Memorial Race Course at Radnor BT Dwellings—Saskatchewan Wahi, Lake (N.Z.) Hunt (Malvern, Pa.) W.S. Payne Medical Arts Building (Pensacola, Fla.) BT Lakes—New Zealand UF Cocks Memorial Race Course at Radnor Hunt UF Medical Arts Building (Pensacola, Fla.) Waʻahila Ridge (Hawaii) (Malvern, Pa.) Payne Medical Arts Building (Pensacola, Fla.) BT Mountains—Hawaii BT Racetracks (Horse racing)—Pennsylvania BT Office buildings—Florida Waaihoek (KwaZulu-Natal, South Africa) W-cars W star algebras USE Waay Hoek (KwaZulu-Natal, South Africa : USE General Motors W-cars USE C*-algebras Farm) W. -
Historical Astrolexicography and Old Publications
Library and Information Services in Astronomy III ASP Conference Series, Vol. 153, 1998 U. Grothkopf, H. Andernach, S. Stevens-Rayburn, and M. Gomez (eds.) Historical Astrolexicography and Old Publications Terry J. Mahoney Instituto de Astrof´ısica de Canarias, E-38200 La Laguna, Tenerife, Spain Abstract. I describe how the principles of lexicography have been ap- plied in limited ways in astronomy and look at the revision work under way for the third edition of the Oxford English Dictionary,which,when completed, will contain the widest and most detailed coverage of the as- tronomical lexicon in the English language. Finally, I argue the need for a dedicated historical dictionary of astronomy based rigorously on a corpus of quotations from sources published in English from the beginnings of written English to the present day. 1. The Role of Lexicography in Astronomy Lexicography is defined in the Oxford English Dictionary1 as “the writing or com- pilation of a lexicon or dictionary.” Among the various senses for lexicon listed in OED2 is, “The vocabulary proper to some department of knowledge or sphere of activity.” I define astrolexicography as the application of the practices of lex- icography to the astronomical lexicon. However, the discipline of lexicography may be applied to many types of reference works other than dictionaries (for example, “encyclopaedic dictionaries”, glossaries of technical terms, thesauri, lists of key words, etc.). A full survey of astronomical lexicographical works will be the subject of a future paper. Here I shall limit myself to identifying what I consider to be an important lacuna in astronomical literature. -
An Amerind Etymological Dictionary
An Amerind Etymological Dictionary c 2007 by Merritt Ruhlen ! Printed in the United States of America Library of Congress Cataloging-in-Publication Data Greenberg, Joseph H. Ruhlen, Merritt An Amerind Etymological Dictionary Bibliography: p. Includes indexes. 1. Amerind Languages—Etymology—Classification. I. Title. P000.G0 2007 000!.012 00-00000 ISBN 0-0000-0000-0 (alk. paper) This book is dedicated to the Amerind people, the first Americans Preface The present volume is a revison, extension, and refinement of the ev- idence for the Amerind linguistic family that was initially offered in Greenberg (1987). This revision entails (1) the correction of a num- ber of forms, and the elimination of others, on the basis of criticism by specialists in various Amerind languages; (2) the consolidation of certain Amerind subgroup etymologies (given in Greenberg 1987) into Amerind etymologies; (3) the addition of many reconstructions from different levels of Amerind, based on a comprehensive database of all known reconstructions for Amerind subfamilies; and, finally, (4) the addition of a number of new Amerind etymologies presented here for the first time. I believe the present work represents an advance over the original, but it is at the same time simply one step forward on a project that will never be finished. M. R. September 2007 Contents Introduction 1 Dictionary 11 Maps 272 Classification of Amerind Languages 274 References 283 Semantic Index 296 Introduction This volume presents the lexical and grammatical evidence that defines the Amerind linguistic family. The evidence is presented in terms of 913 etymolo- gies, arranged alphabetically according to the English gloss. -
Towards a Conceptual Representation of Lexical Meaning in Wordnet
Towards a Conceptual Representation of Lexical Meaning in WordNet Jen-nan Chen Sue J. Ker Department of Information Management, Department of Computer Science, Ming Chuan University Soochow University Taipei, Taiwan Taipei, Taiwan [email protected] [email protected] Abstract Knowledge acquisition is an essential and intractable task for almost every natural language processing study. To date, corpus approach is a primary means for acquiring lexical-level semantic knowledge, but it has the problem of knowledge insufficiency when training on various kinds of corpora. This paper is concerned with the issue of how to acquire and represent conceptual knowledge explicitly from lexical definitions and their semantic network within a machine-readable dictionary. Some information retrieval techniques are applied to link between lexical senses in WordNet and conceptual ones in Roget's categories. Our experimental results report an overall accuracy of 85.25% (87, 78, 89, and 87% for nouns, verbs, adjectives and adverbs, respectively) when evaluating on polysemous words discussed in previous literature. 1 Introduction Knowledge representation has traditionally been thought of as the heart of various applications of natural language processing. Anyone who has built a natural language processing system has had to tackle the problem of representing its knowledge of the world. One of the applications for knowledge representation is word sense disambiguation (WSD) for unrestricted text, which is one of the major problems in analyzing sentences. In the past, the substantial literature on WSD has concentrated on statistical approaches to analyzing and extracting information from corpora (Gale et al. 1992; Yarowsky 1992, 1995; Dagan and Itai 1994; Luk 1995; Ng and Lee 1996).