Metafoor in Die Vertaalde Mediadiskoers Oor Aandele En Markte in Finweek

Total Page:16

File Type:pdf, Size:1020Kb

Metafoor in Die Vertaalde Mediadiskoers Oor Aandele En Markte in Finweek METAFOOR IN DIE VERTAALDE MEDIADISKOERS OOR AANDELE EN MARKTE IN FINWEEK deur Erica du Preez Proefskrif voorgelê ter vervulling van die graad Philosophiae Doctor in Taalpraktyk (Ph.D. Taalpraktyk) in DIE DEPARTEMENT AFRO-ASIATIESE STUDIE, GEBARETAAL EN TAALPRAKTYK FAKULTEIT GEESTESWETENSKAPPE UNIVERSITEIT VAN DIE VRYSTAAT BLOEMFONTEIN NOVEMBER 2009 Promotor: Prof. J.A. Naudé i VERKLARING Ek, die ondergetekende, verklaar dat die proefskrif wat hierby vir die kwalifikasie Ph.D. (Taalpraktyk) aan die Universiteit van die Vrystaat deur my ingedien word, my selfstandige werk is en nie voorheen deur my vir ʼn graad aan ʼn ander universiteit/fakulteit ingedien is nie. Handtekening: .................................... E S du Preez Datum: 27 November 2009 Ek dra hiermee die kopiereg van hierdie produk oor aan die Universiteit van die Vrystaat. Handtekening: .................................... E S du Preez Datum: 27 November 2009 ii ERKENNINGS Alles wat ek vermag, is ’n gawe van Bo. Alle eer aan God alleen! Hiermee wil ek dank en erkenning gee aan die volgende persone waarsonder ek nie die eindpaal sou kon haal nie: • Prof. J.A. Naudé, my studieleier, vir u geduld, aanmoediging en deeglike professionele akademiese leiding tydens my studie en veral met die skryf en finalisering van hierdie proefskrif. • Aan my man James en my allerliefste kinders Liesl en Eric, Inge en Ian, en Werner: baie dankie vir jul deernis en liefdevolle ondersteuning. • Opregte dank aan my vriendin en kollega Igna du Plooy, vir jou hulp met die soektogte. • Spesiaal dankie aan prof. Tienie Crous, dekaan van die Fakulteit Ekonomiese en Bestuurswetenskappe, Universiteit van die Vrystaat, vir u begrip en die geleentheid om hierdie studie binne die verpligtinge van my werk as fakulteitsbestuurder te kon aanpak. • Ek het reuse waardering vir al my kollegas se koestering en lojale ondersteuning. • Aan my spesiale vriende Hettie en Gerrit van Wyk: dankie vir jul opregte vriendskap en aanmoediging oor baie jare. • Aan die taalversorgers Marlie van Rooyen, Marti Gerber en Tessa Moller: julle insette was onmisbaar. iii INHOUDSOPGAWE VERKLARING ii ERKENNINGS iii LYS VAN TABELLE x LYS VAN FIGURE xii LYS VAN AFKORTINGS EN AKRONIEME xiii LYS VAN BYLAES xv OPSOMMING xvi ABSTRACT xviii QOTSO xx HOOFSTUK 1 INLEIDING 1.1 Noodsaaklikheid van studie 1 1.2 Die doel van die studie en probleemstelling 4 1.3 Hipotese 5 1.4 Raamwerk van die studie 6 1.4.1 Korpusgebaseerde vertaalkunde 6 1.4.2 Vorige ondersoeke 7 1.4.3 Metodologie 8 1.4.4 Afbakening van die studie 9 1.4.5 Samestelling van die korpusmateriaal 9 1.5 Organisasie van die studie 10 HOOFSTUK 2 METAFORE IN FINANSIËLE MEDIADISKOERS 2.1 Inleiding 12 2.2 Diskoers 2.2.1 Definiëring van “diskoers” 14 2.2.1.1 Diskoersanalise 14 (i) Strata 18 iv (ii) Metafunksies 21 2.2.1.2 Kritiese teorie 26 2.2.1.3 Poststrukturele tradisie 34 (i) Kulturele Studie 35 (ii) Michel Foucault en diskoers 36 2.2.2 Mediadiskoers 41 2.2.2.1 Fairclough en mediadiskoers 42 (i) Linguistiese en sosiolinguistiese-analitiese benadering tot 44 mediadiskoers (ii) Gesprekanalitiese benadering tot mediadiskoers 44 (iii) Semiotiese benadering tot mediadiskoers 44 (iv) Kritiese linguistiese en sosio-semiotiese benadering tot 45 mediadiskoers (v) Sosio-kognitiewe benadering tot mediadiskoers 45 (vi) Kultureel-generiese benadering tot mediadiskoers 46 2.2.2.2 Samevatting en implikasie van hierdie studie 48 2.3 Metafoor 50 2.3.1 Die klassieke kognitiewe metafoorteorie 50 2.3.2 Konseptuele metafoorteorie 52 2.3.2.1 Lakoff en Johnson 52 2.3.2.2 Kövecses 54 2.3.3 Poëtiese metafoorteorie 57 2.3.3.1 Verlenging 59 2.3.3.2 Uitbreiding 59 2.3.3.3 Bevraagtekening 60 2.3.3.4 Samestelling 61 2.3.3.5 Kartering 62 2.3.3.6 Vermengingsteorie 65 2.4 Die rol van metafoor in mediadiskoers en finansiële mediadiskoers 68 2.4.1 Die oorlogmetafoor in finansiële diskoers 74 2.4.2 Kognitiewe semantiek en kritiese metafooranalise 80 2.4.3 Die identifisering van die metafore in die korpus 83 2.5 Samevatting 86 v HOOFSTUK 3 KORPUSGEBASEERDE VERTAALKUNDE AS ONDERSOEKRAAMWERK 3.1 Inleiding 90 3.2 Wat is ’n korpus? 93 3.2.1 Verskillende definisies 93 3.3 Die masjienleesbare korpus en die gebruik van rekenaars 97 in korpusgebaseerde vertaalkunde 3.4 Die ontwikkeling van vertaalkunde 99 3.4.1 Vroeë korpuslinguistiek 99 3.4.2 Die herlewing van korpuslinguistiek 104 3.5 Korpusgebaseerde navorsing 105 3.5.1 Die ontwerp en ontwikkeling 107 3.5.1.1 Eerste generasie korpusse 108 3.5.1.2 Tweede generasie korpusse 108 3.5.2 Parallelle korpusse 109 3.5.2.1 Definisies 109 3.5.2.2 Parallelle korpusse en vertaling 109 3.5.2.3 Die gebruik van parallelle tekste in onderrig van vertalers 112 3.5.2.4 Die ondersoek van parallelle tekste 114 3.6 ParaConc 115 3.6.1 ’n Basiese konkordansiesoektog 116 3.6.1.1 Resultaat in twee tale 117 3.6.1.2 Om die resultaat te sorteer 117 3.6.1.3 Voorgestelde vertalings 118 3.6.1.4 SWIK 120 3.6.1.5 Hot words 120 3.6.2 Korpusfrekwensie-inligting 120 3.6.2.1 Frekwensiekeuses 120 3.6.3 Kollokasies en kollokering 121 3.6.3.1 Kollokasiefrekwensies 122 3.7 Samevatting 123 vi HOOFSTUK 4 ’N PARALLELLE KORPUS VAN METAFORE OOR AANDELE EN MARKTE IN FINWEEK AS TAAL VIR SPESIALE DOELEINDES 4.1 Inleiding 124 4.2 Taal vir spesiale doeleindes (TSD) 126 4.2.1 Omskrywing van TSD 126 4.2.2 Gebruikers van TSD 128 4.2.3 Verskillende vlakke van TSD-kommunikasie 129 4.3 Die ontwerp van ’n TSD-korpus 129 4.3.1 Die grootte van die korpus 130 4.3.2 Kopiereg en toestemming 132 4.3.3 Die samestelling van die TSD-korpusse 132 4.4 Die samestelling van die parallelle korpusse vir die studie 136 4.4.1 Aantal tekste 136 4.4.2 Medium 136 4.4.3 Ontwerp 136 4.4.4 Taal 137 4.5 Die gebruik van ’n TSD-korpus om metafoor te ondersoek 137 4.6 Tweetalige korpusse: voorafprosessering en inlynstelling 139 4.6.1 Die metode 139 4.6.1.1 Aflaai vanaf die web 139 4.6.1.2 Stoor van tekste 139 4.6.1.3 Register van artikels 140 4.7 Korpusprosessering 141 4.8 Die bepaling van die leksikale velde vir analise 145 4.9 Samevatting 145 HOOFSTUK 5 DIE KRITIESE ANALISE VAN DIE METAFORE IN DIE KORPUSSE 5.1 Inleiding 147 5.2 Kwantitatiewe analise van die voorbeeldtekste 148 5.2.1 Leksikale velde 148 vii 5.2.1.1 Afrikaanse tekste met voorbeelde van OORLOG EN MAG 148 5.2.1.2 Engelse tekste met voorbeelde van OORLOG EN MAG 149 5.2.1.3 Afrikaanse tekste met voorbeelde van SPORT EN SPELE 151 5.2.1.4 Engelse tekste met voorbeelde van SPORT EN SPELE 152 5.2.1.5 Afrikaanse tekste met voorbeelde van ROMANSE EN VLERKSLEEP 154 5.2.1.6 Engelse tekste met voorbeelde van ROMANSE EN VLERKSLEEP 155 5.2.2 Absolute en relatiewe frekwensies in die geïdentifiseerde 157 Metafoorklusters 5.2.2.1 Afrikaanse korpus 157 5.2.2.2 Engelse korpus 160 5.2.3 Absolute en relatiewe frekwensies in die geïdentifiseerde alternatiewe 164 metafoorklusters 5.2.3.1 Afrikaanse korpus 164 5.2.3.2 Engelse korpus 169 5.3 Samevatting en gevolgtrekkings t.o.v. die kwantitatiewe analise 173 5.4 Kwalitatiewe analise van die korpus 175 5.4.1 Die voorbeeldtekste 174 5.4.2 Kwalitatiewe analise van die Afrikaanse artikels 182 5.4.3 Kwalitatiewe analise van die Engelse artikels 191 5.5 Samevatting en gevolgtrekkings t.o.v. die kwalitatiewe analise 202 5.6 Patrone in die teks 207 5.6.1 Engelse artikels 207 5.6.2 Afrikaanse artikels 210 5.7 ʼn Hiërargiese kognitiewe model 213 5.8 Die gebruik van TSD, die uitbreiding van die woordeskat en 214 die begripsvormende funksies van die metafoor 5.9 Samevatting 219 HOOFSTUK 6 SAMEVATTENDE GEVOLGTREKKINGS EN TOEKOMS- PERSPEKTIEWE 6.1 Inleiding 221 6.2 Opsomming van die studie 224 6.2.1 Metafore en finansiële mediadiskoers 225 6.2.2 Korpusgebaseerde vertaalkunde as ondersoekraamwerk 227 6.2.3 ’n Parallelle korpus van metafore oor aandele en markte in 226 Finweek as taal vir spesiale doeleindes 6.2.4 Die kritiese analise van metafore in die korpusse 227 viii 6.3 Die geïdentifiseerde koherente metafoorklusters in die korpus 228 6.4 Die begripsvormende funksies van die metafore in die korpus 229 6.5 Die outeur se oogmerk met die keuse van metafoor wat gebruik word 229 6.5.1 Metafore se bydrae tot die evolusie van ekonomiese denke 230 6.5.2 Die ideologiese effek van die metafore in die korpus 231 6.6 Nuwe idees en betekenisse wat deur die kombinasie van 232 verskillende domeine gevorm is 6.7 ’n Vergelyking tussen hierdie studie en die studie van Koller (2004) 234 6.7.1 Ooreenkomste 234 6.7.2 Verskille 234 6.8 Perspektiewe op toekomstige navorsing 236 BIBLIOGRAFIE Bylaag A Bylaag B Bylaag C Bylaag D Bylaag E Bylaag F Bylaag G Bylaag H Bylaag I Bylaag J Bylaag K Bylaag L ix LYS VAN TABELLE Tabel 2.1: Die metafunksies van taal 23 Tabel 2.2: Orde van bron na teiken 64 Tabel 2.3: Verskille en ooreenkomste tussen klassieke metafoorteorie en vermengingsteorie 67 Tabel 2.4: Vergelyking tussen Engelse en Afrikaanse oriëntasie metafore 73 Tabel 5.1: Leksikale veld OORLOG EN MAG in die Afrikaanse korpus 148 Tabel 5.2: Leksikale veld OORLOG EN MAG in die Engelse korpus 150 Tabel 5.3: Leksikale veld SPORT EN SPELE in die Afrikaanse korpus 152 Tabel 5.4: Leksikale veld SPORT EN SPELE in die Engelse korpus 153 Tabel 5.5: Leksikale veld ROMANSE EN VLERKSLEEP in die Afrikaanse korpus 155 Tabel 5.6: Leksikale veld ROMANSE EN VLERKSLEEP in die Engelse korpus 156 Tabel 5.7: Metaforiese uitdrukkings oor OORLOG EN MAG, en SPORT EN SPELE in die Afrikaanse korpus per woordklas 157 Tabel 5.8: Lemmas oor OORLOG EN MAG en SPORT EN SPELE ondersoek, nie in die Afrikaanse teks nie 159 Tabel 5.9: Metaforiese uitdrukkings oor OORLOG EN MAG, en SPORT EN SPELE in die Engelse korpus per woordklas
Recommended publications
  • Talk Bank: a Multimodal Database of Communicative Interaction
    Talk Bank: A Multimodal Database of Communicative Interaction 1. Overview The ongoing growth in computer power and connectivity has led to dramatic changes in the methodology of science and engineering. By stimulating fundamental theoretical discoveries in the analysis of semistructured data, we can to extend these methodological advances to the social and behavioral sciences. Specifically, we propose the construction of a major new tool for the social sciences, called TalkBank. The goal of TalkBank is the creation of a distributed, web- based data archiving system for transcribed video and audio data on communicative interactions. We will develop an XML-based annotation framework called Codon to serve as the formal specification for data in TalkBank. Tools will be created for the entry of new and existing data into the Codon format; transcriptions will be linked to speech and video; and there will be extensive support for collaborative commentary from competing perspectives. The TalkBank project will establish a framework that will facilitate the development of a distributed system of allied databases based on a common set of computational tools. Instead of attempting to impose a single uniform standard for coding and annotation, we will promote annotational pluralism within the framework of the abstraction layer provided by Codon. This representation will use labeled acyclic digraphs to support translation between the various annotation systems required for specific sub-disciplines. There will be no attempt to promote any single annotation scheme over others. Instead, by promoting comparison and translation between schemes, we will allow individual users to select the custom annotation scheme most appropriate for their purposes.
    [Show full text]
  • CLIB 2016 Proceedings
    The Second International Conference Computational ​ Linguistics in Bulgaria (CLIB 2016) is organised within ​ the Operation for Support for International Scientific Conferences Held in Bulgaria of the National Science ​ Fund Grant № ДПМНФ 01/9 of 11 Aug 2016. National Science Fund ​ ​ CLIB 2016 is organised by: The Department of Computational Linguistics, Institute for Bulgarian Language, Bulgarian Academy of Sciences PUBLICATION AND CATALOGUING INFORMATION Title: Proceedings of the Second International Conference Computational Linguistics in Bulgaria (CLIB 2016) ISSN: 2367­5675 (online) Published and The Institute for Bulgarian Language Prof. Lyubomir ​ distributed by: Andreychin, Bulgarian Academy of Sciences ​ Editorial address: Institute for Bulgarian Language Prof. Lyubomir ​ Andreychin, Bulgarian Academy of Sciences ​ 52 Shipchenski prohod blvd., bldg. 17 Sofia 1113, Bulgaria +359 2/ 872 23 02 Copyright of each paper stays with the respective authors. The works in the Proceedings are licensed under a Creative Commons Attribution 4.0 International Licence (CC BY 4.0). Licence details: http://creativecommons.org/licenses/by/4.0 ​ Proceedings of the Second International Conference Computational Linguistics in Bulgaria 9 September 2016 Sofia, Bulgaria PREFACE We are excited to welcome you to the second edition of the International Conference Computational ​ Linguistics in Bulgaria (CLIB 2016) in Sofia, Bulgaria! ​ CLIB aspires to foster the NLP community in Bulgaria and further the cooperation among researchers working in NLP for Bulgarian around the world. The need for a conference dedicated to NLP research dealing with or applicable to Bulgarian has been felt for quite some time. We believe that building a strong community of researchers and teams who have chosen to work on Bulgarian is a key factor to meeting the challenges and requirements posed to computational linguistics and NLP in Bulgaria.
    [Show full text]
  • The Procedure of Lexico-Semantic Annotation of Składnica Treebank
    The Procedure of Lexico-Semantic Annotation of Składnica Treebank Elzbieta˙ Hajnicz Institute of Computer Science, Polish Academy of Sciences ul. Ordona 21, 01-237 Warsaw, Poland [email protected] Abstract In this paper, the procedure of lexico-semantic annotation of Składnica Treebank using Polish WordNet is presented. Other semantically annotated corpora, in particular treebanks, are outlined first. Resources involved in annotation as well as a tool called Semantikon used for it are described. The main part of the paper is the analysis of the applied procedure. It consists of the basic and correction phases. During basic phase all nouns, verbs and adjectives are annotated with wordnet senses. The annotation is performed independently by two linguists. Multi-word units obtain special tags, synonyms and hypernyms are used for senses absent in Polish WordNet. Additionally, each sentence receives its general assessment. During the correction phase, conflicts are resolved by the linguist supervising the process. Finally, some statistics of the results of annotation are given, including inter-annotator agreement. The final resource is represented in XML files preserving the structure of Składnica. Keywords: treebanks, wordnets, semantic annotation, Polish 1. Introduction 2. Semantically Annotated Corpora It is widely acknowledged that linguistically annotated cor- Semantic annotation of text corpora seems to be the last pora play a crucial role in NLP. There is even a tendency phase in the process of corpus annotation, less popular than towards their ever-deeper annotation. In particular, seman- morphosyntactic and (shallow or deep) syntactic annota- tically annotated corpora become more and more popular tion. However, there exist semantically annotated subcor- as they have a number of applications in word sense disam- pora for many languages.
    [Show full text]
  • Student Research Workshop Associated with RANLP 2011, Pages 1–8, Hissar, Bulgaria, 13 September 2011
    RANLPStud 2011 Proceedings of the Student Research Workshop associated with The 8th International Conference on Recent Advances in Natural Language Processing (RANLP 2011) 13 September, 2011 Hissar, Bulgaria STUDENT RESEARCH WORKSHOP ASSOCIATED WITH THE INTERNATIONAL CONFERENCE RECENT ADVANCES IN NATURAL LANGUAGE PROCESSING’2011 PROCEEDINGS Hissar, Bulgaria 13 September 2011 ISBN 978-954-452-016-8 Designed and Printed by INCOMA Ltd. Shoumen, BULGARIA ii Preface The Recent Advances in Natural Language Processing (RANLP) conference, already in its eight year and ranked among the most influential NLP conferences, has always been a meeting venue for scientists coming from all over the world. Since 2009, we decided to give arena to the younger and less experienced members of the NLP community to share their results with an international audience. For this reason, further to the first successful and highly competitive Student Research Workshop associated with the conference RANLP 2009, we are pleased to announce the second edition of the workshop which is held during the main RANLP 2011 conference days on 13 September 2011. The aim of the workshop is to provide an excellent opportunity for students at all levels (Bachelor, Master, and Ph.D.) to present their work in progress or completed projects to an international research audience and receive feedback from senior researchers. We have received 31 high quality submissions, among which 6 papers have been accepted as regular oral papers, and 18 as posters. Each submission has been reviewed by
    [Show full text]
  • MASC: the Manually Annotated Sub-Corpus of American English
    MASC: The Manually Annotated Sub-Corpus of American English Nancy Ide*, Collin Baker**, Christiane Fellbaum†, Charles Fillmore**, Rebecca Passonneau†† *Vassar College Poughkeepsie, New York USA **International Computer Science Institute Berkeley, California USA †Princeton University Princeton, New Jersey USA ††Columbia University New York, New York USA E-mail: [email protected], [email protected], [email protected], [email protected], [email protected] Abstract To answer the critical need for sharable, reusable annotated resources with rich linguistic annotations, we are developing a Manually Annotated Sub-Corpus (MASC) including texts from diverse genres and manual annotations or manually-validated annotations for multiple levels, including WordNet senses and FrameNet frames and frame elements, both of which have become significant resources in the international computational linguistics community. To derive maximal benefit from the semantic information provided by these resources, the MASC will also include manually-validated shallow parses and named entities, which will enable linking WordNet senses and FrameNet frames within the same sentences into more complex semantic structures and, because named entities will often be the role fillers of FrameNet frames, enrich the semantic and pragmatic information derivable from the sub-corpus. All MASC annotations will be published with detailed inter-annotator agreement measures. The MASC and its annotations will be freely downloadable from the ANC website, thus providing
    [Show full text]
  • Child Language
    ABSTRACTS 14TH INTERNATIONAL CONGRESS FOR THE STUDY OF CHILD LANGUAGE IN LYON, IASCL FRANCE 2017 WELCOME JULY, 17TH21ST 2017 SPECIAL THANKS TO - 2 - SUMMARY Plenary Day 1 4 Day 2 5 Day 3 53 Day 4 101 Day 5 146 WELCOME! Symposia Day 2 6 Day 3 54 Day 4 102 Day 5 147 Poster Day 2 189 Day 3 239 Day 4 295 - 3 - TH DAY MONDAY, 17 1 18:00-19:00, GRAND AMPHI PLENARY TALK Bottom-up and top-down information in infants’ early language acquisition Sharon Peperkamp Laboratoire de Sciences Cognitives et Psycholinguistique, Paris, France Decades of research have shown that before they pronounce their first words, infants acquire much of the sound structure of their native language, while also developing word segmentation skills and starting to build a lexicon. The rapidity of this acquisition is intriguing, and the underlying learning mechanisms are still largely unknown. Drawing on both experimental and modeling work, I will review recent research in this domain and illustrate specifically how both bottom-up and top-down cues contribute to infants’ acquisition of phonetic cat- egories and phonological rules. - 4 - TH DAY TUESDAY, 18 2 9:00-10:00, GRAND AMPHI PLENARY TALK What do the hands tell us about lan- guage development? Insights from de- velopment of speech, gesture and sign across languages Asli Ozyurek Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands Most research and theory on language development focus on children’s spoken utterances. However language development starting with the first words of children is multimodal. Speaking children produce gestures ac- companying and complementing their spoken utterances in meaningful ways through pointing or iconic ges- tures.
    [Show full text]
  • 3 Corpus Tools for Lexicographers
    Comp. by: pg0994 Stage : Proof ChapterID: 0001546186 Date:14/5/12 Time:16:20:14 Filepath:d:/womat-filecopy/0001546186.3D31 OUP UNCORRECTED PROOF – FIRST PROOF, 14/5/2012, SPi 3 Corpus tools for lexicographers ADAM KILGARRIFF AND IZTOK KOSEM 3.1 Introduction To analyse corpus data, lexicographers need software that allows them to search, manipulate and save data, a ‘corpus tool’. A good corpus tool is the key to a comprehensive lexicographic analysis—a corpus without a good tool to access it is of little use. Both corpus compilation and corpus tools have been swept along by general technological advances over the last three decades. Compiling and storing corpora has become far faster and easier, so corpora tend to be much larger than previous ones. Most of the first COBUILD dictionary was produced from a corpus of eight million words. Several of the leading English dictionaries of the 1990s were produced using the British National Corpus (BNC), of 100 million words. Current lexico- graphic projects we are involved in use corpora of around a billion words—though this is still less than one hundredth of one percent of the English language text available on the Web (see Rundell, this volume). The amount of data to analyse has thus increased significantly, and corpus tools have had to be improved to assist lexicographers in adapting to this change. Corpus tools have become faster, more multifunctional, and customizable. In the COBUILD project, getting concordance output took a long time and then the concordances were printed on paper and handed out to lexicographers (Clear 1987).
    [Show full text]
  • Conference Abstracts
    EIGHTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION Held under the Patronage of Ms Neelie Kroes, Vice-President of the European Commission, Digital Agenda Commissioner MAY 23-24-25, 2012 ISTANBUL LÜTFI KIRDAR CONVENTION & EXHIBITION CENTRE ISTANBUL, TURKEY CONFERENCE ABSTRACTS Editors: Nicoletta Calzolari (Conference Chair), Khalid Choukri, Thierry Declerck, Mehmet Uğur Doğan, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, Stelios Piperidis. Assistant Editors: Hélène Mazo, Sara Goggi, Olivier Hamon © ELRA – European Language Resources Association. All rights reserved. LREC 2012, EIGHTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION Title: LREC 2012 Conference Abstracts Distributed by: ELRA – European Language Resources Association 55-57, rue Brillat Savarin 75013 Paris France Tel.: +33 1 43 13 33 33 Fax: +33 1 43 13 33 30 www.elra.info and www.elda.org Email: [email protected] and [email protected] Copyright by the European Language Resources Association ISBN 978-2-9517408-7-7 EAN 9782951740877 All rights reserved. No part of this book may be reproduced in any form without the prior permission of the European Language Resources Association ii Introduction of the Conference Chair Nicoletta Calzolari I wish first to express to Ms Neelie Kroes, Vice-President of the European Commission, Digital agenda Commissioner, the gratitude of the Program Committee and of all LREC participants for her Distinguished Patronage of LREC 2012. Even if every time I feel we have reached the top, this 8th LREC is continuing the tradition of breaking previous records: this edition we received 1013 submissions and have accepted 697 papers, after reviewing by the impressive number of 715 colleagues.
    [Show full text]
  • Morphological Doublets in Croatian: a Multi-Methodological Analysis
    Morphological Doublets in Croatian: A multi-methodological analysis By: Dario Lečić A thesis submitted in partial fulfilment of the requirements for the degree of Doctor of Philosophy The University of Sheffield Faculty of Arts and Humanities Department of Russian and Slavonic Studies 20 January, 2017 Acknowledgments Many a PhD student past and present will agree that doing a PhD is a time-consuming process with lots of ups and downs, motivational issues and even a number of nervous breakdowns. Having experienced all of these, I can only say that they were right. However, having reached the end of the tunnel, I have to admit that the feeling is great. I would like to use this opportunity to thank all the people who made this possible and who have helped me during these four years spent researching the intricate world of morphological doublets in Croatian. First of all, I would like to express my immense gratitude to my primary supervisor, Professor Neil Bermel from the Department or Russian and Slavonic Studies at the University of Sheffield for offering his guidance from day one. Our regular supervisory meetings as well as numerous e-mail exchanges have been eye-opening and I would not have been able to do this without you. I hope this dissertation will justify all the effort you have put into me as your PhD student. Even though the jurisdiction of the second supervisor as defined by the University of Sheffield officially stretches mostly to matters of the Doctoral Development Programme, my second supervisor, Dr Dagmar Divjak, nevertheless played a major role in this research as well, primarily in matters of statistics.
    [Show full text]
  • Background and Context for CLASP
    Background and Context for CLASP Nancy Ide, Vassar College The Situation Standards efforts have been on-going for over 20 years Interest and activity mainly in Europe in 90’s and early 2000’s Text Encoding Initiative (TEI) – 1987 Still ongoing, used mainly by humanities EAGLES/ISLE Developed standards for morpho-syntax, syntax, sub-categorization, etc. (links on CLASP wiki) Corpus Encoding Standard (now XCES - http://www.xces.org) Main Aspects" ! Harmonization of formats for linguistic data and annotations" ! Harmonization of descriptors in linguistic annotation" ! These two are often mixed, but need to deal with them separately (see CLASP wiki)" Formats: The Past 20 Years" 1987 TEI Myriad of formats 1994 MULTEXT, CES ~1996 XML 2000 ISO TC37 SC4 2001 LAF model introduced now LAF/GrAF, ISO standards Myriad of formats Actually…" ! Things are better now" ! XML use" ! Moves toward common models, especially in Europe" ! US community seeing the need for interoperability " ! Emergence of common processing platforms (GATE, UIMA) with underlying common models " Resources 1990 ! WordNet gains ground as a “standard” LR ! Penn Treebank, Wall Street Journal Corpus World Wide Web ! British National Corpus ! EuroWordNet XML ! Comlex ! FrameNet ! American National Corpus Semantic Web ! Global WordNet ! More FrameNets ! SUMO ! VerbNet ! PropBank, NomBank ! MASC present NLP software 1994 ! MULTEXT > LT tools, LT XML 1995 ! GATE (Sheffield) 1996 1998 ! Alembic Workbench ! ATLAS (NIST) 2003 ! What happened to this? 200? ! Callisto ! UIMA Now: GATE
    [Show full text]
  • FERSIWN GYMRAEG ISOD the National Corpus of Contemporary
    FERSIWN GYMRAEG ISOD The National Corpus of Contemporary Welsh Project Report, October 2020 Authors: Dawn Knight1, Steve Morris2, Tess Fitzpatrick2, Paul Rayson3, Irena Spasić and Enlli Môn Thomas4. 1. Introduction 1.1. Purpose of this report This report provides an overview of the CorCenCC project and the online corpus resource that was developed as a result of work on the project. The report lays out the theoretical underpinnings of the research, demonstrating how the project has built on and extended this theory. We also raise and discuss some of the key operational questions that arose during the course of the project, outlining the ways in which they were answered, the impact of these decisions on the resource that has been produced and the longer-term contribution they will make to practices in corpus-building. Finally, we discuss some of the applications and the utility of the work, outlining the impact that CorCenCC is set to have on a range of different individuals and user groups. 1.2. Licence The CorCenCC corpus and associated software tools are licensed under Creative Commons CC-BY-SA v4 and thus are freely available for use by professional communities and individuals with an interest in language. Bespoke applications and instructions are provided for each tool (for links to all tools, refer to section 10 of this report). When reporting information derived by using the CorCenCC corpus data and/or tools, CorCenCC should be appropriately acknowledged (see 1.3). § To access the corpus visit: www.corcencc.org/explore § To access the GitHub site: https://github.com/CorCenCC o GitHub is a cloud-based service that enables developers to store, share and manage their code and datasets.
    [Show full text]
  • LDL-2014 3Rd Workshop on Linked Data in Linguistics
    3rd Workshop on Linked Data in Linguistics: Multilingual Knowledge Resources and Natural Language Processing Workshop Programme 08:30 - 09:00 – Opening and Introduction by Workshop Chair(s) 09:00 – 10:00 – Invited Talk Piek Vossen, The Collaborative Inter-Lingual-Index for harmonizing wordnets 10:00 – 10:30 – Session 1: Modeling Lexical-Semantic Resources with lemon Andon Tchechmedjiev, Gilles Sérasset, Jérôme Goulian and Didier Schwab, Attaching Translations to Proper Lexical Senses in DBnary 10:30 – 11:00 Coffee break 11:00-11:20– Session 1: Modeling Lexical-Semantic Resources with lemon John Philip McCrae, Christiane Fellbaum and Philipp Cimiano, Publishing and Linking WordNet using lemon and RDF 11:20-11:40– Session 1: Modeling Lexical-Semantic Resources with lemon Andrey Kutuzov and Maxim Ionov, Releasing genre keywords of Russian movie descriptions as Linguistic Linked Open Data: an experience report 11:40-12:00– Session 2: Metadata Matej Durco and Menzo Windhouwer, From CLARIN Component Metadata to Linked Open Data 12:00-12:20– Session 2: Metadata Gary Lefman, David Lewis and Felix Sasaki, A Brief Survey of Multimedia Annotation Localisation on the Web of Linked Data 12:20-12:50– Session 2: Metadata Daniel Jettka, Karim Kuropka, Cristina Vertan and Heike Zinsmeister, Towards a Linked Open Data Representation of a Grammar Terms Index 12:50-13:00 – Poster slam – Data Challenge 13:00 – 14:00 Lunch break 14:00 – 15:00 – Invited Talk Gerard de Mello, From Linked Data to Tightly Integrated Data 15:00 – 15:30 – Section 3: Crosslinguistic
    [Show full text]