Eliciting Specialized Frames from Corpora Using Argument-Structure Extraction Techniques Beatriz Sanchez Cardenas, Carlos Ramisch

Total Page:16

File Type:pdf, Size:1020Kb

Eliciting Specialized Frames from Corpora Using Argument-Structure Extraction Techniques Beatriz Sanchez Cardenas, Carlos Ramisch Eliciting specialized frames from corpora using argument-structure extraction techniques Beatriz Sanchez Cardenas, Carlos Ramisch To cite this version: Beatriz Sanchez Cardenas, Carlos Ramisch. Eliciting specialized frames from corpora using argument- structure extraction techniques. Terminology. International Journal of Theoretical and Ap- plied Issues in Specialized Communication , John Benjamins Publishing, 2019, 25 (1), pp.1-31. 10.1075/term.00026.san. hal-02318280 HAL Id: hal-02318280 https://hal.archives-ouvertes.fr/hal-02318280 Submitted on 16 Oct 2019 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. This preprint version has been produced by the authors upon acceptance and reflects changes requested by reviewers. The official ‘version of record’ https://doi.org/10.1075/term.00026.san is under copyright and the publisher should be contacted for permission to re-use or reprint the material in any form. Reference: Sánchez-Cárdenas, Beatriz & Carlos Ramish (2019). Eliciting specialized frames from corpora using argument-structure extraction techniques. Terminology: An International Journal of Theoretical and Applied Issues in Specialized Communication, 25(1). DOI: https://doi.org/10.1075/term.00026.san Authors: Beatriz Sánchez Cárdenas and Carlos Ramisch Length: 8702 words (excluding re!erences" Beatriz Sánchez#Cárdenas Research group LexiCon Department of &ranslation and 'nterpreting University of Granada Calle Buensuceso, -- 18002 Granada (Spain) (./0" 958244104 bsc4ugr5es http:66lexicon.ugr5es6sanchezcardenas Carlos Ramisch Aix 7arseille Uni), Universit8 de &oulon, CNRS, L'S :arc Scienti;que et &echnologique de Luminy 163 Avenue de Luminy # Case 901 13288 7arseille Cedex 9 (>rance" (.//" 0 86 09 06 72 Carlos5Ramisch@lis#la35!r http:66pageperso.lis#la35!r6?carlos5ramisch Abstract Frame Semantics provides a powerful cross-lingual model to describe the conceptual structure underlying specialized language. However, building specialized frames is challenging because of the complex nature of predicate-argument structures, and because of the domain-specific uses of general-language predicates. This article presents a semi-automatic method to elicit semantic frames from specialized corpora. Its goal is to discover lexical patterns that reveal the structure of specialized frames and to populate them with corpus-based data. Firstly, we automatically extracted verb-noun This preprint version has been produced by the authors upon acceptance and reflects changes requested by reviewers. The official ‘version of record’ https://doi.org/10.1075/term.00026.san is under copyright and the publisher should be contacted for permission to re-use or reprint the material in any form. triples from corpora using bootstrapping to identify noun-verb-noun phraseological patterns. Secondly, we annotated each noun-verb-noun triple with the lexical domain of the verbs and the semantic class and role of the noun filling each argument slot. We then used these annotations and patterns to classify similar triples. This allowed us to make generalizations and infer the structure as well as the types of lexical units that belong to these specialized frames. We evaluated our methodology using specialized corpora of environmental science texts in English and in Spanish. Keywords Frame semantics, frame-based terminology, corpora, corpus-based extraction, argument structure This preprint version has been produced by the authors upon acceptance and reflects changes requested by reviewers. The official ‘version of record’ https://doi.org/10.1075/term.00026.san is under copyright and the publisher should be contacted for permission to re-use or reprint the material in any form. Eliciting specialized frames from corpora using argument-structure extraction techniques Abstract Frame Semantics provides a powerful cross-lingual model to describe the conceptual structure underlying specialized language. However, building specialized frames is challenging because of the complex nature of predicate-argument structures, and because of the domain-specific uses of general-language predicates. This article presents a semi-automatic method to elicit semantic frames from specialized corpora. Its goal is to discover lexical patterns that reveal the structure of specialized frames and to populate them with corpus-based data. Firstly, we automatically extracted verb-noun triples from corpora using bootstrapping to identify noun-verb-noun phraseological patterns. Secondly, we annotated each noun-verb-noun triple with the lexical domain of the verbs and the semantic class and role of the noun filling each argument slot. We then used these annotations and patterns to classify similar triples. This allowed us to make generalizations and infer the structure as well as the types of lexical units that belong to these specialized frames. We evaluated our methodology using specialized corpora of environmental science texts in English and in Spanish. Keywords Frame semantics, frame-based terminology, corpora, corpus-based extraction, argument structure This preprint version has been produced by the authors upon acceptance and reflects changes requested by reviewers. The official ‘version of record’ https://doi.org/10.1075/term.00026.san is under copyright and the publisher should be contacted for permission to re-use or reprint the material in any form. 1. Introduction The study of phraseology in scientific texts tends to focus either on general scientific formulaic templates or on the study of terms for their inclusion in specialized dictionaries. However, the description of the language used in a given scientific or technical domain should go far beyond merely collecting an inventory of terms that are used to instantiate general-language constructs (L’Homme 2004, Hanks 2004, Williams 2005, Granger and Meunier 2008, Faber 2012). In fact, a significant part of specialized language is composed of structured lexico-grammatical constructs used to express complex concepts that are typical of a given domain. There is thus the need to develop specialized lexicons that provide this type of information. This is particularly evident in translation. Translators dealing with specialized texts often have problems transposing the meaning of a sentence across languages because a superficial knowledge of the terms in a text is not sufficient. In addition to translating terms, it is necessary to translate actions and processes along with the entities that participate in them. For instance, a description of earthquake should include the entities that generally cause this event as well as its effect on other entities. This would afford translators a more in-depth knowledge of the concept and allow them to express it more idiomatically in the target language. In our opinion, such a description should stem from the analysis of specialized corpora in the source and target languages. In this endeavor, domain-specific corpora are a rich This preprint version has been produced by the authors upon acceptance and reflects changes requested by reviewers. The official ‘version of record’ https://doi.org/10.1075/term.00026.san is under copyright and the publisher should be contacted for permission to re-use or reprint the material in any form. source of information. Given that verbs carry most of the semantic load of the sentence, they are essential to define the underlying conceptual structure of specialized texts (Fellbaum 1990; L'Homme 2012, 1998). Thus, the identification of noun-verb combinations in corpora is crucial to build structured descriptions. The corpus-based construction of specialized lexical resources requires both linguistic and domain expertise, as well as suitable tools for performing corpus inquiries. Computational tools can support, enhance and facilitate corpus analysis to confirm and generalize linguistic introspection. Therefore, one often needs to run complex queries to model morphosyntactic and syntactic co-occurrence patterns, which in turn are proxies for predicate-argument structure. Our research combined the principles of Frame-based Terminology (Faber 2012, 2015; Faber and León Araúz 2014) with computational tools for corpus searches, semantic annotation, and frame specification. For automatic corpus searches, we used the MWEtoolkit, a software application that extracts co-occurrence patterns from corpora using multi-level queries that support regular-expression operators (Ramisch 2015). This approach lies in the roots of a considerable amount of literature over the last 20 years on the identification of knowledge patterns in specialized texts (Faber et al. 2009, Feliu 2004, Condamines 2002, Condamines and Rebeyrolle, Meyer et al. 2001, Meyer et al. 1999, inter alia). This preprint version has been produced by the authors upon acceptance and reflects changes requested by reviewers. The official ‘version of record’ https://doi.org/10.1075/term.00026.san is under copyright and the publisher should be contacted for permission to re-use or reprint the material in any form. The output of the initial
Recommended publications
  • Frame Blending in Specialized Language Harmful Algal Bloom
    John Benjamins Publishing Company This is a contribution from Terminology 19:2 © 2013. John Benjamins Publishing Company This electronic file may not be altered in any way. The author(s) of this article is/are permitted to use this PDF file to generate printed copies to be used by way of offprints, for their personal use only. Permission is granted by the publishers to post this file on a closed server which is accessible to members (students and staff) only of the author’s/s’ institute, it is not permitted to post this PDF on the open internet. For any other use of this material prior written permission should be obtained from the publishers or through the Copyright Clearance Center (for USA: www.copyright.com). Please contact [email protected] or consult our website: www.benjamins.com Tables of Contents, abstracts and guidelines are available at www.benjamins.com Frame blending in specialized language Harmful algal bloom José Manuel Ureña Gómez-Moreno, Pamela Faber and Miriam Buendía Castro According to Frame-Based Terminology (Faber et al. 2005, 2006, 2007), a crucial issue in terminology management is how specialized concepts should be repre- sented within the knowledge structure of a scientific domain. This paper pro- poses a model of specialized concept representation based on conceptual frames (Faber et al. 2006, 2007; Faber 2011) and blends (Fauconnier 1999; Fauconnier and Turner 1998, 2002). Although frame-blending has been documented in general language (cf. Coulson 2005), it has not as yet been studied in specialized language. In this paper, we show how it can be applied to harmful algal bloom in the field of marine biology.
    [Show full text]
  • The Lexical Constructional Model: Genesis, Strengths and Challenges
    The Lexical Constructional Model: genesis, strengths and challenges Christopher S. Butler, Honorary Professor, Swansea University, UK∗ 1. Introduction The early years of the 21st century are proving to be an interesting time for linguistic theory. The last few years have seen a welcome increase in the discussion of similarities and differences both within groupings of theories (e.g. those which would claim to be functionalist in orientation) and across such groupings (e.g. formalist, functionalist, cognitivist, constructionist). This in turn is leading to increasing awareness of the possibilities for rapprochement between models, and to the realisation that a combination of ideas from different approaches may turn out to be much more powerful than any of the models taken by itself. It is against this background that the present article is written. In it, I shall first review briefly some recent work on relationships across a spectrum of functionalist, cognitivist and constructionist theories, as a background to the rest of the discussion. I shall then look at one recent model, the Lexical Constructional Model (henceforth LCM), which richly embodies the principle of combining good ideas from a variety of compatible sources, and which is described in greater detail in other papers in this collection. The LCM is a complex model, with antecedents in a whole range of functional, cognitivist and constructionist approaches. For this reason it is useful, in understanding and evaluating the current proposals, to look how the model came into being. The first part of this article therefore demonstrates how and why the model has arisen, taking material from various approaches.
    [Show full text]
  • Specialized Knowledge Representation: from Terms to Frames*
    Research in Language, 2019, vol. 17:2 DOI: 10.18778/1731-7533.17.2.06 SPECIALIZED KNOWLEDGE REPRESENTATION: FROM TERMS TO FRAMES* PAMELA FABER University of Granada, Spain [email protected] MELANIA CABEZAS-GARCÍA University of Granada, Spain [email protected] Abstract Understanding specialized discourse requires the identification and activation of knowledge structures underlying the text. The expansion and enhancement of knowledge is thus an important part of the specialized translation process (Faber 2015). This paper explores how the analysis of terminological meaning can be addressed from the perspective of Frame- Based Terminology (FBT) (Faber 2012, 2015), a cognitive approach to domain-specific language, which directly links specialized knowledge representation to cognitive linguistics and cognitive semantics. In this study, context expansion was explored in a three-stage procedure: from single terms to multi-word terms, from multi-word terms to phrases, and from phrases to frames. Our results showed that this approach provides valuable insights into the identification of the knowledge structures underlying specialized texts. Keywords: context expansion, frame, multi-word term, phrase, specialized discourse 1. Introduction An important issue in translation is how to achieve sameness of meaning across languages and at all levels of the text. In the case of the translation of scientific and technical texts, a considerable percentage of translation quality depends on finding optimal correspondences for the specialized language units or terms used to convey the text message. These units, which may be single or multi-word terms, designate objects, events, processes, and attributes in the specialized field (Faber 2012). Terms, semantic clusters of terms, and their configurations activate segments of the conceptual structure of a knowledge domain (Sager et al.
    [Show full text]
  • Table of Contents
    Electronic lexicography in the 21st century: Lexicography from scratch TABLE OF CONTENTS From Thesaurus to Framenet Sanni NIMB, Anna BRAASCH, Sussi OLSEN, Bolette SANDFORD PEDERSEN, Anders SØGAARD . 1 Bilingual Dictionary Drafting: Bootstrapping WordNet and BabelNet David LINDEMANN, Fritz KLICHE . 23 The Main Features of the e-Glava Online Valency Dictionary Matea BIRTIĆ, Ivana BRAČ, Siniša RUNJAIĆ . 43 Specifying Hyponymy Subtypes and Knowledge Patterns: A Corpus-based Study Juan Carlos GIL-BERROZPE, Pilar LEÓN-ARAÚZ, Pamela FABER . 63 From Translation Equivalents to Synonyms: Creation of a Slovene Thesaurus Using Word Co-occurrence Network Analysis Simon KREK, Cyprian LASKOWSKI, Marko ROBNIK-ŠIKONJA . 93 An Ontology-terminology Model for Designing Technical e-dictionaries: Formalisation and Presentation of Variational Data Laura GIACOMINI . 110 The Translation Equivalents Database (Treq) as a Lexicographer’s Aid Michal ŠKRABAL, Martin VAVŘÍN . 124 Cognitive Features in a Corpus-based Dictionary of Commonly Confused Words Petra STORJOHANN . 138 From Monolingual to Bilingual Dictionary: The Case of Semi-automated Lexicography on the Example of Estonian–Finnish Dictionary Margit LANGEMETS, Indrek HEIN, Tarja HEINONEN, Kristina KOPPEL, Ülle VIKS . 1 55 The Croatian Web Dictionary Project – Mrežnik Lana HUDEČEK and Milica MIHALJEVIĆ . 172 Dicționariul Limbei Române (LM) by A. T. Laurian and I. C. Massim – the Digital Form of the First Romanian Academic Dictionary Marius-Radu CLIM, Mădălin-Ionel PATRAȘCU, Elena Isabelle TAMBA . 193 What Do Users of General Electronic Monolingual Dictionaries Search for? The Most Popular Entries in the Polish Academy of Sciences Great Dictionary of Polish Ewa KOZIOŁ-CHRZANOWSKA . 202 V Pictorial Illustrations in Encyclopaedias and in Dictionaries – a Comparison Monika BIESAGA .
    [Show full text]
  • Cultural Variation in Terminology Pamela Faber and Laura Medina Rull
    CHAPTER NINETEEN WRITTEN IN THE WIND: CULTURAL VARIATION IN TERMINOLOGY PAMELA FABER AND LAURA MEDINA RULL Introduction The interface between language, culture, and conceptualization is an explicit focus in both Cognitive Linguistics and Cultural Linguistics (Palmer 1996; Sarifian 2011). Culture encompasses the beliefs, behavior, objects, traditions, language, and other characteristics common to a particular sociocultural group. As the primary vehicle of cultural transmission, language encodes shared cultural knowledge, which can be reflected in word or term meaning in its most encyclopedic sense. In Cognitive Linguistics, meaning is identified with conceptualization, which encompasses any kind of mental experience (Langacker 2007: 431). Meanings are thus regarded as access points to extensive bodies of knowledge that are not specifically linguistic (Langacker 2014: 28). This is applicable not only to general language, but also to specialized language. This paper explores the cultural dimension of the conceptual category of WIND. From a meteorological perspective, winds are generally classified in terms of the following: spatial scale, speed, direction, region of occurrence, and effect. Many of these parameters are derived from cultural perceptions, especially when the wind is typical of a certain geographic area or region. The analysis of dictionary definitions as well as the study of micro-contexts extracted from a corpus of specialized environmental texts highlighted a common core of conceptual relations used to describe local winds. These relations are also the basis of a cultural frame or semplate (Burenhult and Levinson 2008: 144) for the concept of WIND. Although terms or specialized meaning units have always possessed a cultural dimension (Temmerman and Campenhoudt 2014), they are not generally perceived as cultural objects.
    [Show full text]
  • 1 PHRASEOLOGY in SPECIALIZED RESOURCES: an APPROACH to COMPLEX NOMINALS Melania Cabezas-García and Pamela Faber University of G
    Cabezas-García, Melania and Pamela Faber (2018). Phraseology in specialized resources: an approach to complex nominals. Lexicography 5(1), 55-83. Available at: http://link.springer.com/article/10.1007/s40607-018-0046-x PHRASEOLOGY IN SPECIALIZED RESOURCES: AN APPROACH TO COMPLEX NOMINALS Melania Cabezas-García and Pamela Faber University of Granada Abstract. In English, the international language of communication (Tono 2014), complex nominals (CNs) are frequently used to convey specialized concepts (Sager et al. 1980; Nakov 2013). These phraseological units have a nominal head that is modified by another element (e.g. hydropower production). Problems can arise in relation to their identification, their bracketing or internal structure disambiguation, their meaning access, and their translation or production in another language. Although they are not marginal phenomena in specialized language, they are rarely included in specialized resources. Even when they are included, their treatment is not systematic (Cabezas-García and Faber 2017a). This article describes the representation of CNs in EcoLexicon (www.ecolexicon.ugr.es), a terminological knowledge base, whose new phraseological module will include verb collocations (e.g. a volcano spews lava) as well as CNs. For that purpose, we used a wind power corpus in English and Spanish for term extraction, semantic analysis, establishment of interlinguistic correspondences, and definition crafting. We propose different access points to information (Kwary 2012), such as the CNs formed from a given term, a bilingual view in English and Spanish, or the syntactic-semantic combinations in CNs. The structure of the CN module is based on the semantics of these phraseological units, which facilitates the specification of mapping rules as well as knowledge acquisition (Faber 2012).
    [Show full text]
  • PAMELA FABER and RICARDO MAIRAL (1999). Constructing a Lexicon of English Verbs
    PAMELA FABER AND RICARDO MAIRAL (1999). Constructing a Lexicon of English Verbs. Berlin: Mouton. Review by Christopher S. Butler In July 1995, the functional linguistics community was shocked and deeply saddened to hear of the sudden and untimely death of Professor Leocadio Martín Mingorance, of the University of Córdoba, Spain. Martín Mingorance’s work, combining the Functional Grammar of Simon Dik with the lexematics of Eugene Coseriu into the lexically-based Functional Lexematic Model, began the process of developing the Functional Grammar conception of the lexicon into a model which integrates semantic, syntactic and pragmatic aspects of lexemes within a framework in which both paradigmatic and syntagmatic patterning find their place. Prominent among Martín Mingorance’s collaborators were Pamela Faber and Ricardo Mairal Usón, whose determination to carry on and develop the line of research pioneered by their friend and mentor has resulted in the present volume. Their aim in this book is impressively ambitious: to give an account of the English verbal lexicon which not only systematises the meanings of lexemes within a hierarchical framework, but also demonstrates the principled connections between meaning and, on the one hand, the syntactic complementation patterns of verbs, and on the other hand, patterns of conceptualization in the human mind. Such an endeavor is entirely compatible with the tendency towards lexically-based approaches in modern grammatical theory. This shift in paradigm is explored in the first part of Chapter 1 of the book, where developments in lexicology and lexicography are reviewed in relation to their impact on linguistic theorising. Matters of psychological adequacy and computational implementation are also discussed.
    [Show full text]
  • From Terms to Frames*
    Research in Language, 2019, vol. 17:2 DOI: 10.18778/1731-7533.17.2.06 SPECIALIZED KNOWLEDGE REPRESENTATION: FROM TERMS TO FRAMES* PAMELA FABER University of Granada, Spain [email protected] MELANIA CABEZAS-GARCÍA University of Granada, Spain [email protected] Abstract Understanding specialized discourse requires the identification and activation of knowledge structures underlying the text. The expansion and enhancement of knowledge is thus an important part of the specialized translation process (Faber 2015). This paper explores how the analysis of terminological meaning can be addressed from the perspective of Frame- Based Terminology (FBT) (Faber 2012, 2015), a cognitive approach to domain-specific language, which directly links specialized knowledge representation to cognitive linguistics and cognitive semantics. In this study, context expansion was explored in a three-stage procedure: from single terms to multi-word terms, from multi-word terms to phrases, and from phrases to frames. Our results showed that this approach provides valuable insights into the identification of the knowledge structures underlying specialized texts. Keywords: context expansion, frame, multi-word term, phrase, specialized discourse 1. Introduction An important issue in translation is how to achieve sameness of meaning across languages and at all levels of the text. In the case of the translation of scientific and technical texts, a considerable percentage of translation quality depends on finding optimal correspondences for the specialized language units or terms used to convey the text message. These units, which may be single or multi-word terms, designate objects, events, processes, and attributes in the specialized field (Faber 2012). Terms, semantic clusters of terms, and their configurations activate segments of the conceptual structure of a knowledge domain (Sager et al.
    [Show full text]
  • Proceedings of the 37Th Annual Meeting of the Berkeley Linguistics Society (2013), Pp
    UC Berkeley Proceedings of the Annual Meeting of the Berkeley Linguistics Society Title The World Meets the Body: Sociocultural Aspects of Terminological Metaphor Permalink https://escholarship.org/uc/item/9b38g5k2 Journal Proceedings of the Annual Meeting of the Berkeley Linguistics Society, 37(37) ISSN 2377-1666 Authors Ureña, José Manuel Faber, Pamela Publication Date 2013 Peer reviewed eScholarship.org Powered by the California Digital Library University of California The world meets the body: Sociocultural aspects of terminological metaphor Author(s): José Manual Ureña and Pamela Faber Proceedings of the 37th Annual Meeting of the Berkeley Linguistics Society (2013), pp. 359-374 Editors: Chundra Cathcart, I-Hsuan Chen, Greg Finley, Shinae Kang, Clare S. Sandy, and Elise Stickles Please contact BLS regarding any further use of this work. BLS retains copyright for both print and screen forms of the publication. BLS may be contacted via http://linguistics.berkeley.edu/bls/. The Annual Proceedings of the Berkeley Linguistics Society is published online via eLanguage, the Linguistic Society of America's digital publishing platform. The World Meets the Body: Sociocultural Aspects of Terminological Metaphor JOSÉ MANUEL UREÑA and PAMELA FABER Department of Translation and Interpreting University of Granada, Spain Introduction The experientalist view of the embodied mind is condensed in Gibbs (1999:155) affirmation that cognition is what happens when the body meets the world. Yet, it is also necessary to ask what happens when the world meets the body. In our opinion, conceptual metaphor analysis, whatever the knowledge field, is traceable to both sensory-motor inferences and cultural factors. On this basis, this paper analyzes a number of resemblance metaphor term pairs in English and Spanish, which were extracted from a text corpus of marine biology academic journals.
    [Show full text]
  • Specialized Knowledge Representation and the Parameterization of Context
    fpsyg-07-00196 February 19, 2016 Time: 20:46 # 1 HYPOTHESIS AND THEORY published: 23 February 2016 doi: 10.3389/fpsyg.2016.00196 Specialized Knowledge Representation and the Parameterization of Context Pamela Faber* and Pilar León-Araúz Department of Translation and Interpreting, University of Granada, Granada, Spain Though instrumental in numerous disciplines, context has no universally accepted definition. In specialized knowledge resources it is timely and necessary to parameterize context with a view to more effectively facilitating knowledge representation, understanding, and acquisition, the main aims of terminological knowledge bases. This entails distinguishing different types of context as well as how they interact with each other. This is not a simple objective to achieve despite the fact that specialized discourse does not have as many contextual variables as those in general language (i.e., figurative meaning, irony, etc.). Even in specialized text, context is an extremely complex concept. In fact, contextual information can be specified in terms of scope Edited by: Marco Cruciani, or according to the type of information conveyed. It can be a textual excerpt or a University of Trento, Italy whole document; a pragmatic convention or a whole culture; a concrete situation or Reviewed by: a prototypical scenario. Although these versions of context are useful for the users Elisabetta Lalumera, of terminological resources, such resources rarely support context modeling. In this Università degli Studi di Milano-Bicocca, Italy paper, we propose a taxonomy of context primarily based on scope (local and global) Rita Temmerman, and further divided into syntactic, semantic, and pragmatic facets. These facets cover Vrije Universiteit Amsterdam, Belgium Pius Ten Hacken, the specification of different types of terminological information, such as predicate- Universität Innsbruck, Austria argument structure, collocations, semantic relations, term variants, grammatical and *Correspondence: lexical cohesion, communicative situations, subject fields, and cultures.
    [Show full text]
  • Ecolexicon: New Features and Challenges
    EcoLexicon: New Features and Challenges Pamela Faber, Pilar León-Araúz, Arianne Reimerink Department of Translation and Interpreting, Universidad de Granada Buensuceso 11, 18071 Granada (Spain) E-mail: [email protected], [email protected], [email protected] Abstract EcoLexicon is a terminological knowledge base (TKB) on the environment with terms in six languages: English, French, German, Modern Greek, Russian, and Spanish. It is the practical application of Frame-based Terminology, which uses a modified version of Fillmore’s frames coupled with premises from Cognitive Linguistics to configure specialized domains on the basis of definitional templates and create situated representations for specialized knowledge concepts. The specification of the conceptual structure of (sub)events and the description of the lexical units are the result of a top-down and bottom-up approach that extracts information from a wide range of resources. This includes the use of corpora, the factorization of definitions from specialized resources and the extraction of conceptual relations with knowledge patterns. Similarly to a specialized visual thesaurus, EcoLexicon provides entries in the form of semantic networks that specify relations between environmental concepts. All entries are linked to a corresponding (sub)event and conceptual category. In other words, the structure of the conceptual, graphical, and linguistic information relative to entries is based on an underlying conceptual frame. Graphical information includes photos, images, and videos, whereas linguistic information not only specifies the grammatical category of each term, but also phraseological, and contextual information. The TKB also provides access to the specialized corpus created for its development and a search engine to query it.
    [Show full text]
  • Assessing Ecolexicat: Terminology Enhancement and Post-Editing
    Proceedings of eLex 2019 Assessing EcoLexiCAT: Terminology Enhancement and Post-editing Pilar León-Araúz, Arianne Reimerink, Pamela Faber Department of Translation and Interpreting University of Granada E-mail: {pleon, arianne, pfaber}@ugr.es Abstract EcoLexiCAT is a freely available online application, which integrates all features of the professional translation workflow in a stand-alone interface where a source text is interactively enriched with terminological information (i.e. definitions, translations, images, compound terms, corpus access, etc.) from different external resources. EcoLexiCAT is powered by MateCat and the external sources include EcoLexicon, BabelNet, the EcoLexicon English Corpus (powered by Sketch Engine) and IATE, as well as other common resources (e.g. Wordreference, Wikipedia, Linguee, etc.). Machine translation (MT) can also be optionally added. In order to evaluate the functionalities and performance of the tool, two experiments were carried out. In the first, one subject group used EcoLexiCAT and the other used MateCat, acting as the control group. In the second, both subject groups used EcoLexiCAT and only one used MT. Both experiments shed interesting light on user behaviour, performance and satisfaction while using EcoLexiCAT. Keywords: EcoLexiCAT; CAT tools; terminology management; MT post-editing 1. Introduction: EcoLexiCAT Today, machine translation (MT) and computer-assisted translation (CAT) are a crucial part of the professional translation workflow. Nevertheless, the post-editing of MT output has only recently started to become more widely accepted, and terminology management is often not seamlessly integrated into the translation process. As a possible solution to this problem in the field of environmental translation we developed EcoLexiCAT, a terminology-enhanced CAT tool that provides easy access to domain-specific terminological knowledge in context and MT (León-Araúz, Reimerink & Faber, 2017; León-Araúz & Reimerink, 2018; León-Araúz, Reimerink & Faber, 2019).
    [Show full text]