Workshopabstracts

TENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION MAY 23 – 28, 2016 GRAND HOTEL BERNARDIN CONFERENCE CENTRE Portorož , SLOVENIA WORKSHOP ABSTRACTS Editors: Please refer to each single workshop list of editors. Assistant Editors: Sara Goggi, Hélène Mazo The LREC 2016 Proceedings are licensed under a Creative Commons Attribution- NonCommercial 4.0 International License LREC 2016, TENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION Title: LREC 2016 Workshop Abstracts Distributed by: ELRA – European Language Resources Association 9, rue des Cordelières 75013 Paris France Tel.: +33 1 43 13 33 33 Fax: +33 1 43 13 33 30 www.elra.info and www.elda.org Email: [email protected] and [email protected] ISBN 978-2-9517408-9-1 EAN 9782951740891 ii TABLE OF CONTENTS W12 Joint 2nd Workshop on Language & Ontology (LangOnto2) & Terminology & Knowledge Structures (TermiKS) …………………………………………………………………………….. 1 W5 Cross -Platform Text Mining & Natural Language Processing Interoperability ……. 12 W39 9th Workshop on Building & Using Comparable Corpora (BUCC 2016)…….…….. 23 W19 Collaboration & Computing for Under-Resourced Languages (CCURL 2016).….. 31 W31 Improving Social Inclusion using NLP: Tools & Resources ……………………………….. 43 W9 Resources & ProcessIng of linguistic & extra-linguistic Data from people with various forms of cognitive / psychiatric impairments (RaPID)………………………………….. 49 W25 Emotion & Sentiment Analysis (ESA 2016)…………..………………………………………….. 58 W28 2nd Workshop on Visualization as added value in the development, use & evaluation of LRs (VisLR II)……….……………………………………………………………………………… 68 W15 Text Analytics for Cybersecurity & Online Safety (TA-COS)……………………………… 75 W13 5th Workshop on Linked Data in Linguistics (LDL-2016)………………………………….. 83 W10 Lexicographic Resources for Human Language Technology (GLOBALEX 2016)… 93 W18 Translation Evaluation: From Fragmented Tools & Data Sets to an Integrated Ecosystem?................................................................................................................... 106 W16 2nd Workshop on Arabic Corpora & Processing Tools (OSACT2)........................ 116 W38 3rd Workshop on Indian Language Data Resource & Evaluation (WILDRE-3)….. 123 W40 ETHics In Corpus collection, Annotation and Application (ETHI-CA² 2016)………. 136 W1 Sideways 2016 ………………………………………………………………………………………………….. 148 W7 Emotions, Metaphors, Ontology & Terminology during disaster (EMOT)….………. 155 W35 Multimodal Corpora 2016 (MMC 2016)..............…………………………………………….. 163 W8 7th Workshop on the Representation & Processing of Sign Languages…………….. 172 W4 12th Joint ACL-ISO Workshop on Interoperable Semantic Annotation (ISA-12)… 186 W34 Natural Language Processing for Translation Memories 2016…………………………. 194 W41 Controlled Language Applications…………………………………………………………………… 201 W23 Workshop on Research Results Reproducibility, and Resources Citation in Science and Technology of Language (4REAL)…………………………………………………..…….. 207 W43 Novel Incentives for Collecting Data & Annotation from People……………………… 214 W36 Normalisation & Analysis of Social Media Texts (NormSoMe) ……….……………….. 221 W6 4th Workshop on Challenges in the Management of Large Corpora (CMLC-4)….. 228 W27 Just talking Casual Talk among Humans and Machines …………………………………… 235 W30 Workshop on Collecting &Generating Resources for Chatbots & Conversational Agents Development & Evaluation (RE-WOCHAT 2016)……..…………… 242 W11 Quality Assessment for Text Simplification (QATS)......……………………………………. 249 iii iv Language and Ontology (LangOnto2) & Terminology and Knowledge Structures (TermiKS) 23 May 2016 ABSTRACTS Editors: Larisa Grčić Simeunović, Špela Vintar, Fahad Khan, Pilar León Araúz, Pamela Faber, Francesca Fontini, Artemis Parvisi, Christina Unger 1 Workshop Programme Session 1 09:00 – 09:30 – Introductory Talk Pamela Faber, The Cultural Dimension of Knowledge Structures 09:30 – 10:10 – Introductory Talk John McCrae, Putting ontologies to work in NLP: The lemon model and its applications 10:10 – 10:30 Gregory Grefenstette, Karima Rafes, Transforming Wikipedia into an Ontology-based Information Retrieval Search Engine for Local Experts using a Third-Party Taxonomy 10:30 – 11:00 Coffee break Session 2 11:00 – 11:30 Juan Carlos Gil-Berrozpe, Pamela Faber, Refining Hyponymy in a Terminological Knowledge Base 11:30 – 12:00 Pilar León-Araúz, Arianne Reimerink, Evaluation of EcoLexicon Images 12:00 – 12:30 Špela Vintar, Larisa Grčić Simeunović, The Language-Dependence of Conceptual Structures in Karstology 12:30 – 13:00 Takuma Asaishi, Kyo Kageura, Growth of the Terminological Networks in Junior-high and High School Textbooks 13:30 – 14:00 Lunch break Session 3 14:00 – 14:20 Gabor Melli, Semantically Annotated Concepts in KDD’s 2009-2015 Abstracts 14:20 – 14:40 Bhaskar Sinha, Somnath Chandra, Neural Network Based Approach for Relational Classification for Ontology Development in Low Resourced Indian Language 14:40 – 15:00 Livy Real, Valeria de Paiva, Fabricio Chalub, Alexandre Rademaker, Gentle with the Gentilics 15:00 – 15:30 Dante Degl’Innocenti, Dario De Nart and Carlo Tasso, The Importance of Being Referenced: Introducing Referential Semantic Spaces 15:30 – 16:00 Haizhou Qu, Marcelo Sardelich, Nunung Nurul Qomariyah and Dimitar Kazakov, Integrating Time Series with Social Media Data in an Ontology for the Modelling of Extreme Financial Events 2 16:00 – 16:30 Coffee break Session 4 16:30 – 16:50 Christian Willms, Hans-Ulrich Krieger, Bernd Kiefer, ×-Protégé An Ontology Editor for Defining Cartesian Types to Represent n-ary Relations 16:50 – 17:10 Alexsandro Fonseca, Fatiha Sadat and François Lareau, A Lexical Ontology to Represent Lexical Functions 17:10 – 17:40 Ayla Rigouts Terryn, Lieve Macken and Els Lefever, Dutch Hypernym Detection: Does Decompounding Help? 17:40 – 18:30 Discussion and Closing 3 Workshop Organizers Istituto di Linguistica Computazionale “A. Fahad Khan Zampolli” - CNR, Italy Špela Vintar University of Ljubljana, Slovenia Pilar León Araúz University of Granada, Spain Pamela Faber University of Granada, Spain Istituto di Linguistica Computazionale “A. Francesca Frontini Zampolli” - CNR, Italy Artemis Parvizi Oxford University Press Larisa Grčić Simeunović University of Zadar, Croatia Christina Unger University of Bielefeld, Germany Workshop Programme Committee Guadalupe Aguado-de-Cea Universidad Politécnica de Madrid, Spain Amparo Alcina Universitat Jaume I, Spain Nathalie Aussenac-Gilles IRIT, France Caroline Barrière CRIM, Canada Institute of Croatian Language and Linguistics, Maja Bratanić Croatia Paul Buitelaar Insight Centre for Data Analytics, Ireland Federico Cerutti Cardiff University Béatrice Daille University of Nantes, France Aldo Gangemi LIPN University, ISTC-CNR Rome Eric Gaussier University of Grenoble, France Emiliano Giovannetti ILC-CNR Ulrich Heid University of Hildesheim, Germany Caroline Jay University of Manchester Kyo Kageura University of Tokio, Japan Hans-Ulrich Krieger DFKI GmbH Roman Kutlak Oxford University Press Marie-Claude L'Homme OLST, Université de Montréal, Canada Monica Monachini ILC-CNR Mojca Pecman University of Paris Diderot, France Silvia Piccini ILC-CNR Yuan Ren Microsoft China Fabio Rinaldi Universität Zürich, Switzerland Irena Spasic University of Cardiff Markel Vigo University of Manchester 4 Introductory talk I Monday 23 May, 9:00 – 9:30 The Cultural Dimension of Knowledge Structures Pamela Faber Frame-Based Terminology (FBT) is a cognitive approach to terminology, which directly links specialized knowledge representation to cognitive linguistics and cognitive semantics (Faber 2011, 2012, 2014). More specifically, the FBT approach applies the notion of frame as a schematization of a knowledge structure, which is represented at the conceptual level and held in long-term memory and which emphasizes both hierarchical and non-hierarchical conceptual relations. Frames also link elements and entities associated with general human experience or with a particular culturally embedded scene or situation. Culture is thus a key element in knowledge structures. Cultural frames are directly connected to what has been called ‘design principle’ (O’Meara and Bohnemeyer 2008), ‘template’, ‘model’, ‘schema’ or ‘frame ‘(Brown 2008; Burenhult 2008, Cablitz 2008, Levinson 2008). In EcoLexicon, a frame is a representation that integrates various ways of combining semantic generalizations about one category or a group of categories. In contrast, a template is a representational pattern for individual members of the same category. Burenhult and Levinson (2008: 144) even propose the term, semplate, which refers to the cultural themes or linguistic patterns that are imposed on the environment to create, coordinate, subcategorize, or contrast categories. Although rarely explored, cultural situatedness has an impact on semantic networks, which reflect differences between terms used in closely related language cultures. Nevertheless, the addition of a cultural component to term meaning is considerably more complicated than the inclusion of terms that designate new concepts specific to other cultures. One reason for this is that certain conceptual categories are linked, for example, to the habitat of the speakers of a language and derive their meaning from the geographic and meteorological characteristics of a given geographic area or region. This paper explains the need for typology of cultural frames or profiles linked to the most prominent semantic categories. As an example, the terms for different types of local wind are analyzed

Workshopabstracts

Measuring Text Simplification with the Crowd

Gnu Smalltalk Library Reference Version 3.2.5 24 November 2017

GNU MP the GNU Multiple Precision Arithmetic Library Edition 6.2.1 14 November 2020

Proceedings of the 1St Workshop on Tools and Resources to Empower

Dynamic Object-Oriented Programming with Smalltalk

Data-Driven Text Simplification

Classifier-Based Text Simplification for Improved Machine Translation

Ontology-Based Chatbot for Disaster Management: Use Case Coronavirus

Natural Language Processing Tools for Reading Level Assessment and Text Simpliﬁcation for Bilingual Education

E-Content Submission to INFLIBNET Subject Name Linguistics Paper Name Corpus Linguistics Paper Coordinator Name & Contact Mo

Pipenightdreams Osgcal-Doc Mumudvb Mpg123-Alsa Tbb

Experimenting Sentence Split-And-Rephrase Using Part-Of-Speech Labels