Contextualising Computer-Assisted Translation Tools and Modelling Their Usability
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
An Artificial Neural Network Approach for Sentence Boundary Disambiguation in Urdu Language Text
The International Arab Journal of Information Technology, Vol. 12, No. 4, July 2015 395 An Artificial Neural Network Approach for Sentence Boundary Disambiguation in Urdu Language Text Shazia Raj, Zobia Rehman, Sonia Rauf, Rehana Siddique, and Waqas Anwar Department of Computer Science, COMSATS Institute of Information Technology, Pakistan Abstract: Sentence boundary identification is an important step for text processing tasks, e.g., machine translation, POS tagging, text summarization etc., in this paper, we present an approach comprising of Feed Forward Neural Network (FFNN) along with part of speech information of the words in a corpus. Proposed adaptive system has been tested after training it with varying sizes of data and threshold values. The best results, our system produced are 93.05% precision, 99.53% recall and 96.18% f-measure. Keywords: Sentence boundary identification, feed forwardneural network, back propagation learning algorithm. Received April 22, 2013; accepted September 19, 2013; published online August 17, 2014 1. Introduction In such conditions it would be difficult for a machine to differentiate sentence terminators from Sentence boundary disambiguation is a problem in natural language processing that decides about the ambiguous punctuations. beginning and end of a sentence. Almost all language Urdu language processing is in infancy stage and processing applications need their input text split into application development for it is way slower for sentences for certain reasons. Sentence boundary number of reasons. One of these reasons is lack of detection is a difficult task as very often ambiguous Urdu text corpus, either tagged or untagged. However, punctuation marks are used in the text. -
Dissertationes Legilinguisticae 11 Legilinguistic Studies 11
Dissertationes legilinguisticae 11 Legilinguistic studies 11 Studies in Legal Language and Communication Dissertationes legilinguisticae Legilinguistic studies Studies in Legal Language and Communication Editor-in-chief: Aleksandra Matulewska Co-editors: Karolina Gortych-Michalak Editor of the volume: Paulina Nowak-Korcz © Copyright the Author and Institute of Linguistics of Adam Mickiewicz University Volume 11 ADVISORY BOARD Marcus Galdia Fernando Prieto Ramos Hannes Kniffka Artur Kubacki Maria Teresa Lizisowa Judith Rosenhouse Reviewer: Onorina Botezat ISBN 978-83-65287-50-2 Wydawnictwo Naukowe CONTACT Poznań 2017 2 Dissertationes legilinguisticae 11 Legilinguistic studies 11 Studies in Legal Language and Communication Methodology for Interlingual Comparison of Legal Terminology. Towards General Legilinguistic Translatology Paulina Kozanecka Aleksandra Matulewska Paula Trzaskawka Wydawnictwo Naukowe CONTACT Poznań 2017 4 Table of Contents Acknowledgements ..................................................................... 7 Abbreviations .............................................................................. 9 Introduction ............................................................................. 11 I Purpose and Scope of Research ......................................... 11 II Research Methods.. ............................................................ 14 III Research Material.. ............................................................ 14 IV Research Hypotheses ........................................................ -
Translation of Humor in the Dubbed Version of the Sitcom “How I Met Your Mother” in Vietnamese
VIETNAM NATIONAL UNIVERSITY, HANOI UNIVERSITY OF LANGUAGES AND INTERNATIONAL STUDIES FACULTY OF ENGLISH LANGUAGE TEACHER EDUCATION GRADUATION PAPER TRANSLATION OF HUMOR IN THE DUBBED VERSION OF THE SITCOM “HOW I MET YOUR MOTHER” IN VIETNAMESE Supervisor : Nguyễn Thị Diệu Thúy, MA Student : Nguyễn Thị Hòa Course : QH2014.F1.E20 HÀ NỘI – 2018 ĐẠI HỌC QUỐC GIA HÀ NỘI TRƯỜNG ĐẠI HỌC NGOẠI NGỮ KHOA SƯ PHẠM TIẾNG ANH KHÓA LUẬN TỐT NGHIỆP CÁCH DỊCH YẾU TỐ HÀI HƯỚC TRONG BẢN LỒNG TIẾNG PHIM HÀI TÌNH HUỐNG “KHI BỐ GẶP MẸ” Giáo viên hướng dẫn : Th.S Nguyễn Thị Diệu Thúy Sinh viên : Nguyễn Thị Hòa Khóa : QH2014.F1.E20 HÀ NỘI – 2018 ACCEPTANCE PAGE I hereby state that I: Nguyễn Thị Hòa (QH14.F1.E20), being a candidate for the degree of Bachelor of Arts (English Language) accept the requirements of the College relating to the retention and use of Bachelor’s Graduation Paper deposited in the library. In terms of these conditions, I agree that the origin of my paper deposited in the library should be accessible for the purposes of study and research, in accordance with the normal conditions established by the librarian for the care, loan or reproduction of the paper. Signature Date May 4th, 2018 ACKNOWLEDGEMENTS First and foremost, I feel grateful beyond measure for the patient guidance that my supervisor, Ms. Nguyễn Thị Diệu Thúy has shown me over the past few months. Without her critical comments and timely support, this paper would not be finished. In addition, I would like to express my sincere thanks to 80 students from class 15E12, 15E13, 15E14 and 15E16 at the University of Languages and International Studies who eagerly participated in the research. -
A Comparison of Knowledge Extraction Tools for the Semantic Web
A Comparison of Knowledge Extraction Tools for the Semantic Web Aldo Gangemi1;2 1 LIPN, Universit´eParis13-CNRS-SorbonneCit´e,France 2 STLab, ISTC-CNR, Rome, Italy. Abstract. In the last years, basic NLP tasks: NER, WSD, relation ex- traction, etc. have been configured for Semantic Web tasks including on- tology learning, linked data population, entity resolution, NL querying to linked data, etc. Some assessment of the state of art of existing Knowl- edge Extraction (KE) tools when applied to the Semantic Web is then desirable. In this paper we describe a landscape analysis of several tools, either conceived specifically for KE on the Semantic Web, or adaptable to it, or even acting as aggregators of extracted data from other tools. Our aim is to assess the currently available capabilities against a rich palette of ontology design constructs, focusing specifically on the actual semantic reusability of KE output. 1 Introduction We present a landscape analysis of the current tools for Knowledge Extraction from text (KE), when applied on the Semantic Web (SW). Knowledge Extraction from text has become a key semantic technology, and has become key to the Semantic Web as well (see. e.g. [31]). Indeed, interest in ontology learning is not new (see e.g. [23], which dates back to 2001, and [10]), and an advanced tool like Text2Onto [11] was set up already in 2005. However, interest in KE was initially limited in the SW community, which preferred to concentrate on manual design of ontologies as a seal of quality. Things started changing after the linked data bootstrapping provided by DB- pedia [22], and the consequent need for substantial population of knowledge bases, schema induction from data, natural language access to structured data, and in general all applications that make joint exploitation of structured and unstructured content. -
ACL 2019 Social Media Mining for Health Applications (#SMM4H)
ACL 2019 Social Media Mining for Health Applications (#SMM4H) Workshop & Shared Task Proceedings of the Fourth Workshop August 2, 2019 Florence, Italy c 2019 The Association for Computational Linguistics Order copies of this and other ACL proceedings from: Association for Computational Linguistics (ACL) 209 N. Eighth Street Stroudsburg, PA 18360 USA Tel: +1-570-476-8006 Fax: +1-570-476-0860 [email protected] ISBN 978-1-950737-46-8 ii Preface Welcome to the 4th Social Media Mining for Health Applications Workshop and Shared Task - #SMM4H 2019. The total number of users of social media continues to grow worldwide, resulting in the generation of vast amounts of data. Popular social networking sites such as Facebook, Twitter and Instagram dominate this sphere. According to estimates, 500 million tweets and 4.3 billion Facebook messages are posted every day 1. The latest Pew Research Report 2, nearly half of adults worldwide and two- thirds of all American adults (65%) use social networking. The report states that of the total users, 26% have discussed health information, and, of those, 30% changed behavior based on this information and 42% discussed current medical conditions. Advances in automated data processing, machine learning and NLP present the possibility of utilizing this massive data source for biomedical and public health applications, if researchers address the methodological challenges unique to this media. In its fourth iteration, the #SMM4H workshop takes place in Florence, Italy, on August 2, 2019, and is co-located with the -
Full-Text Processing: Improving a Practical NLP System Based on Surface Information Within the Context
Full-text processing: improving a practical NLP system based on surface information within the context Tetsuya Nasukawa. IBM Research Tokyo Resem~hLaborat0ry t623-14, Shimotsurum~, Yimmt;0¢sl{i; I<almgawa<kbn 2421,, J aimn • nasukawa@t,rl:, vnet;::ibm icbm Abstract text. Without constructing a i):recige filodel of the eohtext through, deep sema~nfiCamtlys~is, our frmne= Rich information fl)r resolving ambigui- "work-refers .to a set(ff:parsed trees. (.r~sltlt~ 9 f syn- ties m sentence ~malysis~ including vari- : tacti(" miaiysis)ofeach sexitencd in t.li~;~i'ext as (:on- ous context-dependent 1)rol)lems. can be ob- text ilfformation, Thus. our context model consists tained by analyzing a simple set of parsed of parse(f trees that are obtained 1)y using mi exlst- ~rces of each senten('e in a text withom il!g g¢lwral syntactic parser. Excel)t for information constructing a predse model of the contex~ ()It the sequence of senl;,en('es, olIr framework does nol tl(rough deep senmntic.anMysis. Th.us. pro- consider any discourse stru(:~:ure mwh as the discourse cessmg• ,!,' a gloup" of sentem' .., '~,'(s togethel. i ': makes.' • " segmenm, focus space stack, or dominant hierarclty it.,p.(,)ss{t?!e .t.9 !~npl:ovel-t]le ~ccui'a~'Y (?f a :: :.it~.(.fi.ii~idin:.(cfi.0szufid, Sht/!er, dgs6)i.Tli6refbi.e, om< ,,~ ~-. ~t.ehi- - ....t sii~ 1 "~g;. ;, ~-ni~chin¢''ti-mslat{b~t'@~--..: .;- . .? • . : ...... - ~,',". ..........;-.preaches ":........... 'to-context pr0cessmg,,' • and.m" "tier-ran : ' le d at. •tern ; ::'Li:In i. thin.j.;'P'..p) a (r ~;.!iwe : .%d,es(tib~, ..i-!: .... -
Natural Language Processing
Chowdhury, G. (2003) Natural language processing. Annual Review of Information Science and Technology, 37. pp. 51-89. ISSN 0066-4200 http://eprints.cdlr.strath.ac.uk/2611/ This is an author-produced version of a paper published in The Annual Review of Information Science and Technology ISSN 0066-4200 . This version has been peer-reviewed, but does not include the final publisher proof corrections, published layout, or pagination. Strathprints is designed to allow users to access the research output of the University of Strathclyde. Copyright © and Moral Rights for the papers on this site are retained by the individual authors and/or other copyright owners. Users may download and/or print one copy of any article(s) in Strathprints to facilitate their private study or for non-commercial research. You may not engage in further distribution of the material or use it for any profitmaking activities or any commercial gain. You may freely distribute the url (http://eprints.cdlr.strath.ac.uk) of the Strathprints website. Any correspondence concerning this service should be sent to The Strathprints Administrator: [email protected] Natural Language Processing Gobinda G. Chowdhury Dept. of Computer and Information Sciences University of Strathclyde, Glasgow G1 1XH, UK e-mail: [email protected] Introduction Natural Language Processing (NLP) is an area of research and application that explores how computers can be used to understand and manipulate natural language text or speech to do useful things. NLP researchers aim to gather knowledge on how human beings understand and use language so that appropriate tools and techniques can be developed to make computer systems understand and manipulate natural languages to perform the desired tasks. -
Anatomy of a Multilingual IFU
Anatomy of a Multilingual IFU A project lifecycle, from content creation to final print. September 26, 2019 www.idemtranslations.com EXECUTIVE BRIEF Anatomy of a Multilingual IFU Ready to commercialize your product abroad? With regulatory approvals in the works, it’s time to plan the next step in your product distribution cycle. For many medical devices and biotech products, this means creating a multilingual IFU that ships directly with your product around the globe. This executive brief walks you through the decision-making steps necessary to plan and execute an in-box multilingual IFU, from drafting the content to sealing the box. A bit of advance planning will go a long way to ensuring that your product launch is stress-free. Step #1: Content Clearly, you need to draft your English IFU content as the first step on the road to a multilingual IFU. When possible, it is worth spending a little extra effort on the IFU content to ensure it is crystal-clear before the verbiage is locked down. This linguistic clarity will save you time (and money) in the future. Here are several easy things that you can do that pay off future dividends in terms of managing a multilingual IFU: 1. use international measurements Outside the US, metric units of measurement (mm, cm, m, ml) and temperatures in Celsius are far more common than feet, inches, and degrees Fahrenheit. If you only include the typical US measurements, your translators will need to ask how you wish to convert the numbers for use abroad. By including the international measurements (either alone or in tandem with the US measurements), you can perform those conversions up front and avoid questions during the translation process. -
Entity Extraction: from Unstructured Text to Dbpedia RDF Triples Exner
Entity extraction: From unstructured text to DBpedia RDF triples Exner, Peter; Nugues, Pierre Published in: Proceedings of the Web of Linked Entities Workshop in conjuction with the 11th International Semantic Web Conference (ISWC 2012) 2012 Link to publication Citation for published version (APA): Exner, P., & Nugues, P. (2012). Entity extraction: From unstructured text to DBpedia RDF triples. In Proceedings of the Web of Linked Entities Workshop in conjuction with the 11th International Semantic Web Conference (ISWC 2012) (pp. 58-69). CEUR. http://ceur-ws.org/Vol-906/paper7.pdf Total number of authors: 2 General rights Unless other specific re-use rights are stated the following general rights apply: Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. • Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain • You may freely distribute the URL identifying the publication in the public portal Read more about Creative commons licenses: https://creativecommons.org/licenses/ Take down policy If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately and investigate your claim. LUND UNIVERSITY PO Box 117 221 00 Lund +46 46-222 00 00 Entity Extraction: From Unstructured Text to DBpedia RDF Triples Peter Exner and Pierre Nugues Department of Computer science Lund University [email protected] [email protected] Abstract. -
Ruslan - an Nt System Between Closely Related Languages
RUSLAN - AN NT SYSTEM BETWEEN CLOSELY RELATED LANGUAGES Jan Haji~ J , , . Vyzkumny ustav matematxckych stroju , P J Loretanske nam. 3 118 55 Praha 1, Czechoslovakia Machinery) at the Department of Software ABSTRACT in cooperation with the Department of Mathematical Linguistics, Faculty of A project of machine translation of Mathematics and Physics, Charles Czech computer manuals into Russian is University, Prague. described, presenting first a description of the overall system structure and concentrating then mainly Input texts on input text preparation and a parsing algorithm based on bottom-up parser The texts our system should translate programmed in Colmerauer's Q-systems. are software manuals to V~MS-developed DOS-4 operating system which is an advanced extension to the common DOS. The texts are currently maintained on INTRODUCTION tapes under the editing and formatting system PES (Programmed Editing System). In mid-1985, a project of machine This system allows for preparation, translation of Czech computer manuals editing and binding-ready printout using into Russian was started, thus national printer chain(s). Texts are constituting a second MT project of the stored on tapes using an internal format group of mathematical linguistics at containing upper/lowercase letters, Charles University (for a full editing & formatting commands, version description of the first project, see number/identification, info on (Kirschner, 1982) and (Kirschner, in last-changed pages etc.; most of this press)). can be used to improve the overall translation quality. On the other hand, Our goals are both practical part of it is somewhat confusing and (translation or re-translation of new or must be handled carefully. -
Mining Text Data
Mining Text Data Charu C. Aggarwal • ChengXiang Zhai Editors Mining Text Data Editors Charu C. Aggarwal ChengXiang Zhai IBM T.J. Watson Research Center University of Illinois at Urbana-Champaign Yorktown Heights, NY, USA Urbana, IL, USA [email protected] [email protected] ISBN 978-1-4614-3222-7 e-ISBN 978-1-4614-3223-4 DOI 10.1007/978-1-4614-3223-4 Springer New York Dordrecht Heidelberg London Library of Congress Control Number: 2012930923 © Springer Science+Business Media, LLC 2012 All rights reserved. This work may not be translated or copied in whole or in part without the written permission of the publisher (Springer Science+Business Media, LLC, 233 Spring Street, New York, NY 10013, USA), except for brief excerpts in connection with reviews or scholarly analysis. Use in connection with any form of information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed is forbidden. The use in this publication of trade names, trademarks, service marks, and similar terms, even if they are not identified as such, is not to be taken as an expression of opinion as to whether or not they are subject to proprietary rights. Printed on acid-free paper Springer is part of Springer Science+Business Media (www.springer.com) Contents 1 An Introduction to Text Mining 1 Charu C. Aggarwal and ChengXiang Zhai 1. Introduction 1 2. Algorithms for Text Mining 4 3. Future Directions 8 References 10 2 Information Extraction from Text 11 Jing Jiang 1. Introduction 11 2. Named Entity Recognition 15 2.1 Rule-based Approach 16 2.2 Statistical Learning Approach 17 3. -
International Standard Iso 17100:2015(E)
INTERNATIONAL ISO STANDARD 17100 First edition 2015-05-01 Translation services — Requirements for translation services Services de traduction — Exigences relatives aux services de traduction Reference number ISO 17100:2015(E) Licensed to Nubveto / Eva Feldbrugge ([email protected]) ISO Store Order: OP-257234 / Downloaded: 2017-12-18 Single user licence only, copying and networking prohibited. © ISO 2015 ISO 17100:2015(E) COPYRIGHT PROTECTED DOCUMENT © ISO 2015, Published in Switzerland All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized otherwise in any form orthe by requester. any means, electronic or mechanical, including photocopying, or posting on the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address below or ISO’s member body in the country of Ch. de Blandonnet 8 • CP 401 ISOCH-1214 copyright Vernier, office Geneva, Switzerland Tel. +41 22 749 01 11 Fax +41 22 749 09 47 www.iso.org [email protected] Licensed to Nubveto / Eva Feldbrugge ([email protected]) ISO Store Order: OP-257234 / Downloaded: 2017-12-18 Single user licence only, copying and networking prohibited. ii © ISO 2015 – All rights reserved ISO 17100:2015(E) Contents Page Foreword ..........................................................................................................................................................................................................................................v Introduction ................................................................................................................................................................................................................................vi