October/November 2009
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Can You Give Me Another Word for Hyperbaric?: Improving Speech
“CAN YOU GIVE ME ANOTHER WORD FOR HYPERBARIC?”: IMPROVING SPEECH TRANSLATION USING TARGETED CLARIFICATION QUESTIONS Necip Fazil Ayan1, Arindam Mandal1, Michael Frandsen1, Jing Zheng1, Peter Blasco1, Andreas Kathol1 Fred´ eric´ Bechet´ 2, Benoit Favre2, Alex Marin3, Tom Kwiatkowski3, Mari Ostendorf3 Luke Zettlemoyer3, Philipp Salletmayr5∗, Julia Hirschberg4, Svetlana Stoyanchev4 1 SRI International, Menlo Park, USA 2 Aix-Marseille Universite,´ Marseille, France 3 University of Washington, Seattle, USA 4 Columbia University, New York, USA 5 Graz Institute of Technology, Austria ABSTRACT Our previous work on speech-to-speech translation systems has We present a novel approach for improving communication success shown that there are seven primary sources of errors in translation: between users of speech-to-speech translation systems by automat- • ASR named entity OOVs: Hi, my name is Colonel Zigman. ically detecting errors in the output of automatic speech recogni- • ASR non-named entity OOVs: I want some pristine plates. tion (ASR) and statistical machine translation (SMT) systems. Our approach initiates system-driven targeted clarification about errorful • Mispronunciations: I want to collect some +de-MOG-raf-ees regions in user input and repairs them given user responses. Our about your family? (demographics) system has been evaluated by unbiased subjects in live mode, and • Homophones: Do you have any patients to try this medica- results show improved success of communication between users of tion? (patients vs. patience) the system. • MT OOVs: Where is your father-in-law? Index Terms— Speech translation, error detection, error correc- • Word sense ambiguity: How many men are in your com- tion, spoken dialog systems. pany? (organization vs. -
Translation of Humor in the Dubbed Version of the Sitcom “How I Met Your Mother” in Vietnamese
VIETNAM NATIONAL UNIVERSITY, HANOI UNIVERSITY OF LANGUAGES AND INTERNATIONAL STUDIES FACULTY OF ENGLISH LANGUAGE TEACHER EDUCATION GRADUATION PAPER TRANSLATION OF HUMOR IN THE DUBBED VERSION OF THE SITCOM “HOW I MET YOUR MOTHER” IN VIETNAMESE Supervisor : Nguyễn Thị Diệu Thúy, MA Student : Nguyễn Thị Hòa Course : QH2014.F1.E20 HÀ NỘI – 2018 ĐẠI HỌC QUỐC GIA HÀ NỘI TRƯỜNG ĐẠI HỌC NGOẠI NGỮ KHOA SƯ PHẠM TIẾNG ANH KHÓA LUẬN TỐT NGHIỆP CÁCH DỊCH YẾU TỐ HÀI HƯỚC TRONG BẢN LỒNG TIẾNG PHIM HÀI TÌNH HUỐNG “KHI BỐ GẶP MẸ” Giáo viên hướng dẫn : Th.S Nguyễn Thị Diệu Thúy Sinh viên : Nguyễn Thị Hòa Khóa : QH2014.F1.E20 HÀ NỘI – 2018 ACCEPTANCE PAGE I hereby state that I: Nguyễn Thị Hòa (QH14.F1.E20), being a candidate for the degree of Bachelor of Arts (English Language) accept the requirements of the College relating to the retention and use of Bachelor’s Graduation Paper deposited in the library. In terms of these conditions, I agree that the origin of my paper deposited in the library should be accessible for the purposes of study and research, in accordance with the normal conditions established by the librarian for the care, loan or reproduction of the paper. Signature Date May 4th, 2018 ACKNOWLEDGEMENTS First and foremost, I feel grateful beyond measure for the patient guidance that my supervisor, Ms. Nguyễn Thị Diệu Thúy has shown me over the past few months. Without her critical comments and timely support, this paper would not be finished. In addition, I would like to express my sincere thanks to 80 students from class 15E12, 15E13, 15E14 and 15E16 at the University of Languages and International Studies who eagerly participated in the research. -
A Survey of Voice Translation Methodologies - Acoustic Dialect Decoder
International Conference On Information Communication & Embedded Systems (ICICES-2016) A SURVEY OF VOICE TRANSLATION METHODOLOGIES - ACOUSTIC DIALECT DECODER Hans Krupakar Keerthika Rajvel Dept. of Computer Science and Engineering Dept. of Computer Science and Engineering SSN College Of Engineering SSN College of Engineering E-mail: [email protected] E-mail: [email protected] Bharathi B Angel Deborah S Vallidevi Krishnamurthy Dept. of Computer Science Dept. of Computer Science Dept. of Computer Science and Engineering and Engineering and Engineering SSN College Of Engineering SSN College Of Engineering SSN College Of Engineering E-mail: [email protected] E-mail: [email protected] E-mail: [email protected] Abstract— Speech Translation has always been about giving in order to make the process of communication amongst source text/audio input and waiting for system to give humans better, easier and efficient. However, the existing translated output in desired form. In this paper, we present the methods, including Google voice translators, typically Acoustic Dialect Decoder (ADD) – a voice to voice ear-piece handle the process of translation in a non-automated manner. translation device. We introduce and survey the recent advances made in the field of Speech Engineering, to employ in This makes the process of translation of word(s) and/or the ADD, particularly focusing on the three major processing sentences from one language to another, slower and more steps of Recognition, Translation and Synthesis. We tackle the tedious. We wish to make that process automatic – have a problem of machine understanding of natural language by device do what a human translator does, inside our ears. -
Learning Speech Translation from Interpretation
Learning Speech Translation from Interpretation zur Erlangung des akademischen Grades eines Doktors der Ingenieurwissenschaften von der Fakult¨atf¨urInformatik Karlsruher Institut f¨urTechnologie (KIT) genehmigte Dissertation von Matthias Paulik aus Karlsruhe Tag der m¨undlichen Pr¨ufung: 21. Mai 2010 Erster Gutachter: Prof. Dr. Alexander Waibel Zweiter Gutachter: Prof. Dr. Tanja Schultz Ich erkl¨arehiermit, dass ich die vorliegende Arbeit selbst¨andig verfasst und keine anderen als die angegebenen Quellen und Hilfsmittel verwendet habe sowie dass ich die w¨ortlich oder inhaltlich ¨ubernommenen Stellen als solche kenntlich gemacht habe und die Satzung des KIT, ehem. Universit¨atKarlsruhe (TH), zur Sicherung guter wissenschaftlicher Praxis in der jeweils g¨ultigenFassung beachtet habe. Karlsruhe, den 21. Mai 2010 Matthias Paulik Abstract The basic objective of this thesis is to examine the extent to which automatic speech translation can benefit from an often available but ignored resource, namely human interpreter speech. The main con- tribution of this thesis is a novel approach to speech translation development, which makes use of that resource. The performance of the statistical models employed in modern speech translation systems depends heavily on the availability of vast amounts of training data. State-of-the-art systems are typically trained on: (1) hundreds, sometimes thousands of hours of manually transcribed speech audio; (2) bi-lingual, sentence-aligned text corpora of man- ual translations, often comprising tens of millions of words; and (3) monolingual text corpora, often comprising hundreds of millions of words. The acquisition of such enormous data resources is highly time-consuming and expensive, rendering the development of deploy- able speech translation systems prohibitive to all but a handful of eco- nomically or politically viable languages. -
Is 42 the Answer to Everything in Subtitling-Oriented Speech Translation?
Is 42 the Answer to Everything in Subtitling-oriented Speech Translation? Alina Karakanta Matteo Negri Marco Turchi Fondazione Bruno Kessler Fondazione Bruno Kessler Fondazione Bruno Kessler University of Trento Trento - Italy Trento - Italy Trento - Italy [email protected] [email protected] [email protected] Abstract content, still rely heavily on human effort. In a typi- cal multilingual subtitling workflow, a subtitler first Subtitling is becoming increasingly important creates a subtitle template (Georgakopoulou, 2019) for disseminating information, given the enor- by transcribing the source language audio, timing mous amounts of audiovisual content becom- and adapting the text to create proper subtitles in ing available daily. Although Neural Machine Translation (NMT) can speed up the process the source language. These source language subti- of translating audiovisual content, large man- tles (also called captions) are already compressed ual effort is still required for transcribing the and segmented to respect the subtitling constraints source language, and for spotting and seg- of length, reading speed and proper segmentation menting the text into proper subtitles. Cre- (Cintas and Remael, 2007; Karakanta et al., 2019). ating proper subtitles in terms of timing and In this way, the work of an NMT system is already segmentation highly depends on information simplified, since it only needs to translate match- present in the audio (utterance duration, natu- ing the length of the source text (Matusov et al., ral pauses). In this work, we explore two meth- ods for applying Speech Translation (ST) to 2019; Lakew et al., 2019). However, the essence subtitling: a) a direct end-to-end and b) a clas- of a good subtitle goes beyond matching a prede- sical cascade approach. -
Speech Translation Into Pakistan Sign Language
Master Thesis Computer Science Thesis No: MCS-2010-24 2012 Speech Translation into Pakistan Sign Language Speech Translation into Pakistan Sign AhmedLanguage Abdul Haseeb Asim Illyas Speech Translation into Pakistan Sign LanguageAsim Ilyas School of Computing Blekinge Institute of Technology SE – 371 79 Karlskrona Sweden This thesis is submitted to the School of Computing at Blekinge Institute of Technology in partial fulfillment of the requirements for the degree of Master of Science in Computer Science. The thesis is equivalent to 20 weeks of full time studies. Contact Information: Authors: Ahmed Abdul Haseeb E-mail: [email protected] Asim Ilyas E-mail: [email protected] University advisor: Prof. Sara Eriksén School of Computing, Blekinge Institute of Technology Internet : www.bth.se/com SE – 371 79 Karlskrona Phone : +46 457 38 50 00 Sweden Fax : + 46 457 271 25 ABSTRACT Context: Communication is a primary human need and language is the medium for this. Most people have the ability to listen and speak and they use different languages like Swedish, Urdu and English etc. to communicate. Hearing impaired people use signs to communicate. Pakistan Sign Language (PSL) is the preferred language of the deaf in Pakistan. Currently, human PSL interpreters are required to facilitate communication between the deaf and hearing; they are not always available, which means that communication among the deaf and other people may be impaired or nonexistent. In this situation, a system with voice recognition as an input and PSL as an output will be highly helpful. Objectives: As part of this thesis, we explore challenges faced by deaf people in everyday life while interacting with unimpaired. -
Ruslan - an Nt System Between Closely Related Languages
RUSLAN - AN NT SYSTEM BETWEEN CLOSELY RELATED LANGUAGES Jan Haji~ J , , . Vyzkumny ustav matematxckych stroju , P J Loretanske nam. 3 118 55 Praha 1, Czechoslovakia Machinery) at the Department of Software ABSTRACT in cooperation with the Department of Mathematical Linguistics, Faculty of A project of machine translation of Mathematics and Physics, Charles Czech computer manuals into Russian is University, Prague. described, presenting first a description of the overall system structure and concentrating then mainly Input texts on input text preparation and a parsing algorithm based on bottom-up parser The texts our system should translate programmed in Colmerauer's Q-systems. are software manuals to V~MS-developed DOS-4 operating system which is an advanced extension to the common DOS. The texts are currently maintained on INTRODUCTION tapes under the editing and formatting system PES (Programmed Editing System). In mid-1985, a project of machine This system allows for preparation, translation of Czech computer manuals editing and binding-ready printout using into Russian was started, thus national printer chain(s). Texts are constituting a second MT project of the stored on tapes using an internal format group of mathematical linguistics at containing upper/lowercase letters, Charles University (for a full editing & formatting commands, version description of the first project, see number/identification, info on (Kirschner, 1982) and (Kirschner, in last-changed pages etc.; most of this press)). can be used to improve the overall translation quality. On the other hand, Our goals are both practical part of it is somewhat confusing and (translation or re-translation of new or must be handled carefully. -
Between Flexibility and Consistency: Joint Generation of Captions And
Between Flexibility and Consistency: Joint Generation of Captions and Subtitles Alina Karakanta1,2, Marco Gaido1,2, Matteo Negri1, Marco Turchi1 1 Fondazione Bruno Kessler, Via Sommarive 18, Povo, Trento - Italy 2 University of Trento, Italy {akarakanta,mgaido,negri,turchi}@fbk.eu Abstract material but also between each other, for example in the number of blocks (pieces of time-aligned Speech translation (ST) has lately received text) they occupy, their length and segmentation. growing interest for the generation of subtitles Consistency is vital for user experience, for ex- without the need for an intermediate source language transcription and timing (i.e. cap- ample in order to elicit the same reaction among tions). However, the joint generation of source multilingual audiences, or to facilitate the quality captions and target subtitles does not only assurance process in the localisation industry. bring potential output quality advantageswhen Previous work in ST for subtitling has focused the two decoding processes inform each other, on generating interlingual subtitles (Matusov et al., but it is also often required in multilingual sce- 2019; Karakanta et al., 2020a), a) without consid- narios. In this work, we focus on ST models ering the necessity of obtaining captions consis- which generate consistent captions-subtitles in tent with the target subtitles, and b) without ex- terms of structure and lexical content. We further introduce new metrics for evaluating amining whether the joint generation leads to im- subtitling consistency. Our findings show that provements in quality. We hypothesise that knowl- joint decoding leads to increased performance edge sharing between the tasks of transcription and consistency between the generated cap- and translation could lead to such improvements. -
Translating Game Achievements: Case Study of the Long Dark and Spyro the Dragon
Translating game achievements: Case study of The Long Dark and Spyro the Dragon Venla Virtanen MA Thesis English, Degree Programme for Multilingual Translation Studies School of Languages and Translation Studies Faculty of Humanities University of Turku May 2020 The originality of this thesis has been checked in accordance with the University of Turku quality assurance system using the Turnitin OriginalityCheck service. UNIVERSITY OF TURKU School of Languages and Translation Studies / Faculty of Humanities VIRTANEN, VENLA: Translating game achievements: Case study of The Long Dark and Spyro the Dragon MA thesis, 45 p., 3 appendices. English, Degree Programme for Multilingual Translation Studies May 2020 – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – Though localisation in general and video game localisation in particular is increasing in popularity as a subject of research within Translation Studies, it is still a recent phenomenon, and there are many subjects and perspectives in it left to explore. One such subject is the translation of video game achievements, which currently remains completely unstudied. It is the purpose of this thesis to fix this research gap to the extent it is able. Because of the lack of research on the subject in Translation Studies, much of the background of this thesis comes from the domains of Video Game Studies and gamification research. While achievements are a particularly popular topic of research in gamification, translation has not been taken into account in any of that research. This thesis aims to examine features of achievement translation by comparing the source and target achievements of the games The Long Dark and Spyro the Dragon and classifying the translation strategies used in them. -
Implementing Machine Translation and Post-Editing to the Translation of Wildlife Documentaries Through Voice-Over and Off-Screen Dubbing
ADVERTIMENT. Lʼaccés als continguts dʼaquesta tesi queda condicionat a lʼacceptació de les condicions dʼús establertes per la següent llicència Creative Commons: http://cat.creativecommons.org/?page_id=184 ADVERTENCIA. El acceso a los contenidos de esta tesis queda condicionado a la aceptación de las condiciones de uso establecidas por la siguiente licencia Creative Commons: http://es.creativecommons.org/blog/licencias/ WARNING. The access to the contents of this doctoral thesis it is limited to the acceptance of the use conditions set by the following Creative Commons license: https://creativecommons.org/licenses/?lang=en Universitat Autònoma de Barcelona Departament de Traducció i d’Interpretació i d’Estudis de l’Asia Oriental Doctorat en Traducció i Estudis Interculturals Implementing Machine Translation and Post-Editing to the Translation of Wildlife Documentaries through Voice-over and Off-screen Dubbing A Research on Effort and Quality PhD dissertation presented by: Carla Ortiz Boix Supervised and tutorized by: Dr. Anna Matamala 2016 A la meva família: als que hi són, als que no, i als que només hi són a mitges. Acknowledgments The road to finishing this PhD has not been easy and it would not have been accomplished without the priceless support of many: First of all, I want to thank my supervisor, Dr. Anna Matamala, for all her hard work. It has not been an easy road and sometimes I would have lost the right path if she had not been there to support, encourage, and challenge me. The PhD would not have come out the way it has without you. On a professional level, I also want to thank Dr. -
Breeding Gender-Aware Direct Speech Translation Systems
Breeding Gender-aware Direct Speech Translation Systems Marco Gaido1,2 y, Beatrice Savoldi2 y, Luisa Bentivogli1, Matteo Negri1, Marco Turchi1 1Fondazione Bruno Kessler, Trento, Italy 2University of Trento, Italy fmgaido,bentivo,negri,[email protected],[email protected] Abstract In automatic speech translation (ST), traditional cascade approaches involving separate transcrip- tion and translation steps are giving ground to increasingly competitive and more robust direct solutions. In particular, by translating speech audio data without intermediate transcription, di- rect ST models are able to leverage and preserve essential information present in the input (e.g. speaker’s vocal characteristics) that is otherwise lost in the cascade framework. Although such ability proved to be useful for gender translation, direct ST is nonetheless affected by gender bias just like its cascade counterpart, as well as machine translation and numerous other natural language processing applications. Moreover, direct ST systems that exclusively rely on vocal biometric features as a gender cue can be unsuitable and potentially harmful for certain users. Going beyond speech signals, in this paper we compare different approaches to inform direct ST models about the speaker’s gender and test their ability to handle gender translation from English into Italian and French. To this aim, we manually annotated large datasets with speakers’ gen- der information and used them for experiments reflecting different possible real-world scenarios. Our results show that gender-aware direct ST solutions can significantly outperform strong – but gender-unaware – direct ST models. In particular, the translation of gender-marked words can increase up to 30 points in accuracy while preserving overall translation quality. -
María Fernández-Parra* the Workflow of Computer-Assisted Translation
1 María Fernández-Parra* The Workfl ow of Computer-Assisted Translation Tools in Specialised Translation 1. Introduction Since the wide availability of computers, the work profi le of the professional translator has radi- cally changed. Translators no longer work in isolation relying on a typewriter and a pile of books as their only aids. However, the goal of a totally independent translating machine producing high quality output has not been achieved either, and may never be achieved. It would be practical- ly unthinkable for translators nowadays to work without a computer, but integrating computers into the translation workplace does not mean replacing human translators altogether. The term computer-assisted translation (henceforth CAT) refers to the integration of computers into the workplace, whereas the term machine translation (MT) refers to fully automating the translation process. The workfl ow described here is that of CAT, but some differences and similarities with MT are also pointed out as appropriate. 1.1. Aims The fi rst aim of this paper is to explain in simple terms what is meant by the term computer-assist- ed translation and how this type of translation differs from other types of computer-based transla- tion. This is broadly discussed in section 2 below. The second aim of this paper is to raise aware- ness of the new working methods of translators of specialised texts by describing a simplifi ed but typical workfl ow in the specialised translation scenario using CAT tools, in section 3 below. In other words, I aim to describe what computers already can do at present.