Translation in Transition: Book of Abstracts

Translation in Transition: Human and Machine Intelligence October 15–17, 2020 Book of Abstracts

Contents

TT5 Program Committee and Organizers ...... iii Foreword ...... iv

Abdullah A. Hasan & Husam Qasim Aspire: an MT-based Bilingual Corpus Aligner Utilizing Triangulated Parameters to Achieve Higher Accuracy ...... 1

Alina Karakanta Subtitling in Transition: The case of TED talks ...... 4

Ana Guerberof Arenas & Antonio Toral CREAMT: creative shifts, readers engagement and translation reception in literary texts translated in three modalities ...... 7

Deniz Özkan, Ena Hodzik, & Ebru Diriker Predictive eye-movements in student vs. professional Turkish (A) - English (B) simultaneous interpreters: between-group and individual differences ...... 10

Devin Gilbert Using Commercially Available Customizable NMT to Study Translator Style ...... 14

Haruka Ogawa Production Duration and Translator Characteristics ...... 19

Jacob T. Høgh Testing of Perceived Sound Quality on Interprefy App ...... 23

Jia Feng, Shirong Chen, Michael Carl, Jinjin Chen, & Yueqi Zhu Eye-Voice Span in Sight Interpreting: Evidence from Both Process and Product ...... 25

Kamal Deep Garg, Ajit Kumar, & Vishal Goyal Addressing the Rare Word Problem in Punjabi to English Neural Machine Translation ...... 28

Kamal Kumar Gupta, Rejwanul Haque, Asif Ekbal, & Pushpak Bhattacharyya Augmenting Dependency Tags in Interactive Neural Machine Translation ...... 32

Kara Warburton Supporting Translators Through Keyword Mining ...... 34

Katarzyna Stachowiak-Szymczak Psychosocial awareness in community translation: A report on human and machine translation and post-editing...... 39

M. Cristina Toledo-Báez & Michael Carl Assessing low and high translation variation in post-editing ...... 41

Masaru Yamada, Mayuka Yamamoto, Nanami Onish, Atsushi Fujita, Rei Miyata, & Kyo Kageura Metalanguage for the translation process ...... 46

Michael Carl Explorations in Empirical Translation Process Research: A Book Presentation ...... 52

Miguel A. Jiménez-Crespo & Joseph Casillas Are literal translations always easier to process?: a process-based study of effort based on comparable corpus data...... 56

Oliver Czulo FSEM: Current developments and perspectives ...... 60

Valentina Ragni & Lucas Nunes Vieira Going off track? Activity tracking in translation. Translators perceptions, ethical issues and positive uses...... 64

Wenchao Su & Xiaoxing Zhao Researching Automatic Speech Recognition on the Process and Quality of Written Translation: Discussions of Indicators ...... 66

TT5 Program Committee

Chantal Gagnon Koen Plevoets Daniel Gallego-Hernández Marie-Aude Lefer Dorothy Kenny Oliver Czulo Ekaterina Lapshinova-Koltunski Ralph Kruger Federico Gaspari Silvia Hansen-Schirra Isabelle Delaere Sonia González Cruz Jennifer Fest Stella Neumann Joke Daems Éric André Poirier

TT5 Organizers

Michael Carl Isabel Lacruz Kairong Xiao Devin Gilbert

Foreword

Since its inauguration in Copenhagen (2014), Translation in Transition has become a central meeting point for empirical translation studies in Europe through successive editions of the conference in Germersheim (2015), Ghent (2017), and Barcelona (2019). The fifth iteration of this conference (TT5) was held outside Europe, at Kent State University (USA). It was, as before, a forum of discussion focused on empirical research in the fields of translation and interpreting.

As in the previous iterations, TT5 provides a forum for discussion to learn more about how human translators exercise their skill cognitively and also how computer programs can be designed to help human translators, by automatically translating written text, by recognizing and translating spoken utterances, or—more indirectly— by logging translation events and analyzing recorded process data.

The special focus of TT5 is on human and machine intelligence. In times of increasing machine intelligence, translation aides and translation technologies are changing at a rapid pace, fundamentally transforming the status of translation and the translation profession. TT5 aims at discussing related questions, including: How can humans cope with machine intelligence that is being developed? Also, how do humans cope with machine intelligence, and how can machines adapt to the human condition? What are the fundamental mechanisms that underlie human translation performance? What is the effect of technology on the translation process, translation performance, job satisfaction, the translation product, and society?

The following extended abstracts represent this discussion in translation process research. They were submitted and accepted by poster presenters at the fifth iteration of Translation in Transition. Book of Abstracts: Translation in Transition (TT5) October 15 - 17, 2020

Aspire: an MT-based Bilingual Corpus Aligner Utilizing Triangulated Parameters to Achieve Higher Accuracy

Abdullah A. Hasan Husam Qasim

Mayo Clinic Kent State University 912 Newcastle Dr. Champaign, IL 61822 475 Janik Dr, Kent OH 44240

[email protected] [email protected]

1 Introduction

Aligning bilingual corpus is essential for In light of the previous, a new MT-based machine translation (MT) engine training, corpus alignment tool, Aspire, was created in Python to linguistics studies, term extraction, concordance tolerate higher levels of noise in data and to search, and the creation of translation memories achieve higher accuracy in alignment. Aspire for use in computer- assisted translation (CAT) addresses the problem of misalignments and tools. noise by using a semantic scoring algorithm that relies on scikit-lean imlemenain of Tf-idf While segment alignment based on semantic (term frequency-inverse document frequency), features has recently offered more promising which is a term weighting model that is useful results than length-based algorithms, there is still for measuring semantic similarity. In addition to room for improvement. For example, the MT- Tf-idf, Aspire uses dynamic programming and based tool Bleualign (Sennrich & Volk, 2010) triangulated parameters such as positional offers little customizability, and does not perform features, anchors, length correspondence, as well adequately when handling corpora containing as other strong indicators, such as numbers, clutter and noise, which is typical of content punctuation marks, and acronyms to achieve harvested from the web, or when there are highly accurate alignment. Even without human version differences between the source and the review, Aspire can achieve high levels of target. These shortcomings can also be observed accuracy. When tested on content obtained from in algorithms that do not use MT, like the length- the web, it correctly aligned 96% of the possible based approach (Gale & Church, 1991) and pairs, and its error rate was less than 1%. The Hnalign dicina- based approach (Varga et remaining pairs were false negatives. al., 2007).

2 Methodology On the other hand, proprietary tools that rely in

great part on manual segment alignment or user Forming a probability model as whole, the correction using a graphical editor require triangulated, interdependent parameters take into significant effort and time to produce high- account semantic, positional, and length quality translation memories. Without manual correspondence to align segments even when two correction, alignments created using these segments that should align are not positioned graphical editors may produce poor results. similarly in the source and target

Page 1 Book of Abstracts: Translation in Transition (TT5) October 15 - 17, 2020

texts or when they achieve a low semantic score. Weights and minimums for semantic and length scores are adjustable. Accordingly, advanced The main factors taken into consideration and users can make them stricter if higher accuracy is measured by the tool to achieve higher alignment desirable at the expense of coverage1. This accuracy are: flexibility may come in handy depending on how a. A lexical score based on machine a user wants to employ the tool. translation of the target text, which is compared to the source text. Semantic 3 Evaluation method and results comparison is performed using Tf-idf cosine similarity after the text is tokenized and Manual alignments were conducted on bilingual stemmed. texts obtained from the websites of Human b. A length score. A pair with both high Rights Watch and The World Health semantic and length scores is used for setting anchors, which are highly reliable points of Organization. The sample size was reference that neighboring pairs can use as approximately 20,000 words divided across the hints for calculating positional scores. following language pairs: English-Arabic, c. A positional score that represents the English-Spanish, and English-French. The same egmen diance fm he la known sample was then automatically aligned using anchor. three tools: Aspire, Hunalign, and Bleualign. The d. A special tokens score based on outputs of the three tools were compared with the acronyms, numbers, and other unique manual alignments in terms of coverage rate and symbols. This score is used for elevating the score of low-accuracy pairs surrounded by error rate2. The same segmentation was used for high-confidence pairs. both manual and automatic alignments, and the same MT was used for Aspire and Bleualign. Each factor is assigned a level of importance, i.e. weight, which can be customized. Semantic The following table offers a comparison of correspondence is typically given the highest accuracy tests: weight. Eventually, scores and weights are used to produce an average score for each segment pair. The more two segments correspond in terms Algorithm Coverage rate Error rate of semantic equivalence, position, and length, the Aspire 96% 0.30% higher the average score is. Hunalign 91% 10% Weights are customizable. The following Bleualign 43% 56% may serve as a typical weighting scheme: Table 1: Aspire's coverage rate and error rate a. Semantic: 70% importance. compared to Hunalign and Bleualign. b. Length: 10% importance. c. Positional: 10% importance. 4 Interface d. Special tokens: 10% importance. e. Aspire can be run with or without user editing, but After aligning the most likely pairs, the algorithm a visual interface is available if manual correction looks for any unpaired fragments and joins them is needed. The user can also customize the weight with their most likely matches using the same assigned to each parameter as needed. The user scoring parameters, thus forming one-to-many and can preview alignment results and scores, filter many-to-one pairs even when the source and the aligned segments by weighted scores, reject false target do not employ similar sentence boundaries. positives, and override false negatives as necessary before exporting the final alignments. 1 Good alignments that are captured by the tool. 2 Rate of bad alignments and noise.

Page 2 Book of Abstracts: Translation in Transition (TT5) October 15 - 17, 2020

Figure 1: A eie f Aie cing el interface

5 Future Implications References

Tests so far show that Aspire captures a greater Gale, W. A., & Church, K. W. 1991. A program for number of alignments with a smaller risk for aligning sentences in bilingual corpora. false positives. This helps creating translation Proceedings of the 29th Annual Meeting on memories that are both larger and cleaner. By Association for Computational Linguistics. extracting highly accurate alignments, Aspire could potentially be used to offer better MT Sennrich, Rico, & Volk, Martin. 2010. MT- based training data. The tool could also be used to Sentence Alignment for OCR- generated Parallel recycle legacy translations to build better Texts. Proceedings of AMTA 2010. translation memories for use in CAT tools, term extraction, or concordance search. Varga, D., Halácsy, P., Kornai, A., Viktor, N., László, N., & Viktor, T. 2007. Parallel corpora for medium

density languages. With the rise of affordable and self-hosted

machine translation engines, MT-based

alignment will likely become a standard

alignment method.

6 Limitations

The tool was tested on three language pairs only. More testing on a larger sample and on more language pairs should prove helpful in demonstrating alignment quality for other content types.

Page 3 Book of Abstracts: Translation in Transition (TT5) October 15 - 17, 2020

Subtitling in Transition: The case of TED talks

Alina Karakanta

Fondazione Bruno Kessler

University of Trento

[email protected]

1 Introduction 2 Methodology

Subtitles are a vital tool for facilitating Data: We use MuST-Cinema (Karakanta et al., accessibility and the spread of information to 2020), a multilingual speech-to-subtitles corpus audiences with different linguistic needs and compiled from the subtitle files (.srt) of TED preferences. With the advances in digital alk in 7 langage ai (EnglihGeman, technologies, subtitling, as a form of audiovisual Spanish, Italian, French, Portuguese, Dutch, translation, is no longer a task practiced Romanian). The corpus contains (audio, source exclusively by professionals, but it has become language subtitles, target language subtitles) extremely popular among volunteers and fans triplets. (Bogucki, 2009; Díaz-Cintas, J. & Muñoz Metric: We investigate the conformity to the Sánchez, 2005; Ferrer Simó, 2005). One of the subtitling constraint of length. Due to the limited largest projects of volunteer subtitling is the TED space on screen, a subtitle cannot exceed a Talk Translators programme, with ~3000 talks in specific length. According to the TED translation various topics subtitled in 116 languages by guidelines, conformity is measured with a more than 33,000 volunteers. The programme maximum subtitle length of <=84 characters has been constantly growing and adapting to (maximum 2 lines of up to 42 characters each). respond to the increasing global interest in TED talks. Over the years, subtitling guidelines have 3 Results been drafted, new tools have been introduced and a thorough quality check process has been set up. We compute the percentage of non-conforming In this work, we pose the question of whether subtitles per year for each language. The year this constant transition and refinement of the corresponds to the date of release of the talk, but workflows of the programme is mirrored in the due to the popularity of the programme, talks are quality of the produced subtitles, in terms of their normally translated in a short time after their form and conformity to traditional subtitling release. Figure 1 shows the percentage of non- constraints (Díaz-Cintas and Ramael, 2007). We conforming subtitles per year for French. We conduct a corpus analysis based on TED talk observe a consistent decrease in the percentage subtitles, focusing here on their conformity to the of subtitles not respecting the length constraint, subtitling constraint of length (maximum number especially after 2012. The same tendency is of characters per subtitle). observed for all languages, even though some

Page 4 Book of Abstracts: Translation in Transition (TT5) October 15 - 17, 2020

Figure 1. Percentage of non-conforming subtitles per year (French) languages have consistently higher non- subtitle quality in terms of length conformity. By conformity percentages than others (higher for conducting a qualitative evaluation of several Spanish and French, lowest for Portuguese). For talks from different years, we observed that the all languages, there is a sharp reduction in the main reasons for the non-conformity to the number of non-conforming subtitles in year length constraint are the lack of compression 2009, gradually reaching an almost full and/or improper subtitle segmentation. The conformity around 2012. This suggests a shift following examples in French and Italian come fm a lain anlain f he ce fm Al Ge 2006 alk Aeing he climae transcript, which does not consider subtitling cii1. The number of characters is shown in constraints, to well-formed subtitles. parentheses. The source language subtitles (English) here are verbatim transcriptions of the This shift can be attributed to several factors: as spoken utterances and have a frequent the pool of volunteers grew, multiple roles could segmentation, which leads to one-liners and be allocated to review and quality-check the keeps the number of characters low: translation. Subtitling guidelines were drafted and training material was released for the 116 volunteers. Technology and tools seem to have 00:06:03,464 --> 00:06:07,281 played an important role. To ensure better What can you do about the climate crisis? (41) management, the translation project was 117 transferred under the Amara collaborative 00:06:08,137 --> 00:06:11,476 platform. The task of subtitling is performed I want to start with a couple of -- (35) through the Amara subtitle editor, which allows 118 for automatically controlling length and reading 00:06:11,500 --> 00:06:14,069 speed constraints. This facilitates the work of I'm going to show some new images, (34) subtitlers by notifying them when subtitles 119 exceed the maximum length, leading them to 00:06:14,093 --> 00:06:19,216 rephrase, condense and omit information in order to avoid delivering a translation with error messages. Therefore, it is possible that the 1https://www.ted.com/talks/al_gore_averting_the_cli adoption of the subtitling tool increased the mate_crisis#t-353391

Page 5 Book of Abstracts: Translation in Transition (TT5) October 15 - 17, 2020 and I'm going to recapitulate just four or five. (48) 4 Conclusion Our findings show that there is a chronological French: shift towards higher subtitling quality (in terms 71 of subtitling constraints) in TED Talks, as a 00:06:02,820 --> 00:06:10,820 result of a thorough quality assurance process, Qu'est-ce que vous pouvez faire contre le well-defined subtitling guidelines, training réchauffement climatique? J'aimerais material and the use of a subtitling tool. These commencer avec -- (96) elements could be adopted as best practices for 72 further projects of volunteer subtitling to ensure 00:06:10,820 --> 00:06:19,820 the efficient diffusion of knowledge through Je vais vous montrer de nouvelles diapositives et audiovisual content. je vais juste en détailler quatre ou cinq. (92) References Italian: Amara - Award-winning Subtitle Editor and 71 Enterprise Offerings. https://amara.org/en/ Last 00:06:02,820 --> 00:06:10,820 accessed on 3 June 2020. Cosa si può fare per la crisi del clima? (40) Bgcki . (2009) Amae Sbiling n he Inene. 72 In: Cintas J.D., Anderman G. (eds) Audiovisual 00:06:10,820 --> 00:06:19,820 Translation. Palgrave Macmillan, London Voglio iniziare mostrando alcune nuove immagini, solo 4 o 5. (60) Díaz-Cintas, J. and Aline Remael. 2007.Audiovisual Translation: Subtitling. Translation practices explained. Routledge. While both languages use the same timestamps, the French subtitles exceed by a large margin the Díaz-Cintas, J. & Muñoz Sánchez, P. 2006. Fansubs: 84-character limit. The false start J'aimerais Audiovisual Translation in an Amateur Environment. The Journal of Specialised commencer avec is reproduced from the English Translation 6, 37-52. captions and included in subtitle 71 for French, but it is omitted in Italian. In the French subtitle Karakanta, A.; Negri M. & Turchi M. 2020. MuST- 72 there is an unnecessary repetition of the Cinema: a Speech-to-Subtitle corpus. In Proceedings of the 12th International Conference phrase je vais, leading to a subtitle of 92 on Language Resources and Evaluation (LREC characters. In addition, the French subtitle 2020), Marseille, France, May 13-15 2020. follows the phrasing of the English one. The Italian subtitle 72, however, is much more TED Our organization. https://www.ted.com/about/our-organization. Last compact; the subtitler has merged the two main accessed on 3 June 2020. clauses in one and rephrased them in order to provide a shorter solution. In general, the Italian TED Translators Cheat-sheet. version has a larger degree of compression and https://translations.ted.com/TED_Translators_Chea t-sheet. Last accessed on 3 June 2020. omission than the French. This example shows that, despite a transition point towards higher Fee Sim, M. R. 2005. Fanb canlain: la conformity being present in 2009, there is some influencia del aficionado en los criterios feinale. Puentes 6: 27-43. variation among the languages, which should be investigated in other factors, such as degree of eanin f he age langage, lnee experience, and properties of the talks (speech rate, disfluencies etc).

Page 6 Book of Abstracts: Translation in Transition (TT5) October 15 - 17, 2020

CREAMT: creatie hif, eade egagee and translation reception in literary texts translated in three modalities

Ana Guerberof Arenas Antonio Toral

Centre for Translation Studies Center for Language and Cognition

University of Surrey/MSC IF at University of Groningen University of Groningen

[email protected] [email protected]

1 Introduction there are few studies in machine-assisted translation that focus on the ultimate user of the Technology in general and machine translation translation (e.g. Castilho and Guerberof 2018; (MT) in particular are intrinsically included in the Gebef, Mken and OBien 2019). translation process in the language industry. In fact, the 2020 Language Industry Survey by the 2.1 Objectives European Association of Translation The study presented here seeks to re-address this 1 Companies shows machine translation post- focus by applying MT to literary texts and editing (MTPE) to be the most popular service focusing on the final user, to answer these and MT the strongest technology trend. questions: Automation has focused primarily on how technical or scientific translations are produced RQ1: Can we quantify the creativity in texts and processed because of their repetitive nature translated by humans as opposed to those and the priority given to accuracy. In recent produced with the aid of machines? years, however, there has been an interest to see RQ2: Do users reading translated material how MT can benefit or hinder the translation of produced with and without the aid of machines more creative texts, i.e. literary works or have different reading experiences? marketing. 2.2 Pilot project 2 Related Word In the beginning of 2019, we carried out a pilot Some studies show that MT might help experiment (Guerberof Arenas & Toral, 2019) in professional literary translators to be more one language direction (English-to- Catalan) that productive (Toral et al. 2018). However, included a questionnaire to assess narrative anla ecein i ha he me ceaie engagement (Mangen and Kuiken 2014), using a the literary text, the less useful MT is (Moorkens scale created for this et al. 2018). On the other hand,

1 https://ec.europa.eu/info/sites/info/files/2020 _language_industry_survey_report.pdf

Page 7 Book of Abstracts: Translation in Transition (TT5) October 15 - 17, 2020 purpose (Busselle and Bilandzic 2009), and to show higher scores in narrative engagement and gauge the translation reception of a fictional translation reception, where MTPE scores slightly piece translated in three conditions: from scratch higher in enjoyment. However, there were (fully human translation by professional statistically significant differences only between translators), machine translated, and a post-edited HT and MT, and MTPE and MT, but not between version (machine translation output corrected by HT and MTPE. professional translators). We found out that readers were more engaged with human Our objective is ultimately to understand how the translation, as expected, followed by, use of MT technology might or might not surprisingly, the machine translated text, and constrain creativity in the anla process, and lastly by the post-edited text. how readers engage and receive texts that have been translated assisted by this technology. Our 2.3 Current Project next steps are to use this methodology to analyze English literary texts translated into Catalan and We are presenting here the results of a follow- up 3 experiment that seeks to explore further these Dutch in the three aforementioned modalities . questions by looking at the three modalities: Funding information Machine-translated (MT), Post- edited (MTPE) and translated without any aid (HT) applied to a This project has received funding from the short story. Eean Unin Hin 2020 eeach and innovation programme under the Marie We analyzed, on the one hand, the two main Skdka-Curie grant agreement No. 890697 aspects of creativity acceptability and novelty, and fm CLCG 2020 bdge f eeach by looking at errors and creative shifts participants and has been partially funded by the respectively (Bayer-Hohenwarter, 2012), and, Expanding Excellence in England Programme on the other, the narrative engagement and funded by Research England. reception of these translated texts in three modalities with a larger cohort of Catalan readers References (n > 80). Bayer-Hhenae, G. (2012). Ceaie Shif a a Means of Measuring and Promoting Translational The results2 show that, in response to RQ1, Creativity. Meta, 56(3), 663692. creativity could be quantified when comparing https://doi.org/10.7202/1008339ar translated texts in different modalities. We found Busselle R., & Bilandzic, H. (2009). Meaing that the HT texts score higher in creativity than Naaie Engagemen. Media Psychology, 12:4, MTPE, and MTPE scores higher than MT, 321-347. Routledge, Taylor and Francis Group. especially when analyzing novelty in the translated texts. Castilho, S., & Guerberof Arenas, A. (2018). Reading Comprehension of Machine Translation Output: In response to RQ2, the clear answer is yes, Wha make f a bee ead?. In Proceedings of readers engage differently depending on the 21st Annual Conference EAMT, Alicante, Spain. modality. We saw a pattern where HT always Gebef, A., Mken, J., and OBien, S. (2019). Wha is the impact of raw MT on Japanese users of

2 The full results of this experiment will be available in Guerberof-Arenas, A. & Toral, A. Eeience. Translation Spaces. John Benjamins (fhcming). The Imac f P-editing and Publishing. Machine Translation on Creativity and Reading 3 https://cordis.europa.eu/project/id/890697

Page 8 Book of Abstracts: Translation in Transition (TT5) October 15 - 17, 2020

Word preliminary results of a usability study using Mangen, A., & Kiken, D. (2014). L in an iPad: eye-tracking. In Proceedings of the Seventh Naaie engagemen n ae and able. Machine Translation Summit (MT Summit 2019), Scientific Study of Literature, 4(2), 150-177. Dublin, Ireland. Moorkens, J., Toral, A., Castilho S., & Way, A. (2018). Guerberof Arenas, Ana, and Antonio Toral. Pecein f Liea P-editing using Meaing Reade Engagemen in Liea Saiical and Neal Machine Tanlain. Texts: A Study Comparing Human Translation to Translation Spaces 7:2. Machine Assisted Translation. In Book of Tal, A., Wieling, M., & Wa, A. (2018). P- Abstracts EST Congress 2019, 239. Stellenbosch: editing effort of a novel with statistical and neural EST, 2019. machine anlain. Frontiers in Digital Humanities, 5, 9.

Page 9 Predictive eye-movements in student vs. professional Turkish (A) - English (B) simultaneous interpreters: between-group and individual differences

Deniz Özkan Ena Hodzik Ebru Diriker

Koç University Boaii Unieri Boaii University Istanbul, Turkey Istanbul, Turkey Istanbul, Turkey [email protected] [email protected] [email protected]

1. Introduction based on morphosyntactic cues (i.e., case markers) in both verb-medial and verb-final Studies on predictive processing of spoken sentences. This finding implies that the case language have found that language users can form markers are interpreted incrementally and the predictions on different levels as an integral part thematic roles of the upcoming nouns can be of language comprehension. They can pre- assigned based on those case marker cues. activate the semantic/conceptual (Federmeier & Kutas, 1999; Federmeier & Kutas, 2001) and the Recent psycholinguistic research started structural features (Van Berkum et al., 2005; investigating individual differences in predictive Wicha et al., 2004) of an upcoming argument processing. In an investigation of spoken based on various linguistic elements in the spoken language processing using the VWP, Huettig & input, such as the semantic and/or Janse (2016) found that the higher an indiidal morphosyntactic cues (Altmann & Kamide, 1999; working memory capacity (WMC), the more they Kamide, Altmann, & Haywood, 2003). Such pre- engaged in prediction. Huettig (2015) argues that activation has been demonstrated as anticipatory language-based predictions are mediated by looks to the referents of upcoming arguments in working memory, where information in the visual scenes (e.g. Altmann & Kamide, 1999; sensory and linguistic input (in visuospatial Huettig & Altmann, 2005) using the visual world memory) is connected to information in long-term paradigm (VWP; Cooper, 1974; Tanenhaus, memory (including phonological and semantic Spivey-Knowlton, Eberhard, & Sedivy, 1995). representation). The higher an indiidal WMC, the faster these connections will be made. In a fairly under-investigated language in this As such, WMC appears as one important correlate context, Turkish, Brouwer, Özkan, & Küntay of predictive processing. (2018) found that Turkish-speaking adults, similar to Dutch-speaking adults, could predict Prediction has also been found in the interpreting the upcoming argument based on the verb studies literature, where professional interpreters semantics in verb-medial sentences. Moreover, have shown more instances of prediction than Özge, Küntay, & Snedeker (2019) recently student interpreters (Jörg 1997; Riccardi 1996). In showed that Turkish-speaking adults and 4-year- addition, expertise related advantages have also old children can predict the upcoming argument been observed in tasks measuring working Book of Abstracts: Translation in Transition (TT5) October 15 - 17, 2020 memory capacity (WMC) (Padilla et al., 2005; The eye tracking experiment was the same as the Yudes et al., 2013). Nevertheless, while one that was used with Turkish-speaking individual variations in WMC have been related participants in Brouwer et al. (2018). The to SI performance in untrained bilinguals experiment had 24 visual displays with two (Christoffels et al., 2003) and trainee interpreters colored line drawings, corresponding to a target (Zhang and Yu, 2019, Lin et al., 2018), studies (e.g., a cake) and a competitor item (e.g., a tree). with professional interpreters have found either The visual displays were paired with prerecorded non-significant or weaker associations of WMC sentences that were either semantically with SI performance (Timarová et al., 2015). constraining (1) or neutral (2), corresponding to Similarly, Liu et al. (2004) found differences in SI semantic and neutral conditions, respectively. performance between professional and novice interpreters despite the two groups being matched (1) Çocuk yiyor bu byk keki [semantic] for WMC. The b ea he big cake (2) Çocuk göryor bu byk keki [neutral] To our knowledge, predictive processing based on The b ee he big cake verb semantic cues has not been previously studied in the interpreter population using an A female native adult speaker of Turkish recorded online measure, such as eye tracking. What is the sentences, which were minimally manipulated more, even though WMC has been associated in PRAAT to adjust the prediction window (from with prediction skill and found to be enhanced in verb onset to noun onset) to 2400 ms (see interpreters, the relationship between the two has Brouwer et al., 2018 for the reasoning). Because not been previously investigated in the context of our participants were adults, we did not use the interpreting studies. As such, this study aims to motivational filler items, as was done in Brouwer idenif hehe he feinal ineee e al.s original experiment with children. The semantic prediction skills during spoken language onset of the prerecorded sentences was processing differ from that of students and synchronized with the onset of the presentation of whether WMC is associated with prediction skills the visual display, so that the participants did not in those two groups. Based on previous literature, have a preview of the visual display. we expected (1) the professional interpreters to Experimental stimuli were presented by E-Prime exhibit better prediction skills compared to the software, and a Tobii T120 eye tracker recorded student interpreters (i.e., earlier initiation of he aician gae behai ih a amling prediction or a stronger effect); (2) the rate of 60 Hz. The procedure lasted about 15 professional interpreters to have larger WMC minutes. than the student interpreters; and (3) WMC skills We used an automated version of the operation to be involved in predictive processing differently an ak f aeing aician king for the two groups (i.e., larger involvement in the memory span (Mizrak & Oztekin, 2016; student interpreters than the professional Unsworth, Heitz, Schrock, & Engle, 2005). The interpreters). participants were required to decide whether the 2. Method answer for an arithmetic operation was correct or not, as they tried to remember a set of unrelated Our study investigated eye movements based on letters (F, H, J, K, L, N, P, Q, R, S, T, Y) in the verb semantic cues in student (N = 21) and same order as presented. The participants first professional (N = 20) interpreters with Turkish as completed three practice sessions before the their A and English as their B language in a VWP experimental procedure began. The experiment eye tracking experiment. Participants completed a involved a letter presented on the screen for 1000 verb-based semantic prediction task and an ms after each arithmetic operation. The list length operation span task. All participants had normal for the letters varied between 3 to 7. The operation or corrected-to-normal vision. span score presented the number of correct items

Page 11 Book of Abstracts: Translation in Transition (TT5) October 15 - 17, 2020 recalled in the correct position. The procedure experience, working memory is exploited less, lasted about 10 minutes. This task constitutes a not more, for the purpose of predictive complex span task as it involves both a processing processing. This in turn may imply more selective and storage component, but what sets it apart from processing of information in the auditory input by other complex span tasks like reading or listening virtue of expertise (Liu et al., 2004). span, which only involve processing of verbal information, is the fact that it also engages more Given the strategic importance of prediction general (math) problem solving skills (Turner & during interpreting, our results may help identify Engle, 1989). the experience-related effects on predictive processing of spoken language. Thus, the current 3. Results and discussion exploratory study may inform both conference interpreting and psycholinguistics literatures. We found a prediction effect (i.e., greater likelihood of looks to the target object in the semantic versus the neutral condition) for both References groups in the verb-based semantic prediction task, Altmann, G. T., & Kamide, Y. (1999). as indicated by previous research with Turkish- Incremental interpretation at verbs: speaking adult participants (Brouwer et al., 2019). Restricting the domain of subsequent However, this effect seemed to be stronger for the reference. Cognition, 73(3), 247-264. professionals than the students. Based on Brouwer, S., Özkan, D., & Küntay, A. C. (2018). preliminary interpretation of the data, the two Verb-based prediction during language groups exhibited different gaze behavior: the processing: the case of Dutch and professional interpreters were earlier than the Turkish. Journal of child language, 46(1), 80- 97. students in initiating predictive looks to the target objects by about 300 ms. Christoffels, I. K., De Groot, A. M. B. & Waldorp, L. J. (2003). Basic skills in a complex task: A Unlike previous literature, there were no graphical model relating memory and lexical significant group differences in the WMCs of our retrieval to simultaneous interpreting. participants. However, when the two groups were Bilingualism: Language and Cognition, 6, 201- divided into high versus low WMC sub-groups 211. based on the median WMC scores, the high WMC Cooper, R. M. (1974). The control of eye fixation by student interpreter group was found to outperform the meaning of spoken language: A new the low WMC student interpreter group in the methodology for the real-time investigation of prediction task, whereas no such performance speech perception, memory, and language difference was observed for the high versus low processing. Cognitive Psychology, 6(1), 84107. WMC professional groups. Federmeier, K. D., & Kutas, M. (1999). A rose by 4. Conclusions any other name: Long-term memory structure and sentence processing. Journal of memory and In line with our hypothesis, these findings Language, 41(4), 469-495. suggested that the professional interpreters Federmeier, K. D., & Kutas, M. (2001). Meaning and exploited the semantic prediction cue faster and modality: Influences of context, semantic memory more competently than the students did. In organization, and perceptual predictability on picture addition, it is the only type of cue that the processing. Journal of Experimental Psychology: professional interpreters made use of, and perhaps Learning, Memory, and Cognition, 27(1), 202. needed, unlike the students, who also relied on their working memory span. It seems that with

Page 12 Book of Abstracts: Translation in Transition (TT5) October 15 - 17, 2020

Huettig, F., & Altmann, G. T. (2005). Word meaning and the control of eye fixation: Tanenhaus, M. K., Spivey-Knowlton, M. J., Semantic competitor effects and the visual Eberhard, K. M., & Sedivy, J. C. (1995). world paradigm. Cognition, 96(1), B23-B32. Integration of visual and linguistic information in spoken language comprehension. Science, Huettig, F. (2015). Four central questions about 268(5217), 1632- prediction in language processing. Brain 1634. research, 1626, 118-135. Tima, ., ek, I. & Melae, R. (2015). Huettig, F., & Janse, E. (2016). Individual Simultaneous interpreting and working memory differences in working memory and processing capacity, In A. Ferreira & J. W. Schwieter (eds.) speed predict anticipatory spoken language Psycholinguistic and cognitive inquiries into processing in the visual translation and interpreting. Amsterdam, world. Language, Cognition and Netherlands: Benjamins. Neuroscience, 31(1), 80-93. Turner, M. L., & Engle, R. W. (1989). Is working Jörg, U. (1997). Bridging the gap: Verb anticipation memory capacity task dependent? Journal of in German-English simultaneous interpreting. In memory and language, 28(2), 127- 154. M. Snell-Hornby, Z. Jettmarova & K. Kaindl (eds.), Translation as Intercultural Unsworth, N., Heitz, R. P., Schrock, J. C., & Engle, Communication: Selected Papers from the EST R. W. (2005). An automated version of the Congress, Prague 1995. Amsterdam: John operation span task. Behavior research methods, Benjamins Publishing. 37(3), 498-505.

Kamide, Y., Scheepers, C., & Altmann, G. T. Van Berkum, J. J. A., Brown, C. M., Zwitserlood, P., (2003). Integration of syntactic and semantic Kooijman, V., & Hagoort, P. (2005). Anticipating information in predictive processing: Cross- Upcoming Words in Discourse: Evidence From linguistic evidence from German and English. ERPs and Reading Times. Journal of Journal of psycholinguistic research, 32(1), 37- Experimental Psychology: Learning, Memory, 55. and Cognition, 31(3), 443-467.

Lin, Y., Lv, Q., & Liang, J. (2018). Predicting Wicha, N. Y., Moreno, E. M., & Kutas, M. (2004). Fluency With Language Proficiency, Working Anticipating words and their gender: An event- Memory, and Directionality in Simultaneous related brain potential study of semantic Interpreting. Frontiers in Psychology, 9, 1543. integration, gender expectancy, and gender agreement in Spanish sentence reading. Journal Liu, M., Schallert, D. L., & Carroll, P. J. (2004). of cognitive neuroscience, 16(7), 1272-1288. Working memory and expertise in simultaneous interpreting. Interpreting, 6(1), 19-42. Yudes, C., Macizo, P., Morales, L. & Bajo M. (2013). Comprehension and error monitoring in simultaneous Mak, E., & ekin, I. (2016). Wking mem interpreters. Applied Psycholinguistics, 34, 1039-1057. capacity and controlled serial memory search. Cognition, 153, 52-62. Zhang, W. & Yu, D. W. (2019). A duet and/or a concerto? Simultaneous interpreters' working Özge, D., Küntay, A. and Snedeker, J. (2019). Why memory and interpreting expertise. Babel- Revue wait for the verb? Turkish speaking children use Internationale De La Traduction- International case markers for incremental language Journal of Translation, 65, 519- comprehension. Cognition 183, 152-180. 537. Padilla, F., Bajo, M. T., & Macizo, P. (2005). Articulatory suppression in language interpretation: Working memory capacity, dual tasking and word knowledge. Bilingualism: Language and Cognition, 8(3), 207-219.

Page 13 Book of Abstracts: Translation in Transition (TT5) October 15 - 17, 2020

Using Commercially Available Customizable NMT to Study Translator Style

Devin Gilbert

Center for Research and Innovation in Translation and Translation Technology (CRITT) Kent State University, Kent, Ohio, USA

[email protected]

1 Introduction Portuguese (Ondjaki 2009) to American English (Gilbert, 2020)) in order to use the customized Winters and Kenny remarked during last yea engine anlain f liea e fm he ame Translation in Transition conference that there has author to analyze translator style. been a shift in the last 1020 years of how translation scholars consider the use of machine While the customized engine did yield significant translation (MT) with literary texts, arguing that improvements in TER1 score (base engine researchers are now considering MT to aid literary average TER score: 53.8; customized engine anla in he ceain f flen average TER score: 48.2; t(57) = - 2.2, p = 0.035), whereas the field of translation studies used to hi d aach f cmiing a NMT only look at applying MT to literature as a source engine with very little data to translate literary of inspiration to create texts that violated target- texts would not be recommended for applications language norms (2019: 58). The focus of this in the language industry. Instead, this study looks current research trend is whether or not prose can beyond post-editing to a novel application of a be translated by MT to the point that it could be novel technology: using a customized NMT considered usable for post-editing (Besacier, engine as a tool for studying translator style. 2014; Besacier & Schwartz, 2015; Toral & Way, Specifically, what will the differences between 2015a; Toral & Way, 2015b; Moorkens et al., he cmied engine and he bae 2018). engine eeal abt a particular anla le? Al, ha eeach Neural MT (NMT) systems have performed methodologies can be employed to leverage a relatively well on literary texts that are similar to customized NMT engine as a productive tool for the data used to train these systems (Toral & Way, textual analysis of literary translations? 2018: 277), but no research on the application of commercially-available, customizable NMT for 2 Procedure translating niche literary texts has yet been Ggle base NMT engine was customized using conducted. This study adds to previous research just 1,367 sentence pairs (1,095 for training, 136 by training a commercially available, for validation, and 136 for testing; altogether customizable NMT engine with data from a single 19,921 source words). The cmied engine author-translator combination (Angolan

1 Snover et al. (2006). Note that these TER scores were calculated on segments from the held-out test set described in Section 2.

Page 14 Book of Abstracts: Translation in Transition (TT5) October 15 - 17, 2020 output on a held-out test set (58 segments, 942 Table 1 ih a dah in he N. clmn nde source words)taken from the same work that its Taining Daa. training data was taken fromwas compared with Ggle bae NMT engine , and bh 3 Results and Discussion were compared to the human translation. Analysis of textual features in this d parallel These three translated versions of the held-out test corpus revealed discrete categories that the set were analyzed to identify textual features that customized engine learned from its training data, he cmied engine leaned fm he hman hich in n can be ied he anla le. translator. A textual feature was considered to Google states that the main advantage of hae been leaned fm he hman anla if customizable NMT is accuracy with domain- the output of the customized engine was different specific vocabulary.2 Rows 510 of Table 1 from the base engine and this difference could be correspond to what is meant by this. Rows 14 tied to something in the human-translated training correspond to textual features that we could daa. Thee leaned eal feae ee hen consider as dealing with more than mere lexical categorized according to their commonalities. choice since they deal with patterns that are more Therefore, everything in Table 1 represents complex than what are essentially terminological something that the customized engine did one-to-one correspondences, as in rows 510. differently than the base engine and that could be Rows 1113 deal with punctuation. Comparing tied to a discrete textual feature in the training the consistency percentages found in the data. cmied engine n he held-out test set and the percentages found for the training data, Sbeenl, a cnienc ecenage a there appears to be a general correlation between calculated for each textual feature category (Table the two. 1, % column under Te Se). This consistency percentage was calculated by dividing the number The customized engine was quite consistent when of observed occurrences of each category in the it came to what we can consider domain-specific held-out test set (this number is shown in Table 1 vocabulary. We can see this with examples such in he N. clmn nde Te Se) b he al as bocadinho [little bit], which is Angolan slang; number of times each corresponding textual cacimbo [mi/de], a meelgical feature could have been employed as a translation phenomenon in Angola where mornings and option by the customized engine. nights, during a certain part of the year, come with hick fg/mi; and camarada professor(a) Finally, similar calculations were made for the [comrade teacher], which is peculiar to Angolan aining daa b ing find fncin n he Pgee de he cn -communist TMX file containing all the sentence pairs used to exit from colonialism. The lack in consistency train the customized engine (calculation results ih camarada professor i ling, hee, hn in clmn labeled Taining Daa). F given the consistency that the training data categories that were not amenable to a simple enjoyed. Relating to this gamut of textual features, find fncin, a amle f 30 enence ai was one interesting example where the customized randomly selected from the training data and was engine avoided mistranslation comes from the then analyzed. These categories will be marked in hae, ou as plantas a darem ares duma

primeira respiração na frescura da manhã, entre 2 See he becin iled I he Tanlain API silêncios e cacimbos molhados [human AML Tanlain he igh l f me? nde he translation: or like plants stretching out as if ecin AML Tanlain beginne gide n https://cloud.google.com/translate/automl/docs/begin n ers-guide

Page 15 Book of Abstracts: Translation in Transition (TT5) October 15 - 17, 2020

hee aking hei fi beah n a chill data). Also, it translated the imperfect verb tense morning, in the middle of quietness and damp ing he ailia eb ld (i.e., I ld cacimb.]. The cmied engine leaned look at thing in he ene f I fen ed translate cacimbo via borrowing even though the lk a hing). While i did n eml hi term occurs only four times in the training data, strategy in the held-out test set in all of the cases but the base engine translated the last part of the that the human translator did, the overall hae a amid ilence and e ae ie. percentage is comparable to results from the training data. A cursory reading of the parallel Table 1 shows that the customized engine corpus shows this textual feature is a salient cceed in leaning me han j dmain- chaaceiic f he anla al iing le specific vocabulary. For example, it omitted and is rather frequent because the narrator in Os optinal ha and ed cnacin me fen da minha rua often reminisces of recurrent past than the base engine (although, it did not use events using the imperfect tense. contractions as often as was found in the training

Test Set Training Data Category No. % No. % 1 contractions instead of full form 3 21.4% - 78.6% 2 eliin f inal ha 4 66.7% - 75.0% 3 ge inead f highe-register verb 3 75.0% - 54.5% 4 ld a 9n f imperfect tense 5 33.3% - 20.0% 5 bi a 9n f bocadinho 1 100% 11 84.6% 6 bing cacimbo inead f anlaing i 1 100% 4 100% 7 bing f chl name Jende em La 2* 100% 2 100% 8 cmade eache a 9n f camarada professor(a) 8 38.1% 29 100% 9 n gne dn a 9n f sol (ter) ido embora 1 100% 3 100% 10 gdbe inead f faeell a 9n f despedida 3 75.0% 4 100% 11 capitalizing all words in title of a work 1 100% 1 100% 12 standard English quotation marks 3 75.0% 284 99.3% 13 e f igh ingle ain mak ( ) inead f ( ' ) 14 66.7% 788 99.2%

* One f he ime, he cmied engine aiall bed he chl name a Juventude in Luta

Table 1 Textual Featres of Translators Stle

In line with the above categories, there were other Thi a imila he hman anla examples of the customized engine following a lin: a big e ja. Anhe eamle f general trend of using a lower register than the the customized engine lowering the register can base engine. For example, the base engine be observed when the customized engine anlae he hae um frasco grande e bonito anlaed A camarada professora tá muito a a beaifl and lage ja, hile he bonita a Cmade eache lk eall e, customized engine translated it as a nice big ja. heea he bae engine endeed Cmade eache i e beaifl. Sch eamle ae n

Page 16 Book of Abstracts: Translation in Transition (TT5) October 15 - 17, 2020 represented in Table 1 because it is difficult to elcidae ceain aec f a anla le, place them into any one, discrete category. it also sheds light on how this technology can Nnehele, he cmied engine abili make considerable gains with very few examples pick up on patterns of lower register is remarkable in the training data. While the customized engine given the very low number of sentence pairs that sometimes applied textual features learned from were used in training. its human training data inconsistently, this is nonetheless a useful tool for scholars studying As for the textual feature categories relating to translator style. The research methodology used in punctuation, it was remarkable to see the this study could be used as a starting point for customized engine capitalize all of the words in fhe aliaie anali f a anla k, the title of a book that was cited in the held-out and its inclusion in research on translator style test set, whereas the base engine did not. Only one could lend such studies more objectivity, or at such example that the customized engine could least more systematicity. hae leaned fm a fnd in he aining data. However, as with other categories that were It would be especially interesting to use this treated 100% consistently by the customized methodology multiple times for the same enginebut that also only occurred oncewe translator, training different customized engines must take the consistency with a grain of salt with data from different source authors. For because we are essentially only looking at a single example, one engine could be customized with data point that happened to be translated in a way daa fm Geg Rabaa anlain f Jli that was consistent with the human training data. Ca Rayuela [Hopscotch] while another This becomes even more apparent when we see engine could be trained with Rabaa translation that the other two punctuation categories, which f Gabiel Gaca Me Cien años de have multiple occurrences in the held-out test set, soledad [One Hundred Years of Solitude]. One were much less consistent than in the human could even use the Rayuela engine to translate a training data. In the case of quotation marks, test set from Cien años de soledad to see which however, this could be due to the fact that aspects of translator style are specific to the Portuguese orthography uses two different types translator and which are specific to when they are of quotation marks ( and « »), and the dashes translating a certain author or a certain work. can also be used for purposes other than quotation. This study also brings up some questions about Despite little training data, the consistency with commercially available, customizable NMT which the customized engine exhibited the textual systems. There were multiple textual features that features listed in Table 1 is very similar to that he cmied engine leaned fm he observed in the human-derived training data translator despite there only being one instance of with the exception of contractions and how it it in the training data. How consistently will a handled camarada professor(a). I i ci cmied NMT engine c a eal feae that this fairly heavily represented term was not ha i ha nl een nce? Cneel, ince handled very consistently at all while lexical items there were textual features that were not with very few instances in the training datasuch consistently applied by the customized engine as cacimbo and despedidaee leaned e despite there being many instances in the training consistently by the customized engine. Perhaps daa, h mch de ch a cmied engine multi-word terms experience more interference consistency depend on the consistency of the than single-word terms. training data? Or what other factors impinge on a cmied NMT engine abili al textual 4 Conclusion features present in its training data?

This study not only illustrates how commercially available, customizable NMT engines can be used

Page 17 Book of Abstracts: Translation in Transition (TT5) October 15 - 17, 2020

References Laen Beacie. 2014. Tadcin amaie Antonio Toral and Andy Way. 2015a. Machine- dne e liaie: Une de ile, aied anlain f liea e: A cae d, Traitement Automatique du Langage Naturel Translation Spaces, 4(2), 240267. (TALN), Marseille, France. Anni Tal and And Wa. 2015b. Tanlaing Laurent Besacier and Lane Schwartz. 2015. literary text between related languages using Amaed Tanlain f a Liea Wk: A SMT, Proceedings of the Fourth Workshop Pil Sd, Proceedings of the Fourth on Computational Linguistics for Literature, Workshop on Computational Linguistics for 123132. https://www.aclweb.org/anthology/W15- Literature, 114122. 0714 Anni Tal and And Wa. 2018. Wha leel f Devin Gilbert. 2020. The People On My Street, quality can neural machine translation attain on unpublished manuscript. liea e?, Translation Quality Assessment, Joss Moorkens, Antonio Toral, Sheila Castilho, and 26387. And Wa. 2018. Tanla ecein f Main Wine and Dh Kenn. 2019. Tad literary post-editing using statistical and neural the study of computer-aided literary translation in machine anlain, Translation Spaces, 7(2), real-ld eing, Translation in Transition 4 240262. Book of Abstracts, 5859. Ondjaki. 2009. Os da minha rua. Livro de Bolso, https://eventum.upf.edu/_files/_event/_23119/_edi BIS, Barcelona. torFiles/file/book-abs.pdf Matthew Snover, Bonnie Dorr, Richard Schwartz, Linnea Miccilla, and Jhn Makhl. 2006. A study of translation edit rate with targeted human annain, Proceedings of association for machine translation in the Americas.

Page 18 Book of Abstracts: Translation in Transition (TT5) October 15 - 17, 2020

Production Duration and Translator Characteristics

Haruka Ogawa

Kent State University 475 Janik Drive, Kent, Ohio 44242, USA

[email protected]

1 Introduction dataset, 39 participants translated two English

source texts (STs) into Japanese without any It is commonly acknowledged that more difficult external resources. Production duration was tasks take more time. In translation process divided by the number of ST words per sentence research, however, it has been found that (henceforth nDur), which was then analyzed in translation production duration is not always terms of the translator characteristics using mixed associated with text difficulty (Hvelplund, 2011). effect models. All the data was handled on Hence, it should be analyzed in consideration of RStudio (1.3.1073) and the lme4 package (Bates anla indiidal diffeence (Sn, 2015; et al. 2015) was used for the analysis, and effects Akbari & Segers, 2017). This study aims to better package (Fox & Weisberg, 2019) for understand production duration in relation to visualization. anla chaaceiic. The eeach ein asks whether sentence level production duration is The translator characteristics examined here aciaed ih anla backgnd, elf- were: years of training, years of experience, L1, evaluation of their own translation, and translation satisfaction rate on the task, initial orientation, and styles. end revision. The first four features were extracted from the metadata of ENJA15, which is a 2 Data summary of the questionnaires that the participants filled out after the experiment. Their The data was extracted from a study called years of training range from 0 to 7 and those of ENJA15 (Carl et al., 2016) in CRITT TPR-DB.1 experience from 0 to 24. Out of 39 participants, The .sg tables for the from-scratch translation task 36 use Japanese and 3 English as their L1. The were used, where the textual and process data satisfacin n he ak i he aician were organized at the sentence level. In this ene he ein H aified ae

with the translation you have produced through 1 I eeen Cene f Reeach and Innain in Translation and Translation Technology Translation Pce Reeach Daabae, aailable at https://sites.google.com/site/centretranslationinnovation/tprdb/public-studies?authuser=0

Page 19 Book of Abstracts: Translation in Transition (TT5) October 15 - 17, 2020 from-cach anlain? The che ne f he 3 Results following: HD (highly dissatisfied), SD (somewhat dissatisfied), N (neutral), SS Figure 1 shows the result of the mixed effect (somewhat satisfied), and HS (highly satisfied). model analysis. Although we can observe some The last two features were added to the dataset at interesting tendencies in the figure, the years of the session level2 following the classification of experience and the satisfaction rate were the only translation styles in Dragsted and Carl (2013). For characteristics that produced statistically the initial orientation (i.e., the phase of translation significant results. before the participant starts typing their The top middle plot indicates that the longer translation), each session was annotated as a head- years of experience the participants had, the less starter (who immediately started typing), a quick- time they spent on translation. This can be planner (who read the first few ST sentences explained by more automated processes as a result before typing), a scanner (who quickly scanned of their experience (Jääskeläinen & Tirkkonen- the ST), or a systemic planner (who read the entire Condit 1991). The bottom left plot reveals a trend ST). For the end revision, sessions were classified that the participants who were satisfied with their into three groups: Long (where the participant own work (SS and HS) spent less time than those spent more than 25% of the session duration for who were dissatisfied (HD and SD). If nDur is revision after producing all the translation), Short taken as a measure of translation difficulty, this (where the participant spent some time but shorter result supports some previous studies than 25% of the session duration for end revision), demonstrating that pre-task subjective assessment and None (where the participant did not spend any time for end revision).

Figure 1. Translator characteristics and nDur

Figure 1. Averaged Production Duration and Translator Characteristics

2 Although we are interested in the characteristics of the participants, because some translators used the translators, I decided to annotate the sessions, not different approaches in two different sessions.

Page 20 Book of Abstracts: Translation in Transition (TT5) October 15 - 17, 2020

Figure 2. Interaction Effects of their own performance correlates with The top left plot, which shows the interaction translation difficulty (Sun & Shreve, 2014; Liu et effect between the years of training and the initial al., 2019). orientation, suggests that i) the head-starters spent about the same amount of time for translation Further examination using mixed effect regardless of the years of training, ii) the more modeling revealed that the years of training has training the participants had, the more time they statistically significant interaction effects with the spent on translation in the group of quick- initial orientation, end revision, and years of planners, but iii) this tendency overturned in the experience, as shown in Figure 2. Let us first group of scanners and systemic-planners. It is compare the most experienced group (the orange interesting to note that the most trained group line) and the least experienced group (the blue spent a considerably different amount of time line) in the bottom left plot, which visualizes the depending on the initial orientation, while there is interaction effect between the years of training not much difference in the group with zero and experience. In the most experienced group, training. the years of training positively correlates with nDur, whereas the correlation is negative in the The top right plot suggests that the more least experienced group. Yildiz (2020) suggests training the participants had, the less time they that more advanced translation students are more spent on the task when they did not have the end sensitive to potential problems and therefore revision phase. This tendency again gets reversed spend more time than students at lower levels. when they spent some time for end revision, Although the top left plot in Figure 1 seems to although the differences among the five groups corroborate his finding as a general tendency, the are not so evident as when they had no end result was not statistically significant. Moreover, revision time. It is to be investigated why two of the plot in Figure 2 suggests that the students with the least trained groups spent more time when no or little experience has the opposite tendency they did not have the end revision phase. in the present study.

Page 21 Book of Abstracts: Translation in Transition (TT5) October 15 - 17, 2020

4 Concluding Remarks Fox, J., & Weisberg, S. 2019. An R Companion to Applied Regression, 3rd edition. Sage, Thousand This study has shed light on some of the Oaks relationships between production duration and the CA. https://socialsciences.mcmaster.ca/jfox/Books/ characteristics of individual translators. Although Companion/index.html. there were only two characteristics that were Hvelplund, K. T. 2011. Allocation of Cognitive statistically significant, one may be able to find Resources in Translation. An Eye-tracking and Key- better results if more granular units of analysis logging Study. PhD thesis. Copenhagen: (i.e., word or phrase level) are adopted. At the Copenhagen Business School. same time, the visualization of interaction effects Jääskeläinen, R., Tirkkonen-Condit, S. 1991. have revealed that different participant groups Automated processes in professional vs. non- have contradicting tendencies in terms of professional translation: a think-aloud protocol production duration, which demands careful study. In S. Tirkkonen-Condit (ed.). Empirical examination of translation process data for future Research in Translation and Intercultural Studies. Tübingen: Gunter Narr. 89-109. research. Liu, Y., Zheng, B., & Zhou, H. 2019. Measuring the References difficulty of text translation: The combination of text-focused and translator-oriented approaches. Akbari, A., & Segers, W. 2017. Translation Difficulty: Target: International Journal of Translation How to Measure and What to Measure. Lebende Studies, 31(1), 125149. Sprachen, 62(1), 329. Sun, S. 2015. Measuring translation difficulty: Bates, D., Mächler, M., Bolker, B., and Walker, S. Theoretical and methodological considerations. 2015. Fitting linear mixed-effects models using Across Languages and Cultures, 16(1), 2954. lme4. Journal of Statistical Software, 67(1):148. Sun, S., & Shreve, G. M. 2014. Measuring translation Carl, M., Aizawa, A., & Yamada, M. 2016. ENJA15: a difficulty: An empirical study. Target International fee c f Englih Jaanee Tanlain Journal of Translation Studies, 26(1), 98127. Process Data. 22nd Annual Meeting of the Association for Natural Language Processing. Yildiz, M. 2020. How Do Translain Sden Cognitive Efforts Vary? - An Answer in Dragsted, B., & Carl, M. 2013. Towards a Consideration of Pauses. Journal of Education and Classification of Translation Styles Based on Eye- Practice, 11(2), 4855. tracking and Key-logging Data. Journal of Writing Research, 5(1), 133157.

Page 22 Book of Abstracts: Translation in Transition (TT5) October 15 - 17, 2020

Testing of Perceived Sound Quality on Interprefy App

Jacob T. Høgh

Co-founder Interprefy Ltd. Bellerivestrasse 11 CH-8004 ZURICH

[email protected]

1 Introduction marginal languages from anywhere in the world, but it also enables them to eliminate the travel and The 2015 global market for interpreting is accommodation expenses of the interpreters. estimated to be worth US $8 billion; the Swiss market is valued at CHF 180 million, plus CHF 2.2 Layout 115 million for the rental equipment, making a In close collaboration with more than 50 total of CHF 300 million. interpreters or so called wild-testers and our Interprefy Ltd. has developed a cloud-based strategic partner in Chicago, InterpreNet, Ltd. platform for Remote Simultaneous Interpreting Wee bil a lafm f eme imlane (RSI) at online meetings, conferences and multi- interpretation. We identified the trend setting and lingual events. Current audio/visual technology is most influential key actors in the industry and fm he 60 and daed in man cnfeence offered them to use Interprefy App in their halls and the equipment used is both expensive teaching master classes for linguistic students and cumbersome to transport. free-of-charge in exchange for valuable input. A Swiss patent application was submitted in June 2 Speaker Microphone testing 2015. In the same month, Interprefy Ltd. won the prestigious Language Technology Innovate Goal was testing of perceived sound quality on Award 2015 in Brussels. the Interprefy App using the method Mean Opinion Score with a jury of as many Over time, a variety of additional services and people/Interpreters possible on earphones, back- products such as text-into-speech, CAI headsets, or USB microphones via smartphone or (computer aided interpreting), and a moderator laptop. UI, where the moderator can reduce the resolution of the video from the conference/floor while 2.1 Proposal always maintaining full-range, uncompressed HD The purpose of this proposal is to tell how we audiobuilt-in the coding webRTC. Five top tested the sound quality of Interprefy RSI short-term actions: platform. It is our ambition to be known as a full- range (uncompressed) HD audio-quality SaaS- ● We are co-operating with professor Franz provider. We want 90% of all interpreters to say, Pöchhacker and University of Vienna, and that they would recommend and use Interprefy Professor Sabine Braun at University of Surrey App again. This allows clients not only to source (UK), getting help from master classes with and use specialized interpreters in many, also

Page 23 Book of Abstracts: Translation in Transition (TT5) October 15 - 17, 2020 feedback about the Interprefy platform itself and 2.4 Final statement tested the sound-quality I would like to invite you to a demo of Interprefy

platform and to test the full-range and ● We tested the perceived sound quality on uncompressed audio quality. The Interprefy RSI Interprefy App with a jury of as many solution is easily scalable and in cooperation with people/Interpreters possible using earphones, headsets, or USB microphones via smartphone or interpreters as ambassadors, we can reduce the laptop (with an Interpreter-token). marketing expenditures. The main benefits are these: ● We tested three scenarios: 1. P/A-system i. old-fashioned, heavy hardware is no (Hybrid-In), 2. Remote speaker (microphone longer required; and Infront of the mouth), 3. Court room / RDSI (Remote Dialogue Simul Interpretation)-setting ii. interpreters in numerous languages can be (more than one speaker sitting 1 m from sourced world-wide, with their travel and microphoneColumn width: 7.7 cm accommodation expenses eliminated. ● Reference speech level was set to 60dB iii. Interprefy RSI offers low latency and lip-sync. ● We used the method of Mean Opinion We only broadcast the audio/interpretation to the Score (rating the sound quality or the proportion end-users without video of the interpreters, and of speech transmission from 1-5.) according to the the audio-files are 1/30 of a video file. Numerous categories: use-cases with clients have validated the 1. Ambiance/Back-ground noise technology. The present COVID19-situation has 2. Reverberation/(Echo) made Zoom.us and Interprefy evolve into one 3. Over-all Impression/Timbre stop-boutiques for online webmeetings and

4. Speech transmission Index (%words remote interpreting. In addition, interpreters are intelligible.) n elcming k emel n Ineef Footnote: game-changing technology and not be limited to their metropole area during and after the lock- Higher bitrate gives better sound, but packets are down. bigger, so packet loss and audio drops might increase in poor internet conditions. Bitrate 3 Acknowledgement setting has been available for Opus codec long time ago, but finally implemented for open source Thank you for taking your time to read this webRTC in January 2017. proposal and for inviting me to join you virtuallyduring a remote demonstration of the 2.3 Conclusion Interpreter UI at the Translation in Transition Connection between frequency range and bitrate (TT5)-conference in October. is not direct but chance of getting 20kHz range on 64kbps connection is much higher than on 32. References Nice ha ineee ISO eie 20kH Please confer the following articles: being transmitted to interpreter. Until now these sound quality tests have not been documented, https://wiki.xiph.org/Opus_Recommended_Set and might never be released to public, but I would tings like to share the results with you after we've http://wiki.hydrogenaud.io/index.php?title=Opus correlated the jury's responses carefully with our Internal test results.

Page 24 Book of Abstracts: Translation in Transition (TT5) October 15 - 17, 2020

Eye-Voice Span in Sight Interpreting: Evidence from Both Process and Product

Jia Feng Shirong Chen

Renmin University of China Renmin University of China

[email protected] [email protected]

Michael Carl Jinjin Chen

Kent State University University of Macau

[email protected] [email protected]

Yueqi Zhu

Renmin University of China

[email protected]

1 Introduction insights into the temporal characteristic of simultaneity in interpreting, the temporal Sight interpreting, as one of the basic modes of development of the target speech in relation to ineeing, i a hbid fm, in ha a ien the source speech, the speed of translation, as source text is turned into an oral-or signed- well as into the cognitive load and cognitive age e in anhe langage in eal ime processing involved in interpreting/translation (Cenkova, 2015: 374). The eye-voice span process. This dearth of research so far can be (EVS) in sight interpreting measures the time largely attributed to the methodological lag, or the temporal delay between reading challenges involved, for example, recording eye source text and producing target text. Like the movements during reading in sight interpreting, other two time lag metrics in as well as synchronizing eye movements data interpreting/translation, namely ear-voice span in with spoken production data, which was a simultaneous interpreting and eye-key span in strenuous and heavily time-consuming task that written translation, eye-voice span in sight was previously done manually. interpreting is seriously under-researched, although time lag metrics are long regarded as Thanks to the innovative breakthrough made by haing he enial becme e alable Carl and Yamada (2017), spoken production measures in translation/interpreting process- process during sight interpreting was recorded iened eeach (Tima, Daged, & by Audacity and then later transcribed by Hansen, 2011: 121) as they could provide Automatic Speech Recognition (ASR), and

Page 25 Book of Abstracts: Translation in Transition (TT5) October 15 - 17, 2020 further synchronized with the eye movements interpreter during sight interpreting. The data. On top of that, product data are integrated spoken production of target text (TT) was with the process data as the ST and TT are recorded by an audio recorder (namely aligned at word level and the process data related Audacity) simultaneously. With the help of to ST and TT at word level are related and automatic speech recognition technique, the mapped to the product data. This innovation spoken translation output of TT was later enables a more accurate and finer-grained automatically transcribed into texts. It was then analysis of eye-voice-span. corrected and synchronized using ELAN, an annotation tool that allows annotations for audio 2 Research Design data. In this way, the audio output in sight The experiments were carried out in an eye- interpreting was successfully transcribed into tracking lab with sound-proof walls and doors, written product data with timestamps, which as well as stable illumination. Eye calibration was then merged with Translog II data and was done every time before the interpreter uploaded to CRITT TPRDB platform started sight interpreting. Immediately after the (https://sites.google.com/site/centretranslationi interpreting task, participants filled the modified nnovation/tpr-db). After that, based on the NASA Task Load Index (NASA- TLX) (Sun BOLT Chinese-English Word Alignment 2012) in order to measure their translation Guidelines difficulty in sight interpreting each text. A (https://www.ldc.upenn.edu/sites/www.ldc.upe practice exercise was done in the beginning and nn.edu/files/bolt-chinese-alignment-guidelines- interpreters sight interpret a short text to get v2.pdf), the English source texts and the familiarized with the procedure. They took a five- transcribed Chinese target texts was manually minute break after each text. aligned, providing a meaning linkage between aligned words. 2.1 Participants 3 Research Questions We recruited 9 professional interpreters (7 females and 2 males) whose L1 is Chinese and We tried to explore the following research L2 English. All of them are graduates from questions: world-renowned interpreting institute after two 1) How long is the average EVS in years of formal training in interpreting during English-Chinese sight interpreting? their graduate study. The average length of working as a professional interpreter was 6.4 2) Does ST text difficulty and sight years. interpreting task difficulty influence EVS? 2.2 Materials 3) Is EVS influenced by the source text Each of the participants sight interpreted 6 texts features such as part-of-speech? from English into Chinese. The average length of the texts are about 130 words (with the range 4) Is EVS related to metrics from process from 100 words to 150 words). Four texts are data, e.g. reading measures, key- news articles (Texts 1-4) and 2 texts are excerpts logging measures? from articles of an encyclopedia (Texts 5-6). 4 Preliminary Findings

2.3 Procedures What follows are our preliminary findings: We used Translog II (Carl 2012) to display the source text (ST) and an eye-tracker (Tobii 1) In this study, the word-based EVS of professional interpreters when they TX300) to record the eye movements of the sight interpreted English source texts

Page 26 Book of Abstracts: Translation in Transition (TT5) October 15 - 17, 2020

to Chinese was about 2.25s (M=2.25, References Mean=4.86, SD=10.79). Carl, M. 2012. Translog-II: A Program for Recording User Activity Data for Empirical 2) The EVS performance of professional Reading and Writing Research. Presented at interpreters were quite consistent The Eighth International Conference on across all six source texts. In other Language Resources and Evaluation. Istanbul. words, their EVS were not influenced significantly by ST text difficulty and Carl, M & Yamada, M. 2017. Tutorial on the sight interpreting task difficulty. Multimodal Integration of Written and This consistency may be considered as Spoken Translation Production. Presented at an indicator of their expertise in the 4th International Conference on Cognitive interpreting. Research on Translation and Interpreting. Beijing. 3) Comparatively speaking, part-of- speeches like verbs, numbers, WH Cenkova, I. 2015. Sight Interpreting/Translation. In d (ch a ha, and h, The Routledge Encyclopedia of Interpreting adjectives, and determiners led to Studies (pp. 374-375). London and New York: longer EVS, while that of nouns, Routledge. adverbs, conjunctions, prepositions, Sun, S. 2012. Measuring Difficulty in English- and ha a shorter. Chinese Translation: Towards a General 4) Pfeinal ineee EVS Model of Translation Difficulty [D]. Ph.D. significantly correlated with some eye- Dissertation. Kent: Kent State University. tracking measurements, such as fixation durations, total number of Team, A. D. 2008. Audacity (version 1.2. 6) fixations. It is suggested that the EVS [computer software]. Available: audacity. sourceforge. net/download. performance was closely related to

interee eading behaviors. Timarová, S., Dragsted, B., & Hansen, I. G. 2011. Time lag in translation and interpreting: A methodological exploration. Methods and strategies in process research: Integrative approaches in translation studies, 121-146.

Page 27 Book of Abstracts: Translation in Transition (TT5) October 15 - 17, 2020

Addressing the Rare Word Problem in Punjabi to English Neural Machine Translation

Kamal Deep Garg1 Ajit Kumar2 Vishal Goyal3

Chitkara University Institute of Department of Department of Engineering and Technology Computer Science Computer Science

Chitkara University, Punjab, Multani Mal Modi Punjabi University, India College, Punjab, India Punjab, India

[email protected], [email protected], [email protected]

Abstract 1 Introduction Neural Machine Translation (NMT) is an approach using these days to develop the machine India is a big country having several languages translation system. This approach shows the spoken in different regions. The language changes improvement over the results that are comparable by the regions in India. Machine Translation (MT) to rule-based translation and Statistical Machine system can be used to translate one regional Translation. NMT system uses fixed vocabulary language to another regional language as well as due to the computation complexity, but machine regional language to English and vice-versa. MT translation is an open vocabulary problem. The is a very challenging task for Indian languages. previous work handles the out-of-vocabulary Morphological richness, word order difference, words by using the bilingual dictionary at the and size of parallel corpora of the Indian postprocessing step. In this paper, a more languages make MT a complicated task. As in our effective and simple approach is being used to case of Punjabi to English NMT system, Punjabi handle out-of-vocabulary words in the NMT has Subject-Verb-Object (SVO), whereas English system by encoding the unknown words. We has Subject-Verb-Object (SOV). Moreover, propose a Punjabi to English NMT using Byte- Punjabi is an agglutinative language, whereas Pair-Encoding (BPE) compression algorithm English is an inflected language. The huge parallel along with word-embedding that overcomes the corpus is not available for the Punjabi-English out-of-vocabulary problem for the languages that pair. do not have much translation available online. The In this paper, the baseline NMT system is proposed system is evaluated by using the BLEU developed by using the Punjabi-English parallel score and Word Error Rate(WER). corpus. To improve the baseline model, two more Keywords: variants: one by using BPE,and second by using a word embedding and BPE combined, is NMT, OOV, BPE, BLEU, LSTM developed. To evaluate all models, BLEU and WER are used.

Page 28 Book of Abstracts: Translation in Transition (TT5) October 15 - 17, 2020

2 Neural Machine Translation source and target training data with 30000 words and then applied it on train, validation, and test set The neural machine translation architecture is for both source and target data. based on the neural networks and conditional Word embedding is a feature learning technique probability of translated sentences from the source where words having the same meaning have the sentence to target sentences(Revanuru, Turlapaty, same representation in the continuous vector and Rao 2017)(Chaudhary and Patel 2018). The space(Stahlberg 2019)(Liu, Lu, and Neubig NMT is based on the sequence to sequence 2018). The most commonly used methods to learn architecture. This architecture is used to map the word embedding are Word2vec, Glove, and source language text to target language fastText. fastText is a neural network-based text(Sutskever, Vinyals, and Le 2014). This library to learn the word representations and consists of the two parts, encoder, and decoder. sentence classification created by Facebook's AI The encoder takes the source sentence and Research (FAIR) lab(Joulin et al. 2016)(Facbook converts it to the encoded vector. The decoder n.d.). From fastText, we downloaded pre-trained generates one word at a time based on the word embedding for Punjabi and English. After encoding vector and previously generated words. downloading, we convert the Punjabi and English The encoder converts the source sentence vocabulary into a 300-dimensional vector and are , , , … … … . into the vector of the fixed 1 2 3 used in the training process. dimensions. The decoder produces the output word by word using the conditional probability(Chaudhary and Patel 2018)(Garg and 3 Experiments Agarwal 2018). The Punjabi-English dataset has been created by There are multiple ways to handle the out-of- collecting from various sources. To conduct the vocabulary or unknown words in the MT. These experiments in NMT, first task is to divide the are the use of bilingual dictionary, character dataset into three sets: training set, validation set, embedding, and Subword. There are different and test set. A python script has been written to Subword algorithms: Byte Pair Encoding randomly divide the dataset into three sets. Table (BPE)(Sennrich, Haddow, and Birch 2016), 1 shows the division of the corpus into a different WordPiece(Schuster and Nakajima 2012), and set. After this, various pre-processing steps have SentencePiece(Kudo and Richardson 2018). From been performed on the training and validation set. these three, we have used the BPE to build a This includes the tokenization of the Punjabi and Subword dictionary. By using the BPE algorithm, English text, a true casing of English sentence, we learned the independent encodings on our removal of sentences having more than 40 tokens, and removal of contractions.

Corpus Set Sentences (Parallel)

Training Set 155678

Validation Set 1212

Testing Set 1212

Table 1: Division of Dataset into three sets 3.1 Training Details models, the batch size of 64, and 20 epochs, for OpenNMT toolkit(Klein et al. 2017) has been training is fixed. The Baseline model consists of used to train the NMT model. Three models have the four layers of Bi-directional LSTM encoder been developed by using this toolkit. For all three and four layers of LSTM decoder of 500

Page 29 Book of Abstracts: Translation in Transition (TT5) October 15 - 17, 2020 dimensions with a vocabulary size of 49345 and 3.3 Prediction Analysis 51003 words for source and target. For the second From Figure 1, it is clear that by integrating the model, BPE is used with the baseline model. For BPE and WE to NMT model, BLEU score increase the third model, Word Embedding with BPE is used to train the baseline model. Each model took BLEU AND WER SCORE OF ALL 10 to 12 hours for training by using NVIDIA PROPOSED MODELS GTX1050Ti 4GB GPU. 60 48.9 47.8 3.2 Results 50 46.1 All three proposed models are evaluated by using 40 the BLEU score(Papineni et al. 2002) and the 31.56 32.31 33.04 WER score. The BLEU score and WER score 30 obtained from the three models are shown in 20 Table 2 and Table 3. 10 Model BLEU Score 0 Baseline Model 31.56 BLEU Score WER SCORE Baseline Model+BPE 32.31 Baseline Model Baseline Model+BPE Baseline 33.04 Baseline Model+WE+ BPE Model+WE+ BPE

Table 2: BLEU Score of all models Figure 1: BLEU and WER score of all proposed Models Model WER Score

Baseline Model 48.90% Baseline Model+BPE 47.80% and WER decrease. By taking the one predication from all three models, it is shown in Table 4 that Baseline Model+WE+ 46.10% (pa) is correctly translated to "aap" by the BPE ਆਪ third model, whereas the first two models translate

Table 3: WER Score of all models it to "one" that is an incorrect translation.

Input Punjabi Text: ਇਹ ਕਗਰਸ ਪਾਰਟ ਦਾ ਵ ਡਾ ਝਟਕਾ ਹ , ਇਕ ਆਪ ਨ ਤਾ ਨ ਕਕਹਾ । Iha kgarasa pra d a jhaak hai, ika pa nt n kih. Baseline Model: "it is a big setback of congress party ," said one leader .

Baseline+BPE: "it is a big setback of congress party , "; said one . Baseline+WE+ BPE: "it is a big blow of the congress party ," one aap leader said .

Table 4: Prediction given by all three proposed model 4 Conclusion performances increase of the overall system. We got a BLEU score of 33.04 by adding word In this research, we developed the NMT system embedding and byte pair encoding to baseline for less-resource language i.e. Punjabi to English. NMT model. In Future, we would like to create We found that by including the word embedding and byte pair encoding to baseline NMT model,

Page 30 Book of Abstracts: Translation in Transition (TT5) October 15 - 17, 2020 large corpus to increase the effectiness of the Sennrich, Rico, Barry Haddow, and Alexandra Birch. system. 2016. Neal Machine Tanlain f Rae Wd ih Sbd Uni. 54th Annual Meeting of the Association for Computational References Linguistics, ACL 2016 - Long Papers 3: 1715 25. Chaudhary, Janhavi R, and Ankit C Patel. 2018. Machine Tanlain Uing Dee Leaning: A Sahlbeg, Feli. 2019. Neal Machine Tanlain: Se. International Journal of Scientific A Reie. : 188. Research in Science, Engineering and http://arxiv.org/abs/1912.02047. Technology 4(2): 14550. Sutskever, Ilya, Oriol Vinyals, and Quoc V. Le. 2014. Facbook. FaTe. Seence Seence Leaning ih Neal https://en.wikipedia.org/wiki/FastText (June 14, Nek. Advances in Neural Information 2020). Processing Systems 4(January): 310412. http://arxiv.org/abs/1409.3215. Gag, Ankh, and Maank Agaal. 2018. Machine Tanlain: A Lieae Reie. http://arxiv.org/abs/1901.01122.

Joulin, Armand, Edouard Grave, Piotr Bojanowski, and Tma Mikl. 2016. Bag f Tick f Efficien Te Claificain. 15th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2017 - Proceedings of Conference 2: 42731.

Klein, Gillame e al. 2017. OenNMT: Oen- Source Toolkit for Neural Machine Tanlain. ACL 2017 - 55th Annual Meeting of the Association for Computational Linguistics, Proceedings of System Demonstrations: 6772. http://arxiv.org/abs/1709.03815.

Kudo, Taku, and John Richardson. 2018. SenencePiece: A Simle and Langage Independent Subword Tokenizer and Dekenie f Neal Te Pceing. EMNLP 2018 - Conference on Empirical Methods in Natural Language Processing: System Demonstrations, Proceedings: 6671.

Liu, Frederick, Han Lu, and Graham Neubig. 2018. Handling Hmgah in Neal Machine Tanlain. : 133645.

Papineni, Kishore, Salim Roukos, Todd Ward, and Wei-jing Zh. 2002. B LEU: A Mehd f Amaic Ealain f Machine Tanlain. (July): 31118.

Revanuru, Karthik, Kaushik Turlapaty, and Shrisha Ra. 2017. Neal Machine Tanlain f Indian Langage. ACM International Conference Proceeding Series: 1120.

Schuster, Mike, and Kaisuke Nakajima. 2012. Jaanee and Kean Vice Seach. ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings 1: 514952.

Page 31 Book of Abstracts: Translation in Transition (TT5) October 15 - 17, 2020

Augmenting Dependency Tags in Interactive Neural Machine Translation

1 2 1 1 Kamal Kumar Gupta , Rejwanul Haque , Asif Ekbal , Pushpak Bhattacharyya

1Department of Computer Science and Engineering, Indian Institute of Technology Patna, Patna, India

2ADAPT Centre, School of Computing, Dublin City University, Dublin, Ireland

1{kamal.pcs17, asif, pb}@iitp.ac.in

[email protected]

Abstract Model regenerates a new hypothesis which In recent times, Neural Machine Translation eee he e chice f d and (NMT) has attracted attention to the machine contextually depends on the inserted token. translation researchers and developers due to its Multiple attempts of token replacements may be end-to-end sequence learning flexibility. It has required by a user to get the desired output. been showing great success in machine translation for many language pairs and domains. In this work, we devise a mechanism to Despite the impressive performance by NMT, it introduce syntactic information in the form of is not completely error free, and its applicability dependency tags (to the decoder side) to increase is largely limited by the amount of data available the prediction accuracy of the model, and to for training the models. In a real time translation reduce the efforts, i.e. the number of attempts of environment, where different users have different token replacement by the user to obtain the aspects of translation and vocabulary selection, it desired output. We augment the dependency tags becomes difficult for a trained model to generate as external syntax information at the target side an output which is acceptable by everyone. In sentences in training data. We try to model the that case, output generated by the model is predicted output token on the previously corrected by humans in the post-editing phase. generated token as well as the dependency tags. Interactive neural machine translation (INMT) We perform experiments for English to Hindi, a provides a human-machine collaboration low-resource language pair and English to platform where an interactive translation German, a high resource language pair. We use framework is backed with a neural machine the state-of-the-art Transformer (Vaswani et al. translation model. INMT provides users a highly 2017) model as our baseline. We use a reference productive environment to insert their choice of simulated environment where the reference words in the hypothesis generated by model. sentence is considered as a user choice sequence.

Page 32 Book of Abstracts: Translation in Transition (TT5) October 15 - 17, 2020

Each time the newly generated hypothesis is matched with the reference from left to right in References token to token manner. Each mismatch is Vaswani A, Shazeer N, Parmar N, Uszkoreit J, considered as an error and corresponding token Jne L, Gme AN, Kaie , Plkhin I. value from the reference is taken as the user's Attention is all you need. InAdvances in neural choice word. To measure the performance of our information processing systems 2017 (pp. technique in INMT environment, we use two 5998-6008). evaluation technique i.e. WPA (Knowles et al. Peris, Á., Domingo, M. and Casacuberta, F., 2016) and WSR (Peris et al. 2017). WPA stands 2017. Interactive neural machine translation. for word prediction accuracy which is the ratio of Computer Speech & Language, 45, pp.201- correctly predicted tokens to total sequence 220. length. WSR stands for word stroke ratio which is the ratio of the total tokens replaced to the total Knowles, R. and Koehn, P., 2016, October. sequence length. Our objective is to increase the Neural interactive translation prediction. In WPA (prediction accuracy) and reduce the WSR Proceedings of the Association for Machine (replacement effort). By using dependency tags Translation in the Americas (pp. 107-120). as syntax information at the target side, we achieve an absolute improvement of 2.24% and 3.48% in WPA and reduction of 4.75% and 3.03% in WSR over baseline for English to Hindi and English to German translation respectively.

Page 33 Book of Abstracts: Translation in Transition (TT5) October 15 - 17, 2020

Supporting Translators Through Keyword Mining

Dr. Kara Warburton

Masters Program in Translation and Interpreting, University of Illinois at Urbana-Champaign

[email protected]

Abstract 2 A translation-oriented view of Termbases need to include the terms that termhood translators actually need in order to effectively serve this user group. The The types of terms that translators need contribute methodology described herein involves to a definition of termhood -- or what constitutes using concordancing software to determine a term -- that is oriented towards the translation a term's relevancy to the organization's use-case. According to traditional terminology corpus. Salient unigrams are identified and theory, i.e. the Wusterian General Theory of confirmed to be statistically interesting. Terminology (GTT), a term is the designation of They are subsequently used as search pivots for identifying productive bigrams and a concept belonging to a language for special trigrams. If adopted the method can increase purposes (LSP), and further, these concepts are the effectiveness of termbases for translators universal (language-independent) and can be and other producers of corporate content. classified systematically. In recent years, this purely semantic referential view has been 1 Introduction and background challenged by scholars and researchers who, having access to large-scale corpora and Managing terminology to support translators computing technologies, observe terms in their implies that the terms that translators actually linguistic and communicative environment. need should be the primary focus and therefore Further, what constitutes an LSP, the core GTT these terms should be predominant in criterion, is subject to ongoing debate termbases that are used by translators. (Condamines 1995: 227), with some scholars This raises some key questions. What terms to claiming that business genres constitute a form of translators actually need? Do termbases that LSP (Rey 1995: 144). Previous research has also serve translators include enough of those shown that so-called semi-technical vocabulary is terms? And if not, how can this gap be often more challenging for translators than strictly addressed? technical terms (Warburton 2014: 73).

This paper summarizes portions of the PhD It has also been declared that the end-use and end- research completed by the author at the City users of a terminological resource have a bearing University of Hong Kong (Warburton, 2014) on termhood (L'Homme 2005: 1130). which attempted to address those questions and propose solutions.

Page 34 Book of Abstracts: Translation in Transition (TT5) October 15 - 17, 2020

Considering the larger body of scholarly literature and research, the notion of termhood 3 The failures of term extraction is shifting towards recognition of syntagmatic, A recognized best practice in translation, cognitive, communicative, lexico-semantic, particularly for large projects distributed among and textual features, and we maintain that the multiple translators, is to identify and pretranslate relative importance of these features for the key terms in a given project. Doing so ensures establishing a greater level of terminological accuracy and termhood in a given production environment consistency. Given the large size of these projects, ultimately depends on the target application of manual term identification is impossible. the terminological resource (see L'Homme Automatic term extraction (ATE) has attracted 2020: 59 much interest among researchers in Natural and Condamines 2010). Language Processing, and a number of effective From a purely pragmatic standpoint, based on solutions have been developed at various research the author's experience in commercial institutes. Few of these innovative solutions, enterprises, when considering translators as the however, have been productized. The few term primary end user termhood is shaped by the extraction tools that are available on the market need for: underperform in comparison, and most terminologists find them totally ineffective. 1. designations of concepts from LSPs (the Furthermore, the algorithms used to identify and Wusterian view) - the meaning of such extract terms do not align with the specific units may not be known to the translator; termhood criteria that serve the organization's 2. multi-word units (MWU) that designate a needs for term selection. At most, they are based concept (specialized or not) and which, due on (a) frequency of occurrence, (b) a few syntactic to variations in TL (target language) patterns for multi-word terms, and (c) in some of syntax or morphology, could have multiple the more advanced ATE tools, a level of semantic TL equivalents (which should be avoided domain specificity by comparison with a general if consistency is a goal); reference corpus. Term extraction -- a key enabler of high quality translation -- is therefore rarely 3. any designation (specialized or not) that carried out. occurs frequently, and is also subject to potential inconsistency in the TL; 4 Statement of the problem and aim of the research 4. any designation (specialized or not) that is highly visible, such as on packaging or Given the failures of ATE and the somewhat user interfaces. loosely defined termhood criteria for translation- oriented termbases, one has to consider how well A pragmatic definition of termhood, one that these termbases serve their users with respect to addresses the production-oriented processes of the terms that they contain. Our first assumption large organizations, thus needs to extend is that terms in corporate termbases should align, beyond the purely semantic criteria of the GTT relatively speaking, with the terms in the to include aspects such as frequency of company's corpus, since those are the terms that occurrence, visibility, and potential for are contained in content that is translated. inconsistency. This view is shared by other scholars (Schreve 2001: 785; Martin 2011; Van Campenhoudt 2006: 4, to name a few).

Page 35 Book of Abstracts: Translation in Transition (TT5) October 15 - 17, 2020

Another assumption is that the terms in the One of the ways to address the un-documented termbase should reflect translation-oriented terms is to use ATE on the company's corpus. termhood criteria. Our study sought to (a) However, as stated earlier, commercially establish that institutional (specifically, available ATE tools are ineffective. The "noise" corporate) termbases are under-optimized produced by term extraction tools refers to the with respect to providing translators with the term candidates that are uninteresting for the terms that they need, (b) explain why they are purposes of providing translation support and under-optimized, and (c) determine ways to therefore need to be removed, or "cleaned" from address this weakness. the raw output. These tools produce so much noise (60% or more of the output, according to trials that 5 Methodology we have conducted), that the cleaning effort Four case studies were completed using data renders them impractical. (termbases and corpora) from commercial Our approach therefore turned to concordancing companies. Using empirical statistical software. Previous research has shown that methods, the termbases were compared to the multiword terms (MWT) are abundant in corresponding corpora, considering criteria terminological resources (Meyer and Mackintosh such as normalized frequency (of occurrence 1996: 259; Nagao 1994: 406; Maynard and in both the termbase and the corpus), part of Ananiadou 2001: 265; Daille et al 1996: 207, to speech mention a few). We further demonstrated that prevalence, term length (n-grams), and more. bigrams and trigrams are of particular interest and We used a range of text processing tools and value for terminology work (2014: 138). We one terminology management system, therefore considered that bigrams and trigrams TermWeb (from Interverbum Technologies) to might offer potential for closing the manipulate and analyze the data. termbase/corpus gap. With this in mind, we The comparisons of the termbases and the developed and tested a method that involves corpora enabled us to confirm that with respect identifying salient unigrams, and then using those to term correspondence, there is a large gap unigrams as search keywords to identify multi- between the corpus and the termbase. We word terms of which the unigrams are a identified two types of gap, which became the component. We chose Wordsmith Tools since it focus for our research: allows batch concordancing with an input list of (1) un-documented terms, and (2) under- search keywords. optimized terms. Un-documented terms are To identify salient unigrams we used the Keyword terms that do not exist in the termbase but are function of WordSmith tools. This function prevalent in the corpus. This pool of terms is creates a word list from the corpus, and compares therefore a promising source of terms that may it to a word list created from a general reference be needed by translators. Under-optimized corpus. Using a normalized keyness formula, it terms are terms in the termbase that do not identifies unigrams that may be salient occur at all in the corpus, or are extremely rare, (statistically relevant). From the candidates and therefore, their value in the termbase is offered, we determined through statistical questionable. While both these situations are analysis which of the candidates were under- problematic and reduce the effectiveness of the represented in the termbase. We also identified termbase, the former is of greater concern, cases where there were few MWT in the termbase since it has a direct impact on the efficiency that contained the keyword. and quality of overall content production in multiple languages.

Page 36 Book of Abstracts: Translation in Transition (TT5) October 15 - 17, 2020

These strategies enabled us to focus our For example, the simple removal of a non- essential concordance searches on the keywords that premodifier can have dramatic effects on the had the most potential. We then used those leveragability of the terminological unit. The term keywords in a batch concordance to identify "automatic incremental backup," fully documented bigrams and trigrams formed with those with multiple translations in one company's keywords. termbase, did not occur a single time in its corpus. But the term "incremental backup" occurs over 500 times. Likewise, "remote notification server," found in the termbase, does not exist in the corpus, but "notification server" occurs over 1,000 times. These latter two more productive terms were missing from the termbase.

Another example is the term "data." We identified six un-documented bigrams containing the word data (e.g. "data set", "response data," etc.) which together occur 5,500 times, exceeding by far the total number of times that the 53 existing termbase terms containing "data" occur in the corpus.

The number of under-optimized terms in the termbases was alarming. In the companies we studied, they range from 35 percent to 73 percent of the terms in the termbase. This is a significant Figure 1 Sample set of keywords source of redundancy and wasted resources. WordSmith offers various formulae for 7 Conclusions identifying collocates of a search word, which is The main motivation for managing terminology in the procedure we used to find the productive a commercial setting is to reduce costs for content bigrams and trigrams. We tested all formulae authoring and translation. To meet this goal, the against our corpora and determined that the termbase must reflect the company's corpus. DICE method provided the best results. Documenting terms in the termbase that do not occur in the corpus, while failing to document 6 Observations those that occur frequently, significantly Our research confirmed that bigrams and decreases the value of the termbase and wastes trigrams are highly effective in representing company resources. When selecting terms for a terms of interest in the corpus and serve a key termbase, it is crucial to determine the most translation need. Longer terms (4 words, 5 productive boundaries of MWTs for two reasons: words etc., in length) tend to have lower (1) to avoid redundancy in the termbase caused by repurposing potential, whereas unigrams are under-optimized terms not to mention the wasted problematic because they are highly resources incurred for documenting and polysemic. Our concordancing methodology translating them, and (b) to make the termbase using sailient unigrams as search pivots more representative of the corpus by selecting enabled us to identify both un-documented statistically and semantically relevant terms. Both terms and under-optimized terms and further, require a corpus-based approach to term it revealed how to adjust the boundaries of identification and research. those terms in order to make them fully optimized.

Page 37 Book of Abstracts: Translation in Transition (TT5) October 15 - 17, 2020

References Condamines, Anne. 1995. "Terminology: Meyer, Ingrid and Kristen Mackintosh. 1996. "The New needs, new perspectives." Corpus from a Terminographer`s Viewpoint." Terminology. International Journal of Corpus Linguistics. Amsterdam. John Benjamins, V.2, No. 2, p. Amsterdam. John Benjamins, V. 1, no 2, p.257- 219- 238. 285. Condamines, Anne. 2010. "Variations in Nagao, Makoto. 1994. "A Methodology for the terminology. Application to the Construction of a Terminology Dictionary." management of risks related to language use Computational Approaches to the Lexicon. in the workplace." Terminology. B.T.S. Atkins and A. Zampolli, Eds. Oxford. Amsterdam. John Benjamins, V16, No. 1, p. Oxford University Press, p.397-411. 30-50. Rey, Alain. 1995. Essays on Terminology. Daille, Beatrice, Benoit Habert, Christian Amsterdam. John Benjamins. Jacquemin and Jean Royaute. 1996. "Empirical observation of term variations Van Campenhoudt, Marc. 2006. Que nous reste- and principles for their description." t-il d'Eugen Wuster? Intervention dans le Terminology. Amsterdam. cadre du colloque international Eugen Wuster John Benjamins, V. 3, No.2, p.197-257. et la terminologie de l'Ecole de Vienne. Paris. Universite de Paris 7. L'Homme, Marie-Claude. 2005. "Sur la notion de terme." Meta: Translators' Journal. Warburton, Kara. 2014. Narrowing the gap Montreal. between termbases and corpora in commercial Les Presses de l'Universite de Montreal, V. environments. PhD Dissertation. City 50, No. 4, p.1112-1132. University of Hong Kong.

L'Homme, Marie-Claude. 2020. Lexical Semantics for Terminology - An introduction. Amsterdam. John Benjamins.

Maynard, Diana and Sophia Ananiadou. 2001. "Term extraction using a similarity-based approach." Recent Advances in Computational Terminology. Didier Bourigault, Christian Jacquemin, Marie-Claude L'Homme, Eds. Amsterdam. John Benjamins, p.261- 278.

Page 38 Book of Abstracts: Translation in Transition (TT5) October 15 - 17, 2020

Psychosocial awareness in community translation: A report on human and machine translation and post-editing

Katarzyna Stachowiak-Szymczak

Department of Interpreting Studies and Audiovisual Translation Institute of Applied Linguistics University of Warsaw, Poland

[email protected]

Abstract translation by means of machine work and post- editing. Community translators enable written

communication between parties who do not This presentation will report on a pilot study (fluently) speak/read/write the same language, or conducted remotely on two groups of the communication of messages, alerts, participants differing in translation expertise. The documents, etc. to individuals and groups whose participants (N = 40, M = 18, F = 22) were dominant language differs from the one of the engaged in two tasks: 1) translating two medical country or region they live/work in (Taibi 2011). texts from Polish into English for two target In many cases translation is affected by clients: doctors and patients (lay persons) as well psychosocial factors or calls for psychosocial as 2) post-editing two texts from Polish into knowledge and awareness (Assis Rosa 2006; English for two target clients: doctors and Taibi and Ozolins 2016), e.g. when conducted patients. Briefings did not order the participants for healthcare professionals and allophone to act in a specific way, i.e. e.g. by adjusting patients, where knowledge and power imbalance register to the target audience, while they clearly makes it difficult to come with one uniform specified the type of audience and the purpose of nomenclature for each target client. In turn, the the texts subject to translation/post-editing. The mentioned knowledge and awareness may ensure order of the tasks, as well as the order of the texts successful information transfer, as well as the was counterbalanced across the participants. aien digni, anm and he ali f Next, the results of these translations and post- life and dying. At the same time, building editings were compared with machine translation psychosocial and cultural knowledge requires of the experimental texts. Variables included: time, effort and resources in the process of the lexical accuracy (assessed by independent judges anla edcain and killing. In man by means of a self-designed accuracy score countries of relatively homogeneous societies sheet), word frequency, word class, word and little experience in training translators in emotional load, as well as sentence structure and understanding and responding to psychosocial complexity. The results indicate that translators needs and differences, resources can be limited. of less expertise tend to lexically and At the same time, there is a strong need to ensure grammatically over-complicate texts for lay proper standards of psychosocially adjusted persons, i.e. patients. The outcomes also points

Page 39 Book of Abstracts: Translation in Transition (TT5) October 15 - 17, 2020 out to those types and complexities of lexical and In Translation Studies at the Interface of grammatical items that are best translated by Disciplines, edited by J.F. Duarte, Alexandra Assis machines vs. humans. The presentation can point Rosa and T. Seruya. John Benjamins, Amsterdam, out to possible lines of training in post-editing The Netherlands. 99-109. and translation per se, as well as indicate Taibi, M. 2011. Pblic eice anlain. In The potential risks of machine mistranslation Oxford Handbook of Translation Studies edited by depending on the target client. K. Malmkjær and K. Windle. Oxford University press, Oxford, UK. References Taibi M. and Ozolins, U. 2016. Community Ai Ra, Aleanda. 2006. Defining Tage Te Translation. Bloomsbury Advances in Translation, Sydney, Australia. Reade. Tanlain Sdie and Liea The.

Page 40 Book of Abstracts: Translation in Transition (TT5) October 15 - 17, 2020

Assessing low and high translation variation in post-editing1

M. Cristina Toledo-Báez Michael Carl

University of Málaga Kent State University Department of Translation and Modern and Classical Language Studies Interpreting Kent, OH, USA Málaga (Spain)

[email protected] [email protected]

1. Introduction and background influence both human and machine translation (Ogawa et al, 2020/in press). Previous research (Carl and Schaeffer, 2017, Carl, 2020/in press) introduce the word translation 2. Aim, corpus of study and research questions entropy (HTra) to quantify the observed variation of word (or phrase) translation choices and the The aim of this paper is to analyze some English- word distortion entropy (HCross), which to-Spanish post-edited translations with low and quantifies the amount of translation reordering in high HTra and with low and high HCross values. the target language. Several interesting findings Given that, according to Ogawa et al. (2020/in have been reported: press), correlations of HTra might be caused by linguistic features of the source text, we will focus There is a strong correlation between on showing whether the linguistic features of the HTra and HCross, indicating that the English source texts in our corpus explain HTra more translators choose different words and HCross values. or word forms, the more they will also produce the translations in a different Specifically, we will try to answer the following word order (Schaeffer et al, 2017). research questions: HTra and HCross values correlate across 1. What are the linguistic features behind languages. For instance, English words high translation variation (HTra)? Are with high HTra values when translated there common linguistic patterns that into Spanish tend to have high HTra also explain high translation variation? when translated into German or Japanese 2. What are the linguistic features behind (Carl et al, 2019). high or low translation distortion Similar correlations of HTra can be found (HCross)? Are there common linguistic both between human and machine patterns that explain high or low translation and also across languages. translation reordering? These correlations of HTra may indicate that features of source text seem to

1 The research presented in this study has been (partially) carried out in the framework of research projects VIP (FFI2016-75831-P), TRIAJE (UMA18-FEDERJA-067), MI4ALL (CEI-Andalucía Tech) and PROFETA (PIE 19- 33).

Page 41 Book of Abstracts: Translation in Transition (TT5) October 15 - 17, 2020

3. What is the correlation between After analyzing our corpus, we have found that translation variation (HTra) and high HTra values can be due to the following translation distortion (HCross) for our linguistic features: corpus of study? 1. Choice of different phrasings. An Our analysis is based on a corpus belonging to the example f hi i cgh in hae multiLing data set available through the cgh . I een a high HTa ale Translation Process Research Database (TPR- (3.43) and a low HCross value (0.76). As DB) housed by the Center for Research and can be seen in Table 1, the high HTra Innovation in Translation and Translation value can be due to the fact that 15 Technology (CRITT). Our corpus consists of 30 different translations have been used. In alternative post-editions of six short English texts cna, HC f cgh i only each of which has between 110 and 160 words. 0.56 f cgh and 0.76 f de to Each e a anlaed ih he Ggle the fact that, despite being a phrasal verb Phrase-based Machine Translation in 2012. Each without literal or direct translation into text was then post-edited by approximately 30 Spanish, there is a high syntactic different translators whose mother tongue was similarity in all the translations as all of Spanish2. them are infinitive verbs after deben/ienen (e). Regarding the corpus, it is worth mentioning that, Source have to cough up HTra 1,38 1,45 3,37 3,44 as Carl (Carl, 2020/in press) details, translations HCross 0,65 1,09 0,56 0,77 30 Translations: in the multiLing corpus were manually aligned. #Occ Cross Cross Cross Cross #Occ 1 3 deben 1 hacer 1 un 1 gasto 1 Given that manual alignment may be inconsistent 21 3 tienen 1 que 1 2 frente_a 1 hacer 5 4 tienen_que 1 1 frente 1 or incoherent, it should be mentioned that our 1 gastar 7 1 pagar 5 findings may be influenced by this manual 1 soltar 4 1 desembolsar 3 alignment. 1 afrontar 1 4 pagar_un 1 1 asumir 1 1 lidiar_con_un_gasto 1 3. Findings and analysis 1 gastarse 1 1 3 deben 1 soportar 1 1 3 han 3 tenido_que_gastar 1 This section is organized in the following 1 3 gastan 1 subsections: 1) high HTra values; 2) high and low Table 1. HTa and HC ale f hae HCross values; 3) high HTra and low HCross cough along with the 30 translations and their values and 4) low HTra and high HCross values. occurrences They correspond to the four main groups that we 2. Morphological variation of the same have found in our analysis. lexeme. An example is the past participle 3.1. High HTra values ncliaed in lef ncliaed, whose HTra value is 2.9 and their Regarding HTra values, it should be noted that we translations are variations of the have only focused on high HTra values because clia leeme: in clia, in e low HTra values imply that there is a great overlap cliada, e n e clian, e between source text and target text representations emanece in clia, n e clia, in terms of lexico-semantics. Low HTra values incliada, n cliada, a la e have only been analyzed when they appear with n e alican l cli and clia. high HCross values, i.e., when no effect size was 3. Differences in phrasal translations. An given. eamle i he haal eb flae (in

2 The data of our study is available on the CRITT website (https://sites.google.com/site/centretranslationinnovation/tpr -db) under the BML12 study.

Page 42 Book of Abstracts: Translation in Transition (TT5) October 15 - 17, 2020

in he ake f flaing , he HTa translation is generally avoided and, value is very high (4.32) and HCross therefore, other constructions such as value is also high (2.56). The difficulty of active voice, reflexive passive or translating this phrasal verb into Spanish imenal cncin ih e ae explains the high number of translations preferred. However, as explained below (see Table 2). It is worth stressing that (see section 3.3.), instances of passive most translations are wrong or without an voice also present low HCross values. accurate meaning. All these different 2. Syntactic shift as a consequence of a translations and the very different different target language Pos tag. An distribution of these translations explain eamle i he a aicile gh, the high level of HTra and HCross for whose HCross value is low (1.09). The flaing . reason might be the fact that its 3.2. High and low HCross values translations are VB (verb, base form, iene, inena), VBD (eb, past We have also investigated the linguistic reasons ene, neg, inen), IN (eiin behind the high and low HCross values for some bdinaing cnjncin, words. negaia a me la) and VBG (eb, Source fighting flaring up HTra 4.19 4.32 4.32 gerund or present participle, aand). HCross 2.62 2.56 2.56 32 Translations: Source set to embarrass #Occ Cross Cross HCross 2.24 2.08 1.91 1 4 la_lucha_que_se 3 lleva_a_cabo 32 Translations: 1 1 combates 3 que_han_surgido #Occ Cross Cross Cross 1 3 la_lucha_contra 2 la_quema 1 2 lo_hizo 1 para 2 avergonzar 6 2 la_lucha 3 contra_la_quema 1 1 poner 1 en 2 evidencia_a 1 2 armada -1 intensificación 1 1 sirve 1 para 1 avergonzar 1 2 las_luchas 2 continúan_sin 1 1 previsto 1 para 1 avergonzar 1 -2 la_lucha 5 quema 1 2 previsto 2 para 2 avergonzar_a 1 5 de_las_rebeliones 2 que_surgieron 4 2 previsto_que 2 a -1 avergüence_a 1 2 la_lucha 5 contra_la_nueva_guerra_prevista 2 1 previsto 1 que 2 avergüence_a 1 1 que 2 el_conflicto 2 2 está_previsto 1 que 2 avergüence_a 1 3 lucha_que_surge sin_duda_para_un 1 3 contra_los_asesinatos 1 3 significará 5 a_mella_pública 1 7 de_su_lucha_contra_lo_que_ocurre 1 6 tuvo_lugar_con_el_objetivo_de 2 avergonzar_a 1 3 la_lucha_emergente 1 3 ha_planeado 2 para_avergonzar_a 1 6 de_la_lucha_contra_la_lucha 1 4 por_eso_ha_querido 2 avergonzar_a 2 5 lucha_contra_quema 1 3 violencia 1 3 que_va_a 2 avergonzar_a 2 4 lucha_contra_la_quema 1 -2 su_intención 3 avergonzar se_avergüence_de 1 2 que_resurja 1 4 su_objetivo_es_que 21 _ello 1 3 quema 1 1 que 2 avergüenza_a 1 2 la_quema 1 1 para 2 avergonzar_a 1 2 situación_actual 1 1 prevee 1 avergonzar 1 7 la_lucha_contra_la_quema_de_nuevo 1 1 conflictos 1 8 con_el_objetivo 1 de 5 poner_en_evidencia 1 4 esta_más_oprimida 1 1 ha 4 hecho_para_avergonzar_a 1 1 buscaba 2 avergonzar_a tiene_toda_la_int Table 2. HTa and HC ale f fighing 1 4 ención 3 de_avergonzar_a 1 1 intenta 2 avergonzar_a flaring along with the 32 translations and their tiene_como_obje 1 3 tivo 2 avergonzar_a occurrences 1 3 su_objetivo_es 4 dejar_en_evidencia_a 2 1 pretende 2 avergonzar_a 1. Choice of different target structures. An Table 3. HC ale f e embaa eamle i e in e embaa, along with the 32 translations and their with a high HCross (2.24). This passive occurrences construction presents different syntactic re-orderings with very different syntactic 3.3. High HTra and low HCross values constructions as shown in Table 3. As a We have found that there is a high effect size consequence, the HCross level is quite between HTra and HCross (R = 0.698). However, high (2.24). there are instances of words with high HTra and This high value of HCross can be low HCross as well as words with low HTra and explained by the difficulty of translating high HCross values: passive voice into Spanish. Literal

Page 43 Book of Abstracts: Translation in Transition (TT5) October 15 - 17, 2020

Regarding instances with the highest HTra and the Professionals (healthcare 0.83 1.17 lowest HCross values, four groups of words can professionals) be found: Table 4. Compound nouns with low HTra and a. Verbs as past participles in passive voice. high HCross values T eamle ae eled (in i i The higher HCross can be explained by difficl be eled; e high HTra: the necessary inversion of order when 4.67 and low HCross: 1.93) and gien translating into Spanish. (in he a gien f life enence; b. Verbs having an auxiliary function: high HTra: 3.93 and low HCross: 1.62). a (in a cined; HTa: 0.85, b. Highly polysemous verbs without literal HC: 1.07), need (in need translation into Spanish. An example is ada, HTa: 1.22 HC: 1.30), can ee in he ill hae ee (high (in can ; HTa: 0.20, HC: HTra: 3.84 and low HCross: 1.97). Again, 0.74). as explained above in hae to cough , 4. Concluding remarks the low HCross value can be due to the

fact that there is a high syntactic similarity In this paper we have analyzed translation entropy in all the translations as all of them are and translation distortion in our TPR-DB English- infinitive verbs after Spanish corpus of post-edited translations. We debe/end(e). have found that: c. Abstract nouns which tend to be

translated as verbs in Spanish. Two 1. High translation variation can be due to eamle ae aaene (in nl he the choice of different phrasings, aaene f he hial aff; high morphological variation of the same HTra: 3.84 and low HCross: 1.97) and lexeme and differences in phrasal beakdn (in he beakdn f translations. cial lidai; high HTa: 3.45 and 2. High and low translation distortion can be low HCross: 0.99). explained by choice of different target d. Adverbs which tend to be omitted in structures as well as syntactic shift as a Spanish. An example is ndeandabl consequence of a different target (in alhgh deeling cnie ae language. understandably reluctant to cmmie; 3. There is a high effect size between high HTra: 3.72 and low HCross: 1.72). translation variation and translation This adverb has been omitted by 23 out of distortion (R = 0.698). 30 translators. 4. Four groups of words present instances of 3.4. Low HTra and high HCross values words with high translation variation and low translation distortion: a. Verbs as past As to words with low HTra and high HCross participles in passive voice; b. Highly values, two groups of instances can be found: polysemous verbs without literal a. Nouns belonging to a compound noun translation into Spanish; c. Abstract (noun + noun): nouns which tend to be translated as verbs

Compound noun HTra HCross in Spanish; and d. Adverbs which tend to be omitted in Spanish. Salary (salary increases) 2.53 2.87 5. Two groups of words present instances of Nurse (hospital nurse) 2.19 2.58 words with low translation variation: a. Systems (agricultural 0.20 1.22 Nouns belonging to a noun + noun subsistence system) compound noun and b. Verbs having an auxiliary function.

Page 44 Book of Abstracts: Translation in Transition (TT5) October 15 - 17, 2020

Given HCross and HTra values correlate across Carl, Michael (2020/in press). Information and languages (cf. Ogawa et al., 2020/in press), it Entropy Measures of Rendered Literal Translation. would be interesting to carry out a cross-lingual In Michael Carl (Ed.). Explorations in Empirical study to check whether similar linguistic Translation Process Research. New York: Springer, 128-149. phenomena of translation variation are also found for other language pairs. Ogawa, Haruka., Devin Gilbert and Samar Almazroei (2020/in press). redBird: Rendering Entropy Data References and ST-based Information into a Rich Discourse on Translation. Investigating relationships between Carl, Michael, Andrew Tonge and Isabel Lacruz MT output and human translation. In Carl, Michael (2019). A systems theory perspective on the (Ed.). Explorations in Empirical Translation Process translation process. Translation, Cognition & Research. New York: Springer, 150-173. Behavior, 2(2): 211- 232. Schaeffer, Moritz, Kevin Paterson, Victoria A. Carl, Michael and Moritz Schaeffer (2017). Why McGowan, Sarah J. White and Kirsten Malmkjær Translation Is Difficult: A Corpus-based Study of (2017). Reading f anlain. In An Lkke Non-literality in Post-editing and From-scratch Jakobsen and Bartolome Mesa-Lao (Eds.), Translation. Hermes 56, 43-57. Translation in transition. Amsterdam: Benjamins, 18-54.

Page 45 Book of Abstracts: Translation in Transition (TT5) October 15 - 17, 2020

Metalanguage for the translation process

Masaru Yamada Mayuka Yamamoto Nanami Onish Kansai University Kansai University Kansai University [email protected] [email protected] [email protected]

Atsushi Fujita Rei Miyata Kyo Kageura NICT Nagoya University University of Tokyo [email protected] [email protected] [email protected] u.ac.jp

Keywords: metalanguage, ISO 17100, translation process, translation strategy, decision list

1 Introduction Against this backdrop, we set out a project of deeling a e f mealangage f faciliaing The eein anlain ce ha maj and promoting the level of shared understanding of usages. In translation process research (TPR), which translation in general and concrete translation k ihin a behaial-cognitive experimental processes going on in translation industries. The mehdlgical aadigm (Jakben, 2017), overview of our metalanguages is given in this anlain ce baically refers to cognitive and presentation. We begin by explaining ISO 17100 as a related process of individual translators. TPR basis of describing our translation process, and then contributed significantly to our understanding of elucidate issues with these descriptions of operations hitherto underaddressed aspects of translation, i.e., the pertaining to each subprocess. In the following sections, cognitive process in translation act and factors we provide brief descriptions of our proposed affecting the process, such as the relation between metalanguages tailored for talking about translation cognitive effort and specific linguistic features (e.g., issues (Section 4), translation process (Section 5), and Lacruz et al., 2018). On the other hand, in translation translation strategies (Section 6). We also touch on industries, anlain ce refers to the eenal how such metalanguage can be used to annotate TPR process consisting of such modules as a client-TSP databases (Section 7). contract, brief definition, project management, translation, revision, reviewing, quality assurance, etc. 2 ISO 17100 as a basis of translation In this presentation, we address the translation process process in the latter meaning. This section describes translation process and Descriptions of translation processes in the latter sense, subprocesses for which we are targeting to create as such those most typically given in ISO 17100 (ISO, metalanguages. We utilize ISO 17100 as a basis for 2015), generally seem to assume already describing details of the translation process. As knowledgeable actors as their target readers and remain illustrated in Figure 1, the process is divided into three coarse. As translation industries are growing and phases: pre-translation, translation, and post- incorporating rapidly changing technologies, we translation. Translatorial actions (Holz-Mänttäri, 1984), observe a serious issue here, i.e., lack of common meta-actions taken by translators and actors involved understanding of translation processes among different in the processes before and after the translational action actors. MT developers and other natural language may correspond to the pre/post-translation processes in processing (NLP) researchers and engineers deal with Fige 1. F inance, in Clien-TSP agreement a very limited part of translation processes without so (4.4), ecificain for the translation task is decided noticing consciously. Translation learners need to gain by the project manager through negotiation with the competences and norms through practice without being client prior to the beginning of the actual job. The given a scaffold of explicit and detailed knowledge on content of the specification or so-called translation the processes. We observe a dearth of concrete and brief plays an important role, as it influences the core systematized terms and expressions in talking about process of translation or translational action where translation processes as one of the main causes of this translation is produced by a translator through situation.

Page 46 Book of Abstracts: Translation in Transition (TT5) October 15 - 17, 2020 manipulations of semiotics such as the source and the Although the description contains some important target texts. ked ch a jec ecificain and e, i i n fficienl cncee be ed f Tanlain cee (Secin 5) in Fige 1 ha translator training or for establishing a common follow the pre-anlain (Secin 4) i -called understanding of translation by specifying actions TEP (translation, editing, and proofreading), which necessary to achieve the translation for people who are consists of translation, check, revision, review, and not experienced in translation. Once again, ISO 17100 proofreading in the ISO 17100. Actors to be involved reads, The translator shall translate in accordance with in the translation phase are: 1) the translator, who is in he e. Hee, he deciin ih hi leel chage f Tanlain (5.3.1) and Check (5.3.2) of granularity is the same as saying that the translator or self-revision, 2) the reviewer, who is responsible for can translate, unless any further explanations are Reiin (5.3.3) and Reie (5.3.4), and 3) he provided as to what one should do for achieving it. feade, h efm Pfeading (5.3.5). Otherwise, one must acquire or infer the tacit skills through practice or just by guessing, without knowing 3 Description of operations exactly what to do. Given the overall flow of processes and subprocesses The d e i al age in he ene ha n and operations described above, an issue arises as to explanation of the components of purpose is provided the description of the operations for each subprocess in in he ISO andad. In he hae f Clien-TSP terms of its granularity as to whether the actor can ageemen 4.4, he TSP i nl inced eain a take actions according to the given description record of the agreement in writing. After all, especially for translation training and establishing a anlaing in accdance ih he e i j shared understanding of each operation in detail. Let us anhe a f aing anlaing a he clien examine examples of descriptions excerpted from ISO an,f and inevitably most parts of the translation job 17100 (ISO, 2015) egading Tanlain 5.3.1 and m deend n he en ecein aci kill Reiin 5.3.3, as follows: if he or she happens to have them.

5.3.1 Translation: The translator shall translate in While fully recognizing that ISO 17100 is for TSPs and accordance with the purpose of the translation not for learners or NLP engineers, it is used as an project, including the linguistic conventions of the important point of reference for translation practices in target language and relevant project specifications. translation industries, the aspect that has not been fully (p. 10) explored in translation studies. Enriching the

5.3.3 Revision: The reviser shall examine the descriptions of translation processes based on ISO target language content against the source language 17100 would thus benefit wider range of actors content for any errors and other issues, and its including learners and engineers. Therefore, one aim of suitability for purpose. (pp. 10-11) this study is to increase the granularity of descriptions pertaining to translation processes in the form of

Figure 1. ISO 17100 Translation Process

Page 47 Book of Abstracts: Translation in Transition (TT5) October 15 - 17, 2020

metalanguages through which learners and different actors will be able to operationalize the tacit knowledge of translation for better practice and communication.

4 Metalanguage of translation issues

Our benchmark for achieving a higher degree of detail is translation quality assurance (QA) scheme, a set of issue categories used by reviewers to check for translation errors in the industries and translation training. The QA scheme can be the criteria for the granularity of descriptions in our metalanguage. One well-known example is MQM, multidimensional quality metrics (Burchardt and Lommel, 2014), which is now the de-facto QA scheme.

The use of such a typology makes the coarse descriptions more operationalizable. For instance, let us see the aforementioned ISO description about Reiin 5.3.3. The cnce f e and ie are vague; they can be operationalized in a much higher granularity with consistency in detection when the quality metrics are employed. For these reasons, the error categories are now utilized even by researchers who annotate errors of both human and machine translation for research assessment purposes (e.g., Specia and Shah, 2014).

We have also developed our own issue typology called Figure 2. Decision tree for classifying a given issue MNH-TT issue typology specialized for university- (Fujita et al., 2017) level training, based on MeLLANGE issue categories (Castagnoli et al., 2006; Seca, 2005). Issue categories typology will help equip learners with common and their structure were systematically fine-tuned to knowledge to talk about translation issues with peers improve usability with learners/instructors. For that we and the instructor. have compiled a list of issue categories in the form of a decision tree, a step-by-step guide to help learners Of greater importance is that the set of issue categories classify issues, as provided in Figure 2 (Fujita et al., will serve as a metalanguage for the learners (Piao et 2017). al., 2019). Learners who could not distinctly recognize errors will be able to identify 16 types of errors after In order to identify an issue in accordance with the having learnt with the issue typology. That is to say, issue typology, a learner first finds the issue and meta-recognition is made possible with metalanguage. categorizes it to the appropriate category. For instance, The expected acquired skill is not only the improved if a part of the source text is found to be untranslated ability to check errors during revision but also the and the part that corresponds to the source does not improved translation competence of students having appear in the target text, the learner asks a question become conscious of making such errors during about the issue according to the decision tree by translation. answering Yes or No, starting with the Q1a. When she or he finds an answer to be Yes to the question, then A benefit in consistency of classification is also the issue will be classified as instructed herein. If not, expected in terms of both intra-reviser agreement the leaner proceeds with the following question until (within the learner revising multiple issues) and inter- she or he finds the answer. In this example, Q1a is reviser agreement (among multiple learners revising answered No, Q1b is answered No, Q2a is answered the same issue). The advantage is expected not only for No, and Q2b is answered No. As a result, the issue is training but also for enhancing shared communication classified into X1 omission. In this way, the issue

Page 48 Book of Abstracts: Translation in Transition (TT5) October 15 - 17, 2020 among actors in interdisciplinary fields. For instance, define the target audience, such as gender, age, when MT developers and translation practitioners talk interests, and community membership, which are about translation issues produced by an MT engine, the easily identified even by learners of translation. In leel f each aician ndeanding ab ie order to accomplish this degree of detail, we consulted categories should improve with the use of shared the literature on translation project management, and metalanguage. These show that the level of also obtained authentic documents used in the actual concreteness of descriptions represented by QA industry with interview data from two of the ISO- scheme and MNH-TT issue typology facilitates certified TSPs in Japan. understanding and communications among different actors with different backgrounds or with different (2) Source document property and element: degree of experiences. Tanlain ce (5.3.1) i diided in ndeanding ce dcmen and emiic 5 Metalanguages of translation process anfeing. Gien ha, i i iman to understand the source document (SD) before semiotic transferring, Here we introduce modules of translation processes for but operations required for SD understanding have not each of which a subset of metalanguages is developed. sufficiently been clarified. For this purpose, we divided Based on the framework of translation process the task of SD understanding into two phases: SD provided in ISO 17100, we divided the translation profiling to identify the properties of the SD that are process into the five sub-phases as shown in Figure 3, important for translation and SD analysis to identify then design a set of metalanguages each of which the elements within the SD to be properly translated. corresponds to a specific sub-phase. A brief As metalanguages, we are developing fine-grained explanation of the metalanguage for each sub-phase is comprehensive typologies of SD properties and as follows: elements.

(1) Translation project management process: Based on (3) Translation strategies: Given that translation or descriptions provided in ISO 17100 illustrating textual transformation requires semiotic manipulations subprocesses, we break them down into finer-grained of source and target languages, good translation results and operationalizable items which a person in training fm he anla killed manilain. In de can complete as answering a questionnaire. For describe the manipulations, we have drawn on instance, it is not clear as to what kind of information translation strategy categories proposed by Chesterman one needs to obtain in order to determine the purpose (2016). Detailed explanations are given in the next of translation during the Client-TSP agreement phase. section. More specifically, if information such as the target audience is required for determining the purpose, the (4) Effec f efinemen: Check (5.3.2) i a elf- item should be broken down into the variables that revision, carried out by the translator before submitting

Figure 3. Five sets of metalanguages in the translation process

Page 49 Book of Abstracts: Translation in Transition (TT5) October 15 - 17, 2020 the translation to the reviser. Metalanguage we Our rearranged list of translation categories will serve designed to talk about this phase is a set of categories as a decision list to help translation learners to identify to describe the effect of refinements added by the an appropriate strategy from the top of the list, as translator over the course of self-revising. provided in Figure 4.

(5) Issue typology: As discussed, MNH-TT issue lg i aailable f Reiin (5.3.2) and Reie (5.3.4) hae hee a eiee ill check the translation for any issues.

Each type of metalanguage is designed so that it can be used in both classroom settings and translation industries to describe and explain in detail the translation subprocesses for better practice and Figure 4: Decision list of translation strategies communication. For training purposes, we are planning Referring to one category for each of the three pillars to implement these metalanguages as an independent in the decision list, we can describe a given translation module in MNH-TT, a browser-based learning shift. An example is below. platform for translation training, so that learners using this platform can operationalize the explicit knowledge Source text: CNN reported on (a)Thursday that a giant for every process by either filling out the template tornado and hailstorm had killed 51 people in China. format or annotating texts according to the metalanguages. Target text: CNN が(a)23 ⽇、中国で巨⼤巻とで 51 ⼈が死 6 Metalanguage of translation したとじた。 strategies [Back translation: CNN reported on (a) the 23rd that 51 people had been killed in a giant tornado and hailstorm This section gives explanations in more detail about the in China.] description of translation strategy by providing a specific example to explain what it can provide and In hi eamle, he ndelined ni (a) Thda in how it is operationalized. he ce e i anlaed in 23 ⽇ (he 23rd) in the target text. This translation shift is annotated with Amongst translation strategies proposed in previous three categories given from each pillar: G1, S1, Pr12 in literature (Lörscher, 1991; Newmark, 1988; Vinay and Figure 4. This means that no specific strategies are Darbelnet, 1958/2000), we have selected the list of taken in terms of syntactic and semantic strategies: G1 translation strategies presented by Chesterman (2016) (Literal Translation) and S1 (Semantically Equivalent), as a point of departure. Most proposals consist of one but a pragmatic strategy, Pr12 (the domain adaption), group of categories which makes it difficult for applies to this case. translation learners to distinguish, for example, between semantic and pragmatic shifts, both of which 7 Application to TPR and the future involve message-related manipulations of target text. Cheeman e f anlain aegie (2016) helps Our metalanguage of translation strategies could to disambiguate the aspect with granular sets of potentially be used for deeper analysis in TPR. categories which comprise three pillars syntax, Translation strategy is concerned with a process semantics, and pragmatics each of which branches normally attributed to individual translators and thus into approximately ten subcategories. This would be can correspond to the translation process as defined in advantageous for learners and researchers to dissect TPR. In the existing CRITT TPR-DB (Carl, 2012), the translation shifts or strategies. relationship between the source and target texts is represented only by annotations on alignments on a The categories were originally designed for translation formal linguistic unit (at the word or phrase level). If between English and German. We recompiled them so we also annotate the process data with our strategies, it that they are applicable to English-Japanese translation. may increase granularity of data analysis as well. For Nearly 300 translation samples extracted from example, we can investigate the relationship between translation instruction books (Okada, 2013; Tanabe each anlain aeg and he anla cgniie and Mitsufuji, 2008) were used to verify the categories. effort required for the manipulation.

Page 50 Book of Abstracts: Translation in Transition (TT5) October 15 - 17, 2020

8 Conclusion International Standard Organization (ISO). (2015). ISO 17100. 2015. Translation services Fine-grained metalanguages are expected to make it Requirements for translation services. First edition. possible to externalize the tacit knowledge of skilled translators for use in translator training and in Jakben, A. (2017) Tanlain ce eeach, in communication among different actors. By J. Schwieter and A. Ferreira (eds) Handbook of operationalizing this knowledge through translation and cognition (pp. 386401). Blackwell metalanguages, translation learners can consciously Handbooks in Linguistics. Malden, MA: John Wiley practice the behaviors of professionals and different & Sons. actors can accurately talk about translation. Lac, I., M. Cal and M. Yamada (2018) Lieali and cgniie eff: Jaanee and Sanih, Acknowledgement Proceedings from 11th Edition of the Language Resources and Evaluation Conference (LREC This work is partly supported by JSPS KAKENHI 2018). Grant Number 19H05660. Lörscher, W. (1991) Translation performance, References translation process and translation strategies: A psycholinguistics investigation. Tübingen: Gunter Burchardt, B. and A. Lommel (2014) QT-LaunchPad Narr. supplement 1: Practical guidelines for the Use of MQM in Scientific Research on Translation Quality. Newmark, P. (1988) A textbook of translation. New http://www.qt21.eu/downloads/MQM-usage- York: Prentice Hall. guidelines.pdf. Okada, N. (2013) Honyaku no fuseki to jouseki Cal, M. (2012) The CRITT TPR-DB 1.0: A database [Strategy and tactics of translation]. Sanseido. f emiical hman anlain ce eeach, Piao, H., S. Han and K. Kageura (2019) The e f Proceedings of the AMTA 2012 Workshop on Post- meta-langage in anlain eiin, 2019 Editing Technology and Practice (WPTP 2012) International Conference on Translation Association for Machine Translation in the Education: Computer-Aided Translator Training Americas (AMTA), pp. 9-18. (CATT) of Machines and Man. Castagnoli, S., D. Ciobanu, N. Kübler, K. Kunz and A. Seca, A. (2005) Tanlain ealain: A ae f Vlanchi (2006) Deigning a leane anla he a e, Proceedings of the c f aining e, Teaching and eCoLoRe/MeLLANGE Workshop, pp. 39-44. Language Corpora Conference (TALC) 2006. Secia, L. and K. Shah (2014) Pedicing hman Chesterman, A. (2016) Memes of translation: The anlain ali, Proceedings of the 11th spread of ideas in translation theory. Conference of the Association for Machine Amsterdam: John Benjamins. Translation in the Americas (AMTA), pp. 288300. Fujita, A., K. Tanabe, C. Toyoshima, M. Yamamoto, Tanabe, K. and K. Mitsufuji (2008) Puro ga oshieru K. Kageura and A. Hale (2017) Cnien kiso karano honyaku sukiru [Buidling translation classification of translation revisions: A case study skills]. Tokyo: Sanshusha. of English-Jaanee den anlain, Vinay, J.-P. and J. Dabelne (1958/2000) A Proceedings of the 11th Linguistic Annotation mehdlg f anlain, in L. Veni (ed.) The Workshop (LAW), pp. 57-66. Translation Studies Reader (pp. 84-95). London: Holz-Mänttäri, J (1984) Translatorisches handeln: Routledge. Theorie und methode. Helsinki: Suomalainen Tiedeakatemia.

Page 51 Book of Abstracts: Translation in Transition (TT5) October 15 - 17, 2020

Explorations in Empirical Translation Process Research: A Book Presentation

Michael Carl

Kent State University, Ohio, USA

[email protected]

1 Introduction translation via a pivot languages. The results and findings are interpreted in the context of psycholinguistic models of bilingualism and re-frame The bk Explorations in Empirical Translation empirical translation process research within the Process Research, aea in Singe, context of modern dynamic cognitive theories of Machine Translation: Technologies and the mind. The book aims at bridging the gap Applications, assembles fifteen original, between translation process research and machine interdisciplinary research chapters that explore translation research. methodological and conceptual considerations as well as user and usage studies to elucidate the 2 Structure of the Volume relation between the translation product and translation/post-editing processes. It introduces The 15 chapters in the volume are structured in four numerous innovative empirical/data-driven parts. The first part, Translation Technology, Quality, and Effort, starts with chapters that have measures as well as novel classification schemes an applied orientation, investigating the and taxonomies to investigate and quantify the (psychological) reality of translation edit rate relation between translation quality and (TER) in post-editing. The second part, Translation translation effort in from-scratch translation, and Entropy, presents four contributions that machine translation post-editing and computer- address various aspects of word translation assisted audiovisual translation. Translation entropy. The third part, Translation Segmentation experiments are conducted for several language and Translation Difficulty. deploys qualitative and pairs in different translation modes using eye- quantitative methods to address topics in tracking and/or key-logging technology, to translation segmentation and translation difficulty. compare different types of translator expertise, Part four, Translation Process Research and Post- different types of texts, and various types of cognitivism, provides conceptual, methodological, and theoretical support for a post-cognitivist linguistic expressions. The research addresses perspective in TPR. questions in the translation of cognates, neologism, metaphors, idioms, figurative and cultural specific expressions, it re-assesses the 2.1 Translation Technology, Quality, and notion of translation universals and translation Effort literality, elaborates on the definition of Since its beginnings, TPR has been concerned with translation units and syntactic equivalence, investigating the impact of translation technology investigates the impact of translation ambiguity on the human translation process. The aim of using and translation entropy, suggests alternative computer assistance in translation has been to interpretations of the human translation edit rate, support translators at work, to offer possibilities to and explores the possibilities of computer-assisted lay off memory and cognitive load into the

Page 52 Book of Abstracts: Translation in Transition (TT5) October 15 - 17, 2020 environment, to provide them with customized the gap between MT research and translation collocation and retrieval tools, and to suggest studies. Based on Do Cam proposal, Huang and targeted translation solutions at the right time. Carl introduce, in Chapter 2, a new measure, the Specialized editing interfaces are being produced word-based HER (WHER) score which they show and tested under many different conditions to correlates well with measures of translation effort. facilitate the MT post-editing process. Two factors In Chapter 3, Cumbreño and Aranberri also are of crucial importance in this endeavor: 1) the correlate various measures of cognitive effort with possibilities to assess, in an objective manner, the TER scores, but they come to different conclusions. translation quality and 2) to model, measure, and The last chapter in this part reports a study using explain the hoped-for reduction of translation translation technologies with different goals. effort. Similar to Huang and Carl, also Tardel (Chapter 4) studies aspects of cognitive effort in computer- 1) With the wide-spread deployment of data- assisted subtitling. However, she compares driven MT systems at the beginning of this different settings to find a best information- century, automatic evaluation metrics were environment for computer-assisted subtitling. needed to compare and fine-tune the systems towards a reference or gld andad. The 2.2 Translation and Entropy Transition Edit Rate (TER) is such a measure Entropy is a basic physical measure that quantifies which assumes four edit operation in MT post- the interaction between two entities. The entropy of editing: deleting, inserting, replacing, and an entity (e.g., an ST expression) with regards to moving words and groups of words. It another entity (e.g., the set of its possible computes the minimum amount of assumed translations) counts the number in-distinguishable edit operations to match the MT output to a configurations (i.e., different translation for an ST reference translation for instance, the post- expression). It is the only physical measure that is edited version. The number of assumed edit irreversible and directional (i.e., non-symmetrical), operations is then taken as an indicator for MT and thus tightly linked to the notion of time, which quality, where fewer edit operations indicate is also directional and irreversible. The word better MT output. translation entropy (HTra) represents one of three criteria to measure transition literality (cf. e.g. Carl 2) Numerous models have been proposed to and Schaeffer 2016). HTra has since then been used explain and understand the reported and as a powerful predictor for several translation observed translation behavior and to relate measures. It has been shown to correlate with translation behavior with translation quality. various behavioral observations of the translation Countless publications refer to King (1986) process, such as translation production duration, categories of temporal, technical, and gazing time, the number of revisions, but also with cognitive effort: while temporal effort is often properties of the translation product, including used as a proxy for cognitive effort, gaze data translation errors of humans and machine provides a more direct insight into the mental translation systems. HTra is an information- activities though, often not without much theoretic measure that quantifies whether there are noise. However, more fine-grained models are strong translation preferences. Stronger entrenched being increasingly used to explain these translation solutions are less translation- findings, and the experimental validation of the ambiguous, they carry less translation information, models themselves becomes a matter of they are easier to retrieve, and their production research. requires less cognitive effort as compared to more ambiguous and less entrenched translations. More The first three chapters in this section relate TER entrenched translations are also thought to be scores to human post-editing activities and assess semantically closer to their ST equivalent. to what extent TER is suited to describe the actual post-editing process. Do Carmo proposes in In this section, Carl (chapter 5) describes the Chapter 1 to reverse the view: instead of looking at implementation of a endeed lieali meae the Translation Edit Rate, he suggests taking a view which extends the three previous literality criteria: on the Human Edit Rate (HER), which may close monotonicity, compositionality, and entrenchment,

Page 53 Book of Abstracts: Translation in Transition (TT5) October 15 - 17, 2020 with the additional constraint of TL compliance. In Chapter 6, Ogawa et al. observe HTra correlations This part of the volume addresses translation for human and machine translation, across different segmentation and translation difficulty from those languages and investigate in detail to what extent different angles. In Chapter 9, Carl provides a this is also the case for different word classes and definition of micro (translation) unit that integrates sets of collocations. Wei investigates, in Chapter 7, properties of the translation process and product with great detail scan paths that are triggered and allows to investigate the relation between the through a high HTra word and discusses patterns of first translational response and the final translation visual search to find disambiguating clues in low- product. Vanroy et al. examine in Chapter 10 the entropy context. Heilmann and Llorca-Bofí detect, translation product from a computational- in Chapter 8, an effect of cognateness, as computed linguistics view and base their notion of translation with a Levenshtein distance, on HTra and suggest difficulty on various definitions of cross-lingual adding a formal similarity criterion to the list of syntactic equivalence. In Chapter 11, Lacruz et al. literality criteria. explore novel types of segmentation, based on the Jaanee bne, to assess translation 2.3 Translation Segmentation and difficulties of culturally and contextually Translation Difficulty dependent expressions, based on the variation of Segmentation during translation production has HTra values. Chen takes a process view on been a topic of research for many years and has translation difficulty in Chapter 12 when assessing been in some ways at the very core of TPR. Since the success of several translation strategies its beginning, TPR has produced many models to depending on whether the meaning and describe, explain, and conceptualize the basic units background knowledge of neologisms were of translation, but has up to date - not reached a available. generally accepted conclusion about its nature. 2.4 Translation Process Research and Post- Two fundamentally different approaches can be 1 cognitivism distinguished to describe translation units: by looking into the translation product, one can try to Translation process research has primarily been find linguistic and/or cross-linguistic clues that concerned with technologically heavy indicate coherent translation segments, such as methodologies to collect and analyze translation sequences of monotonous or isomorphic process data that help elucidate the human translational correspondence. Incoherent and translation processes. Various explanatory models smaller segments of translational correspondence, have been deployed that were borrowed - among larger amount of translation re-ordering, and less others - from cognitive sciences, psycholinguistics, isomorphic ST - TT representations are taken to and bilingualism research so as to interpret the TPR engender more translation difficulties and potential findings in a coherent theoretical framework. With increased translation effort. the development of those disciplines in the past 20 years utilizing more sophisticated data acquisition Another approach investigates behavioral data tools, new translation devices, and their directly mainly logs of fixations and/or keystroke technological possibilities combined with the data - to determine the assumed mental processes collection of big data sets and more rigorous of text segmentation and integration during analysis methods, the explanatory models in TPR translation production. Less fluent typing, longer have also changed and adapted to the new situation. keystroke pauses, and more dispersed visual As pointed out by several scholars in the field, (e.g., attention and search are taken as indicators of Sun and Wen 2018, Shreve and Angelone, 2010), translation difficulty and extended effort. The new process models have to be developed that are assumption is that both approaches, the view from able to accommodate those novel developments the process or the product, would converge and and research findings. A trend towards post- allow us to come to the same conclusions about cognitivist theories can be noticed in recent TPR, translation difficulty and translation effort. and also in this volume where several chapters refer

1 Post-cognitivists reject the assumption that the mind usually associated with the computational theory of performs computations on objects that are faithful mind and their protagonists. representations of an outside world, which is

Page 54 Book of Abstracts: Translation in Transition (TT5) October 15 - 17, 2020 to connectionist models as the explanatory framework.

The last part of this volume underpins this post- cognitivist perspective of TPR. Chapter 13 postulates that TPR has mainly developed and been a methodology that suggests a mechanistic view on computation. It lae ha eeenain and cmain are independent concepts, that computational devices are useful for developing and verifying theories of the mind, and that the status of a statement as methodological and ontological has perhaps not always been clearly marked. Chapter 14 introduces a new triangulation method based on artificial neural networks. It integrates findings from bilingualism and translation research and assesses to what extent results from single word translations may carry over to translation in context. Chapter 15 develops a radical embodied post-cognitivist perspective on the translation process. The chapter extends translation affordances with a probabilistic recursive layer and maps this framework onto a dynamic systems approach, capable to explain eeenain hng cgniin as covariation between the model and the world.

References Albir, A.H., Alves, F. Dimitrova, B.D., Lacruz, I. (2015). A retrospective and prospective view of translation research from an empirical, experimental, and cognitive perspective: the TREC network. Translation & Interpreting Vol 7 No 1. Carl, M., Bangalore, S, & Schaeffer, M. (2016). New Directions in Empirical Translation Process Research. Springer. ISBN 978-3-319- 20357-7 Ehrensberger-Dow, M. Hunziker Heeb, A. Jud, P., Angelone, E. (2017) Insights from translation process research in the workplace. In: Doing Applied Linguistics, De Gruyter, (116123) DOI: 10.1515/9783110496604-014 Jakobsen, Arnt Lykke. (2017). Translation Process Research. Book Editor(s): John W. Schwieter Aline Ferreira The Handbook of Translation and Cognition Wiley, https://doi.org/10.1002/9781119241485.ch2

Page 55 Book of Abstracts: Translation in Transition (TT5) October 15 - 17, 2020

Ae lieal alai ala eaie ce?: a process-based study of effort based on comparable corpus data

Miguel A. Jiménez-Crespo Joseph Casillas

Department of Spanish and Portuguese Department of Spanish and Portuguese

Rutgers University, 15 Seminary Place Rutgers University, 15 Seminary Place

New Brunswick, NJ 08901 New Brunswick, NJ 08901

[email protected] [email protected]

1 Introduction Thee nn-lieal anlain candidae involved either (1) lexical or syntactical Recenl, he inee in he nin f lieal diffeence (e.g. ga a anlaed a anlain ha imail eled and he glce nn in ST ee rendered through empirical inquiry into the relation of post editing a verbal structure in TTs), as well as (2) MT (PEMT) output and cognitive effort (i.e. cmleel diffeen endeing (e.g. halii Schaeffer and Carl 2014; Lacruz and Shreve bad beah a mached in the Spanish TT 2014; Carl and Schaeffer 2017a, 2017b; Lacruz ih a endeing ch a halii fl mell 2017; Lacruz, Carl and Yamada 2018). In this coming out of the mh). context, it has been found that non-literal tranlain ae me diﬃcult and time 2 Research Question cnming [...] dce han lieal ne (Schaeffer and Carl 2014: 55) in terms of RQ1 - are non-lieal anlain ala me cognitive or temporal effort. The present study effortful to process than literal ones? And more contributes to this area using data from a specifically, are recurring conventionalized units previous product-based comparable corpus study to express a specific communicative purpose (Jiménez-Crespo and Tercedor 2018). The found in natural or non-translational corpora tertium comparationis, what the effort incurred though frequency analysis, regardless of their in ceing lieal anlain i cmaed lieali elain he ce e, me to, is that of post editing the most frequent non- taxing to process in terms of time and cognitive literal renderings in recurring conventionalized effort? units that express a specific communicative Hypothesis 1 (H1). We hypothesize that literal purpose in non-translational or naturally translations (adequate and inadequate) from the produced texts. translational corpus imply less cognitive effort with regard to non-literal translations that represent the most frequent renderings to express recurring communicative purposes in non- translated texts, and, therefore, should be associated with faster processing times.

Page 56 Book of Abstracts: Translation in Transition (TT5) October 15 - 17, 2020

1 Methodology and 0.95 for time to completion around a point null value of 0. We consider a posterior Ten professional translators took part in the diibin f a aamee in hich 95% f experiment. They post edited two texts using a the HDI falls outside the ROPE and a high MPE mock CAT tool setup in a PC using the (i.e., values close to 1) as compelling evidence keylogging software Inputlog. The task was for a given effect. Finally, we use Bayes Factors presented as a regular PE process with a full (BF) to compare null and alternative hypotheses cohesive text presented segment pair by in order to determine under which model the segment pair. Time was used as a proxy for observed data are more probable. cognitive effort in tune with other previous studies (Lacruz and Shreve 2014; Lacruz, Carl 3 Results and Yamada 2018). The experimental task included an instrument that included ST-TT In order to assess whether non-literal translations pairs with randomized literal translations from are more effortful to process than literal ones the translational section of the corpus, as well as (RQ1) we analyzed time_to_type and the most frequent renderings from the non- time_to_completion as a function type of translational section of the corpus. The translation rendering (Original, Translation) in independent variables were whether the ST-TT separate models. Figure 1 plots time_to_type and pair was edited (edit_yes, edit_no), the time time_to_completion response times for original from the moment the translation pair was and translational types of translation TT presented to the moment the subjects typed any renderings. The models included a by-subject key (time_to_type), as well as when the post- and by-item random intercepts with a random editing of the translation was accepted slope for type of translation rendering for each (time_completion). This last variable is an subject. Table 1 provides summaries of both indication of he cmlein cmlee ediing models. Time_to_type response times did not een, ha i, cheen g f ediing vary as a function of type of translation rendering acin ie imila dcin ni ( = 0.01, HDI = [0.12, 0.14], ROPE = 100, (Lacruz 2016: 392). This allowed to investigate MPE = 0.58), n did ime cmlein ( = he eff aciaed he ienain 0.07, HDI = [0.24, 0.09], ROPE = 100, MPE (Jakben 2002), anlain ne ime = 0.82). That is, neither model provided (Vandepitte, Hartsuiker and Van Assche 2015) compelling evidence suggesting literal when segments are edited or just simply translations from the translational corpus were accepted, as well as the overall time effort when more effortful to process than non-literal segments are edited in both conditions. renderings from the non-translational corpus.

2 Statistical analysis

We fitted a series of Bayesian linear mixed effects regression models. These models examined how reaction time (time_to_type, time_to_completion) varied as a function of a series of predictors (see below). For statistical inferences we report mean posterior point Figure 1: Response times for time to type and estimates for each parameter of interest, along time to completion as a function of type of with the 95% highest density interval (HDI), the translation rendering (Original, Translation). percent of the region of the HDI contained within Small colored points represent raw data. Large a region of practical equivalence (ROPE), and points summarize the posterior distributions (in the maximum probability of effect (MPE). We grey) with posterior means ± 95% and 66% established a ROPE of ± 0.87 for time_to_type credible intervals.

Page 57 Book of Abstracts: Translation in Transition (TT5) October 15 - 17, 2020

This study therefore highlights need to On the contrary, we find the effect of origin of incae nin ch a defal anlain translation renderings has a probability of 0.58 (i.e. Halverson 2015, 2019) from 4EA (time to type) and 0.82 (time to completion) of approaches that combine process and product being negative and would not be considered based perspectives (Halverson 2019: 207). aiicall ignifican nde a feeni framework. The time to type and time to References completion models both provide strong evidence in favor of an absence of effect of origin of translation Carl, Michael and Schaeffer, Moritz. (2017a). rendering (BF = 0.013; BF Wh Tanlain i Difficl: A C-based = 0.027). Study of Non-literality in Post-editing and From-cach Tanlain. Hermes 56: 4357.

Parameter Estimate HDI ROPE MPE BF Carl, Michael and Schaeffer, Moritz. (2017b).

TT Intercept 9.1 [8.9, 9.3] 0 1 3.2e+101 Meaing Tanlain Lieali. In A.L. Type 0.01 [0.1, 0.1] 100 0.58 0.013 Jakobsen and B. Mesa (eds.). Translation in

TC Intercept 9.69 [9.4, 9.9] 0 1 6.9e+74 Transition. Type 0.07 [0.2, 0.1] 100 0.82 0.027 Between Cognition, Computing, and Technology, pp. 81105. Amsterdam: John Benjamins.

Table 1: Model summary of the posterior Halen, Sanda. (2015). Cgniie distribution modeling response times (TT: time to Translation Studies and the merging of type, TC: time to completion) as a function of emiical aadigm: he cae f lieal type of translation rendering (Type). The table anlain. Translation Spaces 4(2): 310- includes posterior means (Estimate), the 95% 340. HDI, the percentage of the HDI within the ROPE, the maximum probability of effect (MPE), Halen, Sanda. 2019. Defal anlain: a and the Bayes Factor for each parameter. construct for cognitive translation stdie. Translation, Cognition & Behavior 2 (2): 187210.

4 Conclusions Jiménez-Crespo, Miguel A. and Maribel Teced. (2018). Leical aiain, egie This study started with the question of whether and explicitation in medical translation: a non-lieal anlain ae me difficl and comparable corpus study of medical ime cnming [...] dce han lieal ne terminology in US websites translated into (Schaeffer & Carl 2014: 55). The specific Sanih. TIS: Translation and Interpreting contribution of this paper was to use the most Studies 12 (3): 405426. frequent renderings to express a communicative purpose from a comparable corpus as tertium Jakben, An Lkke. (2002). Tanlain drafting by professional translators and by comparationis. Results suggest that processing translation students. In G. Hansen (ed.), effort of the most frequent ST-TT literal Empirical translation studies: Process and translation candidates from the translational product, pp. 191204. Copenhagen: corpus were indistinguishable from the default Samfundslitteratur. translation ones from the non-translational corpus for both time_to_type and time_to_completion. Lacruz, Isabel and Gregory Shreve. (2014). The results also show that when subjects decide Pae and cgniie eff in -editing. In not to post edit a proposed translation candidate, post-editing of machine translation: Processes and alicain. In ST-TT lieal anlain ai inc in highe anlain ne ime ha ae aiicall significant.

Page 58 Book of Abstracts: Translation in Transition (TT5) October 15 - 17, 2020

S. OBien, M. Simad, L. Secia, M. Cal, & L. Schaeffer, Moritz and Michael Carl. (2014). W. Balling (eds.). Expertise in post-editing: Meaing he Cgniie Eff f Lieal Processes, technology and applications, pp. Tanlain Pcee. Workshop on Humans 246274. Cambridge: Scholars Publishing. and Computer-assisted Translation, pp. 29 37. Lac, Iabel. (2017). Cgniie eff in Gothenburg, Sweden: Association for translation, editing and post-ediing. In Jhn Computational Linguistics. Schwieter and Aline Ferreira (eds.). Handbook of translation and cognition, pp. 386-401. Vandepitte, Sonia, Hartsuiker, Robert J. and Eva Blackwell in Linguistics. Malden, MA: John Van Ache. (2015). Pce and e die f Wiley & Sons. a anlain blem. In Aline Feeia, and John W. Schwieter (eds.), Psycholinguistic and Lacruz, Isabel, Carl, Michael and Masaru Yamada. Cognitive Inquiries into Translation and (2018). Lieali and Cgniie Eff: Japanese Interpreting, pp. 127 and Sanih. In N. Callai e al. (ed.), The 143. Amsterdam/Philadelphia: John Benjamins. LREC 2018 Proceedings: Eleventh International Conference on Language Resources and Vieia, Lca Nne. (2019). P-Editing of Evaluation, pp. 3818-3821. Paris: European Machine Tanlain. In Minak O'Hagan (ed.), Language Resources Association. The Routledge Handbook of Translation and Technology, pp. 319-335. New York-London: Routledge.

Page 59 Book of Abstracts: Translation in Transition (TT5) October 15 - 17, 2020

FSEM: Current developments and perspectives

Oliver Czulo

Institute for Applied Linguistics and Translatology

University of Leipzig, Leipzig, Germany

[email protected]

The Frame-Semantic Evaluation Metric including the fact that the often come with rituals, will make it possible for them to have a basic The Frame-Semantic Evaluation Measure FSEM understanding of the concept. (Czulo et al. 2019) is designed to be a fully automated, referenceless machine translation The Berkeley FrameNet (Ruppenhofer et al. evaluation measure for semantic similarity of 2016) is a lexical-semantic project using empirical originals and their machine translations. This daa deie a leicn f fame, fen n a contribution addresses some of the currently more very general level of abstraction, for English. It prevalent research questions around FSEM. has served as blueprint for various framenets in other languages (e. g. Subirats Rüggeberg and In terms of linguistic theory, FSEM is based on Petruck 2003; Ohara et al. 2004; Burchardt et al. Frame semantics (Fillmore 1982; 1985), a 2006; Torrent and Ellsworth 2013). cognitive-linguistic, ethnographically oriented semantics of understanding. A frame is a system The primacy of frame model of translation of concepts which structures knowledge about (Czulo 2017) provides the translatological cognitive entities: events, states, objects, background for FSEM. The basic hypothesis, or attributes etc. These mental categorization more precisely the idealized model is that for each schemata are linked to linguistic expressions by frame evoked in the original, there should be a which they are evoked. The word Christmas, for maximally comparable counterpart in the example, will activate extensive knowledge such translation that is linguistically realized. The as typical events (e. g. exchanging gifts) and cnain maimall cmaable accn f contexts (e. g. winter) in many speakers of differences in how frames are structured across English. Frames are shaped by backgrounds, (sub-)cultures, such as the Family frame, where beliefs, experiences etc., where different levels of the understanding of what types of families exist abstraction will have group- or individuum- and what the relations between family members specific configurations. As an example: For some are, may differ, but can be compared as there is an users of the lexeme Christmas, gifts are underlying abstract notion of Kinship. There exchanged on the evening of December 24th, for are, however, a number of factors which may lead others on the morning of the 25th. Even though to overrides of this principle, such as (potentially) many speakers of English may not celebrate grammatically induced shifts in perspective: Christmas, the more abstract concept of holiday,

Page 60 Book of Abstracts: Translation in Transition (TT5) October 15 - 17, 2020

Tray 1 [...] holds up to 125 Some systematic patterns of semantic sheets [...] divergences have already been described in the literature, such as differences in how motion In Fach 1 können bis zu 125 events are framed in various types of languages Blatt Papier eingelegt (Talmy 2000; Slobin 2004), and this knowledge werden [...] has been applied to the analysis of translation Into tra 1, p to 125 sheets of paper can (Ellsworth et al. 2006). The FSEM algorithm can be inserted relate different types of motion framings and thus state similarity for such cases. It is not intended, German has a considerably lower tendency to however, to test on typicality of framing for a agentivize inanimate objects which may have led language, which is where language models should to the above-seen shift of the subject Tray 1 to the come into play and tasks could be shared between prepositional object In Fach 1. The two frames different components. evoked by the main verbs, Fullness for English and Filling for German, can, however It is conceivable that the level of abstraction be easily related to one another by means of that frames provide could help assess other cases exploiting the frame hierarchy in Berkeley which are not as easily covered by a lexeme-based approach, such as metaphors. This could cover FrameNet: Filling is inchoative of cases where two metaphorical expressions in two Fullness. languages are very different in their structure, or FSEM makes use of the Frame hierarchy to where a metaphor is translated non- calculate a similarity measure between an original metaphorically or vice versa (those are indeed and a translated sentence. In one of the currently frequent cases, as shown e. g. by Samaniego existing pilot implementations (see Czulo et al. Fernández 2013). An evaluation of two 2019), a spread-activation algorithm makes use of expressions such as pull a leg vs. German activated frames and their subsumed frames to veralbern (ghl: make mene lk id) calculate a similarity measure on the basis of (so does not yield a match on a lexical level. If, far) manually annotated frames. A penalty is however, both metaphorical and non- given for the distance of frames in the translation metaphorical expression are annotated with the which are linked to, but not identical with the targeted meaning in terms of frames, a match is original frames. created where a purely lexical mapping would fail. This may be true for other types of Systematic description of shifts expressions such as raining cats and dogs vs. German schütten . We can think of the frame level as a level of abstraction which allows us to describe shifts in Another case in which frame-semantic (lexically and grammatically realized) meaning in annotation can be of help is that of differences in a more systematic way than e. g. by means of an information distribution. Padó and Erk (2005) individual lexical comparison. On top of this, a introduced the notions of frame groups, semantic level of description allows us to describing a case where combined causation and investigate reasons for shifts from various scale information in English were split into two perspectives, either on the level of semantics itself lexemes in German. A frame group can then be (e. g. culture-specific differences) or when defined as an equivalent to a single frame in combined with other levels of description (such as another language, covering cases of different typological or pragmatic factors, or contrastive lexicalization strategies. differences as in the case of the above sentence pair).

Page 61 Book of Abstracts: Translation in Transition (TT5) October 15 - 17, 2020

Challenges for FSEM translation, where more creative translation strategies may be applied. One phenomenon in With the basic principles laid out, a number of human translation, which seems unlikely in questions opens up, of which I can here only machine translations, is that of differences in describe some of the currently most prominent. sentence segmentation resulting in such patterns When applying FrameNet frames, we can such as A B. C. A. B C. By means of assume that the more abstract ones are better frame semantic analysis, such differences in applicable across languages, while the more information distribution might be tackled as well, specific ones will be more tied to a (sub-)culture. elevating the metric towards text level rather than An automatic method of how to compare frames limiting it to the sentence level. This is a future is proposed by (Sikos and Padó 2018) using word goal of evolving FSEM. embeddings, relying on existing annotations. But References the number and granularity of frames in different framenets can still vary, and automatic induction Burchardt, Aljoscha, Katrin Erk, Anette Frank, Andrea methods are needed to increase coverage. Kowalski, Sebastian Pado, and Manfred Pinkal. 2006. The SALSA C: A Geman C The evaluation of machine translated texts Rece f Leical Semanic. In Proceedings of LREC 2006, 96974. Genoa, Italy. cannot, of course, be done by FSEM alone, as the high level of abstraction may generate false Cl, Olie. 2017. Aec f a Primacy of Frame Mdel f Tanlain. In Empirical Modelling of positives. Both of the following two sentences Translation and Interpreting, edited by S. Hansen- (and their literal translations) will generate the Schirra, Oliver Czulo, and Sascha Hofmann, 465 same frame annotation, namely People, 90. Translation and Multilingual Natural Language Processing 6. Berlin: Language Science Press. Taking and Containers. Czulo, Oliver, Tiago Timponi Torrent, Ely Matos, The woman grabbed the cup. Alexandre Diniz da Costa, and Debanjana Kar. 2019. Deigning a Fame-Semantic Machine Tanlain Ealain Meic. In Proceedings of The man took the glass. The Second Workshop on Human-Informed Translation and Interpreting Technology (HiT-IT 2019). Varna, Bulgaria. This also raises questions on how granular frame Ellsworth, Michael, Kyoko Ohara, Carlos Subirats, definitions should be, which would require a and Thma Schmid. 2006. Fame-Semantic deeper discussion than this short contribution can Analysis of Motion Scenarios in English, German, offer. Assuming that a certain level of abstraction Sanih, and Jaanee. Peened a he Fh International Conference on Construction will always be desirable, FSEM evaluation needs Grammar, Tokyo, Japan. to be complemented with other methods of Fillme, Chale J. 1982. Fame Semanic. In evaluation tackling more fine-grained lexical Linguistics in the Morning Calm, edited by Charles information. J Fillmore, 111137. Hanshin. . 1985. Fame and he Semanic f Conclusion Undeanding. Quaderni Di Semantica 6: 222 254. The list of benefits of and challenges for developing a frame-semantic machine translation Ohara, Kyoko, Seiko Fuji, Toshio Ohori, Ryoko Suzuki, Hiroaki Saito, and Shun Ishizaki. 2004. evaluation measure discussed in this contribution The Jaanee FameNe Pjec: An Indcin. is certainly incomplete, but it should give a In Proceedings of the Satellite Workshop Bilding glimpse into the current state of conceptual Lexical Resources from Semantically Annotated development of FSEM. Various questions still Corpora, 911. European Language Resources Association. need to be answered in order to understand how informative FSEM can be for the evaluation of machine translation, and potentially human

Page 62 Book of Abstracts: Translation in Transition (TT5) October 15 - 17, 2020

Pad, Sebaian, and Kain Ek. 2005. T Cause or Slbin, Dan I. 2004. The Man Wa Seach f a Not to Cause: Cross-Lingual Semantic Matching Frog: Linguistic Typology and the Expression of f Paahae Mdelling. In Proceedings of the Motion Een. In Relating Events in Narrative: Cross-Language Knowledge Induction Workshop. Typological Perspectives, edited by S. Strömqvist Cluj- Napoca, Romania. and L. Verhoeven, 21957. Mahwah, N.J.: Lawrence Erlbaum Associates. Ruppenhofer, Josef, Michael Ellsworth, Miriam http://ihd.berkeley.edu/linguistictypolog R. L. Petruck, Christopher R. Johnson, Collin F. Baker, yofmotionevents.pdf. and Jan Scheffck. 2016. FameNe: The and Subirats Rüggeberg, Carlos, and Miriam Petruck. Pacice. h://famene2.ici.bekele.ed/dc/ 2003. Sie: Sanih FameNe! In r1.7/book.pdf. Proceedings of the Workshop on Frame Semantics, Samanieg Fennde, Ea. 2013. The Imac of edied b Ea Haji, Anna Kc, and Cognitive Linguistics on Descriptive Translation Ji Mik. Page: Mafe. Studies: Novel Metaphors in English-Spanish Talmy, Leonard. 2000. Toward a Cognitive Semantics. Neae Tanlain a a Cae in Pin. In 2. Typology and Process in Concept Structuring. Cognitive Linguistics and Translation: Advances in Cambidge, Ma.: Cambidge, Ma. [.a.]: MIT Some Theoretical Models and Applications, edited Press. by Ana Rojo and Iraide Ibarretxe-Antuñano, 159 98. Torrent, Tiago Timponi, and Michael Ellsworth. 2013. Behind he Label: Cieia f Defining Applications of Cognitive Linguistics 23. Berlin: De Analical Caegie in FameNe Bail. Gruyter Mouton. Veredas 17: 4465. Sikos, Jennifer, and Sebastian Padó. 2018. Uing Embeddings to Compare Framenet Frames across Langage. In Proceedings of the First Workshop on Linguistic Resources for Natural Language Processing, 91101.

Page 63 Book of Abstracts: Translation in Transition (TT5) October 15 - 17, 2020

Going off track? Activity tracking in translation. Tala ecei, ehical ie ad iie e.

Dr Valentina Ragni Dr Lucas Nunes Vieira

University of Bristol University of Bristol

School of Modern Languages School of Modern Languages

19, Woodland Road, BS8 1TE, UK 19, Woodland Road, BS8 1TE, UK

[email protected] [email protected]

Abstract be misused as instruments of surveillance, which raises questions and warrants a wider discussion In an increasingly automated translation industry of the echnlg use implications. In this paper, where recent machine translation (MT) we report data from a longitudinal study that developments are producing higher quality output eamine anla e and ecein f (Way, 2018) with higher potential to speed up the activity-tracking. translation process productivity is often deemed imperative by clients and translators alike (e.g. Participants used activity-tracking tools within Marg, 2016; Tabor, 2010; Vallianatou, 2020). their CAT software (memoQ and Trados Studio) Activity-tracking tools are used to measure to calculate their productivity over a period of 16 productivity by LSPs, translation vendors, as well weeks. Every week, they sent us quantitative self- as freelance translators (see Moran, 2018). These reported project data such as translation speed and tools can collect a wealth of information about the number of edited words,1 as well as qualitative translation process, from average speed or total perception data related to productivity self- words translated for a client, to fine-grained by- assessments via rating scales and open questions. egmen daa ab he anla ing and This combination of qualitative and quantitative editing activity. However, the practice is data was then complemented by focus-group controversial, and currently completely interviews with all participants. unregulated. Tracking translation activity can help to diagnose problems in the translation This presentation focuses on the weekly data process (e.g. below-average MT quality) and may collection process, which provides valuable also provide translators with insights into their insights into the work (and related productivity) working patterns (Vieira, 2018). Depending on of professional translators in real settings over an the circumstances, however, these tools can also extended period. Moreover, several relevant

1 The study was reviewed by the Faculty of Arts share any information that could identify the texts Ehic Cmmiee a he ah iniin. F they worked with or the clients. They were reminded ethical and confidentiality reasons, translators did not that this was strictly prohibited.

Page 64 Book of Abstracts: Translation in Transition (TT5) October 15 - 17, 2020 themes emerged from the analysis, including Resources and Evaluation (LREC'16) (pp. 23- issues related to tool-specific feedback, 26). P, Slenia. 23-28 May 2016. productivity definition and calculation issues, and John Moran. 2018. Extensive CAT Tool Logging Big ethical implications. Most notably, it emerged that Brother or language technology evaluation translators are not averse to activity-tracking panacea? In: ITI Research Network e-book 2018. technology per se. In fact, most translators in our The Human and the Machine (p. 3). Institute of study recognized as tantamount the importance of Translation and Interpreting. keeing ack f ne dcii hil ing Jared Tabor. 2010. Productivity for Translators: an CAT tools and related technologies. A core Overview. Proz.co Wiki. Available from: difference in translator attitudes was noted https://wiki.proz.com/wiki/index.php/Productivity_ beeen nanel acking ne n for_translators:_an_overview [accessed June 2020] productivity vs. being required to do so by a client Fotini Vallianatou. 2020. CAT Tools and Productivity: or company owner, where only the latter was Tracking Words and Hours. highlighted as potentially problematic. It became TranslatonDirectory.com. Available from: clear that issues with activity tracking often https://www.translationdirectory.com/article752.ht revolve around the client-translator relationship, m [accessed June 2020] especially in cases of power imbalances and Lucas Nunes Vieira. 2018. Human Challenges in the market structures that risk compromising Use of Machine Translation in Professional anla agenc. The eenain ill adde Translation Processes. In: ITI Research Network the above-mentioned themes and discuss e-book 2018. The Human and the Machine (p. anla ecein f dcii. Sme 7). Institute of Translation and Interpreting. tentative recommendations on the practice of activity-tracking will also be put forward, in order Andy Way. 2018. Quality Expectations of Machine to raie aaene f he echnlg Translation. In: J. Moorkens, S. Castilho, F. Gaspari, and S. Doherty (Eds.), Translation implications and highlight situations where Quality Assessment. From Principles to translators deemed it to be useful and worthwhile. Practice (pp. 159-178). Springer International

References Publishing. (498 words without references; 662 words including Lena Marg. 2016. The Trials and Tribulations of references) Predicting Machine Translation Post-Editing Productivity. Proceedings of the Tenth International Conference on Language

Page 65 Book of Abstracts: Translation in Transition (TT5) October 15 - 17, 2020

Researching Automatic Speech Recognition on the Process and Quality of Written Translation: Discussions of Indicators

Wenchao Su Xiaoxing Zhao

Guangdong University of Foreign Studies Sun Yat-sen University

[email protected] [email protected]

Abstract support the use of ASR in written translation is limited, and more systematic research is needed to Automatic Speech Recognition (ASR) has been examine the function of ASR in the process and used by professional translators for some years. product of written translation on various language Instead of typing the translations, professional pairs using a variety of ASR tools (Ciobanu & translators prefer to speak out their translations, Secara 2019). for they find it faster to produce high-quality texts in this way (Bower 2002) .With This study explores the application of ASR tool in technologies maturing, more and more written translation from English to Chinese. Using technological tools have been playing a central a within-subjects design, a group of undergraduate role in the practice of translators and den ih ne ea fmal aining in interpreters in recent years. Technological translation were invited to complete two tasks, competence is now regarded as one of the written translation with ASR and written essential components that define a qualified translation from scratch. Screen recording was professional translator or interpreter. ASR is used to record the real-time process of written one of the technologies that deserve the translation. The data extracted from the screen attention of translators and interpreters. recording results, including the overall production time, the time spent on pre-drafting, drafting and Some studies have assessed the performance of post-drafting were used as the indicators of speed. different ASR systems (Zapata & Kirkedal The number of changes in content (i.e., 2015), and other studies have investigated the replacement, addition, and omission) and application of ASR to simultaneous interpreting expressions (i.e., punctuations, spellings, lexical (Li & Wang 2018), post editing (Zapata et al., changes, syntactic changes and textual changes) 2017) and written translation (Carl et al. 2016; were adopted as the indicators of revision Dragsted et al. 2011; Baxter 2017). However, the empirical data that

Page 66 Book of Abstracts: Translation in Transition (TT5) October 15 - 17, 2020 behaviors. Translation outputs were analyzed and scored in terms of accuracy and target Dragsted, B., Mees, I.M., &Hansen, I.G. (2011). language quality to suggest the ASR effect on Seaking anlain: Sden fi encne the quality of written translation. with speech recognition technology. Translation & Interpreting, 3(1):10-28.

References Zapata, J., & Kirkedal, A.S. (2015). Assessing the Performance of Automatic Speech Recognition System When Used by Native and Non-Native Baxter, R.N. （2017）. Exploring the effects of Speakers of Three Major Languages in Dictation computerized sight translation on written Workflows. Proceedings of the 20th Nordic translation speed and quality. Perspectives, Conference of Computational Linguistics (pp. 201- 25(4):622-639. 210). Vilnius, Lithuania.

Bower, L. (2002). Computer-aided Translation Zapata, J., Castilho, S., & Moorkens, J. (2017). Technology (p.34). Ottawa: University of Translation Dictation vs. Post- editing with Cloud- Ottawa Press. based Voice Recognition: A Pilot Experiment. Proceedings of MT Summit XVI, Vol2: Users and Carl, M., Aizawa, A., Yamada, M. (2016). English- Translators Track (pp. 123-136). Nagoya, Japan. to-Japanese Translation vs. Dictation vs. Post- editing: Comparing Translation Modes in a Li, X.L. & Wang, M. J. (2018). Construction and Multilingual Setting. 10th Language Resources Research of the Teaching Model of Using Automatic and Evaluation Conference (LREC 2016). Speech Recognition APP in Simultaneous P, Slenia, May 2016. Interpreting Training CourseA Case Study of Voice Note as an Auxiliary Tool. Technology Ciobanu, D., & Secara, A. (2019). Speech Enhanced Foreign Language Education, 179: 12-18. recognition and synthesis technologies in the anlain kfl. In M. OHagan (Ed.), The Routledge Handbook of Translation and Technology (pp. 91-106). New York: Routledge.

Page 67