MT Postediting
Total Page:16
File Type:pdf, Size:1020Kb
MT post-editing: How to shed light on the "unknown task" Experices made at SAP Dr. Falko Schäfer Neurottstr. 16, SAP AG 69190 Walldorf Germany [email protected] invested in an increasing number of MT systems Abstract over the past years. Currently there are four MT systems that are deployed in the translation of SAP This paper describes a project currently offline texts, i.e. texts extracted from SAP systems, under way at SAP dealing with the task of converted into an "MT-suitable" format before ma- post-editing MT output. As a concrete re- chine translation, and re-imported into the systems sult of the project, a standard post-editing after the translation has been completed. These guide to be used by translator end users is systems are: currently being created. The purpose of this post-editing guide is not only to train LOGOS (used for English–French and and support translators in the daily work English–Spanish) with MT but also to make the post-editing PROMT (used for English–Russian and task more accessible to them, thus en- English–Portuguese) couraging an open-minded attitude to- METAL (used for German–English) wards translation technology. LOGOVISTA (used for English– Furthermore the systematic error typology Japanese) underlying the guide serves not only as a methodological framework for the re- Translation is done by external vendors and search on post-editing but also as a diag- translation projects are coordinated by the depart- nosis for necessary corrections and ment SAP Language Services (SLS) in cooperation enhancements to be carried out in the cor- with Multilingual Technology (MLT, in the case of responding MT systems used. LOGOS and METAL). However, the translation In the context of the project descrip- workflow varies from system to system, the tion, the related research in the field of LOGOS and the PROMT processes being very automated translation processes as well as similar to each other and differing greatly from the the experiences made with MT at SAP are METAL process. LOGOS and PROMT are used to illustrated. translate SAP documentation material and training courses, whereas METAL and LOGOVISTA are used exclusively for the translation of “SAP notes” 1 Introduction (standardized documents for troubleshooting and customer support). For the purposes of this study, only the major differences between the workflows 1.1 Machine Translation at SAP connected with the four MT systems are described, and two system-specific processes are illustrated In order to cope with its large and constantly grow- (PROMT and METAL). ing translation volumes faster and at lower costs, SAP is one of the few big industrial groups to have PROMT has been productively deployed in the Translation Workflow: METAL translation of SAP documentation and training courses since August 2000. The MT software is SAP Vendor installed locally at SAP, and MT output is sent to CS-System Transfer Server external translation agencies for post-editing, one Manual Download Automatic Upload Remote of them being the PROMT translation department Of SAP Notes itself. The following slide provides an overview of Source Text METAL the PROMT translation workflow. MT & TM Transfer Server Post-Editing Local Translation Workflow: PROMT Target Text Proofreading SAP Vendor Final Translation New Document SAP AG 2001, Title of Presentation, Speaker Name 1 Post-Editing Figure 2 - Translation Workflow: METAL Terminology Extraction MT & TM Output Proofreading PROMT for TRADOS In every translation project, post-editing the raw (P4T) MT output is perhaps the most important stage in Final Document TRADOS TWB the process, since the quality expected from the final translations (after post-editing) must in prin- PROMT System ciple meet the same high requirements as that of any human translation. SAP AG 2001, Title of Presentation, Speaker Name 1 However, despite the various experiences gained Figure 1 - Translation Workflow: PROMT in this field at SAP, approaches to examine post- editing as a linguistic task and to provide common The METAL technology has been used in the guidelines for post-editors have remained rather translation of SAP notes since 1993/94. Initially insular. This is the motivation behind SLS's in- Machine Translation and post-editing were done creasing efforts. internally at SAP. In 1996, the translation of notes was outsourced to an external translation agency, 1.2 Post-editing: Defining the “Unknown which also meant a change in the translation work- Task” flow itself. The reasons for this are among other things the high and constantly growing translation Translation is not only a creative process but also volumes as well as the special requirements this implies some routine and tedious work. And that is text type implies. Since notes deal with very spe- where MT systems come in. Machine Translation cific problems related to the use of SAP software cannot replace human translation, at least not in the and aim at enabling customers worldwide to access foreseeable future. Nevertheless, it can make a support on their own with the goal of reducing the translator's work easier and more efficient by fa- number of incoming phone calls and customer cilitating many tedious tasks such as lexical que- messages at SAP's support departments, these need ries and ensuring terminological consistency. to be available in German, English and Japanese Secondly, MT can make drafting a lot easier for a within very tight deadlines. translator since the MT system already provides a The biggest difference between the notes transla- textual framework to "polish up," provided that the tion process and the PROMT / LOGOS workflow relevant terminology is sufficiently coded in the is that the MT software is installed at translation system dictionaries. partners' sites rather than at SAP, and external This "polishing" of machine-translated texts, translators generate their worklists themselves. generally referred to as "post-editing," is required Therefore, unlike the PROMT / LOGOS workflow, in almost every instance where MT is used. And the whole translation workflow takes place exter- yet, post-editing is still a largely unknown task, nally. The following slide gives an overview of the largely neglected by research in linguistics and MT METAL translation process. studies, so that the borderlines between post- editing and proofreading often appear rather proaching the job, and this is why post-editing im- blurry. plies a positive and open-minded attitude on the part of the translator towards MT technology. 1.3 Objective of the Study Since machine-translated texts are linguistically different from texts translated by a human transla- The objective of this study is twofold. Firstly, it tor, post-editing also requires certain experience sets out to define post-editing as a linguistic task in and skills in recognizing typical "machine" errors. its own right, similar to translating, yet not quite These skills can usually be applied only after some the same and not identical to proofreading either. analysis of output texts. Nevertheless, mistakes Furthermore, the different steps into which the produced by a machine translation system are post-editing task can be broken down are described normally typological and recurring. Once the post- in the light of the experiences gained in the daily editor is able to identify these mistakes, his work translation business at SAP. will be facilitated. Still, many MT mistakes do not Secondly, a general typology of MT errors is immediately catch the post-editor's eye since the presented. This classification can be used as a stan- sentence appears "comprehensible" at first glance. dard benchmark for post-editing MT output, Therefore it is absolutely crucial for post-editing regardless, in principle, of language pair, MT sys- that every machine-translated sentence be thor- tem, or text type. This common typology was made oughly checked against the source text in order to possible by the observations that not only the re- identify "tricky" MT mistakes, especially those quirements relevant for post-editing are largely resulting from wrongly analyzed syntactic struc- language and system-independent, and that the er- tures or from defects in the input text. rors detected in raw MT output often bear great Post-editing must also be distinguished from the resemblance to each other, even between language task of proofreading, which can be defined as the pairs as different as English-Portuguese and Eng- last step in the post-editing process upon comple- lish-Japanese. tion of the target text. The aim of proofreading is to make sure that the target text renders the content of the source text, preserves coherence, and is 2 Translation v. Post-editing v. Proof- idiomatic in the target language. The text reviser reading does not necessarily have to be the translator or post-editor of the text him/herself. In many cases, As mentioned above, the task of “polishing up" the it is even recommended to give post-editing tasks raw MT output to an acceptable, end-user friendly to one person and proofreading tasks to another. text quality is commonly referred to as post- editing, and this terminological convention should As a result, post-editing must be seen as a lin- be maintained in MT research. The scope of the guistic task in its own right and the cognitive proc- post-editing task depends on the quality of the out- esses involved in this task are neither identical to put, the text type and the purpose of the target text translation per se nor to proofreading. One of the with regard to its recipient. reasons for this is related to the nature of the post- In most cases, the post-editor is also a translator. editing process itself, as this process includes three In many cases, this can lead to a situation in which different texts, the source text, the machine transla- the translator will expect the same quality from a tion and the translator's own target text, as Krings machine-translated text as from a text translated by points out (Krings, 2001).