The MT developer/provider and the global corporation

Terence Lewis1 & Rudolf M.Meier2 1 Hook & Hatton Ltd, 2 Siemens Nederland N.V. [email protected], [email protected]

Abstract. This paper describes the collaboration between MT developer/service provider, Terence Lewis (Hook & Hatton Ltd) and the Office of Siemens Nederland N.V. (Siemens Netherlands). It involves the use of the developer’s Dutch-English MT software to translate technical documentation for divisions of the Siemens Group and for external customers. The use of this language technology has resulted in significant cost savings for Siemens Netherlands. The authors note the evolution of an ‘MT culture’ among their regu- lar users.

1. Introduction 2. The Developer’s perspective In 2002 Siemens Netherlands and Hook & Hat- Under the current arrangement, the Developer ton Ltd signed an exclusive agreement on the (working from home in the United Kingdom) marketing of Dutch-English MT services in the currently acts as the sole operator of an Auto- Netherlands. This agreement set the seal on an matic Translation Environment (Trasy), process- informal partnership that had evolved over the ing documents forwarded by e-mail by the Trans- preceding six-seven years between Siemens and lation Office of Siemens Netherlands in The the Developer (Terence Lewis, trading through Hague. Ideally, these documents are in MS Word the private company Hook & Hatton Ltd). Un- or rtf format, but the Translation Office also re- der the contract, Siemens Netherlands under- ceives pdf files or OCR output files for automatic takes to market the MT Services provided by translation. After conversion to a machine-pro- the Developer to divisions within the Siemens cessable format, these files need to be cleaned Group and to prospective external customers. up, otherwise the MT output will be markedly This arrangement has proved to be a win- inferior to the agreed baseline. This clean-up is win situation for both Siemens Netherlands and done by either party by arrangement. the Developer. It has given Siemens Nether- The documents are processed and returned lands access to translation technology at a time on the basis of an agreed delivery date. Short when it has been involved in major cross-border documents are often returned within the hour. infrastructure projects with consortium partners They are not post-edited or revised line by line, that need rapid access to usable English transla- but the Developer does quickly scan each page tions of Dutch technical and legal documenta- to spot any gross errors or omissions. Sentences tion (technical requirements, relevant legisla- containing gross errors are usually sent back to tion and regulations etc.). Siemens is responsi- the MT system after further preparation work. ble for the complete administration of the MT Broadly speaking, users receive automatically customers, and this set-up has provided the De- generated documents. These are either edited by veloper with a stable stream of revenue ena- the end users themselves or used “as is” by en- bling him to devote more time to further devel- gineers familiar with the technical background opment of the software. This paper provides a to the documents. mostly practical account of this relationship, In addition to the run-of-the-mill jobs that first through the eyes of the Developer, and pass through a corporate translation office, the then from the perspective of the global corpora- automatic translation environment has been tion. used to handle huge volumes of documentation

EAMT 2005 Conference Proceedings 176 The MT developer/provider and the global corporation from such major projects as the HSL-Zuid (the nal delivery tool and only sending to the MT high-speed rail link between Amsterdam and engine sentences that are not already present in Paris), the RandstadRail light-rail project in the the database. The MT soft- Hague Region, and a huge construction project ware produces a file that can imported directly for an external customer, Shell International. into the translation memory. This selection, ex- Trasy is built around a small LAN compris- traction and integration is performed semi- ing 4 PCs: automatically. Every job the Developer does is ultimately stored in the translation memory da- • a Linux gateway to the Internet tabase. Figure 1 shows an example of transla- • two Linux translation servers tion unit in an aligned file produced by the MT • a translator workstation/client running un- software. It should be noted that the MT soft- der Windows 98 ware can also be used on a stand-alone basis, This configuration has been set up to facilitate producing an html output file. the handling of a flow of jobs in an office envi- ronment. Of course, the number of translation servers and workstation/clients can be increased 98 to meet any required number of translator/ope- LEWIS! rators. All the dictionary building work is done 15042004,08:38:04 on the translator workstation and the up-to-date De hoofddraagconstructie is van lexical resources are then uploaded together with staal. the files to be translated to the Linux servers. The main supporting structure is of Typically, the Developer/Provider will start the steel. terminology work for the next job while the cur- rent job is being processed. This client/server Figure 1: produced by MT software arrangement has been found to the most practi- cal one, but it is possible (although somewhat The following software tools are used in this inconvenient) to put the whole set-up on a pow- translation environment: erful notebook running either Linux (or Solaris) or Windows XP (or both) that functions as a • Resource Building Tools mobile translation workstation. • Dutch-English software The use of two translation servers by a sin- • Trados Translator’s Workbench gle operator makes it easy to jump the queue to The Resource Building Tools, written by the handle shorter jobs when a long job is already Developer, include conventional single-word dic- running. It also enables the Developer to run the tionary-building tools plus tools for building re- same job simultaneously on two successive builds sources consisting of multiword expressions up of the software, thus providing a simple mecha- to idiomatic of complete sentences. nism for observing the success or failure of in- The utility called “ResourceBuilder” can pro- tended software improvements. This is vital since duce useful sub-sentence segments from trans- the Dutch-English MT application, written en- lation memory databases, aligned source/target tirely in Java, is seen as a dynamic process – files and even different language versions of most large jobs will generate at least one im- web pages. It is the key factor in securing user provement in the software. And the process of acceptance of the automatically generated improvement has been customer-driven: mis- translations since many of the constituent parts takes that were tolerated six years ago are now of sentences are taken from “human transla- pointed out with alarm. tions”. The single-word dictionary and the Re- sources Repository are basically simple re- 3. Software tools source files (enhanced text files) which are only All jobs are processed using the already de- compiled at run-time. The Resources Repository scribed approach combining Machine Transla- contains all lexical items that are not single- tion and Translation Memory. This involves us- word entries. ing a translation memory application as the fi- Examples of entries in this repository are:

EAMT 2005 Conference Proceedings 177 Lewis and Meier

“Tot het werk behoort”, “The work and, with sufficient data, can produce complete includes„ translations independently within limited do- ( =„to the work belongs”) mains. Figure 3 shows the translation of a com- “van den Berg”, “van den Berg„ sources Repository with one word being looked (Dutch surname, not “of the Mountain”!) up in the single-word dictionary. After this module has run, the sentence is “in dit document wordt beschreven”, “this then passed to a Rule-Based Machine Transla- document describes„ tion module which applies various known MT (literal translation: “in this document is described”) techniques to the sentence. These include the “Centrale Velsen”, “Velsen Power parsing of “clusters” or chunks of text, recogni- Plant„ tion of known patterns and techniques for “certificerende instantie van de opdrachtgever, achieving semantic disambiguation based on “client’s certification agency„ the identification of the subject-matter of the (literal translation: “certifying instance of the document. At this stage, for instance, the string client”) “niet zal kunnen worden gekozen” would al- “Dit systeem is gelaagd opgebouwd”, “this ready be identified and tagged as a verbal phrase system has been built up in layers„ and the provisional translation “will not be able (literal translation: “This system is layered built to be chosen” would be allocated for it (the up”) word-for-word translation being “not will can be chosen”). The term “provisional” is used Figure 2: Examples of repository entries since other modules might “decide” to change The Dutch-English translation software is a mo- this translation at a later stage. These various dular application incorporating a hybrid approach techniques are applied “sentence down”, i.e. the to machine translation. Since the Developer be- last series of rules (contained in an Issu- lieves that this approach is critical in achieving eResolver module) deals with single words. In an acceptable quality level for Siemens Nether- the final part of the ‘run-time sequence’, the lands users, a few paragraphs will be devoted to sentence is passed to the EnglishStyleEditor the internal flow of the application. module which works on the target sentence and The program first of all applies known “data- performs some of the tasks of a human post- driven” techniques to match the strings in the editor. For example: “the decision is taken by the input file with segments and sub-segments stored board” becomes “the board takes the decision”. in the Resources Repository. The software at- Of all these modules, the semantic disam- tempts to provide grammatical tagging for the biguation module, particularly the business of results of these matching operations, and ap- getting the technical terms right, is a very im- plies some rules for extracting usable transla- portant aspect for the non-Dutch Siemens engi- tions from the Resources Repository by con- neers who are the main users of the MT output. sulting the single-word dictionary. This module If they know they are dealing with transformers, is a self-contained application in its own right, substations, catenaries and the like, they are not too bothered with a clumsily constructed Eng- „tot het werk behoort”=„the work in- lish sentence, nor are they in the least interested cludes„ in the software tools used to generate their „het programmeren en laden van software”= document. Much of the success of the use of „the programming and loading of soft- MT at Siemens Netherlands is due to the De- ware„ veloper’s commitment to getting the termi- „die gebruikt wordt voor de sturing van”= nology right. In the important field of railway „which is used to control„ engineering, for instance, every major piece of „fabrieken”=„factories transformer names, has been entered in the re- pository, which even includes the names of the Figure 3: Example of a sentence constructed using key personnel involved in the HSL-Zuid pro- ‘data-driven techniques’

178 EAMT 2005 Conference Proceedings The MT developer/provider and the global corporation ject. So high is the confidence in this terminol- mens Nederland at market prices while cover- ogy that Siemens staff translators working on ing its costs in full.’ “human translations” sometimes send the De- Although the translation department has suc- veloper their terminology queries. But this is a ceded in fulfilling its mission over the years, two-way process, and the Developer is also able this task has become more and more difficult on to forward terminology queries to the Siemens account of the developments described above Translation Office before processing a job. To and the specific situation of the translation mar- conclude this first part of the paper, the main ket which is characterized by a constant down- advantages for the Developer in this close rela- ward movement of prices, while at the same tionship with Siemens Netherlands are: time the internal cost allocations have con- stantly risen. • a guaranteed flow of revenues • an opportunity to test new developments on 4.3. Impact of introducing the MT “real texts” service of Hook&Hatton (‘Trasy’) • the possibility of building in knowledge of Siemens’s core areas of expertise 4.3.1. General desciption of the Trasy service • a constant challenge to meet the growing In the early 90’s, the translation department came quality demands made by Siemens custom- into contact with Hook&Hatton which offered ers over the years MT services from Dutch into English. The inci- dental cooperation between the translation de- 4. The Siemens perspective partment and Hook&Hatton has developed into a firm relationship on the basis of an exclusive 4.1. General situation of in-company contract for the delivery of MT services for the translations departments Netherlands which was concluded in 2002. The Trasy service of Hook&Hatton which The increasing globalization, liberalization of the translation department of Siemens is offer- markets, and ever growing global (tele)commu- ing to internal as well as to external customers nication speeds and possibilities as well as glo- is a unique selling point. Constant reviews of bal competition have fundamentally changed existing competing MT system show that the the way in which enterprises conduct their busi- Trasy service is the best available choice for ness. Lifetime employment in a company or even machine translations from Dutch into English. in a department is a phenomenon of the past. There is no organization in the world which Instead, companies are in a constant move of can offer machine translations from Dutch into internal reorganizations, assessing outsourcing English of higher quality than the translations paths, redefining core business, selling activities department of Siemens! to other companies. Through its preediting approach the system Internal departments such as a translation is constantly learning and thus preserving its department can no longer trust that the once competitive edge over the other systems in the made decision to set it up is a future-proof deci- market. sion. Internal translation departments are sub- The recognition of the system under internal ject to shareholder-value-driven profitability customers has grown over the years. Situations considerations. They must prove to be able to in which customers by themselves ask for a MT produce an added value and increase profita- translation are everyday practice. The price/per- bilty in order to safeguard their survival. formance of Trasy is second to none. In the most favorable case Trasy can deliver 150.000 4.2. Situation of the in-company words a day, seven days a week! translation office at Siemens Netherlands 4.3.2. Financial impact for the translation The translation department of Siemens in the department Netherlands was founded in 1988. Its mission is The MT business has in the meantime grown to to ‘satisfy the translation requirements of Sie- a volume of € 220.000 (in fiscal 2004), that is 24% of the total turnover of about € 900.000.

EAMT 2005 Conference Proceedings 179 Lewis and Meier

Roughly 30% of this turnover is realized with On the other hand, technical developments external customers. are continously creating new professions. In other words, 24% of our sales come from As far as machine translations systems at the a sector in which the translation department is current stage of development are concerned: they and will stay the top-quality supplier. often create additional translation needs. Docu- ments for which no human translation would be 4.3.3. Financial impact for Siemens ordered, become translation jobs. Cost savings Netherlands due to machine translation free funds for – in For Trasy machine translations the translation much cases more interesting – human translations. department charges 40% of the price charged for human translations. On the basis of the MT 5. Conclusion turnover with internal Siemens Netherlands cus- The authors are aware of the fact that the sub- tomers of about € 154.000 in 2004 this means ject of this paper is limited to the effect of cost savings of about € 231.000 for this period working with the Trasy MT system in the trans- for Siemens Netherlands. lation department of Siemens Netherlands. This paper does not claim to be a general study about 4.3.4. Organizational impact for the the implications of MT systems for translation translation department agencies and for the whole translation market. It These figures contributed to the fact that the does, however, describe the practical benefits of translation department of Siemens Netherlands collaboration between a developer working has been able to prove its added value for the with a ‘lesser used language’ and a major cor- organization. poration with a requirement for rapid, low-cost translations from that language. 4.3.5. Side-effects of offering MT services The following paragraph was delivered to The translation department of Siemens Nether- Siemens Netherlands by Trasy on 28/01/2005: lands has experienced a very positive side-effect “The Contractor should be fully aware that of offering MT services to big customers. These the activities to be carried out by him at the lo- customers have decided to also order human cations take place during the normal operational translation from us. The advantages of one-stop- management of the production facilities. The shopping may have been the reason for this. It production and the related supply logistics may may be claimed without any exaggeration that never be impeded by activities relating to the the introduction of the Trasy MT service has access control. Therefore installation operations safeguarded the existence of the translation de- should be planned at all times in proper consul- partment of Siemens and caused a material cost tation with the Principal, and where necessary reduction not only for Siemens Netherland but temporary provisions should be made by the also for external customers. Contractor to limit obstacles for the production to an acceptable level for the Principal.” 4.3.6. Translators’ professional considerations Does Trasy threaten the existence of human 6. References translators? This is a very difficult question which cannot be answered satisfactorily here. Techni- LEWIS, Terence (2001). ‘Combining Tools to Im- prove Automatic Translation’. Paper given at the cal and scientific developments have always had ‘MT Summit VIII’ conference, 18-22 September an effect on existing professions. The rise of the 2001, Santiago de Compostela, Galicia, Spain. motor-driven vehicles has caused the disappear- ance of coachmen, for example.

180 EAMT 2005 Conference Proceedings