<<

FOR October/November 2009 GGETTINGETTING SSTARTEDTARTED: Guide

Planning and Writing ® for Translation

Optimizing the Source ® Using Translation Memory

Elements of Style ® for Machine Translation

Optimized MT for ® Higher Translation Quality

Controlled Authoring ® to Improve Localization

0011 CCoverover ##107107 GGSG.inddSG.indd 1 99/24/09/24/09 110:45:430:45:43 AAMM WRITING FOR TRANSLATION Guide: GGETTINGETTING SSTARTEDTARTED

Getting Started: Believe it or not, setting out to write lyrically Writing for Translation beautiful copy for a manual or even the web may not be the most straightforward way to Editor-in-Chief, Publisher Donna Parrish Managing Editor Laurel Wagers get to clear translation. These have some better . Barb Sichel begins Assistant Editor Katie Botkin this Getting Started Guide with an overview on planning and writing for translation, Proofreader Jim Healey and then Joseph Campo offers the findings from a project using a translation tool to News Kendra Gray find already-translated phrases to write the original copy. Ken Clark gives a short Illustrator Doug Jones guide on writing for machine translation (MT), and Lori Thicke outlines why MT allows Production Sandy Compton for quality translation in the first place. Ultan Ó Broin finishes things with a discus- Editorial Board Jeff Allen, Julieta Coirini, sion on controlled authoring. Bill Hall, Aki Ito, Nancy A. Locke, The Editors Ultan Ó Broin, Angelika Zerfaß Advertising Director Jennifer Del Carlo

CONTENTS Advertising Kevin Watson, Bonnie Merrell Planning and Writing for Translation Webmaster Aric Spence Barb Sichel Technical Analyst Curtis Booker page 3 Data Administrator Cecilia Spence Barb Sichel, director of business development at International Language Assistant Shannon Abromeit Services, Inc., has over 25 years of sales, marketing and management experience. Subscriptions Terri Jadick Special Projects Bernie Nova Optimizing the Source Using Translation Memory Advertising [email protected] www.multilingual.com/advertising page 5 Joseph Campo 208-263-8178 Joseph Campo, a senior at Dassault Systèmes SolidWorks Subscriptions, customer service, back issues Corporation in Concord, Massachusetts, has ten years of experience. [email protected] www.multilingual.com/subscribe Elements of Style for Machine Translation Submissions [email protected] Editorial guidelines are available at page 8 Ken Clark www.multilingual.com/editorialWriter Ken Clark, CEO of 1-800-Translate, worked previously as a , Reprints [email protected] and speech writer for Japanese and American government offi cials. This guide is published as a supplement to MultiLingual, the magazine about language Optimized MT for Higher Translation Quality technology, localization, web globalization and international software development. It may be page 9 Lori Thicke downloaded at www.multilingual.com/gsg Lori Thicke is cofounder and general manager of Lexcelera (formerly Eurotexte), established in 1986, as well as cofounder of Translators Without Borders. Controlled Authoring to Improve Localization page 12 Ultan Ó Broin Ultan Ó Broin, MultiLingual editorial board member and Blogos contributor, works for Oracle in Ireland. He has an MSc from Trinity College Dublin.

Writing for Translation Rely on the No. 1 independent technology for the linguistic supply chain.

Across Systems, Inc. Phone +1 877 922 7677 [email protected]

Across Systems GmbH Phone +49 7248 925 425 [email protected]

page 2 The Guide From MultiLingual

0022 TTOCOC ##107107 GGSG.inddSG.indd 2 99/24/09/24/09 110:48:530:48:53 AAMM WRITING FOR TRANSLATION

GGETTINGETTING SSTARTEDTARTED:GuideWRITING FOR TRANSLATION Planning and Writing for Translation

BARB SICHEL

ocuments and online communica- involve translating warning labels and soft- graphics accessibility. Don’t plan to embed tions are translated to achieve spe- ware user interfaces. Again, to save money, words into layer upon layer of graphics. Your Dcific objectives. Your goal may be perhaps you can omit a section such as the translator may not be able to access them for to execute a global communication plan, corresponding parts list. If your customers translation at all or may be able to do so only meet regulatory requirements, avoid liabil- can’t order parts in Japanese by calling your at great expense to you. Plan to place your ity or drive revenue by addressing target customer service line, why provide a Japa- text labels beneath graphics rather than in their native language. What- nese parts list? inside of them. Text must be “live,” that is, ever the outcome, you will need clear com- Understanding the intent and full scope accessible independently of the graphics in munication of a single message across all of your project will enable you to plan your order to be translated and reinserted in the of the languages involved to get there. budget and work with your translator to same position. Lately, cost considerations have become determine the correct order in which to The same concept applies to screen just as important as the accuracy of the proceed. A phased implementation may shots. Unless you translate your software translation. Consequently, writing for suc- be easiest to manage while allowing you first and provide new screen shots, the cessful translation today involves planning to complete the highest priority require- English copy locked within your graphics your project so that you can convey your ments first. cannot be accessed for translation. If you message within a reasonable budget. must use preexisting graphics, your trans- lator may be able to recommend solutions Message and scope such as a reference table so that the reader First, and most obviously, decide what ranslation is can still understand your message. you need to communicate, and communi- T Too often, project costs are unnecessar- cate it as simply and directly as possible. a meticulous, ily high or the quality of the finished trans- Determine what is most relevant to your lation is compromised because translation target and what you must trans- skilled process was never considered when a document late to achieve your particular objectives. was originally created. Take the time to think your project similar in nature to through from the perspective of the recipi- Your copy ent, and do some if you don’t technical writing. Simple, straightforward text is easiest to know the recipient’s perspective. Translat- translate. Say what you mean as concisely ing everything you publish in English may as possible. Word count is a key factor in not maximize the return on your translation Layout the cost of your translation, so, if possible, investment. You might not have the luxury For printed materials, properly planning keep sentences short and limited to a single of translating every one of your product your layout even before you start writing . If English copy already exists for your data sheets, for example, so focusing on copy can greatly influence the ease and pending , review and revise product line summary brochures instead cost of managing your project. Quite liter- the content. Formal copy style with correct may be less costly. If it is beyond your ally, it pays to understand which factors grammar, spelling and punctuation will be budget to translate your entire 200-page affect the cost and quality of your trans- most easily understood by your translator. employee manual, perhaps you can focus lation. Then you can craft your presenta- Consider also your audience’s on only those critical policies most needed tion to achieve the desired outcome within level and communication style and then to protect your firm’s interests. your allotted budget. select the appropriate . Instructions to Some projects, such as catalog or web- A few things to consider are the choice a physician prescribing medication should site translations, may warrant the creation of desktop publishing application and lay- be written differently than instructions to of abbreviated or revised versions for target out. If this is going to be a printed docu- the patient taking the medication. audiences. Sections dealing with customer ment with color plates, you might look at Avoid words with double meanings and support or how to locate a sales represen- whether enough room is left for text expan- references or metaphors that may not tative, for instance, may need modification sion to accommodate any graphics. Text will make sense in other cultures. Don’t rely so that they are relevant in the geographic expand in some languages and may contract on buzzwords, abbreviations, industry locale in which they will be used. Other in others. This has implications for the font jargon, colloquial expression or humor. types of projects require translation of ancil- sizes and page margins you select, as well Create standardized text whenever pos- lary materials that may not immediately as graphics. Chinese characters that need sible. If you can reuse blocks of copy from come to mind. Technical documentation for to be reduced to a 6-point font in order to one document to the next, you will save large-scale industrial equipment may also fit on a page will be illegible. Also, check the time and money on your translations and

October/November 2009 • www.multilingual.com/gsg page 3

003-043-04 SSichelichel ##107107 GGSG.inddSG.indd 3 99/24/09/24/09 110:51:240:51:24 AAMM WRITING FOR TRANSLATION Guide: GGETTINGETTING SSTARTEDTARTED

ensure consistency across all of your writ- them. Provide files to your translator in the Timelines ten and online communications. same format you would like to receive back. Translation is a meticulous, skilled pro- If your content is highly technical in PDFs are fine for reference, but depend- cess similar in nature to technical writing. nature or your industry-specific terms are ing on the size of your document and the Though you provide the concept and the prone to multiple meanings, supply your application used, having the source files source files, your translator must take time translator with reference material or glos- available may significantly impact the time to fully comprehend your meaning and saries for key terms. Links to websites or to quote your project, the cost of your pro- find the best way to replicate the tone and product catalogs can minimize the need for ject and the appearance of the final out- content in his or her native tongue. Often research during the translation process. put. If you are working from hard copies there is research involved or requests for Some copy may not translate well or may or scanned documents, manual processes you to provide clarification. translate into some languages but not oth- will have to be employed that will similarly Your project involves much more than ers. Be particularly aware of this if you are affect your project. merely translation. Numerous details are creating ad copy or marketing materials. It Given the source files, most translation involved in preparing your files for trans- is worthwhile to check with your translator firms can replicate standard file formats, lation, gaining commitment from the best early, before you have invested heavily in even for software code. How you present qualified translators, proofreading, format- developing graphics or a tagline to accom- content for translation impacts cost, timeline ting and ensuring proper quality control. For pany your corporate logo. Choosing the and the ease of implementing your project. If multiple language translations, managing right words and the right images or colors you do any cutting and pasting at your end, your project becomes even more complex. for your presentation may make the differ- have your translator provide a “post-format If you make a single change, it needs to be ence between a seamless translation and review.” This ensures proper text flow and disseminated across teams of translators one that falls completely flat with your tar- the overall quality of your presentation be- and proofreaders for each language. get audience. fore you print or post it on the internet. Costs Allow realistic timelines for your pro- Acronyms should be avoided. The prob- for this service are usually nominal and can jects to be completed. A simple brochure lem in trying to translate an acronym is that prevent potential embarrassment. may take several days, while a 300-page once you translate the word, the Formatting foreign character sets on manual may take several weeks. Advise letters change and they no longer cross- your own can be a challenge, even for an your project manager in advance if you reference to the supporting ideas you experienced graphics person, and you must meet a specific deadline so that your want to convey in your target languages. may not have the right tool set. Languages project can be managed accordingly. A native-speaking translator is a good such as that read right to left require resource for spotting things that won’t special software versions and the ability Partnering with a vendor play well with your target audience. Basic to reorient everything on a page. It is best Since the quality of the translations you localization — gearing your translated not to attempt this on your own. publish reflects on you and your organiza- document to a particular country, region If you are translating software for user tion, establishing a comfortable working or target audience — is usually part of any interfaces, handheld LCD screens or similar relationship with your vendor is essential. well-executed translation project. Exten- uses, be prepared to answer questions about Carefully crafted branding strategies sive localization, to the point of creative your ability to handle foreign character sets, can be derailed in an instant by sloppy or strategizing, however, is a specialized space limitations and other factors that spe- inaccurate work. Even a single poorly cho- skill beyond the scope of typical transla- cifically affect these types of projects. sen word can alter your intended meaning. tion projects. If you suspect your project If you need to resize short translations And just imagine your customer purchas- requires an unusual amount of attention, to fit an ad or label, ask for an Adobe Illus- ing a piece of equipment only to find that check with your translator. trator EPS file that has been “outlined.” the documentation doesn’t make sense or Provide only fully proofread, final copy This provides the best of both worlds. It that the table of contents doesn’t match for translation. Drafts are fine for budget- is locked down like a graphic to eliminate the order of the text. You will rely on your ary quotes, but works-in-progress are the possibility of introducing errors during translation vendor to provide you with unsuitable for translation and will leave formatting, but leaves flexibility for resiz- accurate translations that are audience yours prone to errors, inconsistencies and ing. You can format the text to meet your appropriate and delivered, print ready, higher costs. If you intend to update docu- needs, even for a character set that you within the specified timeframe. You should ments later with new product models or may not have installed. also educate yourself as to their quality next year’s catalog, the level of attention Lastly, use the right application for your processes and experience level with pro- you devote to tracking changes and version project. Some applications play well with jects similar to yours so that you can move control now will be well worth your effort. the automated tools employed by trans- forward with full confidence. lation firms while others require a lot of While there is no single industry certi- Formatting manual manipulation. fication for translations, there are third Locate your source files for older docu- Microsoft Word works fine for short docu- parties such as TÜV or the American ments. This includes all of the desktop pub- ments, but FrameMaker may be a better Translators Association that provide qual- lishing and accompanying graphics files. choice for large manuals. If you use charts, ity testing and auditing. It is perfectly Are they with your graphics design firm or live embedded links or manually inserted acceptable to ask for credentials. In many archived somewhere within your organiza- multiple carriage returns, the level of diffi- cases, your own in-house quality policies tion? Your translator may not be able to rep- culty in working with your files for translation or regulatory requirements demand that licate your formatting and graphics without will increase, and this will impact your cost. you do. G

page 4 The Guide From MultiLingual

003-043-04 SSichelichel ##107107 GGSG.inddSG.indd 4 99/24/09/24/09 110:51:250:51:25 AAMM WRITING FOR TRANSLATION

GGETTINGETTING SSTARTEDTARTED:GuideWRITING FOR TRANSLATION Optimizing the Source Using Translation Memory

JOSEPH CAMPO

ow many times have you written largest chm, with approximately 2,000 A dual monitor setup was essential to this something and known that you topics. I went back in time and created an project. On my right monitor, I opened Work- Hwrote something similar, but can’t English TM. bench and ran topics individually through remember where it was or how it was writ- I collected 73 new and 39 changed top- the English TM to pretranslate them. On ten? If you could only find that text and ics that documentation had actually sent the left monitor, I opened the original HTML replicate it, you would save money and to the translation team during the Solid- topic that had been sent to translation. time for your translation team by reusing Works 2007 development cycle. I used the When I ran a topic through Workbench, it already-translated text strings and would Analyze tool in Workbench to obtain an provided a percentage match of the new text produce more consistent documentation. against the existing TM on a string-by-string This article describes a pilot project that basis. I used these suggestions to change tested a potential solution to this issue us- the English source text in HTML on the left ing translation memory (TM). he higher monitor and to improve the percentage of I hypothesized that if our technical writ- T fuzzy match. I also paid strong attention to ers could tie our authoring process into an percentage the trying to reduce the number of new words. English TM that contains already-translated After pretranslating each topic, I used text strings, we could find existing English fuzzy match, the the Analyze tool in Workbench to gauge text strings, reuse them on new topics and and record the amount of savings for each lower our translation costs. In effect, the lower the cost topic. When I completed pretranslating documentation team would pretranslate all the topics, I calculated the costs and their new English documentation to maxi- to translate savings using the research data. I also mize matches against existing English obtained a translation cost estimate from text strings before sending topics to the the text. our outsourcing localization vendor for the translators who use the same TM. now pretranslated topics. We would use the English (source language) TM to improve the quality of Results fuzzy matches and reduce the number of original estimate of a full-cost translation. In both Table 1 and Table 2, results show words. Fuzzy matches indicate a percent- I also obtained an original estimate for a a modest reduction in per-language transla- age match of new or changed text against full-cost translation for the same topics tion costs when comparing the original cost existing already-translated text. A higher from our outsourcing localization vendor, estimates to the cost estimates after using percentage fuzzy match means the text using German as the target language. Trados to research the TM (post-Trados). string more closely matches existing trans- lated text. The higher percentage the fuzzy Original cost Post-Trados Savings match, the lower the cost to translate the project cost text. Totally new text strings are the most New topics $4,807.64 $4,301.79 $505.85 (10.5%) expensive to translate, so I tried to reduce new words used. Because we translate Changed topics $1,957.97 $1,554.78 $403.19 (20.6%) into 12 languages, there is a great poten- tial for cost savings. Grand total $6,765.61 $5,856.57 $909.04 (13.4%) After approval of the pilot project, I worked with my manager to schedule two Table 1: Cost estimate — vendor full-cost translation (includes translation, review and layout/DTP). months of project time. The translation team manager provided me with a TM tool Post-Trados license — Trados, in my case — and I was Original cost project cost Savings ready to start the project after several days of training. New topics $4,463.57 $3,837.79 $625.78 (14.0%) $1,322.25 $1,049.49 $272.76 (20.6%) Project design Changed topics We use RoboHelp HTML to create online Grand total $5,785.82 $4,877.28 $898.54 (15.5%) help and deliver multiple compiled help files (chms). I chose the main SolidWorks Table 2: Cost estimate — Trados Workbench Analyze tool help to use in the pilot because it is our full-cost translation (includes translation, review and layout/DTP).

October/November 2009 • www.multilingual.com/gsg page 5

005-075-07 CCamposampos ##107107 GGSG.inddSG.indd 5 99/24/09/24/09 110:55:060:55:06 AAMM WRITING FOR TRANSLATION Guide: GGETTINGETTING SSTARTEDTARTED

Details — new topics: I created charts to display the percentages of fuzzy matches in the original versus the post-Trados topics. The post-Trados topics (Figure 1) showed an increase in the number of 100% matches and a decrease in the number of No Matches. In terms of percentages, there was also an increase in the number of 50%-74% fuzzy matches. The remaining fuzzy match ranges were approximately equal to or less than the percentages of the original new topics. • Total words reduced by 10% (2,028 words). • 100% match increased by 439 words. • No match reduced by 1,613 words. Details — changed topics: Changed topics are existing topics with changes to already- translated text. These charts revealed a similar trend as with new topics. In terms of percentages, the post-Trados changed topics showed an increase in the number of 100% matches of about 10%. The remain- ing fuzzy match ranges were less than the percentages of the original changed topics. Figure 1: Fuzzy matches in original versus post-Trados topics. Overall, there is a greater percentage of 100% matches and a smaller percentage of translation items require full-cost trans- savings of 12.2% for new topics. This no matches compared with the new topics. lation. She suggested we apply a differ- seemed like a reasonable compromise. ent cost metric to the other 50% of our Analysis outsourced translation items; this metric Savings projection The cost estimates are within accept- is called raw translation, which includes Outsourcing costs for a typical release able deviations that permit me to say that translation of the text only. The savings vary from $100,000 minimum to $400,000 the Analyze tool results are defensible. in percent are similar to full-cost transla- maximum, depending on how many new I discussed the deviation with a senior tion using the Analyze tool. Notably, for products and services requiring localiza- employee in our research department. new topics, raw translation saved 14.1% tion are added to our suite of products, Standard deviations are complex to calcu- while full-cost translation saved 14% their length, and the number of languages late and vary based on many parameters. (Table 4). supported. If the process was applied to When I provided the deviations, particularly For the purposes of this pilot, I decided to all new documentation that we send to for the new topics, the research employee split the difference between the outsourc- translation for outsourced localization, an felt that the 3.5% difference was within an ing localization vendor savings of 10.5% estimated cost savings of 12.2% (between acceptable deviation range (Table 3). and the Analyze tool full-cost translation $12,200 and $48,800) would be achieved I then met with the translation manager savings of 14% and to use an estimated (Table 5). to discuss the difference in costs between our outsourcing localization vendor and Original cost Post-Trados Savings the Analyze tool. The translation manager project cost confirmed that translation costs will vary New topics $2,409.98 $2,071.03 $338.95 (14.1%) depending on the vendor, the language, and the services provided. Having multiple Changed topics $ 815.93 $ 629.67 $186.26 (22.8%) variables makes it impossible to provide Grand total $3,225.91 $2,700.70 $525.21 (16.3%) an exact cost estimate to fit all situations. The translation manager informed me Table 4: Cost estimate — Trados Workbench Analyze tool that only about 50% of our outsourced raw translation (includes translation only).

New topics New topics Changed topics Changed topics Grand total Grand total savings deviation savings deviation savings deviation Vendor 10.5% 20.6% 13.4% 33.3% 0% 17.9% Analyze tool 14.0% 20.6% 15.8%

Table 3: Full-cost savings comparison/deviation between vendor and Analyze tool.

page 6 The Guide From MultiLingual

005-075-07 CCamposampos ##107107 GGSG.inddSG.indd 6 99/24/09/24/09 110:55:060:55:06 AAMM WRITING FOR TRANSLATION GGETTINGETTING SSTARTEDTARTED:Guide

Changed documentation is typically Outsourced localization SolidWorks translation localized by our in-house translation cost savings team time savings team, so for us there will be no “cost sav- ings” per se. However, the translation New $12,200 to $48,800 n/a team would experience a time savings of Changed n/a 20.6% to 22.8% between 20.6% and 22.8% because of the increased quality and number of matches Table 5: Annual estimated savings if Trados as well as the reduced word count. is implemented for all new and changed documentation.

Pilot project conclusions to deal with localization issues during the to XML, this might be the perfect time to Using a TM tool is viable in pretranslation authoring , they are simply too over- examine your documentation in detail with only if we consider its value in increasing worked. . . . We often can’t get them to edit your translation costs in mind. the consistency and quality of documenta- their work, let alone reduce the word count One benefit I found was that while using tion. I could not justify using the tool on or make it consistent.” my TM tool, I was fully focused on reduc- just a cost-savings basis alone. I have been following the progress ing word count because I kept translation Savings were achieved by both reuse of of the SDLX AuthorAssistant (SDLXAA) as my main focus. Word reduction is hard existing text and aggressive word-count product, which seems similar to my pilot to achieve in normal writing because reduction. However, the anticipated trans- project. SDLXAA lets writers write, then the technical writer is normally not thinking lation savings only partially offset the runs the document against a TM to offer about it. According to Freij, “verbosity is the cost of the skilled writer’s time in . suggestions for improved matches and enemy. It pays to be concise and straight I spent approximately 30 minutes per topic reuse. According to Sue Blaisdell, informa- to the point, eliminating unnecessary text using my TM tool. tion architect at Avaya, “with AuthorAs- when localization is imminent. When writ- For the 73 new topics, I spent approxi- sistant, you can connect to TMs for your ing technical documents, remember that mately 37 writer hours. Using $50 cost project, and it will display 100% and fuzzy simplicity is also very much desired by the per writer hour, I spent $1,850 in time to matches to the writer. It also gives the end-user.” It is also important that your TM achieve only $625 savings in outsourcing writers insight into the way that changes be as clean as possible. costs. Labor costs were triple the savings they make in their English content affect What writers need is a TM tool that runs achieved, for one language. Actual cost the localization costs.” side-by-side with an authoring application savings are only achieved when factor- This pilot project indicates that transla- and can semi-automatically offer sugges- ing in that we translate into 12 languages. tion cost savings can be achieved using tions on how to better match new text Savings = total cost savings ($7,500) TM, but at a cost in labor and time. With to the existing TM. The development of – time spent ($1,850) = $5,650. If the TM usage, writers would become more pro- SDLXAA and -it’s new application tool were used to only search for reusable ficient with the system and save time. should give us hope that tools are becom- text (no word reduction), the results would Your company would have to be ready to ing available to bring technical writers and be even less impressive (estimated 2.4% absorb license and time costs. If you are translators closer together to achieve cost savings in outsourcing costs). going through a major restructuring of savings by leveraging valuable memory your documentation, perhaps upgrading resources. G Beyond the case study: related research I queried translation experts as to whether any similar projects had been undertaken. Authoring memory tools have been around for over ten years. An article by Jeff Allen UPCOMING EVENTS in 1999 discussed how authoring memory could be used in conjunction with controlled 2009 Know-how for Global Success language to aid in translation (www.transref .org/default.asp?docsrc=/u-articles/allen2 ■ October 20-22 .asp). The new Author-it product, for exam- ■ Hyatt Regency Santa Clara, Silicon Valley, California ple, uses fuzzy matching within a con- tent management system. I contacted Nabil Freij, president of GlobalVision, and accord- 2010 ing to him, this pilot project was a unique approach. The key to reducing localization ■ 7-9 June costs is to reduce word count. Some com- ■ Hotel Maritim proArte, Berlin, Germany panies are implementing controlled English to reduce word count, increase the 100% matches, and also to transition to machine ■ October 12-14 translation (MT). According to Freij, “MT ■ Bell Harbor Conference Center, Seattle, Washington engines can perform better under restricted and controlled vocabulary.” In his experi- ence, “most tech pub writers do not want www.localizationworld.com ■ [email protected]

October/November 2009 • www.multilingual.com/gsg page 7

005-075-07 CCamposampos ##107107 GGSG.inddSG.indd 7 99/24/09/24/09 110:55:070:55:07 AAMM WRITING FOR TRANSLATION Guide: GGETTINGETTING SSTARTEDTARTED Elements of Style For Machine Translation

KEN CLARK

e have entered the Machine Transla- • Don’t remove necessary words, and • Check your translation. So after closely tion Age. Demand for human trans- don’t go too far with editing. In English we studying and applying all these rules before Wlation is still increasing dramatically drop a lot of words when we write, espe- translation, how can you know if your MT — or was until this year — but the vast major- cially when writing informally. Keep those output makes any sense? Translate the ity of the world’s translation is now done by articles, prepositions, pronouns and so on output back using an MT tool. That reverse computer. And the vast majority of machine where the machine can find them. English translation may help you to spot the most translation (MT) transactions is completed speakers are able to fill in the blanks and glaring errors. Recast those problem sen- using free online translation tools such as Ba- fully understand — not so when the reader tences in English and see if the back transla- bel Fish or Google. is a translation engine. tion gets any clearer. Don’t expect miracles The result usually leaves much to be • Misspelling does not compute. A mis- here. But it may be some comfort to know desired, and there’s not much you can do spelled word will not translate — end of that the original translation is better than about it when you are translating some- story, end of translation. the back translation. one else’s content, particularly if you don’t • Ditto on punctuation. One accidental • Keep source and target together. No understand the source language. But you period can completely change the meaning garbage in the MT tool, less garbage out. can dramatically improve translation of of a sentence and trash your translation. But garbage there will be. That’s why we content you write yourself and share with Spell-checking and proofreading after you like to keep a copy of the source with the others in a foreign language, without using write and before you translate are pretty target translation to create a bilingual out- the special software and workflows of pow- basic quality assurance steps. put so that those errors can be spotted erhouse automated translation systems. and corrected later if need be. Just a few simple writing tricks can make a • Identify MT. Avoid blame by giving dramatic difference in MT quality. It’s not credit. Letting people know that you used a controlled language, but language control. he simplicity machine to communicate with them allows Writing clearly, whether for man or T them to read with caution, and keeps them machine, is always a struggle (at least for from feeling they’ve been short-changed me), and the dim machine minds of the and clarity of on a real translation. translation tools are unforgiving when it comes to bad in source. expression On the writers’ craft Unlike us humans, MT tools have no sense Using a little bit of discipline to prepare WRITING FOR TRANSLATION of context, no appreciation of an author’s demanded by MT content for MT extends the functionality of intent and definitely no sense of humor. these tools for people busily engaged with With Strunk and White’s The Elements of tools would meet others in multiple languages. Style as inspiration, here’s an abbreviated Writing for MT, just like writing for human guide to good English style for improved MT. the approval of translation, is good writing practice. Trans- • Use short sentences. Keep it simple. lation has a way of highlighting communi- Cut the clauses. Ditch the sentence frag- Strunk and White, cation errors that are invisible or ignored in ments. Simple sentences and grammati- just a single language. The simplicity and cal structure (subject-object-verb) are the I hope. clarity of expression demanded by MT tools only way to go. would meet the approval of Strunk and • Avoid ambiguity, as in “I saw her White, I hope. duck.” Well, which is it? A duck that quacks • Slang is so like, whatever. No slang Their , The Elements of Style, prized that belongs to her? Or was she avoiding a and no for MT. is the first thing for its focus on clear, concise language, has flying object? Look for multiple meanings lost in MT. Stay earnest and formal. That’s been a source of guidance for writers, copy- when proofing. Good luck. If you don’t find why pithy headlines and snappy news- editors and college students for half a cen- it, your MT tool may just do it for you. paper copy so often translate badly with tury. To commemorate the 50-year edition • Remove extra words. Editing out these tools. Rule of thumb: Good MT style published this spring, The New York Times unessential phrases and extra words will comes in one flavor . . . plain vanilla. commentators consign it to the dustbin of make for a simpler, better translation. • Use “Do not translate” coding. Some history in the Room for Debate (http:// Since the algorithms have fewer transla- MT tools will allow you to place code roomfordebate..nytimes.com/2009/ tion variables to wrestle with and better around a word or phrase, which allows the 04/24/happy-birthday-strunk-and-white). style with fewer words in the translation, word to pass through the engine without I’ve still got a dog-eared, ratty, old copy it will also be more accurate. getting translated. on my desk, where it shall remain. G

page 8 The Guide From MultiLingual

0088 CClarklark ##107107 GGSG.inddSG.indd 8 99/24/09/24/09 110:57:240:57:24 AAMM WRITING FOR TRANSLATION

GGETTINGETTING SSTARTEDTARTED:GuideWRITING FOR TRANSLATION Optimized MT for Higher Translation Quality

LORI THICKE

veryone knows machine translation Against the benchmark of FAQT, MT is sure question is whether to wait for MT to catch (MT) has enormous potential for dra- to disappoint. up to our aspirations for it or to invest in Ematically reducing translation cost For those resigned to the lack of qual- processes that can optimize the MT we and increasing speed. But who thinks of ity with unoptimized MT, there’s always have today. MT as a way to improve quality? the unfortunately named FAUT (fully auto- ISO 9001-certified for the last decade, matic useful translation). FAUT is essen- How MT improves quality my company’s quest for quality has unex- tially “gisting” translation, which is a more Once we stop waiting for quality MT pectedly led us to MT. Along the way we’ve or less accurate approximation of the to emerge fully clothed from the loins of developed and tested a number of dif- source text. a research and development lab some- ferent processes for MT and discovered Today, gisting is overwhelmingly the use where, we can start to see MT for what it is: that correctly optimized MT can actually to which MT is being applied and accounts an efficient solution that can assist human improve quality — and for less cost and for even more words translated than by translators by taking out a large part of the with higher rates of productivity. Under humans. If the claim that MT translates drudgery in translation. the right conditions, MT actually breaks The reality we are seeing every day is that those compromises we’ve come to accept Speed for technical translations ranging from soft- in the traditional localization paradigm. ware to manuals to catalogs, quality MT is You may want price, speed and quality, but achievable. But like any relationship, you here’s the kicker: you only get to pick two have to work at it. In fact, correctly optimized out of three. MT — that’s the “working at it” part — paired MT can offer all three. However, the truth with human post-editors can actually improve is that for most people, quality MT is still an quality. How could this be possible? oxymoron. And who could blame them? In the first place, correctly customized MT (customizing MT engines is a skill in MT: always five years from perfection itself) removes terminological inconsis- Just about any of us with an internet tencies. If the source document always connection has had first-hand experience uses the same term, so will the MT engine. with MT. We have probably used SYSTRAN This resolves the real problem of teams of to translate an e-mail or ProMT to give us translators working on the same project the gist of a web page. We may have con- Price Quality but employing divergent terminology. versed with someone in another language The current localization paradigm. Across a large project, MT can also ensure via Google’s translation center, read Wiki- a more consistent tone, with less stylistic pedia in Thai thanks to Asia Online or more than humans seems outrageous, con- discrepancies. Furthermore, MT removes solved an IT problem using Microsoft’s sider that an estimated 30 million e-mails that human element of non-quality: omis- automatically translated knowledge base. are translated by MT every day. sions. Enforced, validated terminology, Along the way, MT will have amused For internauts, instantaneous gisting consistency and completeness are MT’s us with its inadvertent twisting of human (gist-in-time) provides a basic understand- strengths. But what about mistransla- language. ing of an e -mail or a website. In the cor por ate tions? There’s no question that MT deliv- Most people would agree that “out-of- space, gisting is used for legal discovery, for ers more of its fair share of sentences that the-box” MT is far from what it is supposed patent or technology searches, or to iden- mangle the meaning of the source text. to be: fully automatic quality translation tify parts of larger corpora that merit being This is where the post-editors come in. (FAQT). This has been the promise held out translated by a human. But how much gist- Working on a bitext format, a post-editor to our industry since the very first MT sys- ing do we humans really need? Not much, correcting MT output will frequently scru- tem translated 49 Russian sentences into as it turns out: for all the profusion of free, tinize texts more carefully than a reviewer English using a 250-word vocabulary and software-as-a-service and off-the-shelf MT working on human output. On large-volume six grammar rules. Fifty years later we’re solutions, commercial translations, which localization projects, T + E + P (translate + still waiting. As Hans Fenstermacher of need more than gisting quality, are by and edit + proof) as a process may be inter- Translations.com says, “MT has been five large assured by humans. For the vast preted differently by different language ser- years from perfection since 1952.” majority of corporate needs, MT is staying vice providers. T + E + P on a million-word It could be that our overwrought expec- on the shelf. project may consist of T + a sampling review tations for MT partially explain the slow If FAQT is still “five years away” and of 10-20. The source text may or may not be uptake of MT by the translation industry. FAUT is simply not that useful after all, the consulted at the same time.

October/November 2009 • www.multilingual.com/gsg page 9

009-11Thicke9-11Thicke ##107107 GGSG.inddSG.indd 9 99/24/09/24/09 111:01:371:01:37 AAMM WRITING FOR TRANSLATION Guide: GGETTINGETTING SSTARTEDTARTED

MT affords you no such luxury. Because For software, quality may be defined means no information, service or customer MT can and does go completely off the rails as accurate, understandable and rapid satisfaction at all.” from time to time, each and every segment enough for simship. For a catalog, correct Customers also report that support must be examined in bitext format and terminology on each of thousands of items articles translated by MT are just about as approved or rewritten by a human post- is primordial. For courseware, the material effective in solving their problems as human editor. If only every translation received needs to promote learning. For a knowl- localized content and at a price far below that type of attention! edge base, customers need to be able to what human translations would cost. This process for review and correction, resolve their problems without further This is not about depriving translators if properly managed, should not only catch recourse to the help desk staff. of work. Human translations would not and fix the errors, but should also yield Since MT allows you to calibrate the have been economically feasible for the an accounting of what changes need to be human effort (linguistic training, post- hundreds of thousands of knowledge base made to the MT engine itself. This goes to editing) that you put into achieving the articles in various languages — including the heart of any good quality system, such quality levels you need, setting quality Chinese, Japanese, Portuguese, French, as ISO 9001: ensuring quality at the source requirements in advance is an essen- German and Spanish — that Microsoft pub- — that is, catching errors at the beginning tial step. The example of online help and lishes online. This would have required an rather than correcting them downstream knowledge bases above demonstrates initial outlay of approximately $30 million — and, crucially, instituting processes for the importance of customer-defined qual- per language, according to Microsoft itself, continuous improvement. ity. It’s well known that human reviewers not including weekly updates. Instead, Correcting systematic errors and then will often designate only extremely high Microsoft chose its own hybrid MT sys- feeding these corrections back into the quality as acceptable. However, when the tem to translate content that would other- MT engine is what we call “the Virtuous choice is between an imperfect translation wise not have been translated. Measuring Circle of MT Quality.” This, too, is the results, the company found that an integral part of the optimization across all languages, MT helped solve process. customer problems on average 23% of the time. This figure may seem low, What quality do you need? but it’s only slightly below the success But what quality is good enough? rate of 29% for human translation. Any good process defines its quality Microsoft concluded at a presenta- expectations up front, and working tion to the 11th Machine Translation with MT is no exception. Summit in Copenhagen, Denmark, in MT quality has been measured by September 2007 that “customer sat- the wrong yardsticks to the detri- isfaction numbers for machine trans- ment of the elegant solution that MT lated articles is comparable to and can be when matched to the type of sometimes exceeds original English!” result needed. The question is not whether MT is “better” than a human Optimizing MT translation on a given text. Rather, Regardless of the quality level MT the question is what quality is nec- is to achieve — publishable qual- essary for a particular project and ity or simply understandable quality what process — human only, human — unoptimized MT is just not up to + translation memory (TM), human the job. While some sentences coming + TM + MT — will best allow you to Five factors influence increasing MT quality. out of untrained MT engines may be achieve that exact level of quality. stunningly good, others will be pure The 2008 version of the ISO 9001 standard and no translation (information available gibberish. And without effective training, introduces the idea of customer-defined only in the original language), customers there is no way to ensure that the terminol- quality to the international norm. This is an themselves weigh in heavily in favor of raw ogy you want will be consistently applied important distinction to make. Accuracy, — that is, fully automatic — MT. by the MT engine. consistency of style, correct terminology, Don DePalma of the Common Sense Training, then, is the secret sauce of good spelling and punctuation, and completeness Advisory says, “Whether it’s FAQT, FAUT, MT, even more important than what system are all inarguably elements of a quality trans- or perfectly rendered output, the biggest you choose, whether rule-based or statisti- lation. But how much quality is required for a decision that companies will have to make cal (see sidebar). This is also one of the areas given situation? “Doesn’t read like a transla- about machine translation is whether any that requires the greatest investment. tion,” for example, is the type of quality that of those are a worse alternative than no For statistical machine translation (SMT) a marketing translation would need to have translation at all. Given the enormous vol- systems, this training involves not only in buckets. We may not have a specific metric umes of content that companies and gov- extensive corpora of bitext (think in terms for defining marketing quality, but we sure ernment should make available for other of millions of segments), but also glossa- know when it’s not there! But what about markets, for me and many of the organiza- ries and monolingual texts. The more the software? A catalog? E-learning courseware? tions that we talk to, the quality question better. Imagine Steven Spielberg’s little A knowledge base? This is where the quality is ultimately a non-issue. What we call the alien, ET, saying “Need more data.” That’s question begins to get more nuanced. ‘zero translation’ option of doing nothing SMT in a nutshell.

page 10 The Guide From MultiLingual

009-11Thicke9-11Thicke ##107107 GGSG.inddSG.indd 1100 99/24/09/24/09 111:01:381:01:38 AAMM WRITING FOR TRANSLATION GGETTINGETTING SSTARTEDTARTED:Guide

However — and this is a big however — the data must be good, clean data, follow- Rule-based versus statistical MT ing the garbage-in-garbage-out truism. As Microsoft says, “we can never have enough There are two major streams in MT technology: rule-based MT (RBMT) and statistical MT (SMT). clean, parallel data.” And it must be domain These two methods, espoused by various MT technology vendors, represent two different routes and client-specific data: no point training to the same place. the system on EU corpora if you’re a car The earliest systems were rule-based, among them SYSTRAN. For the development of RBMT manufacturer. systems (SYSTRAN, ProMT, Lucy), various languages were broken down into their parts of speech In rule-based machine translation (RBMT), and grammatical rules were hard coded, along with dictionaries. An RBMT system would never this training is even more specific, involv- say un noir chat but un chat noir, coded, as it is, with the knowledge that adjectives follow nouns ing data mining to create domain-specific in French. Exceptions such as une vielle dame would also be coded in the system. dictionaries created for terminology entries SMT, on the other hand (Google, Asia Online), uses an algorithm to parse vast numbers of bilin- including “Do Not Translates” and graphic gual sentences (preferably in the millions) in order to extrapolate relationships, including word user interface strings. This expert training order. Un chat noir would appear as the translation of a black cat if it had seen that in the training phase. However, blissfully ignorant of the rules of grammar (with the exception of Asia Online), of the engine creates the grammatically SMT would be likely to incorrectly translate a green cat as un vert chat because it wouldn’t have coded glossaries that will do the work of encountered any green cats — unless trained on Dr. Seuss. imprinting in-house terminology on the sys- Both RBMT and SMT systems have their advantages and disadvantages. Both are capable of tem. This is actually trickier than it sounds delivering accurate, fluid sentences, depending on how they were trained. Both can also deliver and requires a linguist trained in MT’s idio- utter gibberish — again, depending on how they were trained. RBMT wins the day when you syncrasies to avoid inadvertently creating don’t have millions of words of training corpora; SMT is the victor when it comes to adding a new errors and making the output worse, rather language pair, a major multiyear undertaking when preparing an RBMT system. Hybrid systems than better. This can occur when terms are such as SYSTRAN’s are capable of bridging the gap between RBMT and SMT. coded incorrectly (a verb as a noun) but also when coding correctly but failing to take into consideration how the system will Testing will provide information on the level they need to know what level of quality react to exceptions. If training the engine is of fuzzy match that should be discarded in is expected. Besides post-editing, other the sine qua non of quality MT, it is also one favor of MT segments. However, it’s usually post-production optimization techniques of the greatest barriers because relatively useful to make sure that new MT segments include use of QA tools, automatic post- few linguists know how to correctly tune are identified as such to distinguish them editing through regular expressions, text MT systems, and few resources exist to tell from validated TM segments. normalization, updating of the TMs and so them how to do it. on. And above all, it is essential that there Upstream of the actual MT processing, be ongoing tuning of the engine with new another activity is important to optimizing and modified terminology and error cor- MT output: controlled authoring, or lan- ong, convoluted rections in a continuous, virtuous, cycle of guage control of the source content. Long, L feedback and improvement. convoluted sentences do not lend them- sentences do not If all these processes, from pre-production selves to MT, no matter how well trained the to post-production, are instituted to opti- system is. Authoring guidelines specify, for lend themselves mize MT output, what kind of quality can example, that technical writers use short, be expected? Recently one of our clients, simple, declarative sentences, employ the to MT, no matter a major software publisher, noted in the active and not the passive voice, avoid par- report “Leveraging a crisis for innovation enthetical expressions in the middle of a how well trained (or never let a good crisis go to waste)” that sentence and so on. And while we humans “contrary to all expectations, using MT in may understand text that is rife with gram- the system is. [our company] has improved the translation matical errors, no MT system will. quality . . . with the reviewer commenting ‘It Where the source text already exists was nearly 9 — it was the best translation or where in-house documentation teams of courseware I ever read.’” are resistant to applying the principles of The capacity of MT to function as a It has long been believed that buyers of controlled language for authoring, there is standalone will depend on the quality translation services must compromise. In another solution. Using automatic normal- required and on how well the engine is the traditional localization paradigm, if you ization or running source text through a QA optimized through stringent training, want speed and quality, you have to com- program may bring a noticeable improve- ongoing maintenance, controlled author- promise on price; if you want speed and ment to the ability of your MT engine to ing and so on. But for publishable quality, price, you have to compromise on quality. understand and translate your text. human post-editors are essential. MT is often associated with a compromise TM leveraging is another step in MT opti- In this regard, MT can be seen as just of quality in favor of cost and turnaround mization. Even a well-trained MT engine is no another tool in the translator’s toolkit, improvements. However, the reality is that replacement for the human translations con- much like any CAT tool, albeit one that’s correctly optimized MT can break these tained in TMs, assuming they’re of high qual- more complex and expensive to set up. compromises by offering faster through- ity. It’s important therefore to develop the In optimizing MT, post-editors need to be put, lower costs and higher quality. But you processes that will increase TM leveraging. trained in post-editing techniques, and have to work at it. G

October/November 2009 • www.multilingual.com/gsg page 11

009-11Thicke9-11Thicke ##107107 GGSG.inddSG.indd 1111 99/24/09/24/09 111:01:391:01:39 AAMM WRITING FOR TRANSLATION Guide: GGETTINGETTING SSTARTEDTARTED Controlled Authoring to Improve Localization

ULTAN Ó BROIN

ontrolled authoring, broadly speak- approved rule and term application during the years through such developments as ing, is the process of applying a set the actual text-editing phase. Caterpillar Technical Eng lish, Nortel Stan- Cof predefined style, grammar, punc- dard English, the Plain English Campaign, tuation rules and approved terminology Controlled languages GM’s Controlled Automotive Service Lan- to content (documentation or software) It’s not uncommon for organizations to guage, Global English and so on. during its development. Many companies have no serious control over their content The introduction of structured authoring offer some form of guidance to their con- style rules and terminology or to rely on through SGML and later XML, along with tent developers, either through tools or manual processes, combining in-house more innovations in linguistic processing more ad hoc means, of course, so this may guidelines with the commonly applicable and database storage, allowed for the devel- not seem at all remarkable. In the last few rules and recommendations of sources opment of and application of targeted rules years, however, innovations in linguistic such as The Chicago Manual of Style, to meet customer requirements driven by processing technology and its commoditi- while working off spreadsheets of terms content type and market, reflected by the zation indicate that controlled authoring and applying simple checks for consis- ability to now apply a controlled authoring holds great potential for anyone seeking a tency and using human editing to meet process through common authoring tools tool-driven approach to maximizing returns their “quality” requirements. For some this such as Microsoft Word, PTC Arbortext Editor from the localization process. This has par- is acceptable; however, it is hardly a scal- and Adobe FrameMaker through plug-ins. ticularly paved the way for the adoption of able, enforceable or measurable process. The use of an approved set of terminol- cost-effective machine translation (MT). We’ve all seen the waste of many possible ogy, where each term has only one mean- Controlled authoring and languages are opportunities for localization efficiencies ing in that context — consider the different complex, so this article concentrates on — let’s save the content development effi- translations for the out-of-context word the localization-related aspects of intro- ciencies for another audience — because job, for example — and clear and enforce- ducing controlled authoring into an organi- manual enforcement and voluntary uptake able authoring rules allow writers to zation that must localize its content. of authoring guidelines allow for a good achieve a high degree of consistency in the Controlled authoring itself is frequently deal of subjectivity in interpretation and source texts they create, not only in the conflated with other parts of the overall application. Controlled authoring is much words and terms they use, but how they content delivery process, notably that more objective as the selection, applica- use them. Consistency in constructing of content management, a separate but tion and enforcement of such guidance phrases, along with eliminating complex- contributing function. Isolating the non- is programmatic. The application of rules ity, ambiguity and verbosity, is the key to WRITING FOR TRANSLATION technical essence of controlled authoring “controls” the authoring, so to speak, maximizing TM use and MT potential (and is made all the more difficult by the range allowing content developers to avail of the large efficiencies on the production side). and interplay of tool functionality offered rules directly through the authoring user What might these controlled language by various vendors. Whereas seemingly interface: looking up alternative words, rules entail? Well, the number can vary, could subtle distinctions do not always make a phrases and terms, reusing already written be as many as 10 to between 50 and 100, but great deal of sense from an overall busi- phrases, harvesting and storing new ones, typically might relate to standardized spell- ness process engineering viewpoint, it’s and reporting on the content’s compliance ing, length of sentence, number of clauses, important to understand from a local- with the rules immediately or afterwards. use of active versus passive, simplifying ization perspective just how controlled The origins of the controlled language tenses, rules for noun phrases, modifiers, authoring technology works. For example, concept are far from the needs of modern syntactic cues, past participles, gerunds, if the storage of objects in the content day localization, rather being designed to avoidance of Latin phrases, slang and so on. management system (CMS) allows reuse improve comprehension of the source lan- I recommend Jon R. Kohl’s The Global Eng- at a level higher than the translation mem- guage by simplifying matters for non native lish Style Guide if you need a valuable start- ory (TM) segmentation does, localization English speakers of English (“human orien- ing point and reference material for possible saves are limited. It may be more helpful tation”) or computers (“machine orienta- rules as well as Sharon O’Brien’s “Controlling from a business requirements position to tion”). Often, these nonnative readers Controlled English” paper (www.mt-archive regard controlled authoring as an informa- worked in the maintenance and service field, .info/CLT-2003-Obrien .pdf) for recommen- tion quality process that consists of many an audience targeted by probably the best- dations on the rules central to content different parts: data mining for rule and known iteration of a controlled language: intended for MT. terminology research and creation, new ASD-STE100 Simplified (Technical) English. Naturally, the rules vary by content type terminology harvesting and rule devel- The genesis of the controlled language con- and audience. Gerunds may be acceptable opment, reuse management, reporting cept can be traced back to Ogden’s Basic in headings, but not main text without qual- on quality, and so on rather than purely English from the 1930s and established over ification, delimiters may not be required

page 12 The Guide From MultiLingual

112-152-15 O BBroinroin ##107107 GGSG.inddSG.indd 1122 99/24/09/24/09 111:06:411:06:41 AAMM WRITING FOR TRANSLATION GGETTINGETTING SSTARTEDTARTED:Guide

content easier to translate by humans and machine. It’s a fundamental recognition that the basic internationalization concept Weighting Solution 1 Solution 2 Solution 3 of assuring translatability and high-quality source content results in greater savings Price accruing to the organization at the localiza- NLP-level verification of terms, grammar tion stage than trying to continually nego- and style according to our requirements tiate lower prices with vendors or praying Prompting of writer to reuse of existing segments from CMS for a quantum leap in translation technol- ogy to turn garbage source into localized Scalability (multiple users, concurrent users, performance) gospel. Efficiencies are magnified in a Easy maintenance of rules by existing, in-house resources one-to-many relationship as the number Percentage of existing rules from style guide that can be automated of languages translated increases. Controlled authored content is consis- Integration with existing translation glossaries and exchange formats tently expressed in an understandable way. This results in translators not needing clari- New terminology harvesting fications, maximizing TM matches, elimi- Basic rule set supports translation memory requirements nating the need for terminology creation after localization starts, and providing texts Basic rule set supports machine translation readiness more easily processed by MT, cutting down Automatic reporting on quality in batch and single mode on post-editing needs and recalibrations. Interactive quality assurance through editing environment Volumes too, are generally smaller, reduc- Allows prioritization of rules for grandfathering of content ing cost and time-to-localize per se. Bear in mind, however, that these sav- Supports multiple rules and terms by content type ings are a function of the rules created, Plug-ins and integrations for existing authoring tools as well as how and when the texts are translated. Overarching internationaliza- Automatic indexing capability tion rules also impact the efficiencies as well as the technical review of the source Customer references include MT and TM savings text by domain experts. If a switch should Open standards or proprietary architecture be documented as being “off” instead of Established user group and conference “on,” then don’t expect controlled author- ing to eliminate any language version test- Global 24 x 7 support ing issues. Figure 1: Business requirements can be weighed against a variety of solutions. Do you need controlled authoring tech- nology in order to use MT? The simple on software strings but are required on (www.localisation.ie/resources/Awards/ answer is “no.” But if you need a scalable messages or documentation, and so on. Theses/PatrickCadwell_Thesis.pdf), the pub- approach to ensure your source text meets Thus, solutions that allow for forms of lications of Jeff Allen (www.geocities.com/ realistic MT business requirements by pro- semantic checking have an advantage. controlled language), the MT, Localization viding easily processed source text that Professional, and Information Quality groups minimizes the need for post-editing, thus Controlled authoring solution: which one? on LinkedIn, and so on. making MT cost-effective, then controlled Commercially available controlled auth- The decision as to which controlled authoring technology is a must-have. Mov- oring technology solutions range from the authoring solution to adopt is driven ing past the “writing for translation guide- more sophisticated, scalable technology- by business requirements. Localization lines” approach is the way to go here. based solutions based on advanced lin- teams should ensure they’re involved in guistic processing to less complex content the identification of these, so come armed The business case for management-based offerings, “methodol- with facts and figures for the business controlled authoring: the big picture ogies” and combinations of same. Possible case. Initial business requirements when It’s often said that the biggest risks to options include acrolinx IQ, IAI CLAT, applied to a range of solution possibilities the introduction of controlled authoring, Author-it, Smart MAXit, Boeing Simplified might look something like Figure 1. other than the cost (nontrivial even at the English Checker, SDL AuthorAssistant, Requirements vary by organization, best of times), is the political. The term Shufra and more. For those interested in naturally. Prioritize and weight each point controlled authoring itself must be found researching a controlled authoring option, before making a decision among compet- guilty on all counts of contributing to the possible sources of information are IDC ing alternatives. problem of user acceptance as it conjures reports, ELDA, International Journal of up images of mass layoffs, stilted, boring Language and Documentation, CLAW pro- Benefits to localization texts, loss of control by authors, inducing ceedings, Localisation Research Centre The clear benefits of controlled author- an immediate negative reaction, mostly publications, DCU papers from Sharon ing in the localization space are derived based on understandable fear and igno- O’Brien and MA research by Patrick Cadwell from the improved source quality making rance, frequently exacerbated by a belief

October/November 2009 • www.multilingual.com/gsg page 13

112-152-15 O BBroinroin ##107107 GGSG.inddSG.indd 1133 99/24/09/24/09 111:06:411:06:41 AAMM WRITING FOR TRANSLATION Guide: GGETTINGETTING SSTARTEDTARTED

that controlled authoring can somehow solution for a few thousand words of offer automatic creation of content, and marketing material would not be a strong a narrow focus on just localization ben- business case! efits. There are ways of dealing with • Prioritize rules. Decide which ones these issues too, beyond the scope of are more important to you than oth- this article. ers. Aim for automatable and therefore In general then, beyond the clear hard- measurable ones. For example, a rule sell on the TM and MT front, localization called “one strong idea per sentence” is cost and time-to-market savings, local- not automatable, whereas repeating the ization teams can emphasize the quality noun instead of switching it for a pronoun We’re Not Just aspects of the source content for native or checking for the passive voice is. Translators . . . We Are speakers too — superior user experi- • Look for leverage points between ence, consistent terminology, less sup- localization and authoring teams. Many WORDSMITHS! port calls, improved accessibility and so rules for localization maximization are on. Leverage the global user experience, obviously ones that should be applied in FRENCH only inc./in SPANISH too! is a not just the localized one. It should be to text even if never intended for local- small GIANT! pointed out there are controlled author- ization in the first place. Other, more • small and personal enough to ensure language ing solutions for Japanese, German and “severe” MT rules may not be optimal for continuity, short communications channels, fast so on, so do not assume it is an English- the source language depending on the feedback and short production cycles only concept now, whatever the origins. user experience required. For example, • large enough to meet your needs with a text intended for mobile applications may team of passionate, experienced, professional Introduction and changing be fragmented, clipped, dropping articles in-house translators and project managers processes: localization’s role and so on for user experience reasons. Be EN 15038-certified Introduction of controlled authoring prepared for compromise. Err on the side requires a serious management decision of user experience trumping localization as to timing, not least the provision of a unless it’s a complete showstopper. in FRENCH only inc./ significant budget. However, research • One particular challenge to the intro- in SPANISH too! Translations would indicate that using pilot projects duction of controlled authoring can also Toronto, Ontario Canada to develop the process as well as achieve come from localization groups them- [email protected] • www.translations.ca maximum buy-in by the stakeholders in selves — the disruption of TM match the process is key, as well as using train- rates for previously localized content. ing techniques that rely less on com- This requires careful management. Solu- puter science and linguistics but more tions include the introduction of con- on content development approaches. trolled authoring on new content yet to In general, localization groups might be localized, phased introductions based consider the following with faced with on content that is going to change any- the opportunity to introduce controlled way, grandfathering of content that has authoring: shown little change over years, or a reas- High Quality MT • Identify a localization strategy for TM sessment as to how a one-time hit on and MT tools and how controlled author- localization assets results in longer term for International Success ing business requirements fit into this. cost savings, time-to-market improve- SYSTRAN is the leading provider of machine • Help kick-start the controlled author- ments and quality uptake. translation (MT) solutions for the desktop, enter- ing process of adoption and pilot proj- • Localization group input to the rule prise and internet services. Our solutions facilitate ects by providing rules and terminology creation process must be matched by an multilingual communication in 52 language pairs already harvested to the implementers of evaluation of the localized source out- and in 20 domains. SYSTRAN Enterprise Server 7 the technology. put, too, iteratively maximizing returns is powered by our new hybrid MT engine that • As the creation of rules and terminol- through the fine-tuning of rules. An MT combines the predictability and consistency of ogy are central to controlled authoring pilot makes a fine adjunct to a controlled rule-based MT with the fluency of the statistical and to the impact on localization tools, authoring pilot. Provide content develop- approach. The self-learning techniques allow then it is critical that localization groups ment teams with the feedback, qualita- users to train the software to any specific domain remain visible and active as stakeholders tive and quantitative. G in their development and maintenance to achieve cost-effective, publishable quality over time. Acknowledgements translations. SYSTRAN solutions are used by • Recognize the best kind of texts Publicly available sources from the fol- Symantec, Cisco, Ford and other enterprises to — large volumes of structured, techni- lowing were used in this article: Patrick support international business operations. For cal, procedural texts such as software Cadwell (DCU), Sharon O’Brien (DCU), Jeff more information, visit www.systransoft.com and user assistance strings or online Allen (Translations.com), Uwe Muegge documentation. These texts require a (Muegge.cc), Jon Kohl (SAS), Andres Heu- SYSTRAN Software, Inc. consistent user experience between com- berger (ForeignExchange Translations) and San Diego, California USA • Paris, France ponents. Seeking a controlled authoring Fred Hollowood (Symantec). [email protected] • www.systransoft.com

page 14 The Guide From MultiLingual

112-152-15 O BBroinroin ##107107 GGSG.inddSG.indd 1144 99/24/09/24/09 111:06:421:06:42 AAMM WRITING FOR TRANSLATION GGETTINGETTING SSTARTEDTARTED:Guide

Save Translation Cost with HyperSTE TermNet — International HyperSTE is the leading quality assurance Network for Terminology Creating a software for standardized documentation. TermNet, the International Network for with the World Benefits include Terminology, is a forum for companies, associa- Our network of 500+ skilled professionals, • Up to 30% in cost savings on translation tions and universities that engage in terminology. and localization working in over 50 world languages and Terminology is considered and promoted by numerous areas of expertise, provides you with • Up to 40% in reduced word count TermNet as an integral and quality assuring part • Quality improvement in writing and translations the precision and accuracy needed in today’s of any product and service in the areas of global marketplace. • Quality assurance and quality measurement • information and communication • Interpretation and translation services for content • classification and categorization • Up to 30% in reduced product cycle time • translation and localization • Competitive pricing • Up to 40% reduction in overall If you would like to join the international • 50+ languages documentation cost • Free, no-obligation estimates • Improved safety and customer service community, please visit www.termnet.org and • Facilitates DITA, S1000D, SCORM, CMS contribute to our blog. and XML TermNet — International Tennessee Foreign Tedopres International, Inc. Network for Terminology Language Institute Austin, Texas USA Vienna, Austria Nashville, Tennessee USA [email protected] • www.simplifiedenglish.net [email protected] • www.termnet.org [email protected] • www.tfli.org

SDL TRADOS Translation Services Clarity and Efficiency Technologies With a vast network of professionals worldwide, into 70 Languages SDL TRADOS Technologies is a division of we provide reliable, customized language We provide translation services into 70 solutions in Spanish and Brazilian Portuguese. SDL, the world’s largest provider of technology solutions for global information management languages using the most modern technology for Our services include localization, translation, (GIM), which benefit corporations and institu- clients throughout the whole world. interpreting, desktop publishing, and project tions, language service providers and freelance CEET provides translation, proofreading, management solutions that enable our clients to translators worldwide. localization, DTP, interpreting, voice-over and increase revenue and create effective communica- cultural consulting in all major world languages tion channels with their audiences. SDL has over 170,000 software licenses deployed across the translation supply chain with special expertise in Central and Eastern Our adherence to on-time deliveries, fair pricing and has demonstrated proven ROI in over 500 European languages. and fast turnaround makes us a language service enterprise solution installations. SDL delivers We approach all projects with respect to provider our clients can trust. innovative software products that accelerate customers’ needs and the cultural uniqueness of You are kindly invited to find out how you can global content delivery and maximize language each country because we believe the language benefit from our services. translation productivity. of your firm communicates who you are to the audience. Clear Words Translations Córdoba, Argentina SDL CEET Ltd. [email protected] Berkshire, UK Prague, Czech Republic www.clearwordstranslations.com www.lspzone.com [email protected] • www.ceet.eu

October/November 2009 • www.multilingual.com/gsg page 15

112-152-15 O BBroinroin ##107107 GGSG.inddSG.indd 1155 99/24/09/24/09 111:06:421:06:42 AAMM 1166 MMadcapadcap GGSG.inddSG.indd 1166 99/24/09/24/09 111:09:021:09:02 AAMM WRITING FOR TRANSLATION GGETTINGETTING SSTARTEDTARTED:Guide An invitation to subscribe to

his guide is a component of the magazine MultiLingual. The promoting your business or for conducting fully international e- ever-growing easy international access to information, ser- commerce, you’ll benefit from the information and ideas in each T vices and goods underscores the importance of language issue of MultiLingual. and culture awareness. What issues are involved in reaching an international audience? Are there technologies to help? Who pro- Managing content vides services in this area? Where do I start? How do you track all the words and the changes that occur Savvy people in today’s world use MultiLingual to answer these in a multilingual website? How do you know who’s doing what questions and to help them discover what other questions they and where? How do you respond to customers and vendors in should be asking. a prompt manner and in their own languages? The growing and MultiLingual’s eight issues a year are filled with news, technical changing field of content management and global manage- developments and language information for people who are inter- ment systems (CMS and GMS), customer relations management ested in the role of language, technology and translation in our (CRM) and other management disciplines is increasingly impor- twenty-first-century world. A ninth issue, the Resource Directory tant as systems become more complex. Leaders in the devel- and Index, provides listings of companies in the language industry opment of these systems explain how they work and how they and an index to the previous year’s content. work together. Two issues each year include Getting Started Guides such as this one, which are primers for moving into new territories both Internationalization geographically and professionally. Making software ready for the international market requires The magazine itself covers a multitude of topics. more than just a good idea. How does an international developer prepare a product for multiple locales? Will the pictures and col- Translation ors you select for a user interface in France be suitable for users How are translation tools changing the art and science of com- in Brazil? Elements such as date and currency formats sound like municating ideas and information between speakers of different simple components, but developers who ignore the many inter- languages? Translators are vital to the development of interna- national variants find that their products may be unusable. You’ll tional and localized software. Those who specialize in technical find sound ideas and practical help in every issue. documents, such as manuals for computer hardware and soft- ware, industrial equipment and medical products, use sophisti- Localization cated tools along with professional expertise to translate complex How can you make your product look and feel as if it were built in text clearly and precisely. Translators and people who use transla- another country for users of that language and culture? How do you tion services track new developments through articles and news choose a localization service vendor? Developers and localizers items in MultiLingual. offer their ideas and relate their experiences with practical advice that will save you time and money in your localization projects. Language technology From multiple keyboard layouts and input methods to Unicode- And there’s much more enabled operating systems, language-specific encodings, systems Authors with in-depth knowledge summarize changes in the that recognize your handwriting or your speech in any language language industry and explain its financial side, describe the chal- — language technology is changing day by day. And this technol- lenges of computing in various languages, explain and update ogy is also changing the way in which people communicate on a encoding schemes, and evaluate software and systems. Other personal level — changing the requirements for international soft- articles focus on particular countries or regions; specific lan- ware and changing how business is done all over the world. guages; translation and localization training programs; the uses MultiLingual is your source for the best information and insight of language technology in specific industries — a wide array of into these developments and how they will affect you and your current topics from the world of multilingual computing. business. If you are interested in reaching an international audience in the best way possible, you need to read MultiLingual. G Global web Every website is a global website, and even a site designed for one country may require several languages to be effective. Experienced web professionals explain how to create a site that Subscribe to MultiLingual at works for users everywhere, how to attract those users to your www.multilingual.com/subscribe site and how to keep the site current. Whether you use the inter- net and worldwide web for e-mail, for purchasing services, for

October/November 2009 • www.multilingual.com/gsg page 17

##107107 GGSGSG SubOffer.inddSubOffer.indd 1717 99/24/09/24/09 2:14:302:14:30 PMPM