Translation Memory: State of the Technology Fuzzy Matching In
Total Page:16
File Type:pdf, Size:1020Kb
Language | Technology | Business Translation memory: state of the technology Fuzzy matching in theory and practice A rule-based environment for Swahili development 01 Cover #90.indd 1 8/9/07 8:27:17 AM 02-03 ads #90.indd 2 8/3/07 2:19:40 PM Why Buy GMS Software? Software as a Service (SaaS) is revolutionizing other industries. Localization is next. Welcome to Freeway! With no server or desktop software to purchase and instant Where will Freeway take you? connections to leading CMS solutions, companies of all sizes are already accelerating their translation projects on Freeway. www.GetOnTheFreeway.com Don’t pay for yesterday’s technology. Access the latest tools for Free. FAST • CONNECTED • FREE 02-03 ads #90.indd 3 8/3/07 2:19:42 PM BitextBitext or or TM? TM? WhyWhy not not both? both? TEXTBASE TM Powered by MultiCorpora, an independent technology provider > Eliminates tedious manual alignment process > Recycles all legacy documents > Retrieves matches from terms to paragraphs > Facilitates in-context translation within preferred editor > Ensures terminology consistency > Provides web access to TM and terminology Discover how more and more global organizations, governments and language service providers are North America: 1-877-725-7070 Unlocking the “TRUE” value Europe: +32(0)2 21 3 0020 of their Multilingual Assets www.multicorpora.com 04 Multicorpora #90.indd 4 8/3/07 2:20:05 PM u ti in ua M l L g September 2007l Language | Technology | Business #90 Volume 18 Issue 6 Q Up Front Q Feature Articles Q 6 www.multilingual.com Q Tech Q 7 Post Editing 34 Translation memory: state of the technology Q News — Jost Zetzsche Q 8 News 36 What’s next for TMS? Q 21 Calendar — Benjamin B. Sargent 37 CAT tools and standards: a brief summary Q Reviews — Yves Savourel 22 Logoport 39 Fuzzy matching in theory and practice Reviewed by Ignacio Garcia — Richard Sikes Up Front & Vivian Stevenson 45 The conveyor belt approach and terminology 26 The Defence of French: management — Christie Fidura A Language in Crisis? 49 Automating MT post-editing using regular Reviewed by Fabien Côté expressions — Rafael Guzmán Columns and Commentary Q Q Languages 29 Off the Map — Tom Edwards 53 A rule-based environment for Swahili 32 World Savvy — John Freivalds development — Arvi Hurskainen 82 Takeaway — Ultan Ó Broin 60 Open-source software for South African languages — Linda Martindale Q Industry Focus 63 E-government — citizen access in the US — Earl Mardle Q 69 Basics Q 72 Buyer’s Guide About the Cover 80 Advertiser Index The hand-carved woodcut print block was used to print labels for fabrics produced for export in the nineteenth-century cotton mill located in Gravensteen Castle (Castle of the Counts) in the center of old Ghent, Belgium. www.multilingual.com September 2007 MultiLingual 5 05 Contents #90.indd 5 8/9/07 8:28:11 AM on the web at www.multilingual.com Downloads — Free internationalization course MultiLingual #90 Volume 18 Issue 6 September 2007 Have you wondered about software internationalization but weren’t Editor-in-Chief, Publisher: Donna Parrish quite sure where to start? We have the information for you — at no Managing Editor: Laurel Wagers cost! A course on this topic created by G. Watson Internationalization Translation Dept. Editor: Jim Healey Services can now be downloaded from www.multilingual.com Copy Editor: Cecilia Spence The materials cover a range of topics, including general News: Kendra Gray internationalization issues, C, C++, Java, international components for Illustrator: Doug Jones Unicode and testing issues. These materials have been used to deliver Production: Sandy Compton commercial, instructor-led courses. Each topic was covered in a Cover Photograph: Doug Jones half-day course and includes between 100 and 150 slides. Webmaster: Aric Spence Because these course materials are being placed in the public Assistant: Shannon Abromeit domain, they can be used for any purpose without obligation. Intern: Callie Welch Download the course free at Circulation: Terri Jadick www.multilingual.com/internationalizationCourseMaterial.php Advertising Director: Jennifer Del Carlo Advertising: Kevin Watson, Bonnie Merrell Editorial Board Jeff Allen, Julieta Coirini, Downloads — Getting Started Guides Bill Hall, Aki Ito, Nancy A. Locke, Ultan Ó Broin, Angelika Zerfaß All of our Getting Started Guides are available to Advertising readers for free download. You may download a [email protected] print-quality (larger size) or screen-quality PDF ¿ le of www.multilingual.com/advertising each of our 23 guides, including our latest, Getting 208-263-8178 Started Guide: South America. These guides are Subscriptions, back issues, valuable introductory overviews to topics such as customer service localization, writing for translation, internationalization, [email protected] and different geographic regions. www.multilingual.com/subscribe Download guides free at www.multilingual.com/gsg Submissions, letters [email protected] Editorial guidelines are available at www.multilingual.com/editorialWriter Reprints: [email protected] How to use www.multilingual.com MultiLingual Computing, Inc. GO TO the home page to see daily news updates and links 319 North First Avenue, Suite 2 to new website content as well as current job postings. Sandpoint, Idaho 83864-1495 USA MANAGE your print or digital subscription at [email protected] www.multilingual.com/subscribe www.multilingual.com FIND a technology or service by searching our database © MultiLingual Computing, Inc. All rights reserved. Reproduction of more than 1,600 industry resources at without permission is prohibited. For reprints and e-prints, please e-mail [email protected] or call 208-263-8178. www.multilingual.com/industryResources MultiLingual (ISSN 1523-0309), September 2007, is published monthly except Jan-Feb, Apr-May, Jul-Aug, Oct-Nov for US $58, CHECK OUT CURRENT THOUGHTS from the MultiLingual editorial international $85 per year by MultiLingual Computing, Inc., 319 North First Avenue, Suite 2, Sandpoint, ID 83864-1495. Periodicals board at www.multilingualblog.com postage paid at Sandpoint, ID and additional mailing offi ces. POSTMASTER: Send address changes to MultiLingual, 319 North PLAN your travels by checking the calendar of events at First Avenue, Suite 2, Sandpoint, ID 83864-1495. www.multilingual.com/calendar 6 | MultiLingual September 2007 06 MH #90.indd 6 8/9/07 8:29:19 AM Laurel Wagers Post Editing Carrying on If July was watching for rain on the Fourth, catching the scent of fresh-cut hay along a back road, reading purely for pleasure (okay, it was the Harry Potter fi nale), sunshine and fresh air, August is a different story. More than a million acres in Idaho and Montana are on fi re, the smoke invading and settling into river valleys, the sun Jred, the air “unhealthy” in mountain towns. Fire crews carry on against the odds, and at 92°F, everyone hopes for cool nights and the relief of rain — for September. Julieta Coirini in Argentina calls August a “transition month.” “Nothing really interesting or extreme tends to happen in the middle of the year,” she says, but this August is a cold one with snow all around, warming into the 40s°F, 8° to 10°C. Political campaigns heat up as the country approaches a spring (October) presidential election. Worlds away, translators are facing dangerous environments — with Iraq at the top of the list. “Translators who work for the US military in Iraq are as important to the overall mission as armor is to soldiers who patrol the streets of Baghdad — both are essential,” says a writer in the US military newspaper Stars and Stripes. (Note: translators don’t get body armor.) Last year, 50 of them received US visas. Many more than that have been killed. The Iraqi parliament and the US Congress are on vacation while soldiers, interpreters and “ordinary people” carry on, and Baghdad is “cooling down” to 87°F — 31°C — at night. Congressional Quarterly (July 30) says the US government “is still struggling” to fi nd qualifi ed personnel such as linguists, as we’ve heard before. And once they’re hired, “it takes between fi ve and seven years to fully develop an intelligence analyst,” says Ronald P. Sanders of the Offi ce of the Director of National Intelligence. However sophisticated the technology they use may become, people with language skills are needed — teachers, translators, analysts and trainers. Carry on. In this issue . We’re focusing on the tech side of the language industry — mostly translation tools — and who better to provide an overview of the current trends in tools than Jost Zetzsche? Benjamin B. Sargent and Yves Savourel add forward-looking commentary and a summary of standards. Then Richard Sikes explains how fuzzy matching works, Rafael Guzmán shows how to use regular expressions in post- editing Spanish machine translation, and Christie Fidura addresses the conveyor belt approach to terminology management. In a look at African languages, Arvi Hurskainen describes the development of tools for working with Swahili, and Linda Martindale tells the story of a nonprofi t organization building localized software for the languages of South Africa. Earl Mardle continues his examination of trends in e-government with a look at citizen access in the United States. Tom Edwards makes a strong case for geocultural literacy, John Freivalds illustrates the differences between two localization approaches, and Ultan Ó Broin explores social translation in his Takeaway. Ignacio Garcia and Vivian Stevenson review Lionbridge’s Logoport, and Fabien Côté reviews the book In Defence of French: A Language in Crisis? Language technology, translation and international software — the tools and activities of the language industry — are helping to change our world for the better. We’re planning an issue with a focus on those projects, and, if you know of one, please pass the word to [email protected] so that we may include it.