<<

This article was downloaded by: 10.3.98.104 On: 01 Oct 2021 Access details: subscription number Publisher: Routledge Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: 5 Howick Place, London SW1P 1WG, UK

The Routledge Encyclopedia of Technology

Chan Sin-wai

Computer-Aided Translation

Publication details https://www.routledgehandbooks.com/doi/10.4324/9781315749129.ch3 Ignacio Garcia Published online on: 03 Nov 2014

How to cite :- Ignacio Garcia. 03 Nov 2014, Computer-Aided Translation from: The Routledge Encyclopedia of Translation Technology Routledge Accessed on: 01 Oct 2021 https://www.routledgehandbooks.com/doi/10.4324/9781315749129.ch3

PLEASE SCROLL DOWN FOR DOCUMENT

Full terms and conditions of use: https://www.routledgehandbooks.com/legal-notices/terms

This Document PDF may be used for research, teaching and private study purposes. Any substantial or systematic reproductions, re-distribution, re-selling, loan or sub-licensing, systematic supply or distribution in any form to anyone is expressly forbidden.

The publisher does not give any warranty express or implied or make any representation that the contents will be complete or accurate or up to date. The publisher shall not be liable for an loss, actions, claims, proceedings, demand or costs or damages whatsoever or howsoever caused arising directly or indirectly in connection with or arising out of the use of this material. Downloaded By: 10.3.98.104 At: 10:32 01 Oct 2021; For: 9781315749129, chapter3, 10.4324/9781315749129.ch3 optional adjunctsinmodern-day CATsystems. Translation (MT)aidswill be addressed only in the contextof their growing presence as that canfullyprovideamachine-generated versioninanotherlanguage.SuchMachine applications thatassisthuman translators byretrievinghuman-mediatedsolutions,notthose CAT system,havebeendeveloped forcomputationallinguists. concordancers which,although potentially incorporating featuressimilartothose in atypical have beendevelopedforabroaderuserbase.Nordoes itincludeapplicationssuchas checkers, andotherelectronicresourceswhich,whilecertainly ofgreathelptotranslators, designed withtranslationinmind.Itdoesnotdiscussword processors, spellingandgrammar including non-professionals,cannowbenefitfromthem. systems havesinceexpandedtocaterformosttypesoftranslation, andmosttranslators, Once restrictedtotechnicaltranslationandlargelocalization projectsinthenineties,CAT ability toreusevettedtranslationsandconsistentlyapplythe sameterminologybecamevital. teams oftranslatorstoworkconcurrentlyonthesamesource material.Inthiscontext,the markets (localization).Sheervolumeandtightdeadlines(simultaneousshipment)required corporations andinstitutionstotargetproductsservicestowardotherlanguages search resultsarethenofferedtothehumantranslatoraspromptsforadaptationandreuse. recognition ofterminologyinanalogousbilingualglossariesarealsostandard.Thecorresponding identical (exactmatch)orsimilarfuzzysourceandtranslationsegments.Search (normally sentences,asdefinedbypunctuationmarks)andsearchesabilingualmemoryfor acceptable levelofquality.Atitscore,everyCATsystemdividesatextinto‘segments’ costs oftranslationprojectswhilemaintainingtheearningscontractedtranslatorsandan purpose offacilitatingthespeedandconsistencyhumantranslators,thusreducingoverall Computer-aided Translation(CAT)systemsaresoftwareapplicationscreatedwiththespecific Amongst thegeneralclassoftranslation-focused computersystems,thiswillcentreonlyon This overviewofCATsystemsincludesonlythosecomputer applicationsspecifically CAT systemsweredevelopedfromtheearly1990storespondincreasingneedof COMPUTER-AIDED TRANSLATION university ofwesternsydney,australia Ignacio Garcia Introduction Systems 3 68 Downloaded By: 10.3.98.104 At: 10:32 01 Oct 2021; For: 9781315749129, chapter3, 10.4324/9781315749129.ch3 for CAT-usagewiden.These new trendsareexploredintheCurrentCATSystemssection. text reuse has emerged; the amount of addressable data expanded, and the potential scenarios discussed belowinthenextsection.From2005onwards,a more granularapproachtowards new waysofextractingextralanguage-dataleverage.Werefer tothisastheclassicperiod, for overadecadethegainscentredmoreonstabilityandprocessing powerthananyappreciably by themid-1990s.Theofferingsofleadingbrandswouldlater increaseinsophistication,but do notseemtobeusedoutsidetheinnercircleofitsdeveloper. the pastdecade.SomeCompendiumentrieshaveleftabigfootprintinindustrywhileothers been discontinuedandnewonescreated,theoverallfigures havenotchangedmuchduring boasted 23,31and9productsrespectively(withseveraloverlaps), andalthoughanumberhave memory systems/components’ and ‘Translator workstations’. By January 2005, said categories systems areincludedundertheheadingsof‘Terminologymanagementsystems’,‘Translation available forpurchaseonthemarket’(Hutchins1999−2010:3).InthisCompendium,CAT systems of machine translation and computer-based translation support tools that are currently Systems andComputer-aidedTranslationSupportToolslists(from1999onwards)‘allknown individual translators.Somesystemswerebuiltforin-houseuseonly,otherstobesold. the pioneerdevelopersbeingtranslationagencies,corporatelocalizationdepartments,and of processingpower(personalcomputersopposedtomainframes)andperceivedneed,with and researchgoingintoMTinstead.CATgreworganically,inresponsetothedemocratization Memory orTMwillbeusedinitsactualandliteralsenseasthedatabaseofstoredtranslations. of softwareuser interfaces (UIs), rather than the ‘traditional’ user help and technical text. Translation this labelwillbetheso-calledlocalizationtools−aspecificsub-typewhichfocusesontranslation the suitesoftoolsthattranslatorswillcommonlyencounterinmodernworkflows.Includedwithin processing, spellcheckingetc.). encompassing stricttranslation-orientedfunctionalityplusothermoregenericfeatures( Meanwhile, theCATacronymhasbeenconsideredrathertoocatholicinsomequarters,for human-mediated process,itcertainlystandsinattractiveandsymmetricaloppositiontoMT. core component,thevernaculartermofTMhasbeenwidelyemployed:asalabelfor support tools,orlatterlytranslationenvironmenttools(TEnTs).Despitedescribingonlyone TM, TMtools(orsystemsorsuites),translatorworkbenchesworkstations,translation translators andlanguagepairswhileensuringbasiclinguisticengineeringqualityassurance. files, andinmanagingcomplextranslationprojectswithlargenumberstypesof documents. CATsystemsmayalsoassistinextractingthetranslatabletextoutofheavilytagged extraction tools,tocompilesearchabletermbasesfromTMs,bilingualglossaries,andother alignment tools,tocreateTMdatabasesfrompreviouslytranslateddocuments,andterm terminology databases.Thesecorefunctionalitiesmaybesupplementedbyotherssuchas translation memory(TM)databases,andtheautomatedapplicationofterminologyheldin MT, whichbegan c.1949.Documentary referencestoCAT,aswe understandittoday,are The ideaofcomputersassisting thetranslationprocessisdirectlylinkedtodevelopment of The essentialtechnology,revolvingaroundsentence-levelsegmentation, wasfullydeveloped Hutchins’ Historically, CATsystemdevelopmentwassomewhatadhoc,withmostconcertedeffort While thereispresentlynoconsensusonan‘official’label,CATwillbeusedheretodesignate CAT Systemshavevariouslybeenknowninboththeindustryandliteratureastools, CAT systemsfundamentallyenablethereuseofpast(human)translationheldinso-called Compendium ofTranslationSoftware:DirectoryCommercialMachine Classic CATsystems(1995−2005) CAT: systems 69 Downloaded By: 10.3.98.104 At: 10:32 01 Oct 2021; For: 9781315749129, chapter3, 10.4324/9781315749129.ch3 below. components ofthattechnology,whichwouldnotchangemuch foroveradecade,aredescribed conversion filtersandotherfeatureswereallpresentinthe more advancedsystems.Themain choice ofthemainplayersand,thus,defaultindustrystandard. successful EuropeanCommissiontenderbidsin1996and1997 −thatfounditselfthetoolof launch (Brace1992),wereshortlydiscontinued.Ofthem all,itwasTrados−thanksto retain aprofile today;otherssuchastheEurolang Optimiser,well-fundedandmarketedatits 418−419). (also German)launcheditsownin-housesystem,Transit,onto themarket(Hutchins1998: in-house developed TranslationManager2,whilelarge language serviceprovider STAR AG Workbench TMtoolfollowingin1992.Also1992,IBMDeutschlandcommercializedits launched theirMultiTermterminologydatabase,withthefirsteditionofTranslator’s entrepreneurs whohadfoundedTradosin1984andalreadybeenusingTextTools) translators sawawindowofopportunity.In1990,HummelandKnyphausen(twoGerman typewriter fromthetranslators’desks.Certainbusiness-mindedandtechnologicallyproficient were stillnotripeforthetechnology’scommercialization. Nevertheless, while the requiredprogramming was not overly complicated, the conditions system. Itwaslaterre-engineeredbyINKNetherlandsasTextTools(Kingscott1999:7). Systems) inSaltLakeCity,Utah,themid-1980sisconsideredfirstprototypeofaCAT proposed intheearly1980s. those queries,andthatispreciselywhatKay([1980]1997:323)Melby(1983:174−177) queried inamoreconvenientfashion.Clearly,computersmightsomehowbeusedtoautomate The adventofthepersonalcomputeralloweddocumentstorageassoftcopy,whichcouldbe presumably kept paper copies of their work and simply consulted them when the need arose. terminology managementsystemscanbetracedbacktoMannheim,theideaofdatabasing which couldtrulyassistinproducingfaster,cheaperandyetstilluseabletranslation.While productivity byover50percent(ALPAC1996:26,79−86). translators usingelectronicglossariescouldreduceerrorsby50percentandincrease Translation AgencyinMannheim.Astudyincludedthereport(Appendix12)showedthat the ‘machine-aidedhumantranslation’thenbeingimplementedbyFederalArmedForces report didsupportfundingforComputationalLinguistics,andinparticularwhatitcalled than thetraditionalmethod,thenfrequentlyfacilitatedbydictationtoatypist.However, translation, asitwasmostlyknownthen)amoretime-consumingandexpensiveprocess vacuum tube mainframesand punch-cards, the report understandablyfoundthat MT (mechanical 1966, which halted the first big wave of MT funding in the United States. In that era of already foundintheAutomaticLanguageProcessingAdvisoryCommittee(ALPAC)reportof end thattranslators usetoopenasourcefile fortranslation,andquery thememoryand databases, andapplyterminology fromterminologydatabases.Theeditoristhesystemfront- A CATsystemallowshuman translatorstoreusetranslationsfromtranslationmemory By themid-1990s,translationmemory,terminologymanagement, alignmenttools,file Similar productssoonenteredthearena.Some,suchasDéjà Vu,firstreleasedin1993,still By theearly1990sthishadchanged:micro-computerswithwordprocessorsdisplaced The TranslationSupportSystem(TSS)developedbyALPS(AutomatedLanguageProcessing CAT systemsgrewoutofMTdevelopers’frustrationatbeingunabletodesignaproduct per sedidnotsurfaceuntilthe 1980s. During the typewriterera, translators The editor I. Garcia 70 Downloaded By: 10.3.98.104 At: 10:32 01 Oct 2021; For: 9781315749129, chapter3, 10.4324/9781315749129.ch3 sentence, butcanalsobeatitle, caption,orthecontentofatablecell. segment, andisnormallydemarcated byexplicitpunctuation−itisthereforecommonly a matching pairsofsourceandtarget units.Aswehaveseen,thebasicdatabaseunitiscalled a and Hummel,isadatabasethat containspasttranslations,alignedandreadyforreuse in A translationmemoryorTM, theoriginalcoinageattributedtoTradosfoundersKnyphausen this period,mostbigprojectsinvolvedpre-translation(Wallis 2006). process knownaspre-translation.Translatorsapparentlyprefertheinteractivemodebut,during entries eithersortedandsenttothetranslators,ordirectly inserted intothesourcefileina by anagencyortheendclient,sourceisfirstanalysedagainst themandthenanyrelevant from thedatabasesaseachsegmentismade‘live’.Whenmemories andglossariesareprovided they mostlikelyworkininteractivemode,withtheprogram sendingtherelevantinformation interactive modeorinpre-translationmode.Whenusingtheir ownmemoriesandglossaries either inasidebaroratbottomofscreen. is importedintothetargetcellonright,withadditionalmemoryandglossarydatapresented cell; dependingonthe(useradjustable)searchsettings,mostrelevantdatabaseinformation example, thetranslatoractivatesaparticularsegmentbyplacingcursorincorresponding language file.Thismodelwasfollowedbyothersystems,mostnotablyWordfast. the resultwasabilingual(‘uncleaned’)filerequiring‘cleanup’intomonolingualtarget- segment providedthetranslatorwithco-text.Oncetranslationwascompleted and edited, called Translator’sWorkbench.Theinactivesegmentsvisibleaboveandbelowtheopen Any TMandglossaryinformationrelevanttotheopensegmentappearedinaseparatewindow, translates withassistancefrommatchesifavailable,thenclosesthissegmentandopensthenext. proprietary ‘TagEditor’isthemodelforverticalpresentation.Thetranslatoropensasegment, segment. below (verticalpresentation)orbeside(horizontaltabularthecurrentlyactive the memoryand/orallowatranslationtobewrittenfromscratch.Theworkspacecanappear segment displayedtogetherwithaworkspaceintowhichthesystemwillimportanyhitsfrom for matches in the memory. Inside the editor window, the translator sees the active source translation units,enablingthetranslatortoworkonthemseparatelyandprogramsearch already embodiessuchanintermediatestep,withoutrelyingonWordtodisplaytheresults. intermediary applicationcapableofextractingitstranslatablecontent.Aproprietaryeditor open normallyinWord,thenitcouldnotbetranslatedwithoutpriorprocessingsome familiar withitsenvironment.Theobviousdisadvantage,however,isthatifafilecouldnot advantage ofusingaword-processingpackagesuchasWordisthatuserswouldalreadybe during thisclassicperiod.Most,however,decidedonaproprietaryeditor.Theobvious software; typicallyMicrosoftWord.TradosandWordfastwerethebestknownexamples to thetranslationmemoryandterminologypairstermbase. own translationsifnomatchesarefound,andtheinterfaceforsendingfinishedsentencepairs terminology databasesforrelevantdata.Itisalsotheworkspaceinwhichtheycanwritetheir Independently ofhowtheeditorpresentstranslatabletext, translatorsworkeitherin When thesourceispresentedinside-by-side,tabularform,DéjàVubeingclassic The workflowforclassicTradosinbothitsconfigurations,asWordmacro,andthelater Whether bolt-on or standalone, a CAT system editor firstsegments the source file into Some classicCATsystemspiggy-backedtheireditorontothird-partywordprocessing The translationmemory CAT: systems 71 Downloaded By: 10.3.98.104 At: 10:32 01 Oct 2021; For: 9781315749129, chapter3, 10.4324/9781315749129.ch3 existing entriesinthedatabase: translation ifthesameorasimilarsegmentarisesinnewtext. subject matter,etc.).TheTMapplicationalsocontainsthealgorithmforretrievingamatching linked toitstranslation,plusrelevantmetadata(e.g.time/dateandauthorstamp,clientname, • • • memories. All current systemscanimport and exportmemoriesinTranslation Memory guaranteed or perfect match, and text-based able to import and work with sentence-based match comesfromthesamecontext bynamingit,dependingonthebrand,context, stress ontextorsentenceare blurred,withconventionalTMindicatingalsowhenanexact than sentence-based)TM.In current systems,however,thelinesofdifferentiationbetween (MultiCorpora) arethebestcurrent examples,withthelatterreferringthisasTextBased(rather but tothecompletedocument,thusprovidingcontext.LogiTerm (Terminotix)andMultiTrans developers cameupwiththeconceptofbi-texts,linkingmatchnottoanisolatedsentence memories. Star-Transitusesfilepairsasreferencematerialsindexed tolocatematches.Canadian repetition mayberegardedstylisticallyasvirtueratherthanvice. technical translation(Helpfiles,manualsanddocumentation), whereconsistencyiscrucialand continually updated with just a few features added or altered – the ideal environment being translate the same sentence twice’. Most reuse is achieved when a product or a service is The moreinternalrepetition,thebetter,sinceascatchcry says‘withTMoneneednever the translatorworksthroughatext,witheachtranslatedsegment sentbydefaulttothedatabase. into onecatch-allTM,knowninplayfuljargonasa‘bigmama’. common practiceamongfreelancerstoperiodicallydumpthecontentsofmultiplememories (a particulartopic,acertainclient,etc.),andensuringinternalconsistency.Ithasalsobeen wish −therebyallowingindividualTMstobekeptsegregatedforuseinspecificcircumstances hand-in-hand. to thesourcematerial(thecloser,better).Clearly,sizeandspecificitydonotalwaysgo segments inthedatabase(simplistically,morebetter),butalsoonhowrelatedtheyare How usefulamemoryisforparticularprojectwillnotonlydependonthenumberof • • • When thetranslatoropensasegmentineditorwindow,programcomparesitto A typical TM entry, sometimes called a translation unit or TU, consists of a source segment There havebeensometechnicalvariationsofstrictsentence-based organizationforthe Clearly, anyactiveTMisprogressivelyenhancedbecauseits numberofsegmentsgrowsas Accordingly, mostCATtoolsallowuserstocreateasmanytranslationmemoriesthey particular segmentintheconventionalway. suggestion isoffered;thiscalledanomatch,andthetranslatorwillneedtotranslatethat If itfailstofindanystoredsourcesegmentexceedingthepre-setmatchthreshold,no distracting thanhelpful. segments above a 70 per cent threshold are offered, since anything less is deemed more can beusefullyadapted,oriflesseffortisrequiredtotranslatefromscratch;usually,only deletions or substitutions required to make it equal; the translator then assesses whether it and calculatedontheLevenshteindistance,i.e.minimumnumberofinsertions, the targetasafuzzymatchtogetherwithitsdegreeofsimilarity,indicatedpercentage If itfindsadatabasedsourcesegmentthatissimilartotheactiveoneineditor,offers whether someminoradjustmentsarerequiredforpotentialdifferencesincontext. per centmatch);allthetranslatorneeddoischeckwhetheritcanbereusedas-is,or translator isworkingon,itretrievesthecorrespondingtargetasanexactmatch(ora100 If itfindsasourcesegmentinthedatabasethatpreciselycoincideswith the I. Garcia 72 101%, Downloaded By: 10.3.98.104 At: 10:32 01 Oct 2021; For: 9781315749129, chapter3, 10.4324/9781315749129.ch3 ‘hard-wired’ into agivenCATsystem.Trados isoneexample,withits MultiTermtool of situationswhereconsistency isparamount. texts, aclearlylimitedscenario. Bycontrast,recurrentterminologycanappearinanynumber consider thattranslationmemories workbestincasesofincrementalchangestorepetitive it istheterminologyfeaturewhich affordsthegreatestassistance.Thisisunderstandableif we compliant. was eventuallycreatedbyOSCAR/LISA.Nowadaysmost sophisticatedsystemsareTBX interests ofenhancedexchangecapability,aTerminologyBase eXchange(TBX)openstandard TMX. Thisinvariablyentailsthelossorcorruptionofsome evenallofthemetadata.In export/import to/from intermediate formats such as spreadsheets, simple text files, or even can becomplicatedduetothevariationinstorageformats. It isthereforecommontoallow key terms,withcontractingtranslatorsoragenciesbeingobliged toabidebythem. practice toconstructproduct-specificglossarieswhichimpose uniformusagesfordesignated to assistanypotentialusers,present or future. For large corporateprojects it is also usual with ,definitions,examplesofusage,andlinkstopicturesexternalinformation to bothcreateandmaintainindustry-widemultilingualtermbases.Thesewillbeenriched andexperience. for differentcontexts,withlimited(orabsent)metadatasupplementedbythetranslator’sown unless subjectedtotime-consumingmaintenance.Aminimalapproachofferseaseandflexibility Entries arenormallykeptinlocalcomputermemory,andcanremainsomewhatadhocaffairs manually − typically over many years − by entering source and target term pairings as they go. reflect theseneeds. functionalities offeredinthefreelanceandenterpriseversionsofsomeCATsystemstendto of course,periodicallydumpedintoa‘bigmama’termbanktoo). practice tocompilemultipletermbaseswhichcanbekeptsegregatedfordesignateduses(and, should alsorelateasmuchpossibletoagivendomain,clientandproject.Itisthereforeusual inflections. Most systemsalsoimplementsomefuzzyterminologyrecognitiontocaterformorphological When itdetectsasourcetermmatch,promptswiththecorrespondingtargetrendering. active translationsegmentintheeditoragainstadatabase–thiscase,bilingualglossary. pairings ofsourceandtargettermsplusassociatedmetadata. functions attermlevelbymanagingsearchable/retrievableglossariescontainingspecific This canbelikenedconceptuallytothetranslationmemoryofreusablesegments,butinstead To fully exploititsdata-basing potential, every CAT system requires a terminology feature. Standards Association). Container/Content AllowingRe-use),aspecialinterestgroupofLISA(LocalizationIndustry eXchange (TMX)format,anopenXMLstandardcreatedbyOSCAR(OpenStandardsfor Interestingly, terminologyfeatures −whiledemonstrablycorecomponentsarenotalways Despite theemphasistraditionallyplacedonTMs,experienced userswilloftencontendthat Glossaries arevaluableresources,butcompilingthemmore rapidly viadatabaseexchanges By contrast, big corporations can afford dedicated bureaus staffed with trained terminologists Freelance translatorsarelikelytopreferunadornedbilingualglossarieswhichtheybuildup Term basescomeindifferentguises,dependingupontheircreatorsandpurposes.The As withTMs,biggerisnotalwaysbetter:specificitybeingequallydesirable,aglossary Just asthetranslationmemoryenginedoes,terminologyfeaturemonitorscurrently The terminologyfeature CAT: systems 73 Downloaded By: 10.3.98.104 At: 10:32 01 Oct 2021; For: 9781315749129, chapter3, 10.4324/9781315749129.ch3 Trados’ shiftfrom WordtoTagEditor). third-party softwarewereclearly unwieldy,soproprietaryinterfacesbecamestandard(witness the informationrevolutiongathered momentumandfiletypesmultiplied,macrosthatsat on fresh converter utilities were needed for each new release or upgrade of supported types. As respective licenceorevenknowing howtousethecreatorsoftware. on numerousfiletypes(desktoppublishers,HTMLencoders etc.)withoutpurchasingthe reapplied uponexportofthefinishedtranslation.Theproper filtersmadeitpossibletowork (paragraphs, justification,indenting,pagination)wouldbe preservedinatemplatetobe only tags(typicallynumbersdisplayedincoloursorcurlybrackets) whilestructuralformatting native format.Inlineformatting(bold,italics,font,colouretc.) wouldbedisplayedasread- editor. Translatorscouldthenworkontextthatkeptthesame appearance,regardlessofits relevant filterstoextractfromthosefilesthetranslatable texttopresentthetranslator’s lists. discounts’ andcomplainedbitterlyontheLantra-LYahoo GroupsCATsystemsusers’ of thestandardcostperword.Translatorswerenotenthused withtheseso-called‘Trados agencies withdemandingclients,thepotentialsavingspointedelsewhere. manage and translate alone,andreap any rewards in efficiency themselves.However, forlarge costs andtime.Individualtranslatorsworkingwithdiscreteclientscouldclearlyproject- repetition, andtheresultingfigurescouldbeusedbyprojectmanagerstocalculatetranslation same analysisprocessmeantquantifyingthenumberandtypeofmatchesaswellanyinternal them by populating the target side of the relevant segments with any matches. Effectively,that were abletobatch-processincomingfilesagainsttheavailablememories,andpre-translate the editorandtranslatedinusualway. then importinganumberofsourcefilesintothatproject.Eachfilecouldbeopenedin and targetlanguages,specifictranslationmemoriestermbases,segmentationrules) handling multiplefilesrelatedtoaspecificundertaking−specifyingglobalparameters(source or language service providers (LSPs), CAT systems began to acquire a management dimension. and complexitiesbeyondthecapacitiesofindividualsintospheretranslationbureaus catered forfreelancetranslatorsinclient-directrelationships.Asglobalizationpushedvolumes Modest first-generationsystems,suchastheoriginalWordfast,handledfilesoneatatimeand thousands) offilesindifferentformatsintomanytargetlanguagesusingteamstranslators. Technical translation and localization invariably involve translating great numbers (perhaps error flagsiftranslatorsfailtoobserveauthorisedusagefromdesignatedtermbases. strict that many CAT systems have incorporated quality assurance (QA) features which raise feature has become inconceivable. Indeed,the imposition of specific vocabulary can beso Help files,documentation,packagingandmarketingmaterial,translatingwithoutaterminology interface, hasbundledeverythingtogethersinceinception. (historically theTranslator’sWorkbench).DéjàVuonotherhand,withitsproprietary presented asastand-aloneapplicationbesidethecompany’stranslationmemory Keeping abreastoffileformats wasclearlyachallengeforCATsystemdevelopers,since As forthefilesthemselves,theycouldbeofvariedtypes. CATsystemswouldusethe Thus bythemid-1990sitwascommonagencypracticeformatchestobepaidatafraction These changesalsosignalledaneweraofremuneration.Eventuallyallcommercialsystems Instead ofthefrontendbeingtranslationeditor,itbecamea‘projectwindow’for Regardless, withcorporationsneedingtomaintainlexicalconsistencyacrossuserinterfaces, Translation management I. Garcia 74 Downloaded By: 10.3.98.104 At: 10:32 01 Oct 2021; For: 9781315749129, chapter3, 10.4324/9781315749129.ch3 appearing immediatelyafterwards. systems capableofdealingwiththesemattersinamuchsimplerandeffectivefashionstarted within asingleCATsystem,provedtoocomplexandwasdiscontinuedin2006.Web-based Workspace byTrados,launchedin2002asafirstattemptatwhole-of-projectmanagement were usedtoexchangefilesandfinancialinformationbetweenclients,agenciestranslators. or third-partysystems(suchasLTCOrganiser,Project-Open,Projetex,andBeetext Flow) was limitedtoassemblingatranslation‘kit’withsourceanddatabasematches.Otherin-house amongst teamsoftranslatorsusingthesamememoriesandglossaries.Nevertheless,theirrole the decade. industry. However,individualCATdesignersdidnotembraceXLIFFuntilthesecondhalfof (OASIS) in2002,tosimplifytheprocessesofdealingwithformattingwithinlocalization created bytheOrganizationforAdvancementofStructuredInformationStandards effectively betweeneachother.TheXMLLocalisationInterchangeFileFormat(XLIFF)was began appearing towardstheendofclassic period,andlikewisefollowed thesamewell- inflections, nounandadjectival phrases)wasanothermatter.Thecorrespondingtools thus punctuation rules;consistently demarcatingterms(withtheirgrammaticalandmorphological attendant costintimeandeffort. must assesswhetherthegainsachieved throughfuturereusefromthememorieswilloffset the legacy documents.Whendeterminingwhethertoalignapparently attractivebi-texts,one and extraorincompletesegmentsdetected,toensureaperfect 1:1mappingbetweenthetwo editing andmonitoringfunctionsaswellsothatsegmentscan besplitormergedasrequired segment by segment,to ensure exact correspondence. Alignment tools implement some translation differently.Anoperatormustthereforeworkmanually throughthealignmentfile, between languages,sothesegmentationprocesscanfrequently chunkasourceandits across systems. Rules eXchange(SRX)openstandardwassubsequentlycreated tooptimizeperformance alignment inthesamewaywithinagivenCATsystem.The LISA/OSCARSegmentation used in the translation editor, theoretically maximizingreusebytreatingtranslationand for import intothe designated memory database. Segmentation would follow the samerules alignment toolwasTAlign,laterrenamedTradosWinAlign,launchedin1992. emerged atthebeginningsofclassicera,preciselytofacilitatethistask.Thefirstcommercial exploited bysendingthemdirectlyintoatranslationmemory.Alignmenttoolsquickly side (asifalreadyinatranslationeditor),thentheywouldyieldresourcethatcouldbeeasily and French.Ifsuchlegacysourcestheirtranslationscouldbesomehowlinedupside-by- translated knownvariouslyasparallelcorpora,bi-textsorlegacymaterial. translation. Butthisisslow,andignoreslargeamountsofexistingmatterthathasalreadybeen enough, bysendingsourceandtargetpairingstotherespectivedatabaseduringactual without muchthoughtastotheircreation.Certainly,buildingthembarehandediseasy Hitherto theexistenceoftranslationmemoriesandtermbaseshasbeentreatedasagiven, By incorporatingprojectmanagementfeatures,CATsystemshadfacilitatedsharing There wereinitiativestonormalizetheindustrysothatdifferentCATsystemscouldtalk Terminology extractionposed moredifficulties.Afterall,alignmentscouldsimplyfollow Performing an alignment is not alwaysstraightforward. Punctuation conventions differ In thealignmentprocessparalleldocumentsarepaired,segmentedandcodedappropriately Consider forexampletheCanadianParliament’sHansardrecord,keptbilinguallyinEnglish Alignment andtermextractiontools CAT: systems 75 Downloaded By: 10.3.98.104 At: 10:32 01 Oct 2021; For: 9781315749129, chapter3, 10.4324/9781315749129.ch3 commercially viablebydesigningplug-ins. additional features, leaving fewer niches where third-party developers could remain tools andterminologyextractionsoftware.CATsystemswere progressively incorporating same evolutionarypathasfileconverters,wordcountand file analysisapplications,alignment business sense,withWordfastleadingtheway. party standalones.CATsystemsengineerssoonsawthatbuildinginQAmadetechnicaland back toitsnativeformatforfinalproofinganddistribution. and quantity.WithQAchecklistconditionsmet,thedocumentcanbeconfidentlyexported segment is left untranslated, and that the target format tags match the source tags in both type according totargetlanguageconventions.Attheengineeringlevel,theyensurethatno unaltered. They can also detectif numbers, measurements and currency are correctly rendered grammar, andconfirmingthatanynon-translatableitems(e.g.certainpropernouns)areleft features thatnowcomeasstandardinallcommercialsystems. They alsocontributesignificantlytoavertingerrorsthroughautomatedqualityassurance(QA) and maintainingconsistencyevenwhenteamsoftranslatorsare involved in the same project. CAT systemsareintendedtohelptranslatorsandtranslationbuyersbyincreasingproductivity while MultiTermExtractseemedsuperiorinothercases(Zetzsche2010:34). was reportedtoworkbetterwiththoseEuropeanlanguagesthatalreadyhadspecificalgorithms, SDL offeredusersbothitsSDLXPhraseFinderandTradosMultiTermExtract. specific parsingforafewmajorEuropeanlanguages.AfteritsacquisitionofTradosin2006, extractors couldonlypropose:everythinghadtobevettedbyahumanoperator. proposing translationcandidatesfromthetargettext.Whatevertheirrespectivevirtues,term . Whenterm-miningfromtranslationmemories,someprogramswerealsocapableof number of words a candidate could contain, with a stopword list applied to skip the function Since anunfilteredlistcouldbehuge,userssetlimiting parameterssuchasthemaximum terminology candidates from the source text, withselectionbasedon frequency of appearance. translation memories)andwasonlysemi-automated.Thatis,thetoolwouldofferup system integration. worn pathfromstandalones(XeroxTerminologySuitebeingthebestknown)tofullCAT profound: theprintable versustheviewable. with much of the context coming from their on-screen display. The contrast wassimple yet ageoperatedinafar morepiecemeal,visuallyorientedandrandom-accessfashion, enhanced, ‘’presentation ofsequentialparagraphsandpages.Thenewtexts the global older classoftextsretaineda familiar aspect,analogoustoatraditional,albeitelectronically (UIs) withtheirdrop-downmenus, dialogueboxes,pop-uphelp,anderrormessages.The and webcontentingeneral;theyfellnotablyshortwhenit cametosoftwareuserinterfaces The classic-eraCATsystemsdescribedaboveworkedwell enough withHelpfiles,manuals What isalsonotableherethegeneraltrendofconsolidation, withQAtoolsfollowingthe The firstQAtools(suchasDistiller,Quintillian,orErrorSpy)weredevelopedthird- CAT QAmodulesperformlinguisticcontrolsbycheckingterminologyusage,spellingand Beyond purely statistical methods, some terminology extraction tools eventually implemented Extraction couldbe performed onmonolingual(usuallythe source) or bilingual text (usually Localization tools:aspecialCATsystemsub-type Quality assurance I. Garcia 76 Downloaded By: 10.3.98.104 At: 10:32 01 Oct 2021; For: 9781315749129, chapter3, 10.4324/9781315749129.ch3 new softwareformatswillalwaysariseandspecializedtoolsaddressthemfaster. longer occupythefieldexclusively.Thereareunlikelytodisappearaltogether,however,since (as opposedtotext)filescouldbeprocessedwithinconventionalCATsystems. instead. Typical EXE and DLL files give way to Java and .NET, and more and more software designers ceasedhard-codingtranslatabletextandbeganplacingitinXML-basedformats and QA. translation memories,termbases,alignmentandextractiontools,projectmanagement aside, theyalloperatedinmuchthesamewayastheirconventionalCATbrethren,with (Multilizer, Sisulizer,RCWinTrans)andopensource(KBabel,PO-Edit).Sourcematerial (acquired by major US agency TransPerfect).There are also many others, both commercial (glossaries, andlatermemoriestoo)weresharedbybothtechnologies. between theUIperseanditsaccompanyingHelpdocumentation,linguisticresources the CATsystemsdescribedabove.However,tomaintainconsistencywithinUIand display areas. CAT –toensurethetranslatedtextfittedspatially,withoutencroachingonotherallocated strings’ rather than segments. They also added a visual dimension – hardly a forte of conventional rules wereofnouseinchunking,solocalizersengineeredanewapproachcentredon‘text (i.e. displayable) text from actualinstructions. Under the circumstances, normal punctuation programming languages,itcouldbeproblematicjustidentifyingandextractingthetranslatable at aprice. MultiTerm separately(Trados 2002). Userhelpwithinthisquitecomplexscenarioalsocame T-Windows, and XML Validator, but had to buy the fundamental terminology application Trados 5.5Freelancegotrarefied engineeringormanagementtoolssuchasWorkSpace, applications thatrequiredconstant andexpensiveupgrades.Forexample,freelancerspurchasing other CATsystemstoday.Trados meanwhileremainedaratherunwieldycollectionofseparate and free–after-salessupport.Itsinfluencewassuchthatits basic templatecanbediscernedin together atanaccessibleandstableprice,thedeveloper (Atril)offeredcomprehensive– considered itamoreuser-friendlyandgenerallysuperiorproduct. Allfeaturescamebundled the brandmaynotbesorecognizable,itstillboastsaloyaluser base. pre-requisite forcertainjobs.Yet by and largefreelancers preferred Déjà Vu, and while today who commissionedit,andepitomizedbythelegendaryDéjà Vu versusTradosrivalry. became notifbutwhichone−withthedilemmalargelyhinging onwhodidthetranslatingand and training institutions became keenallies in CAT system promotion. The question of adoption remarked, couldlikewiseaccessthem).Inthiscontext,from2000mostprofessionalassociations professionalism, and proficient freelancers could accessthe localization industry (which, as already corporate buyers and language service providers. But CAT ownership conferred an aura of − the greatestbeneficiariesofleveragingandsavingswerethosewithcomputerpower The uptakeofCATsystemsbyindependenttranslatorswasinitiallyslow.Untilthelate1990s, Nowadays, thedistinctionswhichengenderedlocalizationtoolsareblurring,andtheyno Eventually, as industry efforts at creating internationalization standards bore fruit, software The bestknownlocalizationtoolsarePassolo(nowhousedintheSDLstable)andCatalyst These distinctionsweresignificantenoughtomakelocalizationtoolsnotablydifferentfrom Moreover, withheavycomputationalsoftware(forexample,3Dgraphics)codedin There wereseveralreasonswhyDéjàVugarneredsuch a loyalfollowing.Freelancers Trados hadpositioneditselfwellwiththecorporatesector, and forthisreasonalonewasa CAT systemsuptake CAT: systems 77 Downloaded By: 10.3.98.104 At: 10:32 01 Oct 2021; For: 9781315749129, chapter3, 10.4324/9781315749129.ch3 2006: 17). cent ofrecipientswhoreportedbuyingasystemwithoutever managingtouseit(Lagoudaki per cent claimingownership)or satisfaction (aseemingpreferencefor Déjà Vu), but the 16 per College in2006.Itsmostintriguingfindingwasperhapsnot thedegreeofadoption(with82.5 LISA 2002,eColore2003,and2004,withthemostdetailed sofarbyLondon’sImperial most developedopensoftwaresystem. imperatives. OmegaT,writteninJavaandthusplatformindependent,wasremainsthe collectives whocoulddesignperfectlyadequatesystemswithouttheburdenofcommercial less toconventionalprofessionaltranslators,andmorecomputer-savvymultilingual (FOSS) communityalsoneededtolocalizesoftwareandtranslatedocumentation.Thattaskfell Wordfast nativelysupportingMac. TTX filesgenerated byTrados.Windowswasthedefault platforminallcases,withonly (66) andSDLX(30). weretoppedbyDéjàVu(1169),followedWordfast(1003),Trados(438),Transit (2205) andTrados(2138),thenDéjàVu(1233)SDLX(537).Monthlymessageactivity By June2003themostpopularCATproducts,rankedbytheirlistmembers,wereWordfast Groups, andmembernumberstrafficontheselistsgiveanideaofrespectiveimportance. classic era.From1998onwards,CATsystemusersbegancreatingdiscussionlistsonYahoo the-shelf (mostlikelyTrados),orlaunchtheirownofferings(asSDLdidwithitsSDLX). tendency amongstmostlargetranslationagencieswastoeitherstopdevelopingandbuyoff- in-house onlysystems,suchLogos’MnemeandLionbridge’sForeignDesk.However,the base thatshowsintheirusers’YahooGroups.Completingthepanoramawereanumberof of thecentury.MetaTexis,WordFisherandTransSuite2000hadalsoasmallbutdedicated price thedevelopereventuallysetinOctober2002. to overtakeevenDéjàVuinfreelancers’affections.Usersreadilyacceptedthesmallpurchase maintained compatibility. It also came free at a time when alternatives werecostly,and began this environment.ItbeganasasimpleWordmacroakintotheearlyTrados,withwhichit Déjà Vu‘holywars’,thelastbeingwagedinAugust2002. of themostactiveattime)wouldfrequentlyreflectthis,especiallyinfamedTradosvs. passions runhigh.TheLantra-Ltranslators’discussionlist(foundedin1987,theoldestandone SDL’s website. (Studio 2011attimeofwriting) haveaccesstoallpriorversionsthroughdownloadsfrom preserves theoldTranslator’sWorkbench andTagEditor.HoldersofcurrentTradoslicences SDL hasbeenattheTradoshelm: itremainsWinAlign,stillpartofthe2007packagewhich licence, butstill installed separately. Curiously, there hasbeen no newalignment tool while integrated allfunctionsintoa proprietaryinterface;MultiTermwasnowincludedin the SDL Trados2006and2007.ThereleaseofStudio 2009sawashiftthatfinally Trados wasacquiredbySDLin2005,tobeultimatelybundled withSDLXandmarketedas Various surveysonfreelancerCATsystemadoptionhavebeen published,amongstthem Not all activity occurred in a commercial context. The Free and OpenSource Software All commercialproductswereTradoscompatible,abletoimportandexporttheRTF There areusefulrecordsforassemblingasnapshotofrelativeCATsystemacceptanceinthe LogiTerm andespeciallyMultiTransalsogainedasignificantuserbaseduringthefirstyears Wordfast, whichfirstappearedin1999its‘classic’guise,provedanagilecompetitor The prosandconsofthetwomaincompetingpackages,adegreeideology,saw Current CATsystems I. Garcia 78 Downloaded By: 10.3.98.104 At: 10:32 01 Oct 2021; For: 9781315749129, chapter3, 10.4324/9781315749129.ch3 demarcate thecloseofclassicCATsystemsera. too. All these emergingenhanced capabilities, which are coveredbelow, appropriately native format,CATsystemsnowofferadvancedaids−including TrackChanges–forrevisers etc.) insteadofcodedtags.Whereasmanyeditingtaskswere ideally leftuntilafterre-exportto more visual,withtranslationeditorscapableofdisplayingin-line formatting(fonts,bolding segmental matchingisalsobeingattempted.On-screenenvironments arelessclutteredand grammatical knowledgetocomplementthepurelystatistical algorithmsofthepast.Sub- to achieveoutbound-qualityresults.Advancesincomputational linguisticsaresupplying (including crowd-basedQA)havemadeitpossibletoharness armiesoftranslationaficionados a myriad of smallchanges with little manual supervision. The concept and crowd sourcing databases, and implement more agile translation management systems capable of dealing with and Web2.0,withusersplayingamoreactiveroleinwebexchanges. cloud computing,whereremote(internet)displacedlocal(harddrive)storageandprocessing; systems andthoseinthe1990scanbebetterunderstoodwithinframeworkoftwotrends: computer processingpowerandconnectivity.ThedifferenceinscopebetweencurrentCAT showing andgarneringsupport. their own applications. Organizing conferences, as memoQfest does, is anotherway of both extend functionality,thenwithSDLOpenExchange,allowingthemoreambitioustodevelop feedback. SDLTradosledwithitsIdeas,whereuserscouldproposeandvoteonfeaturesto comparative basis.Nowdevelopersseektightercontroloverhowtheyreceiveandaddress clear-cut thanitwastenyearsagowhenYahooGroupsuserlistsatleastaffordedsome freelance following. the classic era. Of them, MemoQ (Kilgray), launched in 2009, seems to have gained considerable mention below,whenillustratingnewfeaturesnowsupplementingtheonescarriedoutfrom writing aidFlare,hasmovedintothetranslationspherewithLingo. first, linkingtocrossAuthor.Theflowisnotjustone-way:Madcap,thedeveloperoftechnical consolidation patternwehaveseen,CATsystemsbeganincorporatingthem.Acrosswasthe authoring tools for precisely the same gains of consistency and reuse. Continuing the software developers had looked at this supply side of the content equation and begun creating Wordfast Classic)toJava-codedProfessionalandweb-basedAnywhere. versions atwriting)haveallkeptaprofile.Wordfastmovedbeyonditsoriginalmacro(now professionals. DéjàVualongwithX2,TransitNXTandMultiTransPrism(latest web-based TranslatorToolkitin2009,aCATsystempitchedforthefirsttimeatnon- web-based system and pioneered the integration of TM with MT. Google released its own as aservice(SaaS). the middle2000s,someCAT systems werealreadymakingtheconnectivityleaptosoftware power, certainfunctionalities would beaccessedoveraLANandeventuallyonserver. By Wordfast simplyranasmacros withinWord.Asthetechnologyexpandedwithcomputer Conventional CATsystemsof the1990sinstalledlocallyonahard-drive;somesuch as Cloud computinginparticularhasmadeitpossibletomeldTMwithMT,accessexternal The greatestdeterminingfactorsthroughouttheevolutionofCAThavebeenavailable The statusofCATsystems–theirmarketshare,andhowtheyarevaluedbyusersisless Many otherCATsystemssawthelightin last yearsofthedecadeandwillalsogaina Translation presupposes a source text, and textshaveto be written by someone. Other Other significantmoveswereoccurring:Lingotek,launchedin2006,wasthefirstfully From thehard-driveto web-browser CAT: systems 79 Downloaded By: 10.3.98.104 At: 10:32 01 Oct 2021; For: 9781315749129, chapter3, 10.4324/9781315749129.ch3 very response lagtimes seem less problematic too. Freelancer resistance thus presumably centres on the glossaries, dictionariesandcorpora.Ascountriescompaniesinvestinbroadbandinfrastructure, explain this,sincemostprofessionaltranslatorsalreadyrelyoncontinuousbroadbandforconsulting free WordfastAnywhere.Internetconnectivityrequirementsalonedonotseemtoadequately adherents, forexample,thepaidClassicversionisstillpreferredoveritsonlinecounterpart, circumvent toolobsolescenceandupgradedilemmas(Muegge2012:17−21).AmongWordfast have givenwaytostreamingchanges. automated −mostconvenientinanerawithshortcontentlifecycles,whereperiodicupdates becomes centralizedandstraightforward.Managementtaskscanalsobesimplified a segmentjustenteredbyonecanbealmostinstantlyreusedall.Databasemaintenance SDL Trados(WorldServer)andAcross. and Boltran. Traditional hard drive-based products also boast web-based alternatives, including Crowd.in, TextUnited,WordbeeandXTMCloud,plusopensourceGlobalSight(Welocalize) based systems soon followed: first Google Translator Toolkit and Wordfast Anywhere, then into theagency’scurrentGeoWorkzTranslationWorkspace. performed ontheserver.PurchasedbyLionbridgeforin-houseuse,ithassincebeendeveloped for MicrosoftWord,withthemajorityofcomputationaltasks(databasingandprocessing)now more controlanduniformity. often createdengineeringhitchesthroughcorruptedfileexports.Webaccesstodatabasesgave buyers, however,rejoiced.TheextendeduseofTrados-compatibletoolsinsteadTradoshad partially dependent on internet connection speed. Language service providers and translation it gave them less control over their own memories and glossaries, and made work progress accessing client-designateddatabasesremotelyviaalogin.Itdidnotmakealltranslatorshappy: forced translatorstoworkin‘web-interactive’mode−runningtheirCATsystemslocally,but bases. Thesewerevaluableresources,andclientswantedtosafeguardthemonservers.This Meer 2011). way behindtheinteroperabilityachievedinotherindustriessuch asbankingortravel(Vander being developedtoaddressthis.YetasTAUShasnoted,the translationindustrystillisalong communicability. Anewopenstandard,theLanguageInteroperability Portfolio(Linport),is to someextent,retreatingisolatedlog-inaccesshashobbled furtheradvancesincross-system With TMXhavingalreadybeenuniversallyadoptedandmost systemsbeingXLIFFcompliant there isagrowing emphasisonstatisticalmachine translation(SMT)forwhich, withappropriate be accessednowondemandthrough awebbrowser. renewed asprocessingcapabilities expanded.SophisticatedandcontinuallyevolvingMT can automation. Thelackofcomputational firepowerstalledMTprogressforatime,butit was management andtranslation memory happenedtobeanoffshootofresearchinto full Research intomachinetranslationbeganinthemid-twentieth century.Terminology Translators themselves have been less enthused, even though browser-based systems neatly Translators themselveshavebeenlessenthused,eventhoughbrowser-basedsystemsneatly The advantages of web-based systems are obvious. Where teams of translators are involved, The firstfully-onlinesystemarrivedintheformofLingotek,launched2006.Otherweb- The next jump came with Logoport.The original version installed locally as a small add-in The movehadcommencedattheturnofthiscenturywithtranslationmemoriesandterm Moving to the browser has not favoured standardization and interoperability ideals either. Although conventionalrule-based machinetranslation(RBMT)isstillholdingitsground, raison d’êtreofweb-basedsystems:remoteadministrationandresourcecontrol. Integrating machinetranslation I. Garcia 80 Downloaded By: 10.3.98.104 At: 10:32 01 Oct 2021; For: 9781315749129, chapter3, 10.4324/9781315749129.ch3 everything had to builtupfromzero.Nowadays thatisnottheonlyoption, andfromdayone purchasers weresomehowgranted externalmemoriesandglossaries(fromclients,say) Traditionally, whenusersfirst boughtaCATsystem,itcamewithemptydatabases.Unless workflow makestruebusinesssense. engines, post-editors,andspecificjobsforwhichMTintegration intoCATlocalization might shortlyanticipateevidence-baseddecisionsregarding thelanguagepairs,domains, and timeemployedachievingit.Withthepowerfulanalytic toolscurrentlyemerging,we metadata toindicatewhetherthetranslationderivesfromMT (andifsowhich),andthesteps (Specia 2011:73−80),stillrequirefinetuning. untranslated sentence. Non-referenced methods, such as those based on confidence estimations reference translation,andthuscannothelptoexactlypredict performanceonapreviously such astheBLEUscore(Papinenietal.2002:311−318)measureMTmatchqualityagainsta the engine’srawoutputquality,asyetthereisnoclearwayof quantifyingit.Standardmethods so howdoesonepredictthesuitabilityofatextbeforeMTprocessing? hindrance. Thisplacestranslationmanagersatadecisionalcrossroad:trialanderroriswasteful, viable; similarly,MTsolutionsshouldatleastbeofgistingqualitytoanythingotherthana As notedbefore,fixingfuzzymatchesbelowacertainthreshold(usually70percent)isnot and translatefromscratch. for treatmentakintoconventionalfuzzymatches:modifyifdeemedhelpfulenough,ordiscard translating fromthesourcenomatches)ortopopulatethosewithMTsolutions traditional way(acceptingorrepairingexactmatches,rejectingthefuzzyonesand (their translationmemoriesandterminologydatabases). so in a familiarenvironment(theirchosenCATsystem),whilstleveragingfrom legacy data generated outputtotranslatorsviatheirCATeditor.Thepayoffistwofold:enterprisescando machines nowproducinguseablefirstdrafts,therearepotentialgainsinpipeliningMT- mainframe poweredMT;SDLTradossoonfollowedsuit,andthenalltheothers.With remarked above,Lingotekin2006wasthefirsttolaunchaweb-basedCATintegratedwith MT wasnotreallypowerfuloragileenough,tricklingoutasdiscretebuildsonCD-ROM.As productivity’ (PlittandMasselott2010:15). appropriate conditionsMTpost-editingalso‘allowstranslatorstosubstantiallyincreasetheir professional post-editors. As an Autodesk experiment conductedin 2010 showed, under modern localizationprojects,enterprisesmayevenprefercustomizedMTenginesandtrained gradually developeditsownprinciples,procedures,training,andpractitioners.Forsome important thanstylisticcorrectness. Translator, withlight(orevenno)post-editingmaysuffice,especiallywhengistingismore these conditions,even free on-line MT engines such asGoogle Translate and Bing MT inmind(see‘authoringtools’above)outputcanbesignificantlyimprovedagain.Under existing onesforspecificdomains.Whatismore,ifsourcetextsarewrittenconsistentlywith bilingual andmonolingualdata,itiseasiertocreatenewlanguage-pairenginescustomize The nextgenerationofCATsystemswillforeseeablyascribe segmentsanotherlayerof Unfortunately, while the utility of MT and post-editing for a given task clearly depends on While theprocessmayseemstraightforward,desiredgainsintimeandqualityarenot. The integrationofTMwithMTgivesCATusersthechoicecontinuingworking Attempts ataugmentingCATwithautomationbeganinthe1990s,butavailabledesktop Post-editing, themanual‘cleaningup’ofrawMToutput,onceasmarginalitself,has Massive externaldatabases CAT: systems 81 Downloaded By: 10.3.98.104 At: 10:32 01 Oct 2021; For: 9781315749129, chapter3, 10.4324/9781315749129.ch3 massive databaseaccess. sheer translation volume and demand is pushing irrevocably toward a world of open and best minds.Yetrecentinitiatives(e.g.TAUS)wouldindicate thatthestrainofcopingwith issues, andthetrade-offbetweengoingpublicstaying private isexercisingtheindustry’s Commercial secrecy,ownership,priorinvestedvalue,andcopyrightareclearlycounterbalancing reach throughopenparticipation,albeitquarantiningsensitiveareasfrompublicuse. even themosthighlyresourcedcorporateplayersmightalsoseeabenefittoincreasingtheir Trados StudioandmemoQhadMyMemoryfunctionalitysoonafterwards. memories: MultiTranshasenabledaccesstoTDAandMyMemorysince2010,SDL Toolkit. OtherCATsystemshavealreadybegunincorporatinglinkstoonlinepublictranslation Memory (VLTM);itwascloselyfollowedbytheGlobal,sharedTMofGoogleTranslator and fuzzymatchesdirectly. enable translatorstoqueryworldwiderepositoriesoftranslationsolutionsandimportanyexact ability toaccesssuchdatawithouteverneedingleavetheCATeditorwindow.Itwould within aseparate application, andtransferringresultsacross:whatwouldbetrulyuseful is the concordance featureintheirownCATsystemsandmemories.Theonlyhitchisworking problematic sentences and phrases by querying the database, just as they would with the can alsosignificantlyassisthumantranslation.Freeon-lineaccessallowstranslatorstotackle Automation UsersSocietyTAUS),MyMemory(Translated.com)andLinguee.com. most notableincludetheTAUSDataAssociation(TDA,promotedbyTranslation all availabledatainsuchawaythatitcanbesortedbylanguage,clientandsubjectmatter.The entire TMstocktoo.Butambitionsdidnotceasethere,andinitiativeshaveemergedtopool using CATsystemswereobviousandattractivecandidates. greater thevolumeonehas,better,andtranslationmemoriescreatedsince1990s results foranygiventaskdependonfeedingtheSMTenginedomain-specificinformation, improvement, notjustinthealgorithmsbutdataqualityandquantityaswell.Sinceoptimal European Union.ThehighlyuseabletranslationsachievedwithSMTwereaspurtofurther using publishedbilingualcorpora–thetranslationmemories(minusmetadata)of company’s −lifetimeoutput. it ispossibletoaccessdatainquantitiesthatdwarfanytranslator’s−orformatter,entire True, theconcordancing toolcanbeusedto conductasearch,butthis is inefficient(and overburdened translators,somost toolsweresettoignoreanythingunderacertainthreshold. begging. sentences whichdidnotreturn fuzzymatchestocontainshorterperfectthatweregoing significant partofwriting.This posedanigglingproblem,sinceitwasentirelypossible for sentence level,withthestockexpressionsandconventional phraseologythatmakeupa a matchfortheaveragesizesentenceiscoincidence.Most repetitionhappensbelowthe applied toasourcecreatedforthesameclientandwithin sameindustry.Otherthanthat, Translation memoryhelpsparticularlywithinternalrepetition andupdates,alsowhen Now thatmemoriesandglossariesareincreasinglyaccessedonline,itisconceivable Wordfast wasthefirsttoprovideapracticalimplementationwithitsVeryLargeTranslation Now, thesesamemassivetranslationmemoriesthathavebeenassembledtoempowerSMT Accordingly, corporationsandmajorlanguageserviceprovidersbegancompilingtheir Interestingly, thissituationhascomeaboutpartlythroughSMT,whichbeganitsdevelopment Research andexperienceshowed thatlow-valuematches(usuallyunder70percent) Sub-segmental reuse I. Garcia 82 Downloaded By: 10.3.98.104 At: 10:32 01 Oct 2021; For: 9781315749129, chapter3, 10.4324/9781315749129.ch3 term extraction components.Earlyinthepast decadetermextractionwasconsidered aluxury, SDL, TerminotixandMultiCorpora havealsocreatedsystemswithstronglanguagespecific computational linguisticscanpay dividends.FollowingtheXeroxTerminologySuitemodel, Factory. unbundling thealignmenttool fromitsLogiTermsystemandmarketingitseparatelyasAlign more relevantthanever.Inthisclimate,Terminotixhas bucked theestablishedtrendby that thereissomuchdemandforSMTbi-texts,quickandaccuratealignmentshavebecome converters, QA,alignment,termextraction),ultimatelydisplacing theirpioneers.Butnow noted, CATsystemdesignershaveprogressivelyintegrated third-partystandalones(file without manualverification.Hereaninterestingbusiness reversal hasoccurred.Asalready to theextentthatitsalignmentsyieldoutputwhichforsomepurposes isdeemedusefulenough linguistic analysisfunction. and sub-segmentalmatchingforthesevenEuropeanUnion languagessupportedbyits ‘second-generation translation memory’, Similis boasts enhanced alignment, term extraction, in thesecondhalfofdecadeSimilissystem(Lingua etMachina).Advertisedasa specific knowledgewithinaCATenvironment.Nowdiscontinued,itstechnologyresurfaced and storedones;astranslationaidstheycouldbepowerful,butnot‘smart’. generation CATsystemsworkedbyseekingpurelystatisticalmatch-upsbetweennewsegments apply thesamedatabasingprinciplestowhateverlanguagecombinationuserchose.First own specialalgorithms;CAT systems werethe opposite, comingasempty vessels that could In theclassicera,itwasMTapplicationsthatwerelanguagespecific,witheachpairhavingits segmental matchqueriesfrominternaldatabasestomassiveexternalones. Once thatisattained,onecanonlyspeculateonthepotentialandgainsofelevatingsub- others, sotherightbalanceisneededbetweenwhat(andhowmany)suggestionstooffer. reuse atsentencelevelonly. leveraging inTAUS-speak),increasedreusebyanaverageof30percentoverconventional Predictive typingisvariouslydescribedasAutoSuggest,AutoComplete,AutoWriteetc. Substring Concordance;Star-TransithasDualFuzzy,andDéjàVuX2DeepMiner. and Lingotek,followingTAUS,callitAdvancedLeveraging;memoQreferstoLongest Each developerhasitsownimplementationandjargonforsub-segmentalmatching:MultiTrans indexing withpredictivetyping,suggestionspoppingupasthetranslatortypesfirstletters. (Garcia 2003). memory whennomatcheswereavailable.Sometranslatorslovedit;othersfounditdistracting feature, whichofferedportionsthathadbeenenteredintothetermbase,lexiconor have provenelusivetoachieve.TheearlyleaderinthisfieldwasDéjàVuwithitsAssemble level (or‘sub-segmental’)matchesallbyitself−automatedconcordancing,sotospeak. additional time.Itwouldbemuchbetterifthecomputercouldfindandofferthesephrase- random) sinceitreliesonthetranslator’sfirstidentifyingneedtodoso,andtakes Apart fromalignment,term extractionisanotherareawheretrackingadvancesin Canada-based Terminotixhasalsostoodoutforitsabilityto mixlinguisticswithstatistics, The termextractiontoolXeroxTerminologySuitewasapioneerinintroducinglanguage- As discoveredwiththeoriginalDéjàVuAssemble,whatisahelptosomedistraction A studysponsoredbyTAUSin2007reportedthatsub-segmentalmatching(oradvanced It isonlyrecentlythatallmajordevelopershaveengagedwiththetask,usuallycombining Potential methodshavebeenexploredforyears(SimardandLanglais2001:335−339),but CAT systemsacquirelinguisticknowledge CAT: systems 83 Downloaded By: 10.3.98.104 At: 10:32 01 Oct 2021; For: 9781315749129, chapter3, 10.4324/9781315749129.ch3 MemoQ. developed forCATsystems,havingemergedalmostsimultaneouslyinSDLTradosand document for another to approve. It is only at the time of writing that this feature is being update, exporttonativeformat). target (plusmetadata)toatableinWordforediting,thenimportitbackfinalization(TM decade ago,thebestavailableoptionwasprobablyinDéjàVu,whichcouldexportsourceand ‘what-you-see-is-what-you-get’ view. That situationhaschangedsomewhat,withmanyproprietaryeditorsedgingclosertoaseamless was amajorpointofdifferentiationbetweenconventionalCATsystemsandlocalizationtools. driven industry.Tagswereseeminglythebaneofatranslator’sexistence.Thevisualpresentation translation couldnotbeexportedtonativeformat–aharrowingexperienceindeadline- negate anyproductivitybenefitsentirely.Ifatagweremissing,anotherwisecompleted in fromaPDF,OCRoutputwithunevenfontsetc.),thenumberoftagscouldexplodeand brackets). (displayed asiconsinTagEditor,paint-brushedsectionsSDLX,oranumericcodecurly handle otherfiletypes,butcouldbecomeuselesslyclutteredwithin-lineformattingtags oblivious totheunderlyingcodingthatmadefiledisplay.Earlyproprietaryinterfacescould blessing: translatorscould operate within a familiar environment (Word) whilst remaining Microsoft Word-basedTMeditors(suchasTradosWorkbenchandWordfast)hadonegreat remains demandingandexpensive,moreCATlanguagespecializationwillassuredlyfollow. CAT paradigmnolongerstands,andalthoughbuildingalgorithmsforspecificlanguagepairs (Fluency, Fortis,Snowball,Wordbee,XTM)wereincludingitwithintheirstandardofferings. marketed byonlytheleadingbrandsatapremiumprice.Bydecade’send,allnewcomers done bybilingual usersoremployees,inthe beliefthatsubjectmatterexpertise willoffseta translator engagedinsporadic work.Sometranslationbuyersmightprefertohaveprojects around. monthly oronthevolumeof wordstranslated.Thisallowsuserstobothshopandmove Text United,Wordbee)haveskirted thisobstaclebyadoptingasubscriptionapproach,charged then feeling‘lockedin’byone’sinvestment.Web-basedapplications (MadcapLingo,Snowball, Snowball, TextUnited,Wordbeeandothers. person performing the translation does not: Across, Lingotek memoQ, MemSource, Similis, Many atleasthaveafreesatelliteversion,sothatwhiletheproject creatorneedsalicence,the other opensourcetools,butalsotheGoogleTranslationToolkit andWordfastAnywhere. and costs are falling. Several suites are even free, such asOmegaT,Virtaal, GlobalSight and and tendedtobeexpensivecumbersome.Thepotential userbaseisnowmuchbroader, A decadeagoCATsystemswereaimedattheprofessionaltranslator workingontechnicaltext, In wordprocessing,TrackChangeshasbeenoneeffectivewaytopresentalterationsina Conventional CAThasnotparticularlyfacilitatedthepost-drafteditingstageeither.A If for some reason the file had notbeen properly optimized atthe source (e.g., textpasted Now atleastwherethemajorEuropeanlanguagesareconcerned,classic‘tabularasa’ Modern CATsystemsnowassist withmosttypesoftranslation,andsuiteventhecasual One stickingpointforpotentialpurchaserswastheoftenhefty up-frontlicencefee,and Upgrades tothetranslator’seditor Where tofromhere? I. Garcia 84 Downloaded By: 10.3.98.104 At: 10:32 01 Oct 2021; For: 9781315749129, chapter3, 10.4324/9781315749129.ch3 it containedonly 27entriesatthetimeofwriting (evenmajornamessuch asDéjàVuor become thebestinformationrepository forproductsunderactivedevelopment.Justreleased, With theHutchingsCompendium nowdiscontinued,theTAUS Trackerwebpagemaysoon need tocomprehensivelydocumenttheirevolutionbeforeit recedestoofarfromview. we wishtofullyunderstandwhatCATsystemshaveachieved intheirfirsttwentyyears,we past. Thatisnoteasywhenchangepropellingusdizzyingly anddistractinglyforward.Butif by then,thesystemsoftodaywilllookasoutdatedDOS-based softwarelooksnow. envision state-of-the-artin2020wouldbeguessworkatbest. Whatisvirtuallycertainthat have evolved, any prediction is risky. Change ishardly expected to slacken, so attempting to altogether exceptforverynarrowdomains. back totheeditor,willmakelanguageitselffuzzier;theyadvocate avoidanceofthetechnology memory toeditorwindow,frommassivedatabases andSTMengines,then of todayredundant.Pessimistsworryevennowthatcontinuousreusematchesfrominternal MT post-editingwillbetheanswerinmostsituations,makingtranslator-focusedsystems software willappearbytheendofpresentdecade.Technologyoptimistsseemtothinkthat recognition, orcombinationsthereof)toachieveoptimalresults. matched againstdifferentapproaches(MTpluspost-editing,sub-segmentalmatching,speech empirical studiestodescribehowbasicvariables(texttype,translatorskillprofile)canbe typing. Thepossibilitiesseemsuggestiveandattractive.Unfortunately,therearestillno translating’ than from MT post-editing or assembling sub-segmental strings or predictive speech recognition. Aided Translation)isthefirstsystemthat purpose-builttopackage TM(andMT)with Trados andDragonNaturallySpeaking)canstraincomputerresources.Aliado.SAT(Speech environments overthelastfewyears.However,runningheavyprogramsconcurrently(say speech recognitionsoftware,dictationhasreturnedformajorsupportedlanguagesatleast. speed couldbeincreasedbyhavingexperttranslatorsdictatetotypists.Withthehelpof segmental matchingfromexistingdatabases. and post-editingor,ifpreferred,enhancingmanualtranslationwithpredictivetypingsub- terminology. Nowadays,theycanalsoassistwithnon-matchsegmentsbypopulatingMT CAT systemsaimedatboostingproductivitybyreusingexactandfuzzymatchesapplying and concordancing;LogitermcanaccessTermiumothermajortermbanks.Inthepast, MultiTrans, SDL Trados Studio and memoQ can directly access massive databases for matches developed specificallywithmasscollaborationinmind. work inteams,butsome−likeCrowd.in,LingotekorTranslationWorkSpacehavebeen translate itssiteintovariouslanguagesvoluntarily.AllCATsystemsallowfortranslatorsto or repaired. Thisisoften referred toascrowdsourcing.Forexample, Facebook had its userbase enough peopleengagedinatask,resultscanbeconstantlymonitoredandifnecessarycorrected possible lackoflinguistictraining.Anothercompensatingfactorissheernumbers:ifthereare While itistemptingtopeerintopossiblefutures,alsoimportant nottolosetrackofthe Considering recentadvances,andhowcomputingingeneral andCATsystemsinparticular Given allthistechnologicalferment,onemightwonderhowprofessionaltranslation Translators whoarealsoskilledinterpretersmightperhapsachievemorefrom‘sight Translators havebeenusingstand-alonespeechrecognitionapplicationsintranslationeditor As fortypingperse,historyisbeingrevisitedwithamoderntwist.Inthetypewriterera, A decadeago,CATsystemscame with empty memory and terminology databases. Now, Further readingandrelevant resources CAT: systems 85 Downloaded By: 10.3.98.104 At: 10:32 01 Oct 2021; For: 9781315749129, chapter3, 10.4324/9781315749129.ch3 authored andregularlyupdatedtheelectronicbookATranslator’sToolBoxfor21 on CATsystems(whichhecallsTEnTs,or‘translationenvironmenttools’).Zetzschehasalso now rebrandedTheToolBox,whichhasbeenanimportantsourceofinformationandeducation Jost Zetzsche’sTranslatorsTraining.com.ZetzscheisalsotheauthorofTheToolKitnewsletter, Support forumsallowforagoodappraisalofhowtranslatorsengagewiththeseproducts. some ofwhich(DéjàVu,Wordfast,theoldTrados)arestillquiteactive;theseCATTool Support’ technicalforumsandgroupbuyschemes.TherearealsouserbasesonYahooGroups, make informeddecisionsbycompilingallrelevantinformationconCATsystemsinoneplace. ‘CAT Fight’featurethatwasshelvedsomeyearsago−alsoproposestohelpfreelancetranslators Lingotek havenotmadeitslistyet).ProZ’sCATToolcomparison−successortopopular Muegge, Uwe(2012)‘TheSilent Revolution:Cloud-basedTranslationManagementSystems’, TC AssistedTranslation Systems:TheStandardDesignandaMulti-level Melby, AlanK.(1983) ‘Computer Lagoudaki, Elina(2006)TranslationMemoriesSurvey,ImperialCollegeLondon.Availableat:http:// Kingscott, Geoffrey(1999,November)‘NewStrategicDirection forTradosInternational’,Journal Kay, Martin(1980/1997)‘TheProperPlaceofMenandMachines inLanguageTranslation’,Machine Hutchins, W.John(1999–2010)CompendiumofTranslationSoftware: DirectoryofCommercial Hutchins, W.John(1998)‘TwentyYearsofTranslatingandtheComputer’, TranslatingandtheComputer Garcia, Ignacio(2003)‘StandardBearers:TMBrandProfilesatLantra-L’, TranslationJournal7(4). Brace, Colin(1992,March−April)‘Bonjour,EurolangOptimiser’, LanguageIndustryMonitor.Available ALPAC (Automatic Language Processing Advisory Committee) (1966) under ‘Aidsandtoolsfortranslators’,alsoon‘Systemsprojectnames’. to CAT systems will be found in the ‘Methodologies, techniques, applications, uses’ section Translation Archives,arepositoryofarticlesalsocompiledbyHutchings.Mostitemsrelated Translation. Chronicle, can befoundalsoinnewsletterspublishedbytranslators’professionalorganizations(TheATA be foundindigitalperiodicalssuchasTranslationJournal, as wellgeneralcommentsonthestateoftechnology.Reviewsandcanalso remains, andcontinuesofferingreviewsofnewproducts(andversionsestablishedones) Language International, A ComputerPrimerforTranslators,nowinitstenthedition. World 7(7):17−21. CA, USA,174−177. Design’, in www3.imperial.ac.uk/portal/pls/portallive/docs/1/7294521.PDF. Language andDocumentation6(11).Availableat:http://www.crux.be/English/IJLD/trados.pdf. Translation 12(1−2):3−23. www.hutchinsweb.me.uk/Compendium.htm. Machine TranslationSystemsandComputer-aidedSupport Tools.Availableat:http:// 20. London:TheAssociationforInformationManagement. at: http://www.lim.nl/monitor/optimizer.html. Washington, DC:NationalResearchCouncil. Division ofBehavioralSciences,National Academy ofSciences,National Research Council, in TranslationandLinguistics,AReportbytheAutomaticLanguageProcessingAdvisoryCommittee, The first initiativeto usethe web tosystematicallycomparefeaturesof CATsystemswas ProZ, themajorprofessionalnetworkingsitefortranslators,includesalso‘CATTools Articles takenfromtheseandothersourcesmaybesearchedwithintheMachine Of theseveralhardcopyindustryjournalsavailableinnineties(LanguageIndustryMonitor, ITI Bulletin),andacademicjournalssuchasMachineTranslationJournalofSpecialised Proceedings of the ACL-NRL Conference on Applied Natural Language Processing , Santa Monica, Multilingual ComputingandTechnologyothers),only References I. Garcia 86 ClientSide News,orTCWorld;they Language and Machines: Computers st Century: Downloaded By: 10.3.98.104 At: 10:32 01 Oct 2021; For: 9781315749129, chapter3, 10.4324/9781315749129.ch3 Specia, Lucia(2011)‘ExploitingObjectiveAnnotationsforMeasuringTranslationPost-editingEffort’, Simard, MichelandPhilippeLanglais(2001)‘Sub-sententialExploitationofTranslationMemories’,in Plitt, MirkoandFrançoisMasselot(2010)‘AProductivityTestofStatisticalMachineTranslation:Post- Zetzsche, Jost(2012)ATranslator’sToolBoxforthe21 Zetzsche, Jost(2010)‘GetThoseThingsOutofThere!’TheATAChronicle34−35,March. Zetzsche, Jost(2004–)TheToolBoxNewsletter,WinchesterBay,OR:InternationalWriters’Group. Wallis, Julian(2006)‘InteractiveTranslationvs.Pre-translationintheContextofMemory van derMeer,Jaap(2011)LackofInteroperabilityCoststheTranslationIndustryaFortune:ATAUSReport. Trados (2002)5.5GettingStartedGuide,Dublin,Ireland:Trados. TAUS (TranslationAutomationUserSociety)(2007)AdvancedLeveraging:AReport.Availableat Papineni, KishoreA.,SalimRoukos,ToddWard,andZhuWei-Jing(2002)‘BLEU:AMethodfor in Spain, 335−339. Proceedings oftheMTSummitVIII:MachineTranslationinInformationAge,SantiagodeCompostela, editing inaTypicalLocalisationContext’,ThePragueBulletinofMathematicalLinguistics93:7−16. for ComputationalLinguistics,ACL-2002,7−12July2002,UniversityofPennsylvania,PA,311−318. Satisfaction’, unpublishedMAThesisinTranslationStudies,Ottawa,Canada:UniversityofOttawa. Systems: InvestigatingtheEffectsofTranslationMethodonProductivity,QualityandTranslator translation-industry-a-fortune. Available at:http://www.translationautomation.com/reports/lack-of-interoperability-costs-the- http://www.translationautomation.com/technology-reviews/advanced-leveraging.html. Leuven, Belgium,73–80. 10), WinchesterBay,OR:InternationalWriters’Group. Automatic EvaluationofMachineTranslation’,inProceedingsthe Proceedings ofthe15 th ConferenceoftheEuropeanAssociationforMachineTranslation(EAMT2011), CAT: systems 87 st Century: A ComputerPrimerforTranslators(version 40 th AnnualMeetingoftheAssociation