Geekspeak Jost Zetzsche [email protected]

Total Page:16

File Type:pdf, Size:1020Kb

Geekspeak Jost Zetzsche Jzetzsche@Internationalwriters.Com GeekSpeak Jost Zetzsche [email protected] More Is Better? I took some time the other day Translate, Microsoft Translator, Systran, and MyMemory. to take an inventory of the out-of-the- itranslate4.eu, Systran, LetsMT!, box machine translation (MT) con- Asia Online, and MyMemory. • How it works: You can select only nectors or plug-ins that translation one engine at a time and matches environment tools (TEnTs) come • How it works: You can select var- are not shown automatically. readily equipped with these days. Just ious engines at the same time and Features an interface for writing a few months ago, most TEnTs came all matches are shown with one scripts to ease post-editing. with only a connector to Google selectable preferential engine. Translate, but Google’s decision to Lingotek start charging for its MT service (if it Déjà Vu X2 • Integrated plugins for Google is integrated into a third-party tool) • Integrated plugins for Google Translate and Microsoft Translator. made most tool vendors look for other Translate, Microsoft Translator, solutions alongside Google Translate. itranslate4.eu, PROMT, and Systran. • Possible connectors to SDL What follows is an (admittedly LanguageWeaver, SAIC Omni- incomplete) list of tools and their con- • How it works: It is only possible to fluent, and Asia Online. nectors, along with some thoughts select one MT engine at a time. Déjà about their usefulness. (Note that for Vu X2 uses MT hits in combination • How it works: You can select sev- the actual use of most of these MT with translation memory hits. eral engines at a time and all tools you will need a license key.) matches are shown. Wordfast Classic 6 Trados Studio 2011 • Integrated plugins for Google Multitrans • Integrated plugins for Language - Translate, Microsoft Translator, • Possible connectors to PROMT and Weaver (BeGlobal), SDL MT, itranslate4.eu, WorldLingo, and MyMemory. Google Translate, and Microsoft MyMemory. Translator. MemSource • How it works: You can select up to • Integrated plugins for Google • Free installable plugins on SDL’s three engines at a time and all Translate, Microsoft Translator, app store OpenExchange for itrans- matches are shown, including var- Microsoft Translator Hub, and late4.eu, Systran, MyMemory, ious matches from itranslate4.eu. Systran. Google Translate, and Microsoft Translator. The two last plugins Wordfast Pro 3 OmegaT extend the ability of the out-of-the- • Integrated plugins for Google • Integrated plugin for Google box connectors. For instance, this Translate, Microsoft Translator, Translate, Belazar (for Russian<> allows you—in the case of and WorldLingo. Belarussian), Microsoft Translator, Microsoft Translator—to translate and Apertium. without a license in exchange for • How it works: You can select sev- your translation data from the cur- eral engines at a time and all Across rent project. matches are shown. • Possible connectors to Google Translator, Lucy LT, Reverso, • How it works: You can select var- Fluency 2011 Language Weaver, Moses, and ious engines at the same time and • Integrated plugins for Google Asia Online. · all matches are shown in the order Translate, Microsoft Translator, of the preference you determine. Exceptions: When selecting itrans- The GeekSpeak column has two goals: to inform the community about technological late4.eu or MyMemory, only those advances and at the same time encourage the use and appreciation of technology among matches are shown. translation professionals. Jost is the co-author of Found in Translation: How Language Shapes Our Lives and Transforms the World, a perfect source for replenishing your memoQ 6 arsenal of information on how human translation and machine translation each play • Integrated plugins for Google important parts in the broader world of translation. 32 The ATA Chronicle n January 2013 Figure 1: Suggested translations in Wordfast If you need to know what these dif- some kind of combination might be box MT engines (as opposed to not ferent MT engines are and what lan- useful in the actual and final translation. using MT at all or using a customized guage combinations they support, I What role does this information play for engine). It allows you to compare the encourage you to check on their respec- the translator, though? Does it help or results very easily. Chances are that tive websites or Wikipedia pages. hinder? Is it different, for instance, than you will quickly find one of the So, what is all this good for? I will having a lot of matches from a general engines better than the others for your leave this up to your preferences translation memory shown? particular project, and this will allow (and language combination, and kinds It was interesting to discuss this you to disable the less helpful ones of translation you do, and the many likes question at a workshop I gave recently. (and stop paying for their suggestions). and dislikes that you might have about Not surprisingly, the translators in One of the (very unscientific) tests this kind of technology). But there is one attendance expressed very divergent that we did during the above-men- thing that interests me in particular: Is it opinions. Some felt that this would tioned workshop was to look at dif- helpful to have several MT suggestions stifle creativity, whereas others liked ferent MT providers with different shown as you translate? the idea of having four or five different kinds of texts in about 10 of the repre- Consider the example in Figure 1 MTs displayed. And chances are that sented language combinations. The from Wordfast (with MTs from Google the answer does indeed differ for each result? We noticed that often there was Translate, Microsoft Translator, Ling- translator and that translator’s indi- a clear “winner” on a per-project basis. uatec, Systran, and Trident MT—the vidual style of processing data. MT might not be your cup of tea as last three through itranslate4.eu). There is, however, one way that this a productivity tool, but it is important We do not need to argue about how simultaneous display of different to remember that the results of one “good” these matches are, but most of results will be helpful for anyone who MT system are always unlike those them contain some material that in is looking into using these out-of-the- of another. ATA Member-to-Member Discounts Are you an ATA member who wants to save money? See what discounts your fellow members can offer through ATA’s Member-to-Member Discount Program! • The Tool Kit • The Translator’s Tool Box • Payment Practices • Getting Started as a Freelance Translator • Translate Write • Translation Office 3000 To participate in the program or to learn about the benefits, contact ATA Member Benefits and Project Development Manager Mary David at [email protected]. Visit www.atanet.org/providers.php to start saving! The ATA Chronicle n January 2013 33.
Recommended publications
  • The Impact of Crowdsourcing Post-Editing with the Collaborative Translation Framework
    The Impact of Crowdsourcing Post-editing with the Collaborative Translation Framework Takako Aikawa1, Kentaro Yamamoto2, and Hitoshi Isahara2 1 Microsoft Research, Machine Translation Team [email protected] 2 Toyohashi University of Technology [email protected], [email protected] Abstract. This paper presents a preliminary report on the impact of crowdsourcing post-editing through the so-called “Collaborative Translation Framework” (CTF) developed by the Machine Translation team at Microsoft Research. We first provide a high-level overview of CTF and explain the basic functionalities available from CTF. Next, we provide the motivation and design of our crowdsourcing post-editing project using CTF. Last, we present the re- sults from the project and our observations. Crowdsourcing translation is an in- creasingly popular-trend in the MT community, and we hope that our paper can shed new light on the research into crowdsourcing translation. Keywords: Crowdsourcing post-editing, Collaborative Translation Framework. 1 Introduction The output of machine translation (MT) can be used either as-is (i.e., raw-MT) or for post-editing (i.e., MT for post-editing). Although the advancement of MT technology is making raw-MT use more pervasive, reservations about raw-MT still persist; espe- cially among users who need to worry about the accuracy of the translated contents (e.g., government organizations, education institutes, NPO/NGO, enterprises, etc.). Professional human translation from scratch, however, is just too expensive. To re- duce the cost of translation while achieving high translation quality, many places use MT for post-editing; that is, use MT output as an initial draft of translation and let human translators post-edit it.
    [Show full text]
  • How to Use Google Translate
    HOW TO USE GOOGLE TRANSLATE For some ASVAB CEP participants (or their parents), English is a second language. Google Translate is an easy way to instantly translate any webpage using these steps. Google Chrome Internet Explorer 1. Open Google Chrome. Google Translate is available on Internet Explorer version 6 and 2. Go to asvabprogram.com. later. To activate it: 3. Right click anywhere on the webpage. 1. Open Internet Explorer. 4. Select Translate from the menu. 2. Go to Google Toolbar’s website (toolbar.google.com), 5. Select Options. and click the “Download Google Toolbar” button. 6. On the Translate Language dropdown, 3. Click on “Accept and Install” and the toolbar will be select the desired language. automatically installed on your Internet Explorer. 4. Click Run or Open in the window that appears. 5. Enable the toolbar. 6. Go to asvabprogram.com. 7. Select More >> 8. Select Translate. 9. Then, the translate button will appear at the top of your webpage. 10. Right click to select the language option. 7. You will see the Google Translate icon in the browser bar, which you can use to manage your translation settings. iphone Android Microsoft Translator is a universal app for 1. On your Android phone or iPhone and iPad, and can be downloaded tablet, open the Chrome app. from the App Store for free. Once you’ve 2. Go to a webpage. got it downloaded, you can set up the action extension for translation web pages. 3. To change the language, tap 4. Tap Translate… To activate the Microsoft Translator extension in Safari: 5.
    [Show full text]
  • Metia Cloud OS Ss
    U.S. Army Europe saves more than $150,000 by automating database translation Customer: U.S. Army Europe Website: www.eur.army.mil “By using the Microsoft Translator API to automate SQL Customer Size: 29,000 soldiers Server data translation into English, we are able to Country or Region: Germany Industry: Military/public sector present senior leaders with universally usable data that Customer Profile supports better informed decisions.” U.S. Army Europe trains and leads Army Mark Hutcheson forces in 51 countries to support U.S. IT Specialist, U.S. Army Europe European Command and Headquarters, Department of the Army. Before migrating to Microsoft Dynamics CRM, U.S. Army Europe Benefits needed to translate portions of a SQL Server database used for ◼ Enhanced force protection ◼ Saved $150,500 in manual translation screening and hiring local nationals. Using the Microsoft costs ◼ Improved usability of data Translator API, Microsoft Visual C#, and the common language runtime (CLR) environment, engineers automated the translation Software and Services ◼ Microsoft Server Product Portfolio of select SQL Server data into English. As a result, the Army saved − Microsoft SQL Server 2012 about $150,500 (about 1,750 hours) in manual translation costs, ◼ Microsoft Dynamics CRM ◼ Microsoft Visual Studio avoided a seven-month delay, and maintained access to all of its − Microsoft Visual C# historical employment screening data. ◼ Technologies − Microsoft Translator API information was typically submitted in a − Transact SQL Business Needs U.S. Army Europe trains, equips, deploys, language other than English. and provides command and control of troops to enhance transatlantic security. To All of the application data was stored in a support that mission, it employs many local SQL Server database to be used for nationals for civilian jobs such as land- screening and hiring employees and scaping, food services, and maintenance.
    [Show full text]
  • Empowering People with Disabilities Through AI
    Empowering people with disabilities through AI Microsoft WBCSD Future of Work case study February 2020 Table of Contents Summary ............................................................................................................................................................... 2 Company background ............................................................................................................................................ 2 Future of Work challenge ...................................................................................................................................... 3 Business case ......................................................................................................................................................... 3 Microsoft’s solution ............................................................................................................................................... 3 Seeing AI............................................................................................................................................................... 4 Helpicto ................................................................................................................................................................ 4 Microsoft Translator ............................................................................................................................................ 5 Results ..................................................................................................................................................................
    [Show full text]
  • "Machine Translation Evaluation Through Post-Editing…"
    ©inTRAlinea & Anna Fernández Torné (2016). "Machine translation evaluation through post-editing measures in audio description", inTRAlinea Vol. 18. Permanent URL: http://www.intralinea.org/archive/article/2200 inTRAlinea [ISSN 1827-000X] is the online translation journal of the Department of Interpreting and Translation (DIT) of the University of Bologna, Italy. This printout was generated directly from the online version of this article and can be freely distributed under the following Creative Commons License. Machine translation evaluation through post-editing measures in audio description By Anna Fernández Torné (Universitat Autònoma de Barcelona, Spain) Abstract & Keywords English: The number of accessible audiovisual products and the pace at which audiovisual content is made accessible need to be increased, reducing costs whenever possible. The implementation of different technologies which are already available in the translation field, specifically machine translation technologies, could help reach this goal in audio description for the blind and partially sighted. Measuring machine translation quality is essential when selecting the most appropriate machine translation engine to be implemented in the audio description field for the English-Catalan language combination. Automatic metrics and human assessments are often used for this purpose in any specific domain and language pair. This article proposes a methodology based on both objective and subjective measures for the evaluation of five different and free online machine translation systems. Their raw machine translation outputs and the post-editing effort that is involved are assessed using eight different scores. Results show that there are clear quality differences among the systems assessed and that one of them is the best rated in six out of the eight evaluation measures used.
    [Show full text]
  • A Finite-State Morphological Analyser for Sindhi
    A Finite-State Morphological Analyser for Sindhi Raveesh Motlani1, Francis M. Tyers2 and Dipti M. Sharma1 1FC Kohli Center on Intelligent Systems (KCIS), International Institute of Information Technology Hyderabad, Telangana, India 2 HSL-fakultehta, UiT Norgga árktalaš universitehta N-9019 Romsa [email protected], [email protected], [email protected] Abstract Morphological analysis is a fundamental task in natural-language processing, which is used in other NLP applications such as part-of-speech tagging, syntactic parsing, information retrieval, machine translation, etc. In this paper, we present our work on the development of free/open-source finite-state morphological analyser for Sindhi. We have used Apertium’s lttoolbox as our finite-state toolkit to implement the transducer. The system is developed using a paradigm-based approach, wherein a paradigm defines all the word forms and their morphological features for a given stem (lemma). We have evaluated our system on the Sindhi Wikipedia, which is a freely-available large corpus of Sindhi and achieved a reasonable coverage of about 81% and a precision of over 97%. Keywords: Sindhi, Morphological Analysis, Finite-State Machines [ɓ] ٻ [ɲ] ڃ [ŋ] ڱ Introduction .1 [ɠ] [ʄ] [bʱ] ڀ ڄ ڳ Morphology describes the internal structure of words in a [dʱ] ڌ [cʰ] ڇ [k] ڪ language. A morphological analysis of a word involves [ɗ] ڏ [ʈʰ] ٺ [ɳ] ڻ ,describing one or more of its properties such as: gender [ɖ] ڊ [ʈ] ٽ [pʰ] ڦ number, person, case, lexical category, etc. Morphological [ɖʱ] ڍ [tʰ] ٿ [ɽ] ڙ -analysis of a word thus becomes a fundamental and cru cial task in natural-language processing for any language.
    [Show full text]
  • TRANSLATORS WITHOUT BORDERS a Community Translating to Save Lives
    The Voice of Interpreters and Translators THE ATA Nov/Dec 2015 Volume XLIV Number 9 CHRONICLE TRANSLATORS WITHOUT BORDERS A Community Translating To Save Lives PEMT Yourself! Don't Leave Money You're Owed on the Table! Beyond Post-Editing: Advances in Interactive Translation Environments Switching from a Laptop to a Tablet: An Interpreter’s Experience A Publication of the American Translators Association CAREERS at the NATIONAL SECURITY AGENCY inspiredTHINKING When in the office, NSA language analysts develop new perspectives NSA has a critical need for individuals with the on the dialect and nuance of foreign language, on the context and following language capabilities: cultural overtones of language translation. • Arabic • Chinese We draw our inspiration from our work, our colleagues and our lives. • Farsi During downtime we create music and paintings. We run marathons • Korean and climb mountains, read academic journals and top 10 fiction. • Russian • Spanish Each of us expands our horizons in our own unique way and makes • And other less commonly taught languages connections between things never connected before. APPLY TODAY At the National Security Agency, we are inspired to create, inspired to invent, inspired to protect. U.S. citizenship is required for all applicants. NSA is an Equal Opportunity Employer and abides by applicable employment laws and regulations. All applicants for employment are considered without regard to age, color, disability, genetic information, national origin, race, religion, sex, sexual orientation, marital status, or status as a parent. Search NSA to Download WHERE INTELLIGENCE GOES TO WORK® 14CNS-10_8.5x11(live_8x10.5).indd 1 9/16/15 10:44 AM Nov/Dec 2015 Volume XLIV CONTENTS Number 9 FEATURES 19 Beyond Post-Editing: Advances in Interactive 9 Translation Environments Translators without Borders: Post-editing was never meant A Community Translating to be the future of machine to Save Lives translation.
    [Show full text]
  • Multi-Script Morphological Transducers and Transcribers for Seven Turkic
    Multi-script morphological transducers and transcribers for seven Turkic languages Jonathan Washington, Francis Tyers, Oğuzhan Kuyrukçu Swarthmore College, Indiana University / Высшая Школа Экономики, Boğaziçi Üniversitesi [email protected], [email protected], [email protected] This paper describes ongoing work to augment morphological transducers for seven Turkic languages with support for multiple scripts each, as well as respective IPA transcription systems. Evaluation demonstrates that our approach yields coverage equivalent to or not much lower than that of the base transducers. Background. A morphological transducer converts between form and analysis, e.g. алмалардан ↔ алма<n><pl> <abl>, where the form-to-analysis task is termed “morphological analysis” and the analysis-to-form task is termed “morphological generation”. Existing Free/Open-Source morphological transducers for Turkic languages (Wash- ington et al. 2020) are implemented in only one orthography, despite a number of the languages being currently written in two or more orthographies, or having a large body of text written in an orthography that was recently switched away from. This paper builds on work in which Cyrillic support was added to a transducer for Crimean Tatar which had been implemented in the Latin script (Tyers et al. 2019). We leverage morphological transducers for Kazakh (imple- mented in the Cyrillic script), Kyrgyz (Cyrillic), Turkmen (Latin), Qaraqalpaq (Latin), Uzbek (Latin), and Uyghur (Perso-Arabic), and add support for analysis and generation in additional scripts that are currently or have recently been used for the languages. Specifically, we add Cyrillic support to Turkmen, Qaraqalpaq, Uzbek, and Uyghur transducers; Perso-Arabic support to the Kazakh and Kyrgyz transducers; Latin script to the Kazakh and Uyghur transducers; and IPA support to all of them.
    [Show full text]
  • Developing Morph-Analyzer for Urdu Using Apertium: Some Issues
    Developing Morph-Analyzer for Urdu Using Apertium: Some issues Shahid Mushataq Bhat [email protected] Linguistic Data Consortium for Indian Languages (LDCIL) CIIL, Mysore Content: Introduction Morphological features of Urdu Apertium (LT-toolbox): Some background Computing Noun-morphology using LT Toolbox Split-Orthography of Urdu Conclusion Introduction: Automatic morphological analysis is the fundamental task in NLP that can be employed in enhancing the accuracy of POS-taggers, Chunkers, Parsers and Information retrieval systems. Computational morphology models the internal structure of words i-e the way; words are built out of minimal units called morphemes. Most of natural languages construct words by concatenating morphemes together in strict orders. Such Concatenative morphotactics is highly productive, particularly, in agglutinative languages like Tamil, Kannada, Manipuri, etc but in some languages like Hebrew and Arabic (Semitic languages) infixation is the main morphological operation (instead of concatenation), constituting Non-Concatenative (Templatic or Root & Pattern) morphology. Continues …… Beyond this Concatenative and Non-Concatenative polarity, Urdu nouns (unlike nouns of other Indian Languages) show the interplay of both types of morphologies. So, morphological structure of Urdu like Tagalog (a language of Philippines) can’t be computed adequately unless dual nature of its morphology is not taken into account. “The morphotactic limitations of the traditional implementations are the direct result of relying solely
    [Show full text]
  • From the Myth of Babel to Google Translate: Confronting Malicious Use of Artificial Intelligence— Copyright and Algorithmic Biases in Online Translation Systems
    Fordham Law School FLASH: The Fordham Law Archive of Scholarship and History Faculty Scholarship 2019 From the Myth of Babel to Google Translate: Confronting Malicious Use of Artificial Intelligence— Copyright and Algorithmic Biases in Online Translation Systems Shlomit Yanisky-Ravid Fordham University School of Law, [email protected] Cynthia Martens Deborah A. Nilson & Associates, PLLC Follow this and additional works at: https://ir.lawnet.fordham.edu/faculty_scholarship Recommended Citation Shlomit Yanisky-Ravid and Cynthia Martens, From the Myth of Babel to Google Translate: Confronting Malicious Use of Artificial Intelligence— Copyright and Algorithmic Biases in Online Translation Systems, 43 Seattle U. L. Rev. 99 (2019) Available at: https://ir.lawnet.fordham.edu/faculty_scholarship/1089 This Article is brought to you for free and open access by FLASH: The Fordham Law Archive of Scholarship and History. It has been accepted for inclusion in Faculty Scholarship by an authorized administrator of FLASH: The Fordham Law Archive of Scholarship and History. For more information, please contact [email protected]. From the Myth of Babel to Google Translate: Confronting Malicious Use of Artificial Intelligence— Copyright and Algorithmic Biases in Online Translation Systems Professor Shlomit Yanisky-Ravid and Cynthia Martens* Many of us rely on Google Translate and other Artificial Intelligence and Machine Learning (AI) online translation daily for personal or commercial use. These AI systems have become ubiquitous and are poised to revolutionize human communication across the globe. Promising increased fluency across cultures by breaking down linguistic barriers and promoting cross-cultural relationships in a way that many civilizations have historically sought and struggled to achieve, AI translation affords users the means to turn any text—from phrases to books—into cognizable expression.
    [Show full text]
  • Estudio Comparativo De Tres
    TRABAJO DE FIN DE GRADO FACULTAD DE CIENCIAS HUMANAS Y SOCIALES GRADO EN TRADUCCIÓN E INTERPRETACIÓN ESTUDIO COMPARATIVO DE TRES TRADUCTORES AUTOMÁTICOS EN LÍNEA : DEEP L, YANDEX Y APERTIUM Autora: Mónica Adán Soriano Directora: Profesora Mª Luisa Romana García Madrid, junio 2019 Resumen : Este trabajo tiene la finalidad de comparar traductores automáticos en línea para así determinar cuál es el traductor más avanzado para un texto técnico. Para ello, primero, habrá un análisis de la evolución histórica de la traducción automática y sus usos, así como los programas desarrollados para ello. Se explicarán además los diferentes sistemas de traducción automática que existen y cómo funcionan. En la parte experimental, se escogerá un texto de carácter técnico en español y de oraciones que supongan un reto para un traductor y se procesará en los tres traductores automáticos escogidos según su modalidad para realizar una traducción al inglés. Una vez obtenidas las respuestas, se analizarán los errores cometidos por los traductores automáticos, concluyendo así con los errores más comunes y el mejor traductor automático en línea así como los usos que se le pueden dar. Palabras clave: traducción automática, sistemas basados en estadística, Yandex, sistemas neuronales, DeepL, sistemas basados en reglas, Apertium, errores de traducción. Abstract : The aim of this dissertation is to do a comparative research analysis on three automatic translation programs available on the internet to determine which is the most advanced for a technical text. First, we will do an analysis on the historic evolution of automatic translation and the several uses given to it, as well as the diverse programs developed for it and how they work.
    [Show full text]
  • The Apertium Bilingual Dictionaries on the Web of Data
    Undefined 0 (0) 1 1 IOS Press The Apertium Bilingual Dictionaries on the Web of Data Jorge Gracia a;∗, Marta Villegas b, Asunción Gómez-Pérez a, and Núria Bel b, a Ontology Engineering Group, Universidad Politécnica de Madrid Campus de Montegancedo s/n Boadilla del Monte 28660 Madrid. Spain E-mail: {jgracia,asun}@fi.upm.es b Institut Universitari Linguistica Aplicada, Universitat Pompeu Fabra Roc Boronat, 138 08018 Barcelona. Spain E-mail: {marta.villegas,nuria.bel}@upf.edu Abstract. Bilingual electronic dictionaries contain collections of lexical entries in two languages, with explicitly declared trans- lation relations between such entries. Nevertheless, they are typically developed in isolation, in their own formats and accessible through proprietary APIs. In this paper we propose the use of Semantic Web techniques to make translations available on the Web to be consumed by other semantic enabled resources in a direct manner, based on standard languages and query means. In particular, we describe the conversion of the Apertium family of bilingual dictionaries and lexicons into RDF (Resource Descrip- tion Framework) and how their data have been made accessible on the Web as linked data. As result, all the converted dictionaries (many of them covering under-resourced languages) are connected among them and can be easily traversed from one to another to obtain, for instance, translations between language pairs not originally connected in any of the original dictionaries. Keywords: linguistic linked data, multilingualism, Apertium, bilingual dictionaries, lexicons, lemon, translation 1. Introduction In this article we will focus on the case of electronic bilingual dictionaries as a particular type of languages The publication of bilingual and multilingual lan- resources.
    [Show full text]