Issues in Machine Translation a Case of Mobile Apps in the Lithuanian and English Language Pair
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Integrating Optical Character Recognition and Machine
Integrating Optical Character Recognition and Machine Translation of Historical Documents Haithem Afli and Andy Way ADAPT Centre School of Computing Dublin City University Dublin, Ireland haithem.afli, andy.way @adaptcentre.ie { } Abstract Machine Translation (MT) plays a critical role in expanding capacity in the translation industry. However, many valuable documents, including digital documents, are encoded in non-accessible formats for machine processing (e.g., Historical or Legal documents). Such documents must be passed through a process of Optical Character Recognition (OCR) to render the text suitable for MT. No matter how good the OCR is, this process introduces recognition errors, which often renders MT ineffective. In this paper, we propose a new OCR to MT framework based on adding a new OCR error correction module to enhance the overall quality of translation. Experimenta- tion shows that our new system correction based on the combination of Language Modeling and Translation methods outperforms the baseline system by nearly 30% relative improvement. 1 Introduction While research on improving Optical Character Recognition (OCR) algorithms is ongoing, our assess- ment is that Machine Translation (MT) will continue to produce unacceptable translation errors (or non- translations) based solely on the automatic output of OCR systems. The problem comes from the fact that current OCR and Machine Translation systems are commercially distinct and separate technologies. There are often mistakes in the scanned texts as the OCR system occasionally misrecognizes letters and falsely identifies scanned text, leading to misspellings and linguistic errors in the output text (Niklas, 2010). Works involved in improving translation services purchase off-the-shelf OCR technology but have limited capability to adapt the OCR processing to improve overall machine translation perfor- mance. -
Neuro Informatics 2020
Neuro Informatics 2019 September 1-2 Warsaw, Poland PROGRAM BOOK What is INCF? About INCF INCF is an international organization launched in 2005, following a proposal from the Global Science Forum of the OECD to establish international coordination and collaborative informatics infrastructure for neuroscience. INCF is hosted by Karolinska Institutet and the Royal Institute of Technology in Stockholm, Sweden. INCF currently has Governing and Associate Nodes spanning 4 continents, with an extended network comprising organizations, individual researchers, industry, and publishers. INCF promotes the implementation of neuroinformatics and aims to advance data reuse and reproducibility in global brain research by: • developing and endorsing community standards and best practices • leading the development and provision of training and educational resources in neuroinformatics • promoting open science and the sharing of data and other resources • partnering with international stakeholders to promote neuroinformatics at global, national and local levels • engaging scientific, clinical, technical, industry, and funding partners in colla- borative, community-driven projects INCF supports the FAIR (Findable Accessible Interoperable Reusable) principles, and strives to implement them across all deliverables and activities. Learn more: incf.org neuroinformatics2019.org 2 Welcome Welcome to the 12th INCF Congress in Warsaw! It makes me very happy that a decade after the 2nd INCF Congress in Plzen, Czech Republic took place, for the second time in Central Europe, the meeting comes to Warsaw. The global neuroinformatics scenery has changed dramatically over these years. With the European Human Brain Project, the US BRAIN Initiative, Japanese Brain/ MINDS initiative, and many others world-wide, neuroinformatics is alive more than ever, responding to the demands set forth by the modern brain studies. -
Machine Translation for Academic Purposes Grace Hui-Chin Lin
Proceedings of the International Conference on TESOL and Translation 2009 December 2009: pp.133-148 Machine Translation for Academic Purposes Grace Hui-chin Lin PhD Texas A&M University College Station Master of Science, University of Southern California Paul Shih Chieh Chien Center for General Education, Taipei Medical University Abstract Due to the globalization trend and knowledge boost in the second millennium, multi-lingual translation has become a noteworthy issue. For the purposes of learning knowledge in academic fields, Machine Translation (MT) should be noticed not only academically but also practically. MT should be informed to the translating learners because it is a valuable approach to apply by professional translators for diverse professional fields. For learning translating skills and finding a way to learn and teach through bi-lingual/multilingual translating functions in software, machine translation is an ideal approach that translation trainers, translation learners, and professional translators should be familiar with. In fact, theories for machine translation and computer assistance had been highly valued by many scholars. (e.g., Hutchines, 2003; Thriveni, 2002) Based on MIT’s Open Courseware into Chinese that Lee, Lin and Bonk (2007) have introduced, this paper demonstrates how MT can be efficiently applied as a superior way of teaching and learning. This article predicts the translated courses utilizing MT for residents of global village should emerge and be provided soon in industrialized nations and it exhibits an overview about what the current developmental status of MT is, why the MT should be fully applied for academic purpose, such as translating a textbook or teaching and learning a course, and what types of software can be successfully applied. -
Machine Translation for Language Preservation
Machine translation for language preservation Steven BIRD1,2 David CHIANG3 (1) Department of Computing and Information Systems, University of Melbourne (2) Linguistic Data Consortium, University of Pennsylvania (3) Information Sciences Institute, University of Southern California [email protected], [email protected] ABSTRACT Statistical machine translation has been remarkably successful for the world’s well-resourced languages, and much effort is focussed on creating and exploiting rich resources such as treebanks and wordnets. Machine translation can also support the urgent task of document- ing the world’s endangered languages. The primary object of statistical translation models, bilingual aligned text, closely coincides with interlinear text, the primary artefact collected in documentary linguistics. It ought to be possible to exploit this similarity in order to improve the quantity and quality of documentation for a language. Yet there are many technical and logistical problems to be addressed, starting with the problem that – for most of the languages in question – no texts or lexicons exist. In this position paper, we examine these challenges, and report on a data collection effort involving 15 endangered languages spoken in the highlands of Papua New Guinea. KEYWORDS: endangered languages, documentary linguistics, language resources, bilingual texts, comparative lexicons. Proceedings of COLING 2012: Posters, pages 125–134, COLING 2012, Mumbai, December 2012. 125 1 Introduction Most of the world’s 6800 languages are relatively unstudied, even though they are no less im- portant for scientific investigation than major world languages. For example, before Hixkaryana (Carib, Brazil) was discovered to have object-verb-subject word order, it was assumed that this word order was not possible in a human language, and that some principle of universal grammar must exist to account for this systematic gap (Derbyshire, 1977). -
Machine Translation and Computer-Assisted Translation: a New Way of Translating? Author: Olivia Craciunescu E-Mail: Olivia [email protected]
Machine Translation and Computer-Assisted Translation: a New Way of Translating? Author: Olivia Craciunescu E-mail: [email protected] Author: Constanza Gerding-Salas E-mail: [email protected] Author: Susan Stringer-O’Keeffe E-mail: [email protected] Source: http://www.translationdirectory.com The Need for Translation IT has produced a screen culture that Technology tends to replace the print culture, with printed documents being dispensed Advances in information technology (IT) with and information being accessed have combined with modern communication and relayed directly through computers requirements to foster translation automation. (e-mail, databases and other stored The history of the relationship between information). These computer documents technology and translation goes back to are instantly available and can be opened the beginnings of the Cold War, as in the and processed with far greater fl exibility 1950s competition between the United than printed matter, with the result that the States and the Soviet Union was so intensive status of information itself has changed, at every level that thousands of documents becoming either temporary or permanent were translated from Russian to English and according to need. Over the last two vice versa. However, such high demand decades we have witnessed the enormous revealed the ineffi ciency of the translation growth of information technology with process, above all in specialized areas of the accompanying advantages of speed, knowledge, increasing interest in the idea visual -
Language Service Translation SAVER Technote
TechNote August 2020 LANGUAGE TRANSLATION APPLICATIONS The U.S. Department of Homeland Security (DHS) AEL Number: 09ME-07-TRAN established the System Assessment and Validation for Language translation equipment used by emergency services has evolved into Emergency Responders commercially available, mobile device applications (apps). The wide adoption of (SAVER) Program to help cell phones has enabled language translation applications to be useful in emergency responders emergency scenarios where first responders may interact with individuals who improve their procurement speak a different language. Most language translation applications can identify decisions. a spoken language and translate in real-time. Applications vary in functionality, Located within the Science such as the ability to translate voice or text offline and are unique in the number and Technology Directorate, of available translatable languages. These applications are useful tools to the National Urban Security remedy communication barriers in any emergency response scenario. Technology Laboratory (NUSTL) manages the SAVER Overview Program and conducts objective operational During an emergency response, language barriers can challenge a first assessments of commercial responder’s ability to quickly assess and respond to a situation. First responders equipment and systems must be able to accurately communicate with persons at an emergency scene in relevant to the emergency order to provide a prompt, appropriate response. The quick translations provided responder community. by mobile apps remove the language barrier and are essential to the safety of The SAVER Program gathers both first responders and civilians with whom they interact. and reports information about equipment that falls within the Before the adoption of smartphones, first responders used language interpreters categories listed in the DHS and later language translation equipment such as hand-held language Authorized Equipment List translation devices (Figure 1). -
Terminology Extraction, Translation Tools and Comparable Corpora Helena Blancafort, Béatrice Daille, Tatiana Gornostay, Ulrich Heid, Claude Méchoulam, Serge Sharoff
TTC: Terminology Extraction, Translation Tools and Comparable Corpora Helena Blancafort, Béatrice Daille, Tatiana Gornostay, Ulrich Heid, Claude Méchoulam, Serge Sharoff To cite this version: Helena Blancafort, Béatrice Daille, Tatiana Gornostay, Ulrich Heid, Claude Méchoulam, et al.. TTC: Terminology Extraction, Translation Tools and Comparable Corpora. 14th EURALEX International Congress, Jul 2010, Leeuwarden/Ljouwert, Netherlands. pp.263-268. hal-00819365 HAL Id: hal-00819365 https://hal.archives-ouvertes.fr/hal-00819365 Submitted on 2 May 2013 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. TTC: Terminology Extraction, Translation Tools and Comparable Corpora Helena Blancafort, Syllabs, Universitat Pompeu Fabra, Barcelona, Spain Béatrice Daille, LINA, Université de Nantes, France Tatiana Gornostay, Tilde, Latva Ulrich Heid, IMS, Universität Stuttgart, Germany Claude Mechoulam, Sogitec, France Serge Sharoff, CTS, University of Leeds, England The need for linguistic resources in any natural language application is undeniable. Lexicons and terminologies -
Acknowledging the Needs of Computer-Assisted Translation Tools Users: the Human Perspective in Human-Machine Translation Annemarie Taravella and Alain O
The Journal of Specialised Translation Issue 19 – January 2013 Acknowledging the needs of computer-assisted translation tools users: the human perspective in human-machine translation AnneMarie Taravella and Alain O. Villeneuve, Université de Sherbrooke, Quebec ABSTRACT The lack of translation specialists poses a problem for the growing translation markets around the world. One of the solutions proposed for the lack of human resources is automated translation tools. In the last few decades, organisations have had the opportunity to increase their use of technological resources. However, there is no consensus on the way that technological resources should be integrated into translation service providers (TSP). The approach taken by this article is to set aside both 100% human translation and 100% machine translation (without human intervention), to examine a third, more realistic solution: interactive translation where humans and machines co-operate. What is the human role? Based on the conceptual framework of information systems and organisational sciences, we recommend giving users, who are mainly translators for whom interactive translation tools are designed, a fundamental role in the thinking surrounding the implementation of a technological tool. KEYWORDS Interactive translation, information system, implementation, business process reengineering, organisational science. 1. Introduction Globalisation and the acceleration of world trade operations have led to an impressive growth of the global translation services market. According to EUATC (European Union of Associations of Translation Companies), translation industry is set to grow around five percent annually in the foreseeable future (Hager, 2008). With its two official languages, Canada is a good example of an important translation market, where highly skilled professional translators serve not only the domestic market, but international clients as well. -
Three Directions for the Design of Human-Centered Machine Translation
Three Directions for the Design of Human-Centered Machine Translation Samantha Robertson Wesley Hanwen Deng University of California, Berkeley University of California, Berkeley samantha [email protected] [email protected] Timnit Gebru Margaret Mitchell Daniel J. Liebling Black in AI [email protected] Google [email protected] [email protected] Michal Lahav Katherine Heller Mark D´ıaz Google Google Google [email protected] [email protected] [email protected] Samy Bengio Niloufar Salehi Google University of California, Berkeley [email protected] [email protected] Abstract As people all over the world adopt machine translation (MT) to communicate across lan- guages, there is increased need for affordances that aid users in understanding when to rely on automated translations. Identifying the in- formation and interactions that will most help users meet their translation needs is an open area of research at the intersection of Human- Computer Interaction (HCI) and Natural Lan- guage Processing (NLP). This paper advances Figure 1: TranslatorBot mediates interlingual dialog us- work in this area by drawing on a survey of ing machine translation. The system provides extra sup- users’ strategies in assessing translations. We port for users, for example, by suggesting simpler input identify three directions for the design of trans- text. lation systems that support more reliable and effective use of machine translation: helping users craft good inputs, helping users under- what kinds of information or interactions would stand translations, and expanding interactiv- best support user needs (Doshi-Velez and Kim, ity and adaptivity. We describe how these 2017; Miller, 2019). For instance, users may be can be introduced in current MT systems and more invested in knowing when they can rely on an highlight open questions for HCI and NLP re- AI system and when it may be making a mistake, search. -
Seimas of the Republic of Lithuania Resolution No Xiv
SEIMAS OF THE REPUBLIC OF LITHUANIA RESOLUTION NO XIV-72 ON THE PROGRAMME OF THE EIGHTEENTH GOVERNMENT OF THE REPUBLIC OF LITHUANIA 11 December 2020 Vilnius In pursuance of Articles 67(7) and 92(5) of the Constitution of the Republic of Lithuania and having considered the Programme of the Eighteenth Government of the Republic of Lithuania, the Seimas of the Republic of Lithuania, has resolved: Article 1. To approve the programme of the eighteenth Government of the Republic of Lithuania presented by Prime Minister of the Republic of Lithuania Ingrida Šimonytė (as appended). SPEAKER OF THE SEIMAS Viktorija Čmilytė-Nielsen APPROVED by Resolution No XIV-72 of the Seimas of the Republic of Lithuania of 11 December 2020 PROGRAMME OF THE EIGHTEENTH GOVERNMENT OF THE REPUBLIC OF LITHUANIA CHAPTER I INTRODUCTION 1. As a result of the world-wide pandemic, climate change, globalisation, ageing population and technological advance, Lithuania and the entire world have been changing faster than ever before. However, these global changes have led not only to uncertainty and anxiety about the future but also to a greater sense of togetherness and growing trust in each other and in the state, thus offering hope for a better future. 2. This year, we have celebrated the thirtieth anniversary of the restoration of Lithuania’s independence. The state that we have all longed for and taken part in its rebuilding has reached its maturity. The time has come for mature political culture and mature decisions too. The time has come for securing what the Lithuanian society has always held high: openness, responsibility, equal treatment and respect for all. -
The MT Developer/Provider and the Global Corporation
The MT developer/provider and the global corporation Terence Lewis1 & Rudolf M.Meier2 1 Hook & Hatton Ltd, 2 Siemens Nederland N.V. [email protected], [email protected] Abstract. This paper describes the collaboration between MT developer/service provider, Terence Lewis (Hook & Hatton Ltd) and the Translation Office of Siemens Nederland N.V. (Siemens Netherlands). It involves the use of the developer’s Dutch-English MT software to translate technical documentation for divisions of the Siemens Group and for external customers. The use of this language technology has resulted in significant cost savings for Siemens Netherlands. The authors note the evolution of an ‘MT culture’ among their regu- lar users. 1. Introduction 2. The Developer’s perspective In 2002 Siemens Netherlands and Hook & Hat- Under the current arrangement, the Developer ton Ltd signed an exclusive agreement on the (working from home in the United Kingdom) marketing of Dutch-English MT services in the currently acts as the sole operator of an Auto- Netherlands. This agreement set the seal on an matic Translation Environment (Trasy), process- informal partnership that had evolved over the ing documents forwarded by e-mail by the Trans- preceding six-seven years between Siemens and lation Office of Siemens Netherlands in The the Developer (Terence Lewis, trading through Hague. Ideally, these documents are in MS Word the private company Hook & Hatton Ltd). Un- or rtf format, but the Translation Office also re- der the contract, Siemens Netherlands under- ceives pdf files or OCR output files for automatic takes to market the MT Services provided by translation. -
NONVIOLENT RESISTANCE in LITHUANIA a Story of Peaceful Liberation
NONVIOLENT RESISTANCE IN LITHUANIA A Story of Peaceful Liberation Grazina Miniotaite The Albert Einstein Institution www.aeinstein.org 2 CONTENTS Acknowledgments Introduction Chapter 1: Nonviolent Resistance Against Russification in the Nineteenth Century The Goals of Tsarism in Lithuania The Failure of Colonization The Struggle for the Freedom of Religion The Struggle for Lithuanian Press and Education Chapter 2: Resistance to Soviet Rule, 1940–1987 An Overview Postwar Resistance The Struggle for the Freedom of Faith The Struggle for Human and National Rights The Role of Lithuanian Exiles Chapter 3: The Rebirth From Perestroika to the Independence Movement Test of Fortitude The Triumph of Sajudis Chapter 4: Towards Independence The Struggle for Constitutional Change Civil Disobedience Step by Step The Rise of Reactionary Opposition Chapter 5: The Struggle for International Recognition The Declaration of Independence Independence Buttressed: the Battle of Laws First Signs of International Recognition The Economic Blockade The January Events Nonviolent Action in the January Events International Reaction 3 Chapter 6: Towards Civilian-Based Defense Resistance to the “Creeping Occupation” Elements of Civilian-Based Defense From Nonviolent Resistance to Organized Civilian-Based Defense The Development of Security and Defense Policy in Lithuania since 1992 Concluding Remarks Appendix I Appeal to Lithuanian Youth by the Supreme Council of the Republic of Lithuania Appendix II Republic in Danger! Appendix III Appeal by the Government of the Republic