Translating and the Computer 36

Total Page:16

File Type:pdf, Size:1020Kb

Translating and the Computer 36 seeeeeeeeeeeeeeee Editions Tradulex, Geneva © AsLing, The International Association for Advancement in Language Technology, 2014 Distribution without the authorisation from ASLING is not allowed. These proceedings are downloadable from www.tradulex.com 1 Conference Chairs and Editors of the Proceedings João Esteves-Ferreira, Tradulex – International Association for Quality Translation. Juliet Macan, Arancho Doc srl. Ruslan Mitkov, University of Wolverhampton. Olaf-Michael Stefanov, JIAMCATT - International Annual Meeting on Computer-Assisted Translation and Terminology, United Nations (ret.). Programme Commitee Alain Désilets, National Research Council of Canada (NRC). David Chambers, World Intellectual Property Organisation (ret.). Gloria Corpas Pastor, University of Malaga. Estelle Delpech, Nomao. David Flip, LRC - Localisation Research Centre (Ireland), CNGL - Centre for Global Intelligent Content (Ireland). Web, University of Limerick. Pamela Mayorcas, Fellow of the Institute of Translation and Interpreting (FITI). Paola Valli Nelson Verástegui, International Telecommunications Union (ITU). Conference Manager Nicole Adamides Technical Advisor Jean-Marie Vande Walle Editorial Assistants Míriam Urbano Mendaña Petya Petkova 2 Acknowledgements AsLing wishes to thank and acknowledge the support of the sponsors of TC36 3 Preface 2014 has been one of profound changes for the Translating and the Computer conference that has managed to maintain over 36 years its character of a unique forum for researchers, developers and users. Bringing together academics involved in language technology research and in teaching translation and terminology with those who develop and market tools for language transformation and both of these groups with practitioners: translators, terminologists, interpreters, and voice-over specialists, whether freelancers or working in translation departments of large organisations, international companies as well as Language Services Providers (LSPs), large and small. At the end of April the Chairs were invited to take over the entire organisation of the event previously run by Aslib. In the midst of many doubts and uncertainties we decided to accept this challenge. To do this, we set up a legal entity, AsLing (Association internationale pour la promotion des technologies linguistiques), a not for profit association, to run the conference and promote the advancement of language technology. In June the adventure began! Despite the inevitable delays in the call for papers and deadlines that dangerously clashed with the holiday season, we had a generous response. We would therefore like to thank all those who believed in our initiative and provided the worthy submissions contained in this volume. In particular we would like to thank our keynotes who so generously gave us credit. Again we are also grateful to those who agreed to the new opportunity to present Posters and Workshops to extend the programme and provide a rich and interesting opportunity for our delegates. We were very fortunate in convincing the Programme Committee members to step in and provide expert support in selecting the submissions and providing feedback to the authors as well as helping with the conference itself. You will also be pleased to know that the volume has been assembled with care by our attentive editorial assistants under the guidance and supervision of the Editors. Finally, we would like to say a big thank you to: Nicole Adamides, whom we are delighted to welcome back, Jean-Marie Vande Walle, and naturally our sponsors without whom the conference would not have taken place and consequently these Proceedings would not exist. Conference Chairs João Esteves-Ferreira, Juliet Margaret Macan, Ruslan Mitkov, Olaf-Michael Stefanov. 4 Programme Session chair: THURSDAY 27 November 2014 MORNING Session (incl. Lunch) Olaf-Michael Stefanov Lecture Theatre George Stephenson Room Time Speakers Title Time Title – Presenter/Moderator 08:30 Registration Lead Chair & Welcome and Everyone is invited to the Lecture 09:00 Session Chair Introduction Theatre for the conference Opening Fill in the gaps: what we 09:15 Kevin Flanagan need from TM subsegment recall Some thoughts on progress Gábor Prószéky in MT — almost fifty years Everyone is invited to the Lecture 09:45 Keynote – Day 1 after the (first?) ALPAC Theatre for the Keynote report 10:35 Break Break Everyone is invited to the Lecture Workshop infos Theatre for an overview of the Workshop moderators 10:55 Session Chair (8 x 5 min.) and Workshops and Poster Poster overview Presentations to be held in the George Stephenson Room Beyond prescription: What Miguel A. empirical studies are telling 12:00 Workshop 1: Jimenez Crespo us about localization SDL trados (Gold Sponsor) crowdsourcing Using Cross-Language 12:00–13:00 “Productivity beyond the Information Retrieval and Translation Memory” Nasredine 12:30 Meaning-Text Theory in - Lydia Simplicio Semmar Example-Based Machine Technical Business Consultant Translation P1 13:25–13:50 Lunch 13:00 Lunch 13:55–14:20 (incl. 2 Poster sessions) P2 Poster Presenter Additional Author(s) Title of Poster Presentation Representing intra-and interlingual terminological P1 Koen variation in a new type of translation resource: a 13:30 Kerremans prototype proposal iCompileCorpora: A Web-based Application to Semi- P2 Gloria Corpas Pastor, Hernani Costa automatically Compile Multilingual Comparable 14:00 Miriam Seghiri Corpora 5 Session chair: THURSDAY 27 November 2014 AFTERNOON Session (Lunch listed here again) Ruslan Mitkov Lecture Theatre George Stephenson Room Time Speakers Title Time Title – Presenter/Moderator 13:25–13:50 Lunch P1 13:00 Lunch 13:55–14:20 (incl. 2 Poster sessions) P2 Nizar Ghoula, Terminology Management 14:30 Jaques Guyot, Revisited Workshop 2: & Gilles Falquet Victoria Porro, “Quality Assurance Process in Rule-based automatic post- 14:30–15:30 Johanna Gerlach, Translation” processing of SMT output to 15:00 Pierrette reduce human post-editing - Jerzy Czopik Bouillon, & effot: a case study Violeta Seretan Break 15:30 Break 15:30–15:50 P3 (incl. 1 Poster session) Tom Improving fuzzy matching Vanallemeersch Workshop 3: 15:50 through syntactic & Vicent MateCat (Gold Sponsor) knowledge Vandeghinste “Free. A new business model for 15:50–16:50 CAT tools” Integrating Machine - Translation (MT) in the Alessandro Cattelan Marion 16:20 Higher Education of Director of Localization, Wittkowsky Translators and Technical Translated.net Writers. AsLing Everyone is invited to the Special Award 16:50 Executive Lecture Theatre for the Award Ceremony Committee Ceremony Workshop 4: 17:00 Gold sponsors Product presentations “Top-down or bottom-up: what do industry approaches to 17:00–18:00 translation quality mean for effective integration of 17:30 Close of Day 1 standards and tools?” - Joanna Drugan Poster Presenter Additional Author(s) Title of Poster Presentation Representing intra-and interlingual terminological P1 Koen variation in a new type of translation resource: a 13:30 Kerremans prototype proposal iCompileCorpora: A Web-based Application to Semi- P2 Gloria Corpas Pastor, Hernani Costa automatically Compile Multilingual Comparable 14:00 Miriam Seghiri Corpora P3 Rohit Gupta, Constantin Intelligent Translation Memory Matching and Hanna Bechara 15:30 Orasan Retrieval Metric Exploiting Linguistic Technology 6 Session chair: FRIDAY 28 November 2014 Juliet Macan MORNING Session (incl. Lunch) Lecture Theatre George Stephenson Room Time Speakers Title Time Title – Presenter/Moderator “Terminology finding in the Sketch Getting the best out of a 09:00 Terence Lewis 09:00–09:30 Engine: an Evaluation” mixed bag - Adam Kilgarriff Angelika Translation Tools are Everyone is invited to the Lecture 09:30 Zerfass TOOLS. How do we make Theatre for the Keynote Keynote – Day2 the most of them? Break 10:20 Break 10:20–10:40 P4 (incl. 1 Poster session) The Dos and Don’ts of XML Workshop 5: 10:50 Andrzej Zydroń document localization “Quality Evaluation today: the 10:50–11:50 Dynamic Quality Framework” 11:20 Kurt Eberle AutoLearn<word> - A. Görög Far From the Maddening Crowd: Integrating Collaborative Translation 11:50 Erin Lyons Technologies into Workshop 6: Healthcare Services in the 11:50–12:50 Kilgray / MemoQ Developing World (Silver Sponsor) Antonio Toral, Is Machine Translation 12:20 & Andy Way Ready for Literature? Lunch 12:50 Lunch 13:45–14:05 P5 (incl. 2 Poster sessions) Poster Presenter Additional Author(s) Title of Poster Presentation P4 Sabine Machine Translation Quality Estimation Adapted to Alexandru Ceausu 10:20 Hunsicker the Translation Workflow Alejandro Armando, P5 A Tool for Building Multilingual Voice Manny Rayner Pierrette Bouillon, Nikos 13:00 Questionnaires Tsourakis 7 Session chair: FRIDAY 28 November 2014 AFTERNOON Session (Lunch listed here again) David Chambers Lecture Theatre George Stephenson Room Time Speakers Title Time Title – Presenter/Moderator Lunch 12:50 Lunch 13:45–14:05 P5 (incl. 2 Poster sessions) Toolkit 2020. What have we got? What do Both rooms will be integrated into the panel 14:15 Panel Debate we need? How do we debate get there? 15:30 Break Break Translating implicit 15:45 Irina Burukina elements in RBMT Losses and Gains in Workshop 7: Mozhgan Computer-Assisted “Solving Terminology Problems Ghassemiazghandim Translation: Some 16:10 More Quickly whith & Tengku Sepora Remarks on Online 15:45–16:45 ‘IntelliWebSearch (Almost) Tengku Mahadi Translation of English to Malay Unlimited’ ” Affective Impact of the - Michael Farrell
Recommended publications
  • A Semantic Model for Integrated Content Management, Localisation and Language Technology Processing
    A Semantic Model for Integrated Content Management, Localisation and Language Technology Processing Dominic Jones1, Alexander O’Connor1, Yalemisew M. Abgaz2, David Lewis1 1 & 2 Centre for Next Generation Localisation 1Knowledge and Data Engineering Group, 1School of Computer Science and Statistics, Trinity College Dublin, Ireland {Dominic.Jones, Alex.OConnor, Dave.Lewis}@scss.tcd.ie 2School of Computing, Dublin City University, Dublin, Ireland 2 [email protected] Abstract. Providers of products and services are faced with the dual challenge of supporting the languages and individual needs of the global customer while also accommodating the increasing relevance of user-generated content. As a result, the content and localisation industries must now evolve rapidly from manually processing predicable content which arrives in large jobs to the highly automated processing of streams of fast moving, heterogeneous and unpredictable content. This requires a new generation of digital content management technologies that combine the agile flow of content from developers to localisers and consumers with the data-driven language technologies needed to handle the volume of content required to feed the demands of global markets. Data-driven technologies such as statistical machine translation, cross-lingual information retrieval, sentiment analysis and automatic speech recognition, all rely on high quality training content, which in turn must be continually harvested based on the human quality judgments made across the end-to-end content processing flow. This paper presents the motivation, approach and initial semantic models of a collection of research demonstrators where they represent a part of, or a step towards, documenting in a semantic model the multi-lingual semantic web.
    [Show full text]
  • Euractiv Proposal
    EurActiv Proposal Andrzej Zydroń MBCS CTO XTM International, Balázs Benedek CTO Easyling Andrzej Zydroń CTO XTM-Intl • 37 years in IT , 25 of those related to Localization • Member of British Computer Society • Chief Technical Architect @ Xerox , Ford , XTM International • 100% track record design and delivery of complex systems for European Patent Office , Xerox , Oxford University , Ford , XTM International • Expert on computional aspects of L10N and related Open Standards • Co-founder XTM International • Technical Architect of XTM Cloud • Open Standard Technical Committees: LISA OSCAR GMX LISA OSCAR xml:tm LISA OSCAR TBX W3C ITS OASIS XLIFF OASIS Translation Web Services OASIS DITA Translation OASIS OAXAL ETSI LIS Interoperability Now! TIPP and XLIFF:doc XTM International – company background • XTM International was formed in 2002 • Independent TMS & CAT tool developer • Software development & support teams based in Poland • Sales & Marketing in UK and Ireland • XTM Cloud was launched 2010 & is available as: – Public cloud – Accessed on XTM International’s servers – Private cloud – Installed on your servers – 100% Open Standards based : OAXAL – Open APIs, infinitely scalable modular SOA Architecture design – Next Generation SaaS TMS/CAT solution XTM International – The Team • 50 People • 30 man software development team • All Software engineers have Computer Science MSc • Efficient and effective Software Support Team • Extensive experience in 3rd party systems integration , including: XTM Cloud design principles • XML - XTM is built
    [Show full text]
  • Metadata-Group Report
    Open and Flexible Metadata in Localization How often have you tried to open a Translation Memory (TM) created with one computer-aided translation (CAT) tool in another CAT tool? I assume pretty often. In the worst case, you cannot open the TM. In the best case, you can open it, but data and metadata are lost. You aren’t able to tell which strings have been locked, which are under review and so on. The standard Translation Memory eXchange (TMX), developed by the Localization Industry Standards Association (LISA)’s standards committee called OSCAR (Open Standards for Container/content Allowing Reuse) undoubtedly makes the exchange of TM data easier and does not lock the translators in a specific tool or tool provider. Also, the standard XML Localisation Interchange File Format (XLIFF), developed under the auspices of the Organization for the Advancement of Structured Information Standards (OASIS) is an interchange file format which exchanges localization data and can be used to exchange data between companies, such as a software publisher and a localization vendor, or even between localization tools. Both TMX and XLIFF are important standards for the localization process. These standards have their own format, though the synergy is there: XLIFF’s current version 1.2 borrows from the TMX 1.2 specification, and the inline markup XLIFF support in TMX 2.0 is currently in progress. There is a range of standard data formats, apart from TMX and XLIFF, such as darwin information typing architecture (DITA), attached to OASIS, Internationalization Tag Set (ITS), put out by W3C, Segmentation Rules eXchange (SRX), affiliated with LISA/OSCAR along with Global Information Management Metrics eXchange-Volume (GMX-V) and so on.
    [Show full text]
  • OAXAL Open Architecture for XML Authoring and Localization
    OAXAL Open Architecture for XML Authoring and Localization June 2010 Andrzej Zydroń: [email protected] Why OAXAL? Why OAXAL? Globalization Standards Globalization Standards Interoperability Interoperability Globalization Standards • Can we imagine world trade without the Shipping container • World trade would be significantly hampered • World GDP would be significantly reduced • Billions of people would be condemned to a life of constant poverty Why Open Standards? • Usability • Interoperability • Exchange • Risk reduction • Investment protection • Reduced implementation costs Why Open standards • Free – no fees • Input is from accredited volunteers • Democratic process • Extensive peer review • Extensive public review • Well documented Localization Standards . Parent organization • LISA OSCAR, ISO, W3C, OASIS . Constitution • IP policy, membership and committee rules . Membership • Company, academic, individual . Technical committee • OASIS XLIFF, LISA OSCAR GMX, W3C ITS . Extensive peer review and discussions . Public review process Localization Standards . Standards matter . Reduce costs . Improve interoperability . Improve quality . Create more competition True Cost of Translation Localization without Standards Too Many Standards? • W3C ITS Document Rules • Unicode TR29 • LISA OSCAR SRX • LISA OSCAR xml:tm • LISA OSCAR TMX • LISA OSCAR GMX • OASIS XLIFF • W3C/OASIS DITA, XHTML, DocBook, or any component based XML Vocabulary DITA XLIFF SRX TMX ? GMX Unicode TR29 xml:tm W3C ITS OAXAL Open Architecture for XML Authoring and Localization
    [Show full text]
  • App Localization SDL LSP Partner Program Our Partners
    Language | Technology | Business January/February 2015 Focus: Cloud Technology Technology: App localization SDL LSP Partner Program Our Partners 1 to 1 Translations Etymax Ltd Language Connect RWS Group 1-800-translate Eurhode Traduction Language Line Translation RWS Group Deutschland GmbH 3ic International Euris Consult Ltd Solutions Sandberg Translation Partners Ltd À Propos eurocomTranslation Services Language Link Uk Santranslate Ltd A.C.T. Fachübersetzungen GmbH GmbH Language Translation Inc. sbv anderetaal a+a boundless communication Eurotext AG Languages for Life Ltd Semantix AA Translations Limited Eurotext Translations LATN Inc. SH3 Inc. Absolute Translations Ltd Exiga Solutions GmbH Lemoine International GmbH Soget s.r.l. Accurate Translations Ltd Ferrari Studio Lexilab Sprachendienst Dr. Herrlinger Accu Translation Services Ltd Five Office Ltd Lingsoft Inc GmbH Advanced Language Translation Fokus Translatørerne Traductores jurados LinguaVox S.L. SprachUnion Inc Foreign Translations, Inc Linguistic Systems Inc Supertext AG AKAB SRL Fry & Bonthrone Partnerschaft Link Translation Bureau Synergium.eu Alaya Inc Gedev LIT - Lost in Translation srl Synergy Language Services Alexika Ltd Geneva Worldwide Inc Local Concept Tech-Lingua Bt. ALINEA Financial Translations Geotext Translations LocalEyes Ltd Techtrans GmbH All Translations Company GFT GmbH Locasoft GmbH Techworld Language Solutions Alpha CRC Ltd GIB consult SPRL Logos Group - Multilingual TECKNOTRAD ALTA Language Services Global LT LTD Translation Services TecTrad Altica Traduction Global Translations GmbH Louise Killeen Translations Limited Ten-Nine Communications Inc. AmeriClic Global Translations Solutions Ltconsult Texo SRL Apostroph AG Global Words aps Mastervoice Texpertec GmbH Arc Communications Gradus Multilingual Services Inc MasterWord Services Inc. TextMinded Arkadia Translations Srl HCR Mc Lehm Translation Services textocreativ AG Arvato Technical Information Sl hCtrans Company Limited MCL Corporation Textualis Babylon Translations Ltd Hedapen Global Services Media Research Inc.
    [Show full text]
  • How to Link Productivity and Quality Andrzej Zydron
    How to link produc<vity and quality Andrzej Zydroń CTO XTM Intl Translang Europe, Warszawa 2016 Language is difficult Language is difficult Language is organic Language is diverse Language is human 30 billion cells, 100 trillion synapses UG UG Morphology Spectrum Primi've morphology Extremely rich Informaon Technology Evolu'on 1945- 1975- 1985- 2000- 2010- 2006- Mainframe Mini WorKstaon/PC Laptop Tablet Cloud Informaon Technology Evolu'on The Cloud A Connected World Unimaginable Scales Algorithmic advances Turing/von Neumann architecture John von Neumann Alan Turing 1903-1957 1912-1954 von Neumann architecture von Neumann architecture limitaons = The right tool for the job The right tool for the job The Human Brain 30 billion cells, 100 trillion synapses Von Neuman architecture does not scale DARPA Cat brain project Pung Things Into Perspec've Innovaon in Translaon Technology • Standards ✴ Unicode ✴ XML ✴ L10N Interoperability (TMX, XLIFF, TBX, TIPP etc.) ✴ Quality QT Launchpad, TAUS DQF • Internet ✴ Resources (sharing, accessing) ✴ Communication • Automation ✴ Translation Management Systems (TMS) ✴ Automated file processing ✴ Collaborative workflows ✴ Connected real time resource sharing • Advanced algorithmic technology ✴ Web Services ✴ Voice Recognition ✴ NLP ✴ SMT, NMT ✴ POS analysers ✴ Stemmers ✴ Terminology extraction (monolingual, bilingual) ✴ High quality dictionary based bilingual text alignment ✴ Linked data lexicons Why Standards? Why Standards? Why have Standards? ISO Standard Standards = Efficiency Standards = Lower Costs Standards
    [Show full text]
  • Dita + Xliff + Cms
    www.oasis-open.org AutomatingAutomating ContentContent Localization:Localization: DITA + XLIFF + CMS TonyTony JewtushenkoJewtushenko Co-Chair, OASIS XLIFF TC Director, Product Innovator Ltd, Dublin Ireland [email protected] What is a CMS? Creates or acquires content Organizes and structures the content Stores content and metadata in a repository Business and workflow rules that customize and retrieve content Publishes format & output Why use a CMS? Benefits any organization producing content with: z Lots of contributors z Lots of revisions z Lots of content z Lots of different publication formats z Need for fast turnaround z Need for high quality www.oasis-open.org Content Management Systgem (CMS) Workflow ng> i Document Review / Review / hor t Creation Edit Approve u A < > y r o t i s Store/ Store/ Deploy Archive update update epo <R ng> hi HTML s i PDF XHTML Help ubl <P > e t a l ans <Tr What is a GMS? A Globalization Management System consists of: z A CMS to connect with z CMS Connectors z Workflow Engine z Translation Repository z CAT Software: TM, TermBase, MT In a Nutshell: z A CMS + Workflow for Pre-translation www.oasis-open.org CMS Workflow with GMS > ng i Document Review / Review / hor Creation Edit Approve ut A < y> or t i Store/ Store/ Deploy Archive update update epos R < > ng hi HTML s i PDF XHTML Help ubl P < > te a l GMS ans r T < Elements of a GMS CMS’s / GMS’s that support DITA and XLIFF An XLIFF document is a container for all localization data 1.
    [Show full text]
  • Limitations of Localisation Data Exchange Standards and Their Effect on Interoperability
    Lossless Exchange of Data: Limitations of Localisation Data Exchange Standards and their Effect on Interoperability A Thesis Submitted for the Degree of Doctor of Philosophy By Ruwan Asanka Wasala Department of Computer Science and Information Systems, University of Limerick Supervisors: Reinhard Schäler, Dr. Chris Exton Submitted to the University of Limerick, May, 2013 Abstract Localisation is a relatively new field, one that is constantly changing with the latest technology to offer services that are cheaper, faster and of higher quality. As the complexity of localisation projects increases, interoperability between different localisation tools and technologies becomes ever more important. However, lack of interoperability is a significant problem in localisation. One important aspect of interoperability in localisation is the lossless exchange of data between different technologies and different stages of the overall process. Standards play a key role in enabling interoperability. Several standards exist that address data-exchange interoperability. The XML Localisation Interchange File Format (XLIFF) has been developed by a Technical Committee of Organization for the Advancement of Structured Information Standards (OASIS) and is an important standard for enabling interoperability in the localisation domain. It aims to enable the lossless exchange of localisation data and metadata between different technologies. With increased adoption, XLIFF is maturing. Solutions to significant issues relating to the current version of the XLIFF standard and its adoption are being proposed in the context of the development of XLIFF version 2. Important matters for a successful adoption of the standard, such as standard-compliance, conformance and interoperability of technologies claiming to support the standard have not yet been adequately addressed.
    [Show full text]
  • Translations in Libre Software
    UNIVERSIDAD REY JUAN CARLOS Master´ Universitario en Software Libre Curso Academico´ 2011/2012 Proyecto Fin de Master´ Translations in Libre Software Autor: Laura Arjona Reina Tutor: Dr. Gregorio Robles Agradecimientos A Gregorio Robles y el equipo de Libresoft en la Universidad Rey Juan Carlos, por sus consejos, tutor´ıa y acompanamiento˜ en este trabajo y la enriquecedora experiencia que ha supuesto estudiar este Master.´ A mi familia, por su apoyo y paciencia. Dedicatoria Para mi sobrino Dar´ıo (C) 2012 Laura Arjona Reina. Some rights reserved. This document is distributed under the Creative Commons Attribution-ShareAlike 3.0 license, available in http://creativecommons.org/licenses/by-sa/3.0/ Source files for this document are available at http://gitorious.org/mswl-larjona-docs 4 Contents 1 Introduction 17 1.1 Terminology.................................... 17 1.1.1 Internationalization, localization, translation............... 17 1.1.2 Free, libre, open source software..................... 18 1.1.3 Free culture, freedom definition, creative commons........... 20 1.2 About this document............................... 21 1.2.1 Structure of the document........................ 21 1.2.2 Scope................................... 22 1.2.3 Methodology............................... 22 2 Goals and objectives 23 2.1 Goals....................................... 23 2.2 Objectives..................................... 23 2.2.1 Explain the phases of localization process................ 24 2.2.2 Analyze the benefits and counterparts.................. 24 2.2.3 Provide case studies of libre software projects and tools......... 24 2.2.4 Present personal experiences....................... 24 3 Localization in libre software projects 25 3.1 Localization workflow.............................. 25 3.2 Prepare: Defining objects to be localized..................... 27 3.2.1 Some examples.............................
    [Show full text]
  • IJCNLP 2011 Proceedings of the Workshop on Language Resources,Technology and Services in the Sharing Paradigm
    IJCNLP 2011 Proceedings of the Workshop on Language Resources,Technology and Services in the Sharing Paradigm November 12, 2011 Shangri-La Hotel Chiang Mai, Thailand IJCNLP 2011 Proceedings of the Workshop on Language Resources, Technology and Services in the Sharing Paradigm November 12, 2011 Chiang Mai, Thailand We wish to thank our sponsors Gold Sponsors www.google.com www.baidu.com The Office of Naval Research (ONR) Department of Systems Engineering and The Asian Office of Aerospace Research and Devel- Engineering Managment, The Chinese Uni- opment (AOARD) versity of Hong Kong Silver Sponsors Microsoft Corporation Bronze Sponsors Chinese and Oriental Languages Information Processing Society (COLIPS) Supporter Thailand Convention and Exhibition Bureau (TCEB) We wish to thank our sponsors Organizers Asian Federation of Natural Language National Electronics and Computer Technolo- Processing (AFNLP) gy Center (NECTEC), Thailand Sirindhorn International Institute of Technology Rajamangala University of Technology Lanna (SIIT), Thailand (RMUTL), Thailand Chiang Mai University (CMU), Thailand Maejo University, Thailand c 2011 Asian Federation of Natural Language Proceesing vii Introduction The Context Some of the current major initiatives in the area of language resources – FLaReNet (http://www.flarenet.eu/), Language Grid (http://langrid.nict.go.jp/en/index.html) and META-SHARE (www.meta-share.org, www.meta-net.eu) – have agreed to organise a joint workshop on infrastructural issues that are critical in the age of data sharing and open data, to discuss the state of the art, international cooperation, future strategies and priorities, as well as the road-map of the area. It is an achievement, and an opportunity for our field, that recently a number of strategic-infrastructural initiatives have started all over the world.
    [Show full text]
  • Language Industry Standards and Guidelines
    CELAN WP 2 – DELIVERABLE D2.1 ANNOTATED CATALOGUE OF BUSINESS-RELEVANT SERVICES, TOOLS, RESOURCES, POLICIES AND STRATEGIES AND THEIR CURRENT UPTAKE IN THE BUSINESS COMMUNITY ANNEX 2 INVESTIGATION OF BUSINESS-RELEVANT STANDARDS AND GUIDELINES IN THE FIELDS OF THE LANGUAGE INDUSTRY Project Title: CELAN Project Type: Network Project Programme: LLP – KA2 Project No: 196466-LLP-1-2010-1-BE-KA2-KA2PLA Version: 1.2 Date: 2013-01-30 Author: Blanca Stella Giraldo Pérez (sub-contract for standards investigation and analysis) Contributors: Infoterm (supervision), other CELAN partners (comments) The CELAN network project has been funded with support from the European Commission, LLP programme, KA2. This communication reflects the views only of the authors, and the Commission cannot be held responsible for any use which may be made of the information contained therein. CELAN D2.1 ANNEX 2_fv1.2 Executive Summary The investigation of industry&business-relevant standards and guidelines in the fields of the language industry (LI) was subdivided into four parts: General standardization framework relevant to CELAN, Basic standards related to the ICT infrastructure with particular impact on the LI, Specific standards pertaining to language technologies, resources, services and LI related competences and skills, Latest developments with respect to the complementarity of LI standards and assistive technologies standards. At the end of the investigation a summary and recommendations are given. Standards can play an important role in support of developing LI policies/strategies, educational schemes, language technology tools (LTT) and language and other content resources (LCR), language services or using the offers of language service providers (LSP). They can definitely contribute to the design of better products and the saving of financial resources from the outset and thus to avoid the need of having to retrofit products in later stages of their life cycle at significantly higher costs.
    [Show full text]
  • Multilingualweb – Language Technology a New W3C Working Group
    Why Localisation Standardisation Activities Matter to You MultilingualWeb – Language Technology A New W3C Working Group Pedro L. Díez Orzas A few words about who is talking: Pedro L. Díez-Orzas CEO at Linguaserve I.S. S.A Professor at Univ. Complutense de Madrid PhD in Computational Linguistics Member of MultilingualWeb-LT W3C Member CTN 191 – AENOR – Terminology [email protected] Standards They are great. Everyone should have their own. New Standards for Old Needs http://xkcd.com/927/ New Standards for New Needs Viability of Multilinguality on the Web depends on the level of mechanisation of methods and processes. A certain level of standard metadata can decisively help to make this happen. But it cannot always take that long… The Space Shuttle and the Horse's Rear End http://www.astrodigital.org/space/stshorse.html MultilingualWeb‐LT •New W3C Working Group under I18n Activity – http://www.w3.org/International/multilingualweb/lt/ • Aims: define meta‐data for web content that facilitates its interaction with language technologies and localization processes. •Already have 28 participants from 20 organisations – Chairs: Felix Sasaki, David Filip, Dave Lewis • Timeline: –Feature Freeze Nov 2012 – Recommendation complete Dec 2013 EU Project FP7‐ICT MLW‐LT Approach • Standardise Data Categories –ITS (1.0) has: Translate, Loc note, Terminology, Directionality, Ruby, Language Info, Element Within Text –MLW‐LT could add: MT‐specific instructions, workflow, quality‐related provenance, legal? •Map to formats –ITS focussed on XML •useful for XHTML, DITA, DocBook –MLW‐LT also targets HTML5 and CMS‐based ‘deep web’ –Use of microdata and RDFa • Uses Cases MLS‐LT Main Tasks • Develop this metadata standard through broad consensus across communities, involving content producers, localisation workers, language technology experts, browser vendors and users.
    [Show full text]