Best Practices for XML Internationalization

Best Practices for XML Internationalization Best Practices for XML Internationalization W3C Working Group Note 13 February 2008 This version: http://www.w3.org/TR/2008/NOTE-xml-i18n-bp-20080213/ Latest version: http://www.w3.org/TR/xml-i18n-bp/ Previous version: http://www.w3.org/TR/2007/WD-xml-i18n-bp-20071031/ Editors: Yves Savourel, ENLASO Corporation Jirka Kosek, Invited Expert Richard Ishida, W3C This document is also available in these non-normative formats: PDF version1 and XHTML Diff markup to publication from 31 October 20072. Copyright3 © 2008 W3C4® (MIT5, ERCIM6, Keio7), All Rights Reserved. W3C liability8, trademark9 and document use10 rules apply. Abstract This document provides a set of guidelines for developing XML documents and schemas that are internationalized properly. Following the best practices describes here allow both the de- veloper of XML applications, as well as the author of XML content to create material in different languages. Status of this Document This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/. 1 ⇢ http://www.w3.org/TR/2008/NOTE-xml-i18n-bp-20080213/xml-i18n-bp.pdf 2 ⇢ http://www.w3.org/TR/2008/NOTE-xml-i18n-bp-20080213/xml-i18n-bp-diff.html 3 ⇢ http://www.w3.org/Consortium/Legal/ipr-notice#Copyright 4 ⇢ http://www.w3.org/ 5 ⇢ http://www.csail.mit.edu/ 6 ⇢ http://www.ercim.org/ 7 ⇢ http://www.keio.ac.jp/ 8 ⇢ http://www.w3.org/Consortium/Legal/ipr-notice#Legal_Disclaimer 9 ⇢ http://www.w3.org/Consortium/Legal/ipr-notice#W3C_Trademarks 10 ⇢ http://www.w3.org/Consortium/Legal/copyright-documents – 1 – Best Practices for XML Internationalization This is a W3C Working Group Note11 of "Best Practices for XML Internationalization". This document was developed by the Internationalization Tag Set (ITS) Working Group12, part of the W3C Internationalization Activity13. Feedback about this document is encouraged. Send your comments to www-i18n-com- [email protected]. Use "[Comment on xml-i18n-bp WD]" in the subject line of your email, followed by a brief subject. The archives14 for this list are publicly available. Publication as a Working Group Note does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress. This document was produced by a group operating under the 5 February 2004 W3C Patent Policy15. W3C maintains a public list of any patent disclosures16 made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s)17 must disclose the information in accordance with section 6 of the W3C Patent Policy18. 11 ⇢ http://www.w3.org/2005/10/Process-20051014/tr.html#WGNote 12 ⇢ http://www.w3.org/International/its/ 13 ⇢ http://www.w3.org/International/Activity 14 ⇢ http://lists.w3.org/Archives/Public/www-i18n-comments/ 15 ⇢ http://www.w3.org/Consortium/Patent-Policy-20040205/ 16 ⇢ http://www.w3.org/2004/01/pp-impl/37139/status 17 ⇢ http://www.w3.org/Consortium/Patent-Policy-20040205/#def-essential 18 ⇢ http://www.w3.org/Consortium/Patent-Policy-20040205/#sec-Disclosure – 2 – Best Practices for XML Internationalization Table of Contents 1 Introduction............................................................................................................................5 1.1 Who should use this document.....................................................................................5 1.2 How to use this document.............................................................................................5 1.2.1 Designers and developers of XML applications...................................................5 1.2.2 Users and authors of XML content.......................................................................5 2 When Designing an XML Application....................................................................................6 Best Practice 1: Defining markup for natural language labelling......................................10 Best Practice 2: Defining markup to specify text direction................................................13 Best Practice 3: Avoiding translatable attribute values.....................................................15 Best Practice 4: Indicating which elements and attributes should be translated..............17 Best Practice 5: Defining markup to override translate information..................................19 Best Practice 6: Providing information related to text segmentation.................................21 Best Practice 7: Defining markup for ruby text..................................................................23 Best Practice 8: Defining markup for notes to localizers...................................................25 Best Practice 9: Defining markup for unique identifiers....................................................28 Best Practice 10: Identifying terminology-related elements..............................................29 Best Practice 11: Defining markup for specifying or overriding terminology-related information........................................................................................................................31 Best Practice 12: Working with multilingual documents....................................................32 Best Practice 13: Naming elements and attributes...........................................................34 Best Practice 14: Defining a span-like element................................................................36 Best Practice 15: Documenting internationalization and localization features of your schema..............................................................................................................................37 3 When Authoring XML Content.............................................................................................39 Best Practice 16: Specifying the language of content.......................................................40 Best Practice 17: Specifying text directionality.................................................................43 Best Practice 18: Overriding information about what should be translated......................45 Best Practice 19: Assigning unique identifiers..................................................................48 Best Practice 20: Avoiding CDATA sections.....................................................................49 Best Practice 21: Providing notes for localizers................................................................52 Best Practice 22: Working with inserted text.....................................................................54 Best Practice 23: Identifying terms...................................................................................57 Best Practice 24: Storing markup from another format.....................................................59 4 Generic Techniques.............................................................................................................61 4.1 Writing ITS Rules........................................................................................................61 4.1.1 Precedence and Inheritance...............................................................................61 4.1.2 Dealing with namespaces...................................................................................62 4.1.3 Create your XPath expressions with care..........................................................64 4.2 Example of adding an attribute to an existing schema...............................................65 4.2.1 Including xml:lang in XML Schema....................................................................65 4.2.2 Including xml:lang in RELAX NG........................................................................65 4.2.3 Including xml:lang in an XML DTD.....................................................................66 5 ITS Applied to Existing Formats..........................................................................................67 5.1 ITS and XHTML 1.0....................................................................................................67 5.1.1 Integrating ITS into XHTML................................................................................67 5.1.2 Using XHTML Modularization 1.1 for the Definition of ITS.................................69 5.1.3 Using NVDL to integrate ITS into XHTML..........................................................74 5.1.4 Associating existing XHTML markup with ITS....................................................76 5.2 ITS and TEI.................................................................................................................78 5.2.1 Integrating ITS into TEI.......................................................................................78 – 3 – Best Practices for XML Internationalization 5.3 ITS and XML Spec......................................................................................................80 5.3.1 Integration of ITS into XML Spec........................................................................80 5.3.2 Associating existing XML Spec markup with ITS...............................................83 5.4 ITS

Best Practices for XML Internationalization

Specification for JSON Abstract Data Notation Version

Bibliography of Erik Wilde

Darwin Information Typing Architecture (DITA) Version 1.3 Draft 29 May 2015

Ontology Matching • Semantic Social Networks and Peer-To-Peer Systems

An Automated Approach to Grammar Recovery for a Dialect of the C++ Language

XML: Looking at the Forest Instead of the Trees Guy Lapalme Professor Département D©Informatique Et De Recherche Opérationnelle Université De Montréal

Content Processing Framework Guide (PDF)

There's a Lot to Know About XML, and It S Constantly Evolving. but You Don't

File Format Guidelines for Management and Long-Term Retention of Electronic Records

Skyfire: Data-Driven Seed Generation for Fuzzing

Session 7 - Main Theme XML Information Rendering (Part I)

SWAD-Europe Deliverable 5.1: Schema Technology Survey