Transforming Data

XML Professional Publisher: Transforming Data for use with XPP 9.3 November 2018 Notice © SDL Group 1999, 2003-2005, 2009, 2012-2018. All rights reserved. Printed in U.S.A. SDL Group has prepared this document for use by its personnel, licensees, and customers. The information contained herein is the property of SDL and shall not, in whole or in part, be reproduced, translated, or converted to any electronic or machine-readable form without prior written approval from SDL. Printed copies are also covered by this notice and subject to any applicable confidentiality agreements. The information contained in this document does not constitute a warranty of performance. Further, SDL reserves the right to revise this document and to make changes from time to time in the content thereof. SDL assumes no liability for losses incurred as a result of out-of-date or incorrect information contained in this document. Trademark Notice See the Trademark file at http://docs.sdl.com for trademark information. U.S. Government Restricted Rights Legend Use, duplication or disclosure by the government is subject to restrictions as set forth in subparagraph (c)(1)(ii) of the Rights in Technical Data and Computer Software clause at DFARS 252.227-7013 or other similar regulations of other governmental agencies, which designate software and documentation as proprietary. Contractor or manufacturer is SDL Group, 201 Edgewater Drive, Wakefield, MA 01880-6216. ii Contents Part I Importing and Transforming Text Chapter 1 Unicode Support in XPP Working with text and Unicode ............................. 1-2 XML/SGML ........................................... 1-2 ASCII Text ............................................. 1-3 XyASCII Text ........................................... 1-3 XSF Text ............................................... 1-3 The Xyvision Character Set ............................... 1-4 Transforming Content from Earlier XPP Versions .............. 1-5 Running the Transformation Process Manually ............. 1-5 Character Transformation ................................ 1-5 Page Transformation .................................... 1-7 Style Transformation .................................... 1-7 Transforming Between ASCII and Unicode ................. 1-7 ASCII File to XSF Summary ............................ 1-8 XSF to ASCII Summary ................................ 1-9 Using ASCII Files ....................................... 1-9 Working with ASCII Files .................................. 1-10 Default Naming Conventions ............................. 1-10 Edit ASCII File or View File as ASCII ...................... 1-10 Using Tag Primitives to Separate Text into Streams ............ 1-12 Understanding Tag Primitives ............................ 1-12 Using Stream Diversion in the Export Process .............. 1-14 Transforming Data Contents iii Stream Order ......................................... 1-14 Stream-End Delimiter ................................. 1-14 Using Stream Diversion in the Import Process .............. 1-14 Footnote and Pickup References .......................... 1-15 Marking Footnotes in ASCII Text .......................... 1-16 Marking Pickups in ASCII Text ........................... 1-17 Marking Pickup Elements in ASCII Text ................... 1-19 Arguments for Graphic Tag Primitives ................... 1-20 Arguments for Pickup Element Blocks ................... 1-22 Marking Stories in ASCII Text ............................ 1-24 Converting Microns to Standard Units ..................... 1-24 XSF Text / ASCII Examples ................................. 1-26 Example 1: XSF Text ................................ 1-26 Example 2: ASCII Text Strings ........................ 1-27 Chapter 2 FromXSF and ToXSF Overview ................................................ 2-2 Using FromXSF ........................................... 2-3 Running FromXSF from PathFinder ....................... 2-4 Running FromXSF from the Command Line ................ 2-4 Using FromXSF on Multiple Divisions within a Job .......... 2-4 Inserting Header and Trailer Strings With FromXSF ......... 2-6 Inserting Header and Trailer Strings in XML/SGML Mode . 2-7 /FOOT .............................................. 2-7 /FOOTPG ........................................... 2-8 Preserving Editing Traces with FromXSF ................... 2-9 Examining a File Containing Trace Xyps ................. 2-9 Using FromXSF in XML/SGML Mode ..................... 2-11 Suppressing XPP PI in XML/SGML Mode ............... 2-11 Selecting Type of Output .............................. 2-11 Specifying Units for Layout Field Information .............. 2-12 Using ToXSF .............................................. 2-13 ToXSF Support ......................................... 2-13 XyASCII ............................................. 2-13 CJK Characters ....................................... 2-13 Autoprocessing ....................................... 2-13 Numeric Character References .......................... 2-13 Running ToXSF ......................................... 2-14 Running ToXSF from PathFinder .......................... 2-14 Running ToXSF from the Command Line .................. 2-16 Replace Modes ......................................... 2-16 Page Replace/Insert/Append Mode (-Rep and -Ret) ........ 2-17 Replacing Main Pages ................................. 2-18 iv Contents Transforming Data Inserting Main Pages .................................. 2-18 Appending Main Pages ................................ 2-19 Replace/Insert/Append Footnotes, Pickups and Stories ... 2-19 Replacing Split Pickups ................................ 2-20 Updating Frozen Pickups .............................. 2-20 –Rep versus –Ret ..................................... 2-20 Page Re-Create Mode (–Rec) .............................. 2-20 Page Replace Mode (–Rep and –Ret) and Maximum Page Sizes .................................................. 2-21 Layout Control Designators (–layout and –nofrills) ......... 2-22 SGML Character Entities (–asc) ........................... 2-23 ToXSF and Line Endings ................................. 2-23 Loop Processing of Input Text (–Ln) ....................... 2-24 ToXSF Error Checking ................................... 2-24 Legal Non-printing Control Characters .................... 2-25 Missing Begin/End Tag, Xycode, or Tag Primitive Characters . 2-26 ToXSF Page Formats and Sizes ............................ 2-27 Footnote Tag Primitives .................................. 2-28 ToXSF Help File ........................................ 2-28 Tag Primitives Help File ................................. 2-28 Restoring Divisions ........................................ 2-29 Chapter 3 XyChange Understanding XyChange .................................. 3-2 An Overview of the Transformation Process ................ 3-2 XyChange/XSLT ........................................ 3-3 What Do You Need To Transform? ........................ 3-3 Setting Up XyChange Transformation Tables ............... 3-4 Running XyChange ..................................... 3-4 Transformation Types ................................... 3-4 Unconditional Transformation .......................... 3-4 Conditional Transformation ............................ 3-5 Writing Transformation Tables ............................ 3-5 Setting Up XPP Transformation Tables ....................... 3-7 Completing the Job Ticket ................................ 3-7 Creating Files for Transformation Tables, Scripts, and Style Sheets ................................................. 3-9 Sample Perl Script .................................... 3-9 Accessing a Transformation Table, Script, Style Sheet ........ 3-10 Structure of a Transformation Table ....................... 3-11 XPP Transformation Table Fields .......................... 3-12 Using Match and Output Strings ............................ 3-14 Guidelines for Match and Output Strings .................. 3-14 Transforming Data Contents v Escape Sequences in Match and Output Strings ............. 3-15 Escape Sequence Formats .............................. 3-16 XSF Characters and ASCII Strings ......................... 3-17 Notes on Translating Special XPP Characters ............... 3-19 Running the XyChange Program ............................ 3-21 XyChange Processing Sequence ........................... 3-21 Guidelines for Running XyChange ........................ 3-22 Running XyChange from PathFinder ...................... 3-22 Running XyChange from the Command Line ............... 3-24 Sample XyChange Transformation Process ................... 3-26 ASCII File Before Transformation ....................... 3-26 Sample Transformation Table Rules ..................... 3-27 ASCII File After Translation ............................ 3-29 Composed File in Page Mode .......................... 3-30 XyChange with Xalan Processing ............................ 3-31 Chapter 4 Advanced XyChange Techniques Using Wildcards in Transformation Rules .................... 4-2 Single-Character Wildcards in Match Strings ............... 4-2 Guidelines for Using Single-Character Wildcards ........... 4-3 Multiple-Character Wildcards in Match Strings ............. 4-4 Guidelines for Using Multiple-Character Wildcards ......... 4-5 Using Wildcards in Output Strings ........................ 4-5 Using If True and Then Do Fields ........................... 4-7 Guidelines for the If True Field ........................... 4-8 Comparative Operators ................................ 4-9 Binary

Transforming Data

SGML As a Framework for Digital Preservation and Access. INSTITUTION Commission on Preservation and Access, Washington, DC

Information Processing — Text and Office Systems — Standard Generalized Markup Language (SGML)

HTML 4.0 Specification

An SGML Environment for STEP

Application Developer's Guide (PDF)

Using SGML As a Basis for Data-Intensive NLP

Item 1: SYNTAX

Adobe Framemaker Structure Application Developer Reference

Application Developer's Guide (PDF)

Appendix A: Introduction to SGML

SGML/XML & Text Encoding

TEI Lite: an Introduction to Text Encoding for Interchange Lou Burnard C