Configurable Editing of XML-Based Variable-Data Documents John Lumley, Roger Gimson, Owen Rees HP Laboratories HPL-2008-53

Configurable Editing of XML-based Variable-Data Documents John Lumley, Roger Gimson, Owen Rees HP Laboratories HPL-2008-53 Keyword(s): XSLT, SVG, document construction, functional programming, document editing Abstract: Variable data documents can be considered as functions of their bindings to values, and this function could be arbitrarily complex to build strongly-customised but high-value documents. We outline an approach for editing such documents from example instances, which is highly configurable in terms of controlling exactly what is editable and how, capable of being used with a wide variety of XML-based document formats and processing pipelines, if certain reasonable properties are supported and can generate appropriate editors automatically, including web- service deployment. External Posting Date: October 6, 2008 [Fulltext] Approved for External Publication Internal Posting Date: October 6, 2008 [Fulltext] Published and presented at DocEng’08, September 16-19, 2008, São Paulo, Brazil © Copyright 2008 ACM Configurable Editing of XML-based Variable-Data Documents John Lumley, Roger Gimson, Owen Rees Hewlett-Packard Laboratories Filton Road, Stoke Gifford BRISTOL BS34 8QZ, U.K. {john.lumley,roger.gimson,owen.rees}@hp.com ABSTRACT al form of the final document (WYSIWYG rather than declaring Variable data documents can be considered as functions of their intent such as using LaTeX), but when the document is highly vari- bindings to values, and this function could be arbitrarily complex able and there are very many different possible instances, how to to build strongly-customised but high-value documents. We outline do this is not immediately obvious. an approach for editing such documents from example instances, We were also keen to consider that, especially in complex commer- which is highly configurable in terms of controlling exactly what ical document workflows, there may be many distinctly different is editable and how, capable of being used with a wide variety of roles of ‘editor’ and ‘author’ for such documents. Original design- XML-based document formats and processing pipelines, if certain ers may generate branding material and document exemplars that reasonable properties are supported and can generate appropriate can result in various elements and styles of document templates editors automatically, including web-service deployment. being fixed for other downstream editors. Other individuals may have edit capability which is restricted (‘you can choose the style Categories and Subject Descriptors but not edit the style’; ‘you can edit the mapping to data sources, I.7.2[Computing Methodologies]: Document Preparation — but not style nor fixed content’) . desktop publishing, format and notation, languages and systems, markup languages, scripting languages Editing used to be the province of dedicated standalone editing tools, of increasing complexity and cost. But in the system and service General Terms: Languages deployment scenarios we envisage, many of these editing actions must also be supported through web-deployed facilities. Keywords: XSLT, SVG, Document construction, Functional programming, Document editing We sought an extensible architecture that could support the possible editing and authoring of sets of variable-data documents that: 1. INTRODUCTION & MOTIVATION Was highly configurable in terms of tuning what editing can be In recent years there has been much research on formats, semantics performed on documents and by whom and implementation techniques for variable-data documents, driv- Can edit variable-data documents of high complexity, from en in part by new business opportunities in customised publishing. understandable visual presentations of sample document Our own exploration has focussed on the Document Description instances Framework [1] (DDF), an architecture which treats a variable content document as an extensible function with distinct separation of Can support a wide variety of XML-based document types data, logical structure and presentation. It is heavily dependent upon Can operate within workflows where multiple documents are XML representations and technologies, using an XML tree as the merged and correctly edit appropriate components. main syntactic representation, with constructional semantics sup- Does not depend upon any intimate knowledge of the processing ported by sections of XSLT[2] and an SVG-based geometric pipeline that creates instances of variable documents, nor even presentation by a hierarchical tree of layout instructions[3]. the detailed semantics of the documents themselves During early research such documents (which can of course be Requires minimal or zero disturbance of the document pro- highly programmatic) have been constructed ‘by hand’, but devel- cessing pipeline tools when supporting editing. oping innovative methods of authoring and editing has always been Is deployable both in standalone and web-service situations, one of the goals. Users have become very used to editing on a visu- with very little change in configuration. In this paper we will usually discuss examples using DDF, but the techniques are applicable to a wide range of potential XML-based variable document technologies, provided a few key requirements are satisfied. We'll show that the basic technique is very effective and highly flexible when the causality between source and result Permission to make digital or hard copies of all or part of this work for is ‘local’. Adding more knowledge of the semantics of the source personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies language supports modifying documents in less local scopes. bear this notice and the full citation on the first page. To copy otherwise, We'll start by illustrating editing on an instance of a variable doc- or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. ument then outline the basic implementation, including how ‘edit- DocEng’08, September 16-19, 2008, São Paulo, Brazil. Copyright 2008 ACM 978-1-60558-081-4/08/09...$5.00. ability’ may be described and how modifications are actually per- er instances that will be generated subsequently from the template? formed on source (template) documents. We'll then discuss using Figure 3 shows such a case, where we've selected one of the text a ‘view’ of a document, remote editing using the technique and how blocks in the result document and have an editing dialogue that we can edit the extent of modification other users can perform. Lim- allows us to alter various properties, such as borders and margins: itations and specialist treaments are then outlined, followed by a survey of prior art and future directions. 2. EDITING: ON A DOCUMENT INSTANCE A variable content document is just that - variable; its appearance usually depends upon the data that is bound to it. The human con- sumer reads the intended message in a visualisation of a document instance. Consequently editing within such a visualisation is prefer- able, provided we don't have to compromise the ‘smartness’ of the document substantially. We'll outline the issues and approaches on an example 2-page tourist flyer that is a true variable document: Honest John's Honest John's HeliTours HeliTours Bugaboos Chamonix Specially selected for Otis B. Driftwood Nam libero Very little is known about how and tempore, cum where they live. The one certainty is soluta nobis est that they are fearsome opponents. Figure 3. Editing an element of the template eligendi optio Nam libero tempore, cum soluta cumque nihil nobis est eligendi optio cumque nihil impedit quo minus impedit quo minus id quod maxime FlyBy id quod maxime, placeat facere possimus, omnis 1 The helicopter blades have turned BackBowl Offer! omnis voluptas voluptas assumenda est, omnis dolor and the first fresh tracks have been Special assumenda est, repellendus. Temporibus autem omnis dolor quibusdam et aut officiis debitis aut In this example we have selected a text block for editing and this carved to kick off the 2008 season. If you have yet to heliski during the repellendus. rerum necessitatibus saepe eveniet early season, consider it a must. As As temps remain cooler at this time ut et voluptates repudiandae sint et temps remain cooler at this time of of year, conditions at lower elevations molestiae non recusandae. year, conditions at lower elevations provide us with excellent tree skiing has caused the ‘text block’ editing dialogue to be displayed and pop- provide us with excellent tree skiing in poorer weather conditions. in poorer weather conditions. ulated with the correct properties for the text block in question (and indeed for all text blocks, such as that under the right hand picture, Nam libero tempore, cum soluta All-inclusive: £1005 nobis est eligendi optio cumque nihil All-inclusive: £1365 impedit quo minus id quod maxime that were generated from the same element in the template source.) placeat facere possimus, omnis voluptas assumenda est, omnis dolor All-inclusive: £1234 repellendus. Temporibus autem quibusdam et aut officiis debitis aut How to book: rerum necessitatibus saepe eveniet Please contact us via our representatives at any high-street stationer, These properties can be altered and eventually an ‘Apply’ action ut et voluptates repudiandae sint et molestiae non recusandae. where cheques drawn on a Cayman Island or States of Jersey bank will be accepted with alacrity, or alternatively

Configurable Editing of XML-Based Variable-Data Documents John Lumley, Roger Gimson, Owen Rees HP Laboratories HPL-2008-53

Open Source Software Used in Cisco Unified Web and E-Mail Interaction

SVG Tutorial

Stylesheet Translations of SVG to VML

SVG-Based Knowledge Visualization

Pearls of XSLT/Xpath 3.0 Design

Create an HTML Output from Your Own Project with XSLT

Xpath and Xquery

Session 2: Markup Language Technologies

A Proposal for XSLT

Xpath, Xquery and XSLT

Python, XML and Webservices

Introduction to XSL Max Froumentin - W3C