Presentation of XML Documents

Patryk Czarnik

Institute of Informatics University of Warsaw

XML and Modern Techniques of Content Management – 2012/13

Stylesheets Separation of content and formatting Separating content and formatting

According to XML best practices, documents should contain: “pure” content / data markup for structure and meaning (“semantic” or “descriptive” markup) no formatting How to present? “hard-coded” interpretation of known document types, e.g. Libre Office rendering Writer document importing or pasting content into word processor or DTP tool and manual or automatic formatting, e.g. Adobe InDesign approach external style sheets

Patryk Czarnik 05 – Presentation XML 2012/13 4 / 1 Idea of stylesheet

Dawid Arkadiusz Paszkiewicz Gierasimczyk +48223213203 +48223213213 +48223213200 +48501502503 [email protected] [email protected]

* font 'Times 10pt' * yellow backgorund * 12pt for name * blue font and border * abbreviation before * italic font for name phone number * typewritter font for email Stylesheets Separation of content and formatting Benefits of content and formatting separation

General advantages of descriptive markup better content understanding and easier analysis Ability to present in the same way: the same document after modification another document of the same structure Formatting managed in one place easy to change style of whole class of documents Ability to define many stylesheets for a document class, depending on purpose and expectations: medium (screen / paper / voice) expected number of details (full report / summary) reader preferences (font size, colors, . . . )

Patryk Czarnik 05 – Presentation XML 2012/13 6 / 1 Stylesheets Separation of content and formatting Standards related to XML presentation

General: Associating Style Sheets with XML Documents Stylesheet languages: DSSSL (historical, applied to SGML) Document Style Semantics and Specification Language CSS Cascading Style Sheets XSL Extensible Stylesheet Language

Patryk Czarnik 05 – Presentation XML 2012/13 7 / 1 Stylesheets Associating style with document Associating Style Sheets with XML Documents

W3C Recommendation -stylesheet . One style sheet . .xsl"?> . Alternative style sheets . css"href="blue.css"?>

Patryk Czarnik 05 – Presentation XML 2012/13 8 / 1 CSS Idea and applications Cascading Style Sheets

Beginnings of style sheets idea: 1970s motivation — different printer languages and properties CSS beginnings: 1994 CSS Level 1 Recommendation: December 1996 CSS Level 2 Recommendation: May 1998 implemented (more or less completely) in present-day internet browsers CSS 2.1 Recommendation: June 2011 CSS 2 corrected and made less ambiguous also, a bit less ambitious (unsupported features thrown out) CSS Level 3 — work still in progress: modularisation new features supported in modern browsers

Patryk Czarnik 05 – Presentation XML 2012/13 10 / 1 CSS Idea and applications Applications of CSS

First and main application: style for Web sites Separation of content and formatting for HTML Simple style sheets for XML . Idea of accessibility . Content should be accessible for all people, despite their personal or technical disabilities. Support in CSS: media-dependent styles alternative presentation methods (e.g. voice) enabling reader to provide her/his own style (e.g. larger font and contrast colors for people with eyesight disability) .

Patryk Czarnik 05 – Presentation XML 2012/13 11 / 1 CSS CSS sheet structure Example XML . Extremely professional company Accountancy Dawid Paszkiewicz +48223213203 +48223213200 [email protected] Monika Domżałowicz +48223213200 +48501502513 [email protected] Arkadiusz Gierasimczyk ....

Patryk Czarnik 05 – Presentation XML 2012/13 12 / 1 CSS CSS sheet structure Example stylesheet . . person { Terminology display: block; . margin: 10px auto 10px 20px; rule padding: 0.75em 1em; selector { width: 200px; property: value; border-style: solid; property: value border-width: 2px; } border-color: #003399; . background-color: #F6F6F6; } person[position='chief'] { background-color: #DDFFDD; } first-name, last-name { display: inline; font-size: 12pt; } person[position='chief'] last-name { font-weight: bold; } ....

Patryk Czarnik 05 – Presentation XML 2012/13 13 / 1 CSS CSS sheet structure Result of formatting

Patryk Czarnik 05 – Presentation XML 2012/13 14 / 1 CSS CSS sheet structure Selectors

person — element name first-name, last-name — two elements department name — descendant department > name — child A + B — element B directly following A name:first-child — ‘name’ being first child person[position] — test of attribute existence person[position='chief'] — attribute value test person[position~='chief'] — value is element of list person#k12 — value of ID-type attribute ol.staff — equivalent to ol[class~='staff'] (HTML specific)

Patryk Czarnik 05 – Presentation XML 2012/13 15 / 1 CSS Formatting CSS visualisation principles

. Blocks . Content arranged into blocks of different kind display property establishes kind of block: block — regular block with height and width inline — fragment of text, subject to line wrapping other specific objects (tables, lists, etc.) . . Box model . Blocks nesting reflects nesting of elements in source document Additionally: borders, margins, relative or absolute positioning .

Patryk Czarnik 05 – Presentation XML 2012/13 16 / 1 CSS Formatting CSS visualisation principles (2)

. Inheritance . Nested elements inherit properties from parents does not apply to all properties . . Formatting . Properties for various aspects of formatting size, position, borders, and margins text and fonts colour and background user interaction basic properties aural properties .

Patryk Czarnik 05 – Presentation XML 2012/13 17 / 1 CSS Formatting Some other capabilities of CSS

. CSS 2 . Medium-dependent style Inserting content not existing in source document (labels etc.) Multi-level counters . . CSS 3 . Animation and effects Multi-column text More selectors Support for XML namespaces .

Patryk Czarnik 05 – Presentation XML 2012/13 18 / 1 CSS Formatting CSS capabilities and advantages

Rich visual formatting capabilities especially for single elements Elements distinguished basing on name placement within document tree existence and value of attributes Support internet browsers authoring tools Easy to write simple stylesheets :)

Patryk Czarnik 05 – Presentation XML 2012/13 19 / 1 CSS Formatting CSS lacks and drawbacks

Only visualisation not conversion to other formats Some visual formatting issues e.g. confusing block size computing (moreover, for years incorrectly implemented in IE) Not possible: distinguish elements basing on their contents repeating the same content several times presenting elements in different order than in source (no general solution, only relative/absolute positioning) complex logical conditions data processing (e.g. summing numeric values)

Patryk Czarnik 05 – Presentation XML 2012/13 20 / 1 XSL Presentation by transformation Presentation by transformation

Definition of transformation (in role of styelsheet) XML Document for ("pure data") direct presentation In order to present a document, we can transform it into a format, that we know how to present. Examples of “presentable” XML formats: XHTML XSL Formatting Objects

Patryk Czarnik 05 – Presentation XML 2012/13 22 / 1 XSL Presentation by transformation Extensible Stylesheet Language (XSL)

First generation of recommendations (1999 and 2001): XSL (general framework, XSL Formatting Objects language) XSLT (transformation language) XPath (expression language, in particular paths addressing document fragments) XSL stylesheet — transformation from XML to XSL-FO usually for a class of documents (XML application) in practice, transformation to other formats, often (X)HTML

Patryk Czarnik 05 – Presentation XML 2012/13 23 / 1 XSL Presentation by transformation Original idea of XSL

cite: W3C, XSL 1.0

Patryk Czarnik 05 – Presentation XML 2012/13 24 / 1 XSL Introduction to XSL Transformations XSLT — XSL Transformations

Application of XML (of course) Transformation definition (“transformation” or “stylesheet”) specifies how a source document translates into a result document Transformation at document tree level

Patryk Czarnik 05 – Presentation XML 2012/13 25 / 1 XSL Introduction to XSL Transformations XSLT — stylesheet example

. Transformation from staff-XML to HTML .

<xsl:text>Employees of </xsl:text> <xsl:value-of select="/firma/nazwa" /> ... . Patryk Czarnik 05 – Presentation XML 2012/13 26 / 1 XSL Introduction to XSL Transformations XSLT — stylesheet example

. XML → HTML .

.

Patryk Czarnik 05 – Presentation XML 2012/13 27 / 1 XSL Introduction to XSL Transformations XSLT — stylesheet example

. XML → HTML .

.

Patryk Czarnik 05 – Presentation XML 2012/13 28 / 1 XSL Introduction to XSL Transformations XSLT — stylesheet structure

Stylesheet (arkusz) consists of templates. Template (szablon) specifies how to transform a source node into result tree fragment Inside templates: text and elements not from XSLT namespace → copied to result XSLT instructions → affects processing XPath in instructions → access to source document, arithmetic, conditions testing, . . . XSLT can be considered a programming language specialised for XML documents transformation.

Patryk Czarnik 05 – Presentation XML 2012/13 29 / 1 XSL Introduction to XSL Transformations XSLT — operation overview

Transformation on tree level Template for document node (root) started first template exists even if we had not written it apply-templates within a template — another templates called for other nodes, usually for children Templates matched according to node kind and name, location within document tree, etc. Additional features: conditional processing, copying values and nodes from source, generating new nodes, grouping, sorting, . . .

Patryk Czarnik 05 – Presentation XML 2012/13 30 / 1 XSL Introduction to XSL Transformations XPath in XSLT

XPath expressions used for: accessing nodes and values from source document computing values checking logical conditions XPath constructions: paths (navigating through document tree) arithmetic and logical operators functions for numbers. strings, etc.

Patryk Czarnik 05 – Presentation XML 2012/13 31 / 1 XSL Introduction to XSL Transformations Basic approach to presentational XSLT . Main template . Setting shape of result document ... . . Typical internal template . Translation of source elements into result elements (or fragments)

.

Patryk Czarnik 05 – Presentation XML 2012/13 32 / 1 XSL XSL-FO XSL Formatting Objects

XML application designed for visual presentation Focused on printed media page templates (“masters”) automatic pagination Normally, XSL-FO documents are not written by hand but generated via XSLT.

Patryk Czarnik 05 – Presentation XML 2012/13 33 / 1 XSL XSL-FO XSL-FO document structure

. Minimal example of XSL-FO document .

Hello World!

.

Patryk Czarnik 05 – Presentation XML 2012/13 34 / 1 Transformation to XSL-FO (1/2)

...

... Transformation to XSL-FO (2/2)

...

mob. tel. XSL XSL-FO Example visualisation

Patryk Czarnik 05 – Presentation XML 2012/13 37 / 1 XSL XSL-FO page-master — page template

Single page layout A document may be split in many such pages. . Example .

.

Patryk Czarnik 05 – Presentation XML 2012/13 38 / 1 XSL XSL-FO Content of pages

page-sequence results in a number of pages flow content split into pages static-content content repeated on all pages flow-name page region reference . Stylesheet fragment . Company staff

.

Patryk Czarnik 05 – Presentation XML 2012/13 39 / 1 XSL XSL-FO XSL-FO tree elements

. . Block level Inline level . . block inline, character list-block, external-graphics list-item, . list-item-label . Special features table, table-row, . table-cell,... basic-link, . bookmark footnote flow ... .

Patryk Czarnik 05 – Presentation XML 2012/13 40 / 1 XSL XSL-FO Style of visual elements

CSS roots of visual formatting model Attributes for style properties: margin, padding, border-style ... background-color, background-image ... font-family, font-weight, font-style, font-size ... text-align, text-align-last, text-indent, start-indent, end-indent, wrap-option, break-before ...

Patryk Czarnik 05 – Presentation XML 2012/13 41 / 1 XSL XSL-FO Lists — example

First name: Dawid Surname: Paszkiewicz

Patryk Czarnik 05 – Presentation XML 2012/13 42 / 1 XSL XSL-FO Example visualisation — lists

Patryk Czarnik 05 – Presentation XML 2012/13 43 / 1 XSL XSL-FO Tables — example

Surname First name ... Paszkiewicz Dawid ... ...

Patryk Czarnik 05 – Presentation XML 2012/13 44 / 1 XSL XSL-FO Example visualisation — table

Patryk Czarnik 05 – Presentation XML 2012/13 45 / 1 XSL XSL-FO XSL-FO — some more features

Pagination (odd/even pages etc.) Markers (referring to page content from header/footer) Multi-flow documents (from XSL 1.1, not available in Apache FOP 1.0)

Patryk Czarnik 05 – Presentation XML 2012/13 46 / 1 XSL XSL-FO XSL-FO — support

Commercial software: Antenna House XSL Formatter RenderX Ecrion Lunasil LTD Xinc ... Open software: Apache FOP xmlroff

Patryk Czarnik 05 – Presentation XML 2012/13 47 / 1 XSL XSL-FO XSL-FO — critique

. Main XSL-FO advantages . “in line” with XSL easy and direct way to obtain printout (e.g. PDF) from XML data general advantages of stylesheets over “hard-coded” formatting . . Main XSL-FO disadvantages . too complex for simple needs too limited for advanced needs lack of pagination feedback, cannot say “if these two elements occur on the same page. . . ” missing some advanced DTP- and typography-related features hard to format particular elements in a very special way (thus this is a general drawback of using stylesheets) .

Patryk Czarnik 05 – Presentation XML 2012/13 48 / 1