<<

University of Dublin Trinity College

Transforming XML Documents

[email protected] [email protected] What is XSL?

• XSL (eXtensible Stylesheet Language) consists of – A language for transforming XML documents – A language for specifying formatting properties • Based on existing style sheets languages – Document Style Semantics and Specification Language (DSSSL) – Cascading Style Sheets (CSS) • A specifically for XML documents • Uses an XML syntax • XSL is a combination of three specifications – XML Path Language (XPath) – XSL Transformations (XSLT) – XSL-FO (Formatting Objects)

More @ http://www.w3.org/Style/XSL/ What are XSL Transformations?

• Language for transforming XML documents into other XML/other documents – More generally input can be any document as long as it is represented as tree • Uses a collection of templates (rules) to transform the source document • XPath is used to select the parts of a source document to transform • Rule-based (declarative) language for transformations • Server-side/client-side execution

Why XSL Transformations?

• Powerful transformation functionality • An XML document may be completely restructured – Add/remove elements and attributes, re-arrange and sort elements – E.g. into an SQL Create table/Insert script • Dynamic documents – XML to XML, XML to (x)HTML, XML to PDF • Multiple renderings • Good software support for transformations • Easy and fast prototyping possible XSLT versions

• XSLT 1.0 (W3C recommendation since 1999) – defined in two documents: the XSLT 1.0 and XPath 1.0 specifications. • XSLT 2.0 (W3C recommendation since 2007) – defined in a set of eight documents: XSLT 2.0, XPath 2.0, XQuery 1.0 … – Major enhancement to the language • Support for XML Schema - nodes and variables can have basic and custom datatypes • Multiple output of documents • User-defined functions – Integration with XQuery XSLT Processing Model

Style/Transform Sheet

XML Result Doc Doc

Transformation Process XSLT Processing Model

• Different output formats – , , text • Multiple inputs – via document() – • Multiple outputs – • Multiple Programs – via – and applies a template rule from an imported style sheet.

Client-side Example

• https://www.cs.tcd.ie/Owen.Conlan/php/xpat h/xml2xsl.html Nodes in a Tree Model

Node

Root Element Attribute Namespace

Text Comment Example Tree

John Smith is generic

root

element Element Text firstname surname is generic

Text Text John Smith Fundamental Notion of :template • Elements in a template body classified as either data or instruction • When the template is executed element – Data nodes get copied to the result tree firstname – Instructions are executed

Pattern to match John Text John

Text Text Text

John

Data nodes Instruction

John

Templates

• Templates represent a set of rules • Rule matching is done within current context • Rule are not executed in order • Default Behaviour – always happens unless overwritten by specific rule • Matching Templates – • Template instantiated for every element satisfying the match’s XPath expression • Named Templates – • Template instantiated explicitly xsl:template

XML Source XSLT

Mike 24

Tanja

21

Built in Templates

• XSLT provides default built-in template rules for each of the 7 kinds of nodes • If there is no template rule that matches the node, then the default rule is invoked

root calls to process its children element calls to process its children attribute Copies the attribute value to the result tree text copies the text to the result tree comment do nothing process- do nothing instruction namespace do nothing Gaining control of the processing • Built in template rule for element node – – “select all children of the current node in the source tree and for each one find the matching template rule in the stylesheet and execute it” • Controlling the sequence of processing - Explicitly use instructions – With pattern match • – With specific invocation • to invoke a specific template by name rather than by pattern matching Template Modes

• Modes are used to modify the rule set and context • Allow us to process the same node more than once, but in different way • Example – •

Conflict Resolution Policy • What if more than one template rule whose pattern matches particular node? • Conflict can be resolved… – By assigned priority through template definition or system allocation. A higher value indicates priority – Current stylesheet rules take precedence over imported stylesheet rules – Precedency is given to more specific rules – In the case where same import precedence and same priorities… down to the way the XSLT engine is implemented!

XSLT Elements

1. xsl:stylesheet 13.xsl:variable 2. xsl:transform 14.xsl:param 3. xsl:output 15.xsl:with-param 4. xsl:template 16.xsl:if 5. xsl:apply-templates 17.xsl:choose 6. xsl:call-template 18.xsl:when 7. xsl:text 19.xsl:otherwise 8. xsl:value-of 20.xsl:for-each 9. xsl:copy-of 10.xsl:element 11.xsl:attribute 12.xsl:sort xsl:transform, xsl:stylesheet

[...]

[...]

xsl:output

/1999/XSL/Transform">

indent="yes"/> [...] [...] xsl:apply-templates

XML Source XSLT

Mike 24

Tanja 21

xsl:call-template

XML Source XSLT

Mike
24

Tanja

21

Result
Mike
24 (more than 8760 days)
(more than
Tanja
24 (more than 7665 days) select="365*number(age)" />
days) Result Tree Creation

• Generate the output file • Literals (data nodes) - e.g

,

  • ,

    ,
  • - send content directly to output (retain whitespaces) • - extract element values (anywhere in the tree) • - deep copy selected nodes • - shallow copy selected nodes • - instantiate an element • - instantiate an attribute • - instantiate a comment • -instantiate an instruction xsl:text & text()

    XML Source XSLT

    Mike

    24

      Tanja select="p/name"/> 21

      Result
    • Name: Mike
    • Name:
    • Name: Tanja

    © 2003 B. Jung xsl:value-of

    XML Source XSLT

    Mike 24

      Tanja select="p"/> 21

    • Result
    • Mike 24
    • Tanja 21
    • select="age" />

      xsl:copy-of

      XSLT

      Fred Freddy

      Smith

      Result Fred Freddy Smith

      © 2003 B. Jung xsl:element

      XML Source XSLT

      Mike

      24

      Tanja 21

      Result
      Mike 24
      Tanja 21
      xsl:attribute

      XML Source XSLT

      Mike

      24

      Tanja 21

      Result Tanja select="local-name()" />
      Mike 24
      21
      Example

      • Stylesheet print out student surname Mr (bolded) and any John Paul hobbys Murphy Football Racing

      Mary Donnelly

      © 2003 B. Jung xsl:sort

      XML Source XSLT

      Mike 24

      Tanja

      21

      years old
      Result
      Tanja
      21 years old
      Mike
      24 years old
    Transformation into HTML

    XML Source XSLT

    Mike 24

    select="//p" /> Tanja 21

    Result

    Mike

    Tanja

    > Variables

    • Scope – Global variables - accessible throughout the whole stylesheet. – Local variables – available only within the element that contains it • Declaring Variables – • Instantiate a variable • Reference – can be referenced in XPath expressions as $variable-name xsl:variable

    XML Source XSLT

      Mike select="p"/> 24

    Tanja 21

    • Result
    • Mike 24
    • Tanja 21
  • Parameters

    • Scope – Global parameters - accessible throughout the whole stylesheet. Values to these parameters can be assigned externally to the stylesheet. – Local parameters – available only within a template • Declaration – (Defined with 0 as a default value) • Reference – can be referenced in XPath expressions as $sum. xsl:param, xsl:with-param

    XML Source XSLT

      Mike

    24

    Tanja

  • 21

      Result
    • Mike 24
    • Tanja 21

    Conditional Processing

    • One conditional expression – – There is NO ‘else’ part

    • Multi Branches –

    xsl:if

    XML Source XSLT

    Mike 24

    Tanja 21 male

    female
    Result Mike 24
    Tanja 21

    xsl:choose, xsl:when, xsl:otherwise

    XML Source XSLT

    Mike 24

    child

    Tanja 21 teen

    twenties
    Result Mike
    adult
    Tanja
    Iteration

    ... • Used to select every XML element of a specified node‐set • Instantiate the template once for each item • Inside the template, context node (.) refers to “current” item

    © 2003 B. Jung xsl:for-each

    XML Source xsl:template match="/"> XSLT

    John Smith123 Oak St. select="customers/customer"> Zack Zwyker
    368 Elm St.
    Result

    John Smith
    123 Oak St. Zack Zwyker ... Associating an XML doc with XSLT

    Jack Harry Rebecca Mr. Bean

    XSLT Example

    <xsl:apply-templates select="doc/title"/>

    XSLT Processors

    • to apply an XSL stylesheet to an XML document we can use:

    – Saxon

    – Xalan

    – Microsoft XSLT Processor

    XSLT Tools

    • Application – editX (free, or evaluation versions) http://www.editix.com/download.html

    • Online – Free Online XSLT Test Tool http://xslttest.appspot.com/

    Summary • Selects (a set of) ELEMENTs within an XML document based on – Conditions – Hierarchy • Usage – Retrieving info from a single XML document – Applying XSL style sheet rules – Making XQuerys