Transforming XML Documents

University of Dublin Trinity College Transforming XML Documents [email protected] [email protected] What is XSL? • XSL (eXtensible Stylesheet Language) consists of – A language for transforming XML documents – A language for specifying formatting properties • Based on existing style sheets languages – Document Style Semantics and Specification Language (DSSSL) – Cascading Style Sheets (CSS) • A style sheet language specifically for XML documents • Uses an XML syntax • XSL is a combination of three specifications – XML Path Language (XPath) – XSL Transformations (XSLT) – XSL-FO (Formatting Objects) More @ http://www.w3.org/Style/XSL/ What are XSL Transformations? • Language for transforming XML documents into other XML/other documents – More generally input can be any document as long as it is represented as tree • Uses a collection of templates (rules) to transform the source document • XPath is used to select the parts of a source document to transform • Rule-based (declarative) language for transformations • Server-side/client-side execution Why XSL Transformations? • Powerful transformation functionality • An XML document may be completely restructured – Add/remove elements and attributes, re-arrange and sort elements – E.g. into an SQL Create table/Insert script • Dynamic documents – XML to XML, XML to (x)HTML, XML to PDF • Multiple renderings • Good software support for transformations • Easy and fast prototyping possible XSLT versions • XSLT 1.0 (W3C recommendation since 1999) – defined in two documents: the XSLT 1.0 and XPath 1.0 specifications. • XSLT 2.0 (W3C recommendation since 2007) – defined in a set of eight documents: XSLT 2.0, XPath 2.0, XQuery 1.0 … – Major enhancement to the language • Support for XML Schema - nodes and variables can have basic and custom datatypes • Multiple output of documents • User-defined functions – Integration with XQuery XSLT Processing Model Style/Transform Sheet XML Result Doc Doc Transformation Process XSLT Processing Model • Different output formats – xml, html, text • Multiple inputs – via document() – <xsl:value-of select="document(‘test.xml')/photo/title"/> • Multiple outputs – <xsl:result-document> • Multiple Programs – via <xsl:include> – and <xsl:import> • <xsl:apply-imports> applies a template rule from an imported style sheet. Client-side Example • https://www.cs.tcd.ie/Owen.Conlan/php/xpat h/xml2xsl.html Nodes in a Tree Model Node Root Element Attribute Namespace Text Comment Processing Instruction Example Tree <firstname> John </firstname> <surname> Smith </surname> is generic root element Element Text firstname surname is generic Text Text John Smith Fundamental Notion of xslt:template • Elements in a template body classified as either data node or instruction • When the template is executed element – Data nodes get copied to the result tree firstname – Instructions are executed Pattern to match <firstname> John </firstname> Text John <xsl:template match=“firstname"> <h2> <xsl:value-of select=“."/> </h2> Text Text Text </xsl:template> <h2> John </h2> Data nodes Instruction <h2> John </h2> Templates • Templates represent a set of rules • Rule matching is done within current context • Rule are not executed in order • Default Behaviour – always happens unless overwritten by specific rule • Matching Templates – <xsl:template match=‘...’ mode=‘...’> • Template instantiated for every element satisfying the match’s XPath expression • Named Templates – <xsl:template name=‘...’> • Template instantiated explicitly xsl:template <?xml version='1.0'?> XML Source XSLT <xsl:template match="/"> <grp id="G4"> </xsl:template> <p id="P3" sex="m"> <name>Mike</name> <xsl:template match="name"> <age>24</age> </xsl:template> </p> <p id="P7" sex="f"> <xsl:template match="//age"> </xsl:template> <name>Tanja</name> <age>21</age> <xsl:template match="p/name"> </p> </xsl:template> </grp> <xsl:template match="p[age='21']"> </xsl:template> <xsl:template match="p[@sex='f']"> </xsl:template> <xsl:template match= "name[ancestor::grp/@id='G4']"> </xsl:template> Built in Templates • XSLT provides default built-in template rules for each of the 7 kinds of nodes • If there is no template rule that matches the node, then the default rule is invoked root calls <xsl:apply-templates> to process its children element calls <xsl:apply-templates> to process its children attribute Copies the attribute value to the result tree text copies the text to the result tree comment do nothing process- do nothing instruction namespace do nothing Gaining control of the processing • Built in template rule for element node – <xsl:apply-templates/> – “select all children of the current node in the source tree and for each one find the matching template rule in the stylesheet and execute it” • Controlling the sequence of processing - Explicitly use instructions – With pattern match • <xsl:apply-templates/> • <xsl:apply-templates select=“some_pattern”> – With specific invocation • <xsl:call-template> to invoke a specific template by name rather than by pattern matching Template Modes • Modes are used to modify the rule set and context • Allow us to process the same node more than once, but in different way • Example – • <xsl:template match=‘...’ mode=‘...’> Conflict Resolution Policy • What if more than one template rule whose pattern matches particular node? • Conflict can be resolved… – By assigned priority through template definition or system allocation. A higher value indicates priority – Current stylesheet rules take precedence over imported stylesheet rules – Precedency is given to more specific rules – In the case where same import precedence and same priorities… down to the way the XSLT engine is implemented! XSLT Elements 1. xsl:stylesheet 13.xsl:variable 2. xsl:transform 14.xsl:param 3. xsl:output 15.xsl:with-param 4. xsl:template 16.xsl:if 5. xsl:apply-templates 17.xsl:choose 6. xsl:call-template 18.xsl:when 7. xsl:text 19.xsl:otherwise 8. xsl:value-of 20.xsl:for-each 9. xsl:copy-of 10.xsl:element 11.xsl:attribute 12.xsl:sort xsl:transform, xsl:stylesheet <?xml version="1.0"?> <xsl:transform version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> [...] </xsl:transform> <?xml version="1.0"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> [...] </xsl:stylesheet> xsl:output <?xml version="1.0"?> <?xml version="1.0"?> <xsl:stylesheet <xsl:stylesheet version="1.0" version="1.0" xmlns:xsl="http://www.w3.org xmlns:xsl="http://www.w3.org /1999/XSL/Transform"> /1999/XSL/Transform"> <xsl:output <xsl:output method="html" method="xml"/> indent="yes"/> [...] [...] </xsl:stylesheet> </xsl:stylesheet> xsl:apply-templates <?xml version='1.0'?> XML Source <xsl:template match="/"> XSLT <grp id="G4"> <xsl:apply-templates select="grp"/> <p id="P3" sex="m"> </xsl:template> <name>Mike</name> <age>24</age> </p> <xsl:template match="grp"> <p id="P7" sex="f"> <xsl:apply-templates select="p"/> <name>Tanja</name> </xsl:template> <age>21</age> </p> <xsl:template match="p"> </grp> <xsl:apply-templates /> </xsl:template> <xsl:template match="name"> </xsl:template> <xsl:template match="age"> </xsl:template> xsl:call-template <?xml version='1.0'?> XML Source <xsl:template match="grp"> XSLT <div> <grp id="G4"> <xsl:apply-templates /> <p id="P3" sex="m"> </div> <name>Mike</name> </xsl:template> <age>24</age> <xsl:template match="p"> </p> <dt> <p id="P7" sex="f"> <xsl:value-of select="name" /> <name>Tanja</name> </dt> <age>21</age> <dd> <xsl:value-of select="age" /> </p> <xsl:call-template name="d" /> </grp> </dd> </xsl:template> <dl> Result <dt>Mike</dt> <xsl:template name="d"> <dd>24 (more than 8760 days)</dd> <xsl:text>(more than </xsl:text> <dt>Tanja</dt> <xsl:value-of <dd>24 (more than 7665 days)</dd> select="365*number(age)" /> </dl> <xsl:text> days)</xsl:text> </xsl:template> Result Tree Creation • Generate the output file • Literals (data nodes) - e.g <p>, <li>, </p>, </li> • <xsl:text> - send content directly to output (retain whitespaces) • <xsl:value-of> - extract element values (anywhere in the tree) • <xsl:copy-of> - deep copy selected nodes • <xsl:copy> - shallow copy selected nodes • <xsl:element> - instantiate an element • <xsl:attribute> - instantiate an attribute • <xsl:comment> - instantiate a comment • <xsl:processing-instruction>-instantiate an instruction xsl:text & text() <?xml version='1.0'?> XML Source <xsl:template match="/"> XSLT <grp id="G4"> <xsl:apply-templates select="grp"/> <p id="P3" sex="m"> </xsl:template> <name>Mike</name> <age>24</age> <xsl:template match="grp"> </p> <ul> <p id="P7" sex="f"> <xsl:apply-templates <name>Tanja</name> select="p/name"/> <age>21</age> </ul> </p> </xsl:template> </grp> <xsl:template match="name"> <ul> Result <li> <li>Name: Mike</li> <xsl:text>Name: </xsl:text> <li>Name: Tanja</li> </ul> <xsl:apply-templates /> </li> </xsl:template> <xsl:template match="text()"> <xsl:value-of select="."/> </xsl:template> © 2003 B. Jung xsl:value-of <?xml version='1.0'?> XML Source <xsl:template match="/"> XSLT <grp id="G4"> <xsl:apply-templates select="grp"/> <p id="P3" sex="m"> </xsl:template> <name>Mike</name> <age>24</age> <xsl:template match="grp"> </p> <ul> <p id="P7" sex="f"> <xsl:apply-templates <name>Tanja</name> select="p"/> <age>21</age> </ul> </p> </xsl:template> </grp> <xsl:template match="p"> <li> <ul> Result <xsl:value-of select="name"/> <li>Mike 24</li> <li>Tanja 21</li> <xsl:apply-templates </ul> select="age" /> </li> </xsl:template> <xsl:template match="age"> <xsl:value-of select="."/> </xsl:template> xsl:copy-of <?xml

Load more