XML for Developers G22.3033-002

Session 7 - Main Theme XML Information Rendering (Part I)

Dr. Jean-Claude Franchitti

New York University Computer Science Department Courant Institute of Mathematical Sciences

1

Agenda

n Summary of Previous Session n Extensible Stylesheet Language Transformation (XSL-T) n Extensible Stylesheet Language Formatting Object (XSL-FO) n XML and Document/Content Management n Assignment 4a+4b (due in two week)

2

Summary of Previous Session

n Advanced XML Parser Technology

n JDOM: Java-Centric API for XML

n JAXP: Java API for XML Processing

n Parsers comparison

n Latest W3C APIs and Standards for Processing XML

n XML Infoset, DOM Level 3, Canonical XML

n XML Signatures, XBase, XInclude

n XML Schema Adjuncts

n Java-Based XML Data Processing Frameworks

n Assignment 3a+3b (due next week)

3

1 XML-Based Rendering Development

n XML Software Development Methodology

n Language + Stepwise Process + Tools

n Rational Unified Process (RUP) v.s. “XML Unified Process”

n XML Application Development Infrastructure

n Metadata Management (e.g., XMI)

n XSLT, XPath XSL -FO APIs (JAXP, JAXB, JDOM, SAX, DOM)

n XML Tools (e.g., XML Editors,Apache’s FOP, Antenna House’s XSL Formatter, HTML/CSS1/2/3, XHTML, XForms, WCAG

n XML Applications Involved in the Rendering Phase:

n Application(s) of XML

n XML-based applications/services (markup language mediators)

n MOM, POP, Other Services (e.g., persistence) 4 n Application Infrastructure Frameworks

What is XSL?

n XSL is a language for expressing stylesheets. It consists of two parts:

n A language for transforming XML documents

n A XML vocabulary for specifying formatting semantics

n See http://www.w3.org/Style/XSL for the XSLT 1.0/XPath 1.0 Recs, the XSL-FO 1.0 candidate rec, and working drafts of XSLT 1.1/2.0 and XPath 2.0

n A XSL stylesheet specifies the presentation of a class of XML documents. It describes how an instance of the class is transformed into an XML document that uses the formatting vocabulary

5

XML Data Rendering Patterns

n Manipulating and Rendering XML Structures Using Java

n XSL-T

n Transform

n Sort

n Output

n XSL-T + -FO

n Format

n Output

n Querying will be covered separately

6

2 eXtensible Style Language (XSL)

n DSSSL & DSSSL-O

n CSS 1, 2, 3 …

n http://www.w3.org/Style/CSS/

n XSLT

n XPath

n XSL-FO

n XSLT Processors

n Stylus Studio XSL development environment

n IBM XSL Editor

n Saxon and Xalan XSLT processors

n XSL-FO Processors

n Antenna House 7 n fop

XSL Processing

n http://www.w3.org/Style/XSL/

n Processing Alternatives:

n HTML + CSS -> Presentation

n XML + CSS -> Presentation

n XML + XSLT -> XSL-FO -> Presentation

n XML + XSLT -> XML/HTML + CSS -> Presentation

n Client or Server Processing ?

n See Session 2 handout on IE5’s implementation of the XSL Spec.

n Examples

n See Session 2 Sub-Topic 1 Presentation: Beginning XML

n See Session 2 handouts on XSL Tree Transformation Language

n See Session 2 handout on Cascading Stylesheets 8 n See Session 2 handout on Styling Documents Using XSL

A Language for “Mapping XML” (LMX)

n LMX is a sample textbook application n LMX can convert a document in one DTD into another DTD and vice versa n LMX uses rules to describe bi-directional “MOM” conversions between two sets of documents

n Rules have a “from-pattern” and a “to-pattern”b to respectively match the source document, and construct the target document

n Some restrictions exist w.r.t. the LMX patterns in order to simp lify the program as much as possible n LMX can also be used to convert a XML document to HTML (“POP” application)

9

3 How Does the LMX Processor Work?

n LMX makes heavy use of the DOM 1.0 API n LMX uses XML4J internally to:

n Parse a rule file

n Parse a source document

n Generate a target document n See chapter 4.3 in the XML and Java textbook for a detailed description of the LMX implementation

10

LMX v.s. the eXtensible Stylesheet Language (XSL)

n LMX and XSL both provide a syntax to encode “Style Sheets”

n Each XML document can be associated with a style sheet that describes how elements should be organized and formatted for presentation

n XSL style sheets provide custom appearances that give a web site a unified look and feel

11

How Does XSL Work?

n A XSL style sheet is an XML document

n XSL elements in a XSL style sheet correspond to a series of XSL “transformation” rules (i.e., XML tree transformation and/or formatting rules)

n XSL rules describe how particular XML tags are to be converted to “flow objects” as the document is read

12

4 Part I

Extensible Stylesheet Language Transformation (XSLT)

13

XSL Transformations

n Assume root element of style sheet is n Each element contains one or more rule elements n Each rule has a target and an action n Target is a regular expression defining to which XML elements the rule applies n Action is the list of flow objects generated when the rule is applied:

n Actions output a series of HTML tags in combination with the content of the element

n Actions may output XML tags obtained via transformation of original XML data

n Actions may output non-markup text, or run simple scripts or programs

n Actions may use JavaScript to provide more complex, and dynamic behaviors 14

XSL Transformations (continued)

n Conceptual Representation of XSL Transformations: action (…)

15

5 XSL-T and Templates

n XSLT rules are also called “Templates”

n There may not be rules to match every element

n Elements can be reordered on the output.

n XSL style sheet must be well-formed

n e.g., a HTML empty tag specified as
must be written as
within a XSL style sheet action

n XSLT elements used as a basis for a simple stylesheet are:

n , , , , and

16

XSLT Elements and Functions n Creating Elements and Attributes

n xsl:element, xsl:attribute n Iteration and Sorting (e.g., xsl:sort) n Conditional Processing

n xsl:apply-templates select=“ … “, xsl:if, xsl:choose n Copying Nodes (e.g., xsl:copy) n Combining Stylesheets

n xsl:import, xsl:include n Defining Variables & Parameters (e.g., xsl:variable) n Scripting with XPath functions 17

Parsers with XSLT Support n SAX 2.0 or DOM Level 2 1.0 Support Required n Apache’s Xalan XSLT parser

n org.apache.xalan.processor/templates/transformer

n org.apache. n Saxon XSLT parser n JAXP 1.1 (javax..transform)

n TraXP

n Supported by Xalan 2.0, and Saxon 6.1 n Sun’s XSLTC

n Converts stylesheet’s to class files (“translets”) 18

6 Part II

Extensible Stylesheet Language Formatting Object (XSL-FO)

19

XSL Formatting

n XSL flow objects are markup text

n Markup language output flow objects can be HTML, DSSSL, VRML, etc.

n We will focus on HTML output flow objects (simpler, more widely understood, better supported by current tools, and do not require an extra level of translation)

20

XSL Formatting Characteristics

n XSL formatting is simpler than DSSSL (Document Style Semantics and Specification Language, pronounced “dissal”, ISO std 10179:1996) n XSL formatting is more powerful than CSS (Cascading Style Sheets) n XSL’s basic formatting syntax is understandable by anybody acquainted with DSSSL or CSS

21

7 Part III

XML and Document/Content Management

22

What is a XSL Processor?

n A XML document and its associated style sheet are combined by an XSL processor to produce a HTML document

n The XSL Processor applies the style sheet to the XML document and outputs static HTML

n The process can be automated with CGI scripts, Java servlets, or ActiveX controls to convert XML to HTML on the fly

n A XSL processor is a standalone program or is part of a larger XML browser

23

How Does a XSL Processor Work?

n The XSL processor consults the style sheet to find the rule that matches the element

n The XSL processor takes whatever action is associated to the rule:

n outputs element’s content plus assorted markup

n performs more complicated operations (sorting XML data before outputting it, running a Javascript program on the XML data, adding missing content to XML data, etc.)

24

8 How Does a XSL Processor Work? (continued)

n XSL processor formats each element upon receipt

n XSL processor may process elements recursively

n XSL processor receives input from XML processor and outputs formatted data based on the nature of the elements it receives

n E.g., XSL processor receives element

n XSL processor may output same content as bold text

n If processor is an audio renderer, it may pump up the volume a n otch...2

25

How Does a XSL Processor Work?

n The XSL processor consults the style sheet to find the rule that matches the element

n The XSL processor takes whatever action is associated to the rule:

n outputs element’s content plus assorted markup

n performs more complicated operations (sorting XML data before outputting it, running a Javascript program on the XML data, adding missing content to XML data, etc.)

26

How Does a XSL Processor Work? (continued) n XSL processor formats each element upon receipt n XSL processor may process elements recursively n XSL processor receives input from XML processor and outputs formatted data based on the nature of the elements it receives

n E.g., XSL processor receives element

n XSL processor may output same content as bold text

n If processor is an audio renderer, it may pump up the volume a n otch...

27

9 Mainstream XSL Processors n See Microsoft’s XML and XSL Samples and Demos at http://msdn.microsoft.com/xml n See IBM’s LotusXSL, Apache’s xalan, and fop. Look at Appendix E of the class textbook for relevant information on XSL n A comprehensive list of XSL formatters, and XSLT engines/editors/utilities is available at http//www.xmlsoftware.com

n Includes links to latest product pages

n Includes Version numbers, Licensing information, and Platform details 28

DOM 1.0 XSL Processing Support

n The DOM Level 1 specification does not support XSL stylesheets n Microsoft’s initial version of MSXML DOM included a DOM Level 1 extension that added support for XSL stylesheets

n The function transformNode(…) was used to apply an XSL stylesheet to an existing XML document n Similar extensions were emulated early on by other XSL processors (LotusXSL, xalan, fop, etc.) n DOM Level 2 1.0 formalizes rendering support

29

Mainstream XSL Processors

n See Microsoft’s XML and XSL Samples and Demos at http://msdn.microsoft.com/xml n See IBM’s LotusXSL, Apache’s xalan, and fop. Look at Appendix E of the class textbook for relevant information on XSL n A comprehensive list of XSL formatters, and XSLT engines/editors/utilities is available at http//www.xmlsoftware.com

n Includes links to latest product pages

n Includes Version numbers, Licensing information, and Platform details

30

10 Xalan n Xalan-J version 2.1.0 is the latest n Provides XSL-T processing for transforming XML documents into HTML, text, or other XML document types n Built on top of SAX 2.0, DOM Level 2 1.0, JAXP 1.1 n Implements the TraX subset of JAXP 1.1

31

FOP n Latest version is 0.19

n xml.apache.org/fop, www.jtauber.com n Print formatter driven by XSL-FO objects n Formatted output is in PDF format for now n Can be embedded in a Java application by instantiating org.apache.fop.apps.Driver

32

Frameworks n Cocoon 2 n Xang n Batik

33

11 Part IV

Conclusions

34

Summary

n XSL style sheets describe how individual elements are displayed in HTML n A XSL processor like LotusXSL converts an XML document and its associated style sheet into an HTML document that can be read by current web browsers n Style instructions are stored in rule elements

35

Summary (continued)

n Each rule has a pattern and an action

n The pattern define the elements to which the rule applies

n The action specifies the flow objects that the XSL processor outputs when the rule fires

n When multiple rules apply to one element, only the most specific rule is applied

n Flow objects usually include the content of the element, along with some combination of HTML markup

36

12 Readings n Readings

n XML Development with Java 2: Chapter 5

n Professional Java XML: Chapters 7,8, and Appendix G

n XML and Java: Chapter 4

n Handouts posted on the course web site

n Review WCAG status on W3C web site n Project Frameworks Setup (ongoing)

n Apache’s Web Server, TomCat/JRun, and Cocoon

n Apache’s Xerces, Xalan, Saxon

n Antenna House XML Formatter, Apache’s FOP, X-smiles

n Publishing Systems at http://www.xmlsoftware.com

n Visibroker 4.5, WebLogic 6.1

n POSE & KVM (See Session 3 handout) 37

Assignment

n Assignment #4:

n This part of the project focuses on the application content model design/development using XML information rendering technology. The design/development process should adhere to the following steps: (a) Identifying rendering/transformation targets, (b) Defining the optimal rendering approach for each target, (c) Considering data rendering issues when designing an overall application data model

n More specific project related information, and extra credit assignments will be provided during the session

38

Next Session: XML Information Rendering (Part II)

n XML/XSL and JSP/JavaBeans Rendering Technology

n Internationalization Issues

n Web Content Accessibility Guidelines (WCAG)

39

13