2.3 JAXP: Java API for XML Processing JAXP 1.1 How can applications use XML processors? Included in Java since JDK 1.4 – A Java -based answer: through JAXP An interface for “plugging -in ” and using XML – An overview of the JAXP interface processors in Java applications » What does it specify? – includes packages » What can be done with it? » org.xml.sax : SAX 2.0 » How do the JAXP components fit together? » org.w3c.dom: DOM Level 2 » javax.xml.parsers : [Partly based on tutorial “An Overview of the APIs ” at initialization and use of parsers http://java.sun.com/xml/jaxp/dist/1.1/docs/tutorial/overview » javax.xml.transform : /3_apis.html, from which also some graphics are borrowed] initialization and use of transformers (XSLT processors) XPT 2006 XML APIs: JAXP 1 XPT 2006 XML APIs: JAXP 2 Later Versions JAXP: XML processor plugin (1) JAXP 1.2 (2002) adds property -strings for Vendor -independent method for selecting setting the language and source of a schema processor implementation at run time used for (non -DTD -based) validation – principally through system properties JAXP 1.3 included in JDK 5.0 (2005) javax.xml.parsers.SAXParserFactory javax.xml.parsers.DocumentBuilderFactory – more flexible validation (decoupled from parsing) javax.xml.transform.TransformerFactory – support for DOM3 and XPath – Set on command line (for example, to use Apache Xerces as the DOM implementation): We'll restrict to basic ideas from JAXP 1.1 java -Djavax.xml.parsers.DocumentBuilderFactory = org.apache.xerces.jaxp.DocumentBuilderFactoryImpl XPT 2006 XML APIs: JAXP 3 XPT 2006 XML APIs: JAXP 4 JAXP: XML processor plugin (2) JAXP: Functionality −> – Set during execution ( Saxon as the XSLT impl ): Parsing using SAX 2.0 or DOM Level 2 System.setProperty ( "javax.xml.transform.TransformerFactory ", Transformation using XSLT "com.icl.saxon.TransformerFactoryImpl "); – (We ’ll study XSLT in detail later) By default, reference implementations used Adds functionality missing from SAX 2.0 and – Apache Crimson/ Xerces as the XML parser – Apache Xalan as the XSLT processor DOM Level 2: Supported by a few compliant processors: – controlling validation and handling of parse errors can – Parsers: Apache Crimson and Xerces , Aelfred , » error handling be controlled in SAX, Oracle XML Parser for Java by implementing ErrorHandler methods – XSLT transformers: Apache Xalan , Saxon – loading and saving of DOM Document objects XPT 2006 XML APIs: JAXP 5 XPT 2006 XML APIs: JAXP 6 JAXP Parsing API JAXP: Using a SAX parser (1) .newSAXParser () Included in JAXP package .getXMLReader () javax.xml.parsers Used for invoking and using SAX … Used for invoking and using SAX … XML SAXParserFactory spf = SAXParserFactory .newInstance (); .parse ( ”f.xml ”) and DOM parser implementations: f.xml DocumentBuilderFactory dbf = DocumentBuilderFactory .newInstance (); XPT 2006 XML APIs: JAXP 7 XPT 2006 XML APIs: JAXP 8 JAXP: Using a SAX parser (2) JAXP: Using a DOM parser (1) We have already seen this : .newDocumentBuilder () SAXParserFactory spf = SAXParserFactory .newInstance (); try { .newDocument () SAXParser saxParser = spf. newSAXParser (); XMLReader xmlReader = saxParser. getXMLReader (); ... .parse( ”f.xml ”) xmlReader .setContentHandler (handler ); xmlReader .parse (fileNameOrURI ); ... f.xml } catch (Exception e) { System.err.println(e.getMessage ()); System.exit(1); }; XPT 2006 XML APIs: JAXP 9 XPT 2006 XML APIs: JAXP 10 JAXP: Using a DOM parser (2) DOM building in JAXP Document Parsing a file into a DOM Document : Builder DocumentBuilderFactory dbf = (Content DocumentBuilderFactory .newInstance (); Handler ) try { // to get a new DocumentBuilder : XML Error DOM Document DocumentBuilder builder = Reader Handler dbf. newDocumentBuilder (); XML Document domDoc = (SAX DTD builder. parse (fileNameOrURI ); Parser ) Handler } catch (ParserConfigurationException e) { Entity e.printStackTrace ()); Resolver System.exit(1); }; DOM on top of SAX - So what ? XPT 2006 XML APIs: JAXP 11 XPT 2006 XML APIs: JAXP 12 JAXP: Controlling parsing (1) JAXP: Controlling parsing (2) Errors of DOM parsing can be handled – by creating a SAX ErrorHandler Further DocumentBuilderFactory » to implement error , fatalError and warning methods configuration methods to control the form of and passing it to the DocumentBuilder : the resulting DOM Document : builder. setErrorHandler (new myErrHandler ()); domDoc = builder. parse (fileName ); setIgnoringComments (true /false) Parser properties can be configured : setIgnoringElementContentWhitespace (true /false) – for both SAXParserFactories and setCoalescing (true /false) DocumentBuilderFactories (before parser /builder • combine CDATA sections with surrounding text? creation ): factory. setValidating (true /false) setExpandEntityReferences (true /false) factory. setNamespaceAware (true /false) XPT 2006 XML APIs: JAXP 13 XPT 2006 XML APIs: JAXP 14 JAXP Transformation API JAXP: Using Transformers (1) earlier known as TrAX Allows application to apply a Transformer to a .newTransformer (…) Source document to get a Result document .transform (.,.) Transformer can be created – from XSLT transformation instructions (to be discussed later ) – without instructions Source » gives an identity transformation , which simply XSLT copies the Source to the Result XPT 2006 XML APIs: JAXP 15 XPT 2006 XML APIs: JAXP 16 JAXP Transformation APIs Source -Result combinations javax.xml.transform : Source Transformer Result – Classes Transformer and TransformerFactory ; initialization similar XML Content Reader Handler to parsers and parser factories (SAX Parser ) Transformation Source object can be DOM – a DOM tree, a SAX XMLReader or an input stream DOM Output Input Output Transformation Result object can be Stream Stream – a DOM tree, a SAX ContentHandler or an output stream XPT 2006 XML APIs: JAXP 17 XPT 2006 XML APIs: JAXP 18 JAXP Transformation Packages (2) Serializing a DOM Document as XML text Classes to create Source and Result objects By an identity transformation to an output stream: from DOM, SAX and I/O streams defined in packages TransformerFactory tFactory = – javax.xml.transform.dom , TransformerFactory .newInstance (); javax.xml.transform.sax , and // Create an identity transformer : javax.xml.transform.stream Transformer transformer = tFactory. newTransformer (); Identity transformation to an output stream is a DOMSource source = new DOMSource (myDOMdoc ); vendor -neutral way to serialize DOM documents StreamResult result = (and the only option in JAXP) new StreamResult (System.out ); – “I would recommend using the JAXP interfaces until the DOM ’s own load/save module becomes available ” » Joe Kesselman , IBM & W3C DOM WG transformer. transform (source , result ); XPT 2006 XML APIs: JAXP 19 XPT 2006 XML APIs: JAXP 20 Controlling the form of the result ? Creating an XSLT Transformer We could specify the requested form of the result by Then create a tailored transfomer : an XSLT script, say, in file saveSpec.xslt : StreamSource saveSpecSrc = <xsl:transform version="1.0" new StreamSource ( xmlns:xsl="http ://www.w3.org/1999/XSL/ Transform "> new File( ”saveSpec.xslt ”) ); Transformer transformer = <xsl:output encoding="ISO -8859 -1" indent="yes " tFactory. newTransformer (saveSpecSrc ); doctype -system="reglist.dtd " /> // and use it to transform a Source to a Result , <xsl:template match ="/"> // as before <! -- copy whole document : -- > <xsl:copy -of select ="." /> </ xsl:template > The Source of transformation instructions could be </ xsl:transform > given also as a DOMSource , URL, or character reader XPT 2006 XML APIs: JAXP 21 XPT 2006 XML APIs: JAXP 22 DOM vs. Other Java/XML APIs JAXP: Summary JDOM ( www.jdom.org ), DOM4J ( www.dom4j.org ), An interface for using XML Processors JAXB ( java.sun.com /xml/ jaxb ) JAXB ( ) – SAX/DOM parsers, XSLT transformers The others may be more convenient to use, Supports pluggability of XML processors but … but … “The DOM offers not only the ability to move Defines means to control parsing, and between languages with minimal relearning, but handling of parse errors (through SAX to move between multiple implementations in ErrorHandlers ) a single language – which a specific set of classes such as JDOM can ’t support ” Defines means to write out DOM Documents » J. Kesselman , IBM & W3C DOM WG Included in Java 2 XPT 2006 XML APIs: JAXP 23 XPT 2006 XML APIs: JAXP 24.
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages3 Page
-
File Size-