Session 3: XML Information Processing

Session 3: XML Information Processing

Extreme Java G22.3033-007 Session 3 - Sub-Topic 4 XML Information Processing Dr. Jean-Claude Franchitti New York University Computer Science Department Courant Institute of Mathematical Sciences 1 Agenda Q XML applications development tools for Java Q XML application Development using the XML Java APIs Q Java-based XML application support frameworks Q Advanced XML Parser Technology Q JDOM: Java-Centric API for XML Q JAXP: Java API for XML Processing Q Parsers comparison Q Latest W3C APIs and Standards for Processing XML Q XML Infoset, DOM Level 3, Canonical XML Q XML Signatures, XBase, XInclude Q XML Schema Adjuncts Q Java-Based XML Data Processing Frameworks 2 1 XML-Based Software Development Q Business Engineering Methodology Q Language + Process + Tools Q e.g., Rational Unified Process (RUP) Q XML Application Development Infrastructure Q Metadata Management (e.g., XMI) Q XML APIs (e.g., JAXP, JAXB) Q XML Tools (e.g., XML Editors, XML Parsers) Q XML Applications: Q Application(s) of XML Q XML-based applications/services (markup language mediators) Q MOM & POP Q Other Services (e.g., persistence, transaction, etc.) Q Application Infrastructure Frameworks 3 More on XML Information Modeling Q Using UML use cases to support the development of DTDs and XML Schemas Q Establish linking relationship Q See Family tree application of XML 4 2 Part I XML Application Development Tools for Java 5 Java-enabled XML Technologies Q XML provides a universal syntax for Java semantics (behavior) Q Portable, reusable data descriptions in XML Q Portable Java code that makes the data behave in various ways Q XML standard extension Q Basic plumbing that translates XML into Java Q parser, namespace support in the parser, simple API for XML (SAX), and document object model (DOM) Q XML data binding standard extension 6 3 XML Processors Characteristics Q An XML engine is a general purpose XML data processor Q An XML processor/parser is a software engine that checks the syntax (well-formedness)of XML documents Q If a schema (or DTD) is included, the parser can (optionally) validate the correctness of XML documents’ structure against it Q A parser reads the XML document’s information and makes it accessible to the XML application via a standard API 7 Common XML APIs Q Document Object Model (DOM) API Q Tree structure-based API Q Issued as a W3C recommendation (10/98) Q See Session 5 Sub-Topic 1 Presentation Q Simple API for XML (SAX) Q Event-driven API Q Developed by David Megginson Q ElementHandler API Q Event-driven proprietary API provided by IBM’s XML4J Q Pure Java APIs: JDOM (Open Source) and JAXP 8 4 Java API Packages Q java.xml.parsers Q The JAXP APIs, which provide a common interface for different vendors' SAX and DOM parsers. Q Two vendor-neutral factory classes: SAXParserFactory and DocumentBuilderFactory that give you a SAXParser and a DocumentBuilder, respectively. The DocumentBuilder, in turn, creates DOM-compliant Document object. Q org.w3c.dom Q Defines the Document class (a DOM), as well as classes for all of the components of a DOM. Q org.xml.sax Q Defines the basic SAX APIs. Q jaxax.xml.transform 9 Q Defines the XSLT APIs that let you transform XML into other forms. Simple API for XML (SAX) Parsing APIs 10 5 SAX API Packages Q org.xml.sax Q Defines the SAX interfaces. Q org.xml.sax.ext Q Defines SAX extensions that are used when doing more sophisticated SAX processing, for example, to process a document type definitions (DTD) or to see the detailed syntax for a file. Q org.xml.sax.helpers Q Contains helper classes that make it easier to use SAX -- for example, by defining a default handler that has null-methods for all of the interfaces, so you only need to override the ones you actually want to implement. Q javax.xml.parsers Q Defines the SAXParserFactory class which returns the SAXParser. Also defines exception classes for reporting errors. 11 DOM Parsing APIs 12 6 DOM API Packages Q org.w3c.dom Q Defines the DOM programming interfaces for XML (and, optionally, HTML) documents, as specified by the W3C. Q javax.xml.parsers Q Defines the DocumentBuilderFactory class and the DocumentBuilder class, which returns an object that implements the W3C Document interface. The factory that is used to create the builder is determined by the javax.xml.parsers system property, which can be set from the command line or overridden when invoking the newInstance method. This package also defines the ParserConfigurationException class for reporting errors. 13 XSLT APIs 14 7 XSLT API Packages Q See Session 3 handout on “Processing XML Documents in Java Using XPath and XSLT” Q javax.xml.transform Q Defines the TransformerFactory and Transformer classes, which you use to get a object capable of doing transformations. After creating a transformer object, you invoke its transform() method, providing it with an input (source) and output (result). Q javax.xml.transform.dom Q Classes to create input (source) and output (result) objects from a DOM. Q javax.xml.transform.sax Q Classes to create input (source) from a SAX parser and output (result) objects from a SAX event handler. Q javax.xml.transform.stream Q Classes to create input (source) and output (result) objects from an I/O stream. 15 JAXP and Associated XML APIs Q JAXP: Java API for XML Parsing Q Common interface to SAX, DOM, and XSLT APIs in Java, regardless of which vendor's implementation is actually being used. Q JAXB: Java Architecture for XML Binding Q Mechanism for writing out Java objects as XML (marshalling) and for creating Java objects from such structures (unmarshalling). Q JDOM: Java DOM Q Provides an object tree which is easier to use than a DOM tree, and it can be created from an XML structure without a compilation step. Q JAXM: Java API for XML Messaging Q Mechanism for exchanging XML messages between applications. Q JAXR: Java API for XML Registries Q Mechanism for publishing available services in an external registry, 16 and for consulting the registry to find those services. 8 Content of Jar Files Q jaxp.jar (interfaces) Q javax.xml.parsers Q javax.xml.transform Q javax.xml.transform.dom Q javax.xml.transform.sax Q javax.xml.transform.stream Q crimson.jar (interfaces and helper classes) Q org.xml.sax Q org.xml.sax.helpers Q org.xml.sax.ext Q org.w3c.dom Q xalan.jar (contains all of the above implementation classes) 17 Sample XML parsers and engines Q XML parsers Q RXP, Dan Connolly’s XML parser, XML-Toolkit, LTXML, expat, TCLXML, xparse, XP, DataChannel XPLparser (DXP), XML:Parse, PyXMLTok, Lark, Microsoft’s XML parser, IBM’s XML for Java, Apache’s Xerces-J, Aefred, xmlproc, xmllib, Windows foundation classes, Java Project X Parser (Crimson), OpenXML Parser, Oracle XML Parser, etc. Q SGML/XML parsers Q SGMLSpm, SP 18 9 Sample XML Parsers and Engines (continued) Q XML middleware: Xpublish (Media Design), XML middleware 1.0 Q DSSSL engines: Jade 1.1, DAE SDK, DAE Server SDK Q XSL processors: Sparse, Microsoft XSL processor, doproc, xslj, LotusXSL, Xalan, XSL:P Q XLink processors: xmllinks 19 Comprehensive List of XML Processors Q A comprehensive list of parsers is available at http//www.xmlsoftware.com/parsers Q Includes links to latest product pages Q Includes Version numbers, Licensing information, and Platform details Q Research work being done around MetaParsers and parallel XML parsers 20 10 Mainstream Java-Based XML Processors Q Sun’s Java Project X Parser Q Donated on April 13, 2000 to the Apache’s XML Project under the name “Crimson” Q Apache’s XercesJ Q XercesJ is strongly recommended for this course Q Oracle’s XML Parser for Java 21 Other Java-Based XML Processors Q Sun’s JAXP Q Jason Hunter and Brett McLaughlin’s OpenSource JDOM Q IBM Alphaworks’s XML for Java (XML4J) Q Based on the Apache Xerces XML Parser Q DataChannel’s XJParser 22 11 XML Data Binding Standard Extension Q Aims to automatically generate substantial portions of the Java platform code that processes XML data Q A Sun project, codenamed “Adelard” Q See JSR-31 XML Data Binding Specification Q see http://java.sun.com/xml/jaxp-1.0.1/docs/binding/DataBinding.html 23 Part II XML Application Development Using the XML Java APIs 24 12 Typical XML Processor Installation Q Pick a processor based on the features it provides to match your requirements Q Download and install the latest (or supported) version of the JDK from http://www.javasoft.com Q Install the XML processor Q Update the PATH and CLASSPATH variables as needed, and test the processor 25 Reading XML Documents Q Use Apache’s XercesJ or Alphaworks’ XML Parser for Java Q The “SimpleParse.java” application provided in section 2.4 of “XML and Java” will need to be adapted to support the latest version of the parsers Q We suggest looking at the source for the sample applications located in XercesSamples.jar Q For testing, use XML and Java’s sample document or the “personal.xml” sample XML document provided with XML4J 26 13 Presenting XML Documents Using Java Tools Q Presenting an XML document requires processing of the XML document by accessing its internal stucture Q An XML document’s structure can be accessed using the various XML APIs Q Various third party tools have been implemented using such APIs to apply XSL style sheets to XML documents and generate HTML output (e.g., Xalan, LotusXSL) 27 XML Data Exchange Protocols Q Message format alternatives Q Text-based (e.g., EDI, RFC822, SGML, XML) Q Binary (e.g., ASN.1, CORBA/IIOP) Q See XML and Java sections 7.2, and 7.4 Q An API that provide a common interface to work with EDI or XML/EDI objects is supported

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    32 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us