Session 3: XML Information Modeling (Part I)
Total Page:16
File Type:pdf, Size:1020Kb
XML for Java Developers G22.3033-002 Session 3 - Main Theme XML Information Modeling (Part I) Dr. Jean-Claude Franchitti New York University Computer Science Department Courant Institute of Mathematical Sciences 1 Agenda Q Summary of Previous Session Q XML Physical Entities Q Logical Structure of XML Documents Q XML Document Navigation Q Java APIs Q Custom Markup Languages Q Readings Q Assignment #1b (1 week) Q Assignment #2a+2b (2 weeks) 2 1 Summary of Previous Session Q History and Current State of XML Standards Q Advanced Applications of XML Q XML’s eXtensible Style Language (XSL) Q Character Encodings and Text Processing Q XML and DBMSs Q Course Approach ... Q XML Application Development Q XML References and Class Project Q Readings Q Assignment #1a (reminder?) / Assignment #1b (1 week) 3 XML Physical and Logical Structure Q Physical Structure Q Governs the content in a document in form of storage units Q Storage units are referred to as entities Q See http://www.w3.org/TR/REC-xml#sec-physical-struct Q Logical Structure Q What elements are to be included in a document Q In what order should elements be included Q See http://www.w3.org/TR/REC-xml#sec-logical-struct 4 2 XML Physical Entities Q Allow to assign a name to some content, and use that name to refer to it Q Eight Possible Combinations: Q Parsed vs. Unparsed Q General vs. Parameter Q Internal vs. External Q Five Actual Categories: Q Internal parsed general Q Internal parsed parameter Q External parsed general Q External parsed parameter Q External unparsed general 5 Logical Structure: Namespaces Q See Namespaces 1.0 Q Sample Element: <z:a z:b="x" c="y" xmlns:z="http://www.foo.com/"/> Q Corresponding DTD Declaration <!ELEMENT z:a EMPTY> <!ATTLIST z:a z:b CDATA #IMPLIED c CDATA #IMPLIED xmlns:z CDATA #FIXED "http://www.foo.com"> 6 3 Logical Structure: DTDs Q Shortcomings Q Separate Syntax <!ELEMENT Para (#PCDATA)*> <Para>Some paragraph</Para> vs. <ElementType name="Para"> <ContentModel><PCData/></ContentModel> </ElementType> Q Lack of Support for Data-typing Q DTD Treats an XML Structure as a String of Characters <Price currency="USD">1450</Price> <Price currency="USD">too high</Price> 7 Logical Structure: XML Schemas Q Structures Q How elements and attributes are setup in an XML document Q Datatypes Q Built-in datatypes (e.g., String, Boolean, numbers) Q Generated datatypes (e.g., dates, times, real values) Q Support for user generated datatypes Q Backward compatibility with functional subset (DTD) Q ID, IDREF, NMTOKEN, and SGML-based types Q Grouping of Elements/Attributes Q Archetypes and Attribute Groups Q Inheritance Q Via Basetypes, and Archetypes/Attribute Groups 8 4 Logical Structure: XML Schemas (continued) <datatype name='AgeInYears'> <basetype name='integer' URI="http://www.w3.org/xmlschemas/datatypes"/> <minInclusive>0</minInclusive> <maxInclusive>140</maxInclusive> </datatype> <attribute name="employeesAge" type="AgeInYears"/> 9 Logical Structure: Navigation Q URIs/URLs Q Syntax for encapsulating a name in any registered namespace, and label it with the namespace Q Produce a member of the universal set of reachable objects Q See http://www.w3.org/Addressing/ Q XPath Q Used to locate certain parts of an XML document Q See Session 3 handout on “Processing XML documents in Java using XPath and XSLT” 10 5 JAXP and Associated XML APIs Q JAXP: Java API for XML Parsing Q Common interface to SAX, DOM, and XSLT APIs in Java, regardless of which vendor's implementation is actually being used. Q JAXB: Java Architecture for XML Binding Q Mechanism for writing out Java objects as XML (marshalling) and for creating Java objects from such structures (unmarshalling). Q JDOM: Java DOM Q Provides an object tree which is easier to use than a DOM tree, and it can be created from an XML structure without a compilation step. Q JAXM: Java API for XML Messaging Q Mechanism for exchanging XML messages between applications. Q JAXR: Java API for XML Registries Q Mechanism for publishing available services in an external registry, 11 and for consulting the registry to find those services. Simple API for XML (SAX) Parsing APIs 12 6 DOM Parsing APIs 13 XSLT APIs 14 7 Java API Packages Q java.xml.parsers Q The JAXP APIs, which provide a common interface for different vendors' SAX and DOM parsers. Q Two vendor-neutral factory classes: SAXParserFactory and DocumentBuilderFactory that give you a SAXParser and a DocumentBuilder, respectively. The DocumentBuilder, in turn, creates DOM-compliant Document object. Q org.w3c.dom Q Defines the Document class (a DOM), as well as classes for all of the components of a DOM. Q org.xml.sax Q Defines the basic SAX APIs. Q jaxax.xml.transform 15 Q Defines the XSLT APIs that let you transform XML into other forms. SAX API Packages Q org.xml.sax Q Defines the SAX interfaces. Q org.xml.sax.ext Q Defines SAX extensions that are used when doing more sophisticated SAX processing, for example, to process a document type definitions (DTD) or to see the detailed syntax for a file. Q org.xml.sax.helpers Q Contains helper classes that make it easier to use SAX -- for example, by defining a default handler that has null-methods for all of the interfaces, so you only need to override the ones you actually want to implement. Q javax.xml.parsers Q Defines the SAXParserFactory class which returns the SAXParser. Also defines exception classes for reporting errors. 16 8 DOM API Packages Q org.w3c.dom Q Defines the DOM programming interfaces for XML (and, optionally, HTML) documents, as specified by the W3C. Q javax.xml.parsers Q Defines the DocumentBuilderFactory class and the DocumentBuilder class, which returns an object that implements the W3C Document interface. The factory that is used to create the builder is determined by the javax.xml.parsers system property, which can be set from the command line or overridden when invoking the newInstance method. This package also defines the ParserConfigurationException class for reporting errors. 17 XSLT API Packages Q javax.xml.transform Q Defines the TransformerFactory and Transformer classes, which you use to get a object capable of doing transformations. After creating a transformer object, you invoke its transform() method, providing it with an input (source) and output (result). Q javax.xml.transform.dom Q Classes to create input (source) and output (result) objects from a DOM. Q javax.xml.transform.sax Q Classes to create input (source) from a SAX parser and output (result) objects from a SAX event handler. Q javax.xml.transform.stream Q Classes to create input (source) and output (result) objects from an I/O stream. 18 9 Content of Jar Files Q jaxp.jar (interfaces) Q javax.xml.parsers Q javax.xml.transform Q javax.xml.transform.dom Q javax.xml.transform.sax Q javax.xml.transform.stream Q crimson.jar (interfaces and helper classes) Q org.xml.sax Q org.xml.sax.helpers Q org.xml.sax.ext Q org.w3c.dom Q xalan.jar (contains all of the above implementation classes) 19 XML Information Modeling Q Steps Q Documenting the Information Structure Q Representing the Information Structure in XML Form Q Defining XML DTDs and/or Schemas Q Modeling Techniques Q UML: object modeling Q XML: content modeling Q ORM: data modeling Q See Session 3 handout on “XML Information Modeling” Q UML, MOF and XMI Q See Session 3 handouts on “UML, MOF, and XMI” and “OMG’s XML Metadata Interchange Format (XMI)” 20 10 Open Information Model Q Analysis and Design Model Q Unified Modeling Language (UML) - uml.dtd Q UML Extensions - umlx.dtd Q Common Data Types - dtm.dtd Q Generic Elements - gen.dtd Q Components and Object Model Q Component Description Model - cde.dtd Q Database and Warehousing Model Q Database Schema Elements - dbm.dtd Q Data Transformation Elements - tfm.dtd Q OLAP Schema Elements - olp.dtd Q Record Oriented Legacy Databases - rec.dtd Q Knowledge Management Model 21 Q Semantic Definition Elements - sim.dtd Custom Markup Languages Q Mathematical Markup Language (MathML) Q OpenMath Q Chemical Markup Language (CML) Q Geography Markup Language (GML) Q Wireless Markup Language (WML) Q Synchronized Multimedia Integration Language (SMIL) Q Synchronized Vector Graphics (SVG) Q Extensible 3D (X3D) Q XML-Based User Interface Language (XUL) Q Extensible Log Format (XLF) 22 11 Readings Q Readings Q XML Development with Java 2: Chapters 4 Q Professional Java XML: Chapters 5 Q XML and Java: Chapter 2 Q Handouts posted on the course web site Q Review XML 1.0, XPath 1.0, XML Schema W3C Recs Q Project Frameworks Setup (ongoing) Q Apache’s Web Server, TomCat/JRun, and Cocoon Q Apache’s Xerces, Xalan, Saxon Q Antenna House XML Formatter, Apache’s FOP, X-smiles Q Visibroker 4.5 (or AppServer), WebLogic 6.1 23 Q POSE & KVM (MIDP/PADP) Assignment Q Assignment #2a: Q This part of the project focuses on the application business model discovery using XML information modeling technology. The discovery process should adhere to the following steps: (a) Documenting the information structure, (b) Representing the information structure in XML form, (c) Defining XML DTDs and/or Schemas Q More specific project related information, and extra credit assignments will be provided during the session 24 12 Assignment (continued) Q Assignment #2b: Q This part of the project relies on the business model discovery process suggested in assignment #2a, and should demonstrate the use of UML use cases to support the development of XML DTDs and/or Schemas 25 Next Session: XML Information Modeling (Part II) Q Advanced Logical Structuring and XML Schemas Q XML Metadata Management Q XML Linking/Pointer Language, XML Base, and XML Inclusions Q XML Data Binding Q Industry Specific Markup Languages 26 13.