Chapter 7 XML.Pptx

Chapter 7 XML.Pptx

Prof. Dr.-Ing. Stefan Deßloch AG Heterogene Informationssysteme Geb. 36, Raum 329 Tel. 0631/205 3275 [email protected] Chapter 7 - XML XML Origin and Usages n Defined by the WWW Consortium (W3C) n Originally intended as a document markup language, not a database language n Documents have tags giving extra information about sections of the document n For example: n <title> XML </title> n <slide> XML Origin and Usages </slide> n Meta-language: used to define arbitrary XML languages/vocabularies (e.g. XHTML) n Derived from SGML (Standard Generalized Markup Language) n standard for document description n enables document interchange in publishing, office, engineering, … n main idea: separate form from structure n XML is simpler to use than SGML n roughly 20% complexity achieves 80% functionality © Prof.Dr.-Ing. Stefan Deßloch XML Origin and Usages (cont.) n XML documents are to some extent self-describing n Tags represent metadata n Metadata and data are combined in the same document n semi-structured data modeling n Example <bank> <account> <account-number> A-101 </account-number> <branch-name> Downtown </branch-name> <balance> 500 </balance> </account> <depositor> <account-number> A-101 </account-number> <customer-name> Johnson </customer-name> </depositor> </bank> © Prof.Dr.-Ing. Stefan Deßloch Forces Driving XML n Document Processing n Goal: use document in various, evolving systems n structure – content – layout n grammar: markup vocabulary for mixed content n Data Bases and Data Exchange n Goal: data independence n structured, typed data – schema-driven – integrity constraints n Semi-structured Data and Information Integration n Goal: integrate autonomous data sources n data source schema not known in detail – schemata are dynamic n schema might be revealed through analysis only after data processing © Prof.Dr.-Ing. Stefan Deßloch XML Language Specifications XSL XML Link XML Pointer XPath XQuery XSLT XSL-FO XML Schema XML Namespace XML Metadata Interchange XHML eXtensible Markup Language Cascading Style Sheets Unified Modeling Language Standardized Generalized Markup Language Meta Object Facility Unicode Document Type Definition © Prof.Dr.-Ing. Stefan Deßloch XML Documents n XML documents are text (unicode) n markup (always starts with '<' or '&') n start/end tags n references (e.g., &lt, &amp, …) n declarations, comments, processing instructions, … n data (character data) n characters '<' and '&' need to be indicated using references (e.g., &lt) or using the character code n alternative syntax: <![CDATA[ (a<b)&(c<d) ]]> n XML documents are well-formed n logical structure n (optional) prolog (XML version, …) n (optional) schema n root element (possibly nested) n comments, … n correct sequence of start/end tags (nesting) n uniqueness of attribute names n … © Prof.Dr.-Ing. Stefan Deßloch XML Documents: Elements n Element: section of data beginning with <tagname> and ending with matching </tagname> n Elements must be properly nested n Formally: every start tag must have a unique matching end tag, that is in the context of the same parent element. n Mixture of text with sub-elements is legal in XML n Example: <account> This account is seldom used any more. <account-number> A-102</account-number> <branch-name> Perryridge</branch-name> <balance> 400 </balance> </account> n Useful for document markup, but discouraged for data representation © Prof.Dr.-Ing. Stefan Deßloch XML Documents: Attributes n Attributes: can be used to describe elements n Attributes are specified by name=value pairs inside the starting tag of an element n Example <account acct-type = "checking" > <account-number> A-102 </account-number> <branch-name> Perryridge </branch-name> <balance> 400 </balance> </account> n Attribute names must be unique within the element <account acct-type = “checking” monthly-fee=“5”> © Prof.Dr.-Ing. Stefan Deßloch XML Documents: IDs and IDREFs n An element can have at most one attribute of type ID n The ID attribute value of each element in an XML document must be distinct è ID attribute (value) is an 'object identifier' n An attribute of type IDREF must contain the ID value of an element in the same document n An attribute of type IDREFS contains a set of (0 or more) ID values. Each ID value must contain the ID value of an element in the same document n IDs and IDREFs are untyped, unfortunately n Example below: The owners attribute of an account may contain a reference to another account, which is meaningless; owners attribute should ideally be constrained to refer to customer elements © Prof.Dr.-Ing. Stefan Deßloch XML data with ID and IDREF attributes <bank-2> <account account-number=“A-401” owners=“C100 C102”> <branch-name> Downtown </branch-name> <balance>500 </balance> </account> . <customer customer-id=“C100” accounts=“A-401”> <customer-name>Joe</customer-name> <customer-street>Monroe</customer-street> <customer-city>Madison</customer-city> </customer> <customer customer-id=“C102” accounts=“A-401 A-402”> <customer-name> Mary</customer-name> <customer-street> Erin</customer-street> <customer-city> Newark </customer-city> </customer> </bank-2> © Prof.Dr.-Ing. Stefan Deßloch XML Document Schema n XML documents may optionally have a schema n standardized data exchange, … n Schema restricts the structures and data types allowed in a document n document is valid, if it follows the restrictions defined by the schema n Two important mechanisms for specifying an XML schema n Document Type Definition (DTD) n contained in the document, or n stored separately, referenced in the document n XML Schema © Prof.Dr.-Ing. Stefan Deßloch Document Type Definition - DTD n Original mechanism to specify type and structure of an XML document n What elements can occur n What attributes can/must an element have n What subelements can/must occur inside each element, and how many times. n DTD does not constrain data types n All values represented as strings in XML n Special DTD syntax n <!ELEMENT element (subelements-specification) > n <!ATTLIST element (attributes) > © Prof.Dr.-Ing. Stefan Deßloch Element Specification in DTD n Subelements can be specified as n names of elements, or n #PCDATA (parsed character data), i.e., character strings n EMPTY (no subelements) or ANY (anything can be a subelement) n Structure is defined using regular expressions n sequence (subel, subel, …), alternative (subel | subel | …) n number of occurences n “?” - 0 or 1 occurrence n “+” - 1 or more occurrences n “*” - 0 or more occurrences n Example <!ELEMENT depositor (customer-name account-number)> <!ELEMENT customer-name(#PCDATA)> <!ELEMENT account-number (#PCDATA)> <!ELEMENT bank ( ( account | customer | depositor)+)> © Prof.Dr.-Ing. Stefan Deßloch Example: Bank DTD <!DOCTYPE bank-2[ <!ELEMENT account (branch-name, balance)> <!ATTLIST account account-number ID #REQUIRED owners IDREFS #REQUIRED> <!ELEMENT customer(customer-name, customer-street, customer-city)> <!ATTLIST customer customer-id ID #REQUIRED accounts IDREFS #REQUIRED> … declarations for branch, balance, customer-name, customer-street and customer-city ]> © Prof.Dr.-Ing. Stefan Deßloch Describing XML Data: XML Schema n XML Schema is closer to the general understanding of a (database) schema n XML Schema supports n Typing of values n E.g. integer, string, etc n Constraints on min/max values n Typed references n User defined types n Specified in XML syntax (unlike DTDs) n Integrated with namespaces n Many more features n List types, uniqueness and foreign key constraints, inheritance .. n BUT: significantly more complicated than DTDs © Prof.Dr.-Ing. Stefan Deßloch XML Schema Structures n Datatypes (Part 2) Describes Types of scalar (leaf) values n Structures (Part 1) Describes types of complex values (attributes, elements) n Regular tree grammars repetition, optionality, choice recursion n Integrity constraints Functional (keys) & inclusion dependencies (foreign keys) n Subtyping (similar to OO models) Describes inheritance relationships between types n Supports schema reuse © Prof.Dr.-Ing. Stefan Deßloch XML Schema Structures (cont.) n Elements : tag name & simple or complex type <xs:element name=“sponsor” type=“xsd:string”/> <xs:element name=“action” type=“Action”/> n Attributes : tag name & simple type <xs:attribute name=“date” type=“xsd:date”/> n Complex types <xs:complexType name=“Action”> <xs:sequence> <xs:elemref name =“action-date”/> <xs:elemref name =“action-desc”/> </xs:sequence> </xs:complexType> © Prof.Dr.-Ing. Stefan Deßloch XML Schema Structures (cont.) n Sequence <xs:sequence> <xs:element name=“congress” type=xsd:string”/> <xs:element name=“session” type=xsd:string”/> </xs:sequence> n Choice <xs:choice> <xs:element name=“author” type=“PersonName”/> <xs:element name=“editor” type=“PersonName”/> </xs:choice> n Repetition <xs:element name =“section” type=“Section” minOccurs=“1” maxOccurs=“unbounded”/> © Prof.Dr.-Ing. Stefan Deßloch Namespaces n A single XML document may contain elements and attributes defined for and used by multiple software modules n Motivated by modularization considerations, for example n Name collisions have to be avoided n Example: n A Book XSD contains a Title element for the title of a book n A Person XSD contains a Title element for an honorary title of a person n A BookOrder XSD reference both XSDs n Namespaces specifies how to construct universally unique names © Prof.Dr.-Ing. Stefan Deßloch XML Schema Version of Bank DTD <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.banks.org" xmlns ="http://www.banks.org" > <xsd:element

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    48 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us