RELAX NG XML Schemas Schematron
Total Page:16
File Type:pdf, Size:1020Kb
RELAX NG RELAX NG is a schema language for XML. The key features of RELAX NG are that it: • is simple • is easy to learn • has both an XML syntax and a compact non-XML syntax • does not change the information set of an XML document • supports XML namespaces • treats attributes uniformly with elements so far as possible • has unrestricted support for unordered content • has unrestricted support for mixed content • has a solid theoretical basis • can partner with a separate datatyping language (such W3C XML Schema Datatypes) XML Schemas XML Schemas express shared vocabularies and allow machines to carry out rules made by people. They provide a means for defining the structure, content and semantics of XML documents. in more detail. XML Schema was approved as a W3C Recommendation on 2 May 2001 and a second edition incorporating many errata was published on 28 October 2004. Schematron The Schematron differs in basic concept from other schema languages in that it not based on grammars but on finding tree patterns in the parsed document. This approach allows many kinds of structures to be represented which are inconvenient and difficult in grammar-based schema languages. If you know XPath or the XSLT expression language, you can start to use The Schematron immediately. The Schematron allows you to develop and mix two kinds of schemas: • Report elements allow you to diagnose which variant of a language you are dealing with. • Assert elements allow you to confirm that the document conforms to a particular schema. The Schematron is based on a simple action: • First, find a context nodes in the document (typically an element) based on XPath path criteria; • Then, check to see if some other XPath expressions are true, for each of those nodes. The Schematron can be useful in conjunction with many grammar-based structure-validation languages: DTDs, XML Schemas, RELAX, TREX, etc. Indeed, Schematron is part of an ISO standard (DSDL: Document Schema Description Languages) designed to allow multiple, well- focussed XML validation languages to work together. You can even embed a Schematron schema inside an XML Schema <appinfo> element or inside a RELAX NG schema! ? DTD (W3C) XML RELAX TREX RELAX NG Schematron Schemas overview a XML an object- a pattern- a pattern a schema a rules-based structure oriented based, user- specification language XML definition XML friendly for the created by schema with a list schema XML structure and unifying language of legal language schema content of an RELAX elements language XML Core and document TREX (Tree Regular Expressions for XML) grammar posses its object-like, both an own XML syntax XML syntax compact and a but non- compact XML non-XML grammar syntax datatyping no, (yes but yes yes weak, only (datatype applies on systems can attributes) be plugged) support for none yes yes XML namespaces can partner yes, with a with others separate datatyping language Vendor support Post- yes yes no Schema- Validation- Infoset complexity high can express no yes non- determinism rules no no no yes, using expression XPath ?structures? yes yes yes no ?integrity? yes flexibility poor intermediate high for top, but all (weak structures must be support for defined ? DTD (W3C) XML RELAX TREX RELAX NG Schematron Schemas unordered content) notes a Schema is TREX has relatively been merged easy to with extend and RELAX to good for create data- RELAX oriented NG. All applications future development of TREX will take place as part of the RELAX NG effort • How to obviate parsing problems ? (i.e. how to have a well formed XML document containing the needed data) => find another way to express the unparsed data => use external data-files (non XML) containing the unparsed data • RELAX NG : is it possible to reference a single definition of an element from another file.rng ? • Data-structure is easily represented by RELAX NG! • How to translate the following text into RELAX NG ? <!-- Definition of Annotation follows --> <xs:complexType name="Annotation"> <xs:annotation> <xs:documentation>Concise processing directives for downstream applications.</xs:documentation> </xs:annotation> <xs:sequence> <xs:any processContents="skip" minOccurs="0" maxOccurs="unbounded"/> </xs:sequence> </xs:complexType> • How is the "xs:anyType" translated into RELAX NG from <xs:element name="type" type="xs:anyType"/> ? • With RELAX NG the style of your schema (Russian doll, DTD-like, or content- oriented) has an impact on its extensibility. =>The content-oriented option is the most extensible. • Some interesting RELAX NG features : => the ability to define attributes wherever you want in your patterns, => the flexibility and freedom with which you can combine patterns and the lack of restrictions associated with these combinations, => can use regular expressions to specify or constraint datatypes, => previously defined structures can be very easily employed,.