Introduction to XML Tutorial Description

With your HTML knowledge, you have a solid foundation for working with markup languages. However, unlike HTML, XML is more flexible, Bebo White allowing for custom tag creation. This course [email protected] introduces the fundamentals of XML and its related technologies so that you can create your own markup language. InterLab 2006 FermiLab October 2006

Topics* What Is Markup? • XML well-formed documents • Information added to a text to make its structure • Validation concepts comprehensible • DTD syntax and constructs • Pre-computer markup (punctuational and presentational) • W3C Schema syntax and constructs • Word divisions • XSL(T) syntax and processing • Punctuation • XPath addressing language • Copy-editor and typesetters marks • Development and design considerations • Formatting conventions • XML processing model • XML development and processing tools

* Tutorial plus references Computer Markup (1/3) Computer Markup (2/3)

• Any kind of codes added to a document • Declarative markup (cont) • Typesetting (presentational markup) • Names and structure • Macros embedded in ASCII • Framework for indirection • Commands to define the layout • Finer level of detail (most human-legible signals are • MS Word, TeX, RTF, Scribe, Script, nroff, etc. overloaded) • *Hello* Æ Hello • Independent of presentation (abstract) ••/Hello//Hello/ Æ Hello • Often called “semantic” • Declarative markup • HTML (sometimes) ••XMLXML

Computer Markup (3/3) Markup – ISO-Definitions • Semantic Markup • Markup – Text that is added to the data of a • Authors put annotations into their texts to help the document in order to convey information about it publisher to understand what type of text this is (e.g. • Descriptive Markup – Markup that describes “this is a heading”) the structure and other attributes of a document • Annotations are agreed between author and publisher in a non-system-specific way, independently of • Publisher decides on the layout any processing that may be performed on it • Descriptive markup • (PI) – Markup • Describing content not the layout consisting of system-specific data that controls • Markup to support search in documents how a document is to be processed • Words in headings are more important than in footnotes • Markup for machines vs. markup for humans Markup Language Features • Stylistic (appearance) • • Structural (layout) Hypertext Markup •


Language (HTML) • Semantic (meaning) ••<TITLE> • <META NAME=keywords CONTENT = " …... " > • Functional (action) • <BLINK> • <A HREF = "[link]">Click here</A></p><p>Hypertext Markup Language Some Problems (1/2)</p><p>• HTML – The Markup Language used to represent Web pages for viewing by people • Rendered and viewed in a <a href="/tags/Web_browser/" rel="tag">Web Browser</a> • Not extensible • Documents • Easy to write – Markup your data with tags • Platform independent • Can contain links to Images, documents, and other pages • HTML is an application/instance of SGML (Standard Generalized Markup Language, ISO 8879:1986 – used for defining Markup Languages) • For further information: http://www.w3.org/MarkUp/ Some Problems (2/2) Observations on HTML <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN"> • Powerful for Presentation (Focus on Client-Side) <html><head> <title>The Some Problems Example • Cascading Style Sheets (CSS) • Allows for dynamic behavior using scripting/ DHTML

Separation Of Concerns

There are a lot of problems using • Allows for proprietary extension (ActiveX, plug-ins, HTML for Web Application development,, etc.). ifif youyou dodo notnot separateseparate concerns.concerns.

• Easy to write and generate, but: The Bold and Italictalic example:example:

• Difficult to parse While rendering is easy nowadays. The semantic of this markup is not clear. • No support for extending semantics, e.g. using your own tags • REMEMBER: Do not develop AApplicationspplications in this manner! • Difficult to apply disciplined approaches

XML (1/2) • The eXtensible Markup Language • XML is a universal format for structured eXtensible Markup documents and data on the Web • XML is a standard, interoperable way to Language (XML) describe data for flexible processing • Multi-format delivery • Schema-aware information retrieval • Transformation and dynamic data customization • Archival: standardized, self-describing XML (2/2) XML History • http://www.w3.org/XML/ • 1996 Development started • XML looks like markup (e.g., HTML) but in this • 1997 Public Drafts context the interpretation of data is the job of the application • E.g. Provided in paper form at WWW6, Santa Clara, CA • XML tags/elements/attributes are not predefined • XML uses a Document Type Definition (DTD) or • February, 1998 W3C REC an XML Schema to describe data • Based on experience: simplified form of SGML • XML with a DTD or XML Schema is designed to • XML derived from SGML – both are used for be self-descriptive defining Markup Languages • XML = 80% of SGML´s capabilities, 20% of SGML´s complexity

The W3C Standards* Process XML Facts • Consortium (W3C) • Important for Web development because it • Development is organized into WGs. removes two constraints: • Dependence on a single, inflexible Document • Working Group (~10) - set agenda /decide Type (HTML); • Special Interest Group (~100) - • The complexity of full SGML, whose syntax discuss/recommend allows many powerful but hard-to-program • W3C members (~500) - vote options • W3C Director (TimBL) - may veto • XML was not designed to do anything • The public--comment on public WDs; • XML is free and extensible adopt/reject • XML complements (not replaces) HTML XML and HTML XML Characteristics • XML was designed to “carry” data • Well-Formed – An XML document is well- • Two different goals: formed if it complies to the following rules: • Elements have an open and close tag: • XML – describe data and focus on what it is content • HTML – display data and focus on how it looks • Empty elements are closed by “ / ” e.g. • Attribute values are quoted • Valid – An XML document is well-formed and if its content conforms to the rules in its document type definition or schema • Validity allows an application to make sure the XML data is complete, is formated properly, and has appropriate attribute values.

The Two Worlds of XML The Two Worlds United • Markup of documents: the original • Documents and “semi-structured” data share • This perspective is our focus here features • Document representation was the primary problem • Hierarchical structure XML was created to solve • String content • Data exchange and protocol design • Variations in structure • XML turned out to fill important gaps • Their applications also share needs • Relational databases needed a way to share records • Need for a lingua franca, independent of APIs and multi-table data • Ability to cope with international characters • Protocol designers wanted a way to encapsulate • “Fit” with WWW and HTTP. structured data XML is More General Better Rendering than HTML • Tags label arbitrary information units • Fully internationalized • More suited to multiple purposes • Also better for visually-impaired users • “Looking right” is needed but not enough • Supports multiple renderings • Supports custom information structures • Customize to the user, time, situation, device • If you have “price” or “procedure”, you can make a • Separates formatting from structure tag for it, and validate its usage • And processing other than rendering • Can support many different information models • Large documents don’t break it • E.g., molecular models, vector graphics, etc. • Easy to trade off server/client work • More “teeth” to enforce consistent syntax • Artificial “next tiny bit” links no longer necessary • Works hard to avoid semi-interoperable docs • No searches that fail because big doc was split • XHTML is XML-conforming flavor of HTML • Clean existing HTML is already close...

XML Treats Documents like Databases XML Example • XML brings benefits of DBs to documents • A way of representing information • Schema to model information directly • XML documents (application of XML) are • Formal validation, locking, versioning, rollback... composed of elements and attributes ••ButBut because documents are fundamentally different in item some ways room item room OrderDate 200.00 dollars 200.00 dollars What is Structure When Structure is Essential • To Relational Database theorists, structure is: • Large scale data • Tables with fixed sets of non-repeating named fields, • Data with individual parts you care about that have little internal structure • (like price-tag, tool-list, citation, author,...) • E-R diagrams with fixed number of nodes • Need for good navigation tools • Structured documents are different: • Mission-critical information • The order of SECs, Ps, etc. matters (a lot) • Many hierarchical layers (which text crosses) • Information that must last • Text/graphic data mixes with aggregate objects • Multi-author publishing process • Optional or repeatable sub-parts abound • Multiple delivery media • Interaction with natural language phenomena • These are very different requirements

What’s the Difference? XML Design Principles • Without structure • Straightforwardly usable over the Internet • Data conversion is far more expensive • Support for a wide variety of applications • Multi-platform and/or multi-media delivery require re- • Compatible with SGML authoring and hand-work • Paper production is inconsistent • Make writing XML programs easy • Late format changes are far more risky • Avoid optional features • Retrieval is prone to many false hits • Human-readable (if not terse) markup • “Pay me now, or pay me later” • Formal and concise design • Design produced quickly Opportunities with XML Web Usage • Scalability and openness of Web solutions • XML works with familiar Web paradigms • “Rich clients” for complex information • Locations are expressed as URIs • Dynamic user views • High interoperability because of few options • XML as interprocess communication protocol for • Easily implementable and usable “data” (as opposed to “text”) • Robust against network failures • eCommerce integration • Avoids serving schemas every time with documents • (but can do better validation anyway, when • New methods of creation needed) • Schema combination/composition • Free-form, schema-less data development

Some Additional XML Details Well-formedness (1/2) • Well-formedness • Document has a single root element, and • Error handling • Elements nest properly • Case sensitivity • Entities are whole subtrees (not

) • HTML compatibility • No elements omission (close what you open) • Attributes must be quoted ••<< and & must always be escaped in some way • A document can be well-formed (and parsable) whether or not it fits a given schema Well-formedness (2/2) Tutorial Outline • Part 1: The basics of creating an XML document • Part 2: Developing constraints for a well …… formed XML document • Part 3: XML and supplementary technologies

Part 1: Background for XML Main Components of an XML Document • An eXtensible Markup Language (XML) document describes the structure of data • Elements: • XML and HTML have a similar syntax … • Attributes: both derived from SGML • Entities: < (<) • XML has no mechanism to specify the • Comments: format for presenting data to the user • Advanced Components • An XML document resides in its own file • CData Sections with an ‘.’ extension • Processing Instructions The Basic Rules Common Errors for Element Naming • XML is case sensitive Naming • All start elements must have end elements • Do not use white space when creating • Empty elements may be “self-closing” – e.g., names for elements ,
• Element names cannot begin with a digit, • Elements must be properly nested although names can contain digits • XML declaration is the first statement • Only certain punctuation allowed – • Every document must contain a root element periods, colons, and hyphens • Attribute values must have quotation marks • Certain characters are reserved for parsing

Try It! • Open XML Editor (www.philo.de/xmledit) • Peter’s XML Editor (www.iol.ie/~pxe)

• Notepad (but cannot check for well- formedness and validity Elements vs. Attributes Problems Using Attributes • Data can be stored in child elements or in • Attributes cannot contain multiple values (child attributes elements can) • Attributes are not easily expandable (for future modifications) • Attributes cannot describe structures (child elements can) • Attributes are more difficult to manipulate with program code • Attribute values are not easily tested against DTDs or Schemas • Metadata should be attributes; data should be elements

Part 2: Legal Building Blocks of What are the Parts of an XML Document? XML • A Document Type Definition (DTD) or XML • The DTD • Comments Schema allows the developer to create a • Elements • Marked sections set of rules to specify legal content and • Attributes • Processing place restrictions on an XML file • General entities instructions • If the XML document does not follow the ••CharacterCharacter • Notations rules contained within the DTD or Schema, references • Identifiers and a parser generates an error catalogs • An XML document that conforms to the rules within a DTD or Schema is said to be valid Error Handling Case Sensitivity

•• “Draconian error handling” ••HTMLHTML is • Major errors cause processor to stop • Case-insensitive for tag names:

=

passing data in the “normal way” • Case-sensitive for entity names: < ≠ < • Fatal errors: • XML is case-sensitive for both! • Unicode standard advises against case-folding • Ill-formed document • Folding is not well-defined for all languages • Certain entity references in incorrect places • Turkish has two lower-case i’s, only one upper • Misplaced character-encoding declarations • In languages with no accented caps, can’t reverse • This helps save huge $ on error-recovery • Error-prone for programmers • XHTML uses lower case

Practice Validating XHTML

XML System Architectures An (X)HTML System How Do You Get the Data?

Documents, stylesheets, and other data can all be expressed in XML. Any application can plug in via an API But their information is accessed directly. (X)HTML •Web Server called “” document

XML Parser Information data structure DOM Interface Internet (tree+links) Web Client This model can work locally or over a Parser, DTD/ network. Parsing, tree-building, and access formatter, Schema interface can shift between client/server

Server-side XML Publishing XML Everywhere • XML separates representation from structure • So you can use the same parsers, network protocols, Server transforms to HTML/CSS; Ship to client browser for display tree managers, and APIs to access documents, stylesheets, search and query, etc.

Browser/ XML XSL(T) HTML • XML allows separating application parts Interface data +CSS http • So you can mix and match formatters, search engines, networks and protocols, etc.

Stylesheet • XML separates out semantics • So you can control style or search semantics without Very common current strategy; having to mangle your documents to do it Leverages current technology HTML Compatibility Footnote • XHTML is an XML application • XML is text, but isn't meant to be read • One schema among many (probably a • Applications can store their data or respond in popular one, of course) Web-compliant style (Text) instead of binary • Web browser should start supporting format generic XML regardless of tag-set. • XML is verbose by design • Don’t hard-code sizes and names • Data + Markup is in most cases larger than a binary format – but disk space is cheap, HTTP supports compression on the fly (gzip) • As XML defines Markup Languages… • XHTML, XSL, XForms, etc. are applications of XML

The XML “Alphabet Soup” (1/3) The XML “Alphabet Soup”(2/3)

DTD Document Type Non-XML schema XML Extensible Markup Defines XML documents 9 Definition Language 9 DOM Document Object Model API to read, create and edit XML 9 documents; creates in-memory object XSL Extensible Stylesheet Language for expressing stylesheets; model 9 Language consists of XSL(T) and XSL-FO SAX Simple API for XML API to parse XML documents; event- 9 driven XSL(T) XSL Transformations Language for transforming XML 9 documents XML XML Schema Definition an XML based alternative to DTD 9 Schema XSL-FO XSL Formatting Language to describe precise layout of (XSD) Objects text on a page XPath XML Path Language A language for addressing parts of an 9 XML document, designed to be used Data Island XML data embedded in a HTML page by both XSL(T) and XPointer XPointer XML Pointer Language Supports addressing into the internal Data Binding Automatic population of HTML elements from XML data 9 structures of XML documents A collection of names, identified by a URI reference, which are XLink XML Linking Language Describes links between XML Namespace 9 documents used in XML documents XQuery XML Query Language Flexible mechanism for querying XML (draft) data as if it were a database The XML “Alphabet Soup”(3/3)

SOAP Simple Object Access A simple XML based protocol to let 9 Protocol applications exchange information over HTTP WSDL Web Services An XML-based language for describing 9 Description Language Web services and how to access them WAP Wireless Application The leading standard for information Protocol services on wireless terminals like digital mobile phones WML Wireless Markup WAP uses the mark-up language WML Language (not HTML)

XML Information Set Document Analysis • What data in an XML document “counts”? • Cycle of steps; repeat until out of time • Elements, attributes, content • Identify project requirements/audience • Order and hierarchy of elements • Using those, identify information items in the • No whitespace within tags document that could be important • All whitespace within elements • Make sure you have a way to use that • Not which kind of quotes around attributes information • Required for interoperability • Identify restrictions on those items • Applications must not count nodes differently • Identify structural constraints that may be • W3C “Document Object Model” is related needed • DOM is an API for XML, not an O.M. • Identify non-semantic features that may be important for presentation, etc. Project Requirements Identifying Information Items • Know the audience/readers • This is pretty much a manual process • Know the authors • Often best done with paper and • Don’t forget the editorial/clerical staff highlighters and post-its • These 3 groups are the experts, you are • In later stages, adding tags to a text the detail person transcript can be useful. • Don’t make a lifetime commitment to your • The more documents you’ve looked at processing model, but have one in mind; and thought about, the easier this analysis without limitations is dangerous becomes.

Issues to Think About Restrictions on Data Items • Cross-references • Content model • Structural divisions (headings, blurbs, • Data values (are there controlled or semi- ambiguities) controlled vocabularies?) • Tradeoff between freedom and • Are there “authority files” for large open processing sets (like lists of authors) • Normalization of data items • How variable is the content, and how • What external data and catalogs may realistic the idea to normalize it. exist What is a Data Island? • XML data embedded in an (X)HTML document • Unique to IE IE Data Islands • Uses the “unofficial” element Content Models • These are modeled on regular expressions • In DTD, each element has one content DTD model for all time (Document Type Definition) • Similarly, each element has one set of attributes for all time • Attributes and content models are completely independent

Ambiguity Web-compliant Data Definitions • Extensible Markup Language (XML) 1.0 (Third Edition) • A content model is ambiguous if it W3C Recommendation 04 February 2004 contains an alternation (a | b) where the http://www.w3.org/TR/REC-xml/ “The function of the markup in an XML document is to describe its content models a and b cannot be storage and logical structure and to associate attribute-value distinguished by their first element. pairs with its logical structures. XML provides a mechanism, the document type declaration, to defineine constraintsconstraints onon thethe logicallogical • A content model is ambiguous if an structure and to support the use of predefined storage units.” optional occurrence indicator is followed Document Type Declaration – Contains or points to markup by a submodel whose first element is not declarations that provide a grammar for a class of documents. different. This grammar is known as a document type definition, or DTD. Document Type Definition – Set of markup declarations includedincluded inin oror referencedreferenced byby anan XMLXML Document.Document.

• Design using e.g. Diagramming Technique Why Use a DTD? XML 1.0 DTDs

• A single DTD ensures a common format for • DTDs let you say: each XML document that references it • What element types can occur and where • An application can use a standard DTD to • What attributes each element type can have verify that data that it receives from the outside world is valid • What notations are in use • A description of legal, valid data further • What external entities can be referenced contributes to the interoperability and • Standard DTDs exist in almost every efficiency of using XML domain • Some repositories exist, such as xml.org

XML Declaration Document Type Definition • Document Type Definition Syntax • Document Type Definition ::= XMLDecl? Misc* • Not required, but typically used (doctypedecl Misc*)? • Attributes include: • XMLDecl ::= '‘ • encoding – the character encoding in the document • VersionInfo ::= S 'version' Eq ("'" VersionNum "'" | '"' VersionNum '"') • standalone – if yes no external DTD required • Eq ::= S? '=' S?[26] • • VersionNum ::= ([a-zA-Z0-9_.:] | '-')+ • • Misc ::= Comment | PI | S • S ::= White Space • Example - Document Type Definition DOCTYPE • Document Type Definition Syntax • Doctypedecl ::= '‘ • DeclSep ::= PEReference | S • Specify a DTD for the document • Markupdecl ::= elementdecl | AttlistDecl | EntityDecl | • Refer to a DTD using a URI NotationDecl | PI | Comment • Include a DTD inline as part of the • Types of Markup Declaration: document • Element Type Declaration • Attribute-List Declaration • Example: Refer to a DTD • Entity Declaration •

Mixed Specification Basic Operators • Mixed ::= • Joining '(' S? '#PCDATA' (S? '|' S? Name)* S? ')*' • Sequence a,b,c | '(' S? '#PCDATA' S? ')' • Alternation a | b | c • Name must not appear more than once • Grouping (a) • Example • Repetition • (#PCDATA) – Only parsed Character Data allowed (= Text). Restricts all Child-Elements • 0 or more a* to be of Type Text. • 1 or more a+ • Optional a? Data PCDATA ••#PCDATA#PCDATA • Parsed character data ••CDATACDATA • Text occurring in a context in which • Element names markup and entity references may occur • Model groups • Mixed content (#PCDATA | x | …)* • ANY • EMPTY

CDATA Diagramming Technique (1/2) Description Notation A Component called A (A) Allows to define special sections of character data, which the processor does A Component that can not (#PCDATA) not interpret as markup be decomposed called A

Anything inside is treated as plain text A A is optional or occurs A? Example: once A • ]]>

A A occurs 0 to n times A* Diagramming Technique (2/2) Declaration, Definition, Data Description Notation A B Concatenation/Series • (A,B) B after A

A • (A|B) Selection A or B B • Example Account= (Room, Example (Minibar,Food+)?, Total) Account: Total= (#PCDATA)

Room Minibar Food Total

Cautions Concerning DTDs • All element declarations begin with • The ELEMENT declaration is case sensitive • The programmer must declare all elements within an XML file • Elements declared with the #PCDATA content model can not have children • When describing sequences, the XML document must contain exactly those elements in exactly that order. The DTD (schema) Terminology (1/4) • A DTD is a simple schema, based on SGML • Element: a text feature distinguished by markup • They consist of declarations for the parts: • Tag: a string in angle brackets. or . Two • tags delimit an element • • Content: anything in an element (children in the • parse tree) tags and characters between an • Can reference from DOCTYPE, or include: element’s tags • … element ]> • Element Type Name: a string like “p” or “img” • Other schema languages are available that identifies the type of an element • They use XML syntax (why not?)

Terminology (2/4) Terminology (3/4) • Entity: abstraction of an item of data storage. • Document Type declaration (DOCTYPE): • General entity: entity whose text is contained in declaration of root element of a document its declaration. instance, can refer to: • External entity: entity whose content is stored • External subset: DTD (XML declarations) externally to its declaration stored as an external entity. • Declaration: meta-markup that declares entities, • Internal subset: declarations contained content models, etc. within a DOCTYPE declaration. ATTLIST • Document instance: the tags and content in an declarations must be parsed, and XML document, not counting declarations interpreted. Terminology (4/4) Anatomy of an Element • Content Model: description of restrictions on the content of an element

• Model Group: content model Attribute (character) entity subexpression in parentheses Attribute Attribute reference Element type name value Element type • Repetition indicator: *, +, ?

Use a hyphen: • Prolog: All of the stuff before the document ­.

instance starts. Start-tag Content End-tag

Element

Elements Element Type Declaration • Identify structural/semantic components • Allows to define name of an element and • Can (usually do) have children its Content Model • Represented by start-tags and end-tags: •

Hello, world.

• Name is the element type being declared • Some elements are EMPTY • Content-Specification: • Special syntax so parser knows:
ANY – Any use (assumed when no content • Schemas control what sub-element patterns can model is provided) occur with any given type of element EMPTY – No sub-elements allowed • Order matters / Context does not Mixed or Children specification Children Specification Attributes (1/2) • Each name is the type of an element which may • Specify properties/characteristics of elements appear as a child, as described in the grammar: • That generally apply to the elements as wholes • Syntax: • Values are atomic strings • Children ::= (choice | seq) ('?' | '*' | '+')? • Though applications may impose more structure • cp ::= (Name | choice | seq) ('?' | '*' | '+')? • Represented by assignments within start-tags: • choice ::= '(' S? cp ( S? '|' S? cp )+ S? ')‘ •

• seq ::= '(' S? cp ( S? ',' S? cp )* S? ')' • Schemas control what attributes can occur on • Example any given type of element • (room,total) – Sequence of two elements of type • One special type: ID, unique per document room and total. • Attributes are not ordered

Attributes (2/2) Attribute-List Declaration • Data types • Attributes – Used to associate name-value pairs with elements. • Default values / omissability • Attribute-List Declaration defines • AttlistDecl ::= '' AttDef ::= S AttributeName S AttType S DefaultDecl Attribute Types (1/2) Attribute Types (2/2) • String Type • Enumerated Types CDATA – Value is any literal string (v1|…|vn) – Value is one of the values • Tokenized Types provided in the declaration ID – Value must match name production and appear • Example: not more than once (only one ID per Element!) some element in the document ENTITY, ENTITIES – Value(s) must match name of unparsed entity NMTOKEN, NMTOKENS – Values(s) must match NMToken production

Attribute Defaults General Entities (1/2) • Attribute Default – Defines whether an • A lexical mechanism for inclusion attribute's presence is required and if not how to • But, constrained to including subtrees deal with it • This preserves fragment parsability • Syntax: • This allows lazy evaluation of structure nodes DefaultDecl ::= '#REQUIRED' | '#IMPLIED' | • Also used for referring to graphic or other non- (('#FIXED' S)? AttValue) directly-XML data objects #Required – Attribute must be specified • References occur in the document instance: #Implied – Attribute is optional • #Fixed – Required attribute; value is specified in &warn37;&warn12;... quotes AttValue – Contains the declared default value • Declarations associate the name with a URI or a “public identifier” General Entities (2/2) Entity Examples • Simple • Common Entity Declarations • • • • External • • • “http://www.w3.org//”> • Character and Entity Reference ••Character:Character: < • Entity (Declaration above): < > • Parameter-Entity Reference for order.dtd: • Declaration: • Usage: • Means: item = (book | cdrom) +

Entity Declaration Parameter Entities • Entities – define storage units of an XML- • Declaring document. They are either parsed or unparsed. • • Allow for better maintenance • Parsed entity – content is text replacement • may not be text (text may be other than XML) ••UsingUsing • Very powerful tool for advanced usage, cf. XML • %include-file; Specification for full details. • ]]> • Cannot redefine predefined entities. Character References Comments (1/2) • Can be used to obtain untype-able characters Contents are ignored by the XML • Such as Kanji for users with English processor keyboards Cannot come before the XML declaration • Map directly to a Unicode code point Cannot appear inside an element tag • Represented much like entity references: • Decimal: ㋱ • May not include double hyphens • Hex: 뻯 • Schemas do not affect these

Comments (2/2) Marked Sections • Can go most anywhere • Two purposes: • (though not inside tags) • Escaping a lot of markup • Represented as: • Conditional inclusion • • In XML: • Have simpler syntax than in SGML/HTML • Escaping only in the document instance: ••NotNot • Hello

]]> ••NotNot • Thus a transform with no templates except for the root, strips markup from a document Template Priority What Goes in a Template? • Multiple templates may match an element • Literal XML to output •