Using XML to Build the Annotated XML Specification
Total Page:16
File Type:pdf, Size:1020Kb
XML.COM - Building the Annotated XML Specification Building the Annotated XML Specification Sept. 12, 1998 Java, Publishing, Tools Books about XML Java, Publishing, Tools Using XML to Build the Amazon.com Book Search Annotated XML Specification Current Issue by Tim Bray Article Search Previous Issues The design of XML 1.0 stretched over 20 months ending in February 1998, with input from a couple of Annotated XML 1.0 hundred of the world's best experts in the area of What is XML? markup, publishing, and Web design. The result of that work, the XML 1.0 Specification, is a highly XML Resource Guide condensed document that contains little or no The Standards List information about how it came to read the way it does. NEW: The Submissions List Even before the release of XML 1.0, it became Authoring Tools Guide obvious that some parts of the spec were Content Mgmt. Tools self-explanatory, while others were causing headaches XML Events Calendar for its users. The Annotated XML Specification addresses both of View Messages these problems. It supplements the basic specification, Register first with historical background and explanation of Login how things came to be the way they are, and second Change your profile with detailed explanations of the portions of the spec that have proved difficult. Commercially, it has been a Puzzlin' Evidence success; in its first month on the Web, it had over XML Q&A 100,000 page views from over 26,000 unique Internet XML:Geek addresses. It remains, by a substantial margin, the most popular item available at the XML.com site. Seybold Publications W3 Journal: XML This article explains how I created the Annotated XML Specification. If you haven't looked at it, you might want to give it a glance before reading about it, or even better, open it in another browser window while you read about it here. XML Testbed Who We Are How the Annotated XML Specification Works Our Mission Tim describes the architecture of the AXML system Become a Sponsor and the design decisions he made. Flipping the Links How Java was used to convert the XML to HTML. http://www.xml.com/xml/pub/98/09/exexegesis-0.html (1 von 2) [07.07.1999 14:52:43] XML.COM - Building the Annotated XML Specification Conclusion: How Much Work Was It? The conclusion of Tim Bray's explanation of how he created the Annotated XML Specification. Next: How the Annotated XML Specification Works Seybold Seminars Web Review Perl.com WebCoder WebFonts W3 Journal A Songline PACE Production Copyright © 1998-1999 Seybold Publications and O'Reilly & Associates, Inc. XML is a trademark of MIT and a product of the World Wide Web Consortium. http://www.xml.com/xml/pub/98/09/exexegesis-0.html (2 von 2) [07.07.1999 14:52:43] XML.COM - How the Annotated XML Specification Works Sept. 12, 1998 Java, Publishing, Tools Building the Annotated XML Specification Books about XML Java, Publishing, Tools How the Annotated XML Specification Works Amazon.com Book Search by Tim Bray Current Issue The architecture of the system, illustrated below, is simple enough: Article Search Previous Issues Annotated XML 1.0 What is XML? XML Resource Guide The Standards List NEW: The Submissions List Authoring Tools Guide Content Mgmt. Tools XML Events Calendar View Messages Register Login The XML 1.0 specification is accessed in read-only mode. For convenience, I keep a local Change your profile copy in a file called xml.xml. All the annotations live in another single file (about 25% larger than the XML specification itself) named notes.xml. A Java program, based on my Lark Puzzlin' Evidence processor, reads both xml.xml and notes.xml and builds in-memory tree-structured XML Q&A representations of both. After processing all the links in notes.xml, the program writes out a XML:Geek file called target.html, which is the annotated version of the spec, and a large number of small files, each containing one of the annotations. The rest of this article outlines what's in those Seybold Publications files and what the program does. W3 Journal: XML Design Choices for the Annotation The first question I had to face in constructing the annotation was whether or not to use the highly-unfinished XLink and XPointer technologies, currently under development in the XML Testbed World Wide Web Consortium (W3C) XML Activity. XLink and XPointer have two large advantages: they are built for XML and can point at arbitrary locations inside a document. Who We Are Our Mission On the other hand, neither spec is nearly finished, and they have changed (in syntax, if not at a Become a Sponsor conceptual level) from draft to draft. Furthermore, there were (in early 1998) neither commercial nor freeware implementations available. So if I were going to use this technology, I was going to have to write all the software myself. I decided to go ahead with XLink and XPointer, and while the syntax described here is about a year behind the latest drafts, I believe that what I've done is conceptually in tune with the current thinking, so the syntax will be easy to upgrade once the spec settles down. XLink: A Quick Review Seybold Seminars Web Review The XLink spec is concerned with recognizing which elements are being used as links, and Perl.com with giving those links some useful structure and properties. HTML has a very simple solution WebCoder to this set of problems: all linking elements have to be named A, and they have to point at one WebFonts "resource", and the resource's address is found in the HREF= attribute. W3 Journal <A HREF="resource_address"> A Songline PACE Production XLink tries to be more general than HTML: any element can serve as a link, and is identified as such by using a magic reserved attribute, xml:link=. There are two kinds of XLinks, identified by the values xml:link="simple" and xml:link="extended". In the annotation, I only used the extended flavor, so I won't discuss simple XLinks here. Extended XLinks can have a bunch of useful associated information: in-line If a linking element is "in-line", the element itself is one of the ends of the link. All HTML links are in-line, as are all the links in the Annotated Spec. Out-of-line links are really the Wild Blue Yonder of hypertext theory and practice. http://www.xml.com/xml/pub/98/09/exexegesis-1.html (1 von 4) [07.07.1999 14:53:19] XML.COM - How the Annotated XML Specification Works labels XLinks can have labels, both machine-readable (provided in the role= attribute) and human-readable (in the title= attribute). behavior XLink includes some tools for controlling the behavior of a link when it's being followed. In HTML, links really have only one behavior; you are at one page, you follow a link, then you're somewhere else. Since I had to use the Web as it is today to deliver the annotations, I wasn't able to get fancy with behaviors. The x Element In the Annotated Spec, I used an element named x as the chief linking element. If you were to dress it up with all the necessary attributes, one of these elements would look like this: <x xml:link="extended" inline="true" content-role="commentary" content-title="Annotation" > ... contents of the linking element go here ... </x> If every one of the 312 annotations had to carry around all those attributes, the Annotated Spec would be hard to write and to work with. Fortunately, XML has attribute defaulting, so I could provide all these attributes just once, in the document header: <!DOCTYPE Annotations [ <!ELEMENT x (here|spec)+> <!ATTLIST x xml:link CDATA #FIXED "extended" inline CDATA #FIXED "true" content-role CDATA #FIXED "commentary" content-title CDATA #FIXED "Annotation" id ID #REQUIRED> ]> <Annotations> <x id="first-link-id"> ... content of first link ... </x> <x id="second-link-id"> ... content of second link ... </x> ... </Annotations> The real role of the x element is to hold one here element and a bunch of spec elements. The here element actually contains the text of the annotation, while each of the spec elements points at a location in the XML spec where the annotation applies. Most of the annotations apply to only one location, but there are a few that attach to many places. For example, there is an annotation saying that DTD keywords (such as DOCTYPE , SYSTEM, ELEMENT, and ATTLIST) must be in upper-case; this annotation is attached to the first definition of each of these keywords. The here Element This element is what XLink calls a "Locator" - it serves as one end of the extended link, and contains the annotation. It has a lot of attributes, most of which are defaulted and don't actually appear in the body of the document. Here's the declaration, with all those attributes: <!ELEMENT here ANY> <!ATTLIST here xml:link CDATA #FIXED "locator" actuate CDATA #FIXED "auto" show CDATA #FIXED "replace" role CDATA #FIXED "annotation" title CDATA #REQUIRED href CDATA #FIXED "here()" index CDATA #IMPLIED"> Here's what all those attributes mean: xml:link="locator" tells the processing program that this is a locator actuate="auto" specifies that the link should be processed as soon as it's found - this differs from the http://www.xml.com/xml/pub/98/09/exexegesis-1.html (2 von 4) [07.07.1999 14:53:19] XML.COM - How the Annotated XML Specification Works Web browser behavior of just displaying the link (as underlined blue text) then waiting for the user activate it.