Copyright

• Copyright © 2003 John Cowan MOVING TOWARD • Licensed under the GNU General Public License • ABSOLUTELY NO WARRANTIES; USE AT YOUR OWN RISK XHTML 2 • Black and white for readability • The Gentium font available at http://www.sil.org/~gaultney/gentium John Cowan Associated Press

1 2

Roadmap Roadmap

• Introduction (10 slides) • The Common Attributes (19 slides) • Goals and Principles (8 slides) • XML Events (11 slides) • New and Nifty (8 slides) • XForms (19 slides) • What's History (7 slides) • To Be Continued... (1 slide) • The Elements (27 slides)

3 4 Versions of HTML

• HTML 1.0 – simple hypertext INTRODUCTION • HTML 2.0 – images and forms • HTML 3.2 – the union of the browsers • HTML 4.01 – the kitchen sink • XHTML 1.0 – it's XML time • XHTML 1.1 – modularization • XHTML 2 – simple and powerful hypertext

5 6

What is XHTML 2? XHTML 2 Abstract

• An evolving standard currently being XHTML 2 is a general purpose markup worked on by the HTML Working Group of language designed for representing the W3C. documents for a wide range of purposes • The first major change to the semantics of across the World Wide Web. To this end it HTML since HTML 4.0. does not attempt to be all things to all people, supplying every possible markup idiom, but to – Obsolete syntax has been discarded supply a generally useful set of elements. – Separation of content and presentation essentially complete – Genuinely new features being introduced 7 8 Why design XHTML 2? Why plan for XHTML 2?

• More usable • Predicting the degree of market uptake not • More accessible to different kinds of users yet possible and display devices • Support for legacy HTML will probably be • Easier to write by hand or with tools provided forever for practical reasons • Less reliant on embedded scripting • Avoiding most deprecated and obsolete languages for commonly used functionality features is already possible • The planner's rule of thumb: “Plan or be planned for” 9 10

Making early adoption easy Who supports XHTML 2?

• Much HTML is already generated using • Too soon for formal commitments XSLT from one of many XML formats • Server-side support is straightforward • XHTML 2 can be generated just as easily • Cross-platform open-source browsers will • Early-adopting clients can get a better almost certainly support XHTML 2 fairly browsing experience quickly and easily • The WG contains several major browser companies

11 12 Constructing document types Everything is subject to change

• XHTML 2 has a fully modular definition • This presentation is based on the July 2006 • Defining subsets for use within other public working draft of XHTML 2 document types will be easy, for: • Disclaimer: not every slide is 100% updated – rich textual annotations to the very latest draft – embedded presentation-ready material • There are still open issues, which are rather poorly documented at present

13 14

Working Group(s)

• XHTML 2 has been developed by the W3C's HTML WG GOALS AND PRINCIPLES • I am not on the WG, so I have limited inside information • XHTML 2 is being handed off to a new WG soon, according to the latest reports

15 16 Goals: ease of use Goals: universal access

• Generic XML when possible • Better internationalization for the World • Less presentation, more structure Wide Web • Make the language easy to write • Device independence: – Author once, not once per device • Make the resulting documents easy to use – Render differently on different devices • Be as inclusive as possible (“design for our future selves”)

17 18

Goal: reduced use of scripting Breaking backward compatibility

• Writing scripts is programming, not • Earlier versions of HTML were special- document authoring purpose languages • Only high-powered user agents (lots of • Backward compatibility required so that memory and CPU speed) can process scripts new documents would still be usable in • XHTML 2 tries to make declarative, not older browsers procedural, pages straightforward • Modern browsers can render arbitrary XML with stylesheets

19 20 Breaking backward compatibility Modular design

• Much of XHTML 2 already works in existing • XHTML 2 is divided into modules which browsers together define the elements, attributes, • XML Events and XForms still require user and content models of the language agents that understand that functionality • Modules are meant to be reusable • Clever use of Javascript under the table can independently help here • If your XML design needs a bit of rich text, recycle some XHTML 2!

21 22

Namespaces Schema language independence

• XHTML 2 introduces a new namespace • The working drafts define XHTML 2 syntax name: http://www.w3.org/2002/06/xhtml2/ normatively in RELAX NG (www.relaxng.org) . • Ruby, XForms, and XML Events have their • DTD and W3C XML Schema normative own namespace names definitions will be added in later drafts

23 24 Ubiquitous hyperlinks

• Any element can be the target of a NEW AND NIFTY hyperlink (through the id attribute) • Any element can be the source of a hyperlink (through the href attribute) • The a element is kept for backward compatibility

25 26

Hyperlinking example Generalized transclusion

• Any element can have a src attribute

Security providing external content to be rendered Considerations

instead of the element ... • This means rich text can be embedded as

This system well as images and objects is just inherently insecure. • The img element is likewise kept for Deal with it.

backward compatibility

27 28 Transclusion example How to do images

becomes

Hmm, transclusion isn't working. text Sorry. Click here and then use the Back text button.

29 30

External modules XForms

• Ruby provides support for Japanese-style • Widgets can appear anywhere in a XHTML annotation document • XML Events is a greatly extended events • Arbitrary XML is embedded in the module based on DOM Level 2 Events document as a template • XForms is a completely new forms language • Widgets are bound using XPaths to • Other languages (MathML, SVG) aren't part appropriate parts of the template of XHTML 2 but can be used with it • The filled-out template is submitted

31 32 Streamlined language

• HTML 4.0: 91 elements, 188 attributes (counting shared attributes only once) WHAT'S HISTORY • XHTML 2: 59 elements (plus 1 for XEvents and 29 for XForms), 59 attributes • Almost any attribute can be attached to almost any element • XHTML 2 is the true (refactored) second version of HTML 33 34

Presentational markup is It's XML, you know dead, dead, dead • Make sure all start-tags have end-tags and • Originally, HTML was designed to represent empty-tags end in /> the structure of a document, not its • Lower-case elements and attributes presentation • Quote attribute values • Presentation-oriented elements were later added to the language by browser • Wrap script and style content in CDATA manufacturers sections • XHTML 2 takes HTML back to its roots • HTML Tidy (tidy.sf.net) is your friend

35 36 Presentational markup is Presentational markup is dead, dead, dead dead, dead, dead • All presentational elements and attributes • Presentation handled solely by stylesheets are gone • Standardizing on stylesheets means • Even old warhorses like b, i, and br bite enhanced flexibility of presentation the dust • CSS can do more than presentational HTML • HTML Tidy is still your friend ever did • XSLT is also your friend • A standard stylesheet will be provided to give the default (traditional) presentation

37 38

The img and applet elements No more frames

• Superseded by the new and generalized • Everything to do with frames is gone object element, which is now used to • The target attribute of hyperlinks embed any non-XHTML content remains, but its use is not defined • XSLT is your friend once again

39 40 New forms and sublanguages • The new XForms language completely supersedes HTML Forms THE ELEMENTS • It's still possible for forms to return attribute-value pairs, but full XML documents are the recommended method • Event handling now uses XML Events

41 42

The Document Module The Structural Module

• Elements: , head , title , body • Elements: p, div , separator , address , • Attributes specific to the html element: blockquote , pre , blockcode , section , h, h1 -h6 – version replaces DOCTYPE declaration as an effective version signature • blockcode is a hybrid of blockquote – xsi:schemaLocation points to W3C and pre , intended to set off code XML Schemas for namespaces • separator replaces hr • p can now contain lists and tables, but not nested paragraphs

43 44 The section and h elements The Inline Text Module

• section elements • Elements: abbr , cite , code , dfn , em , – contain major divisions of text kbd , l, q, samp , span , strong , sub , – can be nested to convey subdivisions, sub- sup , var subdivisions, etc. • No presentation elements: not even b and i • The h element works like h1 within a main survive document, like h2 within a top-level • q is for quotations, but no longer tries to section, like h3 within a sub-section, etc. supply quotation marks automatically (the • h1 -h6 are no longer necessary rules depend on language and country)

45 46

What are those elements for, The l element anyhow? • Container for a single line (of verse, • abbr : an abbreviation ( full attribute is an computer code, etc.) IDREF to the full name) • Replaces HTML's br • cite : a citation or source reference

Take what you need, • code : inline computer code need what you take. • dfn : a definition

• kbd : keyboard input • samp : sample output • var : a variable 47 48 The Hypertext and Image The List Module Modules • Elements: a, img • Elements: dl , dd , dt , dl , ol , ul , li , nl , • Redundant thanks to ubiquitous hypertext label (same as span ) • Definition lists are as before • Being kept in XHTML 2 because they: • Ordered and unordered lists are as before – are familiar – the value attribute on li tells where to – have well-known semantics start numbering – have well-known default presentations – specialized numbering only in stylesheet)

49 50

The List Module Navigation lists

• The new label child element can be used • nl elements are initially displayed in to label any list (it must be the first child) collapsed form, with just the required label • The new di element groups term(s) with child shown definition(s) in a dl list, showing what • Clicking on the label causes the li children defines what to be popped up/pulled down • Nested navigation lists supported • No more scripting just to get decent menus!

51 52 Access Module The Metainformation Module

• access element specifies keyboard access • Elements: meta , link to elements in the document • RDF is a standard way of making statements • access must be in the head about a resource • key attribute specifies the single-character • RDF statements contain a subject , a predicate , key mapping and an object • targetid and targetrole specifies the • link and meta elements can only appear target element either by id or by matching in the head roles

53 54

The Metainformation Module The Object Module

• The meta element can contain plain text, • Elements: object, param, caption , possibly decorated with Text Module standby elements • Supersedes img and applet link • The element can only contain other • src specifies the object's source link and meta elements • The content of object is used if the object • Browsers are encouraged to render link can't be loaded (perhaps as part of a dropdown list of links) • Nested object elements provide for a • The metainformation attributes control the series of fallbacks semantics of both kinds of elements 55 56 Attributes of object The param element

• content-length is a hint about the size • Embedded in an object element, but not part of the object of the fallback • archive specifies resources (jar files, e.g.) • Passes a name-value pair to the object to be preloaded • Attributes: name , value , valuetype, • declare=”declare” means: type only; does not support Common – object will only be loaded Attributes – must be launched by a user action

57 58

The caption and standby The param element elements • valuetype specifies the kind of value: • caption specifies a caption; this is – data for a simple string borrowed from the Tables Module – ref for an external reference • standby provides textual content to be – object for the ID of an object element displayed while an object is being loaded elsewhere in the document • When valuetype is ref , type specifies the content type of the external resource

59 60 The Ruby module The Handler Module

• Supports annotations written above or next • Elements: handler to the base text • handler specifies a script or other event • A Japanese requirement to explain obscure handler kanji characters • Can appear in head or body as either a • The ruby element is added to Text Module Structural or a Text element • Child elements: rbc , rtc , rb , rt , rp • If src is not given, the content is a fallback • For details, see www.w3.org/TR/ruby handler if it is wrapped in a child handler element, or else the script itself 61 62

Attributes of handler The Style Sheet Module

• type specifies the and is • Element: style now required • Must be in the head element encoding • gives the character set of an • Used for embedded sheets only external script, since scripting languages typically aren't self-identifying • External stylesheets use the link element from the Linking Module • declare=”declare” means don't run the script until an event activates it • The processing instruction is also supported

63 64 External stylesheets Types of stylesheets

• Appropriate attributes on the link • Persistent: rel=”stylesheet” element specify different flavors of external • Preferred: rel=”stylesheet” stylesheet title=”...” • href specifies the location of the stylesheet • Alternate: rel=”alternate (if omitted, the content is the stylesheet) stylesheet” title=”...” • type specifies the stylesheet language • User agents should allow different stylesheets to be selected

65 66

The Style Attribute Module The Tables Module

• Attribute name: style • Elements: table , caption , summary , • Added to the Common Attributes Collection col , colgroup , thead , tfoot , tbody , tr , th , td • Specifies the style of a single element using an unspecified style language • Essentially the same structure as older versions of HTML • Strongly discouraged in favor of the Style Sheet Module • As always, presentation is done via stylesheets; width, border, rules, spacing, padding attributes are gone

67 68 Tables Module attributes

• col and colgroup support the span attribute (number of columns spanned) THE COMMON ATTRIBUTES • th and td continue to support the HTML attributes abbr , axis , headers , scope , rowspan , colspan (all fairly advanced features)

69 70

List of Attribute Modules Common Attribute Collection (including some attributes) • In XHTML 2, all elements support the same • Core: class , :id , title list of attributes! • Internationalization: xml:lang , dir • Exceptions: param , elements from external • Hypertext: href , cite modules • Embedding: src , type • html , abbr , li , style , handler , role access , th , td , col , and colgroup • Role: have their own special-purpose attributes • Editing: edit , datetime

71 72 Attribute Modules Core attributes (including some attributes) • Media: media • class used for stylesheet control and • Meta: about , content , property general-purpose text processing; multiple names are allowed • Image Map: usemap , ismap , shape • id (alternatively xml:id ) used for: event handler observer • Events: , , , – Hypertext anchors (replaces name ) target – Stylesheet control • Style: style

73 74

Core attributes Internationalization attributes

• title used for tooltips, multiple • xml:lang specifies the language of stylesheets content (supersedes HTML's lang ) • layout indicates whether whitespace is • dir specifies directional properties of text relevant or irrelevant to the (for use with Hebrew, Arabic, Syriac scripts) element's meaning; default is – values are ltr , rtl , lro , rlo irrelevant . – supersedes the HTML bdo element

75 76 Hypertext attributes Hypertext attributes

• href is unchanged, but any element can • nextfocus and prevfocus are IDREFs bear it to the element that will get the focus when • hreflang , hrefmedia , hreftype are this element has the focus and the user tabs the language, CSS2 media type, and MIME or shift-tabs away type of the referenced document • target specifies the destination of the • cite points to a source document that hyperlinked resource (values not defined) gives more information about the element • xml:base sets the base URI

77 78

Embedding attributes Role attribute

• XHTML 2 allows general-purpose • Elements can have roles, which are transclusion, including hypertext arbitrary QNames that specify the purpose • src specifies a resource to be embedded in of parts of a document the document in place of the element to • role specifies one or more QNames which it is attached • Standard values: main , secondary , • srctype and encoding give the media navigation , banner , contentinfo , type (and, if needed, the character definition , note , seealso , search encoding) of the resource

79 80 Editing attributes Media attribute

• Provide a simple changelog facility • Some elements should only be presented for • edit can specify inserted , deleted , particular media changed , or moved • media specifies a CSS2 media type such as • datetime specifies time of last change print or screen • edit=”deleted” defaults to invisible • If the current presentation medium matches the attribute, the element is presented; otherwise, it is ignored

81 82

Metainformation attributes Metainformation attributes

• Allows RDF statements to be made locally to • If a rel attribute is present, the RDF object arbitrary elements is specified by the href attribute • about specifies an RDF subject (if missing, • If a rev (reverse relationship) attribute is the first ancestor with an about or id present, the RDF subject is specified by attribute is the subject) href and the RDF object by about • property , rel , and rev specify the RDF • If a property attribute is present, the RDF property value (can be a QName like object is the content of the element dc:creator ) – datatype is the XML Schema datatype

83 of the RDF object 84 Standardized property values Standardized rel /rev values

• description : a basic description • alternate : alternate version • generator : identifies the software used to • start , next , prev , up : adjacent generate the document resources in the collection this resource • keywords : comma-separated list belongs to • title : specifies a title • contents , index , glossary , copyright appendix • robots : advice to web-crawlers , : a resource serving as a specialized part of the whole reference • : default property collection

85 86

Standardized rel /rev values More Common Attributes

• chapter , section , subsection : the • Image Map: usemap , ismap , shape , current part(s) of this collection coords are basically unchanged • help : resource offering help • Events (XML Events namespace): event , • bookmark : key entry point within a handler , observer , phase , collection propagate , target , defaultAction • meta : a resource that provides metadata • Style ( strongly discouraged ): style • icon : icon for this resource or collection

87 88 Repurposing Attributes

• HTML attributes attached to HTML elements are in no namespace XML EVENTS • However, if you want to attach an HTML attribute like href to a non-HTML element, put it in the XHTML 2.0 namespace using html:href • This rule applies both in XHTML 2.0 documents and other documents

89 90

Basic XML Events model Capturing and bubbling

• Every event has a target element (the one • Most events can be captured by an ancestor being clicked on, inserted, removed, gaining of the observer before the observer even or losing focus, etc.) sees the event • Any element can have one or more handlers • If the observer element has no handler for for event types (typically scripts) the event, it bubbles up the tree until some • Handlers are attached to observer elements ancestral element handles it • The observer must be either the same as the • Capturing and bubbling handlers can be target or an ancestor of it distinct, even on the same element

91 92 Stopping propagation The XML Events Module and canceling • A handler can stop the propagation of an • Uses a separate namespace: event during either capturing or bubbling http://www.w3.org/2001/xml-events • Events have associated default behaviors • Element: ev:listener • The default behavior can be canceled or • Attributes: event , observer , handler , allowed to happen (after all handlers are target , phase , propagate , processed) defaultAction , id

93 94

Attributes of listener Attributes of listener

• event is the name of the event • phase is capture for capturing-phase • handler is the URI of a script handlers, default for bubbling-phase • observer is the ID of the observer • propagate is stop to stop propagation continue • target is the ID of the target after this handler runs, to continue looking for handlers • If target is not specified, any descendant defaultAction cancel of the observer is allowed as a target • is to cancel the default action for this event or perform to do it

95 96 Putting attributes Putting attributes on the observer on the handler • Attributes other than observer can be • Attributes other than handler can be placed directly on the observer element placed directly on the handler element • Such attributes must be qualified with the • Such attributes must be qualified with the XML events namespace XML events namespace • If observer is omitted too, the observer is the parent of the handler

97 98

Standard event types Standard event types

• UI events: DOMFocusIn , DOMFocusOut , • Mutation events: DOMSubtreeModified , DOMActivate DOMNodeInserted , DOMNodeRemoved , • Mouse events: click , mousedown , DOMNodeInsertedIntoDocument , mouseup , mouseover , mousemove , DOMNodeRemovedFromDocument , mouseout DOMAttrModified , DOMCharacterDataModified • Key events: not yet defined

99 100 HTML-specific event types

• Events: load , unload , abort , error , select , change , reset , focus , blur , XFORMS resize , scroll • Some of these are redundant with previously specified events • Some are probably obsolete

101 102

Goals XForms models

• Controls and values are internationalized, • Models contain: accessible, and device-independent – Instance data • Submissions are type-validated XML – Bindings from UI controls to data • Validity is defined by XML Schema, – Submission instructions augmented by XForms-specific declarations – Optional W3C XML Schema reference used to validate the instance data • Reduced dependence on scripting • There can be any number of models in a document

103 104 Instance data Bindings

• A shell or skeleton of what is going to be • Associate part of the instance document returned to the server when the form is with a specific forms control submitted • XPaths specify the relevant part of the • Provides the XML structure and any fixed instance attributes or character data • The association is bidirectional: change • Can be loaded from an external resource either, the other changes too • Data already in the instance becomes the • Can be a separate element or an attribute on initial value of bound controls the control

105 106

Submissions Form controls

• Not to be confused with the submit control • Can occur anywhere in a document – form • Specifies the URI, method, and encoding is dead where the form is to be submitted • Represent abstract input devices, not • Specifies what to do with data returned by specific UI widgets the server: • Rendering is up to the browser – Replace the whole page • Can have labels, help, hints, and alerts – Replace just the instance data attached to them – Discard the returned data

107 108 Input controls Selection controls

• input specifies simple data; browsers • select1 allows the user to select a single should provide datatype-specific input item from a list methods • select is similar, but allows multiple • secret is similar, but conceals its value selections • textarea is similar, but supports multi- • Selections can be grouped (this replaces line text with no datatype optgroup ) • range provides a range of values to choose one from

109 110

Other controls Grouping of controls

• trigger signals a single event when • group sets off a group of controls activated (replaces button ) • switch is a container for sets of controls • submit provides a submit widget (enclosed in case elements) to be displayed • upload provides a file browser for alternatively, controlled by toggle elements uploading bulk content • repeat allows groups of items to be added • output displays the value of an XPath or deleted, allowing for shopping-cart-like expression, often referring to the instance behavior

111 112 Controlling control values XForms datatypes

• Constraints apply to a value and are • All simple datatypes except duration , extended to all controls bound to that value ENTITY , ENTITIES , NOTATION • Values can be: • Durations use yearMonthDuration and – Required dayTimeDuration , which are not in – Disabled or made read-only XML Schema but are in XQuery as well as – Constrained by an XPath expression XForms – Constrained by an XForms datatype – Calculated from other values

113 114

XForms datatypes XPath functions for XForms

• XForms-specific listItem and • Boolean: boolean-from-string , if listItems are like NMTOKEN and • Numeric: avg , min , max , count-non- NMTOKENS but without lexical restrictions empty , index • Derivation by restriction and by list are • Date/time: now , days-from-date , supported seconds-from-dateTime , seconds , months • Nodeset: instance

115 116 XForms events XForms events

• Too many event types to list here • The most important events: • General types: – -ready (initialization) – Standard events – xforms-focus (when a control gains focus) – Initialization – xforms-submit (before submission) – Interaction – xforms-value-changed (changed data) – Notification of changes – xforms-invalid (data is invalid) – Errors and exceptions • The sequence of events is carefully defined

117 118

XForms actions XForms actions

• Events can trigger built-in XForms actions • More actions: as well as scripts – Recalculate instance data • Some of the actions: – Revalidate instance data – Submit – Refresh form controls from instance data – Reset – Change focus – Display message box – Redispatch event to another control – Traverse a link • A single event can cause multiple actions to – Set an instance value occur

119 120 Some existing XForms implementations • X-Smiles browser (open source, written in Java) TO BE CONTINUED... • FormsPlayer (a plugin for IE6 SP1) • Novell's XForms (a testbench browser analogous to appletviewer ) • Mozquito DENG (an XML + CSS + XForms + SVG browser written in Flash MX)

121 122

Some open issues

• Currently anything can have image-map attributes, but it only works if src specifies MORE INFORMATION an included image • There's no document-type independent way to detect inline style attributes http://www.w3.org/TR/xhtml2/ http://www.ccil.org/~cowan/xhtml2{.pp • Character entities: staying or going? t,.sxi,.pdf} • Further review will probably create more issues 123 124