XML Tutorial Description

XML Tutorial Description

Introduction to XML Tutorial Description With your HTML knowledge, you have a solid foundation for working with markup languages. However, unlike HTML, XML is more flexible, Bebo White allowing for custom tag creation. This course [email protected] introduces the fundamentals of XML and its related technologies so that you can create your own markup language. InterLab 2006 FermiLab October 2006 Topics* What Is Markup? • XML well-formed documents • Information added to a text to make its structure • Validation concepts comprehensible • DTD syntax and constructs • Pre-computer markup (punctuational and presentational) • W3C Schema syntax and constructs • Word divisions • XSL(T) syntax and processing • Punctuation • XPath addressing language • Copy-editor and typesetters marks • Development and design considerations • Formatting conventions • XML processing model • XML development and processing tools * Tutorial plus references Computer Markup (1/3) Computer Markup (2/3) • Any kind of codes added to a document • Declarative markup (cont) • Typesetting (presentational markup) • Names and structure • Macros embedded in ASCII • Framework for indirection • Commands to define the layout • Finer level of detail (most human-legible signals are • MS Word, TeX, RTF, Scribe, Script, nroff, etc. overloaded) • *Hello* Æ Hello • Independent of presentation (abstract) ••/Hello//Hello/ Æ Hello • Often called “semantic” • Declarative markup • HTML (sometimes) ••XMLXML Computer Markup (3/3) Markup – ISO-Definitions • Semantic Markup • Markup – Text that is added to the data of a • Authors put annotations into their texts to help the document in order to convey information about it publisher to understand what type of text this is (e.g. • Descriptive Markup – Markup that describes “this is a heading”) the structure and other attributes of a document • Annotations are agreed between author and publisher in a non-system-specific way, independently of • Publisher decides on the layout any processing that may be performed on it • Descriptive markup • Processing Instruction (PI) – Markup • Describing content not the layout consisting of system-specific data that controls • Markup to support search in documents how a document is to be processed • Words in headings are more important than in footnotes • Markup for machines vs. markup for humans Markup Language Features • Stylistic (appearance) • <I><B><U> • Structural (layout) Hypertext Markup • <P><BR><H2> Language (HTML) • Semantic (meaning) ••<TITLE><TITLE> • <META NAME=keywords CONTENT = " …... " > • Functional (action) • <BLINK> • <A HREF = "[link]">Click here</A> Hypertext Markup Language Some Problems (1/2) • HTML – The Markup Language used to represent Web pages for viewing by people • Rendered and viewed in a Web Browser • Not extensible • Documents • Easy to write – Markup your data with tags • Platform independent • Can contain links to Images, documents, and other pages • HTML is an application/instance of SGML (Standard Generalized Markup Language, ISO 8879:1986 – used for defining Markup Languages) • For further information: http://www.w3.org/MarkUp/ Some Problems (2/2) Observations on HTML <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN"> • Powerful for Presentation (Focus on Client-Side) <html><head> <title>The Some Problems Example</title> • Cascading Style Sheets (CSS) </head><body> • Allows for dynamic behavior using scripting/ DHTML <H1>Separation Of Concerns</h1> There are a lot of problems using • Allows for proprietary extension (ActiveX, plug-ins, HTML for <WebEngineering>Web Application development</WebEngineering>,, etc.). ifif youyou dodo notnot separateseparate concerns.concerns. <P> • Easy to write and generate, but: The <b>Bold</b> and <i>Italic</i>talic</i> example:example: <br><br> • Difficult to parse While rendering <b> is easy nowadays.<i> The semantic of this markup </b> is not </i> clear. • No support for extending semantics, e.g. using your own tags </BODY></HTML> • REMEMBER: Do not develop AApplicationspplications in this manner! • Difficult to apply disciplined approaches XML (1/2) • The eXtensible Markup Language • XML is a universal format for structured eXtensible Markup documents and data on the Web • XML is a standard, interoperable way to Language (XML) describe data for flexible processing • Multi-format delivery • Schema-aware information retrieval • Transformation and dynamic data customization • Archival: standardized, self-describing XML (2/2) XML History • http://www.w3.org/XML/ • 1996 Development started • XML looks like markup (e.g., HTML) but in this • 1997 Public Drafts context the interpretation of data is the job of the application • E.g. Provided in paper form at WWW6, Santa Clara, CA • XML tags/elements/attributes are not predefined • XML uses a Document Type Definition (DTD) or • February, 1998 W3C REC an XML Schema to describe data • Based on experience: simplified form of SGML • XML with a DTD or XML Schema is designed to • XML derived from SGML – both are used for be self-descriptive defining Markup Languages • XML = 80% of SGML´s capabilities, 20% of SGML´s complexity The W3C Standards* Process XML Facts • World Wide Web Consortium (W3C) • Important for Web development because it • Development is organized into WGs. removes two constraints: • Dependence on a single, inflexible Document • Working Group (~10) - set agenda /decide Type (HTML); • Special Interest Group (~100) - • The complexity of full SGML, whose syntax discuss/recommend allows many powerful but hard-to-program • W3C members (~500) - vote options • W3C Director (TimBL) - may veto • XML was not designed to do anything • The public--comment on public WDs; • XML is free and extensible adopt/reject • XML complements (not replaces) HTML XML and HTML XML Characteristics • XML was designed to “carry” data • Well-Formed – An XML document is well- • Two different goals: formed if it complies to the following rules: • Elements have an open and close tag: • XML – describe data and focus on what it is <tag>content</tag> • HTML – display data and focus on how it looks • Empty elements are closed by “ / ” e.g. <emptyelem/> • Attribute values are quoted • Valid – An XML document is well-formed and if its content conforms to the rules in its document type definition or schema • Validity allows an application to make sure the XML data is complete, is formated properly, and has appropriate attribute values. The Two Worlds of XML The Two Worlds United • Markup of documents: the original • Documents and “semi-structured” data share • This perspective is our focus here features • Document representation was the primary problem • Hierarchical structure XML was created to solve • String content • Data exchange and protocol design • Variations in structure • XML turned out to fill important gaps • Their applications also share needs • Relational databases needed a way to share records • Need for a lingua franca, independent of APIs and multi-table data • Ability to cope with international characters • Protocol designers wanted a way to encapsulate • “Fit” with WWW and HTTP. structured data XML is More General Better Rendering than HTML • Tags label arbitrary information units • Fully internationalized • More suited to multiple purposes • Also better for visually-impaired users • “Looking right” is needed but not enough • Supports multiple renderings • Supports custom information structures • Customize to the user, time, situation, device • If you have “price” or “procedure”, you can make a • Separates formatting from structure tag for it, and validate its usage • And processing other than rendering • Can support many different information models • Large documents don’t break it • E.g., molecular models, vector graphics, etc. • Easy to trade off server/client work • More “teeth” to enforce consistent syntax • Artificial “next tiny bit” links no longer necessary • Works hard to avoid semi-interoperable docs • No searches that fail because big doc was split • XHTML is XML-conforming flavor of HTML • Clean existing HTML is already close... XML Treats Documents like Databases XML Example • XML brings benefits of DBs to documents • A way of representing information • Schema to model information directly • XML documents (application of XML) are • Formal validation, locking, versioning, rollback... composed of elements and attributes ••ButBut <?xml version="1.0“encoding="ISO-8859- • Not all traditional database concepts map cleanly, order 1” ?> because documents are fundamentally different in <order OrderID="10643"> item some ways <item> room <room id=“Room10"/> item </item> <item> room <room id=“Room11"/> OrderDate </item> <OrderDate price ts="2005-10-17T00:00:00"/> 200.00 dollars <price>200.00 dollars</price> </order> What is Structure When Structure is Essential • To Relational Database theorists, structure is: • Large scale data • Tables with fixed sets of non-repeating named fields, • Data with individual parts you care about that have little internal structure • (like price-tag, tool-list, citation, author,...) • E-R diagrams with fixed number of nodes • Need for good navigation tools • Structured documents are different: • Mission-critical information • The order of SECs, Ps, etc. matters (a lot) • Many hierarchical layers (which text crosses) • Information that must last • Text/graphic data mixes with aggregate objects • Multi-author publishing process • Optional or repeatable sub-parts abound • Multiple delivery media • Interaction with natural language phenomena • These are very different

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    92 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us