Using XML for Long-Term Preservation Subtitle: Experiences from the Diva Project

Total Page:16

File Type:pdf, Size:1020Kb

Using XML for Long-Term Preservation Subtitle: Experiences from the Diva Project Title: Using XML for Long-term Preservation Subtitle: Experiences from the DiVA Project Authors: Müller, Eva; Klosa, Uwe; Hansson, Peter; Andersson, Stefan; Siira, Erik Organization: Uppsala University Library, Electronic Publishing Centre Email: [email protected]; [email protected]; [email protected]; [email protected]; [email protected] Address: Box 510, 75 120 Uppsala, Sweden Url: http://publications.uu.se Keyword: long-term preservation, XML, XML Schema, DiVA, DiVA Document Format, DiVA Archive, URN, URN:NBN Abstract One of the objectives of the DiVA project is to explore the possibility of using XML as a format for long-term preservation. For this reason, the practical use of XML in different parts of the system was evaluated before deciding on the design. The DiVA Document Format - defined by an XML schema - has been developed to describe the inter-relationships amongst the various data elements and processes, and to support long-term preservation of the actual documents. XML Schema provides a means for defining the structure, content and semantics of XML documents. It is an XML based alternative to the XML Document Type Definition (DTD). Because one of the primary reasons for using XML was to support long-term preservation, the most popular DTDs for documents: DocBook and TEI were evaluated. Limitations regarding metadata descriptions were found in both of these DTDs, so the decision to develop a new structure for DiVA, using XML schema, was made. This schema combines the DocBook Schema (derived from the DocBook DTD) for the textual parts of the document with the internal schema for all metadata (bibliographic and administrative data). Using the DiVA Document Format for content management and inter-process communication, several applications were developed. Some of their purposes are essential for long-term preservation: • Make persistent National Bibliographic Numbers (NBN) available for the URN resolution service1 at the Royal Library in Stockholm available. • Send MARC21 records in MARC-XML to the National Library. • Create archival file packages for long-term preservation, checksum them, store them in the DiVA Archive and send a copy of them to the Swedish Royal Library. Currently the file-archives for long-term preservation contain the original full-text file in various formats and the DiVA Document Format file, which contains all the metadata about the document. Furthermore the DiVA Document Format file contains all parts of the full-text file that can be converted into XML. In the future it might be possible to transfer the whole full-text into XML, in which case the file-archives would contain only DiVA Document Format files. 1 http://urn.kb.se/resolve Table of Contents 1 XML as Long-term Preservation Format 3 1.1 XML Schema 3 1.2 Comparison of DocBook and TEI 3 1.3 DiVA Document Format 4 2 Long-term Preservation in the DiVA Project 5 2.1 Uniform Resource Name (URN) and National Bibliographic Number (NBN) 6 2.2 The DiVA Archive 6 3 Conclusions 8 Preface DiVA - Digitala vetenskapliga arkivet (DiVA Archive) - is a comprehensive description of a searchable archive containing all documents, which are published in an electronic form at Uppsala University in Sweden. Other Swedish universities are also co-operating in the project within the DiVA framework. One part of this archive is the database containing theses published at Uppsala University from 1998 to date. In September 2000 an Electronic Publishing Centre was established at Uppsala University Library. Its primary assignment was a project in which technical solutions, and a well-functioning workflow, for electronic posting and full-text publication of doctoral theses, essays, working papers and other types of scientific publications were to be created. The first phase of the project was completed in 2002 and the result was the DiVA Publishing System – a system for electronic publishing of different types of publications. One of the goals has been to create a long-term archive containing all digital documents published at Uppsala University. The assignment involves both technical and organisational issues. Developer team faced with many questions. How can the loss of data be avoided? What kind of descriptive and administrative metadata is useful for archiving? What is the appropriate metadata format for long time preservation? How important is the layout of the objects and how is it to be handled? How can images and formulas be handled? Because of those questions, XML was discussed early on as a format for storing descriptive and administrative metadata, as well as for the complete content of the documents. XML represents a format that is easy to restore and understand by both humans and machines. This paper will describe the current status of the XML implementation in DiVA Archive and the surrounding applications and why XML is an important format for long-term preservation. 1 XML as Long-term Preservation Format One of the objectives of the DiVA project is to explore the possibility of using XML as a format for long-term archiving. There are several advantages of using XML encoded documents for long-term archiving. XML is an open and established notation. XML documents are in a human-readable text format and internationalised character sets are supported. These characteristics facilitate data migration and the documents are likely to have longevity. For these reasons XML seemed like a good choice, but to ensure success, the practical use of XML in different parts of the system was evaluated before a decision about the design was made. In the DiVA project XML is not only for archiving. It is also used for the communication between different processes within the system and for the internal communication in the development team. It also helps to validate data with help of an XML schema. The dynamic web interface is built on XML and XSLT. 1.1 XML Schema XML Schema provides a means for defining the structure, content and semantics of XML documents. XML Schema is an XML based alternative to the XML Document Type Definition (DTD). Because the primary reason for using XML was to support long-term archiving, the most popular DTDs and schemas for documents namely DocBook and TEI were evaluated. Limitations regarding the metadata descriptions needed in the DiVA project were found. Because of the need to combine administrative metadata, descriptive metadata and content, a new schema was developed that meets the needs of the DiVA project. This schema combines the DocBook schema (derived from the DocBook DTD) for the textual parts of the document with the bibliographic metadata and administrative metadata for long-term preservation. XML Schema was chosen over XML DTD because it is written in XML and supports many data types, self-defined data types and different namespaces. The support for different data types offers several advantages. It is possible to describe permissible document content, to validate the correctness of data, to define restrictions on data (data facets), to define data formats (data patterns) and to convert between different data types. It is also easier to work with data coming from a database. During the development, it was noticed that XML Schema facilitated the communication between the developers by providing a simple mechanism for writing formal specifications of subsystem interfaces. 1.2 Comparison of DocBook and TEI TEI2 and DocBook3 are two widely used recommendations for encoding textual material in electronic form. These two recommendations were compared to find which is most appropriate and convenient to use when representing full-text documents in the DiVA Archive. A logical unit, i.e. a combination of XML elements and/or XML attributes that have a certain well- defined meaning, can be expressed differently in TEI and DocBook. A logical unit that consists of only one well-defined element in DocBook often is composed by both a general element and attribute in the TEI representation. Attribute values are not defined in the TEI recommendation and therefore have to be defined locally. Therefore it is likely that others would not correctly interpret a TEI encoded document without any agreements. Elements that define the structure of documents, e.g. headers, chapters, lists and tables are more 2 See: http://www.tei-c.org/ 3 See: http://www.docbook.org/ specifically defined in DocBook than in TEI. For publication of documents like PhD theses or scientific papers it is therefore more convenient to use DocBook because relevant structure elements are well defined. But if a text should be marked-up in detail both semantically and structurally, for example in order to create scholarly archives of diverse kinds of historical sources or for linguistic purposes, the more general TEI scheme would be a better choice. The main purpose in the DiVA project is to store the structure of the contents of the documents and not to store the semantics. Therefore DocBook was chosen to mark up the content. Element TEI DocBook Heading 1 <div1 type="chapter" n='1'> <chapter id="1"> <title> Heading 1</title> <head n="1">Heading 1</head> </chapter> </div1> Superscript <hi rend="sup">text</hi> <superscript>text</superscript> Lists <list type=”…”></list> <orderdlist numeration=”…”>…</orderdlist> Table 1: Some elements in TEI and DocBook 1.3 DiVA Document Format DiVA Document Format - defined by an XML Schema - version 1.0 consists of 99 elements4. Administrative elements are combined with descriptive elements to make it possible to describe a publication in the same XML document file that contains its content. Many element names exist in both singular and plural form. The plural form is always used to name container elements. A container element contains one or more elements in its corresponding singular form. For example <creators> contains one or more <creator> elements, <titles> contains <title> elements and so on.
Recommended publications
  • OASIS Response to NSTC Request for Feedback on Standard Practices
    OASIS RESPONSE TO NSTC REQUEST FOR FEEDBACK ON STANDARDS PRACTICES OASIS (Organization for the Advancement of Structured Information Standards) is pleased to respond to the request from the National Science and Technology Council's Sub-Committee on Standards published at 75 FR 76397 (2010), and extended by 76 FR 3877 (2011), for feedback and observations regarding the effectiveness of Federal agencies' participation in the development and implementation of standards and conformity assessment activities and programs. We have advised our own members about the Federal Register inquiry, in case they wish to respond. Of course, their opinions are their own, and this response does not represent the views of any members, but only the observations of OASIS professional staff. I. RESPONDENT'S BACKGROUND OASIS is one of the largest and oldest global open data standards consortia, founded in 1993 as SGML Open. OASIS has over 5000 active participants representing about 600 member organizations and individual members in over 80 countries. We host widely-used standards in multiple fields including • cybersecurity & access control (such as WS-Security, SAML, XACML, KMIP, DSS & XSPA) [/1], • office documents and smart semantic documents (such as OpenDocument, DITA, DocBook & CMIS) [/2], and • electronic commerce (including SOA and web services, such as BPEL, ebXML, WS-ReliableMessaging & the WS-Transaction standards) [/3] among other areas. Various specific vertical industries also fulfill their open standards requirements by initiating OASIS projects, resulting in mission-specific standards such as • UBL and Business Document Exchange (for e-procurement) [/4], • CAP and EDML (for emergency first-responder notifications) [/5], and • LegalXML (for electronic court filing data)[/6].
    [Show full text]
  • Automated Software System for Checking the Structure and Format of Acm Sig Documents
    AUTOMATED SOFTWARE SYSTEM FOR CHECKING THE STRUCTURE AND FORMAT OF ACM SIG DOCUMENTS A THESIS SUBMITTED TO THE GRADUATE SCHOOL OF APPLIED SCIENCES OF NEAR EAST UNIVERSITY By ARSALAN RAHMAN MIRZA In Partial Fulfillment of the Requirements for The Degree of Master of Science in Software Engineering NICOSIA, 2015 ACKNOWLEDGEMENTS This thesis would not have been possible without the help, support and patience of my principal supervisor, my deepest gratitude goes to Assist. Prof. Dr. Melike Şah Direkoglu, for her constant encouragement and guidance. She has walked me through all the stages of my research and writing thesis. Without her consistent and illuminating instruction, this thesis could not have reached its present from. Above all, my unlimited thanks and heartfelt love would be dedicated to my dearest family for their loyalty and their great confidence in me. I would like to thank my parents for giving me a support, encouragement and constant love have sustained me throughout my life. I would also like to thank the lecturers in software/computer engineering department for giving me the opportunity to be a member in such university and such department. Their help and supervision concerning taking courses were unlimited. Eventually, I would like to thank a man who showed me a document with wrong format, and told me “it will be very good if we have a program for checking the documents”, however I don’t know his name, but he hired me to start my thesis based on this idea. ii To Alan Kurdi To my Nephews Sina & Nima iii ABSTRACT Microsoft office (MS) word is one of the most commonly used software tools for creating documents.
    [Show full text]
  • JSON Application Programming Interface for Discrete Event Simulation Data Exchange
    JSON Application Programming Interface for Discrete Event Simulation data exchange Ioannis Papagiannopoulos Enterprise Research Centre Faculty of Science and Engineering Design and Manufacturing Technology University of Limerick Submitted to the University of Limerick for the degree of Master of Engineering 2015 1. Supervisor: Prof. Cathal Heavey Enterprise Research Centre University of Limerick Ireland ii Abstract This research is conducted as part of a project that has the overall aim to develop an open source discrete event simulation (DES) platform that is expandable, and modular aiming to support the use of DES at multi-levels of manufacturing com- panies. The current work focuses on DES data exchange within this platform. The goal of this thesis is to develop a DES exchange interface between three different modules: (i) ManPy an open source discrete event simulation engine developed in Python on the SimPy library; (ii) A Knowledge Extraction (KE) tool used to populate the ManPy simulation engine from shop-floor data stored within an Enterprise Requirements Planning (ERP) or a Manufacturing Execution System (MES) to allow the potential for real-time simulation. The development of the tool is based on R scripting language, and different Python libraries; (iii) A Graphical User Interface (GUI) developed in JavaScript used to provide an interface in a similar manner to Commercial off-the-shelf (COTS) DES tools. In the literature review the main standards that could be used are reviewed. Based on this review and the requirements above, the data exchange format standard JavaScript Object Notation (JSON) was selected. The proposed solution accom- plishes interoperability between different modules using an open source, expand- able, and easy to adopt and maintain, in an all inclusive JSON file.
    [Show full text]
  • Release Notes for the Docbook XSL Stylesheets I
    Release Notes for the DocBook XSL Stylesheets i Release Notes for the DocBook XSL Stylesheets Release Notes for the DocBook XSL Stylesheets ii Contents 1 Release Notes: snapshot 1 2 Release Notes: 1.79.2 1 3 Release Notes: 1.79.1 1 3.1 Gentext . .1 3.2 Common . .2 3.3 FO...........................................................4 3.4 HTML.........................................................9 3.5 Manpages . 13 3.6 Epub.......................................................... 14 3.7 HTMLHelp . 16 3.8 Eclipse . 16 3.9 JavaHelp . 16 3.10 Slides . 17 3.11 Website . 17 3.12 Webhelp . 18 3.13 Params . 18 3.14 Profiling . 20 3.15Lib........................................................... 20 3.16 Tools . 20 3.17 Template . 21 3.18 Extensions . 21 4 Release Notes: 1.79.0 21 4.1 Gentext . 22 4.2 Common . 23 4.3 FO........................................................... 24 4.4 HTML......................................................... 29 4.5 Manpages . 34 4.6 Epub.......................................................... 35 4.7 HTMLHelp . 36 4.8 Eclipse . 36 4.9 JavaHelp . 37 4.10 Slides . 37 4.11 Website . 38 4.12 Webhelp . 38 4.13 Params . 39 Release Notes for the DocBook XSL Stylesheets iii 4.14 Profiling . 40 4.15Lib........................................................... 40 4.16 Tools . 40 4.17 Template . 41 4.18 Extensions . 42 5 Release Notes: 1.78.1 42 5.1 Common . 42 5.2 FO........................................................... 43 5.3 HTML......................................................... 43 5.4 Manpages . 44 5.5 Webhelp . 44 5.6 Params . 44 5.7 Highlighting . 44 6 Release Notes: 1.78.0 44 6.1 Gentext . 45 6.2 Common . 45 6.3 FO........................................................... 46 6.4 HTML......................................................... 47 6.5 Manpages .
    [Show full text]
  • Docbook to XHTML I
    DocBook to XHTML i DocBook to XHTML DocBook to XHTML ii COLLABORATORS TITLE : DocBook to XHTML ACTION NAME DATE SIGNATURE WRITTEN BY Jordi Fita February 6, 2018 REVISION HISTORY NUMBER DATE DESCRIPTION NAME 29081e152caf 2011-05-31 Added the ’notranslate’ class to the code’s div jfita output in db2html. 34b7522b4f97 2011-03-28 atangle is now using a new style for directives jfita which don’t collide with XML tags. I had to update all games and programs as well in order to use the new directive syntax. 6cc909c0b61d 2011-03-07 Added the comments section. jfita a43774cb5c70 2011-01-25 db2html now takes into account XML jfita idiosyncrasies. 3afa2eb8824f 2010-11-12 Fixed missing tokens from lexer in db2html. jfita 2d89308d5f16 2010-11-10 Fixed a problem with double end of line values in jfita db2html’s literate programming filter. d1e8f7703f36 2010-11-10 Corrected the literate programming directive’s jfita regexp to include the dot character. 8c7d8f36c874 2010-10-30 Fixed a typo. jfita a643bad18ca3 2010-10-28 Fixed a typo in db2html. jfita ec13c85db550 2010-10-27 Added a missing source style to db2html.txt jfita DocBook to XHTML iii REVISION HISTORY NUMBER DATE DESCRIPTION NAME 30b4b6244050 2010-10-27 Added the filter for atangle’s directive to db2html. jfita e3241d8e1dc9 2010-10-25 Added the AsciiDoc’s homepage’s link to jfita db2html. 05a1b32f8b4a 2010-10-22 The appendix sections now aren’t actual jfita appendix when making a book. 0ab76df46149 2010-10-20 Added the download links. jfita 9efbebdaa6ab 2010-10-19 Fixed an unused ’tmp’ variable in db2html’s jfita print_error function.
    [Show full text]
  • Markup Languages and TEI XML Encoding
    Methods and tools for Digital Philology: markup languages and TEI XML encoding Digital Tools for Humanists Summer School 2019 Pisa, 10-14 June 2019 Roberto Rosselli Del Turco Dipartimento di Studi Umanistici Università di Torino [email protected] XML encoding Markup languages there are many markup languages, which differ greatly fundamental distinction: procedural markup vs. descriptive markup procedural markup is typical of word processors: instructions for specifying where the characters should appear on the page, their appearance, etc. WYSIWYG approach, but also see LaTeX the user doesn’t see or modify the markup directly (but again see LaTeX) descriptive markup describes text this distinction isn’t as neat as one would love to think, see for instance the structural aspect of text 2 XML encoding Descriptive markup allows the scholar to do a semantic annotation of text the current standard is the XML language (← SGML) in spite of the multiple hierarchies problem XML has been used to produce many different encoding schemas: TEI schemas for all types of texts TEI-derived schemas: EpiDoc, MEI, CEI, etc. other schemas: DOCBOOK, MML – Music Markup Language, MathML, SVG, etc. it is also possible to create a personal encoding schema, but you would need a very good reason not to use TEI XML 3 Il linguaggio XML Markup languages: XML SGML is the “father” of XML (eXtensible Markup Language) XML was created to replace both SGML, offering similar characteristics but a much lower complexity, and also HTML, going beyond the intrinsic
    [Show full text]
  • The Docbook Publishers Schema Edited by Scott Hudson Scott Hudson Flatironsflatirons Solutions Corporation
    The DocBook Publishers Schema Edited by Scott Hudson Scott Hudson FlatironsFlatirons Solutions Corporation <[email protected]> $Id: publishers.xml 8210 2009-01-13 22:26:02Z shudson310 $ wd http://docs.oasis-open.org/docbook/specs Copyright © 2009 The Organization for the Advancement of Structured Information Standards [OASIS]. All Rights Reserved. This is a Committee Draft. It was approved by the OASIS DocBook Publishers subcommit- tee, but it does not necessarily represent the consensus of the OASIS DocBook Technical Committee. Please send comments on this specification to the <[email protected]> list. To subscribe, please use the OASIS Subscription Manager [http://lists.oasis-open.org/ob/adm.pl]. The errata page for this specification is at http://docs.oasis-open.org/docbook/specs/pub- lishers-errata.html. Copyright © OASIS® 2009. All Rights Reserved. All capitalized terms in the following text have the meanings assigned to them in the OASIS Intellectual Property Rights Policy (the "OASIS IPR Policy"). The full Policy may be found at the OASIS website. This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be pre- pared, copied, published, and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this section are included on all such copies and derivative works. However, this document itself may not be modified in any way, including by removing the copyright notice or references to OASIS, except as needed for the purpose of developing any document or deliverable produced by an OASIS Technical Committee (in which case the rules applicable to copyrights, as set forth in the OASIS IPR Policy, must be followed) or as required to translate it into languages other than English.
    [Show full text]
  • Documenting Your Code with Docbook
    Documenting your code with DocBook Copyright © 2003 Tim Waugh This article may be used for Red Hat Magazine What documentation is for and why it doesn't get written Traditionally speaking, users want documentation but coders don't want to write it. It is not as black and white as this of course. Some coders write documentation as well as software, and there are people who use software but do not want to bother reading documentation. The term “documentation” itself means different things to different people. There are people who, when they get their latest electronic toy from the shop home, will immediately switch it on and start using it hoping that it will be easy enough to use that they won't have to patiently read the lengthy manual if they get stuck. Then there are others who will read the instruction booklet cover to cover before plugging it in. Documentation is not limited to paper manuals. For software it can mean on-line help in the form of web pages and pop-up windows, or the manual pages available from the man command or the GNOME help tool. When I use a well-designed program I do not expect to be looking for documentation immediately, because I hope that it will be intuitive and helpful on its own. A nice looking graphical interface with obvious layout, or (for tools where a graphical interface is overkill) command line switches similar to those used elsewhere, go a long way towards making me start using the program first before hunting for its documentation.
    [Show full text]
  • The Org/Opendocument Text Exporter Manual
    The Org/OpenDocument Text Exporter Manual Release 7.7 by Jambunathan K This manual is for Org version 7.7. Copyright c 2004-2011 Free Software Foundation, Inc. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or any later version published by the Free Software Foundation; with no Invariant Sections, with the Front-Cover texts being \A GNU Manual," and with the Back-Cover Texts as in (a) below. A copy of the license is included in the section entitled \GNU Free Documentation License." (a) The FSF's Back-Cover Text is: \You have the freedom to copy and modify this GNU manual. Buying copies from the FSF supports it in developing GNU and promoting software freedom." This document is part of a collection distributed under the GNU Free Docu- mentation License. If you want to distribute this document separately from the collection, you can do so by adding a copy of the license to the document, as described in section 6 of the license. i Table of Contents 1 Exporting ::::::::::::::::::::::::::::::::::::::: 1 1.1 OpenDocument Text export :::::::::::::::::::::::::::::::::::: 1 1.1.1 Installing ODT exporter ::::::::::::::::::::::::::::::::::: 1 1.1.2 ODT export commands :::::::::::::::::::::::::::::::::::: 1 1.1.3 Exporting and Converting to Other formats:::::::::::::::: 2 1.1.3.1 Configuring a converter ::::::::::::::::::::::::::::::: 2 1.1.3.2 Using the converter ::::::::::::::::::::::::::::::::::: 3 1.1.4 Applying Custom Styles :::::::::::::::::::::::::::::::::::
    [Show full text]
  • Opendocument Format Metadata Use Cases
    OpenDocument Format Metadata Use Cases Introduction: The following use cases were developed by the OpenDocument Format Metadata SC for use in specifying the scope of further work by the SC and for the purpose of deriving requirements for future additions to the OpenDocument Format standard. The use cases are presented in alphabetical order. For each use case, there is an overview of the general problem area followed by one or more illustrations of the use case. Use Cases Accessibility Information Overview Not all users have equal access to all the content that can appear in an ODF document. Some questions of accessibility involve the use of metadata that can be used with or as an alternative to the content of an ODF document. Handling of matters such as information about the document itself, how to read tables within tables, a box of text in a column of text, footnotes and communicating the navigation of the document per se, are all issues that touch upon accessibility. (The metadata SC has contacted the accessibility SC for guidance on these issues.) Scenario 1 A user wishes to use reading software that cannot process images or other binary data embedded in an ODF document. Metadata should both signal the software of the presence of the binary data as well as carry alternative information concerning the content of the binary object. Scenario 2 An author wishes to make more information available about a binary object in a hierarchical format that allows a user using reading software to access a very high level description such as is usually seen as a caption or more detailed information that would be available if the user were not using reading software.
    [Show full text]
  • Chapter 12 Summary
    SFWR ENG 3BB4 — Software Design 3 — Concurrent System Design 2 SFWR ENG 3BB4 — Software Design 3 — Concurrent System Design 12.2 4 Operating System Components • Process management Chapter 12 • Main memory management Summary • File management • I/O management • Secondary storage management • Networking • Security • Command Interpreter SFWR ENG 3BB4 — Software Design 3 — Concurrent System Design 12.1 3 SFWR ENG 3BB4 — Software Design 3 — Concurrent System Design 12.3 5 Concurrency UNIX Flavours A concurrent system consists of a set (with at least two elements) of threads of control or processes that UNIX • execute (essentially) independently BSD System V • may access common resources OpenBSD Ultrix • may communicate with each other SunOS 4 NetBSD Linux HPUX = Solaris 1.4 Solaris 2 OSF1 FreeBSD IRIX AIX MacOS X True64 Unix POSIX SFWR ENG 3BB4 — Software Design 3 — Concurrent System Design 12.4 6 SFWR ENG 3BB4 — Software Design 3 — Concurrent System Design 12.6 8 Bash Summary man bash (ctd.) • Complex language • Redirection: command args < infile > outfile 2>> errfile • Context-sensitive lexing • Aliases: “The rules […] are somewhat confusing”. • Complete imperative control structures “For almost every purpose, aliases are superseded by shell • Mostly dynamic binding (static binding with local) functions.” • Iterated expansion mechanisms — functional flavour • Functions: local variables need to be declared local • Arithmetic evaluation, conditional expressions • Concise syntax for command-line interaction • Simple command expansion, command execution
    [Show full text]
  • A Unified Model for Text Markup: TEI, Docbook, and Beyond File:///TEI/Talks/2004-04-18-Xmleurope/Gca/X.Html
    A unified model for text markup: TEI, Docbook, and beyond file:///TEI/Talks/2004-04-18-xmleurope/gca/x.html A unified model for text markup: TEI, Docbook, and beyond Keywords: TEI (Text Encoding Initiative), Docbook, Relax NG schema Sebastian Rahtz Oxford University Computing Services Oxford United Kingdom [email protected] http://www.oucs.ox.ac.uk/ Biography Sebastian Rahtz works for Oxford University in IT support as an Information Manager. In his copious spare work time he is also a member of the Board of Directors and Technical Council of the Text Encoding Initiative Consortium, manager of the UK JISC-funded Open Source Advisory Service, a long-time TeX bore, and an open source bigot. Norman Walsh Sun Microsystems USA [email protected] http://www.nwalsh.com Biography Norman Walsh is an XML Standards Architect in the Web Technologies and Standards group at Sun Microsystems, Inc. Norm is an active participant in a number of standards efforts worldwide, including the XML Core and XSL Working Groups of the World Wide Web Consortium where he is also an elected member of the Technical Architecture Group, the OASIS RELAX NG Committee, the OASIS Entity Resolution Committee for which he is the editor, and the OASIS DocBook Technical Committee, which he chairs. He is the principal author of DocBook: The Definitive Guide, published by O'Reilly & Associates. Lou Burnard Oxford University Computing Services Oxford United Kingdom [email protected] http://www.oucs.ox.ac.uk/ 1 of 19 03/17/04 21:43 A unified model for text markup: TEI, Docbook, and beyond file:///TEI/Talks/2004-04-18-xmleurope/gca/x.html Biography Lou Burnard is co-editor of the Text Encoding Initiative Guidelines, Assistant Director of Oxford University Computing Services and has been involved in text encoding, linguistic computing and general humanities computing since the dawn of time.
    [Show full text]