User-Friendly Structured Document Editing Identifying and Removing Barriers for Acceptance by Authors

Total Page:16

File Type:pdf, Size:1020Kb

User-Friendly Structured Document Editing Identifying and Removing Barriers for Acceptance by Authors User-friendly structured document editing Identifying and removing barriers for acceptance by authors Fredrik Geers Master thesis Content and Knowledge Engineering Utrecht University 25-07-2010 First supervisor: J.B. Voorbij Second supervisor: L. Breure INF/SCR-09-94 Acknowledgements I would like to thank all people who helped me in some sort of way in making this thesis. First of all my university supervisors Hans Voorbij and Leen Breure. My Xopus colleagues Laurens van den Oever, Jeroen Pulles, Robbert Broersma, Sjo- erd Visscher, Carl Giesberts and Suzanne Waalberg. Edo Plantinga and Johan Klevant-Groen for bringing me into contact with the right people. All people who participated and gave me insight in their practices: Marco Hoffman, Marco Veltien, Mirjam van Immerzeel, Jos Overbeeke, Gelland Veldman, Margareth Hop, Jasper Enklaar, Mirjam Bedaf, Edith van Gameren, Frits Boer, Eddy Pronk, Aadje van der Giessen, Marijke Hessing, Jolanda van der Salm, Jorik van Engeland, Julia Krasenberg, Joel van Beek, Welmoed Wagenaar, Charlotte Pauwe, Luella de Regt and Yke Koopmans. Thank you all! Without your help this research would not have been possible. Abstract Editing structured documents can be complicated for non-technical authors, even in current WYSIWYG structured document editors. In this research, the way authors work with conventional word processors and with structured doc- ument editors is investigated. This results in a number of recommendations on how to improve these editors. These recommendations are verified by devel- oping a prototype which implements a number of these recommendations and performing a usability test on that prototype. It can be concluded that editors should be more flexible, allow presentation-oriented operations, treat elements with different semantics on varying levels in the structure as different elements, keep the document valid at all times, format every element distinctly, visualize the right level of structure, increase awareness of the fact that there is a ex- plicitly defined structure, have consistent behavior with other editors and offer controls to modify structure in the text margin. Contents 1 Introduction 4 1.1 Introduction to structured documents . 4 1.2 Problem definition . 6 2 Theoretical background 8 2.1 Related work . 8 2.2 Structured versus conventional document editing . 10 2.3 Structured document models . 14 2.4 Acceptance of structured documents . 15 3 Author analysis method 18 3.1 Analysis objectives . 19 3.2 Authors without prior experience in using structured documents 20 3.3 Authors with prior experience in using structured documents . 21 3.4 Author groups . 21 3.4.1 Freelance journalists . 21 3.4.2 Research staff . 22 3.4.3 Civil servants . 22 3.4.4 Web content editors . 22 4 Interview results 23 4.1 Authors without experience in structured document editing . 23 4.1.1 Process . 23 4.1.2 Headings . 24 4.1.3 Navigating . 25 4.1.4 Formatting . 25 4.1.5 Creating elements . 25 4.1.6 Hidden characters and codes . 26 4.1.7 Requirements . 26 4.1.8 Experience level . 26 4.1.9 On-object interface . 27 4.1.10 Mental models of structure . 27 4.2 Authors with experience in structured document editing . 28 1 4.2.1 Correcting existing documents . 29 4.2.2 Document-oriented operations . 29 4.2.3 New interface methods . 30 4.2.4 Visualization of document structure . 30 4.2.5 Other problems . 31 4.3 Implications for editing software . 31 4.3.1 Flexible document structure . 32 4.3.2 Allow presentation-oriented operations . 33 4.3.3 Treat elements with different semantics in a different way 33 4.3.4 Keep the document valid . 34 4.3.5 Visualize differences in structure as differences in formatting 34 4.3.6 Visualization of document structure: find a balance . 35 4.3.7 Increase awareness of document structure . 35 4.3.8 Conform where possible to behavior of industry standard 35 4.3.9 Make use of controls near the text . 36 4.4 Editing operations . 36 4.4.1 Heading hierarchy . 36 4.4.2 Block-level elements . 37 4.4.3 Inline formatting . 37 4.4.4 Inline objects . 37 4.4.5 Tables . 38 4.4.6 Lists . 38 4.4.7 Metadata . 39 5 User testing method 40 5.1 Subjects . 40 5.2 Material . 40 5.3 Procedure . 41 5.4 Measures . 42 6 Results 44 6.1 Effectivity, efficiency and satisfaction . 44 6.2 Learnability . 45 6.3 Functions used . 45 6.4 Recommendations reviewed . 48 6.4.1 Flexible document structure . 48 6.4.2 Allow presentation-oriented operations . 48 6.4.3 Treat elements with different semantics in a different way 48 6.4.4 Keep the document valid . 49 6.4.5 Visualize differences in structure as differences in formatting 49 6.4.6 Visualization of document structure: find a balance . 49 6.4.7 Increase awareness of document structure . 49 6.4.8 Conform where possible to behavior of industry standard 49 6.4.9 Make use of controls near the text . 50 7 Conclusion 51 2 8 Discussion and future work 53 9 Appendix 55 9.1 Interview questions . 55 9.1.1 Users without experience in using structured documents . 55 9.1.2 Users with experience in using structured documents . 55 9.1.3 Restrictions . 56 9.1.4 Computer literacy . 56 9.1.5 Questionnaire . 56 9.2 Mental models . 57 9.3 CVDR system . 58 9.4 Tasks . 58 9.5 SUS results . 59 9.6 Article used in experiment . 60 3 Chapter 1 Introduction This research aims to improve structured document editing for non-technical authors by giving a number of recomendations for structured document edi- tors. Structured documents are currently used most often by technical writers, working with structured content all the time, and familiar with the technology. Projects where other authors are required to create structured documents are from a usability point of view often unsuccessful and the editing processes can be greatly improved. In this chapter, the concept of structured documents is described, together with an overview of the problem of the current state of art. The remainder of this report is divided into a number of chapters. First in chapter 2, the related literature is discussed in the theoretical background. Next in chapter 3, the method for a preliminary research on the working methods of authors (with and without structured documents) is described, leading to the results of this research and recommendations for a structured document editor in chapter 4. Chapter 5 describes the method for confirming these results with user- testing, followed with the results of those tests in chapter 6. Finally, the conclusions of this research are written in chapter 7, and remarks are placed in chapter 8: discussion and future research. 1.1 Introduction to structured documents When talking about structure in documents, not everyone is talking about the same kind of structure. Documents are structured in implicit ways only apparent by actually reading the content, and in more explicit ways by the formatting of the document which shows structure in the form of chapters and sections for instance. The structure discussed in this research is the latter type of structure. Structured documents can be defined as generally textual documents, with a more or less fixed structure. Examples are product manuals or legal reports, but recipes are also frequently used as example. All these types of documents 4 consist of a variety of textual items with different semantics. In a conventional document, these pieces of text would be identified by their formatting; a big bold piece of text is the title, and if a smaller italic piece of text is found below, there is a good chance that is the name of the author. In a structured document, the semantics of the pieces of text do not have to be guessed based on the formatting, they are defined by the structure. All documents possess structue, but in conventional documents this is not explicitly defined. In these structured documents, the structure (or at least what elements can be used in which position in the hierarchy) is defined in a document model. There are a number of groups developing general models, for example DITA (Day et al., 2005), DocBook (Walsh & Muellner, 1999) and TEI (TEI Con- sortium, 2007). Many organizations are also developing their own models, for example publishers such as Elsevier or Kluwer have developed their own exten- sive document models for use in their journals and books. The design of these models is expected to influence the usability of the authoring process. A too complicated or over-specified schema can make it harder to write documents (Usdin, 2002). It can change the way people author documents in a positive or negative way, just as the interface of an editor changes the way people work with it. Especially initiatives like DITA are meant to facilitate an entire differ- ent style of writing. In this research I will focus on the influence of the editor interface. The usage of structured documents is not a new concept. GML, one of the first languages for describing structured documents, was developed at IBM in 1969. This eventually evolved in the ISO standard SGML. SGML is in turn the origin of HTML, the language most used on the Internet, and xml, which is currently the most influential standard for structured document development. Another structured document technique is LaTeX, which is frequently used for formatting of academic papers and books. It is built on top of the formatting language TeX, giving the documents semantic structure instead of formatting information. A detailed description of LaTeX can be found in Wonneberger (1990).
Recommended publications
  • XML a New Web Site Architecture
    XML A New Web Site Architecture Jim Costello Derek Werthmuller Darshana Apte Center for Technology in Government University at Albany, SUNY 1535 Western Avenue Albany, NY 12203 Phone: (518) 442-3892 Fax: (518) 442-3886 E-mail: [email protected] http://www.ctg.albany.edu September 2002 © 2002 Center for Technology in Government The Center grants permission to reprint this document provided this cover page is included. Table of Contents XML: A New Web Site Architecture .......................................................................................................................... 1 A Better Way? ......................................................................................................................................................... 1 Defining the Problem.............................................................................................................................................. 1 Partial Solutions ...................................................................................................................................................... 2 Addressing the Root Problems .............................................................................................................................. 2 Figure 1. Sample XML file (all code simplified for example) ...................................................................... 4 Figure 2. Sample XSL File (all code simplified for example) ....................................................................... 6 Figure 3. Formatted Page Produced
    [Show full text]
  • Techniques for Authoring Complex XML Documents Vincent Quint, Irène Vatton
    Techniques for Authoring Complex XML Documents Vincent Quint, Irène Vatton To cite this version: Vincent Quint, Irène Vatton. Techniques for Authoring Complex XML Documents. Proceedings of the 2004 ACM symposium on Document Engineering, DocEng 2004, Oct 2004, MilWaukee, WI, United States. pp.115-123, 10.1145/1030397.1030422. inria-00423365 HAL Id: inria-00423365 https://hal.inria.fr/inria-00423365 Submitted on 9 Oct 2009 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Techniques for Authoring Complex XML Documents Vincent Quint Irene` Vatton INRIA Rhone-Alpesˆ INRIA Rhone-Alpesˆ 655 avenue de l’Europe 655 avenue de l’Europe 38334 Saint Ismier Cedex, France 38334 Saint Ismier Cedex, France [email protected] [email protected] ABSTRACT 1. INTRODUCTION This paper reviews the main innovations of XML and con- Authoring techniques for structured documents consti- siders their impact on the editing techniques for structured tuted an active research area during the second half of the documents. Namespaces open the way to compound docu- 80’s and the early 90’s [10]. Several experimental systems ments; well-formedness brings more freedom in the editing such as Grif [7] and Rita [6] were developed and a few pro- task; CSS allows style to be associated easily with structured duction tools resulted from that work.
    [Show full text]
  • A Wiki-Based Authoring Tool for Collaborative Development of Multimedial Documents
    MEDIA2MULT – A WIKI-BASED AUTHORING TOOL FOR COLLABORATIVE DEVELOPMENT OF MULTIMEDIAL DOCUMENTS Author Name * Affiliation * Address * Author Name * Affiliation * Address * * Only for Final Camera-Ready Submission ABSTRACT media2mult is an extension for PmWiki developed at our university. It provides functionality for embedding various media files and script languages in wiki pages. Furthermore media2mult comes with a cross media publishing component that allows to convert arbitrary wiki page sequences to print-oriented formats like PDF. This article gives an overview over the offered extensions, their functionality and implementation concepts. KEYWORDS wiki, multimedia, cross-media-publishing, authoring tool, XML 1. INTRODUCTION At least since the founding of the free web encyclopedia Wikipedia and its increasing popularity wiki web , wiki-wiki or just wiki are widely known terms in context of Web 2.0. However, their exact meaning often remains unclear. Sometimes wiki and Wikipedia are actually used synonymously. The crucial functionality of every wiki system is the possibility to edit wiki web pages directly inside a browser by entering an easy to learn markup language. Thus, manual uploads of previously edited HTML files are superfluous here. The user doesn't even have to know anything about HTML or external HTML editors. The browser- and server-based concept makes it possible that several authors can edit and revise common documents without the necessity of exchanging independently written and updated versions. Because most wiki systems offer an integrated version management system, authors can easily merge their changes and revert selected passages to former stages. Thus, accidentally or deliberately applied changes of protected or publicly accessible wiki pages can be taken back in a second.
    [Show full text]
  • Musical Notation Codes Index
    Music Notation - www.music-notation.info - Copyright 1997-2019, Gerd Castan Musical notation codes Index xml ascii binary 1. MidiXML 1. PDF used as music notation 1. General information format 2. Apple GarageBand Format 2. MIDI (.band) 2. DARMS 3. QuickScore Elite file format 3. SMDL 3. GUIDO Music Notation (.qsd) Language 4. MPEG4-SMR 4. WAV audio file format (.wav) 4. abc 5. MNML - The Musical Notation 5. MP3 audio file format (.mp3) Markup Language 5. MusiXTeX, MusicTeX, MuTeX... 6. WMA audio file format (.wma) 6. MusicML 6. **kern (.krn) 7. MusicWrite file format (.mwk) 7. MHTML 7. **Hildegard 8. Overture file format (.ove) 8. MML: Music Markup Language 8. **koto 9. ScoreWriter file format (.scw) 9. Theta: Tonal Harmony 9. **bol Exploration and Tutorial Assistent 10. Copyist file format (.CP6 and 10. Musedata format (.md) .CP4) 10. ScoreML 11. LilyPond 11. Rich MIDI Tablature format - 11. JScoreML RMTF 12. Philip's Music Writer (PMW) 12. eXtensible Score Language 12. Creative Music File Format (XScore) 13. TexTab 13. Sibelius Plugin Interface 13. MusiXML: My own format 14. Mup music publication program 14. Finale Plugin Interface 14. MusicXML (.mxl, .xml) 15. NoteEdit 15. Internal format of Finale (.mus) 15. MusiqueXML 16. Liszt: The SharpEye OMR 16. XMF - eXtensible Music 16. GUIDO XML engine output file format Format 17. WEDELMUSIC 17. Drum Tab 17. NIFF 18. ChordML 18. Enigma Transportable Format 18. Internal format of Capella (ETF) (.cap) 19. ChordQL 19. CMN: Common Music 19. SASL: Simple Audio Score 20. NeumesXML Notation Language 21. MEI 20. OMNL: Open Music Notation 20.
    [Show full text]
  • WYSIWYM – Integrated Visualization, Exploration and Authoring of Semantically Enriched Un-Structured Content
    Semantic Web 1 (2013) 1–14 1 IOS Press WYSIWYM – Integrated Visualization, Exploration and Authoring of Semantically Enriched Un-structured Content Ali Khalili a, Sören Auer b a AKSW, Universität Leipzig, Augustusplatz 10, 04109 Leipzig, Germany [email protected] b CS/EIS, Universität Bonn, Römerstraße 164, 53117 Bonn, Germany [email protected] Abstract. The Semantic Web and Linked Data gained traction in the last years. However, the majority of information still is contained in unstructured documents. This can also not be expected to change, since text, images and videos are the natural way how humans interact with information. Semantic structuring on the other hand enables the (semi-)automatic integration, repurposing, rearrangement of information. NLP technologies and formalisms for the integrated representation of unstructured and semantic content (such as RDFa and Microdata) aim at bridging this semantic gap. However, in order for humans to truly benefit from this integration, we need ways to author, visualize and explore unstructured and semantically enriched content in an integrated manner. In this paper, we present the WYSIWYM (What You See is What You Mean) concept, which addresses this issue and formalizes the binding between semantic representation models and UI elements for authoring, visualizing and exploration. With RDFaCE and Pharmer we present and evaluate two complementary showcases implementing the WYSIWYM concept for different application domains. Keywords: Visualization, Authoring, Exploration, Semantic Web, WYSIWYM, WYSIWYG, Visual Mapping 1. Introduction more efficient and effective search interfaces, such as faceted search [29] or question answer- The Semantic Web and Linked Data movements ing [17]. with the aim of creating, publishing and interconnect- – In information presentation semantically enriched ing machine readable information have gained traction documents can be used to create more sophis- in the last years.
    [Show full text]
  • Latexfür Musiker
    LATEXfür Musiker Michael Enzenhofer Dezember 2016 Inhaltsverzeichnis I. Der erste Einstieg3 1. LATEX 4 1.1. Geschichte von LATEX............................4 1.2. Grundprinzip von LATEX...........................4 1.3. Finale versus LilyPond . .5 1.4. Word versus LATEX..............................7 1.5. Vorteile von LATEX..............................9 2. LATEX-Editoren 11 2.1. Die Qual der Wahl . 11 2.2. TeXShop die Wahl für Mac OSX . 11 2.3. Editor – Viewer . 12 3. Erster Umgang mit LATEX 13 3.1. Die Computertastatur . 13 3.2. LATEX-Befehle . 13 3.3. LATEX-Packages . 16 3.4. Der LATEX-Quellcode . 17 3.5. Dokumentenklasse . 19 3.6. Gliederung des Dokumententeils . 20 4. Die musikalische Notation 21 5. Zusätzliche Pakete 26 6. Weitere Befehle 28 7. Bilder einfügen 31 7.1. \includegraphics . 31 7.2. Bild-Positionierung . 36 7.3. figure-Umgebung . 37 7.4. sidecap . 38 7.5. picinpar . 39 8. Linkempfehlungen 42 2 Teil I. Der erste Einstieg 3 1. LATEX sprich: „Latech“ 1.1. Geschichte von LATEX Das Basis-Programm von LATEX ist TEX. Dieses wurde von Donald Ervin Knuth während seiner Zeit als Informatik-Professor an der Stanford University von 1977 bis 1986 entwickelt. Auf TEX aufbauend entwickelte Leslie Lamport Anfang der 1980er Jahre LATEX. Der Name LATEX ist eine Abkürzung für Lamport TEX. Lamports Entwicklung von LATEX endete gegen 1990 mit der Version 2.09. Die aktuelle Version LATEX2 wurde ab 1989 von einer größeren Zahl von Autoren um Frank Mittelbach, Chris Rowley und Rainer Schöpf entwickelt. 1.2. Grundprinzip von LATEX Im Gegensatz zu anderen herkömmlichenTextverarbeitungsprogrammen, die nach dem What-You-See-is-What-You-Get-Prinzip (WYSIWYG) funktionieren, arbeitet man in LATEX mit CODE.
    [Show full text]
  • What's in Extras
    What’s In Extras 2021/09/20 1 The Extras Folder The Extras folder contains several sub-folders with: • duplicates of several GUI programs installed by the MacTEX installer for those who already have a TEX distribution; • alternates to the GUI applications installed by the MacTEX installer; and • additional software that aE T Xer might find useful. The following sub-sections describe the software contained within them. Table (1), on the next page, contains a complete list of the enclosed software, versions, supported macOS versions, licenses and URLs. Note: for 2020 MacTEX supports High Sierra, Mojave and Catalina although some software also supports earlier versions of macOS. Earlier versions of applications are no longer supplied in the Extras folder but you may find them at the supplied URL. 1.1 Bibliography Bibliography programs for building and maintaining BibTEX databases. 1.2 Browsers A program to browse symbols used with LATEX. 1.3 Editors & Front Ends Alternative Editors, Typesetters and Previewers for TEX. These range from “WYSIWYM (What You See Is What You Mean)” to a Programmer’s Editor with strong LATEX support. 1.4 Equation Editors These allow the user to create beautiful equations, etc., that may be exported for use in other appli- cations; e.g., editors, illustration and presentation software. 1.5 Previewers Separate DVI and PDF previewers for use as external viewers with other editors. 1.6 Scripts Files to integrate some external programmer’s editors with the TEX system. 1.7 Spell Checkers Alternate TEX, LATEX and ConTEXt aware spell checkers. 1.8 Utilities Utilities for managing your MacTEX distribution.
    [Show full text]
  • Iso/Iec 19757-8:2008(E)
    This is a previewINTERNATIONAL - click here to buy the full publication ISO/IEC STANDARD 19757-8 First edition 2008-12-15 Information technology — Document Schema Definition Languages (DSDL) — Part 8: Document Semantics Renaming Language (DSRL) Technologies de l'information — Langages de définition de schéma de documents (DSDL) — Partie 8: Langage pour renommer une sémantique de documents (DSRL) Reference number ISO/IEC 19757-8:2008(E) © ISO/IEC 2008 ISO/IEC 19757-8:2008(E) This is a preview - click here to buy the full publication PDF disclaimer This PDF file may contain embedded typefaces. In accordance with Adobe's licensing policy, this file may be printed or viewed but shall not be edited unless the typefaces which are embedded are licensed to and installed on the computer performing the editing. In downloading this file, parties accept therein the responsibility of not infringing Adobe's licensing policy. The ISO Central Secretariat accepts no liability in this area. Adobe is a trademark of Adobe Systems Incorporated. Details of the software products used to create this PDF file can be found in the General Info relative to the file; the PDF-creation parameters were optimized for printing. Every care has been taken to ensure that the file is suitable for use by ISO member bodies. In the unlikely event that a problem relating to it is found, please inform the Central Secretariat at the address given below. COPYRIGHT PROTECTED DOCUMENT © ISO/IEC 2008 All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized in any form or by any means, electronic or mechanical, including photocopying and microfilm, without permission in writing from either ISO at the address below or ISO's member body in the country of the requester.
    [Show full text]
  • Reproducible Research II: Sharing Files
    Reproducible Research II: sharing files LPO 9951 | Fall 2015 Contents The common thread . 1 Writing . 2 Equations . 5 References . 5 Graphics . 5 Attribution . 6 Ownership . 8 Final word . 8 PURPOSE In the last lecture on reproducible research, we discussed version control through git. Git works better with some types of file formats than others (e.g., plain text vs. MS Word). Today we will discuss a few of these not only in the service of using git, but with the purpose of improving how you can collaborate with others and share your work. We’ll also discuss attributions and ownership. The common thread Despite their seemingly different approaches to structuring text, the following syntaxes share two primary characteristics: 1. They are structured and saved using plain text files 2. They follow a WYSIWYM rather than WYSIWYG philosophy Plain text The simple MS Word .doc or .docx is a single file, right? Yes and no. The reason non-PC systems sometimes have trouble with MS Word documents is (1) they are saved in a proprietary format, and (2) those files are better thought of as file systems rather than as a single file. In other words, your MS Word document isn’t simply a record of the words on your screen, but also of a large number of format and system settings. This file structure is why, if you have tried, GitHub doesn’t display Windows documents that you may have pushed to it. Git works best with plain text (ASCII) files. There are a limited number of characters available, but for most of your scripting, you’ll be fine (do you really need that umlaut in your variable names?).
    [Show full text]
  • HTML – Hypertext Markup Language
    HTML – Hypertext Markup Language Computer Science and Engineering n College of Engineering n The Ohio State University Lecture 9 HTML Computer Science and Engineering n The Ohio State University o Hypertext Markup Language o Key ideas: 1. Connect documents via (hyper)links o Visual point-and-click o Distributed, decentralized set of documents 2. Describe content of document, not style o Structure with semantics o Separation of concerns o Rephrasing these key ideas: 1. Hypertext 2. Markup Markup: Describing Content Computer Science and Engineering n The Ohio State University o WYSIWYG n A paragraph or bulleted list in MS Word n Benefits: o No surprises in final appearance o Quick and easy o Control: Author can use visual elements to stand in for structural elements o WYSIWYM n A paragraph or list in LaTeX n Benefits: o More information in document (visual & semantic) o Lack of Control: Author doesn't know how to apply visual elements properly for structure Abstraction vs Representation Computer Science and Engineering n The Ohio State University \section{To Do List} \begin{enumerate} \item{Study for midterm} \item{Sleep} \end{enumerate} Authors Lack Requisite Expertise Computer Science and Engineering n The Ohio State University o What's wrong with the following page? Chapter 9 Now that we have the ability to display a catalog containing all our wonderful products, it would be nice to be able to sell them. We will need to cover sessions, models, and adding a button to a view. So let's get started. Iteration D1: Finding a Cart … Evolution
    [Show full text]
  • Taxonomy of XML Schema Languages Using Formal Language Theory
    Taxonomy of XML Schema Languages using Formal Language Theory MAKOTO MURATA IBM Tokyo Research Lab DONGWON LEE Penn State University MURALI MANI Worcester Polytechnic Institute and KOHSUKE KAWAGUCHI Sun Microsystems On the basis of regular tree grammars, we present a formal framework for XML schema languages. This framework helps to describe, compare, and implement such schema languages in a rigorous manner. Our main results are as follows: (1) a simple framework to study three classes of tree languages (“local”, “single-type”, and “regular”); (2) classification and comparison of schema languages (DTD, W3C XML Schema, and RELAX NG) based on these classes; (3) efficient doc- ument validation algorithms for these classes; and (4) other grammatical concepts and advanced validation algorithms relevant to XML model (e.g., binarization, derivative-based validation). Categories and Subject Descriptors: H.2.1 [Database Management]: Logical Design—Schema and subschema; F.4.3 [Mathematical Logic and Formal Languages]: Formal Languages— Classes defined by grammars or automata General Terms: Algorithms, Languages, Theory Additional Key Words and Phrases: XML, schema, validation, tree automaton, interpretation 1. INTRODUCTION XML [Bray et al. 2000] is a meta language for creating markup languages. To represent an XML based language, we design a collection of names for elements and attributes that the language uses. These names (i.e., tag names) are then used by application programs dedicated to this type of information. For instance, XHTML [Altheim and McCarron (Eds) 2000] is such an XML-based language. In it, permissible element names include p, a, ul, and li, and permissible attribute names include href and style.
    [Show full text]
  • The LYX Tutorial
    The LYX Tutorial 1 by the LYX Team January 20, 2008 1 If you have comments or error corrections, please send them to the LYX Documenta- tion mailing list, [email protected]. ii Contents 1 Introduction1 1.1 Welcome to LYX!..........................1 1.2 What the Tutorial is and what it isn’t ................1 1.2.1 Getting the most out of the Tutorial............2 1.2.2 What you won’t find....................2 2 Getting started with LYX3 2.1 Your first LYX document......................3 2.1.1 Typing, Viewing, and Exporting..............4 2.1.2 Simple Operations.....................4 2.1.3 WYSIWYM: Whitespace in LYX.............5 2.2 Environments............................6 2.2.1 Sections and Subsections..................7 2.2.2 Lists and sublists......................8 2.2.3 Other environments: Verses, Quotations, and more.... 10 3 Writing Documents 11 3.1 Document Classes.......................... 11 3.2 Templates: Writing a Letter..................... 12 3.3 Document Titles........................... 13 3.4 Labels and Cross-References.................... 14 3.5 Footnotes and Margin Notes.................... 16 3.6 Bibliographies............................ 17 3.7 Table of Contents.......................... 17 4 Using Math 19 4.1 Math Mode............................. 19 4.2 Navigating an Equation....................... 20 iii iv CONTENTS 4.3 Exponents and Indices....................... 21 4.4 The Math Toolbar.......................... 21 4.4.1 Greek and symbols..................... 21 4.4.2 Square roots, accents, and delimiters............ 22 4.4.3 Fractions.......................... 22 4.4.4 TEX mode: Limits, log, sin and others........... 23 4.4.5 Matrices........................... 23 4.4.6 Display mode........................ 24 4.5 More Math Stuff.........................
    [Show full text]