User-Friendly Structured Document Editing Identifying and Removing Barriers for Acceptance by Authors

User-friendly structured document editing Identifying and removing barriers for acceptance by authors Fredrik Geers Master thesis Content and Knowledge Engineering Utrecht University 25-07-2010 First supervisor: J.B. Voorbij Second supervisor: L. Breure INF/SCR-09-94 Acknowledgements I would like to thank all people who helped me in some sort of way in making this thesis. First of all my university supervisors Hans Voorbij and Leen Breure. My Xopus colleagues Laurens van den Oever, Jeroen Pulles, Robbert Broersma, Sjo- erd Visscher, Carl Giesberts and Suzanne Waalberg. Edo Plantinga and Johan Klevant-Groen for bringing me into contact with the right people. All people who participated and gave me insight in their practices: Marco Hoffman, Marco Veltien, Mirjam van Immerzeel, Jos Overbeeke, Gelland Veldman, Margareth Hop, Jasper Enklaar, Mirjam Bedaf, Edith van Gameren, Frits Boer, Eddy Pronk, Aadje van der Giessen, Marijke Hessing, Jolanda van der Salm, Jorik van Engeland, Julia Krasenberg, Joel van Beek, Welmoed Wagenaar, Charlotte Pauwe, Luella de Regt and Yke Koopmans. Thank you all! Without your help this research would not have been possible. Abstract Editing structured documents can be complicated for non-technical authors, even in current WYSIWYG structured document editors. In this research, the way authors work with conventional word processors and with structured document editors is investigated. This results in a number of recommendations on how to improve these editors. These recommendations are verified by developing a prototype which implements a number of these recommendations and performing a usability test on that prototype. It can be concluded that editors should be more flexible, allow presentation-oriented operations, treat elements with different semantics on varying levels in the structure as different elements, keep the document valid at all times, format every element distinctly, visualize the right level of structure, increase awareness of the fact that there is a explicitly defined structure, have consistent behavior with other editors and offer controls to modify structure in the text margin. Contents 1 Introduction 4 1.1 Introduction to structured documents . 4 1.2 Problem definition . 6 2 Theoretical background 8 2.1 Related work . 8 2.2 Structured versus conventional document editing . 10 2.3 Structured document models . 14 2.4 Acceptance of structured documents . 15 3 Author analysis method 18 3.1 Analysis objectives . 19 3.2 Authors without prior experience in using structured documents 20 3.3 Authors with prior experience in using structured documents . 21 3.4 Author groups . 21 3.4.1 Freelance journalists . 21 3.4.2 Research staff . 22 3.4.3 Civil servants . 22 3.4.4 Web content editors . 22 4 Interview results 23 4.1 Authors without experience in structured document editing . 23 4.1.1 Process . 23 4.1.2 Headings . 24 4.1.3 Navigating . 25 4.1.4 Formatting . 25 4.1.5 Creating elements . 25 4.1.6 Hidden characters and codes . 26 4.1.7 Requirements . 26 4.1.8 Experience level . 26 4.1.9 On-object interface . 27 4.1.10 Mental models of structure . 27 4.2 Authors with experience in structured document editing . 28 1 4.2.1 Correcting existing documents . 29 4.2.2 Document-oriented operations . 29 4.2.3 New interface methods . 30 4.2.4 Visualization of document structure . 30 4.2.5 Other problems . 31 4.3 Implications for editing software . 31 4.3.1 Flexible document structure . 32 4.3.2 Allow presentation-oriented operations . 33 4.3.3 Treat elements with different semantics in a different way 33 4.3.4 Keep the document valid . 34 4.3.5 Visualize differences in structure as differences in formatting 34 4.3.6 Visualization of document structure: find a balance . 35 4.3.7 Increase awareness of document structure . 35 4.3.8 Conform where possible to behavior of industry standard 35 4.3.9 Make use of controls near the text . 36 4.4 Editing operations . 36 4.4.1 Heading hierarchy . 36 4.4.2 Block-level elements . 37 4.4.3 Inline formatting . 37 4.4.4 Inline objects . 37 4.4.5 Tables . 38 4.4.6 Lists . 38 4.4.7 Metadata . 39 5 User testing method 40 5.1 Subjects . 40 5.2 Material . 40 5.3 Procedure . 41 5.4 Measures . 42 6 Results 44 6.1 Effectivity, efficiency and satisfaction . 44 6.2 Learnability . 45 6.3 Functions used . 45 6.4 Recommendations reviewed . 48 6.4.1 Flexible document structure . 48 6.4.2 Allow presentation-oriented operations . 48 6.4.3 Treat elements with different semantics in a different way 48 6.4.4 Keep the document valid . 49 6.4.5 Visualize differences in structure as differences in formatting 49 6.4.6 Visualization of document structure: find a balance . 49 6.4.7 Increase awareness of document structure . 49 6.4.8 Conform where possible to behavior of industry standard 49 6.4.9 Make use of controls near the text . 50 7 Conclusion 51 2 8 Discussion and future work 53 9 Appendix 55 9.1 Interview questions . 55 9.1.1 Users without experience in using structured documents . 55 9.1.2 Users with experience in using structured documents . 55 9.1.3 Restrictions . 56 9.1.4 Computer literacy . 56 9.1.5 Questionnaire . 56 9.2 Mental models . 57 9.3 CVDR system . 58 9.4 Tasks . 58 9.5 SUS results . 59 9.6 Article used in experiment . 60 3 Chapter 1 Introduction This research aims to improve structured document editing for non-technical authors by giving a number of recomendations for structured document editors. Structured documents are currently used most often by technical writers, working with structured content all the time, and familiar with the technology. Projects where other authors are required to create structured documents are from a usability point of view often unsuccessful and the editing processes can be greatly improved. In this chapter, the concept of structured documents is described, together with an overview of the problem of the current state of art. The remainder of this report is divided into a number of chapters. First in chapter 2, the related literature is discussed in the theoretical background. Next in chapter 3, the method for a preliminary research on the working methods of authors (with and without structured documents) is described, leading to the results of this research and recommendations for a structured document editor in chapter 4. Chapter 5 describes the method for confirming these results with user- testing, followed with the results of those tests in chapter 6. Finally, the conclusions of this research are written in chapter 7, and remarks are placed in chapter 8: discussion and future research. 1.1 Introduction to structured documents When talking about structure in documents, not everyone is talking about the same kind of structure. Documents are structured in implicit ways only apparent by actually reading the content, and in more explicit ways by the formatting of the document which shows structure in the form of chapters and sections for instance. The structure discussed in this research is the latter type of structure. Structured documents can be defined as generally textual documents, with a more or less fixed structure. Examples are product manuals or legal reports, but recipes are also frequently used as example. All these types of documents 4 consist of a variety of textual items with different semantics. In a conventional document, these pieces of text would be identified by their formatting; a big bold piece of text is the title, and if a smaller italic piece of text is found below, there is a good chance that is the name of the author. In a structured document, the semantics of the pieces of text do not have to be guessed based on the formatting, they are defined by the structure. All documents possess structue, but in conventional documents this is not explicitly defined. In these structured documents, the structure (or at least what elements can be used in which position in the hierarchy) is defined in a document model. There are a number of groups developing general models, for example DITA (Day et al., 2005), DocBook (Walsh & Muellner, 1999) and TEI (TEI Con- sortium, 2007). Many organizations are also developing their own models, for example publishers such as Elsevier or Kluwer have developed their own exten- sive document models for use in their journals and books. The design of these models is expected to influence the usability of the authoring process. A too complicated or over-specified schema can make it harder to write documents (Usdin, 2002). It can change the way people author documents in a positive or negative way, just as the interface of an editor changes the way people work with it. Especially initiatives like DITA are meant to facilitate an entire different style of writing. In this research I will focus on the influence of the editor interface. The usage of structured documents is not a new concept. GML, one of the first languages for describing structured documents, was developed at IBM in 1969. This eventually evolved in the ISO standard SGML. SGML is in turn the origin of HTML, the language most used on the Internet, and xml, which is currently the most influential standard for structured document development. Another structured document technique is LaTeX, which is frequently used for formatting of academic papers and books. It is built on top of the formatting language TeX, giving the documents semantic structure instead of formatting information. A detailed description of LaTeX can be found in Wonneberger (1990).

User-Friendly Structured Document Editing Identifying and Removing Barriers for Acceptance by Authors

XML a New Web Site Architecture

Techniques for Authoring Complex XML Documents Vincent Quint, Irène Vatton

A Wiki-Based Authoring Tool for Collaborative Development of Multimedial Documents

Musical Notation Codes Index

WYSIWYM – Integrated Visualization, Exploration and Authoring of Semantically Enriched Un-Structured Content

Latexfür Musiker

What's in Extras

Iso/Iec 19757-8:2008(E)

Reproducible Research II: Sharing Files

HTML – Hypertext Markup Language

Taxonomy of XML Schema Languages Using Formal Language Theory

The LYX Tutorial