Adaptation of Mathematical Documents
Total Page:16
File Type:pdf, Size:1020Kb
Adaptation of Mathematical Documents Christine Muller¨ A thesis submitted in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Computer Science Date of Defense: 5 May 2010 Jacobs University Bremen School of Engineering and Science Dissertation Committee Prof. Michael Kohlhase, Jacobs University Bremen, Germany Prof. Adalbert Wilhelm, Jacobs University Bremen, Germany Prof. Cristian Calude, University of Auckland, New Zealand To my parents. I II I hereby declare that this thesis has been written independently except where sources and collaborations are acknowledged and has not been submitted at another university for the conferral of a degree. Parts of this thesis are based on or closely related to previously published material or material that is prepared for publication at the time of this writing. These are [MK09a, CM09, KLM+09, MK09b,M ul09a¨ , MK08a, MK08b,M ul08b¨ ,M ul08a¨ , MK08c, KMR08, WM07, MK07, KMM07a, KMM07b]. Parts of this work are the result of collaborations with other researchers: for PartII with Michael Kohlhase, Christoph Lange, Normen M uller,¨ and Florian Rabe and for Part III with Michael Kohlhase, Achim Mahnke, and Normen Muller.¨ In each case the precise connection to this work is detailed in the relevant passages of the text. Bremen, 10 May 2010, Christine Muller¨ III IV Abstract Modern web technologies have empowered users to create and share documents across the world. Today, users are confronted with an immense amount of documents, including doc- uments in a more traditional understanding such as publications, manuals, and textbooks as well as documents in a wider interpretation such as forum postings, ratings, and tags. Con- fronted with this immense amount of information, users struggle to find appropriate docu- ments and to acquire the essential knowledge conveyed in these documents. Essentially, the web’s usability depends on whether or not it can respond to the individual user preferences and needs (called the user context) and personalise web documents respectively. Personalising documents is widely addressed by applications in industry and research. For example, eLearning systems address the personalisation of online documents, particularly, the presentation and content planning of documents. They propose the most advanced adap- tation services and are thoroughly discussed in this thesis. Unfortunately, all of these sys- tems apply a topic-oriented approach: They can only handle self-contained information units (called topics or learning objects) that omit transitional phrases and cross-references. The resulting topic-oriented documents lack coherence and guidance that traditional, narrative writings provide. However, these narrative documents can not be easily modularised into reusable document parts that can be arbitrarily combined and arranged in a document. This work applies the topic-oriented principles of modularisation and reuse to the narrative authoring paradigm. Narrative documents are modularised into infoms, for which transitional phrases and cross-references are marked. Infoms and their semantic/narrative dependencies as well as variant relations are modelled as graphs. These graphs are processed during the content planning of narrative documents during which appropriate infoms are selected and ar- ranged according to the user’s context. Since the narrative transitions are visualised by words and phrases, they can reduce the adaptability of infoms. To improve the exchangeability of infoms, infoms are thus enriched with alternative transitions and cross-references. With these enrichments, appropriate narrative transitions can be selected according to the combination and arrangement of document parts into user-specific documents. To illustrate the proposed adaptation services for narrative documents, mathematical docu- ments are used. This decision required the author to take an essential aspect of mathematical text into account: Mathematics is a mixture of natural language text, symbols, and formulae. Symbols and formulae can be presented with different notations. These notations can com- plicate communication and acquisition processes since notations are context-dependent and can considerable vary among different communities and individuals. These variations can cause ambiguities and misunderstandings. In this situation, the author proposes a comprehensive framework that allows users to con- figure notations as well as to guide the content planning of mathematical documents regarding a semantic, narrative, and user context. To prioritise these contexts and to guide the adapta- tion, a combination of extensional and intensional options is provided. These give users full control over any step of the adaptation workflow: The specification of which document parts should be adapted and which should remain unchanged, the collection of adaptation objects (notation preferences, infoms, and context parameters), and the user-specific selection of the most appropriate objects to be applied in the rendering, substitution, and ordering of docu- ment parts. Since mathematics is the foundation of many other disciplines, the author expects that the findings of her work can be applied to other domains and a wide range of documents. V VI I am most grateful to Michael Kohlhase for having the overview to find the right research topic for me and for making it possible for me to explore it. I am especially indebted to Cristian Calude whose insights and advice have been very compelling and fruitful. I have benefited from a lot discussions and collaborations with numerous researchers, in particular, Normen Muller¨ and the members of the KWARC group at Jacobs University Bremen. Finally, I would like to thank Jacobs University, the Germany National Merit Foundations, and the JEM-Thematic-Network ECP-038208 who have provided me with financial indepen- dence for the past three years. VII VIII Contents I. Introduction & State of the Art1 1. Introduction3 1.1. The Document-Centered and Topic-Oriented Paradigms............6 1.2. Research Contribution.............................7 1.3. Adaptation of Mathematical Documents....................9 1.4. Context-based Adaptation........................... 12 2. Preliminaries & State of the Art 17 2.1. XML-based Markup Formats.......................... 17 2.2. Systems for Document Adaptation....................... 31 II. Adaptation of Mathematical Notations 45 3. Introduction 47 3.1. Introduction into Mathematical Notations................... 48 3.2. Representation of Mathematical Notations................... 51 3.3. State of the Art................................. 53 3.4. Requirements Specification........................... 55 4. Pattern-based Rendering of Notations 59 4.1. Information Model............................... 60 4.2. Specification of Notation Definitions...................... 64 4.3. The Rendering Algorithm............................ 66 4.4. Guiding the Rendering Algorithm....................... 69 5. Summary & Evaluation of Notations 79 5.1. Services & Limitations............................. 79 5.2. Proof of Concept................................ 83 5.3. Chapter Summary................................ 84 III. Content Planning for Mathematical Documents 85 6. Introduction 87 6.1. Representation of Variants/Alternatives.................... 87 6.2. State of the Art................................. 92 6.3. Requirements Specification........................... 94 7. The Content Planning Approach 99 7.1. Modularisation of Mathematical Documents.................. 100 IX Contents 7.2. Variants & Variant Relations.......................... 108 7.3. Specification for the Content Planning..................... 114 8. Substitution of Document Parts 119 8.1. Information Model............................... 119 8.2. The Abstractor................................. 122 8.3. The Substitution Algorithm........................... 124 8.4. Summary & Evaluation of the Substitution................... 132 9. Reordering of Mathematical Documents 139 9.1. Information Model............................... 139 9.2. The Abstractor................................. 141 9.3. The Reordering Algorithm........................... 142 9.4. Summary & Evaluation of the Reordering................... 153 IV. Adaptation in Practice 159 10.The Panta Rhei System 161 10.1. The Adaptable Panta Rhei........................... 163 10.2. The Social Panta Rhei.............................. 175 10.3. The Adaptive Panta Rhei............................ 183 10.4. Chapter Summary................................ 194 11.Conclusion & Future Work 195 11.1. Conclusion................................... 195 11.2. Future Work................................... 196 11.3. Epilog...................................... 198 X Part I. Introduction & State of the Art 1 1. Introduction Information technologies have transformed our present era into an information age, in which individuals can freely transfer information and have instant access to data that was formerly difficult or impossible to access. In particular, the rise of the World Wide Web (WWW, W3 or Web) has led to an explosion of digital materials that are highly interlinked and available at our fingertips. Originally developed by Tim Berners-Lee in 1989 [BL90] to provide an easy way to exchange research results, the WWW evolved into a global source of information for everyone: The WWW is a wide-area hypermedia information retrieval initiative aiming to give universal access to a large universe of documents (Tim Berners-Lee). Summarised