Content Markup Language Design Principles Andreas Strotmann
Total Page:16
File Type:pdf, Size:1020Kb
Florida State University Libraries Electronic Theses, Treatises and Dissertations The Graduate School 2003 Content Markup Language Design Principles Andreas Strotmann Follow this and additional works at the FSU Digital Library. For more information, please contact [email protected] THE FLORIDA STATE UNIVERSITY COLLEGE OF ARTS AND SCIENCES CONTENT MARKUP LANGUAGE DESIGN PRINCIPLES By ANDREAS STROTMANN A Dissertation submitted to the Department of Computer Science in partial fulfillment of the requirements for the degree of Doctor of Philosophy Degree Awarded: Spring Semester, 2003 The members of the Committee approve the dissertation of Andreas Strotmann defended on April 3, 2003. Ladislav J. Kohout Professor Directing Thesis Mika Sepp¨al¨a Outside Committee Member Robert A. van Engelen Committee Member Kyle Gallivan Committee Member Hilbert Levitz Committee Member Approved: Sudhir Aggarwal, Chair Department of Computer Science The Office of Graduate Studies has verified and approved the above named committee members. ACKNOWLEDGEMENTS First and foremost, I wish to thank Ladislav J. Kohout without whose understanding, trust, and guidance this dissertation would have remained forever unwritten, Mika Sepp¨al¨a, whose unwavering trust and support gave me so many opportunities over so many years to gain the necessary experience for this project, and the other members of my committee, R. van Engelen, K. Gallivan, and H. Levitz, for their interest, encouragement, and support. R. Esser, F.-W. Hehl, and A.C. Hearn introduced me to the theory and practice of symbolic computing in general, and of computer algebra in particular, while J. Lenerz, P.-O. Samuelsdorff and the many participants of the Arbeitskreis Linguistik at the Institut f¨ur Deutsche Sprache und Literatur at Universit¨at zu K¨oln introduced me to the fascinating research on formal linguistics of human languages. Without them, this dissertation would have looked quite different. D. Krekel, R. Schrader, B. Haas and the members of the Graduiertenkolleg Scientific Computing at the University of Cologne got me started on the long journey which culminated in this dissertation. Numerous members of the OpenMath working group provided invaluable opportunities to discuss and share ideas and experience in the field, especially J.A. Abbott, O. Caprotti, D. Carlisle, S. Dalmas, J. Davenport, M. Ga¨etano, M. Kohlhase, A. van Leeuwen, B. Miller, M. Roelofs, M. Sepp¨al¨a, R. Sutor, and S. Vorkoetter. The OpenMath Steering Committee, especially M. Sepp¨al¨a, G. Gonnet, and A. Cohen, provided invaluable support, both moral and financial. P. Marti and M. Rueher were great hosts and collaborators at ESSI/ISSS, University of Nice at Sophia-Antipolis until I was forced to stop working on this topic due to circumstances beyond their or my control in 1995. I still owe them an apology. I wish to thank the faculty and fellow students at the Department of Computer Science, at the Supercomputer Computing Research Institute and later the School of iii Computational Science and Information Technology, at the Mathematics Department, and at the Oceanography Department of The Florida State University. Special thanks also go to the people at the FSU International Student Center, especially its director, R. Christie, and the numerous members of its International Coffee Hour: they have been a haven of sanity in Tallahassee. A.C. Hearn, W. Neun, H. Melenk, A.C. Norman, and J.P. Fitch generously provided me access to the source code of their respective implementations of REDUCE and the underlying LISP systems. S. Vorkoetter of Waterloo Maple kindly allowed me the use of his implementation of his OpenMath prototype language component in Maple. Ph. Marti provided crucial parts of our joint implementation of the OpenMath prototype language in Prolog III and Delphia Prolog. R. van Engelen provided the source code and advice on his Ctadel system as well as the funding for my implementation of OpenMath 1.0 in Ctadel. There have been a multitude of sources of funding for the research reported upon here. The Deutsche Forschungsgemeinschaft (DFG) through its Graduiertenkolleg Scientific Computing at the University of Cologne, Germany, provided funding in the form of a PhD fellowship for the first phase of this project, and the Regional Computer Center at the Universtity of Cologne provided the use of their facilities and travel funding. The European Community projects Editing and Computing (later renamed OpenMath) provided travel funding, as did the Research Institute for Applications of Computer Algebra (RIACA), the Symbolic Computation Group at the University of Waterloo, and, at The Florida State University, the Department of Mathematics, the Department of Computer Science, the School of Computational Science and Information Technology, the Oceanography Department, and the Research Foundation. The Florida State University provided funding through a University Fellowship, through teaching assistantships at the Department of Computer Science, and through research assis- tantships at the Mathematics Department (in collaboration with the Office for Distributed and Distance Learning) and at the Oceanography Department (through a grant from the FSU Research Foundation and in collaboration with the School of Computational Science and Information Technology). Finally, my personal and very special thanks go to D. Zhao for going through this together with me. iv TABLE OF CONTENTS List of Tables ................................................... .... ix List of Figures ................................................... ... x Abstract ................................................... ......... xi 1. OVERVIEW ................................................... .. 1 1.1 The Problem Area . ....... 2 1.1.1 A Deceptively Simple Problem . 2 1.1.2 Problems of Quick-and-Dirty Solutions . 3 1.2 A Better Understanding for More Principled Solutions . 4 1.2.1 The Linguistics Parallel . .... 6 1.2.2 Content Markup Language Architecture . ... 7 1.2.3 The Compositionality Principle . 7 1.2.4 Categorial Semantics . ........ 8 2. INTRODUCTION ............................................... 10 2.1 TheResearchArea .................................... ......... 11 2.2 Markup ............................................ .......... 12 2.3 MarkupLanguage ...................................... ........ 14 2.4 ContentMarkup ..................................... .......... 14 2.5 The Mathematical Markup Language: Content and Presentation Markup for Mathematics ........................................... ....... 16 2.6 PresentationMarkup................................. ........... 17 2.7 Markup Language Design . ....... 18 2.8 TheResearchTopic.................................... ......... 19 2.9 Understanding Content Markup Language Design . .. 19 2.10 Towards Understanding Content Markup Language Design . 20 2.11 The Linguistics Approach . ..... 21 2.11.1 Compositionality . ...... 22 2.11.2 Categorial Semantics . ......... 23 2.11.3 Language Architecture: Layers and Components . .. 23 3. HISTORICAL TIME-LINES ...................................... 25 3.1 AModernProphecy..................................... ........ 25 3.2 RecentWishLists ..................................... ......... 26 3.3 Early Beginnings . ....... 28 3.4 Many Parallel Strands – 1993–1994 . ........ 29 3.5 Two Bundles and Some Loose Strands – 1995 . 32 v 3.6 Regrouping – 1996 . ........ 33 3.7 Consolidation – 1997 . ........ 36 3.8 Cleaning Up and Shaking Down . 38 3.9 Onwards........................................... ........... 40 4. RELATED TOPICS .............................................. 41 4.1 Applications and Environments . ... 41 4.1.1 Collaborating Symbolic Computation Systems . 41 4.1.2 Symbolic-Numeric Interfaces . ..... 42 4.1.3 User Input and Output . 42 4.1.4 Symbolic Computation Web Services . 42 4.1.5 InteractiveTextbooks ............................. .......... 43 4.1.6 Communication Architectures . .. 43 4.1.7 The “Semantic Web” . ... 43 4.1.8 Mobile Communication . 44 4.2 Existing Markup Languages . ...... 44 4.2.1 Symbolic Computing Languages in Artificial Intelligence and Computer Algebra ............................................... ... 45 4.2.2 (Semi-) Numeric Computations: Scientific Data Formats . 47 4.2.3 Textual Information: Meta-Data and Text Encoding . 48 4.2.4 Application-Specific Data Formats . 51 4.2.5 General-Purpose Data Formats . ...... 51 4.2.6 The Extensible Markup Language . 52 4.3 Theoretical Background . .......... 52 4.3.1 Language, Logic, Semiotics . .... 53 4.3.2 Linguistics and Cognitive Science . 54 4.3.3 DesignIssues...................................... ........ 55 5. APPLICATIONS AND IMPLEMENTATIONS ...................... 57 5.1 The Computer Algebra Information Interchange Format Project . 58 5.1.1 A Distributed REDUCE System . 58 5.1.2 “Toy” Application . .... 60 5.1.3 Architecture . ....... 61 5.1.4 Implementation . ..... 64 5.1.5 EarlyLessons...................................... ........ 65 5.2 First Full Prototypes . .......... 66 5.2.1 AnOpenMathPrototype............................... ...... 66 5.2.2 A World Premiere . ..... 68 5.2.3 MoreLessons...................................... ........ 70 5.3 An Application: A Cooperative Multi-Solver Constraint Resolution System . 70 5.3.1 The Problem . ..... 71 5.3.2 The Proposed Solution . .... 72 5.3.3 LessonsLearned.................................... ........ 78 5.3.4 Practical Problems . ...... 81 5.4 Teaching Ctadel to Speak OpenMath 1.0 . .... 86 5.4.1 Parsing and Generating OpenMath