Towards Flexiformal Mathematics By
Total Page:16
File Type:pdf, Size:1020Kb
Towards Flexiformal Mathematics by Mihnea Iancu A thesis submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Computer Science Approved Dissertation Committee Prof. Dr. Michael Kohlhase Jacobs University Bremen, Germany Prof. Dr. Herbert Jaeger Jacobs University Bremen, Germany Prof. Dr. William M. Farmer McMaster University, Canada Priv.-Doz. Dr. Florian Rabe Jacobs University Bremen, Germany Date of defense: Jan 5, 2017 Computer Science & Electrical Engineering Statutory Declaration (Declaration on Authorship of a Dissertation) I, Mihnea Iancu, hereby declare, under penalty of perjury, that I am aware of the consequences of a deliberately or negligently wrongly submitted affidavit, in particular the punitive provisions of x156 and x161 of the Criminal Code (up to 1 year imprisonment or a fine at delivering a negligent or 3 years or a fine at a knowingly false affidavit). Furthermore I declare that I have written this PhD thesis independently, unless where clearly stated otherwise. I have used only the sources, the data and the support that I have clearly mentioned. This PhD thesis has not been submitted for the conferral of a degree elsewhere. Place Date Signature ii Parts of this thesis are based on or closely related to previously published material or material that is prepared for publication at the time of this writing. These are [IKR], [IK15b], [IK15a], [KI14], [IKP14], [IJKW14], [GIJ+16], [IKRU13] and [IKR11]. Parts of this work are collaborations with other researchers: For Part I with Michael Kohlhase, for Part II with Michael Kohlhase and Florian Rabe and for Part III with Constantin Jucovschi, Michael Kohlhase, Enxhell Luzhnica, Akbar Oripov, Corneliu Prodescu, Florian Rabe and Tom Wiesing. In each case the precise connection to this work is detailed in the relevant passages of the text. iii Abstract The application of computer-based methods to mathematics, while meaningful, is constrained by the fact the most mathematical knowledge exists in forms that can only be understood by humans. In order for it to be also understood by machines, those aspects of mathematical knowledge that are relevant for machine-driven applications need to be made explicit. One important aspect of mathematics is the underlying semantics which can be made understand- able to machines by formalizing it. Formalization makes explicit the implicit definitions and inference steps that occur naturally in mathematical documents. But, despite numerous attempts, only a small fragment of mathematics has been formalized because formalization is prohibitively expensive. Furthermore, existing formalized mathematics typically lacks in other aspects such as presentation information and narrative structure that are common in mathematics and critical for human-oriented practical applications. Therefore, existing applications for formal mathematics are mostly limited to verification so there is little practical incentive for formalization. Relying on the idea of flexiformality we propose flexiformalizing mathematics which addresses the bottlenecks discussed above in two ways. First, flexiformality allows content of flexible formality, so it minimizes the starting cost of flexiformalization compared to formalization. Sec- ond, flexiformality involves co-representing the narration, structure and meaning of mathemati- cal knowledge. Therefore, it forms a basis for not only machine processing but also for building practical, human-oriented applications. We call the result of flexiformalization as described here flexiformal knowledge and we believe mathematical knowledge is fundamentally flexiformal. In this thesis, we survey both informal and formal mathematics to identify flexiformal phenom- ena in mathematical practice. Then, we design and implement a formal language for flexiformal knowledge to adequately represent them. Furthermore, we evaluate the language and implemen- tation by importing existing mathematical libraries at different levels of formality and developing applications for them. Finally, we integrate the libraries and applications into a generic system as an extensive case study in flexiformal knowledge management. First and foremost, I would like to express my gratitude to Michael Kohlhase for his continuous guidance, support and encouragement, as well as for suggesting this research topic in the first place. I am especially indebted to Florian Rabe whose comments and advice have always proved compelling and insightful. I am also grateful to Herbert Jaeger and William Farmer for their support and feedback. Finally, I would like to thank the members of the KWARC group whose ideas and remarks proved often fruitful and always interesting. ii Contents 1 Introduction 1 I Flexiformal Phenomena and State of the Art 6 2 Flexiformal Phenomena in Mathematics 7 2.1 Phenomena of Mathematical Vernacular . .7 2.1.1 Phrase Structure . .9 2.1.2 Discourse Structure . 12 2.1.3 Document Context . 16 2.1.4 Level-Independent Phenomena . 20 2.1.5 Additional Requirements . 21 2.2 Narration and Structure in Formal Mathematics . 22 3 State of the Art 28 II Representing Flexiformal Knowledge 36 4 Preliminaries 37 4.1 The OMDoc Format . 37 4.2 The MMT Language and System . 41 4.2.1 The MMT Language . 41 4.2.2 Inference System . 44 4.2.3 The MMT System . 45 iii CONTENTS 5 A Formal Language for Flexiformal Knowledge 48 5.1 MMT Extensions for Flexiformal Knowledge . 48 5.1.1 Opaque Terms . 49 5.1.2 Bipartite Terms . 52 5.1.3 Bipartite Declarations . 55 5.1.4 Derived Declarations . 59 5.2 The IMMT Language and System . 62 5.2.1 Grammar . 62 5.2.2 Referencing Declarations . 63 5.2.3 Implementation . 65 6 Representing Flexiformal Phenomena in IMMT 67 6.1 Informal Mathematics . 68 6.1.1 Notations and Verbalizations . 70 6.1.2 Mathematical Structures . 72 6.1.3 Structured Proofs . 75 6.1.4 Dynamic Theories . 77 6.1.5 Recaps . 83 6.1.6 Foundational Ambiguity . 88 6.2 Formal Mathematics . 91 6.2.1 Local Scopes . 91 6.2.2 Sections . 92 6.2.3 MMT Structures . 92 6.2.4 Documents . 93 III Flexiformal Knowledge Management 95 7 Flexiformal Mathematical Libraries 96 7.1 Importing the Mizar Library into IMMT ...................... 97 7.2 Importing STEX Libraries into IMMT ....................... 100 7.2.1 Object Level . 100 7.2.2 STEX primitives as IMMT declarations . 101 iv CONTENTS 7.2.3 Implementation . 104 7.3 Importing the OEIS Library into IMMT ...................... 104 8 Applications for Flexiformal Knowledge 108 8.1 The IMMT System as a Semantic Database . 110 8.1.1 Accessing Explicitly Represented Content . 110 8.1.2 Accessing Induced Material . 111 8.2 The MathHub System . 115 8.2.1 System Architecture and Realization . 116 8.2.2 Services and Applications . 117 8.2.3 Implementation Overview . 125 9 Conclusion and Future Work 127 v Chapter 1 Introduction Throughout history, knowledge has been a crucial building block for human civilization, and its creation, distribution and consumption have been driving forces for human progress. Mathemat- ics, in particular, is among the oldest areas of human knowledge and is the foundation for most of modern science. Moreover, mathematical knowledge still grows relentlessly as the total amount of published mathematics has been estimated to double every ten to fifteen years [Odl95]. The digital revolution has already transformed mathematical practice. Most mathematical knowl- edge is rendered and read using computer systems such as PDF readers and web browsers. It is organized in electronic repositories of mathematical knowledge such as arXiv [ArX], eu- DML [EuD], Math Reviews [MRv] or ZentralBlatt math [ZBM]. Furthermore, computer algebra systems (CAS) such as Mathematica [Wol02], Maple [Mpl16], SageMath [Sag16], GAP [GAP16] or MuPAD [MPD16] are routinely used for computation. However, computer support for mathematics is limited by the fact that most mathematical knowl- edge exists as human-oriented content that is inaccessible to computer processing. Currently, most mathematics is laid down in the form of documents ranging from research articles to engi- neering whitepapers to math textbooks. These are often encoded electronically as digitizations of printed material or born-digital in the form of PDF generated from LATEX (predominatly in math research) but also other formats (in education and engineering). However, they are usually targeted at and exclusively consumed by human readers while their semantics is inaccessible to computers systems. In order to support “doing mathematics” by computer we have to re- cover the knowledge in these documents in a way that computers can act on. [Far08] defines this as practical expressivity as opposed to theoretical expressivity which disregards how readily the knowledge can be encoded and used by computer applications. In this thesis, motivated by building practical applications, we will focus on practical expressivity. We identify two distinct, existing approaches towards achieving that goal. The first is to use computer methods to semanticize the existing mathematical documents. This process is essentially the syntactic and semantic analysis phase of natural language processing applied to mathematical documents. The analysis process is traditionally viewed as a transforma- tion from utterances or documents into a knowledge representation language. First-order logic 1 INTRODUCTION can seem a plausible target format for mathematical knowledge because it is (together with ax- iomatic set theory [Ber91]) the generally accepted theoretical foundation of mathematics. But, in