Ontologies and Languages for Representing Mathematical Knowledge on the Semantic Web

Undefined 0 (0) 1 1 IOS Press Ontologies and Languages for Representing Mathematical Knowledge on the Semantic Web Christoph Lange fully integrated with RDF representations in order to con- tribute existing mathematical knowledge to the Web of Data. Computer Science, Jacobs University Bremen, We conclude with a roadmap for getting the mathematical Web of Data started: what datasets to publish, how to inter- Germany link them, and how to take advantage of these new connec- E-mail: [email protected] tions. Keywords: mathematics, mathematical knowledge management, ontologies, knowledge representation, formalization, linked data, XML Abstract. Mathematics is a ubiquitous foundation of science, technology, and engineering. Specific areas, such as numeric and symbolic computation or logics, enjoy consid- erable software support. Working mathematicians have re- 1. Introduction: Mathematics on the Web – State cently started to adopt Web 2.0 environment, such as blogs of the Art and Challenges and wikis, but these systems lack machine support for knowledge organization and reuse, and they are disconnected from tools such as computer algebra systems or interactive proof A review of the state of the art of mathematics assistants. We argue that such scenarios will benefit from Se- on the Web has to acknowledge Web 1.0 sites that mantic Web technology. are in day to day use: review and abstract services Conversely, mathematics is still underrepresented on the Web of [Linked] Data. There are mathematics-related such as Zentralblatt MATH [30] and MathSciNet [35], Linked Data, for example statistical government data or sci- the arχiv pre-print server [39], libraries of formalized entific publication databases, but their mathematical seman- and machine-verified mathematical content such as the tics has not yet been modeled. We argue that the services for Mizar Mathematical Library (MML [9]), and refer- the Web of Data will benefit from a deeper representation of ence works such as the Digital Library of Mathemati- mathematical knowledge. Mathematical knowledge comprises logical and func- cal Functions (DLMF [3]) or Wolfram MathWorld [8]. tional structures – formulæ, statements, and theories –, a These sites have facilitated the access to mathemati- mixture of rigorous natural language and symbolic nota- cal knowledge. However, (i) they offer a limited degree tion in documents, application-specific metadata, and dis- of interaction and do not facilitate collaboration, and cussions about conceptualizations, formalizations, proofs, (ii) the means of automatically retrieving, using, and and (counter-)examples. Our review of approaches to representing these structures covers ontologies for mathemat- adaptively presenting knowledge through automated ical problems, proofs, interlinked scientific publications, agents are restricted. Concerning the Web in general, scientific discourse, as well as mathematical metadata vo- problem (i) has been addressed by Web 2.0 applica- cabularies and domain knowledge from pure and applied tions, and problem (ii) by the Semantic Web. This sec- mathematics. tion reviews to what extent these developments have Many fields of mathematics have not yet been imple- mented as proper Semantic Web ontologies; however, we been adopted for mathematical applications and sug- show that MathML and OpenMath, the standard XML-based gests a new combination of Web 2.0 and Semantic Web exchange languages for mathematical knowledge, can be to overcome the remaining problems. 0000-0000/0-1900/$00.00 c 0 – IOS Press and the authors. All rights reserved 2 Lange / Ontologies and Languages for Representing Mathematical Knowledge on the Semantic Web 1.1. How Working Mathematicians have Embraced ers mathematics [187]. Targeting a general audience, it Web 2.0 Technology omits most formal proofs but embeds the pure mathematical knowledge into a wider context, including, An increasing number of working mathematicians e.g., the history of mathematics, biographies of math- has recently started to use Web 2.0 technology for col- ematicians, and information about application areas. laboratively developing new ideas, but also as a new The lack of proofs is sometimes compensated by link- publication channel for established knowledge. Re- ing to the technically similar ProofWiki [17], contain- search blogs and wiki encyclopedias are typical repre- ing over 2,500 proofs, or to PlanetMath. Finally, Con- sentatives. nexions [72], technically driven by a more traditional Content Management System, is an open web reposi- 1.1.1. Research Blogs and Wiki Encyclopedias tory specialized on courseware. Connexions promotes Researchers have found blogs useful to gather feed- the contribution of small, reusable course modules – back about preliminary findings before the traditional more than 17,000, about 4,000 from mathematics and peer review. Successful collaborations among mathe- statistics, and about 6,000 from science and technol- maticians not knowing each other before have started ogy – to its content commons, so that the original au- in blogs and converged into conventional articles [47]. thor, but also others can flexibly combine them into GOWERS, an active blogger, initiated the successful collections, such as the notes for a particular course. Polymath series, where blogs are the exclusive com- The wikis mentioned so far have been set up from munication medium for proving theorems in a mas- scratch, hardly reusing content from existing knowl- sive collaborative effort [14,100,50], including the re- edge bases, but maintainers of established knowl- cent collaborative review of a claimed proof of P 6= edge bases are also starting to employ Web 2.0 fron- NP [15]. Compared to research blogs, the MathOver- tends – for example the recently developed proto- flow forum [7], where users can post their problems typical wiki frontend for the Mizar Mathematical Li- and solutions to others’ problems, offers more instant brary (MML) [178], a large library of formalized and help with smaller problems. By its reputation mecha- machine-verified mathematical content. The wiki in- nism, it acts as an agile simulation of the traditional tends to support common workflows in enhancing and scientific publication and peer review processes. maintaining the MML and thus to disburden the human For evolving ideas emerged from a blog discus- library committee. sion, or for creating permanent, short, interlinked de- scriptions of topics, wikis have been found more ap- 1.1.2. Critique – Little Reuse, Lack of Services propriate. The nLab wiki [28], a companion to the Web 2.0 sites facilitate collaboration but still require n-Category Café blog [27], is a prominent example a massive investment of manpower for compiling a for that, and also for the emerging practice of Open knowledge collection. Machine-supported intelligent Notebook Science, i.e. “making the entire primary knowledge reuse, e.g. from other knowledge collec- record of a research project public”, including “failed, tions on the Web, does not take place. Different knowl- less significant, and otherwise unpublished experi- edge bases are technically separated from each other ments” [188]. The Polymath maintainers have also set by using document formats that are merely suitable up a companion wiki for “collect[ing] pertinent back- for knowledge presentation but not for representation, ground information which was no longer part of the such as XHTML with LATEX formulæ. The only way of active ‘foreground’ of exchanges on the [. ] blog en- referring to other knowledge bases is an untyped hy- tries” [50]. Finally, where MathOverflow focuses on perlink. The proof techniques collected in the Tricki concrete problems and solution, the Tricki [21], also cannot be automatically applied to a problem devel- initiated by GOWERS, is a wiki repository of general oped in a research blog, as neither of them are suffi- mathematical techniques – reminiscent of a Web 2.0 ciently formalized. remake of PÓLYA’s classic “How to Solve It” [161]. Intelligent information retrieval, a prerequisite for Wikis that collect existing mathematical knowledge, finding knowledge to reuse and to apply, is poorly sup- for educational and general purposes, are more widely ported. For example, Wikipedia states the Pythagorean known. PlanetMath [159], counting more than 8,000 theorem as a2 + b2 = c2 and files it into the cate- entries at the time of this writing, is a mathemati- gories “Articles containing proofs” and “Mathematical cal encyclopedia. The general-purpose Wikipedia with theorems” [189]. The LATEX representation of the for- 15 million articles in over 250 languages also cov- mulæ does not support search by functional structure. Lange / Ontologies and Languages for Representing Mathematical Knowledge on the Semantic Web 3 Putting the fact aside that Wikipedia cannot search for- ematical textbooks and disseminate new mathemati- mulæ at all, a search for equivalentp expressions such cal results; and (iv) librarians and mathematicians who as x2 + y2 = z2 or c = a2 + b2 would not yield catalog and organize mathematical knowledge” [93]1. the theorem, unless they explicitly occur in the article. They hoped that Semantic Web technologies would From the categorization it is neither clear for a ma- help to address their challenges. This seemed techni- chine (albeit very likely for a human) whether the ar- cally feasible, particularly due to the common founda- ticle contains a proof of that theorem, nor whether it tion of XML and URIs [140]. is correct. Similarly,

Ontologies and Languages for Representing Mathematical Knowledge on the Semantic Web

Documents with Flexible Notation Contexts As Interfaces To

Download Slides

Ontologies and Languages for Representing Mathematical Knowledge on the Semantic Web

Datatone: Managing Ambiguity in Natural Language Interfaces for Data Visualization Tong Gao1, Mira Dontcheva2, Eytan Adar1, Zhicheng Liu2, Karrie Karahalios3

Editing Knowledge in Large Mathematical Corpora. a Case Study with Semantic LATEX(STEX)

Arxiv:1910.13561V1 [Cs.LG] 29 Oct 2019 E-Mail: [email protected] M

Building Dialogue Structure from Discourse Tree of a Question

A New Kind of Science

Problem of Extracting the Knowledge of Experts Fkom the Perspective of Experimental Psychology

Towards Unsupervised Knowledge Extraction

RIO: an AI Based Virtual Assistant

Ontology and Information Systems