Ontologies and Languages for Representing Mathematical Knowledge on the Semantic Web
Editor(s): Aldo Gangemi, ISTC-CNR Rome, Italy Solicited review(s): Claudio Sacerdoti Coen, University of Bologna, Italy; Alexandre Passant, DERI, National University of Galway, Ireland; Aldo Gangemi, ISTC-CNR Rome, Italy
Christoph Lange data vocabularies and domain knowledge from pure and ap- plied mathematics. FB 3 (Mathematics and Computer Science), Many fields of mathematics have not yet been imple- University of Bremen, Germany mented as proper Semantic Web ontologies; however, we Computer Science, Jacobs University Bremen, show that MathML and OpenMath, the standard XML-based exchange languages for mathematical knowledge, can be Germany fully integrated with RDF representations in order to con- E-mail: [email protected] tribute existing mathematical knowledge to the Web of Data. We conclude with a roadmap for getting the mathematical Web of Data started: what datasets to publish, how to inter- link them, and how to take advantage of these new connec- tions. Abstract. Mathematics is a ubiquitous foundation of sci- Keywords: mathematics, mathematical knowledge manage- ence, technology, and engineering. Specific areas of mathe- ment, ontologies, knowledge representation, formalization, matics, such as numeric and symbolic computation or logics, linked data, XML enjoy considerable software support. Working mathemati- cians have recently started to adopt Web 2.0 environments, such as blogs and wikis, but these systems lack machine sup- 1. Introduction: Mathematics on the Web – State port for knowledge organization and reuse, and they are dis- of the Art and Challenges connected from tools such as computer algebra systems or interactive proof assistants. We argue that such scenarios will benefit from Semantic Web technology. A review of the state of the art of mathematics on Conversely, mathematics is still underrepresented on the Web has to acknowledge traditional web sites that the Web of [Linked] Data. There are mathematics-related are in day to day use: review and abstract services Linked Data, for example statistical government data or sci- such as Zentralblatt MATH [34] and MathSciNet [41], entific publication databases, but their mathematical seman- the arXiv pre-print server [46], libraries of formalized tics has not yet been modeled. We argue that the services for and machine-verified mathematical content such as the the Web of Data will benefit from a deeper representation of Mizar Mathematical Library (MML [9]), and refer- mathematical knowledge. ence works such as the Digital Library of Mathemati- Mathematical knowledge comprises structures given in cal Functions (DLMF [3]) or Wolfram MathWorld [8]. a logical language – formulae, statements (e.g. axioms), These sites have facilitated the access to mathemati- and theories –, a mixture of rigorous natural language and cal knowledge. However, (i) they offer a limited degree symbolic notation in documents, application-specific meta- data, and discussions about conceptualizations, formaliza- of interaction and do not facilitate collaboration, and tions, proofs, and (counter-)examples. Our review of vocab- (ii) the means of automatically retrieving, using, and ularies for representing these structures covers ontologies for adaptively presenting knowledge through automated mathematical problems, proofs, interlinked scientific publi- agents are restricted. Concerning the Web in general, cations, scientific discourse, as well as mathematical meta- problem (i) has been addressed by Web 2.0 applica- tions, and problem (ii) by the Semantic Web. This sec- ments” [228]. The Polymath maintainers have also set tion reviews to what extent these developments have up a companion wiki for “collect[ing] pertinent back- been adopted for mathematical applications and sug- ground information which was no longer part of the gests a new combination of Web 2.0 and Semantic Web active ‘foreground’ of exchanges on the [. . . ] blog en- to overcome the remaining problems. tries” [58]. Finally, where MathOverflow focuses on concrete problems and solution, the Tricki [25], also 1.1. How Working Mathematicians have Embraced initiated by GOWERS, is a wiki repository of general Web 2.0 Technology mathematical techniques – reminiscent of a Web 2.0 remake of PÓLYA’s classic “How to Solve It” [194]. An increasing number of working mathematicians Wikis that collect existing mathematical knowledge, has recently started to use Web 2.0 technology for col- for educational and general purposes, are more widely laboratively developing new ideas, but also as a new known. PlanetMath [192], counting more than 8,000 publication channel for established knowledge. Re- entries at the time of this writing, is a mathemati- search blogs and wiki encyclopedias are typical repre- cal encyclopedia. The general-purpose Wikipedia with sentatives. 15 million articles in over 250 languages also cov- ers mathematics [227]. Targeting a general audience, it 1.1.1. Research Blogs and Wiki Encyclopedias omits most formal proofs but embeds the pure math- As a publication channel for established knowledge, ematical knowledge into a wider context, including, blogs reflect the traditional mathematical practice of e.g., the history of mathematics, biographies of math- publishing short reviews of previously published ma- ematicians, and information about application areas. terial, as done by the review and abstract services men- The lack of proofs is partly compensated by linking tioned initially. Researchers have also found blogs use- to the technically similar ProofWiki [17], containing ful to gather early feedback about preliminary find- over 2,500 proofs, or to PlanetMath. Finally, Connex- ings, whereas traditional publications, even in the state ions [86], technically driven by a more traditional con- submitted for peer review, would rather present mature tent management system, is an open web repository ideas while hiding alternatives that had once been con- specialized on courseware. Connexions promotes the sidered and then discarded – a working method that contribution of small, reusable course modules – more effectively hinders collaboration outside of small re- than 17,000, about 4,000 from mathematics and statis- search groups. Successful collaborations among math- tics, and about 6,000 from science and technology – ematicians not knowing each other before have started to its content commons, so that the original author, but in blogs and converged into conventional articles [55]. also others can flexibly combine them into collections, GOWERS, an active blogger, initiated the successful such as the notes for a particular course. Polymath series, where blogs are the exclusive com- The wikis mentioned so far have been set up from munication medium for proving theorems in a massive scratch, hardly reusing content from existing knowl- collaborative effort [14,121,58], including the recent edge bases, but maintainers of established knowl- collaborative review of a claimed but wrong proof of edge bases are also starting to employ Web 2.0 fron- P 6= NP [15]. Compared to research blogs, the Math- tends – for example the recently developed prototypi- Overflow forum [7], where users can post their prob- cal wiki frontend for the Mizar Mathematical Library lems and solutions to others’ problems, offers more (MML) [39,216], a large library of formalized and instant help with smaller problems. By its reputation machine-verified mathematical content. The wiki in- mechanism, it acts as an agile simulation of the tradi- tends to support common workflows in enhancing and tional scientific publication and peer review processes. maintaining the MML and thus to disburden the human For evolving ideas emerged from a blog discus- library committee. sion, or for creating permanent, short, interlinked de- scriptions of topics, wikis have been found more ap- 1.1.2. Critique – Little Reuse, Lack of Services propriate. The nLab wiki [32], a companion to the Web 2.0 sites facilitate collaboration but still require n-Category Café blog [31], is a prominent example a massive investment of manpower for compiling a for that, and also for the emerging practice of Open knowledge collection. Machine-supported intelligent Notebook Science, i.e. “making the entire primary knowledge reuse, e.g. from other knowledge collec- record of a research project public”, including “failed, tions on the Web, does not take place. Different knowl- less significant, and otherwise unpublished experi- edge bases are technically separated from each other by using document formats that are merely suitable 1.2. Early Adoption of the Semantic Web in for knowledge presentation but not for representation, Mathematical Knowledge Management on the such as XHTML with LATEX formulae. The only way Semantic Web of referring to other knowledge bases is an untyped hy- perlink. The proof techniques collected in the Tricki In the early 2000s, when XML was increasingly cannot be automatically applied to a problem devel- used for mathematics, particularly for formulae (cf. oped in a research blog, as neither of them are suffi- section 4.1.1), the first building blocks of the Seman- ciently formalized. Conversely, the Polymath commu- tic Web vision, such as RDFS, approached standard- nity does not have any automated verification tools at ization. These developments sparked interest in the hand but exclusively relies on the crowdsourcing prin- emerging interdisciplinary mathematical knowledge ciple that “given enough eyeballs, all bugs [here: errors management (MKM) community, consisting of com- in a proof] are shallow” [226].1 puter scientists, computer-savvy mathematicians, and Intelligent information retrieval, a prerequisite for digital library researchers, whose objective is “to de- finding knowledge to reuse and to apply, is poorly sup- velop new and better ways of managing mathematical ported. For example, Wikipedia states the Pythagorean knowledge using sophisticated software tools” [114]4, theorem as a2 + b2 = c2 and files it into the cate- or, more specifically, “to serve (i) mathematicians, sci- gories “Articles containing proofs” and “Mathematical entists, and engineers who produce and use mathe- theorems” [229]. The LATEX representation of the for- matical knowledge; (ii) educators and students who mulae does not support search by functional structure. teach and learn mathematics; (iii) publishers who offer Putting aside the fact that Wikipedia cannot search for- mathematical textbooks and disseminate new mathe- matical results; and (iv) librarians and mathematicians mulae at all, a search for equivalent√ expressions such as x2 + y2 = z2 or c = a2 + b2 would not yield who catalog and organize mathematical knowledge” 5 the theorem, unless they explicitly occur in the arti- [114] . They hoped that Semantic Web technologies cle.2 From the categorization it is neither clear for a would help to address their challenges. This seemed machine (albeit very likely for a human) whether the technically feasible, particularly as both communities article contains a proof of that theorem, nor whether it made use of XML as a serialization format and URIs is correct. Similarly, the Polymath collaborators had to for identifying things [170]. search previous publications of refutations of P 6= NP The two main lines of applying Semantic Web tech- “proofs” by keyword. nologies to MKM focused on digital libraries – im- Formalized repositories such as the MML use spe- proving information retrieval and giving readers ac- cialized search engines [56]. While they support inter- cess to automated reasoning and computation services nal knowledge reuse by formalizing new mathematical –, and web services – providing self-describing inter- concepts of existing ones and proving new theorems faces to automated reasoning and computation on the by applying ones that have already been proven, they Web, so that they could solve problems sent to them by do not support links to external repositories. Thus, the humans or other agents. maintainers of each knowledge collection, informal or 1.2.1. Digital Libraries – MathNet, HELM, and their formalized, hope to receive a critical mass of contri- Spin-Offs butions that makes it sufficiently self-contained for the Mathematical institutes participating in MathNet [6], desired application. an early effort to build “a distributed, efficient and Finally, the integration of mathematical Web 2.0 user-driven information and communication system sites with automated reasoning and computation ser- for mathematics” [93], were advised to put up uni- vices is scarce. Interactive computation is available formly structured homepages, to publish preprints, and in mathematical e-learning systems, such as Active- to annotate both with RDF. Some of the 180 Math- Math [36] or MathDox [174] – where document au- Net homepages that existed in 2002 [208] are still on- thors have sufficiently formalized the underlying math- line; however, the central services, including a preprint ematics in separate editing tools before publishing –, search engine6 and a browser for MathNet pages, have but less so in general-purpose digital libraries and col- been out of order since 2007. laboration environments. Mashups, which have other- Independently, HELM, the Hypertextual Electronic wise been a driving force of Web 2.0 development, Library of Mathematics [5,49], aimed at “integrat[ing] scarcely exist for mathematical tasks.3 the current tools for the automation of formal rea- soning and the mechanization of mathematics [. . . ] efficiently deal with a large number of instances (here: with the most recent technologies for the develop- concrete problems instantiating problem descriptions), ment of web applications and electronic publishing” which required a specific database/reasoner hybrid to [5]. In contrast to MathNet and other traditional digi- be developed, but then, again, the separate treatment tal libraries, HELM intended to explicitly represent the of classes and instances constrained the design of the fine-grained structures of mathematical expressions to MONET ontologies in that they had to model every ob- expose them, e.g., to automated reasoners, but also ject as a class [82]. Part of MONET’s query language is to enrich their publication on the Web. For exam- still used in the MathDox e-learning system [174,88]. ple, mathematical formulae were rendered in Math- More importantly, MONET and the competing Math- ML in such a way that actions could be performed on Broker architecture for symbolic computation web ser- them, e.g. simplifying a selected (sub)expression us- vices [57] influenced each other. The latter, however, ing an automated reasoning backend attached to the li- made less use of Semantic Web service technologies. brary. HELM completely relied on XML and RDF not The MathServe architecture, influenced by both of the only for publishing, but also for its internal knowledge former but focusing on automated reasoning, made ex- representation. Formalizations of mathematical state- tensive use of more recent Semantic Web service tech- ments and proofs were encoded in one XML dialect nologies, such as OWL-S service profiles [232]. per underlying logical system; primarily, the library of 1.2.3. Critique – Frustration and Discontinuation the Coq higher-order proof assistant (cf. section 4.2) Semantic Web approaches to MKM have so far was used in HELM. Relevant structural properties, in- failed to fulfill the hopes set in them, the aftermath terrelations, and metadata were represented in RDF of HELM and MONET being an instructive exam- (cf. section 4.3). ple. Both groups of researchers were initially enthu- The HELM developers had to carry out a lot of siastic about the possibilities of the emerging Seman- foundational research and development, as suitable tic Web, but then it turned out that few stable and reusable implementations were not available for many reusable implementations existed, and hence a con- of the planned features. As none of the prototypi- siderable amount of resources had to be invested into cal RDF query engines available in 2003 satisfied the developing fundamental building blocks.8 The appli- HELM requirements7, a new one was developed [131, cation of Semantic Web technology to MKM was 129]. As browsers did not sufficiently support Math- stopped even before solutions had matured to an extent ML, a MathML rendering widget suitable for embed- that would have allowed for, e.g., a wide deployment ding into desktop applications was developed [189]. to working mathematicians and usability studies with 1.2.2. Web Services – MONET and Related them. Architectures After 2004, when the HELM and MONET activi- The MONET project pioneered an architecture for ties had ended, the MKM community has given up us- mathematical web services built on Semantic Web ing Semantic Web technologies on a large scale, and technologies [179,81]. MONET services give access the Semantic Web community has focused on different to numeric and symbolic computation systems; ac- application areas. It has been suggested that, after the cess to proof assistants or digital libraries was envis- pioneering phase, it was hard to obtain research fund- aged but not pursued. MONET services come with a ing for applications of Semantic Web technologies to machine-comprehensible description and can be regis- MKM.9 tered with a central broker. Mathematical expressions From the MKM perspective, the further develop- in queries or computation requests to the broker were ment was as follows: Parts of the HELM technology represented by their functional structure using Open- have survived in an interactive desktop proof assis- Math (cf. section 4.3). MONET also required founda- tant [50], whereas the web frontend and the RDF- tional work to be done. OWL and the RACER reasoner based components have been discontinued. Contempo- were already found suitable for the internal descrip- rary mathematical web services work without Seman- tion of services and problems and computing matches. tic Web technology. While large parts of the influen- However, the frontend XML languages for service de- tial OpenMath community had been involved into MO- scriptions and queries (which the broker then trans- NET, which heavily relied on Semantic Web technolo- lated to OWL) had to be designed from scratch. Fur- gies, the current driving force of research symbolic thermore, the OWL reasoners of that time could not computation web services, the SCIEnce project (Sym- bolic Computation Infrastructure for Europe [203]), ical semantics is needed to improve, or even enable, does not use “standard” Semantic Web service tech- certain applications of Linked (Open) Data. nologies at all: SCSCP (Symbolic Computation Soft- Mathematics is a ubiquitous foundation of science, ware Composability Protocol [133]) is a lightweight technology, and engineering. Some of these applica- XML protocol using TCP sockets, or alternatively tion areas are already well represented on the Web of SOAP, whose communication semantics heavily relies Data, but their mathematical foundations are not. Hav- on a custom OpenMath vocabulary. ing them represented as well would enable a whole range of new applications: 1.3. Mathematics on the Semantic Web – Why Retry General-purpose Mathematical Knowledge: The in- Now? adequate representation of mathematical knowl- edge in Wikipedia has been criticized in sec- Web 2.0 applications are attracting an increasing tion 1.1.2. DBpedia, the linked open dataset ob- number of working mathematicians. The usage of Se- tained from Wikipedia [99], inherits these limi- mantic Web technologies to improve MKM has been tations. Such limitations – in DBpedia and else- investigated, albeit without becoming mainstream yet. where – forced the Polymath collaborators men- This section argues why a new combination of Web 2.0 tioned in section 1.1.1 to search for previous pub- and Semantic Web technologies is needed to address lications of refutations of P 6= NP “proofs” by them, and why such a solution is now feasible. keyword. 1.3.1. Combining Sem. Web and Web 2.0 for MKM Statistics: Public sector information, increasingly be- The combination of Web 2.0 and Semantic Web ing published as Linked Data by the US, UK, technology has already proven successful in some and other governments [205,103], has been used fields, including semantic wikis and Linked Data to provide, e.g., localized information retrieval mashups [43]; however, it has hardly been applied to about political representatives, crime statistics, MKM yet. From the point of view that mathematicians and hospital waiting list statistics [186]. Statisti- are already using the Web 2.0, it seems feasible to in- cal datasets contain values derived from ground crementally enrich such existing applications with Se- values, or from other derived values using math- mantic Web technology without scaring users away – ematical functions. The derivation can be as sim- an approach that has succeeded with general-purpose ple as counting; note that counts not only exist systems such as Semantic MediaWiki [22] or the RDF- in proper statistical datasets, but also in statis- enabled Drupal 7 [91], which are now mainstream. tics about any kind of datasets: VoID (Vocabu- A similar rewrite of the system powering the Planet- lary of Interlinked Datasets) defines properties for Math encyclopedia (cf. section 1.1.1) is currently in expressing statistical characteristics of datasets, progress [152]. From the point of view that applying such as the number of distinct subjects [40, sec- Semantic Web to MKM had been tried without success tion 4.6]. Planning data collection from statistical datasets and interpreting collected data requires a before the Web 2.0 era, ZACCHIROLI gave two reasons why a hypothetical retry of HELM (cf. section 1.2.3) notion of mathematical provenance of their data would benefit from Web 2.0 technology [231]: Math- points. (Section 5.2 outlines a possible solution.) ematical content would become interactively editable Publication Databases: The RKB Explorer ACM directly on the Web, and projects like PlanetMath have linked dataset [18] classifies the scientific publi- proven that there is “a community of people interested cations of the ACM according to their Comput- in collaboratively authoring rigorous mathematics on ing Classification System (cf. section 4.3.5). Still, the web” [231].10 Furthermore, HELM or MONET it is impossible for a Linked Data agent to un- derstand that a publication merely classified as would now benefit from a much wider availability of “F.1.3 Complexity Measures and Classes” actu- stable libraries and tools. With SPARQL, for exam- ally deals with the P and NP complexity classes, ple, there is now a standardized and widely supported and how they are defined. Thus, a mathematician query language for RDF. who has developed a theory is unable to find out 1.3.2. What MKM can Contribute to the Sem. Web whether or how other mathematicians have built Conversely, there are now also opportunities for on it: Contemporary publication datasets only MKM to give back to the Semantic Web. Mathemat- show what publications cited the original one, but not if they reused the mathematical concept in 1. ontologies defining the domain of interest do question. not exist; Enterprise Applications: Linked Data do not have to 2. they exist but are difficult to find because de- be open; the architecture, as defined in [65], also veloped by small groups for experimentation, works in enterprise intranets. Renault has used lacking advertisement; them for retrieving information about spare car 3. they exist and can be found but they are of parts [204]. Now consider decisions to be made poor quality, not complying with standards or when designing whole cars: They ultimately re- best practices; quire mathematical understanding. An engineer 4. they exist and can be found but there are too looking for an efficient engine for a projected city many, of mixed quality, and it is difficult to car might feed inputs such as the weight of the assess which ones are appropriate for a spe- car, the average length and duration of a trip, the cific use case. most widely available type of fuel and the aver- High-quality machine-readable vocabularies for math- age environment temperature when starting the ematics do exist: The official OpenMath 2.0 Content engine into a mathematical model of the engine in Dictionaries (CDs), for example, defining 260 mathe- order to predict its fuel consumption under these matical symbols – operators, functions, sets, constants constraints. –, have undergone a strict human-driven review pro- e-Science: The above use case is actually about re- cess (cf. section 4.1.2), and there are large machine- producing an experiment – one of the key prin- verified libraries of formalized mathematics (cf. sec- ciples of e-science [60]. Publishing descriptions tion 4.2). Large parts of the MKM community accept of scientific experiments as Linked Data not only them as standard vocabularies for representing math- makes the provenance of their result data explicit ematical expressions, but for the rest of the world – [177] but also makes whole experiments more including the publishers and ultimately the consumers easily accessible and thus reproducible. Fine- of linked data – ZIMMERMANN’s criterion (2) applies. grained reproducibility once more demands a rep- Besides a technical mismatch – they are not available 11 resentation of the mathematical models. Some e- as RDF – there is a cultural mismatch. Mathemat- science datasets include them, e.g. the SysMO ics, due to its practice of rigorously reasoning about SEEK “‘assets catalogue’ describing data, mod- abstract concepts in a self-contained way using a sym- els, . . . , workflows and experiment[s]” [60] from bolic notation, is generally perceived as hard and inac- systems biology of microorganisms [23], whose cessible (see, e.g., [112]). The average computer sci- publication as Linked Data is in progress (cf. entist, whose work builds on a very restricted area of [60]). Currently, the mathematical models are applied discrete mathematics, is not immune to such stereotypes. By integrating mathematics into the Web given as Content MathML formulae (cf. sec- of Data, using the techniques explained in this article, tion 4.1.1) deeply nested into XML files and thus we can take it out of the Ivory Tower. not directly accessible via URIs. A mathemati- cian who has developed the mathematical model 1.3.3. Structure of this Article for a scientific experiment cannot see how the This article reviews vocabularies – ontologies and model is applied. (The OntoMODEL tool, re- languages – that are suitable for contributing mathe- viewed in sections 2.4 and 4.3.6, is a step into that matics to the Semantic Web, particularly the Web of direction.) [Linked] Data. The remaining sections are structured as follows: Section 2 provides an abstract overview Thus, in order to enhance current applications of of the structures of mathematical knowledge and thus Linked Data towards mathematics, dataset publish- the background knowledge about the domain that is ers need a mathematical vocabulary. The quality of needed to assess the aptitude of existing ontologies Linked Data vocabularies – often designed in an ad and languages for representing mathematical knowl- hoc mapping of existing database structures to RDF edge adequately to the applications described above. – and hence of the linked datasets is often low (see, Section 3 defines the scope of this survey and estab- e.g., [142]). ZIMMERMANN has observed the follow- lishes requirements for ontologies and languages. Sec- ing reasons for vocabularies being of a low quality tion 4 provides a comprehensive review and concludes [233]: with recommendations on what ontologies and lan- guages should be used on the Web of Data. Section 5 ties of the mathematical concepts can be inferred. In explains techniques for integrating non-RDF represen- a usual mathematical document, such properties are tations, which are still ubiquitous in the MKM domain, first asserted and then proven – or refuted. Often, the into the Web of Data, using the ontologies reviewed. choice of what properties of a concept to model as ax- Section 6 tries to predict the benefits of that and points ioms is arbitrary and merely follows established con- out further research directions. ventions. All kinds of properties of concepts are some- times subsumed under the term statements. This is the case in the OMDoc representation language (cf. sec- 2. Background: Structures of Mathematical tion 4.1.3), which distinguishes symbol declarations Knowledge and axioms, definitions13, assertions (theorems, lem- mas, corollaries, etc.), proofs (which prove assertions Before we can represent mathematical knowledge by applying inference rules to axioms and previously on the Semantic Web, we have to understand its struc- proven theorems), and examples. Not all assertions in a tures. Realistic MKM applications, in domains where realistic mathematical knowledge base have to be true: mathematics is applied but also in pure mathemat- There can be conjectures whose truth is not yet known, ics, do not only operate on logical and functional as well as wrong assertions that have been refuted by structures but also require information about the (non- counter-examples but are kept for instructive purposes. mathematical) application context, about project orga- Groups of closely related symbols and their properties nization and management (such as “What theorems form theories. When reusing mathematical symbols, are still lacking a proof”), about discussions that au- their names are often qualified by their theory for dis- thors and users hold about the mathematical knowl- ambiguation, i.e. theories act as namespaces for sym- edge (such as “I don’t understand what we need this bols; this is also reflected by speaking of the “home definition for”), etc. There is little literature about these theory” of a statement. For example, both the theory of structures. Working mathematicians often use them real numbers and the theory of functions on real num- without reflecting on them. Computer scientists and bers have an “addition” operator. The latter can be de- knowledge engineers have to reflect on them but often fined pointwise in terms of the former, but both remain do so from the point of view of a system specialized different; for example, one cannot use either of them for a particular task – e.g. checking first-order logic to add a number to a function. proofs – and its particular conceptual model and rep- In the context of theories, statements can be dis- resentation language. Thus, the review given here is tinguished more precisely into constitutive statements influenced by literature on concrete systems, models, (axioms and definitions), which determine the mean- languages, and ontologies, but tries to abstract from ing of a theory, and non-constitutive ones, such as the- that. orems or proofs, which “only illustrate the mathemati- cal objects in the theory by explicitly stating the prop- 2.1. Logical and Functional Structures erties that are implicitly determined by the constitutive statements” [147, chapter 15.1]. Moreover, the logical Mathematical knowledge has a three-layered log- language used to express the statements in a theory can ical structure of objects – composed of symbols –, itself be modeled as a theory, then called meta-theory. statements, and theories12. Symbols comprise opera- For example, the theory of commutative groups can be tors, functions, sets, and constants. New mathematical formalized with FOL as a meta-theory. FOL provides concepts (i.e. symbols) can be defined, possibly based the universal quantifier that is needed for stating the on concepts defined previously. A mathematical ob- group axiom of commutativity as ∀a, b ∈ G.a ◦ b = ject can be a single symbol, or a compound, such as b ◦ a. a complex number, an application of a function to ar- For knowledge management tasks, which involve, guments, or a derivative. Here, we call their structure e.g., reuse of theories or management of theory changes, “functional”, as they are built recursively from apply- it has been found useful to build theories on a minimal ing function or constructor symbols to other objects. set of axioms, and to model a whole field of mathe- Some of the properties of mathematical symbols or ob- matics as a strongly interconnected graph of “little the- jects are specified as axioms. Axioms are expressed ories” reusing each other (cf. [113]). The connections as formulae in a logical language, e.g. first-order logic are called theory morphisms or views. Some of these (FOL). By applying rules of that logic, other proper- views are given by definition – then called imports –, others are postulated and then have to be proven. The- 2.2. Rigorous Language and Symbolic Notation ory graphs allow for modeling hierarchies from ab- stract to concrete concepts; for example, the theory of Mathematical textbooks and other publications pre- real numbers would, via some intermediate steps, im- dominantly consist of natural language (in its very own port the theory of groups, and the morphisms along style; cf. figure 1 and [234,215]) intermixed with for- these imports would map the general operator of a mulae. In such sources of mathematical knowledge, group via multiple inheritance to the specific addition the higher-level logical structures (e.g. theories) are of- ten less obvious, which suggests making the knowl- and multiplication operators of the real numbers. This edge accessible from its discourse structure. Rhetori- use of theories extends beyond mere namespacing and cal Structure Theory (RST [169]) is a general-purpose has particularly been adopted for the structured speci- model for that, which divides a text into spans, often fication and verification of software (see, e.g., [53]). down to the level of subordinate clauses. RST has a Logical/functional structures can be expressed at rich vocabulary of relations between nuclei (essential different degrees of formality: Often, an author starts text spans) and their satellites (spans that provide ad- a document by sketching a few formulae and some ditional information); for example, a satellite can give textual notes. Later, the content is elaborated both evidence to a nucleus, provide background informa- into the formal and into the informal direction: A tion to facilitate understanding, or define the context in sloppy formula is written more rigorously, rigorous which the nucleus is to be interpreted. Figure 2 models text is formalized in a certain mathematical foundation a phrase from figure 1 according to RST. (meta-theory)14, taking previously formalized knowl- Note, however, that statement- and object-level edge into account, and natural language explanations structures are usually obviously identifiable in mathe- are added to formalized knowledge (see, e.g., [147, matical text as shown in figure 1. Here, a sufficiently chapter 4]). Both directions can, in principle, be au- fine-grained model of these structures – covering, e.g., tomated: Natural language processing techniques can inline definitions and proof steps – may lead to a mathematics-specific refinement of RST. aid formalization (see, e.g., [117]), whereas proof ex- For sections and chapters on the upper levels of planation helps to generate natural language from for- a document, several models of discourse in scientific malized knowledge (see, e.g., [115]). These solutions, publications have introduced more convenient coarse- however, can not yet cope with the full complexity grained blocks that correspond to the usual sections of of mathematical knowledge as it occurs in practice. a publication, and reserve RST for markup of an in- Particularly the automated disambiguation of symbolic termediate granularity [128,77]. A typical document in notation (see below) is hard, as the surrounding text one of these models starts with an abstract and a mo- often has to be taken into account for disambigua- tivation and ends with a conclusion and a list of ref- tion [117]. erences, and has some sections in between that pro- One aspect that is not restricted to logical/func- vide background knowledge, explain the actual contri- tional structures but has been investigated most deeply bution of the paper, demonstrate practical applications, for them is dependency. For the purpose of manag- summarize the results of experiments or evaluations, ing mathematical knowledge, dependency can be de- review the state of the art and related work, etc. Mathematical formulae are communicated to human fined such that B depends on A in the way dp iff a change to A may have an impact on the property readers in a two-dimensional notation, whose com- plexity is owed to the possibility to define new seman- p of B. To make this definition precise, one has to tic symbols at will.15 Choosing an intuitive notation fix the property p. Different conceptual models and for the concepts dealt with is of great importance to representation languages have done that in different understanding and communication (see, e.g., [194]). ways. The formal language MMT, which constitutes The notation of a symbol is usually introduced with its a subset of the above-mentioned OMDoc, considers first declaration, typical phrases being “We will denote dependency w.r.t. logical well-formedness [195, chap- by Z the set . . . ”, “The notation aRb means that . . . ”, ter 8.4]; examples for that are given in 4.3.2. The Math- etc. [215]. Notation can be conceived as a many-to- Lang language (cf. section 4.1.4) considers depen- many mapping of structures of mathematical knowl- dency w.r.t. the reader’s ability to understand a knowl- edge – primarily logical/functional structures – to an edge item [199,143]. arrangement of glyphs on paper. The notation chosen
Suppose that ...... provided m 6= 1.
Let M be ...... Assume that ...... Then ...... , unless m = 1.
Write ...... with g a constant satisfying ......
Fig. 1. Typical phrase patterns for theorems [215]
with g a constant Let m be . . . Suppose that . . . Then . . . Elaboration Condition Elaboration satisfying . . .
Fig. 2. RST markup of a theorem (nuclei with thick outline) for a particular object in a particular document is deter- Metadata can be embedded into the data they de- mined by a number of presentation context dimensions scribe, or point to the data from outside (“standoff (examples taken from [176] and [182]): markup”). This section focuses on metadata that are so closely related to the data that embedding them language and culture: the French/Russian notation makes most sense in the interest of uniform manage- k of the binomial coefficient Cn vs. the German/En- ment workflows. In mathematical practice, subjects n glish notation k ; see [167] for details annotated with metadata can be very fine-grained, as level of expertise: the explicit notation of multiplica- the following blackboard-style example shows: tion as a · b, which is common in primary school, vs. the more advanced omission of the operator symbol in the notation ab b−1(( a−1a )b) = b−1(eb) = ... area of application: The square root of −1 is written - as i in most fields, whereas electrical engineers We learned that last week write it as j to distinguish it from the current I. community of practice: People with a set theory Administrative metadata describe the lifecycle and background tend to include 0 in the set of natural revision history of a resource, the data format and the numbers , whereas those with a number theory N usage requirements, copyright information, as well as background tend to start with 1.16 other general-purpose information. The most widely individual preference: Some mathematicians, who used metadata vocabulary is Dublin Core [26], which prefer completely idiosyncratic notations when covers general bibliographical information, but also el- working on their own, translate other articles into ementary licensing and versioning information. Its se- their own notation and translate their own articles mantics is rather weak, but it is widely supported, e.g. to a more conventional notation before publica- by search engines. The Dublin Core Metadata Ele- tion [138, pp. 166–167]. ment Set [108] provides a basic vocabulary, which the The greatest notational variety has been observed more modern DCMI Metadata Terms vocabulary ex- for mathematical symbols. From the level of state- tends in a backwards-compatible way [100]. Dublin ments upwards, notation – such as the font chosen for Core also paves the path to a more comprehensive keywords like “Definition” – is more standardized and domain-specific description of resources, in that it therefore usually not a subject of research.17 is designed to be complemented by domain-specific classification schemes, whose entries are usually al- 2.3. Mathematics-Specific Metadata phanumeric codes. For mathematical publications, the MSC (Mathematics Subject Classification [35]) pre- In an expressive representation of mathematical vails; this article would be classified as 68T30, where knowledge, it is hard to draw a line between data and 68 is computer science, 68T is artificial intelligence, metadata. This article, in line with prior work on meta- and 68T30 is “knowledge representation”. GAMS, the data in MKM [119], considers all of the previously Guide to Available Mathematical Software [4], clas- mentioned structures data of primary interest, whereas sifies more fine-grained things, namely mathematical the remaining, mainly administrative and application- problems, for example, H2a1 = “one-dimensional fi- specific information are considered metadata. nite interval quadrature”. Further metadata vocabularies cover the settings in into scientific publications, which, e.g., make claims which mathematical knowledge is applied. For exam- and argue about claims made in other, cited publica- ple, the Learning Object Metadata (LOM [140]) de- tions. This has often been studied in combination with scribe educational properties of resources, such as their rhetorical structures; see [128] for an overview. level of difficulty or interactivity, their coverage of top- The other kind of scientific discourse is held exter- ics (e.g. in terms of classification systems), and their nally of the representations of its subjects, e.g. in dis- intended audience. A vocabulary inspired by LOM cussion forums. This perspective has been studied in has been used in the mathematical e-learning system the context of collaborative problem solving; the most ActiveMath [176]. common models employed in knowledge engineering have been derived from IBIS (Issue-Based Information 2.4. The Application Environment System [153]). The DILIGENT argumentation model has been developed in the context of the namesake Again, the application environment can be modeled collaborative ontology engineering methodology with as data rather than metadata, given an appropriate con- the design goal of making arguments more focused ceptual model of the application domain. We mention than in plain IBIS in order to make design decisions three notable examples: more traceable [213,214]. A DILIGENT argumenta- SWEET (Semantic Web Earth and Environmen- tive thread starts with raising an issue, e.g. verbalizing tal Terminology [21,198]) is an OWL ontol- a requirement for the ontology to be designed or point- ogy that describes 4600 concepts in 150 mod- ing out a problem with its current state. An issue can ules from fields related to mathematics, such be resolved by implementing a proposed and approved as physics, chemistry, biology, geology, and as- solution idea. About issues and ideas, the participants tronomy. These modules build on a foundation can state objective arguments or their subjective posi- of general concepts of mathematics (e.g. func- tion. tions), natural science, and space (e.g. coordi- The generic notion of issues and ideas can be refined nates). SWEET’s model of mathematics does not to capture domain-specific problem and solution types. intend to be as elaborate as the structural on- The most common problem with items of mathemat- tologies reviewed here, but SWEET provides a ical knowledge in knowledge collections, as reported good showcase of how to integrate knowledge by the 25 participants of a survey that we have con- about mathematics with knowledge about its sci- ducted among domain experts, is that they are wrong, entific application domains. One example of how followed by being incomprehensible, their truth being SWEET integrates mathematics and science is the uncertain, being underspecified, or redundant [164]. concept of a gravity field, defined as a vector field Further cases include knowledge items of which it whose force is gravity. A vector field is a subcon- was not clear whether they were useful, and knowl- cept of a function whose result is a vector, and a edge items expressed in an uncommon style. Prob- vector is defined as an array of scalar elements. lems are most commonly solved by directly improv- GeoSkills is an OWL ontology describing topics, ing the affected knowledge item, by splitting it into competencies and educational contexts related to more than one, or by deleting it altogether. Knowledge interactive geometry [166], albeit without a con- items, issues, and ideas cannot be combined arbitrar- nection to a structural model. ily. For example assertions, proofs, and examples can OntoMODEL is a tool for managing mathematical be wrong, whereas a notation can rather be inappro- models in pharmaceutical product development priate, misleading, or hard to read and write. Then, if [211]. Its underlying OWL ontology of mathe- some knowledge item is wrong, it could be deleted, or matical models captures notions such as assump- fixed in place, or kept as an instructive bad example, tions, parameters, and dependent variables in a whereas splitting it into two parts would not solve that way independent from the specific application do- problem. main. Combining both perspectives on discourse remains to be done for mathematics but would allow for cap- 2.5. Discussions in Mathematical Collaboration turing further important mathematical practices. In his work on “Proofs and Refutations” [154], LAKATOS Previous research has produced ontologies for two has studied how discussions about mathematical knowl- kinds of scientific discourse: One kind is embedded edge items materialize into new mathematical knowl- edge. Consider a discussion thread in which a problem able subsets of first-order logic, such as Descrip- with a proof is pointed out, e.g. that it only covers a tion Logic (DL) and Horn Rules (cf. [111] for an specific case and should be generalized. This discus- overview). In contrast to that, most areas of math- sion provides the rationale for a later, generalized re- ematics require first-order or even higher-order statement of the respective theorem and its new proof logic to be captured faithfully. Tools for verify- and therefore could be integrated into the text that en- ing or even constructing proofs in these logics closes the theorem and its proof. (section 4.2 mentions some of them) exist, within the inherent limitations of calculi for these log- ics. They usually do not scale across the Web but 3. Problem Statement and Requirements for instead require a full representation of a prob- Representing Mathematical Knowledge on the lem and all of its prerequisites and foundations Semantic Web in main memory. A proof of concept for how well a subset of first-order logic can approxi- This article reviews existing ontologies and lan- mate the formalization of mathematical concepts guages for representing mathematical knowledge w.r.t. has been presented by BRÖCHELER (cf. [76], their potential to enable better MKM on the Web, i.e. which includes references to earlier related ap- to advance the state of the art discussed in section 1. In proaches). However, this investigation, as well as accordance with the wide definition of “MKM” cited related ones, had not been motivated by the de- in section 1.2, we are interested in what potential the sire to provide an alternative to existing proof as- representational capabilities of these ontologies and sistants, but by information retrieval use cases, languages have (i) for supporting mathematicians, sci- which we briefly discuss in section 4.3.6. In sum- entists, and engineers in producing and using mathe- mary, we treat mathematical reasoning as a spe- matical knowledge, (ii) for supporting educators and cial task to be supported by specialized tools. We students in teaching and learning mathematics, (iii) for expect knowledge representations in terms of the publishing mathematical textbooks and disseminating ontologies and languages that we review to guide new mathematical results, and (iv) for supporting li- agents and their users in finding the right special- brarians and mathematicians in cataloging and orga- ized tool for a mathematical reasoning task – in nizing mathematical knowledge. analogy to the position that the MONET archi- tecture reviewed in section 1.2.2 took on compu- 3.1. Limiting the Scope of this Survey tation. For that reason, we require the ontologies and languages reviewed to capture the logical and We consider the following two aspects out of scope functional structures of mathematical knowledge of this article: as comprehensively as needed (see requirement Tools and Services: This article does not review ex- S.L below), so that it can be passed on to special- isting MKM tools and services based on the on- ized tools – possibly after a translation into their tologies and languages reviewed. Instead we re- native language, but preferably without losing in- quire ontologies and languages to provide for a formation. machine-comprehensible representation of math- ematical knowledge (see requirement C.A be- 3.2. Requirements low), and conjecture that for such languages it should in principle be possible to develop any de- From the review of the state of the art of mathe- sired tool support; we refer to [160, chapter 6] matics on the Web, we can infer as design goals for for an in-depth treatment of state-of-the art MKM Semantic Web applications for MKM the ability to tools. reuse knowledge across knowledge bases, information Reasoning: Another aspect that we consider out of retrieval adequate to the structures of the knowledge, scope is mathematical reasoning in the narrow and integration with mathematical services, such as au- sense, i.e. within the dimension of logical and tomated reasoning and computation, without compro- functional structures. While Semantic Web tech- mising comprehensibility for human end-users. Hav- nology is concerned with reasoning, it is, in the ing reviewed the structures of mathematical knowl- interest of scalability, largely limited to decid- edge in section 2 and having defined the scope of this survey in section 3.1, we are now ready to specify more C.A: to arbitrary automated agents – therefore, precise requirements for ontologies and languages18: the knowledge SHOULD be self-describing S: All of the previously reviewed structures of mathe- in a machine-comprehensible way. matical knowledge SHOULD be supported; where C.H: to human users – therefore, published human- this is impossible, missing dimensions MUST be comprehensible documents generated from compensated for by language extensions along representations in the given language SHOULD the criteria L.E and L.→ below. We subdivide this retain semantic annotations, so that assistive criterion as follows: services can retrace the original knowledge S.L.{O,S,T}: logical/functional structures: math- and make it available to the user on request, ematical objects, statements, theories e.g. integrated into a user interface. S.R: rigorous language or rhetorical structures Tables 1 and 2 at the end of this section summarize S.N: notation how well the languages and ontologies reviewed sat- S.M: metadata isfy these requirements. S.D: discussions F: Mathematical knowledge occurs in different de- grees of formality; applications targeting human 4. Review of Languages and Ontologies users and automated agents require both formal and informal representations. Therefore, This section reviews existing languages and on- F.R: the language SHOULD be able to represent tologies for representing mathematical knowledge knowledge in a wide range from informally according to the requirements established in sec- to fully formalized, and tion 3.2. Besides Semantic Web ontologies in the nar- F.C: many degrees of formality, including for- row sense, other machine-comprehensible represen- malizations in multiple foundations, SHOULD tation languages are taken into account. This is be- be able to coexist in one document, inter- cause they are widely used in science, technology, en- linked with each other. gineering, and mathematics, even preferred over on- L: In real-world applications, mathematical knowl- tologies in certain settings, and most existing machine- edge is combined with multiple dimensions of comprehensible mathematical knowledge is available non-mathematical knowledge. Therefore, a lan- in these languages rather than RDF. We do, however, guage SHOULD support interlinking of these di- pay attention to the possibility to translate such repre- mensions by rich annotation facilities, but also sentations to RDF. give authors the freedom to represent some knowl- edge by external means and link it to representa- 4.1. XML-based Semantic Markup Languages tions in the given language. In detail, L.A: the language MUST allow for attaching non- XML languages share the URI foundation with mathematical metadata and annotations to RDF. All semantic XML languages allow for assign- mathematical knowledge items, regardless ing IDs for the inner nodes of their tree-structured rep- of their granularity, resentation (i.e. for the elements), via XML ID [171]. L.→1: it MUST allow for linking mathematical Together with the URI of the XML document, that knowledge items to external mathematical allows for global identification and linking and thus or non-mathematical resources, and satisfies requirement L.←. XPointer [122] allows for L.←2: it MUST be possible to address all math- identifying more complex subsets of an XML repre- ematical knowledge items expressed in the sentation, such as node or text ranges, but is rarely sup- given language from outside, in order to link ported. Note, however, that additional work is required external representations to them, for exam- to integrate semantic XML markup with RDF-based ple standoff markup or existing representa- Linked Data; this is discussed in section 5.1. tions in different languages. C: Knowledge represented in a language SHOULD be 4.1.1. Content MathML 3 and OpenMath 2 Objects comprehensible MathML (Mathematical Markup Language [52]) is an XML language that was originally conceived 1read “L out[going]” for embedding mathematical formulae into HTML 2read “L in[coming]” web pages – which is still its main purpose. It fea- tures a presentation-oriented sublanguage (Presenta- reference encoding of that model is a lightweight XML tion MathML) but also a semantics-oriented one (Con- language, roughly comparable to RDFS in expressiv- tent MathML); the latter is covered here. MathML al- ity; the OMDoc language (cf. section 4.1.3) offers a lows for a fine-grained mix of semantic and presen- more expressive but compatible alternative. Alterna- tation markup (“parallel markup”) – thus addressing tively, the model has been implemented by ontologies requirements F.* but also C.H, considering it as an (cf. section 4.3.2). enabling technology for interacting with formulae in The official CD collection reviewed by the Open- published documents (cf. [116]). It is possible to em- Math Society (cf. [80, section 4.5]) defines 260 sym- bed or link non-mathematical information (require- bols from arithmetics, set theory, FOL, algebra, cal- ments L.A and L.→). The task of defining notation, culus, as well as transcendental and statistical func- i.e. mapping semantic structures to a human-readable tions [27]. These CDs do not provide a full formal- presentation, is left to other, non-mathematical lan- ization; instead, developers of, e.g., CAS are supposed guages such as XSLT. to use the CDs as specification manuals when imple- The related OpenMath [80] language has originally menting phrasebooks, which translate OpenMath ob- been invented in the mid-1990s to facilitate data ex- jects into the native languages of such systems. change between computer algebra systems (CAS) but has since been aligned closely with Content MathML, 4.1.3. More Expressive CDs and Documents: leaving only syntactic differences in the latest versions OMDoc 1.3 of both languages [97]. OMDoc (Open Mathematical Documents [12,147]) Both languages are limited to representing the func- is an XML language for representing mathematical tional tree structure of mathematical objects (require- knowledge that has a particularly rich vocabulary ment S.L.O) but are frequently integrated into host for logical/functional structures (requirements S.L.*). languages that cover further structures. Many XML OMDoc supports MathML and OpenMath objects, and languages already embed MathML or OpenMath offi- its theories are compatible to OpenMath CDs. Beyond cially (see below), whereas others allow for extending that, OMDoc adds vocabulary for formal and informal their vocabularies accordingly. Listing 1 shows a doc- statements, modular theories, as well as narratively ument that contains MathML and OpenMath objects. ordered documents, rhetorical structures (requirement Mathematical objects consist of numbers, variables, S.R; cf. SALT in section 4.3.3), notation definitions symbols, and applications of objects to other objects. (requirement S.N). By integrating RDFa [37] for arbi- Content MathML comes with a default supply of sym- trary metadata and links [160, chapter 5], conforming bols that cover high school and introductory univer- to the specification of an RDFa host language, OM- sity education. Mathematical objects reference these Doc furthermore satisfies requirements S.M, L.A and symbols by URI. Their semantics is defined in ex- L.→, and makes representations more comprehensible ternal vocabularies called OpenMath Content Dictio- to agents (if they are aware of RDFa and if the vo- naries (CDs); authors can create and use additional cabularies used for annotation are published as Linked CDs as needed. The machine-comprehensibility of Data; requirement C.A). On each structural level, OM- MathML/OpenMath representations depends on the Doc supports a wide range of degrees of formality, degree of formality of the CDs. Suggested alternative from unstructured text to a full formalization – thus RDF representations of MathML are discussed in sec- eliminating the need for phrasebooks at least on object tion 4.3.2. level –, which can coexist in an interspersed, literate 4.1.2. Extensible Mathematical Vocabularies: programming style to serve the needs of human- and OpenMath 2 Content Dictionaries machine-oriented applications. OMDoc particularly An OpenMath CD is a collection of (usually closely allows for statement-level parallel markup that inter- related) definitions of symbols. The abstract model of weaves textbook-style natural language with a purely a CD covers mathematical objects – used to formally formal representation of the same statements, down represent properties of a symbol, as opposed to plain to linking subclauses in the text to the corresponding text descriptions –, a weak variant of axioms/defini- subterms in formulae, and linking words to symbols or tions and examples on the statement level, and a ba- variables.19 sic notion of theories, plus a limited metadata vocabu- Listing 1 gives an example of the syntax of OM- lary [80, section 4] (requirements S.L.* and S.M). The Doc 1.3, whose specification is currently being final- Listing 1: An OMDoc theory with a declaration (type given in OpenMath) and implicit definition of the exponential function (given in Content MathML)
imports imports, From metaTheory NonConstitutive
hasType Import Statement Notation Definition Constitutive Statement Proof Example Assertion proves
Symbol hasDefinition exemplifies Axiom Definition rendersSymbol
Fig. 3. The core of the OMDoc ontology (slightly simplified) [160]
The o:usesSymbol property flattens the functional that notation definition: structure of a mathematical object by treating all oc- currences of symbols equally, regardless of the depth o:usesSymbol ◦ o:hasNotationDefinition of the expression tree in which they occur. For depen- v o:possiblyUsesNotationDefinition dency w.r.t. validity, there is so far merely a proof of v o:presentationDependsOn v o:dependsOn concept, namely the dependency of a proof on an infer- ence rule or any other axiom or proven assertion used By means of the ontology, this cannot be decided to justify a proof step: definitely, as the knowledge base might have alterna- tive notations for different presentation contexts. Static o:hasStep ◦ o:stepJustifiedBy context matching exceeds the expressivity of a DL on- tology, and in a dynamic setting, where the presenta- v o:validityDependsOn v o:dependsOn tion context depends on the profile of the user view- ing the published document, its notational dependen- cies can only be determined at runtime. Note that, with a structural ontology alone, we can- not check whether an expression is well-formed, or MathLang DRa: The “Document Rhetorical aspect” whether a proof is valid, as we are outside of a particu- (DRa) of the MathLang representation language cov- lar foundation; the validity itself has to be determined ers larger chunks of mathematical text – document sections as well as mathematical statements – and by other means (cf. the discussion in section 3.1). their interrelations, such as a proof justifying a the- The OMDoc ontology also covers notation defini- orem [199,144]. A generic dependency relation has tions for symbols; for example, the exp symbol from been defined, which is used for validating whether the x our example could be defined to render as e . When narrative order of a document respects the logical de- an OMDoc document is published, the presentation of pendencies. Conceptually, this is similar to the state- any formula using that symbol possibly depends on ment level of the OMDoc ontology. The OWL imple- mentation of the DRa vocabulary merely serves as a – Both are, in principle, open for integration with formal specification of the DRa semantics, whereas arbitrary domain knowledge – which would be the validator processes an XML representation of the mathematical knowledge in our case. DRa [199]. A drawback of the DRa ontology is that it cannot easily be extended by, e.g., additional statement Both approaches comprise three ontologies; here, types and additional dependency relations. we explain the model of SALT: PML: PML (Proof Markup Language), an “inter- The Document Ontology models the outline of the lingua for sharing explanations generated by various document – sections, paragraphs, sentences, and automated systems such as hybrid web-based ques- text chunks (in OntoReST: “spans”) below sen- tion answering systems, text analytics, theorem prov- tence level [124]. The latter remain in the orig- ing, task processing, web services execution, rule en- inal representation of the document; SALT pro- gines, and machine learning components” [175], has vides standoff markup via start and end pointers been implemented as an OWL ontology consisting of to their positions in the full text. Additionally, one modules for provenance, information manipulation or can represent the linear order of document units justification, and trust. PML assumes that facts and by numbering them. proofs have been written in some other language and The Annotation Ontology connects instances of the merely adds standoff markup. Resources annotated document ontology with annotations of their that way can be referenced by URI or, in the case rhetorical structure and with background domain of text-based languages such as KIF, by byte off- knowledge, such as the topic of a section [123]. set. The justification module supports unproven con- While rhetorical structures are the primary focus clusions or goals, assumptions, direct assertions, and of SALT, the mechanism is sufficiently general to antecedent→consequent justifications backed by infer- also permit annotation of other structural dimen- ence rules. The provenance module has a vocabulary sions. for describing inference rules – again, not down to the The Rhetorical Ontology covers RST-style rhetori- object level. So far, this is similar to the OMDoc on- cal relations [125]. Their nuclei and satellites, tology. Finally, the trust module allows for expressing subsumed as “rhetorical elements”, are linked degrees of belief in informations and trust in agents. to text spans in the document via the annota- tion ontology. Coarse-grained rhetorical blocks 4.3.3. Scientific Documents that can be applied on top level of a document While logical/functional structures of mathematical are offered as an alternative. (This part of SALT knowledge may occur on their own, e.g. in formal- has now evolved into the Ontology of Rhetorical ized knowledge bases, rhetorical structures are usually Blocks (ORB [84]); the Document Components studied in the context of documents written in, e.g., Ontology (DoCo [206]) provides another recent, A LTEX or an XML language. Two very similar families more comprehensive alternative.) In previous re- of ontologies suitable for modeling rhetorical struc- search, we have investigated the particular suit- tures in mathematical documents are SALT [126,127] ability of SALT for modeling rhetorical structures and OntoReST [183]; further related models and on- in mathematical textbooks by aligning the rhetor- tologies have been reviewed in [128,77]. SALT and ical markup of OMDoc (cf. section 4.1.3) to it OntoReST are relevant for the following reasons: [160, chapter 3.3]. OntoReST provides a stronger – Both have a good coverage of RST-style rhetori- OWL formalization of RST that supports consis- cal structures. tency checking [183]. – Either use case is related to mathematical collab- About the above-mentioned DoCo, note further- oration: SALT focuses on annotating and linking more that it is just one member of a family of Semantic scientific publications on the Web. OntoReST fo- Publishing and Referencing Ontologies (SPAR [20]), cuses on consistency checking in concurrent col- which additionally cover citations, bibliographies, as laborative writing. well as publishing workflows and people involved. – Both allow for an arbitrarily fine-grained anno- tation of phrases. SALT additionally focuses on 4.3.4. Scientific Discourse Ontologies cross-document links for justifying statements by Ontologies formalizing discourse inside scientific citing the claims made [and justified] in external publications are closely related to the above-mentioned publications [127]. ontologies for rhetorical structures; in fact, SALT sup- Fig. 4. The three-layered architecture of the SALT ontologies (simplified) [127] ports both. GROZA et al. provide an overview of fur- When a classification scheme is not available as an ther related document formats and ontologies [128]. ontology, one can use the identifiers of its categories The DILIGENT argumentation model introduced as literal values of metadata fields such as dc:subject. in section 2.5 has been implemented in several vari- A proper ontology implementation, where each cat- ants [213,214,102]. The SIOC (Semantically Inter- egory is a resource of its own, has further advan- linked Online Communities [71,68,70]) ontology, which tages: The hierarchy of categories can be made ex- models user-generated content on the Web and is plicit, and URIs can be used more flexibly in queries. widely supported by Web 2.0 applications, has a In the MONET project, a simple class hierarchy of DILIGENT-inspired argumentation module that al- the GAMS problems has been implemented [178]. lows for a slightly more flexible thread structure DOLOG et al. have turned the ACM CCS, a computer than the original DILIGENT implementations, which science classification scheme, into an ontology [106], makes it applicable in a wider range of Web 2.0 set- drawing on the “classification” vocabulary of LOM; tings [163]. An ontology of mathematical problems the ACM themselves are working on an official Linked and solutions has been provided as an extension of the Dataset36. Similarly, the MSC 2010 has been imple- SIOC argumentation module [164] – which is the only mented as a Linked Dataset, using the SKOS ontol- known occurrence of an ontology specifically captur- ogy [33] (cf. [44,162]), an official release being ex- ing (a subset of) mathematical discourse. pected in early 2012. Combining both models of scientific discourse in one ontology has been pioneered by the align- 4.3.6. Pure and Applied Mathematical Domain ment of SWAN (Semantic Web Applications for Neu- Ontologies romedicine [128]) with SIOC [191]. SWAN models Any instance document of the XML languages and scientific discourse, not exactly in publications, but formalized languages and any dataset expressed in in a distributed knowledge base by pointers to biblio- terms of the structural ontologies reviewed in this sec- graphic records and entities from domain ontologies. tion can be considered a mathematical domain ontol- Its primary target is neuromedicine, but, as SALT, it ogy. In particular, the following knowledge collections supports arbitrary domain ontologies in principle. have been designed for reuse, reviewed or machine- 4.3.5. Mathematical Metadata Vocabularies verified to ensure a high quality, and published with Most of the metadata vocabularies mentioned in sec- stable identifiers – however, usually neither satisfying tion 2.3 have been implemented as ontologies. This the Linked Data criteria nor linked to other collections: is the case with Dublin Core [184] and LOM [141, The official OpenMath CDs have been mentioned in 185]. The mathematics-specific metadata vocabulary section 4.1.2. Reusable OMDoc implementations of a of OpenMath CDs (cf. section 4.1.2) has been imple- large number of logics and translations between them mented as an extension to Dublin Core [156]. Their have been published in a “Logic Atlas”, including var- review status is documented by metadata fields such ious variants of first-order, higher-order, modal, and as status (official, experimental, private, or obsolete), description logic [151,87].37 The collection of math- version, and the date of the next review. ematical, scientific and technological courseware in the Connexions repository has been mentioned in sec- as a tool for aligning domain-specific ontologies – tion 1.1.1, libraries of formalized mathematics in sec- such as the various ontologies that model structures of tion 4.2; however, the latter employ custom identifi- mathematical knowledge as well as mathematical do- cation mechanisms that would first have to be trans- main knowledge. In a scenario where multiple agents lated into URIs. The N3 “math” vocabulary (cf. sec- perform different actions on a heterogeneous collec- tion 4.3.2) declares a fixed set of basic mathematical tion of mathematical knowledge, whose representa- functions, roughly corresponding to the arith1, rela- tion makes use of multiple domain-specific ontologies, tion1, and transc1 OpenMath CDs, without the pur- such an alignment helps to reduce misunderstandings pose of formalizing them. among different services (cf. [173]). Formalization of mathematical domain knowledge As an example, we point out relations between in the decidable first-order logic subsets employed on the DOLCE upper level ontology [173] and domain- the Semantic Web has been attempted, but the possi- specific ontologies reviewed before; however, this bilities are limited. For example, the facts that a dif- merely serves as a pointer towards possible future re- ferentiable function is a function that has a derivative search, as no such alignment has been performed so function and that differentiable functions are continu- far for any of the ontologies reviewed. From a struc- ous functions can be represented in DL, as exemplified tural point of view, DOLCE covers, for example, sev- by BRÖCHELER [76] – but the fact that a differentiable eral notions of parthood, which are sufficiently generic function satisfies a certain ε/δ criterion cannot, as it to comprise both, e.g., a symbol being part of a theory would require higher order logic. BRÖCHELER never- and a section being part of a document, and thus may theless points out use cases for querying a DL formal- serve an agent that explores a knowledge collection ization of domain knowledge in alternation with a rep- along these different structural dimensions. From a resentation of logical structures in terms of a DL on- mathematical domain knowledge point of view, some tology: finding (via the structural ontology) examples of DOLCE’s concepts can be considered abstractions for, e.g., groups (instances of a mathematical concept, of mathematical concepts; for example, a quality such determined via the domain ontology), or finding ap- as “the value of the sin function for x = 0” corre- plicable theorems or definitions of mathematical con- sponds to a mathematical property of a mathematical cepts and (again via the domain ontology) all related object, and its quale (here: the value 0), which is a concepts. (possibly point-sized) region in a quality space, cor- In addition to ontologies representing knowledge responds to a member or subset of some set (here: from the mathematical domain, there are domain on- 0 ∈ R). tologies from fields related to mathematics: GAMS (cf. section 4.3.2) features a directory of software that 4.4. Conclusion solves mathematical problems [4], but only a subset has been made available within MONET. The SWEET Table 1 shows at a first glance that no single seman- domain ontology for science and GeoSkills for inter- tic markup language satisfies all of our requirements active geometry have already been reviewed in sec- for representing mathematical knowledge on the Se- tion 2.4. The mathematical model ontology of the On- mantic Web, but that expressive XML languages and toMODEL tool, also introduced in section 2.4, is par- RDF complement each other. OMDoc’s good results ticularly notable for its usage of Content MathML (em- have, in fact, been achieved only recently, by integrat- bedded into RDF as XML literals) for equations and ing RDFa into the language [160, chapter 6]. The XML variable dependencies [211]. languages, headed by OMDoc (which includes Math- 4.3.7. Upper Level Ontologies ML or OpenMath for mathematical objects), lead the Upper level ontologies, also called foundational on- way w.r.t. coverage of different structures of mathe- tologies, describe general concepts shared by many matical knowledge, as well as combining formal and domains; see [172] for an overview. They often aim informal representations. Moreover, they are reason- at capturing common sense and providing ontological ably well accepted by the MKM community, as op- background knowledge to natural language processing posed to RDF, and most of today’s mathematical do- applications [172]. As upper level ontologies provide main knowledge is available in these XML languages, a shared foundation to which designers of domain- or in formalized languages that have XML transla- specific ontologies can link the latter, they may serve tions. Table 1: How the languages reviewed satisfy the knowledge representation requirements St.a Requirement Structures Formality Linking Compr. S.L.* S.R S.N S.M S.D F.R F.C L.A L.→ L.← C.A C.H OST W MathML 3 ++ – – – – – – ++ ++39 + + + – + W OpenMath 2 Objects ++ – – – – – – + – + – U OpenMath 2 CDs ++c –– – ##–––– # c ef f f f f U OMDoc 1.2/1.3 ++ ++## + ++ + +/++# – ++## + /++ –/++ +/++ –/ #+ C MathLang ++ ++ – – – – – ++ +# – – # – ~W DocBook 5 ++c – – – – +g – – – – +## + – – ~W TEI P5 ++d –––– – ++ ++ + – + – – d g ~W DITA 1.1 ++ – – – – +# – – – + + + – – ~W EPUB 2.0.1/ ++d ––––+–––––+–– DTBook 3 W CNXML 0.7/ ++c + – – – – – – – – + – – CollXML/mdml # W Formalized languages ++ ++ +/++ – + – – – – – + – ~W RDF(a)h 1.0/1.1 (depends on vocabulary, see table 2) ##+ ++ ++ ++ + a Current status of development/deployment: W = widely used; ~W = widely used, but rarely for mathematical knowledge;# U = used in a small, open community; C = used# in a small, closed community; O = obsolete, no longer in use b Legend (summarizing section 3.2): S.* = coverage of knowledge structures (S.L.{O,S,T} = logical/functional structures: mathematical objects, statements, theories; S.{R,N,M,D} = rigorous language or rhetorical structures, notation, metadata, discussions), F.* = degrees of formality supported (F.R = range of degrees, F.C = coexistence of different degrees), L.* = ability to link to non-mathematical knowledge (L.A = annotation, L.→ = outgoing links, L.← = incoming links), C.{A,H} = comprehensibility to agents/humans Symbols: ++ requirement satisfied excellently (and setting an example for other languages), + very well (but leaving room for improvement), sufficiently well for basic modeling tasks, – insufficiently (includes the case that a feature is missing by design) # c built-in support for MathML/OpenMath objects d via MathML extension e Dublin Core and similar vocabularies built in, others available via built-in RDFa f an improvement of OMDoc 1.3 over the previous version 1.2 that has been achieved by integrating RDFa g Dublin Core and similar vocabularies built in, others available via non-RDFa extensions h While most features summarized here apply to RDF in general, we specifically consider the serialization of RDF as RDFa embedded into XML here, as it facilitates the annotation of XML markup (requirements L.A and L.→). Table 2 Structural coverage of vocabularies/ontologies St.a Structures Logical/functional Rhetorical Notation Metadata Discussion Objects Statements Theories ~W N3 Vocabularies + ––––– b O/P OpenMath CD # –– – c O HELM ###+ + –– #–– c O MoWGLI + ++ # –– + – d P OMDoc ++# + – + – – C MathLang DRa #– + – – – – – P PML – ++e ––––– P SALT – – – ++ – – +f P OntoReST – – – ++ – – – P DILIGENT – – – – – – +f P SIOC Argumenta- – – – – – – ++ tion W Dublin Core – – – – – ++g – a see table 1 for a symbol legend b This line merges both OpenMath CD ontologies reviewed; the older one (cf. [79]) is no longer in use. c While the XML markup languages employed by HELM and MoWGLI allow for describing notations, their RDF vocabularies do not. d intentionally delegated to SALT e proofs only f does not cover mathematical discourse, but is extensible to specific domains g can be combined with mathematical classification schemes
RDF, above all, has superior linking capabilities. In MKM practice, RDF standoff markup pointing to Table 2 shows that, given a combination of suitable full XML representations has so far proven most use- 40 structural ontologies , RDF is capable of covering all ful. Particularly in the case of mathematical objects, structures of mathematical knowledge. Finally, RDF’s only selected information relevant for, e.g., informa- linking capabilities and the vast supply of RDF vocab- tion retrieval, is represented in RDF, as opposed to ularies for non-mathematical knowledge – such as vo- representing full n-ary ordered trees using RDF col- cabularies describing application domains of mathe- lections. RDFa embedded into XML has so far only matics (cf. the discussion in section 1.3.2), or user pro- files (e.g., FOAF [74]), or projects (e.g., DOAP [109]) been used in a few OMDoc documents but also seems – provide a unique possibility that is not offered by a promising way to satisfy the given representation re- any other representation language reviewed: model- quirements. ing and processing mathematical knowledge in the Besides these considerations, the availability of context where it is applied or reused, in a machine- tools also advises a division of responsibilities between comprehensible way. However, some structures of XML and RDF.42 Editors and publishing tools for se- mathematical knowledge are not yet covered by on- mantic representations of mathematical knowledge are tologies as deeply as by markup languages (most almost exclusively available for XML or formalized prominently theories), and others have hardly been languages. For most XML languages, translations to covered at all; the latter affects mathematical discourse the native languages of computer algebra systems or (both rhetorical structures and discussions) and mathe- matical notation. The circumstance that each structural proof assistants have been implemented. Conversely, ontology only covers at most two structural dimen- RDF is preferable for information retrieval, except on sions is not a problem, as the RDF data model allows the object level. Both representations are good to have for combining different ontologies by design.41 for a thorough validation; browsers also exist for both. 5. Representing Mathematical Knowledge on the
title=Deolalikar_P_vs_NP_paper&oldid= [36] ActiveMath. ACTIVEMATH. URL http://www. 3654. activemath.org. [16] ProgrammableWeb. Mashups, APIs, and the Web as Platform. [37] Ben Adida, Mark Birbeck, Shane McCarron, and Ivan URL http://www.programmableweb.com. Herman. RDFa Core 1.1. Syntax and processing [17] ProofWiki. URL http://www.proofwiki.org. rules for embedding RDF through attributes. W3C [18] acm.rkbexplorer.com. Advanced Knowledge Technologies Working Draft, World Wide Web Consortium (W3C), (AKT). URL http://acm.rkbexplorer.com. March 2011. URL http://www.w3.org/TR/2011/ [19] Rhaptos Trac: Collection structure redesign / WD-rdfa-core-20110331/. inception. 2009. URL https://trac. [38] Waseem Akhtar, Jacek Kopecký, Thomas Krennwallner, and rhaptos.org/trac/rhaptos/wiki/ Axel Polleres. XSPARQL: Traveling between the XML and CollectionStructureRedesign/Inception? RDF worlds – and avoiding the XSLT pilgrimage. In Bech- version=54. hofer et al. [59], pages 432–447. [20] SPAR – Semantic Publishing and Referencing. 2011. URL [39] Jesse Alama, Kasper Brink, Lionel Mamane, and Josef Urban. http://purl.org/spar. Large Formal Wikis: Issues and Solutions. In Davenport et al. [21] Semantic Web for Earth and Environmental Terminology [96], pages 133–148. (SWEET). NASA, 2011. URL http://sweet.jpl. [40] Keith Alexander, Richard Cyganiak, Michael Hausenblas, nasa.gov/. and Jun Zhao. Describing Linked Datasets with the VoID [22] Semantic MediaWiki. URL http:// Vocabulary. W3C Interest Group Note, World Wide Web semantic-mediawiki.org. Consortium (W3C), March 2011. URL http://www.w3. [23] SysMO-DB SEEK. URL http://www.sysmo-db. org/TR/2011/NOTE-void-20110303/. org/seek/. [41] American Mathematical Society. MathSciNet Mathemati- [24] TEI – ontologies SIG. URL http://www.tei-c.org/ cal Reviews on the Net. URL http://www.ams.org/ Activities/SIG/Ontologies/. mathscinet/. [25] Tricki. A repository of mathematical know-how. URL http: [42] Ola Andersson, Robin Berjon, Erik Dahlström, Andrew Em- //www.tricki.org. mons, Jon Ferraiolo, Anthony Grasso, Vincent Hardy, Scott [26] Dublin Core Metadata Initiative. URL http://www. Hayman, Dean Jackson, Chris Lilley, Cameron McCormack, dublincore.org. Andreas Neumann, Craig Northway, Antoine Quint, Nan- [27]O PENMATH content dictionaries. URL http://www. dini Ramani, Doug Schepers, and Andrew Shellshear. Scal- openmath.org/cd/. able Vector Graphics (SVG) Tiny 1.2 specification. W3C [28] Wolfram|Alpha widgets. URL http://developer. Recommendation, World Wide Web Consortium (W3C), De- wolframalpha.com/widgets/. cember 2008. URL http://www.w3.org/TR/2008/ [29] Connexions – XML languages. URL http://cnx.org/ REC-SVGTiny12-20081222/. help/authoring/xml. [43] Anupriya Ankolekar, Markus Krötzsch, Thanh Tran, and [30] The Coq proof assistant. URL http://coq.inria.fr/. Denny Vrandeciˇ c.´ The two cultures: Mashing up Web 2.0 and [31] The n-Category Café. A group blog on math, physics and the Semantic Web. Web Semantics, 6(1):70–75, 2008. philosophy. URL http://golem.ph.utexas.edu/ [44] Ioannis Antoniou, Charalampos Bratsas, Anastasia Dimou, category/. Patrick Ion, Christoph Lange, and Wolfram Sperber. Math- [32] nLab. URL http://ncatlab.org/. ematics Subject Classification Linked Wiki. URL http: [33] SKOS Simple Knowledge Organization System. World Wide //sci-class.math.auth.gr/MSCLW/. Web Consortium (W3C). URL http://www.w3.org/ [45] Lora Aroyo, Grigoris Antoniou, Eero Hyvönen, Annette ten 2004/02/skos/. Teije, Heiner Stuckenschmidt, Liliana Cabral, and Tania Tu- [34] Zentralblatt MATH. URL http://www. dorache, editors. The Semantic Web: Research and Applica- zentralblatt-math.org. tions (Part I), number 6088 in Lecture Notes in Computer [35] Mathematics Subject Classification MSC2010, 2010. URL Science, 2010. Springer Verlag, Berlin/Heidelberg. http://msc2010.org. [46] ArXiv. arxiv.org e-Print archive. URL http://www. and Applications, number 5021 in Lecture Notes in Computer arxiv.org. Science, 2008. Springer Verlag, Berlin/Heidelberg. [47] Andrea Asperti and Claudio Sacerdoti Coen. Some consider- [60] Sean Bechhofer, John Ainsworth, Jiten Bhagat, Iain Buchan, ations on the usability of interactive provers. In Autexier et al. Philip Couch, Don Cruickshank, David De Roure, Mark [54], pages 147–156. Delderfield, Ian Dunlop, Matthew Gamble, Carole Goble, Da- [48] Andrea Asperti, Bruno Buchberger, and James Harold Dav- nius Michaelides, Paolo Missier, Stuart Owen, David New- enport, editors. Mathematical Knowledge Management, man, and Shoaib Sufi. Why linked data is not enough for sci- MKM’03, number 2594 in Lecture Notes in Computer Sci- entists. In 6th IEEE e-Science conference, 2010, pages 300– ence, 2003. Springer Verlag, Berlin/Heidelberg. 307. [49] Andrea Asperti, Luca Padovani, Claudio Sacerdoti Coen, Fer- [61] Dave Beckett. RDF/XML syntax specification (revised). ruccio Guidi, and Irene Schena. Mathematical knowledge W3C Recommendation, World Wide Web Consortium management in HELM. Annals of Mathematics and Artificial (W3C), February 2004. URL http://www.w3.org/TR/ Intelligence, Special Issue on Mathematical Knowledge Man- 2004/REC-rdf-syntax-grammar-20040210/. agement, Kluwer Academic Publishers, 38(1–3):27–46, May [62] Anders Berglund, Scott Boag, Don Chamberlin, Mary F. Fer- 2003. nández, Michael Kay, Jonathan Robie, and Jérôme Siméon. [50] Andrea Asperti, Claudio Sacerdoti Coen, Enrico Tassi, and XML Path Language (XPath) 2.0. W3C Recommendation, Stefano Zacchiroli. User interaction with the Matita proof World Wide Web Consortium (W3C), 2007. URL http:// assistant. Journal of Automated Reasoning, 39(2):109–139, www.w3.org/TR/2007/REC-xpath20-20070123/. 2007. [63] Tim Berners-Lee. cwm. 2009. URL http://www.w3. [51] Andrea Asperti, Herman Geuvers, and Raja Natarajan. So- org/2000/10/swap/doc/cwm.html. cial processes, program verification and all that. Mathemati- [64] Tim Berners-Lee. Notation 3 (N3) – a readable RDF syn- cal Structures in Computer Science, 19(5):877–896, October tax. Technical report, World Wide Web Consortium (W3C), 2009. 2006. URL http://www.w3.org/DesignIssues/ [52] Ron Ausbrooks, Stephen Buswell, David Carlisle, Giorgi Notation3.html. Chavchanidze, Stéphane Dalmas, Stan Devitt, Angel Diaz, [65] Tim Berners-Lee. Design issues: Linked data. Tech- Sam Dooley, Roger Hunter, Patrick Ion, Michael Kohlhase, nical report, World Wide Web Consortium (W3C), Azzeddine Lazrek, Paul Libbrecht, Bruce Miller, Robert 2006. URL http://www.w3.org/DesignIssues/ Miner, Murray Sargent, Bruce Smith, Neil Soiffer, Robert LinkedData.html. Sutor, and Stephen Watt. Mathematical Markup Language [66] Tim Berners-Lee, Roy T. Fielding, and Larry Masinter. Uni- (MathML) version 3.0. W3C Recommendation, World Wide form resource identifier (URI): Generic syntax. RFC 3986, Web Consortium (W3C), 2010. URL http://www.w3. Internet Engineering Task Force (IETF), 2005. URL http: org/TR/MathML3. //www.ietf.org/rfc/rfc3986.txt. [53] Serge Autexier, Dieter Hutter, Till Mossakowski, and Axel [67] Diego Berrueta and Jon Phipps. Best practice recipes Schairer. Maya: Maintaining structured documents. In OM- for publishing RDF vocabularies. W3C Working Group DOC – An open markup format for mathematical documents Note, World Wide Web Consortium (W3C), August [Version 1.2] Kohlhase [147], chapter 26.12. URL http: 2008. URL http://www.w3.org/TR/2008/ //omdoc.org/pubs/omdoc1.2.pdf. NOTE-swbp-vocab-pub-20080828/. [54] Serge Autexier, Jacques Calmet, David Delahaye, Patrick [68] Diego Berrueta, Dan Brickley, Stefan Decker, Sergio Fernán- D. F. Ion, Laurence Rideau, Renaud Rioboo, and Alan P. Sex- dez, Christoph Görn, Andreas Harth, Tom Heath, Kingsley ton, editors. Intelligent Computer Mathematics, number 6167 Idehen, Kjetil Kjernsmo, Alistair Miles, Alexandre Passant, in Lecture Notes in Artificial Intelligence, 2010. Springer Ver- Axel Polleres, and Luis Polo. SIOC Core Ontology Speci- lag, Berlin/Heidelberg. ISBN 3642141277. fication, March 2010. URL http://rdfs.org/sioc/ [55] John Baez. Math blogs. Notices of the AMS, page 333, spec/. March 2010. URL http://www.ams.org/notices/ [69] Chris Bizer, Sören Auer, and Gunnar AAstrand Grimnes, 201003/rtx100300333p.pdf. editors. Scripting and Development for the Semantic [56] Grzegorz Bancerek. Information retrieval and rendering with Web (SFSW), number 449 in CEUR Workshop Proceed- MML Query. In Borwein and Farmer [72], pages 266–279. ings, Aachen, May 2009. URL http://CEUR-WS.org/ [57] Rebhi S. Baraka. A Framework for Publishing and Vol-449/. Discovering Mathematical Web Services. PhD thesis, [70] Uldis Bojars¯ and John G. Breslin. SIOC core ontology spec- Johannes Kepler Universität Linz, 2006. URL http: ification. W3C Member Submission, World Wide Web Con- //www.risc.uni-linz.ac.at/publications/ sortium (W3C), June 2007. URL http://www.w3.org/ download/risc_2945/06-04.pdf. Submission/2007/SUBM-sioc-spec-20070612/. [58] Michael J. Barany. ‘[b]ut this is blog maths and we’re free [71] Uldis Bojars,¯ John G. Breslin, Vassilios Peristeras, Giovanni to make up conventions as we go along’: Polymath1 and Tummarello, and Stefan Decker. Interlinking the Social Web the modalities of ‘massively collaborative mathematics’. In with Semantics. IEEE Intelligent Systems, 23(3), pages 29– Phoebe Ayers and Felipe Ortega, editors, Proceedings of the 40, 2008. 6th International Symposium on Wikis and Open Collabora- [72] Jon Borwein and William M. Farmer, editors. Mathemati- tion (WikiSym), ACM Press, 2010. URL http://www. cal Knowledge Management (MKM), number 4108 in Lec- wikisym.org/ws2010/Proceedings/. ture Notes in Artificial Intelligence, 2006. Springer Verlag, [59] Sean Bechhofer, Manfred Hauswirth, Jörg Hoffmann, and Berlin/Heidelberg. Manolis Koubarakis, editors. The Semantic Web: Research [73] Scott Bradner. Key words for use in RFCs to indicate re- CEUR Workshop Proceedings, Aachen, 2009. URL http: quirement levels. RFC 2119, Internet Engineering Task //CEUR-WS.org/Vol-523/. Force (IETF), 1997. URL http://www.ietf.org/ [86] CNX. Connexions. URL http://cnx.org. rfc/rfc2119.txt. [87] Mihai Codescu, Fulya Horozal, Michael Kohlhase, Till [74] Dan Brickley and Libby Miller. FOAF vocabulary specifi- Mossakowski, and Florian Rabe. Project Abstract: Logic At- cation 0.98. Technical report, ILRT Bristol. URL http: las and Integrator (LATIN). In Davenport et al. [96], pages //xmlns.com/foaf/spec/20100809.html. 289–291. [75] Dan Brickley, Vinay K. Chaudhri, Harry Halpin, and Deb- [88] Arjeh M. Cohen, Hans Cuypers, and Rikko Verrijzer. Math- orah L. McGuinness, editors. Proceedings of the AAAI ematical context in interactive documents. Mathematics in Spring Symposium on Linked Data Meets Artificial Intelli- Computer Science, 3(3):331–347, 2010. gence, 2010. [89] Sarah Coppins and Brent Hendricks. Content MathML (Con- [76] Matthias Bröcheler. A mathematical semantic web. nexions web site). URL http://cnx.org/content/ Bachelor’s thesis, Computer Science, Jacobs Univer- m9008/2.15/. sity, Bremen, 2007. URL http://www.eecs. [90] Olivier Corby, Leila Kefi-Khelif, Hacène Cherfi, Fabien jacobs-university.de/archive/bsc-2007/ Gandon, and Khaled Khelif. Querying the semantic broecheler.pdf. web of data using SPARQL, RDF and XML. Techni- [77] Simon Buckingham Shum, Tim Clark, Anita de Waard, cal Report 6847, INRIA Sophia Antipolis, February 2009. Tudor Groza, Siegfried Handschuh, and Àgnes Sándor. URL http://hal.inria.fr/docs/00/36/23/81/ Scientific discourse on the semantic web: A survey of PDF/RR-6847.pdf. models and enabling technologies. accepted for publica- [91] Stéphane Corlosquet, Renaud Delbru, Tim Clark, Axel tion in the Semantic Web Journal, 2011. URL http: Polleres, and Stefan Decker. Produce and Consume Linked //www.semantic-web-journal.net/content/ Data with Drupal! In Abraham Bernstein, David R. Karger, scientific-discourse-semantic-web-survey- Tom Heath, Lee Feigenbaum, Diana Maynard, Enrico Motta, models-and-enabling-technologies. and Krishnaprasad Thirunarayan, editors, The Semantic Web [78] Lou Burnard and Syd Bauman. TEI P5: Guidelines for – ISWC 2009, number 5823 in Lecture Notes in Computer electronic text encoding and interchange. Technical re- Science, pages 763–778. Springer Verlag, Berlin/Heidelberg, port, TEI Consortium. URL http://www.tei-c.org/ October 2009. release/doc/tei-p5-doc/en/html/. [92] Hans Cuypers, Arjeh M. Cohen, Jan Willem Knopper, Rikko [79] Stephen Buswell. RDF/XML test cases for RDF logic, Verrijzer, and Mark Spanbroek. MathDox, a system for inter- web ontology and maths content. Deliverable 5.3b, Se- active mathematics. In Proceedings of the World Conference mantic Web Advanced Development for Europe (SWAD- on Educational Multimedia, Hypermedia & Telecommunica- Europe), 2001. URL http://www.w3.org/2001/sw/ tions 2008 (ED-MEDIA’08), pages 5177–5182. AACE, June Europe/reports/xml_test_cases/wp53.html. 2008. URL http://go.editlib.org/p/29092. [80] Stephen Buswell, Olga Caprotti, David P. Carlisle, Michael C. [93] Wolfgang Dalitz, Wolfram Sperber, and Winfried Neun. Dewar, Marc Gaëtano, and Michael Kohlhase. The Open Math-Net, a model for information and communication sys- Math standard, version 2.0. Technical report, The Open- tems in sciences. In Jan von Knop, Peter Schirmbacher, and Math Society, 2004. URL http://www.openmath. Viljan Mahnic, editors, EUNIS, number 13 in Lecture Notes org/standard/om20. in Informatics, pages 140–145. GI, 2001. ISBN 3-88579-339- [81] Olga Caprotti, Mike Dewar, and Daniele Turi. Mathematical 3. service matching using description logic and OWL. In An- [94] Ron Daniel Jr. Harvesting RDF statements from XLinks. drea Asperti, Grzegorz Bancerek, and Andrej Trybulec, edi- W3C Note, World Wide Web Consortium (W3C), Septem- tors, Mathematical Knowledge Management, MKM’04, num- ber 2000. URL http://www.w3.org/TR/2000/ ber 3119 in Lecture Notes in Artificial Intelligence, pages 73– NOTE-xlink2rdf-20000929/. 87. Springer Verlag, 2004, Berlin/Heidelberg. [95] James Davenport. A small OpenMath type system. Bulletin of [82] Olga Caprotti, Mike Dewar, and Daniele Turi. Math- the ACM Special Interest Group on Symbolic and Automated ematical service matching using description logic and Mathematics (SIGSAM), 34(2):16–21, 2000. OWL. Technical report, The MONET Consortium, [96] James Davenport, William Farmer, Florian Rabe, and Josef 2004. URL http://monet.nag.co.uk/monet/ Urban, editors. Intelligent Computer Mathematics, num- publicdocs/monet_onts.pdf. ber 6824 in Lecture Notes in Artificial Intelligence, 2011. [83] Jacques Carette, Lucas Dixon, Claudio Sacerdoti Coen, and Springer Verlag, Berlin/Heidelberg. Stephen M. Watt, editors. MKM/Calculemus Proceed- [97] James H. Davenport and Michael Kohlhase. Unifying ings, number 5625 in Lecture Notes in Artificial Intelli- Math Ontologies: A tale of two standards. In Carette gence, July 2009. Springer Verlag, Berlin/Heidelberg. ISBN et al. [83], pages 263–278. ISBN 9783642026133. 9783642026133. URL http://kwarc.info/kohlhase/papers/ [84] Paolo Ciccarese and Tudor Groza. Ontology of Rhetorical mkm09-MMLOM3.pdf. Blocks (ORB). W3C Interest Group Note, World Wide Web [98] Catalin David, Michael Kohlhase, Christoph Lange, Florian Consortium (W3C), October 2011. URL http://www. Rabe, Nikita Zhiltsov, and Vyacheslav Zholudev. Publishing w3.org/TR/2011/NOTE-hcls-orb-20111020/. math lecture notes as linked data. In Lora Aroyo, Grigoris [85] Tim Clark, Joanne S. Luciano, M. Scott Marshall, Eric Antoniou, Eero Hyvönen, Annette ten Teije, Heiner Stuck- Prud’hommeaux, and Susie Stephens, editors. Semantic Web enschmidt, Liliana Cabral, and Tania Tudorache, editors, The Applications in Scientific Discourse (SWASD), number 523 in Semantic Web: Research and Applications (Part II), number 6089 in Lecture Notes in Computer Science, pages 370–375. [114] William M. Farmer. MKM: A new interdisciplinary field of Springer Verlag, Berlin/Heidelberg, 2010. research. Bulletin of the ACM Special Interest Group on Sym- [99] DBpedia. URL http://dbpedia.org. bolic and Automated Mathematics (SIGSAM), 38(2):47–52, [100] DCMI Usage Board. DCMI metadata terms. DCMI 2004. recommendation, Dublin Core Metadata Initiative, 2008. [115] Armin Fiedler. P.rex: An interactive proof explainer. In Ra- URL http://dublincore.org/documents/2008/ jeev Goré, Alexander Leitsch, and Tobias Nipkow, editors, 01/14/dcmi-terms/. Automated Reasoning — 1st International Joint Conference, [101] Jos De Roo. EulerSharp. URL http://eulersharp. IJCAR 2001, number 2083 in Lecture Notes in Artificial Intel- sourceforge.net/. ligence, pages 416–420, Siena, Italy, 2001. Springer Verlag, [102] Klaas Dellschaft, Hendrik Engelbrecht, José Monte Barreto, Berlin/Heidelberg. Sascha Rutenbeck, and Steffen Staab. Cicero: Tracking de- [116] Jana Giceva, Christoph Lange, and Florian Rabe. Integrating sign rationale in collaborative ontology engineering. In Bech- web services into active mathematical documents. In Carette hofer et al. [59], pages 782–786. et al. [83], pages 279–293. ISBN 9783642026133. URL [103] Li Ding, Dominic DiFranzo, Alvaro Graves, James R. https://svn.omdoc.org/repos/jomdoc/doc/ Michaelis, Xian Li, Deborah L. McGuinness, and Jim pubs/mkm09/jobad/jobad-server.pdf. Hendler. Data-gov wiki: Towards linking government data. [117] Deyan Ginev, Constantin Jucovschi, Stefan Anca, Mi- In Brickley et al. [75]. hai Grigore, Catalin David, and Michael Kohlhase. An [104] DITA. OASIS Darwin Information Typing Architec- architecture for linguistic and semantic analysis on the ture (DITA). URL http://www.oasis-open.org/ arXMLiv corpus. In Applications of Semantic Technolo- committees/dita/. gies (AST) Workshop at Informatik 2009, 2009. URL [105] Leigh Dodds and Ian Davis. Linked Data Patterns. A pat- http://www.kwarc.info/projects/lamapun/ tern catalogue for modelling, publishing, and consuming pubs/AST09_LaMaPUn+appendix.pdf. Linked Data. August 2011. URL http://patterns. [118] Birte Glimm and Chimezie Ogbuji. SPARQL 1.1 entailment dataincubator.org/book/. regimes. W3C Working Draft, World Wide Web Consortium [106] Peter Dolog, Rita Gavrioloaie, Wolfgang Nejdl, and Jan (W3C), October 2010. URL http://www.w3.org/TR/ Brase. Integrating adaptive hypermedia techniques and open 2010/WD-sparql11-entailment-20101014/. RDF-based environments. In Proceedings of the 12th WWW [119] George Goguadze. Metadata for mathematical libraries. De- conference. ACM Press, 2003. liverable D3.a, MoWGLI, 2003. URL http://mowgli. [107] Nicholas Drummond, Alan Rector, Robert Stevens, Georgina cs.unibo.it/misc/deliverables/metadata/ Moulton, Matthew Horridge, Hai Wang, and Julian Seden- D3a_metadata_for_math/math_metadata.pdf. berg. Putting owl in order: Patterns for sequences in OWL. [120] George Goguadze. Metadata model. Deliverable D3.b, MoW- In Bernardo Cuenca Grau, Pascal Hitzler, Connor Shankey, GLI, 2003. URL http://mowgli.cs.unibo.it/ and Evan Wallace, editors, OWL: Experiences and Directions misc/deliverables/metadata/D3bmetadata_ (OWLED), November 2006. model/metadata_model.pdf. [108] Dublin Core Metadata Element Set. Dublin Core metadata [121] Timothy Gowers and Michael Nielsen. Massively collabora- element set. DCMI recommendation, Dublin Core Meta- tive mathematics. Nature, 461(15):879–881, 2009. data Initiative, 2008. URL http://dublincore.org/ [122] Paul Grosso, Eve Maler, Jonathan Marsh, and Nor- documents/2008/01/04/dces/. man Walsh. W3C XPointer framework. W3C [109] Edd Dubmill. DOAP – description of a project. URL http: Recommendation, World Wide Web Consortium (W3C), //trac.usefulinc.com/doap. March 2003. URL http://www.w3.org/TR/2003/ [110] Bob DuCharme. Using RDFa with DITA and DocBook. REC-xptr-framework-20030325/. 2009. URL http://www.devx.com/semantic/ [123] Tudor Groza and Siegfried Handschuh. SALT annota- Article/42543/. tion ontology. Technical report, Digital Enterprise Re- [111] Thomas Eiter, Giovambattista Ianni, Thomas Krennwall- search Institute (DERI), 2009. URL http://salt. ner, and Axel Polleres. Rules and ontologies for the semanticauthoring.org/ontologies/sao. semantic web. In Cristina Baroglio, Piero A. Bon- [124] Tudor Groza and Siegfried Handschuh. SALT docu- atti, Jan Małuszynski,´ Massimo Marchiori, Axel Polleres, ment ontology. Technical report, Digital Enterprise Re- and Sebastian Schaffert, editors, Reasoning Web, number search Institute (DERI), 2009. URL http://salt. 5224 in Lecture Notes in Computer Science, pages 1– semanticauthoring.org/ontologies/sdo. 53. Springer, 2008. URL http://axel.deri.ie/ [125] Tudor Groza and Siegfried Handschuh. SALT rhetor- publications/eite-etal-2008.pdf. ical ontology. Technical report, Digital Enterprise Re- [112] Hans Magnus Enzensberger. Drawbridge up. A K PETERS, search Institute (DERI), 2009. URL http://salt. LTD., 1999. German original: Zugbrücke außer Betrieb. Die semanticauthoring.org/ontologies/sro. Mathematik im Jenseits der Kultur. English translation by [126] Tudor Groza, Siegfried Handschuh, Knud Möller, and Ste- Tom Artin. fan Decker. SALT – semantically annotated LATEX for sci- [113] William Farmer, Josuah Guttman, and Xavier Thayer. Little entific publications. In Enrico Franconi, Michael Kifer, and theories. In D. Kapur, editor, Proceedings of the 11th Confer- Wolfgang May, editors, The Semantic Web: Research and ence on Automated Deduction, number 607 in Lecture Notes Applications, number 4519 in Lecture Notes in Computer in Computer Science, pages 467–581, Saratoga Springs, NY, Science, pages 518–532. Springer Verlag, Berlin/Heidelberg, USA, 1992. Springer Verlag, Berlin/Heidelberg. 2007. ISBN 978-3-540-72666-1. [127] Tudor Groza, Knud Möller, Siegfried Handschuh, Diana [141] IEEE Learning Technology Standards Committee. Stan- Trif, and Stefan Decker. SALT: Weaving the claim web. dard for Resource Description Framework (RDF) binding for In Karl Aberer, Key-Sun Choi, Natasha Fridman Noy, Learning Object Metadata data model. Technical Report Dean Allemang, Kyung-Il Lee, Lyndon J. B. Nixon, Jen- 1484.12.4, IEEE, 2002. nifer Golbeck, Peter Mika, Diana Maynard, Riichiro Mi- [142] Prateek Jain, Pascal Hitzler, Peter Z. Yeh, Kunal Verma, and zoguchi, Guus Schreiber, and Philippe Cudré-Mauroux, ed- Amit P. Sheh. Linked data is merely more data. In Brickley itors, ISWC/ASWC, number 4825 in Lecture Notes in Com- et al. [75]. puter Science, pages 197–210. Springer Verlag, Berlin/Hei- [143] Fairouz Kamareddine, Manuel Maarek, Krzysztof Retel, and delberg, 2007. ISBN 978-3-540-76297-3. J. B. Wells. Narrative structure of mathematical texts. In [128] Tudor Groza, Siegfried Handschuh, Tim Clark, Simon Buck- Manuel Kauers, Manfred Kerber, Robert Miner, and Wolf- ingham Shum, and Anita de Waard. A short survey of dis- gang Windsteiger, editors, Towards Mechanized Mathemat- course representation models. In Clark et al. [85]. URL ical Assistants. MKM/Calculemus, number 4573 in Lecture http://CEUR-WS.org/Vol-523/. Notes in Artificial Intelligence, pages 296–312. Springer Ver- [129] Ferruccio Guidi. Searching and Retrieving in Content-based lag, Berlin/Heidelberg, 2007. ISBN 978-3-540-73083-5. Repositories of Formal Mathematical Knowledge. PhD thesis, [144] Fairouz Kamareddine, J. B. Wells, and Christoph Zengler. Università di Bologna, 2003. Computerising mathematical text with MathLang. Electron. [130] Ferruccio Guidi and Claudio Sacerdoti Coen. Query- Notes Theor. Comput. Sci., 205:5–30, 2008. ISSN 1571-0661. ing distributed digital libraries of mathematics. In URL http://www.cedar-forest.org/forest/ Thérèse Hardin and Renaud Rioboo, editors, Proceed- papers/drafts/mathlang-coq-short.pdf. ings of the 11th Symposium on the Integration of Sym- [145] Andrea Kohlhase and Michael Kohlhase. Semantic knowl- bolic Computation and Mechanized Reasoning (Calcule- edge management for education. Proceedings of the IEEE; mus 2003), pages 17–30, Rome, Italy, September 2003. Special Issue on Educational Technology, 96(6):970–989, URL http://www.calculemus.net/meetings/ June 2008. URL http://kwarc.info/kohlhase/ rome03/Proceedings/final.pdf. papers/semkm4ed.pdf. [131] Ferruccio Guidi and Irene Schena. A query language for a [146] Andrea Kohlhase and Michael Kohlhase. Spread- metadata framework about mathematical resources. In As- sheet interaction with frames: Exploring a mathemati- perti et al. [48], pages 105–118. cal practice. In Carette et al. [83], pages 341–356. [132] Thomas C. Hales, John Harrison, Sean McLaughlin, Tobias ISBN 9783642026133. URL http://kwarc.info/ Nipkow, Steven Obua, and Roland Zumkeller. A revision of kohlhase/papers/mkm09-framing.pdf. the proof of the Kepler conjecture. Discrete and Computa- [147] Michael Kohlhase. OMDOC – An open markup format tional Geometry, pages 1–34, 2010. for mathematical documents [Version 1.2]. Number 4180 [133] Kevin Hammond, Peter Horn, Alexander Konovalov, Steve in Lecture Notes in Artificial Intelligence. Springer Verlag, Linton, Dan Roozemond, Abdallah Al Zain, and Phil Trinder. Berlin/Heidelberg, August 2006. URL http://omdoc. Easy composition of symbolic computation software: A new org/pubs/omdoc1.2.pdf. lingua franca for symbolic computation. In Proceedings of the [148] Michael Kohlhase. Using LATEX as a Semantic Markup 2010 International Symposium on Symbolic and Algebraic Format. Mathematics in Computer Science, 2(2):279– Computation (ISSAC), pages 339–346. ACM Press, 2010. 304, 2008. URL https://svn.kwarc.info/repos/ [134] Steve Harris and Andy Seaborne. SPARQL 1.1 query lan- stex/doc/mcs08/stex.pdf. guage. W3C Working Draft, World Wide Web Consortium [149] Michael Kohlhase. Towards bootstrapping the prag- (W3C), October 2010. URL http://www.w3.org/TR/ matic to strict mapping in OMDoc. August 2009. URL 2010/WD-sparql11-query-20101014/. https://svn.omdoc.org/repos/omdoc/trunk/ [135] Olaf Hartig, Hannes Mühleisen, and Johann-Christoph Frey- doc/blue/p2s-bootstrap/note.pdf. tag. Linked data for building a map of researchers. In Bizer [150] Michael Kohlhase and Ioan ¸Sucan.A search engine for math- et al. [69]. URL http://CEUR-WS.org/Vol-449/. ematical formulae. In Tetsuo Ida, Jacques Calmet, and Dong- [136] Philipp Heim, Steffen Lohmann, and Timo Stegemann. Inter- ming Wang, editors, Proceedings of Artificial Intelligence active relationship discovery via the semantic web. In Aroyo and Symbolic Computation, AISC’2006, number 4120 in Lec- et al. [45], pages 303–317. ture Notes in Artificial Intelligence, pages 241–253. Springer [137] Tom Heath and Christian Bizer. Linked Data: Evolving the Verlag, Berlin/Heidelberg, 2006. URL http://kwarc. Web into a Global Data Space. Morgan & Claypool, San info/kohlhase/papers/aisc06.pdf. Rafael, CA, 2011. URL http://linkeddatabook. [151] Michael Kohlhase, Till Mossakowski, and Florian Rabe. com. Latin: Logic atlas and integrator. URL http://latin. [138] Bettina Heintz. Die Innenwelt der Mathematik. Zur Kultur omdoc.org. und Praxis einer beweisenden Disziplin. Springer Verlag, [152] Michael Kohlhase, Joe Corneli, Catalin David, Deyan Wien, 2000. Ginev, Constantin Jucovschi, Andrea Kohlhase, Christoph [139] Ian Hickson. HTML5. A vocabulary and associated APIs for Lange, Bogdan Matican, Stefan Mirea, and Vyacheslav HTML and XHTML. W3C Working Draft, World Wide Web Zholudev. The Planetary system: Web 3.0 & active docu- Consortium (W3C), May 2011. URL http://www.w3. ments for STEM. Procedia Computer Science, 4:598–607, org/TR/2011/WD-html5-20110525/. 2011. URL https://svn.mathweb.org/repos/ [140] IEEE Learning Technology Standards Committee. Standard planetary/doc/epc11/paper.pdf. Finalist at the for Learning Object Metadata. Technical Report 1484.12.1, Executable Papers Challenge. IEEE, 2002. [153] Werner Kunz and Horst W. J. Rittel. Issues as elements of ceedings, Aachen, 2008. CEUR-WS.org. URL http:// information systems. Working paper 131, Institute of Urban ceur-ws.org/Vol-385/. and Regional Development, University of California, Berke- [167] Paul Libbrecht. Notations around the world: Census and ex- ley, July 1970. ploitation. In Autexier et al. [54], pages 398–410. ISBN [154] Imre Lakatos. Proofs and Refutations. Cambridge University 3642141277. Press, 1976. [168] Paul Libbrecht and Erica Melis. Methods for Ac- [155] Christoph Lange. Krextor – an extensible XML→RDF ex- cess and Retrieval of Mathematical Content in Active- traction framework. In Bizer et al. [69]. URL http: Math. In N. Takayama and A. Iglesias, editors, //ceur-ws.org/Vol-449/ShortPaper2.pdf. Proceedings of ICMS-2006, number 4151 in Lec- [156] Christoph Lange. wiki.openmath.org – how it works, ture Notes in Artificial Intelligence, pages 331–342. how you can participate. In James H. Davenport, editor, 22nd Springer Verlag, Berlin/Heidelberg, 2006. URL OpenMath Workshop, July 2009. URL http://staff. http://www.activemath.org/publications/ bath.ac.uk/masjhd/OM2009.html. Libbrecht-Melis-Access-and-Retrieval- [157] Christoph Lange. Towards OpenMath content dictionaries ActiveMath-ICMS-2006.pdf. as linked data. In Michael Kohlhase and Christoph Lange, [169] William C. Mann and Maite Taboada. Rhetorical structure editors, 23rd OpenMath Workshop, July 2010. URL http: theory. URL http://www.sfu.ca/rst/. //cicm2010.cnam.fr/om/. [170] Massimo Marchiori. The mathematical semantic web. In As- [158] Christoph Lange. The OMDoc ontology, 2010. URL http: perti et al. [48], pages 216–223. Keynote. //kwarc.info/projects/docOnto/omdoc.html. [171] Jonathan Marsh, Daniel Veillard, and Norman Walsh. [159] Christoph Lange. Krextor – an extensible framework for xml:id version 1.0. W3C Recommendation, World Wide contributing content math to the web of data. In Davenport Web Consortium (W3C), September 2005. URL http:// et al. [96], pages 304–306. URL http://kwarc.info/ www.w3.org/TR/2005/REC-xml-id-20050909/. clange/pubs/krextor-system.pdf. [172] Viviana Mascardi, Valentina Cordì, and Paolo Rosso. A com- [160] Christoph Lange. Enabling Collaboration on Semiformal parison of upper ontologies. In Matteo Baldoni, Antonio Boc- Mathematical Knowledge by Semantic Web Integration. PhD calatte, Flavio De Paoli, Maurizio Martelli, and Viviana Mas- thesis, Jacobs University Bremen, 2011, number 11 in Studies cardi, editors, WOA, pages 55–64. Seneca Edizioni Torino, on the Semantic Web. AKA Verlag and IOS Press, Heidelberg 2007. ISBN 978-88-6122-061-4. and Amsterdam, 2011. [173] Claudio Masolo, Stefano Borgo, Aldo Gangemi, Nicola Guar- [161] Christoph Lange and Michael Kohlhase. A mathematical ino, Alessandro Oltramari, and Luc Schneider. The wonder- approach to ontology authoring and documentation. In web library of foundational ontologies. WonderWeb Deliv- Carette et al. [83], pages 389–404. ISBN 9783642026133. erable 17, Laboratory for Applied Ontology – ISTC-CNR, URL http://kwarc.info/kohlhase/papers/ May 2003. URL http://wonderweb.semanticweb. mkm09-omdoc4onto.pdf. org/deliverables/documents/D17.pdf. [162] Christoph Lange, Patrick Ion, Anastasia Dimou, Charalam- [174] MathDox. MathDox – interactive mathematics. URL http: pos Bratsas, Wolfram Sperber, Michael Kohlhase, and Ioan- //www.mathdox.org. nis Antoniou. Getting Mathematics Towards the Web [175] Deborah L. McGuinness, Li Ding, Paulo Pinheiro da Silva, of Data: the Case of the Mathematics Subject Classifica- and Cyntha Chang. PML 2: A modular explanation tion. Submitted to Extended Semantic Web Conference interlingua. In Thomas Roth-Berghofer, Stefan Schulz, 2012. URL http://kwarc.info/clange/pubs/ and David B. Leake, editors, Proceedings of the AAAI eswc2012-msc-skos.pdf. Workshop on Explanation-Aware Computing (ExaCt), 2007. [163] Christoph Lange, Uldis Bojars,¯ Tudor Groza, John Breslin, URL http://www.aaai.org/Papers/Workshops/ and Siegfried Handschuh. Expressing argumentative discus- 2007/WS-07-06/WS07-06-008.pdf. sions in social media sites. In John Breslin, Uldis Bojars,¯ [176] Erica Melis, Giorgi Goguadze, Paul Libbrecht, and Carsten Alexandre Passant, and Sergio Fernández, editors, Social Ullrich. Culturally adapted mathematics education with Data on the Web (SDoW), Workshop at the 7th International ActiveMath. AI & Society, 24(3):251–265, 2009. Semantic Web Conference, number 405 in CEUR Workshop [177] Paolo Missier, Satya S. Sahoo, Jun Zhao, Carole Goble, and Proceedings, Aachen, 2008. URL http://ceur-ws. Amit Sheth. Janus: from workflows to semantic provenance org/Vol-405/paper4.pdf. and linked open data. In Proceedings of the 3rd Provenance [164] Christoph Lange, Tuukka Hastrup, and Stéphane Corlosquet. and Annotation Workshop, 2010. Arguing on issues with mathematical knowledge items in a [178] MKMNET. MKMNET (Mathematical Knowledge Man- semantic wiki. In Joachim Baumeister and Martin Atzmüller, agement Network). URL http://monet.nag.co.uk/ editors, Wissens- und Erfahrungsmanagement LWA (Lernen, mkm/. Wissensentdeckung und Adaptivität) Conference Proceed- [179] Monet. MONET – Mathematics on the net. web page at ings, volume 448, October 2008. http://monet.nag.co.uk. URL http://monet. [165] Christoph Lange et al. Krextor – the KWARC RDF extractor. nag.co.uk. URL http://kwarc.info/projects/krextor/. [180] Till Mossakowski. Hets: the Heterogeneous Tool Set. URL [166] Paul Libbrecht. Cross curriculum search through the geoskills http://www.dfki.de/sks/hets. ontology. In D. Massart, J.-N. Colin, F. Van Asche, and [181] Till Mossakowski, Christian Maeder, and Klaus Lüttich. The M. Wolpers, editors, Proceedings of the 2nd International Heterogeneous Tool Set. In Orna Grumberg and Michael Workshop on Search and Exchange of e-le@rning Material Huth, editors, Proceedings of the 13th International Confer- (SE@M), located at EC-TEL-2008, CEUR Workshop Pro- ence on Tools and Algorithms for the Construction and Anal- ysis of Systems TACAS-2007, number 4424 in Lecture Notes burgh, April 2009. in Computer Science, pages 519–522, Berlin, Germany, 2007. [200] Andrew Robbins. Semantic MathML. 2009. URL Springer Verlag. http://straymindcough.blogspot.com/2009/ [182] Christine Müller. Adaptation of Mathematical Documents. 06/semantic-mathml.html. PhD thesis, Jacobs University Bremen, 2010. URL http: [201] Leo Sauermann and Richard Cyganiak. Cool URIs for the //kwarc.info/cmueller/papers/thesis.pdf. semantic web. W3C Interest Group Note, World Wide Web [183] Hala Naja-Jazzar, Nishadi de Silva, Hala Skaf-Molli, Char- Consortium (W3C), December 2008. URL http://www. bel Rahhal, and Pascal Molli. OntoReST: A RST-based on- w3.org/TR/2008/NOTE-cooluris-20081203/. tology for enhancing documents content quality in collabora- [202] Irene Schena. Towards a Semantic Web for Formal Mathemat- tive writing. INFOCOMP Journal of Computer Science, 8(3): ics. PhD thesis, University of Bologna, March 2002. Techni- 1–10, September 2009. cal Report UBLCS–2002–6. [184] Mikael Nilsson, Andy Powell, Pete Johnston, and Am- [203] SCIEnce. The SCIEnce project – Symbolic Compu- björn Naeve. Expressing Dublin Core metadata using tation Infrastructure for Europe. URL http://www. the Resource Description Framework (RDF). DCMI symbolic-computation.org/. recommendation, Dublin Core Metadata Initiative. URL [204] François-Paul Servant. Linking enterprise data. In Chris- http://dublincore.org/documents/2008/01/ tian Bizer, Tom Heath, Kingsley Idehen, and Tim Berners- 14/dc-rdf/. Lee, editors, Linked Data on the Web (LDOW), number 369 [185] Mikael Nilsson, Matthias Palmér, and Jan Brase. The LOM in CEUR Workshop Proceedings, Aachen, April 2008. URL RDF binding – principles and implementation. In 3rd Annual http://CEUR-WS.org/Vol-369/. Ariadne Conference, 2003. [205] John Sheridan and Jeni Tennison. Linking UK govern- [186] Tope Omitola, Christos L. Koumenides, Igor O. Popov, Yang ment data. In Christian Bizer, Tom Heath, Tim Berners- Yang, Manuel Salvadores, Martin Szomszor, Tim Berners- Lee, and Michael Hausenblas, editors, Linked Data on the Lee, Nicholas Gibbins, Wendy Hall, mc schraefel, and Nigel Web (LDOW), number 628 in CEUR Workshop Proceed- Shadbolt. Put in your postcode, out comes the data: A case ings, Aachen, April 2010. URL http://CEUR-WS.org/ study. In Aroyo et al. [45], pages 318–332. Vol-628/. [187] OpenLink Software. OpenLink universal integration mid- [206] David Shotton and Silvio Peroni. DoCo, the Document Com- dleware – Virtuoso product family. URL http:// ponents Ontology. May 2011. URL http://purl.org/ virtuoso.openlinksw.com. spar/doco. [188] Christian-Emil Ore and Øyvind Eide. TEI and cultural her- [207] Neil J. A. Sloane. The on-line encyclopedia of integer se- itage ontologies: Exchange of information? Literary and Lin- quences. Notices of the AMS, 50(8):912, 2003. guistic Computing, 24(2):161–172, 2009. [208] Wolfram Sperber. Math-Net international and the Math-Net [189] Luca Padovani. GtkMathView. URL http://helm.cs. page. In Fengshan Bai and Bernd Wegner, editors, Elec- unibo.it/mml-widget/. tronic Information and Communication in Mathematics, num- [190] Luca Padovani and Stefano Zacchiroli. From notation to se- ber 2730 in Lecture Notes in Computer Science, pages 169– mantics: There and back again. In Borwein and Farmer [72], 177. Springer, 2003. pages 194–207. [209] Manu Sporny, Benjamin Adrian, Nathan Rixham, Mark [191] Alexandre Passant, Paolo Ciccarese, John G. Breslin, and Tim Birbeck, and Ivan Herman. RDFa API. W3C Clark. SWAN/SIOC: Aligning scientific discourse represen- Working Draft, World Wide Web Consortium (W3C), tation and social semantics. In Clark et al. [85]. April 2011. URL http://www.w3.org/TR/2011/ [192] PlanetMath.org. PlanetMath.org – math for the people, by the WD-rdfa-api-20110419/. people. URL http://planetmath.org. [210] Heinrich Stamerjohanns, Michael Kohlhase, Deyan Ginev, [193] Judith Plümer. Erinnerungen an Prof. Dr. Schwänzl, Catalin David, and Bruce Miller. Transforming large collec- 2004. URL http://d-mathnet.preprints.org/ tions of scientific publications to XML. Mathematics in Com- research/wissinfo/NachrufRS.html. puter Science, 3(3):299–307, 2010. URL http://kwarc. [194] George Pólya. How to Solve it. Princeton University Press, info/kohlhase/papers/mcs10.pdf. 1973. [211] Pradeep Suresh, Shuo-Huan Hsu, Pavan Akkisetty, Gintaras [195] Florian Rabe. Representing Logics and Logic Translations. V. Reklaitis, and Venkat Venkatasubramanian. OntoMODEL: PhD thesis, Jacobs University Bremen, 2008. URL http:// Ontological Mathematical Modeling Knowledge Manage- kwarc.info/frabe/Research/phdthesis.pdf. ment in Pharmaceutical Product Development, 1: Conceptual [196] Florian Rabe and Michael Kohlhase. A scalable module Framework. Industrial & Engineering Chemistry Research, system. Manuscript, submitted to Information & Com- 49(17):7758–7767, 2010. putation, 2011. URL http://kwarc.info/frabe/ [212] Carlos Tejo-Alonso, Sergio Fernández, Diego Berrueta, Luis Research/mmt.pdf. Polo, María Jesús Fernández, and Víctor Morlán. eZaragoza, [197] Ricardo Radaelli-Sanchez and Connexions. The intermediate a tourist promotional mashup. In Adrian Giurca, Brigitte CNXML (Connexions web site). URL http://cnx.org/ Endres-Niggemeyer, Christoph Lange, Lutz Maicher, and content/m9006/2.22/. Pascal Hitzler, editors, AI Mashup Challenge at ESWC, [198] Robert G. Raskin and Michael J. Pan. Knowledge representa- June 2010. URL http://sites.google.com/a/ tion in the semantic web for Earth environmental terminology fh-hannover.de/aimashup/home/ezaragoza. (SWEET). Computers & Geosciences, 31:1119–1125, 2005. [213] Christoph Tempich, H. Sofia Pinto, York Sure, and Steffen [199] Krzysztof Retel. Gradual Computerisation and Verification Staab. An argumentation ontology for DIstributed, Loosely- of Mathematics. PhD thesis, Heriot-Watt University, Edin- controlled and evolvInG Engineering processes of oNTolo- gies (DILIGENT). In Asunción Gómez-Pérez and Jérôme Eu- isar-ref.pdf. zenat, editors, The Semantic Web: Research and Applications, [223] Freek Wiedijk. Statistics on digital libraries of mathematics. number 3532 in Lecture Notes in Computer Science, pages Studies in Logic, Grammar and Rhetoric, 18(31), 2009. 241–256. Springer Verlag, Berlin/Heidelberg, 2005. ISBN 3- [224] Wikipedia. Wikipedia, the free encyclopedia. URL http: 540-26124-9. //www.wikipedia.org. [214] Christoph Tempich, Elena Simperl, Markus Luczak, Rudi [225] Wikipedia: Knowledge management. Knowl- Studer, and H. Sofia Pinto. Argumentation-based ontology edge management, December 2009. URL http: engineering. IEEE Intelligent Systems, 22(6):52–59, 2007. //en.wikipedia.org/w/index.php?title= ISSN 1541-1672. Knowledge_management&oldid=329227520. [215] Jerzy Trzeciak. Writing Mathematical Papers in English. [226] Wikipedia: Linus’ Law. Linus’ Law, March 2011. Gdanskie´ Wydawnictwo Oswiatowe,´ 1995. URL http://en.wikipedia.org/w/index.php? [216] Josef Urban, Jesse Alama, Piotr Rudnicki, and Herman Geu- title=Linus%27_Law&oldid=421629750. vers. A wiki for Mizar: Motivation, considerations, and ini- [227] Wikipedia: Mathematics. Portal: Mathematics, December tial prototype. In Autexier et al. [54], pages 455–469. ISBN 2009. URL http://en.wikipedia.org/w/index. 3642141277. php?title=Portal:Mathematics&oldid= [217] Josef Urban. Content-based encoding of mathematical and 329137789. code libraries. In Christoph Lange and Josef Urban, ed- [228] Wikipedia: Open Notebook Science. Open Notebook itors, ITP Workshop on Mathematical Wikis (MathWikis), Science, July 2010. URL http://en.wikipedia. number 767 in CEUR Workshop Proceedings, Aachen, 2011, org/w/index.php?title=Open_Notebook_ pages 49–53. URL http://ceur-ws.org/Vol-767/ Science&oldid=372235042. paper-09.pdf. [229] Wikipedia: Pythagorean theorem. Pythagorean theorem, [218] Denny Vrandeciˇ c,´ Markus Krötzsch, Sebastian Rudolph, November 2009. URL http://en.wikipedia. and Uta Lösch. Leveraging non-lexical knowledge for org/w/index.php?title=Pythagorean_ the Linked Open Data Web. In Rodolphe Héliot theorem&oldid=328597679. and Antoine Zimmermann, editors, The Fifth RAFT, [230] Gregory Todd Williams. SPARQL 1.1 Service Description. the yearly bilingual publication on nonchalant research, W3C Working Draft, World Wide Web Consortium (W3C), 2010. URL http://km.aifb.kit.edu/projects/ May 2011. URL http://www.w3.org/TR/2010/ numbers/linked_open_numbers.pdf. WD-sparql11-service-description-20110512/. [219] Denny Vrandeciˇ c,´ Christoph Lange, Michael Hausenblas, Jie [231] Stefano Zacchiroli. User Interaction Widgets for Interactive Bao, and Li Ding. Semantics of governmental statistics Theorem Proving. PhD thesis, Università di Bologna, 2007. data. In Proceedings of WebSci’10: Extending the Fron- [232] Jürgen Zimmer. MathServe – A Framework for Semantic Rea- tiers of Society On-Line. Web Science Trust, 2010. URL soning Services. PhD thesis, Universität des Saarlandes, July http://journal.webscience.org/400/. 2008. [220] Norman Walsh. The DocBook schema. Committee specifica- [233] Antoine Zimmermann. Ontology recommendations for the tion, OASIS, August 2008. URL http://www.docbook. data publishers. In Mathieu d’Aquin, Alexander García Cas- org/specs/docbook-5.0-spec-cs-01.html. tro, Christoph Lange, and Kim Viljanen, editors, 1st Work- [221] Norman Walsh and Leonard Muellner. DocBook 5.0: The shop on Ontology Repositories and Editors, number 596 in Definitive Guide. O’Reilly, 2008. CEUR Workshop Proceedings, Aachen, 2010. URL http: [222] Makarius Wenzel, Clemens Ballarin, Stefan Berghofer, Tim- //ceur-ws.org/Vol-596/. othy Bourke, Lucas Dixon, Florian Haftmann, Gerwin Klein, [234] Claus Zinn. Understanding Informal Mathematical Dis- Alexander Krauss, Tobias Nipkow, David von Oheimb, Larry course. PhD thesis, Technischen Fakultät der Universität Paulson, and Sebastian Skalberg. The Isabelle/Isar Reference Erlangen-Nürnberg, 2004. Manual. URL http://isabelle.in.tum.de/doc/