MathServe – A Framework for Semantic Reasoning Services

Jurgen¨ Zimmer

Dissertation zur Erlangung des Grades des Doktors der Ingenieurwissenschaften der Naturwissenschaftlich-Technischen Fakult¨aten der Universit¨at des Saarlandes

Saarbrucken,¨ Juli 2008

Dekan Prof. Dr. Joachim Weickert Vorsitzender Prof. Dr. Christoph Weidenbach Gutachter Prof. Dr. (PhD) J¨org Siekmann, Universit¨at des Saarlandes Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster, Universit¨at des Saarlandes Prof. Alan Bundy, University of Edinburgh, Edinburgh, UK Beisitzer Dr. Dominikus Heckmann Tag des Kolloquiums 11. Juli 2008 Contents

Kurzzusammenfassung VII

Abstract IX

Acknowledgements XI

I Preliminaries 1

1 Introduction 3 1.1 ContributionsofthisThesis ...... 9 1.1.1 Contributions to the Mathematical Semantic Web and Auto- matedReasoningResearch...... 9 1.1.2 Contributions to Semantic Web Research ...... 9 1.2 WhatthisThesisisnotabout ...... 9 1.3 OutlineofthisThesis...... 10

2 Mechanised Reasoning in Artificial Intelligence 13 2.1 History of Mechanised Reasoning ...... 13 2.2 Mechanised Reasoning Systems ...... 14 2.2.1 First-Order Automated Theorem Proving Systems ...... 15 2.2.2 Propositional Satisfiability Solvers ...... 17 2.2.3 DecisionProcedures ...... 17 2.2.4 FiniteModelGenerators ...... 18 2.2.5 ProofTransformationSystems ...... 19 2.3 DistributedAutomatedReasoning...... 20 2.3.1 Frameworks for Distributed Automated Reasoning ...... 20 2.3.2 Frameworks for the Integration of Reasoning Systems ...... 20 2.3.3 TheMathWebSoftwareBus ...... 22 2.3.4 Computation Systems as Semantic Web Services ...... 23 2.4 ReasoningaboutActionsandChange ...... 23 2.4.1 TheSituationCalculus ...... 24 2.4.2 Golog – High-level programming in the Situation Calculus ... 29 2.4.3 ClassicalAIPlanning...... 32 2.4.4 Stochastic Actions and Decision-Theoretic Planning ...... 33 2.4.5 DTGolog–GologandDecisionTheory ...... 38 2.5 Summary ...... 42 II CONTENTS

3 Semantic Web Services 43 3.1 TheSemanticWeb ...... 43 3.1.1 The Extensible Markup Language ...... 45 3.1.2 The Resource Description Framework ...... 45 3.1.3 Knowledge Representation and Description Logics ...... 47 3.1.4 TheWebOntologyLanguage ...... 49 3.1.5 DescriptionLogicsandOWL-DL ...... 50 3.2 SemanticWebServices ...... 53 3.2.1 WebServices ...... 54 3.2.2 The OWL-S Upper Ontology for Web Services ...... 57 3.2.3 WSMO ...... 60 3.3 Composition of Semantic Web Services ...... 61 3.3.1 AI Planning for Web Service Composition ...... 61 3.3.2 GologandtheSituationCalculus ...... 63 3.3.3 MarkovDecisionProcesses ...... 63 3.3.4 ProgramSynthesisinLinearLogic ...... 63 3.4 DefinitionsandNotation ...... 64 3.4.1 OWL Ontologies – Classes, Properties and Individuals ..... 64 3.4.2 TheSemanticWebRuleLanguage ...... 66 3.4.3 OWL-SServiceDescriptions ...... 69 3.5 Summary ...... 74

II Brokering Semantic Reasoning Services 77

4 Semantic Reasoning Services 79 4.1 ADomainOntology ...... 79 4.2 ATaxonomyofReasoningSystems ...... 81 4.3 ProblemTransformationServices ...... 82 4.3.1 ClauseNormalFormGenerators...... 83 4.3.2 Higher-Order to First-Order Translation ...... 84 4.4 First-Order Automated Theorem Proving Services ...... 86 4.4.1 First-order Theorem Proving Problems ...... 86 4.4.2 AnOntologyofATPStatuses ...... 87 4.4.3 Results of First-Order ATP Systems ...... 89 4.4.4 Specialist Problem Classes and System Performance ...... 90 4.4.5 ATPServicesinOWL-S ...... 93 4.5 A Service for TPTP Problem Analysis ...... 97 4.6 Finite Model Generation Services ...... 98 4.7 Decision Procedure Services ...... 98 4.8 ProofTransformationServices ...... 101 4.8.1 The Otterfier Service – Transforming CNF Derivations . .... 102 4.8.2 The TRAMP Service – Generating Natural Deduction Proofs . 103 4.9 Summary ...... 104 CONTENTS III

5 The MathServe Framework 105 5.1 Queries and Composite Services in MathServe ...... 105 5.1.1 OWL-SQueryProfiles ...... 105 5.1.2 Composite Services as Golog Procedures ...... 107 5.2 TheMathServeBroker ...... 108 5.2.1 TheServiceRegistry ...... 109 5.2.2 TheQueryManager ...... 109 5.2.3 TheServiceMatchmaker ...... 110 5.2.4 TheServiceComposer ...... 111 5.2.5 ThePelletOntologyReasoner ...... 112 5.2.6 TheGologInterpreter ...... 113 5.2.7 The Automated Theorem Proving Interface ...... 115 5.3 SystemImplementation...... 116 5.4 Availability & Usability ...... 117 5.5 Summary ...... 117

6 Composition of Reasoning Services 119 6.1 Requirements on Web Service Composition ...... 119 6.2 Analysis of Existing Approaches ...... 121 6.2.1 Planning for Web Service Composition ...... 121 6.2.2 GologandtheSituationCalculus ...... 121 6.2.3 Classical Markov Decision Processes ...... 122 6.2.4 ProgramSynthesis ...... 122 6.2.5 Summary ...... 123 6.3 Automated Service Composition in MathServe ...... 124 6.4 Planning with Deterministic Agent Actions ...... 125 6.4.1 ThePRODIGYSystem ...... 125 6.4.2 PRODIGY for Web Service Composition ...... 127 6.5 Service Profiles and Plans as DTGolog Domains ...... 133 6.5.1 GeneratingDTGologActionDomains...... 134 6.5.2 GeneratingaGologProcedure ...... 138 6.6 ComputinganOptimalPolicy ...... 139 6.7 Summary ...... 141

III Applications and Evaluation 143

7 MathServe at CADE System Competitions 145 7.1 TheCADESystemCompetition...... 146 7.2 TrainingtheMathServeBroker ...... 147 7.3 MathServeatCASC-20...... 150 7.3.1 ComparisonwiththeESystem...... 151 7.3.2 Comparison with the Vampire System...... 152 7.4 ImprovingMathServe...... 153 7.5 MathServeatCASC-J3...... 154 7.6 MathServeonSATProblems ...... 156 IV CONTENTS

7.7 Summary ...... 156

8 MathServe on Higher-Order Problems 159 8.1 ASetofHigher-OrderProblems ...... 160 8.2 LeoandFirst-OrderATPSystems...... 161 8.3 TheLEOServices...... 162 8.4 A Definition and Extensionality Expansion Service ...... 164 8.5 TwoCompositeServices ...... 165 8.6 EvaluationofCompositeServices ...... 166 8.7 Summary ...... 168

IV Conclusion 171

9 Conclusions and Related Work 173 9.1 Semantic Computation Services...... 174 9.2 Online Access to ATP Systems...... 175 9.3 Optimal Choice of Reasoning Systems...... 175

10 Limitations and Future Work 177 10.1 LimitationsofMathServe ...... 177 10.1.1 ManagementofTimeResources ...... 178 10.1.2 Concurrent Service Invocations ...... 178 10.1.3 Reasoning Services with States ...... 179 10.1.4 Limitations of First-order Theorem Provers ...... 180 10.2FutureWork...... 180 10.2.1 ManagementofTimeResources ...... 180 10.2.2 Concurrency...... 182 10.2.3 StatefulReasoningServices ...... 183 10.2.4 Extending the Range of Reasoning Services ...... 184 10.2.5 NetworksofMathServeBrokers ...... 185 10.2.6 Online Update of System Performance ...... 185 10.3Summary ...... 186

Bibliography 187

V Appendices 213

A The MathServe Domain Ontology 215

B Statuses of ATP Problems 219

C First-Order ATP Systems and their Performance 223 C.1 ATPSystemsinMathServe ...... 223 C.2 PerformanceoftheATPSystems ...... 225

D An OWL-S Service Description 227 CONTENTS V

E Translation Functions for Web Service Composition 235 E.1 Translation to Planning Domains and Problems ...... 235 E.2 Translation to the Situation Calculus ...... 239

F Planning and Situation Calculus Domains 247 F.1 The PRODIGY Planning Domain for ResultQuery ...... 247 F.2 A Stochastic situation calculus Domain ...... 252

List of Figures 260

List of Tables 263

List of Acronyms and Symbols 265

Index 269

Kurzzusammenfassung

Die vorliegende Arbeit beschreibt das Design und die Implementierung des MathSer- ve Systems. MathServe basiert auf einer dienstorientierten Architektur und bietet die Funktionalit¨at automatischer Deduktionssysteme und verwandter Systeme als seman- tische Web Dienste an. Die Semantik dieser Web Dienste wird mit Hilfe der OWL-S Ontologie beschrieben. Information uber¨ die Leistungsf¨ahigkeit von Diensten auf ver- schiedenen Problemklassen wird durch bedingte probabilistische Effekte in OWL-S Dienstprofilen beschrieben. Diese Information wird verwendet, um fur¨ ein anstehendes Beweisproblem den am besten geeigneten Dienst zu finden. Der MathServe broker ist ein spezialisierter Software Agent, der Dienst Matchmaking- und Kompositionsdiens- te anbietet. Unser Ansatz zur Komposition von Diensten kombiniert klassisches Pla- nen mit entscheidungstheoretischem Schließen im Situationenkalkul.¨ Zusammengestzte Dienste werden durch Golog Prozeduren beschrieben und k¨onnen sehr einfach von menschlichen Nutzern gelesen und bearbeitet werden. MathServe wurde erfolgreich so- wohl auf einer Menge von Beweisproblemen in Logik h¨oherer Ordnung als auch in den Demonstrations-Ligen von zwei CADE Wettbewerben fur¨ Theorembeweiser (CASC) evaluiert. Diese Arbeit stellt einen maßgeblichen Beitrag zur Entwicklung eines Mathe- matischen Semantischen Netzes dar. Sie pr¨asentiert aber auch neue Beitr¨age auf den Gebieten des automatischen Beweisens und der automatischen Komposition von Web Diensten.

Abstract

In this thesis we describe the design and implementation of the MathServe framework for semantic reasoning services. MathServe is based on a service-oriented architecture and integrates automated reasoning systems and related systems as Semantic Web Services. The semantics of MathServe’s reasoning Web Services is described using the OWL-S upper ontology. Data about the performance of reasoning services is provided in OWL-S service profiles as conditional probabilistic effects. This data can be used to select suitable services for specialised reasoning tasks. The MathServe broker is a middle agent which provides service matchmaking and composition facilities. Service composition in MathServe combines classical planning and decision-theoretic reasoning in the situation calculus language Golog. Composite reasoning services are represented, in a human-readable way, as Golog procedures. MathServe has been evaluated success- fully in the demonstration divisions of two CADE ATP System Competitions (CASC) and on a set of higher-order conjectures about sets, relations and functions. The work presented in this thesis contributes to the ongoing development of a Mathematical Semantic Web, in which mathematical objects, documents and services are described with the help of machine-processable and machine-understandable semantic meta-data. But it also makes contributions to automated reasoning research and to research on automated Web Service Composition.

Acknowledgements

First of all I would like to thank my supervisor J¨org Siekmann for giving me the opportunity to work in the Ωmega group for all those years. It was J¨org’s unvarying enthusiasm for AI that first brought me into the field. For many fruitful discussions and valuable feedback I would like to thank all my colleagues in the Ωmega group at Saarland University. They always worked together as a team and provided a perfect working environment. My special thanks goes to my co-supervisors Christoph Benzm¨uller and Serge Au- texier who accompanied in my everyday work and proof-read substantial parts of this thesis. I am also grateful to the European Union Research Training Network Calculemus which allowed me to visit many outstanding research groups in Europe and made me realise that I have always felt as a European rather than a citizen of any national state. I would also like to express my gratitude to Alan Bundy whose unmatched dedica- tion to his students and research made my stay at the University of Edinburgh a most pleasant experience. I thank Geoff Sutcliffe for his invaluable support. Without his kind and unwavering help, the work presented in this thesis would not have been possible. Over the years, I have sent approximately 300 e-mails to Geoff to discuss the subtleties of first-order ATP systems and the TPTP Library, and I always got an answer straight away. Above all, I thank my wife, Sandra, for her continuing emotional and practical support over the recent years. Last but not least, I thank our son, Andr´e, for all the joy he brought to our life.

Part I

Preliminaries

Chapter 1

Introduction

The Internet and the World Wide Web have changed the way many people live and work. People communicate via electronic mail, talk to friends, listen to radio programs, and do their daily shopping via the Internet and the Web. Furthermore, large parts of the global business activities are performed in the Internet. But the Internet has also changed the way research is done. This is particularly true for large-scale scientific en- deavours such as the International Human Genome Project (HGP), in which hundreds of researchers from all over the world collaborated to determine the 3 billion base pairs of the human DNA. Projects like the HGP would simply be impossible without the communication and data-sharing facilities of the Internet and the Web. With approximately 10 Billion pages available, the Web has become the world’s largest source of information. However, with an ever growing number of Web pages and other resources available on the Web, it becomes increasingly more difficult to find the information one is looking for. State-of-the-art search engines are based on syntactic matching only and, more often than not, deliver unsatisfactory results. This is mainly because information on the Web is primarily designed for humans to process and not for machines to interpret meaningfully. The Semantic Web. The Semantic Web is the vision of a more powerful Web in which information is not only machine readable but machine-understandable [Berners-Lee et al., 2001]. In the Semantic Web, services and static content, are annotated with semantic markup, or semantic meta-data. This allows sophisticated search engines to deliver more accurate search results. Semantic meta-data can also be used by software agents to automatically perform tasks for a user or find exactly the information a user is looking for. Formal ontologies play a key role in the Semantic Web by providing the vocab- ulary used in semantic markup. Ontologies describe hierarchies of precisely defined concepts and specify properties of instances of these concepts. In the last decades, research in the field of knowledge representation has produced a plethora of languages for describing ontological structures [Bobrow and Winograd, 1977, Brachman and Schmolze, 1985, Genesereth and Fikes, 1990]. This culminated in the development of families of Description Logics with different expressivities and computational prop- erties [Baader et al., 2002]. In the context of the Semantic Web, the Web Ontology Language (OWL) [Bechhofer et al., 2004] has been recommended by the World Wide Web Consortium. OWL is seamlessly integrated with other Web standards and is de- 4 Chapter 1. Introduction signed to be easily processable by machines. It consists of three sublanguages with different expressivities. Of particular interest is the Description Logic fragment OWL- DL which combines a high expressivity with good computational properties of the most important reasoning tasks. Several tools for reasoning on OWL-DL ontologies have already been developed [Sirin et al., 2006, Haarslev et al., 2004]. The OWL language builds upon a hierarchy of less expressive languages. These

Trust

Proof

Logic Framework

OWL Rules Signature RDF Schema Encryption

RDF Core

XML Namespaces

URI Unicode

Figure 1.1: The Semantic Web stack (adapted from [Kifer et al., 2005]) languages are also part of the Semantic Web and can be arranged in a stack as shown in Figure 1.1. Generally speaking, languages on a higher level in the stack have a more expressive semantics. They are defined by using the languages on the levels below as meta-languages. Uniform Resource Identifiers (URIs) and the Extensible Markup Language (XML) are the basic building blocks on the lowest layer1. They are followed by the Resource Description Framework (RDF) for simple statements about Web resources, and the RDF-Schema (RDFS) for describing the vocabulary used in RDF documents. It is important to note that the Semantic Web stack has been finalised up to the level of OWL and rule languages. Higher levels, such as an overarching “Logic Framework” and the top layers “Proof” and “Trust” are still subject to ongoing research. Despite its obvious benefits, the Semantic Web has not yet reached the desired mo- mentum, and large parts of the Web still consist of classical content [Shadbolt et al., 2006]. One reason for this is surely the fact that producing semantic markup is a time- and money-consuming effort. Another reason is the problem of getting a sufficiently large number of content providers to agree on common ontologies, i.e. on common con- ceptualisations of a domain. Because it is infeasible to develop one commonly accepted ontology that captures all aspects of human knowledge, Semantic Web researchers started to focus on restricted application domains. As a consequence, the first suc- cessful use cases of Semantic Web technology have been presented by specialised user groups (such as the e-science communities). For example, in the life sciences numerous initiatives are continuously developing ontologies for biology, medicine, genomics and related fields2. 1URIs identify resources in the Web. XML defines a certain class of machine-processable languages. We refer the reader to Chapter 3 for further details. 2See, for example, the Open Biomedical Ontologies initiative at http://obo.sourceforge.net. 5

Future Change in Mathematical Practice. It is reasonable to expect that the growing influence of the Internet and the Semantic Web will also change the way mathematics is done in the future. To this day, math- ematics is mostly done with paper and pencil. Mathematical documents are typically written in the typesetting language LATEX which only provides presentation markup and does not capture the structure or meaning of mathematical objects. As a consequence, mathematical documents and objects are not machine understandable. Most of mathematics has never been formalised in a logical language or verified by one of the formal reasoning systems developed in the field of automated deduction. For- mal methods and logic-based reasoning tools are rarely used by mathematicians. Only computer algebra systems, such as Maple [Char et al., 1992] or Mathematica [Wol- fram, 1999], and software packages for numerical computations [Moler, 2006] have been adopted by the mathematical community. The reasons why formal reasoning systems are not accepted by mathematicians are manifold. On the one hand, reasoning systems use their own idiosyncratic input syntax and can typically only be used by their developers, or by experts in the field. On the other hand, the results of reasoning systems are either hard to interpret (and check) and they are not accepted by mathematicians as valid arguments. For example, proofs found by automated theorem provers are typically large sequences of very low-level logical inference steps. These proofs are not designed for humans to read or verify. Most mathematicians, however, expect a deep understanding of a proof and rely on a social process in which complex proofs are checked by the mathematical community and not by computers [Millo et al., 1979]. In general, such a process is not possible with complex machine-found proofs3. However, the work presented in this thesis addresses the first problem by simplifying the use of formal reasoning systems and by making them accessible to a wider user community. The Mathematical Semantic Web. The work presented in this thesis is inspired by the vision of a future mathematician doing mathematics in an integrated, Web-based mathematical development environ- ment rather than using paper and pencil. In such an environment, the world’s mathe- matical knowledge could be accessed via the Web and automated tools could support the mathematician, for example, by automatically looking for related work, or by auto- matically checking the correctness of mathematical statements. We envision the basic infrastructure for a mathematical development environment tobea Mathematical Se- mantic Web in which the semantics of mathematical objects, documents, and services is described with the help of machine-processable and machine-understandable meta- data. As a consequence Web-based mathematical software systems could be developed that support many mathematical activities, such as writing formal mathematical doc- uments, performing (symbolic or numeric) computations, or proving a theorem. These systems could communicate with human users and amongst themselves by exchanging structured mathematical documents. Commonly accepted document formats would make the context of the communication clear and disambiguate the meaning of the mathematical objects involved. Libraries of formalised mathematical theories would be accessible via networks of data bases.

3This was, for example, the case for the computer parts of Hales’ proof of the Kepler conjecture. 6 Chapter 1. Introduction

Integrated Mathematical Development Environment   MBase Semantic Semantic Computation Reasoning   OMDoc MSDL Services Services OWL−S   OWL Rules Computation Logic−Based MathML OpenMath  RDF Schema Systems Reasoning RDF Core XML Namespaces XML Namespaces (CAS) Systems URI Unicode  URI Unicode Structured Semantic Semantic Mathematical Documents Mathematical Services Web  Contribution of MathServe  Figure 1.2: The Mathematical Semantic Web and an integrated mathematical devel- opment environment built on top of it. The MathServe framework contributes by provisioning semantic reasoning services

Figure 1.2 illustrates how the Mathematical Semantic Web can be seen as an ex- tension of the Semantic Web. The building blocks of the Mathematical Semantic Web are divided in three main groups: markup languages for structured mathematical doc- uments, semantic mathematical services, and the classical Semantic Web stack. In recent years, a significant amount of research has been done on the develop- ment of markup languages for structured mathematical documents with the most prominent languages being the OpenMath standard [Caprotti and Cohen, 1998], content MathML [Carlisle et al., 2003] and, last but not least, the OMDoc for- mat [Kohlhase, 2006]. Specialised data bases for storing and retrieving structured mathematical documents have also been developed. Among the most promising ap- proaches are MBase [Kohlhase and Franke, 2001] and the HELM data base [Asperti et al., 2001]. Research on offering mathematical services on the Web and annotating them with semantic information is still in an early stage. Next to our work, only the projects MathBroker [Schreiner and Caprotti, 2001] and MONET [MONET, 2002] worked in this direction by describing the functionality of computation systems as Web Services. The semantics of these services is described with the help of the proprietary Mathemat- ical Service Description Language (MSDL) [Buswell et al., 2003] and the OpenMath language. Formal (logic-based) reasoning services have not yet been developed. In this thesis, we present the MathServe framework which is a major step towards closing this gap.

The MathServe Framework. The main contribution of this thesis is the design and implementation of the Math- 7

Serve framework following the programming paradigm of service-oriented architectures (SOA) [Bieberstein et al., 2005]. In a service-oriented software environment, all compo- nents of a software are modelled as independent services. These services are typically stateless and provide well-defined interfaces that can be called in a standard way with- out pre-knowledge of the calling application. This enabless the (dynamic) creation of applications by combining loosely coupled, interoperable services. The services inter- operate based on formal descriptions of their interfaces that are independent of the underlying hardware platform or programming language. Despite the fact that the SOA paradigm has been developed in an industrial context, most of its features are also desirable for the services in a Mathematical Semantic Web as described above. With a network of independent mathematical services available, tasks can be per- formed flexibly via the Internet. Semantic descriptions of the services’ interfaces can be used to automatically discover services, and to combine services in case one service alone can not perform a task. At the same time services can be re-used and dynamically replaced by other services that offer the same functionality. The MathServe framework integrates automated reasoning systems and related tools as Semantic Web Services. MathServe complements the work of the projects MathBroker and MONET and contributes to the development of the Mathematical Semantic Web as shown with the honeycomb pattern in Figure 1.2. MathServe has an open architecture and new reasoning services can be integrated by service providers via established Web protocols. The core reasoning services available in MathServe are provided by automated the- orem proving systems, finite model generators, and decision procedures for (decidable fragments of) classical first-order logic with equality. Furthermore, MathServe offers services for transforming reasoning problems into different formats and formal proofs into different logical calculi. Reasoning services in MathServe are deployed as Web Ser- vices, i.e. as processes that are accessible via standard Web protocols. The semantics of these Web Services is described using the Description Logic ontology OWL-S [Mar- tin et al., 2004]. OWL-S defines the basic concepts needed to characterise the inputs, outputs, preconditions and effects of Web Services. It supports semantic descriptions of atomic Web Services, i.e. single processes in the Web, or composite services which combine several more primitive services using standard programming constructs. MathServe is based on a domain ontology formalised in OWL-DL. This ontology defines the Description Logic concepts and properties needed for the OWL-S service descriptions and the communication with MathServe services. Furthermore, the on- tology provides a set of well-defined problem statuses that disambiguate the results delivered by reasoning services. State-of-the-art reasoning tools for OWL-DL, such as the Pellet reasoner [Sirin et al., 2003], are used to derive new knowledge from the asserted domain ontology. It is unrealistic to expect that reasoning problems can always be solved by a single service available in MathServe. Sometimes, several services have to be composed to solve a problem. MathServe’s service broker is a specialised middle-agent that can au- tomatically combine reasoning services if necessary. Service composition in MathServe is based on a combination of classical AI planning and decision-theoretic reasoning in a stochastic variant of the situation calculus [McCarthy, 1968, Reiter, 2001, Soutchanski, 2003]. Composite reasoning services are represented in the high-level programming 8 Chapter 1. Introduction language Golog [Levesque et al., 1997]. The MathServe broker maintains a persistent data base of available reasoning services. New services can easily be registered by ser- vice providers. Software applications can ask the broker to automatically find services that match a certain profile or to compose services in case no single service can answer a query. The performance of several first-order theorem proving systems and decision pro- cedures has been measured on large libraries of problems. The resulting performance data has been added to the OWL-S descriptions of the corresponding reasoning services. This data is used by the MathServe broker to choose the most promising reasoning ser- vice for a problem at hand. The MathServe system entered two system competitions associated with the annual Conference on Automated Deduction (CADE) [Sutcliffe, 2006b]. In these competitions, the optimal choice of reasoning services by the Math- Serve broker was evaluated. At CASC-J3 [Sutcliffe, 2006a], MathServe performed better than any standalone automated theorem proving system participating in the competition. The evaluation also emphasised the stability and flexibility of the Math- Serve framework and its reasoning services (see Chapter 7). The MathServe framework is not restricted to classical first-order logic. We also evaluated the system on a set of higher-order conjectures formalised in a variant of Church’s simply-typed λ-calculus. With this evaluation we showed that, in some cases, composite reasoning services can solve more problems than atomic services that perform the same task (see Chapter 8). Previous Experience. The development of the MathServe framework benefited from our experience with the MathWeb Software Bus [Zimmer and Kohlhase, 2002], a platform in which theorem proving systems, computer algebra systems and other tools are integrated into a net- worked environment. The MathWeb-SB has been used successfully in several projects. However, over the years we have identified several shortcomings of the system which make it unsuitable as a platform of reasoning systems in a Mathematical Semantic Web: -- The MathWeb-SB includes middle agents called MathWeb brokers. Despite their name, these agents do not provide any service brokering facilities but are yellow- page services for available reasoning systems. In the MathWeb-SB, the user has to know exactly which reasoning system to use to solve a problem and how to access that system. -- The MathWeb-SB does not offer machine-interpretable descriptions of the capa- bilities of the integrated reasoning systems. Therefore, an automated retrieval or composition of reasoning systems is not possible. -- Last but not least, the MathWeb-SB is implemented in the concurrent constraint programming language Mozart. The use of this proprietary programming lan- guage caused problems with the stability and reliability of the MathWeb-SB. The new MathServe framework overcomes the first two drawbacks of the MathWeb- SB by offering access to semantic reasoning services rather than reasoning systems. Furthermore, MathServe achieves a high degree of stability and reliability by using established Web technologies and protocols. 1.1 Contributions of this Thesis 9

1.1 Contributions of this Thesis

In this thesis we investigate the use of state-of-the-art Semantic Web technologies for constructing a framework for distributed automated reasoning services. Our work makes contributions to the field of automated reasoning, the development of the Math- ematical Semantic Web, and Semantic Web research.

1.1.1 Contributions to the Mathematical Semantic Web and Automated Reasoning Research

In summary, the contributions to the development of the Mathematical Semantic Web and to automated reasoning research are: • MathServe, a robust framework for sharing logic-based reasoning services in a Mathematical Semantic Web. • Formalising the semantics and expertise of reasoning services in the state- of-the-art service description language OWL-S. • The MathServe broker, a middle-agent, automatically finds and combines reasoning services for solving reasoning problems. • Optimal choice of reasoning services according to features of the reasoning problems to be solved. • A Description Logic ontology for disambiguating the results delivered by reasoning services. • Uniform, application- and user-friendly interfaces to automated reasoning systems.

1.1.2 Contributions to Semantic Web Research

The main contributions to Semantic Web research can be summarised as follows: • Semantic descriptions of Web Services with uncertain outcomes and prob- abilistic effects. • Automated Web Service composition incorporating services with proba- bilistic effects. Web Service composition with a combination of classical AI planning and decision-theoretic reasoning in the situation calculus. • Automatically generated, human-readable representations of composite ser- vices in the high-level programming language Golog.

1.2 What this Thesis is not about

To prevent misunderstandings, we would like to point the reader to some issues that are not addressed in this thesis. 10 Chapter 1. Introduction

Logic Translations and Logic Morphisms. Different reasoning systems are typi- cally based on different logics and logical calculi. Thus, results obtained in one system cannot be easily used by another system. For example, valid proofs in a classical logic are not necessarily valid in the intuitionistic variant of that logic. The problem of in- compatible logical formalisms is as old as automated reasoning itself and a satisfactory solution has not yet been presented. Among the few projects aiming at a solution for this problem is the Logosphere project. Logosphere is mainly concerned with the design of a logical framework as a representation language for logical formalisms, individual theories, and proofs. It is planned to extend this framework with interfaces to several theorem proving systems (see Section 2.2.5). Mediation between different logical formalisms is a problem with high theoretical and practical complexity and we do not attempt to solve this problem in this thesis. However, if (partial) solutions for the problem become available in the future, they can be easily integrated in the MathServe framework as logic mediation and translation services. Currently, most of the reasoning services available in MathServe are based on classical first-order logic with equality and are compatible to each other. Formal Verification of Service Compositions. In this thesis, we do not for- mally prove properties of composite Semantic Web Services as described in [Baader et al., 2005], [Narayanan and McIlraith, 2002] and [Berardi, 2005]. Composite services in MathServe are described in the situation calculus programming language Golog. Although a formal semantics of Golog is provided by the situation calculus, that se- mantics does not capture the semantics of the processes underlying the service de- scriptions. For the verification of properties of composite services, other representation formalisms for Semantic Web Services, such as Petri Nets [Narayanan and McIlraith, 2002] of finite state machines [Berardi, 2005], seem to be more appropriate. However, these formalisms are not suitable for practical application, such as MathServe, in which composite services are presented to and edited by humans.

1.3 Outline of this Thesis

In Chapter 2 and Chapter 3 we provide comprehensive introductions to the fields of mechanised reasoning and the Semantic Web, respectively. Readers with sufficient background knowledge in these fields can skip these chapters. However, we point out that Section 3.4 contains important definitions and notations used throughout this thesis. Section 3.1.5 is also important because it introduces the Description Logic syntax used to describe OWL-DL ontologies, concepts and properties. In Chapter 4 we show how different types of reasoning systems are described as Semantic Web Services. We also present the domain ontology underlying these descriptions. The MathServe framework is presented in Chapter 5 with a focus on the MathServe broker which provides service matchmaking and composition facilities. The service composition approach used by the MathServe broker is described separately in Chap- ter 6. MathServe can automatically choose reasoning services that are most suitable for a given problem. We evaluated this functionality in two CADE system competitions in the domain of automated theorem proving systems. The preparation of MathServe for 1.3 Outline of this Thesis 11 the system competition and the results of our evaluation are discussed in Chapter 7. In Chapter 8, the advantage of using composite reasoning services is shown on a set of higher-order theorem proving problems. We conclude and discuss related work in Chapter 9. Some limitations of MathServe and possible future work are presented in Chapter 10. 12 Chapter 1. Introduction Chapter 2

Mechanised Reasoning in Artificial Intelligence

In this thesis we model general-purpose automated reasoning systems as reasoning services to improve their availability and usability. In Chapter 6 will show how classical planning and decision-theoretic planning (i.e. two methodologies for reasoning about actions and change) are applied to automatically combine reasoning services. In this chapter we first provide introductions to both general-purpose mechanised reasoning and the specialised field of reasoning about actions and change. Because of the vastness of both fields we focus on those aspects important for the work presented in later chapters. We begin with an introduction to the history of mechanised reasoning.

2.1 History of Mechanised Reasoning

The origin of mechanised reasoning dates back to the 17th century when Leibniz pre- sented his dream of developing a universal formal language (the lingua characteristica universalis) in which to express all human thoughts [Leibniz, 1986]. The vision of Leibniz was that with the help of this language and a formal calculus (the calculus ratiocinator) it would be possible to mechanically resolve every logical dispute by a calculation similar to the calculations performed in arithmetic. In the 19th century the ideas of Leibniz were picked up by Boole and Frege and the evolution of formal mathematical logic began. Boole introduced propositional or Boolean logic [Boole, 1854]. Frege defined the first formal logic which achieved a separation of syntax and semantics [Frege, 1879]. His language is known today as the first-order predicate calculus [Hodges, 1983, Fitting, 1990]. Next to the development of logical formalisms themselves, also the question of the decidability of these formalisms was of major interest. If a formalism was proved to be undecidable the questions of semi-decidability and the existence of complete logical calculi had to be answered. Of particular interest at the beginning of the 20th century was the question whether Hilbert’s program could be achieved, i.e. whether all of mathematics could be formalised in axiomatic form, and a proof could be given that this axiomatisation was consistent. First, this question seemed to have a positive answer in particular because of G¨odel’s 14 Chapter 2. Mechanised Reasoning in Artificial Intelligence completeness theorem for first-order logic [G¨odel, 1930]1. However, it was also G¨odel who showed in his famous incompleteness theorems that every formal system expressive enough to formalise arithmetic is incomplete, i.e. there are arithmetical statements which are true in that system but are not provable inside the system. Later, Church and Turing proved the undecidability of the halting problem [Church, 1936, Turing, 1937]. This set-back to the goals of mechanised reasoning was relativised by Herbrand’s proof of the semi-decidability of first-order logic [Herbrand, 1930]. Herbrand introduced special models which add syntactic aspects to the semantics of quantified formulae. The resulting Herbrand universes and their relationship to arbitrary models of formulae2 paved the way for fully automated mechanised reasoning based on computer programs. Soon, it became clear that an enumeration of the Herbrand universe was not a feasible method for determining the validity of a formula (i.e. the unsatisfiability of its negation). It took several years though until Robinson introduced the resolution calcu- lus [Robinson, 1965]. Roughly speaking, resolution is a least-commitment approach to enumerating the Herbrand universe based on unification. The calculus is particularly suitable for automation which is why it is the basis of almost all modern automated theorem proving systems. We will discuss resolution and its extensions in further detail in Section 2.2.1. In parallel to the developments in classical first-order logic, research on the use of set theory as a foundation for mathematics commenced with the work of Cantor [Hallett, 1984]. He defined naive set theory which, later, turned out to allow the formulation of antinomies (e.g, the famous “set of all sets that don’t contain themselves”) which were discovered by Russell. As a consequence, Russell developed the theory of types and used it, together with Whitehead, as a foundation of mathematics in the seminal “Principia Mathematica” [Whitehead and Russell, 1910]. However, G¨odel called the vague syntax in the Principia “a considerable step backwards as compared with Frege” [G¨odel, 1964]. Church presented a more elegant formulation in the typed λ-calculus [Church, 1940], a formalism to investigate function definition, function application and recursion. The types of the simply typed λ-calculus are either base types, type variables, or function types σ → τ. Lambda calculi with dependent types form the basis of intuitionistic type theory, the calculus of constructions, and the Logic Framework (LF) [Harper et al., 1987].

2.2 Mechanised Reasoning Systems

Research on the development of mechanised reasoning systems, i.e. computer programs performing mechanised reasoning, followed two main directions. One line of research has been concerned with the development of proof assistant systems (also called in- teractive theorem provers) which allow a human user to develop a proof in a formal environment while the system guarantees soundness of the proof steps. One of the few proof assistant systems based on first-order logic (with induction) is ACL2 [Kaufmann

1G¨odel’s completeness theorem states that every valid formula is provable, e.g., using Hilbert’s calculus. 2A formula F in Skolem form is unsatisfiable if and only if there is a finite subset of the Herbrand universe of F which is unsatisfiable. 2.2 Mechanised Reasoning Systems 15 and Moore, 1996]. All other popular systems are based on some variant of higher- order logic, such as PVS [Owre et al., 1996], HOL [Gordon and Melham, 1993] and Isabelle [Paulson, 1994, Nipkow et al., 2002], or on intuitionistic type theory, such as the Coq system [Coq, 2002]. One disadvantage of most proof assistant systems is that proofs have to be derived manually at the very fine-grained level of logical calculi. This can lead to many, often tedious and repetitive user interactions and, typically, no or very little automation is provided to the user. One way to overcome this deficit is to add proof automation support inside the system (e.g., in the form of decision procedures as it is done in ACL2) or to integrate specialised external reasoning systems or tactics. The latter approach has been studied extensively in the Ωmega system [Siekmann et al., 2002], a proof assistant based on Church’s simple type theory. The integration of external automated reasoning systems into proof assistants is also one of the main motivations for the work described in this thesis. Another line of research was concerned with the design and implementation of general-purpose, fully automated, black-box style reasoning systems which try to solve reasoning problems without user intervention. However, these systems are typically not designed to be used by other software applications. Furthermore, it is hard for non- experts to make efficient use of reasoning systems. This is mainly due to the fact that it is difficult for potential users to find information that supports them in selecting the most suitable reasoning system for a specific application. Another problem is that users have to cope with the, often idiosyncratic, concrete input syntax of these systems. Last but not least, it is often hard to interpret the output of automated reasoning systems. The MathServe framework described in this thesis helps to overcome these difficul- ties and provides user-friendly, well-defined interfaces to automated reasoning systems as well as semantic descriptions of these interfaces. In what follows we introduce the types of automated reasoning systems integrated in MathServe. We focus on first-order automated theorem proving which plays a central role in MathServe.

2.2.1 First-Order Automated Theorem Proving Systems

Modern, first-order, Automated, Theorem Proving systems (ATP systems) are typically based on Robinson’s resolution principle and its extensions, such as the superposition calculus. Typically, ATP systems try to show that a first-order conjecture follows from a set of first-order axioms. For many ATP systems, the proving problem has to be presented in clause normal form3 (CNF), i.e. as a conjunction of disjunctions of literals which is satisfiability-equivalent to the original formula. For this, the formulae are translated into prenex form. Then, universally quantified variables are removed and existentially quantified variables are replaced with fresh Skolem terms indicating the dependence on universally quantified variables. For example, Skolemisation transforms the prenex form formula ∀x∃y∀z.P (x, y, z) into P (x, fy(x), z), where fy is a new Skolem function symbol. We refer the reader to [Fitting, 1990] for details about this process. For a set {A1,...,An} of axioms and a conjecture C, a proof with the resolution calculus consists of a refutation of the clause normal form of A1 ∧ . . . ∧ An ∧ ¬C. The

3The clause normal form is also called conjunctive normal form. 16 Chapter 2. Mechanised Reasoning in Artificial Intelligence central inference rule for deriving such a refutation (the empty clause) is the binary resolution rule4: σ the most {L ,...,L , ¬P } {L ,...,L , Q} Resolution: 1 n 1′ m′ where general unifier {L ,...,L , L ,...,L }σ 1 n 1′ m′ of P and Q

Together with the factoring rule, refutational completeness is achieved, i.e. the empty clause can be derived from all inconsistent clause sets in finite time:

σ the most {L ,...,L ,...,L ,...L } Factoring: 1 i j n where general unifier {L1,...,Li,...,Lj 1, Lj+1,...,Ln}σ − of Li and Lj

Since the introduction of the resolution calculus by Robinson many important re- finements have been developed which have significantly improved the strength of ATP systems. For example, the paramodulation rule has been developed to enable the ap- plication of equality inside clauses5:

σ is the most {L ,...,L [r],...L } {K ,...,K ,s = t} Paramodulation: 1 i n 1 m general unifier {L ,...,L [t],...,L ,K ,...,K }σ 1 i n 1 m of r and s Paramodulation was further developed into the superposition calculus mainly by work of Bachmair and Ganzinger [Bachmair and Ganzinger, 1994]. Superposition is the calculus employed by most modern ATP systems for solving equational problems6. In superposition, the applicability of inference rules is restricted with ordering-based equality handling as developed in the context of the (unfailing) Knuth-Bendix com- pletion [Knuth and Bendix, 1970], while preserving refutational completeness. This requires a well-founded ordering on terms and clauses. The idea behind the superposi- tion rule is to use a positive equation in a clause as a rewrite rule for reducing another clause with respect to the ordering chosen. One important property of the superposition calculus is the fact that, by introducing redundant clauses and inferences, one can define a saturated set of clauses on which all inferences are redundant. A saturated set of clauses is consistent, i.e. the original conjecture is false. However, detecting a saturated set of clauses is a very difficult task. Checking whether a newly derived clause is subsumed by some other clause (and therefore redundant) is NP-complete in general. However, various algorithms have been proposed that produce an acceptable performance in special cases [Gottlob and Leitsch, 1985]. Further improvements on performance can be achieved with clause reduction rules, literal selection functions, and sophisticated term indexing techniques. Currently, approximately 30 implementations of resolution-based ATP are avail- able. The most promising systems are evaluated at the annual CADE ATP System Competition (CASC) [Pelletier et al., 2002]. Otter [McCune, 1994b] was one of the first

4Clauses are often presented as sets. 5The notation L[r] indicates a literal which contains r as a sub-term at a specific position. 6Non-equational problems can be easily transformed into equational problems. 2.2 Mechanised Reasoning Systems 17

ATP systems with considerable strength. However, the leading first-order ATP systems of today are DCTP [Letz and Stenz, 2001a], E [Schulz, 2001], Paradox [Claessen and S¨orensson, 2003], SPASS [Weidenbach et al., 1999], Vampire [Riazanov and Voronkov, 2002] and Waldmeister [Hillenbrand et al., 1999].

2.2.2 Propositional Satisfiability Solvers Propositional Satisfiability Solvers (SAT solvers) are systems for solving the NP-complete problem of determining the truth of formulae in propositional logic. This problem oc- curs frequently in general AI applications and in areas such as planning, model check- ing, theorem proving, and software verification. A naive enumeration of all possible assignments of truth values to the propositional variables in a problem is generally not feasible. Therefore, SAT solving systems are typically based on the Davis-Putnam- Logemann-Loveland procedure (DPLL) [Davis et al., 1962], a search procedure based on the notions of unit clauses and pure literals.7 DPLL is typically implemented as an iterative algorithm with an explicit backtrack stack. Various refinements of the standard DPLL procedure have been proposed in recent years. Different branching heuristics as well as control techniques, such as lookahead, backjumping and restart (while keeping learnt clauses), have been developed to improve the performance of SAT solvers. SAT solving systems are compared at the annual SAT competition8. Among the strongest state-of-the-art SAT solvers are MiniSAT [E´en and S¨orensson, 2004], zChaff [Moskewicz et al., 2001] and March eq [Heule et al., 2005].

2.2.3 Decision Procedures Decision procedures fill the gap between (semi-decidable) full first-order theorem prov- ing on the one hand and (decidable) propositional satisfiability checking on the other. This is why they can be found at the core of many verification systems. Decision procedures solve the problem of determining the satisfiability of first-order formulae modulo a decidable background theory, the so-called SMT problem. They work on decidable fragments of first-order logic such as the theories of lists and arrays, Pres- burger Arithmetic [Cooper, 1972, Hodes, 1972], FO2 [Mortimer, 1975]9, or the Guarded Fragment [Andr´eka et al., 1998]. Single decidable theories are typically not expressive enough for the specification of non-trivial programs and systems. For example, proving the formula

f(f(x) − f(y)) =6 f(z) ∧ y <= x ∧ y >= x + z ∧ z >=0 to be unsatisfiable involves reasoning in the combination of several theories including equality over uninterpreted terms, arithmetic, and the Boolean theory. This is why the problem of combining decision procedures for decidable theories has always attracted great interest. Theorem proving based on cooperating decision procedures was first

7A unit clause contains only one literal. Pure literals are literals which occur only positively (or only negatively) in the set of unresolved clauses. 8See http://www.satcompetition.org. 9FO2 is the language of first-order sentences with at most two distinct variables. 18 Chapter 2. Mechanised Reasoning in Artificial Intelligence proposed by Nelson and Oppen in 1979 [Nelson and Oppen, 1979]. An alternative way of combining decision procedures was presented by Shostak [Shostak, 1984, Rueß and Shankar, 2001]. Later Boyer and Moore integrated a decision procedure for linear arithmetic in the Nqthm theorem prover [Boyer and Moore, 1979, Boyer and Moore, 1988]. The decision procedure was proved to be sound in the theorem prover itself. Originally, decision procedures were special purpose algorithms developed for each decidable theory, such as the SUP-INF method of Bledsoe [Shostak, 1977]. It turned out later that efficient techniques from other areas of automated reasoning could be used to solve SMT problems. Satisfiability modulo decidable theories can be seen as an extended form of propositional satisfiability, where propositions are either simple Boolean propositions or constraints in a specific theory. Therefore extensions of SAT techniques, such as DPLL(T) [Ganzinger et al., 2004], can be used as decision proce- dures. Joyner showed that resolution-style theorem proving can be used as a decision procedure for a large class of clause sets corresponding to formulae in decidable theo- ries [Joyner, 1976]. As a consequence, resolution-based methods also play an important role in modern SMT systems [Ferm¨uller et al., 2001]. Since the year 2002, the strength of systems combining decision procedures is com- pared at the annual SMT-Competition [Barrett et al., 2006]. Among the most suc- cessful standalone reasoning systems based on the combination of decision procedures are the MathSAT system [Bozzano et al., 2005], the Barcelogic Toolkit [Nieuwenhuis and Oliveras, 2005], Yices [Dutertre and de Moura, 2006], and CVC Lite [Barrett and Berezin, 2004].

2.2.4 Finite Model Generators Finite model generation is concerned with the problem of finding models for satisfiable first-order clause sets. Model generation has many applications in AI, computer science and mathematics. It can be used to show the consistency of a formal specification, to find counterexamples to false conjectures, to solve open mathematical problems or to guide resolution-based theorem provers. There are two main approaches to tackle the model generation problem. They are named after the first systems they have been implemented in, namely the model generator MACE [McCune, 1994a, McCune, 2003] and the SEM system [Zhang and Zhang, 1995] In MACE-style model generation, the problem is reduced to a propositional logic problem by flattening and instantiating the clauses. First-order clauses together with a concrete domain size are transformed into propositional clauses by introducing propo- sitional variables representing the function and predicate tables. A SAT solver is then used to try to solve the resulting problem. The SEM approach works directly on the first-order representation. A basic search algorithm with backtracking, supported by powerful constraint propagation methods, tries to find satisfying interpretations of function and predicates symbols. The method of symmetry reduction is used to avoid searching for isomorphic models multiple times. The SEM-style method is particularly suitable for equational problems because equality can be exploited by the constraint propagation module. Next to the SEM system, the systems FINDER [Slaney, 1995] and Gandalf implement SEM-style methods or modifications thereof. 2.2 Mechanised Reasoning Systems 19

2.2.5 Proof Transformation Systems While the generation of formal proofs has attracted increasing interest by research and industry, the transformation of such proofs into different logical calculi or into human-oriented formats has been mostly neglected. However, proof transformation is, for instance, important whenever formal proofs have to be presented to humans. In the context of the proof assistant system Ωmega [Siekmann et al., 2002, Siekmann et al., 2006] the proof transformation system TRAMP has been developed [Meier, 2000]. TRAMP can transform resolution proofs as produced by certain ATP systems into Natural Deduction (ND) proofs [Prawitz, 1965] at the assertion level [Huang, 1994]. In the latter, the classical ND calculus is extended by a one-step application of assertions. For example, at the assertion level, the application of the definition of the subset relation ⊆ in a proof about sets consists of one inference step:

a ∈ F F ⊆ G ⊆-Def a ∈ G

As a consequence, the proofs resulting from TRAMP’s transformation are much more abstract and readable than pure ND-calculus proofs. They are particularly suitable for further transformation, for instance, into natural language. We describe the TRAMP system in detail in Section 4.8.2. Transforming formal proofs into “informal” natural language proofs is particularly suitable for bridging the gap between formal theorem proving systems and non-expert human users of those systems. In Ωmega natural language transformation has been pioneered by the systems PROVERB [Huang and Fiedler, 1996, Huang and Fiedler, 1997] and P.rex [Fiedler, 2001b, Fiedler, 2001a]. The system ClamNL [Alexoudi et al., 2004] has been developed to present induction proofs found by the proof planner CLAM [Bundy et al., 1990] to human users. ClamNL presents proofs at different levels of abstraction using proof plans. Finally, schematic sentences are used to create a natural language proof. Translating formal proofs into standardised calculi also eases the use of independent proof checkers to ensure soundness of reasoning systems. For resolution proofs of first- order theorems, such a transformation is performed by the Otterfier system [Zimmer et al., 2004]. Otterfier transforms arbitrary resolution proofs (which may include ap- plications of splitting and other prover-specific rules) into resolution proofs which use only binary resolution, factoring, and paramodulation as inference rules. The output of Otterfier can be easily checked by independent proof verification systems10. The Otterfier system is described in Section 4.8.1. The Logosphere project11 aims at developing a formal digital library of mathe- matics. The project is mainly concerned with the design of a logical framework as a representation language for different logical formalisms, individual theories, and proofs, with an interface to theorem proving systems. With the help of the Logic Framework (LF) and logic morphisms, already existing formal libraries and proofs can be translated between different logical formalisms. So far a translation has been realised between the

10A proof verification system for first-order resolution proofs is DVDV, see http://www.cs.miami. edu/~tptp/Seminars/TSTP/SPV.html. 11See http://www.logosphere.org/. 20 Chapter 2. Mechanised Reasoning in Artificial Intelligence higher-order logic of HOL and the polymorphic, extensional type theory of the Nuprl system [Constable et al., 1986].

2.3 Distributed Automated Reasoning

The advent of computer networks and distributed computing motivated the idea of us- ing these new technical facilities in automated reasoning. This is not surprising given that many reasoning systems are search-based programs which have to handle huge search spaces. A distributed search and the parallel use of different search heuristics can therefore improve the overall success of reasoning systems dramatically. In this section, we briefly introduce existing frameworks for distributed automated reasoning and frameworks for the integration of reasoning systems in networked envi- ronments.

2.3.1 Frameworks for Distributed Automated Reasoning The TECHS system [Denzinger and Dahn, 1998] builds on Teamwork technology [Denzinger, 1993, Avenhaus and Denzinger, 1993] and combines heterogeneous state- of-the-art theorem provers while minimising the changes that have to be done to these provers. TECHS combines the ATP systems SPASS and DISCOUNT and the tableau- based prover SETHEO that communicate by exchanging clauses. The provers perform two kinds of cooperation. They send requests for needed information (demand-driven) and autonomously send information they found useful to all other agents (success- driven). In order to reduce the communication overhead several heuristics are used to select the clauses that are sent to other agents. TECHS has also been combined with the ILF environment to allow interactive cooperation of a human user with the prover network. Clause-Diffusion [Bonacina and Hsiang, 1995, Bonacina, 2000] is a general theoret- ical framework for distributed deduction. A major principle inspiring Clause-Diffusion is distributed search: the search space of the theorem proving problem is subdivided among concurrent deductive processes. The processes are asynchronous and there is no master-slave organisation: each process has its own database and generates its own derivation. A distributed derivation is obtained from the derivations generated by the processes. The distributed search stops successfully as soon as one of the distributed processes succeeds. A Clause-Diffusion strategy is defined by an inference system and a parallel search plan, which is executed by each process. The parallel search plan controls the selection of the inferences, as in sequential search, their subdivision, and the diffusion of clauses by broadcasting. A key feature of Clause-Diffusion is that there is no need for a master or top-level scheduler to distribute the work: the subdivision is done dynamically by all processes, each of them working independently.

2.3.2 Frameworks for the Integration of Reasoning Systems The Prosper project [Dennis et al., 2000] aims at developing the technology needed to deliver the benefits of formal specification and verification to system designers in 2.3 Distributed Automated Reasoning 21 industry. The central idea of Prosper is that of a proof engine (a custom-built verifi- cation engine) which can be accessed by applications via an application programming interface (API). This API is designed to support the construction of CASE-tools12 incorporating user-friendly access to formal techniques. Much work of the Prosper project went into the definition and implementation of the API, the development of the Prosper-toolkit, and into intensive case studies. The Prosper-toolkit currently integrates the higher-order theorem prover HOL, and the verification system ACL2. The API of Prosper mainly allows the modularisation of CASE software systems and the inter-operation of these modules with proof engines. However, the components of Prosper use proprietary protocols and communication languages. The Logic Broker Architecture (LBA) [Armando and Zini, 2000] has been developed by the Mechanised Reasoning Group at the Universit`adi Genova, Italy. It is based on the communication functionality provided by the Common Object Request Broker Ar- chitecture (Corba) [Siegel, 1996] and the OpenMath standard [Caprotti and Cohen, 1998]. In the 1990s, Corba was a widely used standard for the development of robust, platform and language independent client-server applications. The LBA provides the infrastructure needed for making mechanised reasoning systems inter-operate by a sim- ple registration/subscription mechanism. Additionally, it offers location transparency and a translation mechanism which ensures the transparent and provably sound ex- change of logical services. It was planned to ensure the logical correctness of the interaction of services by using a logic service matcher which implements a morphism between the logic of a client reasoning system and the logic of a server (see [Armando and Zini, 2000] for details). However, this line of research was not pursued any further. Because of this and because of the low performance of the Corba middle-ware the LBA remained a research prototype which was not used by other projects. The Ω-Ants theorem proving approach [Benzm¨uller and Sorge, 2000] is based on the homonymous command suggestion mechanism within the Ωmega proof assistant. The core of Ω-Ants is a central hierarchy of a suggestion-blackboard and several command blackboards. Command argument agents write information about a central proof data structure on a corresponding command blackboard. Other argument agents are triggered by this information and in turn write their suggestions to the command blackboard. The suggestion agents read from the argument-blackboards and finally write command suggestions on the suggestion blackboard. A human user can then select one of the commands suggested in the current proof state. Benzm¨uller and Sorge enhanced this process to a full automated theorem proving procedure by adding an agent which automatically selects a command and stores all other selections for possible backtracking. External reasoning systems are also integrated into the proof search with specialised agents. For this, Ω-Ants relies on the MathWeb Software Bus (described in the next section). Although the principles of Ω-Ants apply to other domains, the Ω-Ants suggestion mechanism for reasoning steps is hardwired in the Ωmega system and can, therefore, not be used by other applications or inside other proof assistant systems.

12CASE: Computer Aided Software Engineering. 22 Chapter 2. Mechanised Reasoning in Artificial Intelligence

2.3.3 The MathWeb Software Bus In our own research group we developed the MathWeb Software Bus (MathWeb-SB), a platform for distributed automated theorem proving that supports the connection of a wide range of reasoning systems by a common software bus [Franke and Kohlhase, 1999, Zimmer and Kohlhase, 2002]. In the MathWeb-SB, theorem proving systems, computer algebra systems and other tools are homogeneously integrated into a networked proof development environment. The MathWeb-SB also offers a network of middle-agents called MathWeb brokers. However, these agents only provide a yellow-page service for available reasoning systems.

Edinburgh λ HR Clam Spass PP Broker ATP Client service reference (Java) (Mozart) SEM request MG MathWeb Servers

forward request Broker Broker Maple CAS (un−) register Activemath Saarbrücken Client Eindhoven MBase (Java) KB MathWeb Clients

Otter Broker Broker Ωmega ATP Client (Lisp) Birmingham Pittsburgh E broker to broker communication ATP client to broker communication (Mozart, XMLRPC, HTTP) server to broker communication (service offers/requests)

Figure 2.1: The MathWeb Software Bus

Figure 2.1 shows the main components of the MathWeb-SB and their interaction13. Local MathWeb brokers run on machines in the Internet and offer reasoning systems. Client applications connect to their local broker and request instances (service objects) of a particular reasoning system. All MathWeb brokers are connected to each other. If a requested reasoning system is not available locally, the request is forwarded to all known remote brokers. Due to the distributed programming model of the implementation language Mozart [Smolka, 1995]14, client applications can access remote reasoning systems in the same way as systems available locally. The development of the MathWeb-SB originated in the effort to integrate external reasoning systems into the mathematical proof assistance system Ωmega. But over the years, the services of the MathWeb-SB have also been used by other projects, such as

13The figure shows the state of the MathWeb-SB in September 2002. 14Mozart is an implementation of the Oz language and is available at http://www.mozart-oz.org/. 2.4 Reasoning about Actions and Change 23

DORIS [Blackburn et al., 1999] and ActiveMath [Melis and Siekmann, 2004]. Despite the successful use of the MathWeb-SB in these projects, it has several drawbacks:

-- In spite of their name, MathWeb brokers do not provide any service brokering facilities. Instead, the user has to know exactly which reasoning system to use to solve a problem and how to access that system.

-- The MathWeb-SB does not offer machine-interpretable descriptions of the capa- bilities of the integrated reasoning systems. Therefore, an automated retrieval or composition of reasoning systems is not possible.

-- The MathWeb-SB is implemented in the concurrent constraint programming lan- guage Mozart. The use of this proprietary programming language caused prob- lems with the stability and reliability of the MathWeb-SB.

-- The proprietary communication protocol of Mozart also made it difficult to write client applications for the MathWeb-SB.

The problems encountered with the MathWeb-SB motivated the development of the MathServe framework described in this thesis. Indeed, MathServe overcomes all the above problems.

2.3.4 Computation Systems as Semantic Web Services Another motivation for the development of MathServe was the research pursued by the projects MathBroker [Schreiner and Caprotti, 2001] (a national project at the RISC Linz, Austria) and MONET [MONET, 2002] (a project funded by the European Union). The main goal of these projects was to offer symbolic and numeric compu- tations as services over the Internet. For this, computer algebra systems, such as GAP [GAP, 1998], and numerical computation packages, such as the NAG Library (see http://www.nag.co.uk) were described as Web Services. The Mathematical Ser- vice Description Language (MSDL) has been developed to describe the semantics of these Web Services using the OpenMath format. The follow-up project MathBroker- II [Baraka and Schreiner, 2006] investigates brokering techniques for services described in MSDL. The projects MathBroker and MONET investigated the description of computation systems only. The MathServe framework complements this work by offering logic-based reasoning systems as Semantic Web Services. A comparison between MathServe’s approach and the approach of MathBroker and MONET is presented in Chapter 9.

2.4 Reasoning about Actions and Change

Part of the work presented in this thesis is concerned with automated matchmaking and composition of Semantic Web Services. Service composition is typically performed by middle agents, i.e. specialised software agents to which service providers register and which service requesters can use to find and compose available services. The task of these middle agents is to reason about the input and output parameters as well as the 24 Chapter 2. Mechanised Reasoning in Artificial Intelligence preconditions and effects of available services. This is why we regard the problem of Web Service matchmaking and composition as the problem of reasoning about actions – the Web Services – and change – change in an middle agent’s knowledge base and the creation of new objects. Newly created objects are, for instance, the proofs found by reasoning systems or the results produced by translation systems. In this section we provide an introduction to the most prominent methodologies for reasoning about actions and change. In particular, we describe the situation calculus (Section 2.4.1) as well as classical STRIPS-style planning and its derivations (Section 2.4.3). A variant of STRIPS-style planning will be used in Chapter 6 to automatically compose reasoning services. Since the outcome of many reasoning services (agent actions) is not known in advance, we also present formalisms for reasoning about actions with stochastic effects and in domains with uncertain knowledge (Section 2.4.4). We conclude the section with a discussion of the DTGolog language (Section 2.4.2) which we will use in Chapter 6 to represent composite reasoning services.

2.4.1 The Situation Calculus The situation calculus [McCarthy, 1968] is a sorted first-order language specifically designed for representing dynamically changing worlds. All changes to the world are the result of named agent actions, i.e. function symbols of the sort action. A possible world history, which is simply a sequence of actions, is represented by a first-order term with the sort situation. The constant S0 is used to denote the initial situation. The execution of actions is modelled using distinguished binary function symbol do, where do(a, s) denotes the successor situation to s resulting from performing action a in s. All objects that are not actions or situations fall into the sort object. The values of some relations and functions in a dynamic world will vary from one situation to another. Relations (functions) whose truth (value) varies from situation to situation are called relational (functional) fluents. They are denoted by predicate (function) symbols that take a situation term as their last argument. For example, in a world in which objects can be painted, we might have a functional fluent colour(x, s) which denotes the colour of object x in situation s. Of course, the same information could be expressed using a relational fluent colour(x,c,s). We distinguish between the foundational axioms of the situation calculus, which define the properties of situations and their relation to actions and the function do, and so called action theories, which are needed to model a concrete domain. We refer the reader to Reiter’s comprehensive introduction to the situation calculus to learn more about the foundational axioms [Reiter, 2001]. In what follows, we focus on basic action theories. Actions in the situation calculus have preconditions, i.e. requirements that must be satisfied whenever an action can be executed in a situation. The execution of an action a in a situation s has certain effects on s which lead to a new situation do(a, s). Action preconditions are defined in action precondition axioms using the distinguished predicate symbol Poss. Action effects are defined in action effects axioms. A third type of axioms required are the frame axioms. Frame axioms specify the action invariants of a domain, i.e. those fluents unaffected by the performance of an action. They are needed to solve the frame problem (see Section 2.4.1.2). 2.4 Reasoning about Actions and Change 25

Example 2.4.1 To illustrate the different kinds of axioms in a basic action theory we model a situation calculus domain in which robots can pickup and drop objects, and walk around. Furthermore, a bomb can explode and break objects. The domain contains the relational fluents holding(r,x,s) (robot r is holding object x in situation s), nextTo(x,y,s) (x is next to y in s), broken(x,s) (x is broken in s), and onFloor(x, s) (x is lying on the floor in s). Furthermore, objects can be fragile. The truth values of these fluents are influenced by actions performed by the agent. The precondition axioms for the agent actions are the following:

Poss(pickup(r, x),s) ≡ robot(r) ∧ [∀z. ¬holding(r,z,s)] ∧ nextT o(r, x, s) Poss(drop(r, x),s) ≡ robot(r) ∧ holding(r, x, s) Poss(walk(r, y),s) ≡ robot(r).

Some action effect axioms for the actions drop and explode are

onFloor(x,do(drop(r, x),s)) fragile(x) ⊃ broken(x,do(drop(r, x),s)) nextT o(b, x, s) ⊃ broken(x,do(explode(b),s)), i.e. objects are on the floor after a robot dropped them and fragile objects break when being dropped. All objects next to a bomb break if the bomb explodes. The following frame axioms specify that an exploding bomb does not influence the fluent nextTo and that a walking robot keeps holding an object:

nextT o(x,y,s) ⊃ nextT o(x,y,do(explode(b),s)) holding(r, x, s) ⊃ holding(r, x, do(walk(x, y),s)).

Typically, a large number of frame axioms is needed to model all action invariants of a domain. However, frame axioms can be combined with action effect axioms to the far more compact successor state axioms. In Section 2.4.1.2 we will show how successor state axioms can be derived systematically from action effect axioms. The successor state axioms for the fluents holding, nextTo, broken, and onFloor of Example 2.4.1 combine the axioms mentioned above and many other frame axioms with all effect axioms: holding(r, x, do(a, s)) ≡ a = pickup(r, x) ∨ holding(r, x, s) ∧ a =6 drop(r, x) nextT o(x,y,do(a, s)) ≡ a = walk(x, y) ∨ ∃r. [nextT o(r,y,s) ∧ a = drop(r, x)] ∨ nextT o(x,y,s) ∧ ¬∃z. [a = walk(x, z) ∧ z =6 y], onFloor(x,do(a, s)) ≡ ∃r. a = drop(r, x) ∨ onFloor(x, s) ∧ ¬∃r. a = pickup(r, x).

The axioms define that a robot r is holding an object x in a situation do(a, s) if it picked up x with action a or it was holding x already before it performed a and it did not drop x. Similarly, an object or robot x is next to an object y if either x walked to y, or a robot r is next to y and dropped x, or x was already next to y in the previous situation and, in case x is a robot, x has not walked away from y. The last axiom states a similar condition for the fluent onFloor. 26 Chapter 2. Mechanised Reasoning in Artificial Intelligence

2.4.1.1 Basic Action Theories

We will now present a more general and formal account of the situation calculus action theories which will serve as a schema instantiated later in this thesis. A basic action theory D in the situation calculus is composed of five different axiom sets,

D = Σ ∪Dap ∪Dss ∪Duna ∪DS0 , where Σ contains four foundational axioms that define the set of situations as the smallest set containing S0 which is closed under the application of do. They also define equality as well as an ordering on situations.

Action precondition axioms in Dap name requirements that must be satisfied in a sit- uation for an action to be executable. Action precondition axioms have the general form15 Poss(A(~x),s) ≡ ΠA(~x, s), where ΠA(~x, s) is an arbitrary formula that can contain ~x and s as free variables. The set Dss contains successor state axioms that determine the values of functional and relational fluents. They are a combination of action effect axioms and frame axioms and are used to solve the frame problem (see next section). For a relational fluent F the successor state axiom can be written in the normal form

+ F (~x, do(a, s)) ≡ γF (~x, a, s) ∨ (F (~x, s) ∧ ¬γF−(~x, a, s)), i.e.

+ a fluent F has either become true by executing a in s (γF (~x, a, s)) or it was true in s (F (~x, s)) and the execution of a did not cause it to become false (¬γF−(~x, a, s)). We will see later in this chapter how successor state axioms can be obtained automatically from axioms specifying the effects of actions. Unique name axioms are needed to distinguish all actions in a domain. For every pair of distinct action names A and B with the same arity, Duna contains two axioms:

A(~x) =6 B(~x), and

A(x1,...,xn) = A(y1,...,yn) ⊃ x1 = y1 ∧ . . . ∧ xn = yn.

DS0 is a set of first-order sentences that describe the initial situation. Among others, they can contain truths about the domain being modelled that do not depend on the situation. From here on, we focus on action precondition and successor state axioms and simply assume the presence of the foundational action axioms and unique name axioms for every domain we model in the situation calculus.

2.4.1.2 General Problems with Actions and Change

Soon after the introduction of the situation calculus by McCarthy, the first problems were detected. One of the most prominent is the frame problem, i.e. the problem of determining which things do not change from one situation to another. 15 In the remainder of this thesis, ~x represents a list x1,...,xn of variables or terms. 2.4 Reasoning about Actions and Change 27

The Frame Problem. The frame problem originally centred on the representational frame problem, i.e. the apparently unavoidable need for a large number of frame axioms which led to an inelegant and inefficient description of actions. Frame axioms are axioms that specify the action invariant of a domain, i.e. those fluents unaffected by the performance of an action. An example for a frame axiom in the robot domain (see Example 2.4.1) is holding(r, x, s) ∧ a =6 drop(r, x) ⊃ holding(r, x, do(a, s)). Successor state axioms, as the ones shown in Example 2.4.1, lead to a more compact representation. However, the problem of obtaining successor state axioms from logical action descriptions remains. In [Reiter, 2001] Reiter presents an elegant way of deriv- ing successor state axioms automatically from action effect axioms under the causal completeness assumption. We will follow Reiter’s approach in Section 6.5.1 where we automatically create situation calculus action domains from descriptions of Semantic Web Services. A formal presentation of the latter can be found in Appendix E. Reiter’s approach is based on a normal form of action effect axioms. For example, the effect axioms for the fluent broken(x, s) in Example 2.4.1 are the following: fragile(x, s) ⊃ broken(x,do(drop(r, x),s)) nextT o(b, x, s) ⊃ broken(x,do(explode(b),s)), These axioms can be rewritten in the logically equivalent (normal) form: [∃r. a = drop(r, x) ∧ fragile(x, s) ∨ ∃b. a = explode(b) ∧ nextT o(b, x, s)] ⊃ broken(x,do(a, s)) Similarly, the negative effect axiom ¬broken(x,do(repair(r, x),s)) can be rewritten to ∃r. a = repair(r, x) ⊃ ¬broken(x,do(a, s)). In general, we assume that, for a fluent F , each positive effect axiom has the form: + ~ ΦF ⊃ F (t,do(α,s)), (2.1) which is equivalent to ~ + a = α ∧ ~x = t ∧ ΦF ⊃ F (~x, do(a, s)), where ~x consists of new variables distinct from all free variables in (2.1). If {y1,...,ym} is the set of free variables, except s, in (2.1) (i.e. {y1,...,ym} contains all the variables that are implicitly universally quantified) then the effect axiom is equivalent to ~ + [∃y1,..., ∃ym. a = α ∧ ~x = t ∧ ΦF ] ⊃ F (~x, do(a, s)),

=:ΨF Thus, we can write| all k positive{z effect axiom for a} fluent F as:

1 ΨF ⊃ F (~x, do(a, s)), ··· k ΨF ⊃ F (~x, do(a, s)), 28 Chapter 2. Mechanised Reasoning in Artificial Intelligence which is equivalent to 1 k ΨF ∨ . . . ∨ ΨF ⊃ F (~x, do(a, s)).

+ =:γF (~x,a,s) | {z } A similar transformation can be achieved with the negative effect axioms, which can be summarised in the formula

1 l ΦF ∨ . . . ∨ ΦF ⊃ ¬F (~x, do(a, s)). − =:γF (~x,a,s) | {z } To be able to obtain successor state axioms from normal form effect axioms we need to rely on the following assumption: Causal Completeness Assumption: The positive and negative normal form effect axioms for a fluent F have the form

+ γF (~x, a, s) ⊃ F (~x, do(a, s))and (2.2)

γF−(~x, a, s) ⊃ ¬F (~x, do(a, s)), respectively. (2.3)

They characterise all conditions under which action a causes a fluent F to become true (respectively, false) in the successor situation. ♦

The causal completeness assumption, together with the unique name axioms in Duna, implies the explanation closure axioms

F (~x, s) ∧ ¬F (~x, do(a, s)) ⊃ γF−(~x, a, s)and (2.4) + ¬F (~x, s) ∧ F (~x, do(a, s)) ⊃ γF (~x, a, s). (2.5)

If T is a first-order theory that is consistent, i.e. if T implies

+ ¬[∃~x, a, s. γF (~x, a, s) ∧ γF−(~x, a, s)], then T also implies that the effect axioms (2.2) and (2.3) together with the explanation closure axioms (2.4) and (2.5) are equivalent to the successor state axiom

+ F (~x, do(a, s)) ≡ γF (~x, a, s) ∨ F (~x, s) ∧ ¬γF−(~x, a, s). (2.6)

Thus, under the assumption of causal completeness, we can mechanically generate successor state axioms from the action effect axioms provided by classical action rep- resentations.

The Qualification and Ramification Problem. Two more problems, next to the frame problem, have occurred in the study of reason- ing about actions. The qualification problem arises from the difficulty to exactly define, in a real-world domain, all the circumstances under which a given action is guaranteed to work. For example, picking up an object may not work because the object is slip- pery, too big to fit in the robot’s hand, or the object is glued to the floor, and so on. If 2.4 Reasoning about Actions and Change 29 any of these conditions is not mentioned in the action precondition axioms, an agent might believe that he executed an action, although the execution failed.16 The ramification problem concerns implicit effects of agent actions. For example, if a robot picks up an object that is covered with dust, it also picks up the dust, or if a robot moves a car from one place to another, it also moves all the parts of the car. This problem is typically solved by axioms which express that dust is stuck to an object and define which parts belong to a car. The movement of dust and car parts is then inferred from the axioms. However, performing this inference efficiently requires special-purpose reasoning systems. Despite the, sometimes elegant, solutions to the above problems, the situation cal- culus did not become the method of choice for reasoning about actions and change. This was mainly due to inefficiencies in determining the truth values of fluents via regression over situations17. Instead, classical STRIPS-style planning and its exten- sions became the most widely used approach to reason about actions and change (see Section 2.4.3).

2.4.2 Golog – High-level programming in the Situation Cal- culus With the work of Reiter, Lin and Pirri on the meta-theory of the situation calculus the calculus experienced a renaissance and was finally taken seriously as a foundation for practical work in planning, database updates, agent programming and robotics [Lin and Reiter, 1997, Reiter, 1993, Pirri and Reiter, 1999]. Their work led to the development of the language Golog [Levesque et al., 1997], a situation-calculus-based, high-level programming language for defining complex actions in terms of a set of primitive ac- tions.18 In Chapters 5 and 6 we will show how composite reasoning services within the MathServe framework can be represented in Golog. Therefore, we present the essential features of the language here. Golog was initially developed for robot programming by the Cognitive Robotics Group at the University of Toronto. Golog offers the standard control constructs found in most imperative programming languages as well as some specific constructs.

16A brief discussion of plan execution failure and ways to handle it can be found in Section 2.4.3.1. 17In the worst case, a fluent’s value can only be determined by regressing back to the initial situation S0. 18The name Golog is an abbreviation for “AlGol in logic” and suggests the availability of Algol-like programming constructs. 30 Chapter 2. Mechanised Reasoning in Artificial Intelligence

Definition 2.1 (Golog Statements and Procedures) Let A be the set of primitive actions defined in a situation calculus action theory D. Then the set of all Golog statements δ based on D is defined as:

δ := a ∈ A (primitive action) | ϕ? (test action)

| δ1; δ2 (sequence)

| δ1 ′|′ δ2 (nondeterministic choice) | (πx)δ (nondet. argument)

| if ϕ then δ1 else δ2 endIf (conditional) | while ϕ do δ endWhile (loop) | bind(x, y), (variable binding) where ϕ stands for an arbitrary situation calculus formula, and x and y are arbitrary situation calculus variables. The test action ϕ? succeeds if the proposition ϕ is true in the state the action is executed in. An expression δ1 | δ2 allows a Golog interpreter to nondeterministically choose either the program δ1 or the program δ2. The nonde- terministic argument choice (πx)δ binds x to an arbitrary object in the current or any previous situation and then evaluates δ which may contain x as a free variable. For our purposes we extended Golog with a distinguished action for variable assign- ment. Executing the action bind(x, y) in situation s leads to a new situation in which x is bound to the value of y (see below). Golog also supports the definition of procedures that can be used like primitive ac- tions. The set of all Golog programs ρ with procedures p1,...,pn is defined as:

ρ := proc p1(~v1) δ1 endProc ; . . . ; proc pn( ~vn) δn endProc ; δ, where δ and δi (1 ≤ i ≤ n) are Golog statements and ~v1,..., ~vn are the parame- ters of the procedures. For an action domain D, GOLOG denotes the set of all Golog D programs based on D. ♦

Example 2.4.2 We illustrate Golog statements and the use of procedures with a example in the blocks world domain. If put(y, x) is the primitive action of putting a block y on a block x then the following program stacks 5 blocks on top of block A if A is free19: proc stack (x, n ) if (¬n = 0) then (πy)(put(y, x)) ; stack(y, n − 1) endIf endProc; stack(A, 5) The procedure stack(x, n) non-deterministically chooses a block and stacks it on x

19For the sake of simplicity we assume linear arithmetic to be available. 2.4 Reasoning about Actions and Change 31

((πy) put(y, x)) if n is bigger than 0. The action put(y, A) will fail (i.e. Poss(put(y, A),S0) is false) if A is not free in the initial situation S0. The Golog language is given an evaluation semantics by macro expansion using a ternary relation Do. More precisely, Do(δ,s,s′) is an abbreviation for a situation calcu- lus formula whose meaning is that s′ is a situation reachable from situation s by one of the sequences of actions specified by the program δ. The possible sequences of actions of a Golog program can be determined from δ by proving the formula ∃s.Do(δ, S0,s) from the axioms of the background theory D. Because Golog programs macro-expand to situation calculus formulae, properties of these programs (such as termination, cor- rectness, etc.) can be proved directly in the situation calculus. To illustrate macro expansion we present the definition of Do for primitive actions as well as sequences and nondeterministic choice of actions:

Do(a,s,s′) := Poss(a[s],s) ∧ s′ = do(a[s],s),

Do(δ1 ; δ2,s,s′) := ∃s′′.Do(δ1,s,s′′) ∧ Do(δ2,s′′,s′), and

Do(δ1 | δ2,s,s′) := Do(δ1,s,s′) ∨ Do(δ2,s,s′). The notation a[s] stands for a with restored situation arguments for all functional fluents in a. For example, if a is goTo(location(Sam)) and location(.) is a functional fluent, then a[s] is the action goTo(location(Sam, s)). For the special variable binding action bind macro expansion is defined as

Do(bind(x, y),s,s′) := x = y ∧ s′ = s. For a detailed description of the evaluation semantics of Golog we refer the reader to [Levesque et al., 1997] and [Reiter, 2001]. Interpreters for Golog are typically implemented in . The justification for using Prolog is a fundamental result by Clark about the relationship between a logical theory consisting of axioms, all of which are definitions, and a corresponding Prolog program: whenever the Prolog program succeeds on a sentence, then that sentence is logically entailed by the theory, whenever it fails, the negation of the sentence is entailed [Clark, 1978]. As a consequence, the translation of situation calculus basic action theories into Prolog programs is straightforward. In the literature, a distinction is made between off-line and on-line Golog inter- preters. Using an off-line interpreter, executing a program δ amounts to finding a ground situation term σ such that

Axioms |= Do(δ, S0, σ). This is done by trying to constructively prove

Axioms |= ∃σ. Do(δ, S0, σ).

If a proof is found, a ground term do([a1, a2,...,an],S0) is obtained as a binding for the variable σ. The sequence [a1, a2,...,an] of actions can be sent to the execution module for primitive actions. In the presence of sensing actions and external stimuli, an off-line interpreter is no longer adequate because the truth of certain conditions is not known before some actions are actually executed in a real world. An on-line Golog interpreter can perform sensing actions to determine the state of the world after executing an action. 32 Chapter 2. Mechanised Reasoning in Artificial Intelligence

2.4.3 Classical AI Planning STRIPS [Fikes et al., 1971] was one of the first planning systems and was developed in the 1970s by Fikes and Nilsson. STRIPS, and most of the planning systems derived from it, perform state-based planning, where states are conjunctions of ground first-order literals. STRIPS-style planners overcome the frame problem by relying on the closed world assumption which assumes that all literals that are not explicitly mentioned in a planning state are false in that state.20 Planning operators in STRIPS are modelled with three lists of literals which specify the preconditions of an operator (pre list) as well as the list of literals added (add list) and deleted (delete list) to a state when the operator is applied. STRIPS planners have been extensively studied in the blocks world domain in which a robot arm can pickup blocks and stack them on top of each other, or unstack towers of blocks. A typical operator in the blocks world domain is the following:

PUTDOWN (x) pre: holding(x) delete: holding(x) add: on table(x), hand empty

The operator is applicable in a planning state s if the object x is held by the robot arm. If the operator is applied, a new planning state is created in which the precondition is deleted and new literals are added which denote that x is on the table and the robot arm is empty. STRIPS-style planners are given an initial state and a goal state and typically perform backward reasoning on the state space, i.e. they apply planning operators backwards to achieve open goal literals, producing subgoals in a new goal state. This approach has well-known limitations. For example, classical STRIPS planning cannot solve problems with interleaving goals such as the Sussman anomaly. Partial order planning [Russell and Norvig, 1995, Minton et al., 1994] is a least- commitment planning algorithm and involves searching over the space of possible plans, rather than the space of possible states. Each planning step introduces either a new planning step (instantiated operator) or an ordering constraint between steps in an existing plan. A threat occurs, when a new step deletes the preconditions of another step already in the plan. Threats are resolved by promotion or demotion which both introduce explicit ordering constraints on the planning steps involved. Unlike total- order planning, such as STRIPS, partial order planning can solve planning problems with interleaving goals. Hierarchical Task Networks planning (HTN planning) [Russell and Norvig, 1995, Erol et al., 1994] is a planning approach but differs from STRIPS planning in that the common sequences of (primitive) operators have to be defined in advance. The description of an HTN planning domain includes a set of primitive operators similar to those of classical planning, and a set of methods, each of which is a prescription for how to decompose complex tasks into more primitive subtasks. Given a planning domain, the description of a planning problem will contain an initial state and a partially

20This is related to the negation by failure approach used in logical programming languages. 2.4 Reasoning about Actions and Change 33 ordered set of tasks to accomplish. The objective of an HTN planner is to produce a sequence of primitive operators that perform some activity or task. Planning proceeds by using the methods to decompose tasks recursively into smaller and smaller subtasks, until the planner reaches primitive tasks that can be performed directly using the planning operators. Among the most prominent HTN planning systems are the GIPO- II toolkit [McCluskey et al., 2003], SHOP2 [Nau et al., 2003] and O-Plan [Tate et al., 1996].

2.4.3.1 Contingency Planning vs. Execution Monitoring

Classical planning assumes that the world the planning agent acts in is accessible, static and deterministic. However, in real-world applications, agents have to deal with both incomplete and incorrect information. Furthermore, the effects of agent actions may not be deterministic. Last but not least, exogenous events might change the agent’s environment without the agent interfering with it. There are two major approaches to deal with inaccessible, dynamic and nonde- terministic domains: Contingency planning and execution monitoring with replan- ning [Russell and Norvig, 1995]. Contingency planning (also called conditional plan- ning) deals with incomplete information by constructing conditional plans that account for each possible situation (contingency) that could occur. Sensing actions are intro- duced in the plan to test for appropriate conditions during plan execution. One of the earliest conditional planners was WARPLAN-C [Warren, 1976], a variant of the WARPLAN system. More recent systems for conditional planning are UWL [Etzioni et al., 1992] and CNLP [Peot and Smith, 1992]. Execution monitoring simply tries to execute a classical plan. In the case of plan failure the planning agent uses replanning to achieve its goals from the situation in which the plan failed. The earliest major work on execution monitoring was PLANEX [Fikes et al., 1972] which, together with the STRIPS planner, controlled the robot Shakey. PLANEX used triangle tables to allow recovery from partial execution failure without complete replanning. IPEM [Ambros-Ingerson and Steel, 1988] was the first system to smoothly integrate partial-order planning and plan execution with replan- ning.

2.4.4 Stochastic Actions and Decision-Theoretic Planning

In many real world domains the outcome of an action is not known in advance. How- ever, in some domains, the possible outcomes of actions and the probabilities they occur with are known in advance. Actions with these properties are called probabilistic or stochastic agent actions. The execution of a stochastic action leads to one of a set of possible states with a certain probability. Consequently, the notion of a plan changes: While classical planners try to find a plan which will achieve a goal under all circumstances, decision-theoretic planners compute sequences of decision rules which, according to the current situation, suggest actions which maximise the likelihood with which the goal becomes true. 34 Chapter 2. Mechanised Reasoning in Artificial Intelligence

2.4.4.1 Probabilistic STRIPS Planning

Little work has been done on trying to extend classical STRIPS-style planning for ac- tions with probabilistic effects. Hanks and McDermott presented a theoretical frame- work which uses a probabilistic model to reason about the effects of an agent’s proposed action on a dynamic and uncertain world. Their model computes the projected proba- bility of a proposition being true at a point in time based on evidence provided by a so called probabilistic causal theory [Hanks and McDermott, 1994]. Although their work has not been implemented, it is seen as the first step towards probabilistic STRIPS. Kushmerick, Hanks and Weld extended the classical plan representation to han- dle uncertainty in the initial world state and in the effects of actions [Kushmerick et al., 1995]. They developed the C-BURIDAN planner which constructs a sequence of actions such that executing each action in turn results in a final probability distri- bution in which the goal expression holds with a sufficiently high likelihood. First experiments with C-BURIDAN on well-known probabilistic planning problems were promising. However, the system only allows purely propositional domain descriptions and non-parameterised actions. This makes it difficult to model domains with complex agent actions. Furthermore, C-BURIDAN assumes that no additional information is provided during plan execution, i.e. it does not support sensing actions and cannot handle exogenous events. More recently, Kaelbling and colleagues have presented a variant of probabilistic planning which allows an agent to act quickly in a real world. Their work will be discussed in further detail in Section 2.4.4.3 after we introduced Markov Decision Pro- cesses.

2.4.4.2 Markov Decision Processes For some decision-theoretic problems, a well-defined goal state is not known in advance. Instead, a decision-making agent receives a higher reward for visiting certain states, and less reward for others. In this case, the aim is to maximise the reward the decision maker receives while making decisions rather than reaching a specified goal state. Markov Decision Processes (MDPs) [Puterman, 1994, Boutilier et al., 1999] model probabilistic dynamical systems as discrete time stochastic control process. In contrast to symbolic methods, such as probabilistic STRIPS planning, systems are modelled by concrete sets of states21. In each state there are several actions from which the decision maker must choose. For every state s and action a, a state transition function determines the probability with which a follow-up state s′ is reached if a is chosen in s. An MDP is called fully observable if the agent can precisely observe the state resulting from an action execution once it is reached. Another extreme type of MDPs are non-observable MDPs in which the agent receives no information at all about the system state. Between those extremes are partially observable MDPs or POMDPs. In POMDPs the agent has partial information about the state reached by the execution of an action, but the state determined might be wrong. Typically, this is modelled by

21Similar to (finite) state machines, every state and action in an MDP is identified with a unique name. States and actions cannot be characterised with logical descriptions as in classical planning. 2.4 Reasoning about Actions and Change 35 an explicit observation space of the agent and a function which models the probability that the agent makes certain observations if it executes a certain action in a state. In our work we consider fully observable MDPs with finite sets of states and actions over a discrete time domain. They can be formally defined as follows:

Definition 2.2 (Finite Fully Observable Markov Decision Process) A fully observable MDP is a five-tuple M =(S,A,T,R,C) with the following compo- nents:

-- S is a finite set of states of the system being controlled.

-- A is a finite set of actions the controlling agent can perform.

-- T : S × A ×S → [0, 1] is the state transition distribution function.

-- R : S → R is a function that associates a real-valued reward to every state.

-- C : S × A → R is a cost function that determines the cost of executing an action in a certain state.

We assume that the state of the system changes if the decision making agent per- forms an action a ∈ A. Typically, not all actions in A are executable in a state s ∈ S. Therefore, the set As ⊆ A denotes the set of actions available in state s. R(s) is the instantaneous reward (utility) an agent receives if the systems enters state s. The cost function C assigns a cost to taking an action in a certain state. We assume that, for an action a and a state s, the cost C(s, a) is negative if the agent loses accumulated reward by executing a.22 The state transition function T and the reward function R are defined such that ∀si,sj ∈ S and a ∈ A,

T (si,a,sj) = Pr(Xt+1 = sj|Xt = si, Ut = a),and (2.7)

R(si) = Reward(Xt = si), (2.8) where the random variables Xt and Ut denote the state of the system at time t, and the action taken at time t, respectively, i.e.

X : N → S and U : N → A.

Pr(Xt+1 = sj|Xt = si, Ut = a) denotes the probability that action a ∈ Asi , when executed in state si at time t, leads to a transition of the system to state sj at time t + 1. The definition of Pr (and thus T ) depends on the assumption that the successor state of a system depends solely on the current state and the action chosen and does not depend on any previous states or actions. This assumption is called the Markov

22The terms “reward” and “cost” are thus slightly overloaded. Rewards could be negative, in which case “penalty” would be more appropriate. Likewise, costs can be positive (beneficial). 36 Chapter 2. Mechanised Reasoning in Artificial Intelligence

Property. The MDP model also assumes that the environment the decision making agent is acting in is stationary, i.e. the probability function Pr does not change over time. In this work we consider only finite MDPs, i.e. the sets S of states and A of actions are both finite sets. However, research has also been done on modelling and solving MDPs with infinite state and action spaces [Bertsekas and Shreve, 1978, Hern´andez- Lerma and Lasserre, 1990]. Decision Epochs and Policies In Decision Theory, the goal of an agent is to maximise the expected reward earned by executing actions over a certain time period. Typically, the decision process is divided into discrete time epochs or stages. The horizon H is the number of decision epochs the agent must plan for. The horizon can be finite or infinite. In finite-horizon problems the agent’s performance is evaluated over a fixed, finite number H of stages. In infinite-horizon problems, the MDP has a clearly defined terminal decision epoch, but it can be unknown what the precise value for the horizon is until the decision process terminates (i.e. it enters an absorbing state). In this case, one cannot choose a fixed finite horizon unless an upper bound for solving a problem is known. If the horizon H is infinite, one way to guarantee convergence of the sum of earned rewards at different stages is to multiply rewards by a discount factor γ with 0 ≤ γ < 1. The discount factor is also used to indicate how immediate and future rewards should be weighted. Generally, the more delayed a reward is, the smaller its weight will be; a reward earned m steps in the future is scaled down by the factor γm. A Markov decision problem is an MDP together with an objective function which is supposed to be maximised. Here we restrict our attention to one particular objective function: the expected cumulative discounted reward with discount rate γ. A candidate solution for a Markov decision problem is a policy. A policy23 specifies the decision rule to be used at each stage. Generally, a policy π is a sequence of decision rules µi, i.e. π = (µ1,µ2,...). In principle, any computable function can be underlying the decision rules µi. Here we only regard stationary policies for which the action µt chooses at stage t only depends on the state of the process at stage t. Thus, a policy π can be described as a function

π : S → A that maps states to actions. Optimality Criteria and Solution Algorithms Let us now consider the problem of building a policy that maximises the discounted sum of expected rewards over an infinite horizon. Howard [Howard, 1960] showed that there exists an optimal stationary policy for such problems. To be able to compare different policies we employ the notion of a value function. For a fixed policy π we define V π : S → R using the following Bellman equations:

π π V (s)= R(s)+ {C(s, π(s))+ γ T (s, π(s),s′) · V (s′)} for every s ∈ S. (2.9) sX′ ∈S 23Policies are also called contingency plans, universal plans, or strategies. 2.4 Reasoning about Actions and Change 37

For a policy π, this is a system of |S| linear equations with |S| unknown variables V π(s). Each equation states that the value of a state s is the reward for reaching s plus the discounted expected value of the successor states. Howard showed that the optimal value function V ∗ for an optimal policy π∗ satisfies

V ∗(s)= R(s) + max{C(s, a)+ γ T (s,a,s′) · V ∗(s′)} for every s ∈ S. (2.10) a A ∈ sX′ ∈S The optimal value function can be computed using the method of successive approx- imation by value iteration [Bellman, 1957, Puterman, 1994]. Following that method we begin with the value function V0(s) that assigns an arbitrary value to each s ∈ S and then define:

Vt+1(s)= R(s) + max{C(s, a)+ γ T (s,a,s′) · Vt(s′)} for s ∈ S ∧ t> 0. (2.11) a A ∈ sX′ ∈S

This sequence of functions Vt converges linearly to the optimal value function V ∗. The choice of the maximising action a for each s forms an optimal policy π∗, i.e.:

π∗(s) := argmaxa A{ T (s,a,s′) · V ∗(s′)} for every s ∈ S. ∈ sX′ ∈S

The function Vn approximates the value of π∗. The number n of iterations is based on a stopping criterion that generally involves measuring the difference between Vt and Vt+1. We refer the reader to [Puterman, 1994] for a discussion of stopping criteria and the convergence of the algorithm.

2.4.4.3 Modelling Complex Systems as MDPs Classical MDPs require an explicit enumeration of all possible states and actions of the modelled system. For some domains this leads to unnatural and complex repre- sentations with a large number of states. Factored MDPs [Boutilier et al., 2000a] can partially overcome this limitation by representing states with sets of propositional vari- ables. Hoey and colleagues combined factored MDPs with algebraic decision diagrams (ADDs24) to build a very efficient solution method for POMDPs. Their algorithm has been implemented in the SPUDD system [Hoey et al., 1999] which can handle propo- sitional MDPs with thousands of variables. SPUDD is also one of the few available systems for decision-theoretic planning. However, many realistic planning domains are best represented in first-order terms, exploiting the existence of domain objects, relations over these objects, and the ability to express goals and action effects using quantification. Existing algorithms for solving MDPs can only be applied to these problems by grounding or “propositionalising” the domain, which can be problematic because the number of propositions grows quickly with the number of domain objects and relation symbols. Also the number of ground actions grows dramatically with the domain size. This is why Boutilier and colleagues suggested the use of first-order MDPs in which the logical structure of the planning

24ADDs extend Binary Decision Diagrams [Drechsler and Becker, 1998] by allowing values from an arbitrary finite domain to be associated with the terminal nodes. 38 Chapter 2. Mechanised Reasoning in Artificial Intelligence domain is preserved and exploited for efficient decision-theoretic planning [Boutilier et al., 2001]. Their work and the subsequent work of Reiter will be described in Section 2.4.5. For a compact representation of MDPs, Kaelbling and colleagues suggest a rela- tional representation of state and action spaces [Gardiol and Kaelbling, 2004]. They show how the dynamics of a domain can be compactly represented in a small set of probabilistic, relational rules. These rules are used by an envelope-based planning ap- proach which quickly finds an optimal policy for a restricted state space. Recently, Pasula, Zettlemoyer and Kaelbling showed how probabilistic rules can be effectively learnt for several (simulated) dynamic domains [Pasula et al., 2007]. In general, the work of Kaelbling et al. strikes a balance between fully ground (discrete) and purely logical (first-order) representations of domains with uncertainty.

2.4.4.4 Dynamic Bayesian Networks and Influence Diagrams Stochastic action theories can also be represented as influence diagrams [Pearl, 1988] and by dynamic Bayesian Networks [Dean and Kanazawa, 1989]. However, Boutilier and Goldszmidt demonstrated that the representation of complex planning problems as Bayesian networks and influence diagrams is not as compact and natural as the specifications provided by logical approaches such as first-order MDPs [Boutilier and Goldszmidt, 1996]. They showed that for a set A of deterministic actions and a set of propositional variables P , a representation of all actions and their effects has size |A × P |2m, where m is the expected number of fluents that are relevant to post-action nodes under any action. This is significantly larger than the size of a situation calculus representation which requires |A × P | axioms.

2.4.5 DTGolog – Golog and Decision Theory In the previous section we motivated the representation of complex stochastic planning domains as first-order MDPs. In this section we show how first-order MDPs can be modelled in the situation calculus. Preliminary work on modelling uncertainty and probabilistic effects in the situation calculus was presented by Bacchus [Bacchus et al., 1995] and Poole [Poole, 1996]. Later, Reiter and colleagues showed how to extend the situation calculus to model stochastic action theories [Boutilier et al., 2000b, Reiter, 2001]. In the following sections, we briefly describe the approach of Reiter.

2.4.5.1 Stochastic Actions in the Situation Calculus Stochastic actions are introduced in the situation calculus with the following represen- tational trick: each stochastic agent action is associated with a finite set of deterministic actions, from which “nature” chooses nondeterministically. Nature’s actions cannot be executed by the agent itself and the agent has no control over which deterministic action nature will choose. Successor state axioms have to be provided for all determin- istic actions, including nature’s deterministic actions (i.e. successor state axioms never mention stochastic actions). Whenever a stochastic action is executed, nature chooses one of the associated actions with a specified probability. Basic action theories, as introduced in Section 2.4.1.1, can be extended to stochastic action theories by adding 2.4 Reasoning about Actions and Change 39

a set Ddt of axioms which define a reward function as well as nature’s choices for an action, and the probabilities they are chosen with, i.e.

Dp := Σ ∪Dap ∪Dss ∪Ddt ∪DS0 ∪Duna.

In general, the set of nature’s choices can depend on the situation they are made in. However, in our work we only regard stochastic actions for which nature’s choices are independent of the situation, i.e. for every stochastic action a(~x) with nature’s choices c1(~x),...,cn(~x), Ddt contains an axiom

choice(a(~x), b) ≡ (b = c1(~x) ∨···∨ b = cn(~x)).

Also the probability of a certain choice being done by nature can depend on the situ- ation the choice is made in. For every stochastic action a(~x) and for every choice c(~x) for a(~x), Ddt contains an axiom

prob(c(~x), a(~x),pr,s) ≡ (ϕ1(s) ⊃ pr = P1) ∧ ···∧

(ϕm(s) ⊃ pr = Pm), where the ϕµ(s) are formulae that may contain the situation argument s. In accordance + with the classical Kolmogorov axioms for probability theories, the Pi ∈ R are positive real number, i.e.

Poss(ci(~x),s) ∧ prob(ci(~x), a(~x),pri,s) ⊃ pri > 0, i =1,...,n.

Furthermore, the author of a stochastic action theory has to ensure that these proba- bilities add to 1 if any of nature’s choices is possible in a situation s. More precisely, the following condition must hold:

m m

((Poss(c1,s) ∨···∨ Poss(cm,s)) ∧ prob(ci(~x), a(~x),pi,s)) ⊃ pi =1. i^=1 Xi=1 The reward function in classical MDPs is defined on states which correspond to situa- tions in the situation calculus. This is why Ddt contains axioms of the form

reward(n,do(a, s)), where n is an integer25. The axiom states that the reward for performing action a in situation s is n. The reward for the initial situation S0 is assumed to be 0 (i.e. reward(0,S0)).

2.4.5.2 Decision-Theoretic Golog Programs in decision-theoretic Golog (DTGolog) [Soutchanski, 2003] are Golog pro- grams based on a stochastic action theory as introduced above. DTGolog allows the programmer to specify MDPs in a first-order language, and provide “advice” in the

25In general, rewards can be real-valued, but for the work presented here, integers are sufficient. 40 Chapter 2. Mechanised Reasoning in Artificial Intelligence form of high-level programs that constrain the search for policies. A DTGolog program can be regarded as a partially-specified policy. One can write purely nondeterministic DTGolog programs (containing only stochastic primitive actions) that allow an off-line DTGolog interpreter to compute an optimal policy, or purely deterministic programs that leave no decisions to the interpreter. Policies in classical MDPs are functions which choose a discrete agent action in every state of the systems. Policies in DTGolog are conditional Golog statements which select agent actions according to properties of the current situation. After executing a primitive stochastic action, a policy senses the choice made by nature to determine followup decisions. More precisely, Soutchanski defines policies in DTGolog as follows:

Definition 2.3 (DTGolog Policy) A DTGolog policy for a stochastic action domain Dp is a Golog program inductively defined by:

1. Any deterministic agent action in Dp and the distinguished actions stop and nil are policies.

2. If a is a deterministic action in Dp and π is a policy then (a; π) is a policy.

3. Let α be a stochastic agent action and {n1,...,nk} be nature’s choices associated with α. Let further senseEffect (α ) be a procedure sensing the outcome of α, senseCond (n1, ϕ1 ),. . . , senseCond (nk, ϕk ) be axioms in Dp associating nature’s choices with formulae ϕ1,...,ϕk, and π1,...,πk be policies. Then each of the following expressions is a policy:

(k = 0) α ; senseEffect (α ) ; Stop (k = 1) α ; senseEffect (α );(ϕ1)? ; π1 (k = 2) α ; senseEffect (α ) ; if ϕ1 then π1 else (ϕ2)? ; π2 endIf (k ≥ 3) α ; senseEffect (α ) ; if ϕ1 then π1 ··· else if (ϕk 1) then πk 1 − − else (ϕk)? ; πk endIf . . . endIf

It is convenient to wrap policies in a Golog procedure. We call the result a DTGolog policy procedure:

Definition 2.4 (DTGolog Policy Procedure) If π is a DTGolog policy then proc p (~x ) π endProc is called a policy procedure for π. ♦ 2.4 Reasoning about Actions and Change 41

2.4.5.3 Computing an Optimal Policy

DTGolog programs are interpreted relative to a stochastic action theory Dp as defined above. The DTGolog interpreter presented in [Soutchanski, 2003] computes an optimal policy using a variant of directed value iteration (see [Boutilier et al., 1999], pp. 34– 36). In DTGolog, an optimal policy is known to start in the initial state (situation)

Level 0 S0 V=max( V 1 , V 2 )

V 1 = p 1 VV 5 + p 2 6 Level 1 a 1 a 2 V 2 = p 3 V 3 + p4 V4 p1p 2 p3 p4

V3 Level 2 S S S S 1 2 3 4 V4

......

Figure 2.2: The first three levels of a decision tree for evaluating action choices in situation S0. The value of an action is the expected value of its successor states. The value of a state (situation) is the maximum of the values of its successor actions

S0. Therefore, the reachability structure of the underlying MDP can be exploited, restricting value to states reachable by some sequence of actions from S0. This form of directed value iteration can be realised by building a decision tree with the root S0. An abstract version of such a decision tree is shown in Figure 2.2. The successor nodes of S0 at level 1 of the tree are all actions possible in S0. The successor nodes of any action node are those states that can be reached with nonzero probability when that action is executed. Deeper levels of the tree are defined recursively in the same way. For an MDP with finite horizon T , the tree is built to level 2T . The value of a node labelled with a state s is the sum of R(s) and the maximum value of its successor (action) nodes. The value of a node labelled with an action is the expected value of its successor (state) node. Values are computed with a rollback procedure, whereby values at the leaves of the tree are computed first and then the values at successively higher levels of the tree are determined from the values previously computed. Like this, the value of a state s is exactly V ∗(s) and the maximising actions form an optimal policy. In Soutchanski’s DTGolog interpreter, directed value iteration is implemented using a meta-predicate BestDo(δ,γ,s,h,pol,v,pr) which macro-expands to situation calcu- lus formulae depending on the instructions in δ. The situation calculus formula corre- sponding to BestDo(ρ,γ,s,h,pol,v,pr) is true if pol is the optimal policy computed for program ρ, starting from situation s, with a discount factor γ and a horizon h. Furthermore, pr is the probability of successful execution of the policy pol and v is its total accumulated expected reward. On deterministic agent actions the predicate BestDo() behaves similar to the Golog predicate Do (see Section 2.4.2). 42 Chapter 2. Mechanised Reasoning in Artificial Intelligence

We illustrate the semantics of BestDo by showing its definition for the central case: If a is the first action of a DTGolog program (a ; δ) and a is a stochastic action with nature’s choices {c1,...,cn}, then

BestDo((a ; δ),s,h,π,v,pr) ≡ h> 0 ∧

∃π′, v′. BestDoAux({c1,...,cn}, a, δ, s, h − 1, π′, v′,pr) ∧

π =(a ; senseEffect(a) ; π′) ∧ reward(l,s) ∧

v = l + v′, where BestDoAux is a new extra-logical predicate which is true if and only if π′ is the optimal policy for δ in all situations produced by nature’s choices {c1,...,cn}, and v′ is the expected accumulated reward for π′. The evaluation of full DTGolog require similar definitions of BestDo for all other cases that might occur in a DTGolog program. We refer the reader to [Soutchanski, 2003] for details.

2.5 Summary

In this chapter we first presented an overview of the history of mechanised reasoning in Artificial Intelligence. We then described the most important automated reasoning tools available today, ranging from full first-order ATP systems over decision proce- dures to propositional satisfiability solvers. A few existing approaches for distributed automated reasoning were also discussed. The focus was then set on automated rea- soning about actions and change, which plays an important role in AI in general and in our work in particular. We introduced the situation calculus, the Golog language, and classical AI planning techniques which will all be used in Chapter 6 of this thesis. Finally, we introduced techniques for reasoning about actions with probabilistic effects. We explained, why classical (discrete) Markov Decision Processes are not suitable to model the complex domains that occur in our work. First-order MDPs, and in particu- lar their implementation in the DTGolog language allow for more abstract descriptions and can use the structure of Golog procedures to efficiently compute optimal policies. Chapter 3

Semantic Web Services

In the last decade, the World Wide Web (WWW) has developed into one of the biggest pools of information. With billions of Web pages and an estimated number of one billion users it has become the default source for information for many individuals and businesses. However, the growing amount of data has also made it more and more difficult to find the desired information. State-of-the-art search engines are based on syntactic matching and often deliver unsatisfactory results. One major reason for this is that most of the content in the Web today is designed for humans to read, not for computer programs to interpret and manipulate meaningfully. This drawback led to the vision of a more powerful Web referred to as the Semantic Web. The Semantic Web aims at making information on the Web not only machine readable but machine- understandable. Resources in the Semantic Web, e.g. static content or services, are annotated with a layer of machine-understandable, semantic meta-data. This meta- data is based on well-defined concepts and enables automated agents and sophisticated search engines to automatically perform tasks for a user or find exactly the information a user is looking for. In this chapter we provide a comprehensive introduction to the Semantic Web (Sec- tion 3.1) with a focus on Semantic Web Services (Section 3.2). In Section 3.3 we present common approaches to automated composition of Semantic Web Service. We define important concepts and introduce convenient notations for ontologies and semantic service descriptions in Section 3.4. The following sections contain many acronyms commonly used in the Semantic Web literature. For an overview of all acronyms and their meaning we refer the reader to page 265.

3.1 The Semantic Web

The aim of the Semantic Web initiative [Berners-Lee et al., 2001] is to give well- defined meaning to the information in Web pages and other resources available in the Web, such as Web Services and the contents of data bases. For this, information is annotated with machine-processable meta-data which specifies the semantics of the data. The vision of the Semantic Web is that this semantic markup can be used to build more powerful search engines which give intelligent answers to queries instead of the, often unsatisfactory, results delivered by state-of-the-art search engines based on pure syntactic matching. Furthermore, Semantic Web technologies can support 44 Chapter 3. Semantic Web Services intelligent software agents to perform complex tasks for their users. The Semantic Web is based on four main principles:

1. All resources in the Web should be uniquely identified by Uniform Resource Identifiers (URIs) [Berners-Lee et al., 1998].

2. Documents and meta-data are expressed in the Extensible Markup Language (XML) [Bray et al., 2004] which is designed for hard- and software-independent communication across networks.

3. Simple statements about Web resources are expressed in the Resource Description Framework (RDF) [Manola et al., 2004, Beckett, 2004] and in the RDF schema language (RDFS).

4. Ontologies are used for more expressive statements about classes of objects and their properties. Rules based on ontologies can be used to infer new knowledge from already known facts.

The languages needed to realise the Semantic Web can be ordered in a hierarchical way according to their expressiveness. Figure 3.1 depicts this hierarchy. Generally speaking, languages on a higher level in the stack are more expressive than the lan- guages below them. URIs and XML languages are the basic building blocks on the lowest layer. They are followed by the RDF(S) layer for simple statements about prop- erties of resources. The Web Ontology Language (OWL) [Bechhofer et al., 2004] and rule languages are more expressive and add advanced inference capabilities. An over- arching “Logic Framework” and the top layers “Proof” and “Trust” do not refer to languages but name issues that are not fully developed yet and still subject to research. In parallel to the main tower, security aspects like encryption, electronic signatures are built upon XML. The Semantic Web is still very young and its design is still in flux. Hence, the single stack architecture shown in Figure 3.1 is subject to ongoing discussion [Kifer et al., 2005, Horrocks et al., 2005].

Trust

Proof

Logic Framework

OWL Rules Signature RDF Schema Encryption

RDF Core

XML Namespaces

URI Unicode

Figure 3.1: The Semantic Web stack (adapted from [Kifer et al., 2005]) 3.1 The Semantic Web 45

In the following sections we will present some of the building blocks of the above Semantic Web tower in greater detail. However, we will only cover the layers that have already been finalised by the World Wide Web Consortium (W3C), i.e. up to the layer of the Web Ontology Language and inference rules.

3.1.1 The Extensible Markup Language The Extensible Markup Language (XML) describes a class of data objects called XML documents and partially describes the behaviour of computer programs which process them. XML is a restricted form of SGML, the Standard Generalised Markup Lan- guage [Charles F. Goldfarb, 1991]. XML was designed to be easily processable by machines independent of the hard- ware, operating system, or software employed. In recent years, XML has developed into the state-of-the-art language scheme for storing data and transporting data be- tween different applications in a platform-independent way. XML documents are made up of storage units called entities, which contain either parsed or unparsed data1. Parsed data is made up of characters, some of which form character data, and some of which form markup. Markup encodes a description of the document’s storage layout and logical structure. XML provides mechanisms to impose constraints on the storage layout and logical structure, such as Document Type Definitions (DTDs) and XML Schemas [Fallside and Walmsley, 2004]. URIs are closely coupled with XML documents by the XML namespace mechanism. All languages in the Semantic Web are defined by their XML/RDF syntax. How- ever, XML is designed to be easily processable by machines, not by humans. This is why we avoid presenting XML documents whenever possible throughout this the- sis. We will not describe XML in further detail but refer the interested reader to the language specification and introductory texts [Auld et al., 2002] instead.

3.1.2 The Resource Description Framework The Resource Description Framework (RDF) [Manola et al., 2004] is an XML language for representing meta-data about Web resources, such as a Web document’s title, au- thor, modification date, or copyright and licensing information. RDF is intended for situations in which this information needs to be processed by applications rather than humans. RDF provides a common framework for expressing this information so it can be exchanged between applications without loss of meaning. RDF is based on the idea of identifying Web resources using URIs and describ- ing resources in terms of simple properties and property values. RDF properties may be thought of as attributes of resources and in this sense correspond to traditional attribute-value pairs. Attribute-value pairs can be represented as a graph of nodes and edges representing the resources, their properties, and property values. For in- stance, the RDF graph in Figure 3.2 represents some information about Tim Berners- Lee. The graph contains four nodes representing a resource uniquely identifying Tim Berners-Lee (tbl#me), the string “Tim Berners-Lee”, an e-mail address, and the string

1Roughly speaking, XML entities can be regarded as extended HTML tags. 46 Chapter 3. Semantic Web Services

tbl#me contact:fullName contact:personalTitle

contact:mailbox

"Mr." "Tim Berners−Lee" mailto:[email protected]

xmlns:contact="http://www.w3.org/2000/10/swap/pim/contact" xmlns:tbl="http://www.w3.org/People/TBL/contact"

Figure 3.2: Example RDF graph providing information about Tim Berners-Lee

“Mr.”. The properties contact:fullName, contact:mailbox and contact:personalTitle define how these resources are related to each other. For example, “Tim Berners-Lee” is the full name of the person identified by the resource tbl#me. The XML namespace prefixes tbl and contact represent the namespaces http://www.w3.org/People/TBL/contact and http://www.w3.org/2000/10/swap/pim/contact, respectively. RDF also provides an XML-based syntax (RDF/XML) for recording and exchang- ing RDF graphs. The following document corresponds to the graph in Figure 3.2:

Tim Berners-Lee Mr.

RDF provides no mechanisms for describing properties themselves, nor does it pro- vide any mechanisms for describing the relationships between properties and other resources. These mechanisms are provided by the RDF vocabulary description lan- guage RDF Schema (RDFS). RDFS expressions describe either classes of resources, or properties together with domain and range information. The following RDFS code defines the classes of animals and carnivores and an RDF property eats:

3.1 The Semantic Web 47

Carnivores eat other animals.

The expressivity of RDFS is restricted to simple class and property definitions and class and property subsumption. For instance, it does not allow to define a class as the complement of another class or the union or intersection of other classes. This is why more expressive languages, based on Description Logics, have been developed. The simple examples presented in RDF/XML syntax demonstrate that this syntax is too verbose to be presented to users.

3.1.3 Knowledge Representation and Description Logics

Semantic Web research has been strongly influenced by research on Knowledge Repre- sentation and Description Logics. Knowledge Representation focuses on the design of formalisms that are both epistemologically and computationally adequate for express- ing the knowledge an agent has about a particular domain. One line of research in the field of Knowledge Representation has been concerned with the idea that knowledge structures should be expressed in terms of the classes of objects (the concepts) that are of interest in a domain, as well as the relevant relationships holding among such classes (the roles). This idea led to the development of frame systems and semantic networks [Hendrix, 1979, Bobrow and Winograd, 1977]. In general, these systems were not formally defined. A fundamental step towards a logic-based characterisation of such systems was accomplished by the work on the KL-ONE language [Brachman and Schmolze, 1985]. One of the main goals of a logical basis was the precise characteri- sation of the set of constructs used to build class and role expressions. Later, Brach- man and Levesque addressed the trade-off between the expressiveness of KL-ONE-like languages and the computational complexity of reasoning in such languages [Brach- man and Levesque, 1984]. The field of Terminological Logics (or concept descrip- tion languages) research was born. Later, this line of research was also pursued by Baader and others at the German Research Institute for Artificial Intelligence [Baader, 1990, Baader et al., 1990]. Nowadays, the term Description Logics is more commonly used. Description Logics (DLs) are a family of class-based knowledge representation formalisms [Baader et al., 2002, Calvanese et al., 2001]. They are fragments of clas- sical first-order logic which are characterised by the use of various constructors to build complex classes from simpler ones, by an emphasis on the decidability of key reasoning problems (such as consistency and class subsumption checks), and by the provision of sound, complete and tractable reasoning services. Almost all Description Logics are based on the base language ALC (the “Attributive Language with Comple- ments”) [Schmidt-Schauß and Smolka, 1991] and extend it with various constructs for 48 Chapter 3. Semantic Web Services enhanced expressivity, such as number restrictions on roles or inverse roles.2 A knowledge base (KB) expressed in a DL is traditionally constituted by two com- ponents called TBox and ABox (from “Terminological Box” and “Assertional Box”, respectively). The TBox stores a set of universally quantified assertions, stating gen- eral properties of concepts and roles. For examples, an assertion of this kind is the one stating that a concept Carnivore is defined as a given expression using concepts and roles, say “Animal that eats other Animals”. The ABox comprises assertions on individ- ual objects, also called instance assertions. For example, one can assert that Simba is an instance of the class “Animal that eats other Animals”. Several reasoning tasks can be carried out on a DL knowledge base. The simplest form of reasoning involves computing the subsumption relation between two concept expressions, i.e. verifying whether one expression always denotes a subset of the objects denoted by another expression. In the above example, one can easily derive that Carnivore is a specialisation of Animal, i.e. Animal subsumes Carnivore. A more complex reasoning task consists in checking whether a certain assertion is logically implied by a knowledge base. For example, we can infer from the above assertions that Simba is an instance of Carnivore. We note that a DL system is characterised by four aspects:

-- The set of constructs constituting the language for building the concepts and roles defined in the TBox.

-- The kind of assertions that can appear in the TBox.

-- The kind of assertions that can appear in the ABox.

-- The inference mechanism provided for reasoning on the knowledge bases express- ible in the system.

The standard technique for specifying the meaning of DLs is via a model-theoretic semantics. An interpretation (∆I , I) over a set A of atomic concepts and a set R of atomic roles consists of a nonempty domain ∆I and an interpretation function I. The domain ∆I is a set of objects and the interpretation function I is a mapping from individual, class and property names to elements of the domain, subsets of the domain, and binary relations over the domain, respectively. In particular, the interpretation function maps every atomic concept A ∈ A to a set AI ⊆ ∆I and every atomic role R ∈ R to a set RI ⊆ ∆I × ∆I . Objects in a domain do not have any meaning by themselves. Thus, the choice of any particular set of objects for a domain ∆I is not relevant. Given an interpretation (∆I , I), an entity i is an instance of a class C if i is interpreted as an element of the interpretation of C (i.e. iI ∈ CI ). Similarly, a class C is a subclass of class D in case CI ⊆ DI. The interpretation function can be extended to arbitrary concepts and roles as shown in Table 3.1 on page 51. Whole families of DLs with different expressivity and computational requirements have been studied in the last two decades. The SH family of Description Logics is of particular interest for our work due to its close relationship to ontology languages

2A nice presentation of the different constructs of Description Logics and their influence on the complexity of reasoning problems can be found at http://www.cs.man.ac.uk/~ezolin/dl/. 3.1 The Semantic Web 49 developed for the Semantic Web. The constructors and axioms of SH Description Log- ics include the boolean connectives (intersection, union and complement), restrictions on properties, transitive properties, and a property hierarchy. Among others, the SH family contains the SHIQ language [Horrocks et al., 1999], which adds inverse prop- erties and generalised cardinality restrictions3, and SHOQ(D) [Horrocks and Sattler, 2001], which allows to define a class by enumerating its instances, and supports data values like integers and strings.

3.1.4 The Web Ontology Language For the Semantic Web to function, computers must have access to commonly agreed ontologies that define the concepts used in meta-data annotations of Web resources. This is a strong assumption since the size and the liberal nature of the Web makes it practically impossible to require all publishers of Web content to use one centralised ontology4. One prerequisite for developing widely-used ontologies is the use of common lan- guages to express ontological structures. The Web-Ontology Working Group of the World Wide Web Consortium has developed the Web Ontology Language (OWL) as a recommendation for such a language [Bechhofer et al., 2004]. OWL is intended to be used when the information contained in documents needs to be processed by applica- tions, as opposed to situations where the content only needs to be presented to humans. OWL is a revision of the DAML+OIL ontology language [Horrocks et al., 2001] and is an extension of the previously described recommendations RDF and RDFS. The OWL language provides three increasingly expressive sublanguages designed for use by specific communities of implementers and users:

OWL Lite supports those users primarily needing a classification hierarchy and sim- ple constraint features. For example, while OWL Lite supports cardinality re- strictions for properties, it only permits cardinality values of 0 or 1. Due to the restricted expressiveness, reasoning tools for OWL Lite are much simpler and more efficient than tools for the more expressive relatives OWL-DL and OWL Full.

OWL-DL supports those users who want the maximum expressiveness without los- ing computational completeness and decidability of reasoning tasks. OWL-DL includes all OWL language constructs with restrictions such as type separation (a class cannot be an individual or property, a property cannot be an individual or class). The name OWL-DL stems from the correspondence to the Description Logic SHOIN (D).

OWL Full is meant for users who want maximum expressiveness and the syntactic freedom of RDF with no computational guarantees. For example, in OWL Full a class can be treated simultaneously as a collection of individuals and as an individual in its own right. Another significant difference from OWL-DL is that

3Cardinality restrictions constrain the number of values a property can have. 4However, some researchers are concerned with the problem of how to repair mismatches between different ontologies [McNeill et al., 2004, Burstein, 2004]. 50 Chapter 3. Semantic Web Services

a data type property can be marked as an inverse functional property. OWL Full also allows an ontology to augment the meaning of the pre-defined (RDF or OWL) vocabulary.

An OWL ontology contains a sequence of annotations, axioms, and facts. Anno- tations on OWL ontologies can be used to record authorship and other information associated with an ontology, including import references to other ontologies. Axioms define the classes and properties of an ontology. Facts define the individuals of an ontology and their properties. In recent years, support for the OWL language family has steadily grown. First tools for editing OWL ontologies and for various reasoning tasks have become avail- able. These tools mostly support the OWL-DL language which proved to be expressive enough to describe the domain ontologies used in our framework. This is why we will focus on OWL-DL in the remainder of this thesis.

3.1.5 Description Logics and OWL-DL Description Logics had a strong influence on the design of the Web Ontology Lan- guage OWL. Indeed, the Description Logic fragment OWL-DL can be mapped to the SHOIN (D) language which extends the SHOQ(D) Description Logic with inverse roles and unqualified number restrictions. The translation of OWL-DL constructs into SHOIN (D) is shown in Table 3.1. The first column of the table presents the (frame- like) OWL abstract syntax for the construction, while the second column shows the corresponding standard DL constructor. The third row describes the model-theoretic semantics of the construct with respect to an interpretation (∆I ∪ ∆DI , I). The domain ∆I contains the individuals of a model and the set ∆DI denotes the domain of data values. OWL-DL descriptions use the constructs presented in Table 3.1 to describe classes, individuals, properties and range restrictions. These descriptions are used in axiomati- sations of ontologies. The set of possible OWL-DL axioms is shown in Table 3.2. The SHOIN (D) syntax of OWL-DL is a very concise and compact representation formal- ism and we will use this syntax throughout this thesis to present OWL-DL ontologies and their parts. However, we introduce a more convenient notation for values of OWL datatype properties. Instead of the DL formula ho, vi ∈ R we will also write o.R = v. OWL-DL is designed as a Semantic Web Language and its semantics does include some unusual aspects: 1) Annotations can be used to associate information with classes, properties and individuals. 2) Whole Ontologies also live within the semantics and can be given annotation information. 3) The owl:imports construct is given a meaning that involves finding a referenced ontology (if possible) and adding its semantics to the semantics of the current ontology. We do not deal with these aspects here and simply assume all axioms and facts to be defined in a single ontology. The interested reader is referred to [Patel-Schneider et al., 2003]. From a DL point of view, an OWL-DL ontology O is a pair hA, Ei where A is a set of SHOIN (D) axioms, and E is a set of SHOIN (D) facts, i.e. they are Knowledge Bases (KBs) as introduced in Section 3.1.3. We will therefore use the terms “Ontology” and “Knowledge Base” to denote OWL-DL ontologies. The axioms and facts stored 3.1 The Semantic Web 51

Abstract Syntax DL Syntax Semantics Descriptions (C) A (URI reference) A AI ⊆ ∆I owl:Thing ⊤ owl : ThingI =∆I owl:Nothing ⊥ owl : NothingI = {} intersectionOf(C1 C2 ...) C1 ⊓ C2 (C1 ⊓ D1)I = C1I ∩ D2I unionOf(C1 C2 ...) C1 ⊔ C2 (C1 ⊔ C2)I = C1I ∪ C2I complementOf(C) ¬C (¬C)I =∆I \ CI oneOf(o1 ...) {o1 ,...} {o1 ,...}I = {o1I ,...} restriction(R someValuesFrom(C)) ∃ R.C (∃ R.C)I = {x |∃y.hx,yi∈ RI and y ∈ CI} restriction(R allValuesFrom(C)) ∀ R.C (∀ R.C)I = {x |∀y.hx,yi∈ RI → y ∈ CI} restriction(R hasValue(o)) R : o (R : o)I = {x | hx, oI i∈ RI } restriction(R minCardinality(n)) > n R (> n R)I = {x | ♯{y | hx,yi∈ RI} > n} restriction(R maxCardinality(n)) 6 n R (6 n R)I = {x | ♯{y | hx,yi∈ RI} 6 n} restriction(U someValuesFrom(D)) ∃ U.D (∃ U.D)I = {x |∃y.hx,yi∈ U I and y ∈ DD} D restriction(U allValuesFrom(D)) ∀ U.D (∀ U.D)I = {x |∀y.hx,yi∈ U I → y ∈ D } restriction(U hasValue(v)) U : v (U : v)I = {x | hx, vI i∈ U I } restriction(U minCardinality(n)) > n U (> n U)I = {x | ♯{y | hx,yi∈ U I} > n} restriction(U maxCardinality(n)) 6 n U (6 n U)I = {x | ♯{y | hx,yi∈ U I} 6 n} Data Ranges (D) D D (URI reference) D D ⊆ ∆DI oneOf(v1 ...) {v1 ,...} {v1 ,...}I = {v1I ,...} Object Properties (R) R (URI reference) R RI ⊆ ∆I × ∆I R− (R−)I = {hx,yi | hy,xi∈ RI } Datatype Properties (U) U (URI reference) U U I ⊆ ∆I × ∆DI Individuals (o) o (URI reference) o oI ∈ ∆I Data Values (v) D v (RDF literal) v vI = v

Table 3.1: OWL-DL descriptions, data ranges, properties, indivdiuals, and data values (source [Horrocks et al., 2003]) 52 Chapter 3. Semantic Web Services

Abstract Syntax DL Syntax Semantics

Class(A partial C1 . . . Cn) A ⊑ C1 ⊓ . . . ⊓ Cn AI ⊆ C1I ∩ . . . ∩ CnI Class(A complete C1 . . . Cn) A = C1 ⊓ . . . ⊓ Cn AI = C1I ∩ . . . ∩ CnI EnumeratedClass(A o1 . . . on) A = {o1,...,on} AI = {o1I ,...,onI } SubClassOf(C1 C2) C1 ⊑ C2 C1I ⊆ C2I EquivalentClasses(C1 . . . Cn) C1 = . . . = Cn C1I = . . . = CnI DisjointClasses(C1 . . . Cn) Ci ⊓ Cj = ⊥, i 6= j CiI ∩ CjI {}, i 6= j Datatype(D) DI ⊆ ∆DI DatatypeProperty(U super(U1)...super(Un) U ⊑ Ui U I ⊆ UiI domain(C1) . . . domain(Cm) > 1 U ⊑ Ci U I ⊆ CiI × ∆DI range(D1) . . . range(Dl) ⊤⊑∀ U.Di U I ⊆ ∆I × DiI [Functional]) ⊤⊑6 1 U U I is functional SubPropertyOf(U1 U2) U1 ⊑ U2 U1I ⊆ U2I EquivalentProperties(U1 . . . Un) U1 = . . . = Un U1I = . . . = UnI ObjectProperty(R super(R1)...super(Rn) R ⊑ Ri RI ⊆ RiI domain(C1) . . . domain(Cm) > 1 R ⊑ Ci RI ⊆ CiI × ∆I range(C1) . . . range(Cl) ⊤⊑∀ R.Ci RI ⊆ ∆I × CiI [inverseOf(R0] R = (−R0) RI = (R0I )− [Symmetric] R = (−R) RI = (RI )− [Functional] ⊤⊑6 1 R RI is functional [InverseFunctional] ⊤⊑6 1 R− (RI )− is functional + [Transitive]) T r(R) RI = (RI ) SubPropertyOf(R1 R2) R1 ⊑ R2 R1I ⊆ R2I EquivalentProperties(R1 . . . Rn) R1 = . . . = Rn R1I = . . . = RnI AnnotationProperty(S)

Individual(o type(C1) . . . type(Cn) o ∈ Ci oI ∈ CiI value(R1 o1). . . value(Rn on) ho, oii∈ Ri hoI , oiI i∈ RiI value(U1 v1). . . value(Un vn)) ho, vii∈ Ui hoI , viI i∈ UiI SameIndividual(o1 . . . on) o1 = . . . = on oiI = ojI DifferentIndividuals(o1 . . . on) oi 6= oj, i 6= j oiI 6= ojI , i 6= j Table 3.2: OWL-DL axioms and facts (source [Horrocks et al., 2003]) 3.2 Semantic Web Services 53 in a KB only provide limited knowledge. Even more important are the axioms and facts that are entailed by a knowledge base. For example, if a KB contains the axioms C1 ⊑ C2 and C2 ⊑ C3 the statement C1 ⊑ C3 is entailed by the KB. We formally define the notions of satisfying interpretations, consistent ontologies, and KB entailment:

Definition 3.1 (Satisfying OWL Interpretation) An interpretation (∆I ∪ ∆DI , I) satisfies an OWL-DL ontology O = hA, Ei if and only if I satisfies each axiom A ∈A and each fact F ∈E according to Table 3.2. ♦

Definition 3.2 (Consistent OWL Ontologies) An OWL-DL ontology O = hA, Ei is consistent if there is a satisfying interpretation for O. ♦

Definition 3.3 (Knowledge Base Entailment) A OWL-DL ontology O entails an axiom ϕ, a fact ψ, or an ontology O′ if and only if every satisfying interpretation for O also satisfies ϕ, ψ, or O′ respectively. We write

O|= ϕ, O|= ψ, or O|= O′.

For an ontology O = hA, Ei, axioms ϕ,ϕ′ and a fact ψ, we also write O,ϕ′ |= ϕ instead of hA∪{ϕ′}, Ei|= ϕ, and O, ψ |= ϕ instead of hA, E∪{ψ}i |= ϕ. ♦

As mentioned above, DLs are carefully designed in such a way that a mechanical procedure can decide whether a fact is entailed by a KB or not. Typically, decision pro- cedures for DLs employ the tableaux proof method [H¨ahnle, 2001]. Tableaux methods decide consistency of an ontology by constructing an abstraction of a model for it, a so-called completion graph. All DLs have some form of the tree model property [Vardi, 1997] which allows tableaux methods to restrict the search to tree-shaped completion graphs. Horrocks and Sattler presented a tableaux decision procedure for the De- scription Logic underlying OWL-DL [Horrocks and Sattler, 2005]. The procedure has been implemented in the Pellet reasoner [Sirin et al., 2003] which we will present in Section 5.2.

3.2 Semantic Web Services

The principles of the Semantic Web (to annotate resources with semantic information) can be extended to services available in the Web, so called Web Services. However, Web Services differ from classical Web content insofar as they are not static documents but executable processes. Semantic descriptions of Web Services have to describe the functionality of these processes. These descriptions should support (at least partially) the automation of some of the following tasks: Discovery: Locate different services suitable for a given task.

Selection: Choose the most appropriate services among the available ones. 54 Chapter 3. Semantic Web Services

Composition: Combine services manually or automatically to perform a task.

Mediation: Resolve mismatches (in data, protocols, or processes) among combined services.

Execution: Invoke services following standardised protocols and programmatic con- ventions.

Since Semantic Web Services build upon classical Web Services we will first intro- duce the languages recommended by the W3C for describing Web Services.

3.2.1 Web Services According to the W3C a Web Service (WS) is “a software system designed to support interoperable machine-to-machine interaction over a network”. It has an interface described in a machine-processable format and other systems can invoke a Web Service in a manner prescribed by its description using Internet protocols. Web Services can, for instance, perform tasks or business transactions, or provide information for their users. Web Services can be regarded as the successor of previous architectures for distributed systems, such as Corba [Siegel, 1996]. In recent years Web Services have attracted huge attention from industry [Clark et al., 2003] and have become one of the standard means to build interoperable systems (for heterogeneous platforms). As a result, readily usable software and tools for the development, deployment, and invocation of Web Services are now available. In what follows, we are going to describe the current standard to describe Web Services, as developed by the World Wide Web Consortium.

3.2.1.1 The Web Service Description Language The Web Services Description Language (WSDL) is an XML format for describing Web Services as collections of communication endpoints or ports, which send and re- ceive messages according to specified protocols, such as HTTP, or SOAP-RPC [Walsh, 2002] (see Section 3.2.1.2). WSDL aims to automate communication between Web Services by distinguishing such abstract Web Service descriptions from the concrete data formats and protocols that are used to implement the Web Service. A WSDL binding maps between the abstract description of a Web Service and its specific reali- sation [Christensen et al., 2001]. WSDL assumes a stateless client-server model of synchronous or uncorrelated asyn- chronous interactions. Each WSDL port is associated with a port type, which describes the message exchanges (operations) the port can take part in. Four basic kinds of op- erations are possible5: a one-way message, a (two-way) request-response, a (two-way) solicit-response and a (one-way) notification message. Message definitions normally employ XML Schema types and thus support a broad range of type definitions. WSDL builds on the Simple Object Access Protocol (SOAP) by providing a binding for WSDL operations to SOAP messages and WSDL ports to SOAP endpoints.

5In the WSDL specification version 1.1. 3.2 Semantic Web Services 55

It is worth mentioning that WSDL is a purely syntactical interface description language (IDL) and does not provide sufficient means to capture the semantics of Web Services and their operations in a machine-interpretable fashion. To exploit Web Services for the envisioned tasks of automatic application synthesis via service retrieval and composition, the missing semantics has to be incorporated into service descriptions. Furthermore, the algorithms best suited to operate on semantically rich Web Service descriptions have to be identified. One approach to capturing the semantics of Web Services is the OWL-S upper ontology. We will describe this approach in further detail in Section 3.2.2.

3.2.1.2 The Simple Object Access Protocol The Simple Object Access Protocol (SOAP) [Walsh, 2002] is a stateless, one-way mes- sage exchange protocol based on HTTP. In recent years it has become the standard protocol for the invocation of Web Services. Despite the fact that SOAP is a one-way protocol, applications can create more complex interaction patterns by combining one- way exchanges with features provided by an underlying protocol and/or application- specific information. A SOAP message consists of a SOAP envelope which itself contains a message header and a body. The optional message header contains control information which includes, for instance, passing directives or contextual information related to the processing of the message. This allows a SOAP message to be extended in an application-specific manner. The SOAP body is mandatory and contains the actual data conveyed with the message. SOAP does not specify the semantics of any application-specific data it conveys neither does it contain specifications for issues such as the routing of SOAP messages, reliable data transfer, and firewall traversal. However, SOAP provides the framework by which application-specific information may be conveyed in an extensible manner. It also provides a full description of the required actions taken by a SOAP node on receiving a SOAP message.

3.2.1.3 Web Service Registries A transaction in a web services marketplace involves three parties: service requesters, service providers, and infrastructure components. A service requester seeks a service to complete its work. A service provider offers a service sought by requesters. In an open environment with a large number of Web Services available, such as the Internet, the requester may not know ahead of time of the existence of the provider, so the requester relies on infrastructure components that act like registries to find the appro- priate provider. For instance, a requester may need a news service that reports stock quotes with no delay with respect to the market. The role of registries is to match the request with the services offered by service providers to identify which of the available services is most suitable for the service requester. In the industrial context the prevalent initiatives for the discovery of Web Services are the Universal Description, Discovery and Integration (UDDI) [UDDI, 2000, Walsh, 2002] specification and the Electronic Business XML (ebXML) registry [Hofreiter et al., 2002]. 56 Chapter 3. Semantic Web Services

UDDI is an initiative to develop a standard for an online registry and to enable the publishing and dynamic discovery of Web Services offered by businesses. UDDI allows programmers and other representatives of a business to locate potential business partners and to form business relationships on the basis of the services they provide. It thus facilitates the creation of new business relationships. ebXML is a modular suite of specifications that enables enterprises of any size and in any geographical location to conduct business over the Internet. Similar to UDDI, ebXML provides companies with a standard method to exchange business messages, communicate data in common terms and define and register business processes. An ebXML registry provides a mechanism by which XML artifacts can be stored, main- tained, and automatically discovered. ebXML enables Web Services to describe the business processes they support and the services they offer using Collaboration Protocol Profiles (CPP). CPPs contain industry classification, contact information, supported business processes, interface requirements etc. They are registered within an ebXML registry (similar to a UDDI registry), in which other Web Services and their business processes can be discovered. Despite their successful use in industrial contexts, UDDI and ebXML registries only allow to search for keywords based on the names of businesses, services, and technical models (TModels)6. Thus queries on UDDI registries typically return many matching Web Services, most of them not being useful for tackling the given task. Also, an automated composition of Web Services based on UDDI and ebXML descriptions is virtually impossible. The main reason for this is that WSDL descriptions and keyword- based descriptions do not cover the semantics of a Web Service, i.e. what the service actually performs. This is why more expressive Web Service description and discov- ery mechanisms, which take the capabilities of Web Services into account, are being developed.

3.2.1.4 Business Processes and BPEL4WS BPEL4WS (Business Process Execution Language for Web Services) [Andrews et al., ] enables the specification of executable business processes (including Web Services) and business process protocols in terms of their execution logic or control flow. Executable business processes specify the behaviour of individual participants within Web Ser- vices interactions and can be invoked directly, whereas business process protocols (also called abstract processes) abstract from internal behaviour to describe the messages exchanged by the various Web Services within an interaction. Abstract processes only consider protocol-relevant data and ignore process-internal data and computation. The effects of such computation on the business protocol are then described using non-deterministic data values. Executable processes, on the other hand, are described using a rich process description language which deals with both protocol-relevant and process-internal data. BPEL4WS also defines several mecha- nisms for recovery from faults, such as catching and handling of faults, and compensa- tion handlers which specify compensatory activities for actions that cannot be explicitly undone.

6TModels contain information about the specifications and versions of specifications used to design advertised services. 3.2 Semantic Web Services 57

3.2.2 The OWL-S Upper Ontology for Web Services OWL-S [Martin et al., 2004] is an OWL-DL ontology which supplies Web Service providers with a core set of markup language constructs for describing the properties and capabilities of Web Services in an unambiguous, computer-interpretable form. OWL-S markup of Web Services facilitates the automation of tasks such as automated Web Service discovery, execution, composition and interoperation. Figure 3.3 shows the essential parts of an OWL-S service description. Every service presents a service profile which can be used for advertising or discovering services. The process model of a service provides a detailed operational description of the service. The service grounding describes details on how to interoperate with a service. In what follows, we describe the three parts of an OWL-S service in greater detail.

OWL−S presentsService supports

Service describedBy Service Profile Grounding

Service Model What the service does How to access it Processes

How it works

Figure 3.3: Parts of an OWL-S service description

3.2.2.1 The Service Model To give a detailed perspective on how to interact with a service, it is useful to regard a service as a process. A process is not a program to be executed but a specification of the ways a client may interact with a service. The service model contains definitions of atomic and/or composite processes. Atomic OWL-S processes represent a single Web- accessible computer program, sensor, or device. An atomic process is a description of a service that receives exactly one message and replies with exactly one message in response. A composite process is one that maintains some state; each message the client sends advances the client to the next state of the composite process. Composite processes are typically composed of multiple more primitive processes and may require an extended interaction or conversation between a client application and the set of services that are being used. Atomic and Simple OWL-S Processes. Atomic processes correspond to the ac- tions a service can perform by engaging in a single interaction, i.e. they correspond to a single invocable Web Service. Atomic processes are specified by their input and output parameters, their preconditions, and their effects. Input and output parame- ters are annotated with type information in the form of OWL-DL classes defined in an underlying domain ontology. For each atomic process a grounding has to be provided that enables a service requester to construct messages for invoking the underlying Web Service and to deconstruct replies. A grounding maps the input and output parameters of an atomic process to the corresponding parameters in a WSDL document. 58 Chapter 3. Semantic Web Services

Simple processes are not invocable and are not associated with a grounding, but, like atomic processes, they are conceived of as having single-step executions. Simple processes are used as elements of abstraction. A simple process may be used either to provide a view of some atomic process, or a simplified representation of some composite process (e.g., for purposes of planning and reasoning). Composite OWL-S Processes. Composite processes are decomposable into other (atomic or composite) processes. Their decomposition can be specified by means of control constructs. Because many of the control constructs have names reminiscent of control structures in programming languages, it is important to note a fundamental difference to programming: a composite process does not describe the behaviour of a service, but a behaviour (or set of behaviours) a client application can perform by sending and receiving a series of messages. If the composite process has an overall effect, then the client must perform the entire process in order to achieve that effect. In this section we only provide an overview of constructs offered by OWL-S to build composite services. For a more detailed description of these constructs we refer the reader to the OWL-S technical overview [Martin et al., 2004]. The following OWL-S composition constructs are available:

Sequence: A list of processes to be executed in order.

Split: The components of a Split process are a bag of process components to be ex- ecuted concurrently. No further specification about synchronisation is given in the description of OWL-S.

Split+Join: Here the process consists of a concurrent execution of a set of process components with barrier synchronisation: The execution of a Split+Join pro- cess terminates as soon as the parallel execution of all process components has terminated.

Unordered: Allows the process components to be executed in some unspecified or- der, or concurrently. All components must be executed. As with Split+Join, completion of all components is required.

Choice: A control construct for choosing a number of processes from a given set. The actual choice is left to the client application.

If-Then-Else: A control construct for conditional statements. Conditions are ex- pressed in the Semantic Web Rule Language (SWRL) [Horrocks et al., 2004], or the first-order Knowledge Interchange Format (KIF) [Genesereth and Fikes, 1990, Ginsberg, 1991].

Repeat-While & Repeat-Until: Both constructs iterate until a condition becomes false or true, following the familiar programming language conventions7.

7Repeat-While tests for the condition, exits if it is false and performs the operation if the condition is true, then loops. Repeat-Until performs the operation, tests for the condition, exits if it is true, and otherwise loops. 3.2 Semantic Web Services 59

When a process is performed as a step in a composite process, there must be a description of where the inputs to the performed process come from and which process handles the output values. For this purpose, OWL-S offers constructs to specify bind- ings of process parameters either to concrete values or to values of output parameters of previously performed processes.

3.2.2.2 The Service Profile The OWL-S service profile provides sufficient information for service matchmaking and composition purposes. This information can, for instance, be used to advertise a service to service registries similar to the ones described in Section 3.2.1.3. More precisely, the service profile provides a functional description which specifies the inputs required by the service and the outputs generated. Furthermore, since a service may require external conditions to be satisfied, the profile describes the pre- conditions required by the service. The profile also describes the expected effects that result from the execution of the service. For example, a selling service may require as a precondition a valid credit card and as input the credit card number and its expiry date. The service generates a receipt, and produces the effect of the credit card balance being charged. Next to the functional aspects, the service profile allows the service provider to describe three types of additional information that might be of interest for service requesters: 1. The category of a service coarsely specifies which type of service is offered. Typically the category of a service is a reference to an established taxonomy of services such as United Nations Standard Products and Services Code (UN- SPSC). 2. The quality rating of the service can, for instance, contain statements about the reliability and average response time of a service. 3. An arbitrary list of service parameters can contain any type of information. For example, the list might include parameters that provide an estimation of the maximal response time or information about the geographic availability of a service.

3.2.2.3 The Service Grounding The grounding of a service specifies the details of how to access a service, i.e. how to execute the corresponding atomic processes. A grounding is a mapping from the abstract description of the inputs and outputs parameters of atomic processes to a con- crete specification of those parameters required for interacting with the service. So far, OWL-S supports WSDL as concrete specification of Web Services. An OWL-S-WSDL grounding is based on three correspondences: (1) An OWL-S atomic process corre- sponds to a WSDL operation. (2) The inputs of an OWL-S atomic process correspond to the parts of an input message of a WSDL operation, and the outputs correspond to the parts of an output message of a WSDL operation. (3) The types (OWL-DL classes) of the inputs and outputs of an atomic process correspond to WSDL abstract types and may be used in WSDL specifications of message parts. Using these correspondences a 60 Chapter 3. Semantic Web Services client application can create the SOAP messages required for invoking a service and the client can interpret SOAP replies.

3.2.3 WSMO Similar to OWL-S, the Web Service Modelling Ontology (WSMO) [Roman et al., 2005] provides ontological specifications for the core elements of Semantic Web Services. WSMO has been developed of a European Semantic Systems Initiative and is based on the Web Service Modelling Framework (WSMF), which “provides the conceptual model for developing and describing web services and their components” [Fensel and Bussler, 2002]. Descriptions of Semantic Web Services in WSMO consist of four main parts:

Ontologies provide the terminology used by other WSMO elements to describe the relevant aspects of the domains of discourse.

Web services descriptions comprise the capabilities, interfaces and internal working of the Web Service.

Goals represent user desires, for which fulfilment could be sought by executing a Web Service. Goals model the user view in the Web Service usage process and are therefore a separate top-level entity in WSMO.

Mediators describe elements that overcome interoperability problems between differ- ent WSMO elements. Mediators are the core concept to resolve incompatibilities on the data, process and protocol level, i.e. in order to resolve mismatches be- tween different terminologies (data level), in how to communicate between Web Services (protocol level) and on the level of combining Web Services (process level).

WSMO shares with OWL-S the vision that ontologies are essential to support auto- matic discovery, composition and interoperation of Web Services. But despite sharing a unifying vision, OWL-S and WSMO differ greatly in the details and the approach to achieve these results. Whereas OWL-S explicitly defines a set of ontologies that support reasoning about Web Services, WSMO defines a conceptual framework within which these ontologies will have to be created. Another difference between OWL-S and WSMO is that while OWL-S does not make any distinction between types of Web Services, WSMO stresses the specification of mediators: mapping programs that solve the interoperation problems between Web Services. In WSMO’s vision, mediators perform tasks such as translation between ontologies, or between the messages that one Web Service produces and those that another Web Service expects. In the process of defining mediators, WSMO produces a taxonomy of possible mediators that helps to define and classify the different tasks that mediators are supposed to solve. However, it can be difficult to map this taxonomy onto the classical problems of Web Service interoperation; i.e. discovery, composition and invocation. For example, it is unclear how mediators can help during discovery, since discovery is intrinsically a selection problem, while mediators attempt to reconcile the differences between goals of Web Services. 3.3 Composition of Semantic Web Services 61

At the time the MathServe framework was designed and implemented, the devel- opment of WSMO was still in an early stage and software and tool support did not exist. For this reason, we adopted OWL-S for our work.

3.3 Composition of Semantic Web Services

In the context of business applications numerous languages and technologies have been developed that support the specification of Web Services, their execution, and their composition as workflows. However, workflow composition is mainly performed man- ually by skilled engineers. With semantic descriptions of Web Services available the question arises how much of the composition process can be automated and which techniques can be used for this automation. Several approaches to automated Web Service Composition (WSC) have been presented in the literature. In the following sections we briefly describe the most prominent ones.

3.3.1 AI Planning for Web Service Composition AI planning [Fikes et al., 1971] is probably the most prominent approach to automated Web Service composition. This is not surprising since the semantic descriptions of atomic OWL-S processes, for instance, are very similar to planning operators: Planning operators and atomic OWL-S processes define the parameters, preconditions and effects of agent actions and processes, respectively. One of the first works on planning for Web Service composition was presented in [McDermott, 2002]. McDermott argues that automatic web service composition can be achieved by AI planning because Web Services fit in the classical model of discrete actions. However, he presents necessary extensions to classical AI planning that are motivated by the peculiarities of Web Services. McDermott notes that, although Web Services can be seen as discrete agent actions, their composition requires eliminating closed-world assumptions and allowing for branching plans. Furthermore, he claims that the Planning Domain Description Language [Gerevini and Long, 2005] (PDDL)8 has to be extended by

-- a more robust type notation for complex data structures,

-- a :value field to express types of newly created objects, and

-- a know-value predicate for new knowledge obtained by the execution of a Web Service.

However, these extensions have not yet been realised in one single planning system9. The closed-world assumption was originally introduced to overcome the frame problem (see Section 2.4.1.2). Simply eliminating this assumption will re-introduce this problem

8PDDL is the language most commonly used by planning systems to describe planning domains and planning problems. 9McDermott is developing the OPTOP planning system which is supposed to include all these extensions. 62 Chapter 3. Semantic Web Services and lead to a less efficient planning process. Despite this problem, the use of planning for Web Service composition has been suggested throughout the literature. Wu and colleagues presented work on using the SHOP2 planner for automatic Web Service composition [Wu et al., 2003]. SHOP2 is based on the paradigm of Hierarchical Task Networks (HTN) and creates plans by task decomposition. The authors claim that HTN planning is particularly suitable for Web Service composition due to the direct correspondence of atomic and composite processes to HTN operators and methods, respectively (see Section 2.4.3). A hybrid planning approach combining a graph-plan based Fast Forward (FF) plan- ner with a variant of HTN planning has been described in [Klusch et al., 2005]. The authors present the OWLS-XPlan system which translates OWL-S service descriptions into planning operators in PDDL [Gerevini and Long, 2005]. Unlike pure HTN plan- ning OWLS-XPlan performs method decomposition by only using the relevant parts of a method. If method decomposition is not successful, a graph plan algorithm, guided by heuristics, is used to search for a solution (see [Hoffmann, 2000]). The planner is complete, i.e. it always finds a solution if one exists. A proprietary, state-based planning approach based on instance patterns has been described in [Pfalzgraf, 2006]. Instance patterns are constraint-based descriptions of (classes of) OWL instances10. The states of Pfalzgraf’s planning approach are con- straint graphs representing instance patterns. Consequently, planning operators are graph transformations obtained from OWL-S service profiles. A graph-matching algo- rithm is used to check the applicability of planning operators. The depth-first, forward chaining planning algorithm GOAL is employed to try to reach the goal state (goal graph) from the initial state (initial graph). The GOAL planner is used for automated Web Service composition in the interactive environment SmartWeb [Sonntag et al., 2007] which is a multimodal user interface for accessing the Semantic Web and Seman- tic Web Services. In case of an underspecified user request the planning system tries to resolve the truth of unfulfilled preconditions by querying the user of SmartWeb. Martin´ez and Lesp´erance did research on the use of knowledge-based planning for Web Service composition [Martinez and Lesprance, 2004]. They employ the PKS plan- ning system which constructs plans by reasoning about the effects of actions on an agent’s knowledge, rather than the state of the world. The authors show that their approach scales relatively well with respect to different sets of user constraints. Last but not least, work on using the model-based planner MBP [Bertoli et al., 2003] for automated Web Service composition was presented in [Traverso and Pistore, 2004, Pistore et al., 2004] and [Pistore et al., 2005]. MBP is based on the planning as model checking approach [Jensen et al., 2001] and subsequent extensions. MBP uses a compact representation of belief states in BDDs for planning under incomplete knowledge. It can produce strong, weak and cyclic plans11. The authors show how OWL-S atomic processes can be translated into corresponding state transition systems

10The constraint language used by Pfalzgraf has a similar expressivity as established rule languages, such as SWRL (see Section 3.4.2). 11A strong plan is guaranteed to achieve the goal for all possible executions. Weak plans have a chance of success, i.e. some of its possible executions achieve the goal. A cyclic plan reaches the goal with an iterative trial-and-error strategy, such that all the associated executions always have a possibility of terminating and, when they do, they achieve the goal. 3.3 Composition of Semantic Web Services 63 which can be transformed into a planning domain for the MBP planner. MBP produces conditional plans that are translated into executable BPEL4WS processes.

3.3.2 Golog and the Situation Calculus McIlraith and Son presented an extension of Golog (see Section 2.4.2) for Semantic Web Service composition [McIlraith and Son, 2002]. Their extension incorporates user constraints as well as a new order construct which relaxes the notion of sequences of actions, enabling the insertion of actions to achieve the failing preconditions for the next action to be performed by a program. McIlraith and Son define the notion of knowledge and physically self-sufficient programs which are executable with minimal assumptions on the agent’s initial state of knowledge, or the world state. The work of McIlraith and Son was one of the main motivations for using Golog as a representation language for composite services in MathServe.

3.3.3 Markov Decision Processes Doshi et al. claim that, even with suitable extensions, STRIPS-like planning, and variations thereof, does not fully account for the nondeterministic behaviour of Web Services because classical static plans require execution monitoring and re-planning to deal with possible failures [Doshi et al., 2005]. They suggest to model the composition of simple workflows as a Markov Decision Process (MDP) instead. Standard solution methods for MDPs can be used to generate robust and adaptive workflows that are tolerant towards service failures and uncertainties. However, Doshi et al. do not discuss how their approach scales to complex problem domains. Furthermore, using standard MDP solution methods requires a fully propositional representation of the problem domain which leads to unnatural and complex problem descriptions (see also Section 2.4.4.2).

3.3.4 Program Synthesis in Linear Logic Rao investigated the use of logic-based program synthesis for an automated composition of Web Services described in DAML-S [Rao, 2004, Rao et al., 2004]. The approach is based on the propositional fragment of Linear Logic (LL), a logic introduced by Girard in [Girard, 1987] which offers a way of coping with resources and resource control. Rao uses a process calculus (derived from the π-calculus [Milner, 1999]) to represent the process model of the composite service generated. The calculus rules are attached to the Linear Logic inference rules in form of a type theory. Thus, the process model of a composite process can be directly extracted from a completed proof of a specification of the composite process. The approach is complemented by a set of subtyping rules that define valid workflows for composite services. Rao mentions that the use of propositional LL limits the use of his framework. An extension to First-order LL might lead to a non-terminating or less efficient program synthesis. Dixon, Smaill and Bundy applied deductive synthesis in Intuitionistic Linear Logic to the problem of composing simplified versions of some of the reasoning services de- scribed in Chapter 4 [Dixon et al., 2006]. 64 Chapter 3. Semantic Web Services

3.4 Definitions and Notation

In this section we define concepts that are important for the rest of this thesis. In particular, we introduce convenient notations for the various languages and formalisms necessary to work with OWL-S service descriptions. Such service descriptions depend on a domain ontology – written in OWL – in which the domain-specific classes and properties are defined. The classes in this ontology are used to declare the types of the input and output parameters of atomic OWL-S processes. OWL properties are used in the preconditions and effects of processes and profiles. We will use the DL syntax shown in Table 3.1 and Table 3.2 (pages 51 and 52) to present OWL ontologies and their parts. In the Semantic Web Services community it is still common to present OWL-S service descriptions and their parts in the RDF/XML syntax of the OWL ontology language. XML languages are suitable for computers to read and translate, however, they are not designed to be read and understood by humans. This is why we introduce our own frame-like surface syntax for OWL-S processes and profiles. However, for the interested reader a complete OWL-S service description in RDF/XML syntax is presented in Appendix D.

3.4.1 OWL Ontologies – Classes, Properties and Individuals An OWL ontology is defined by its axioms and facts. They provide information about classes, properties, and individuals in the ontology. It is important to note that, in the context of OWL, the notions ontology and knowledge base are equivalent. Next to terminological statements (definitions of classes and properties) OWL ontologies can also contain assertional statements (definitions of individuals). For the following definitions we assume that U is a countably infinite set of URI references as defined in [Berners-Lee et al., 1998].

Definition 3.4 (OWL-DL Ontology) Let A be a set of OWL axioms and E be a set of OWL-DL facts (individual definitions) according to Tables 3.1 and 3.2. We call the pair O = hA, Ei an OWL-DL ontology. The set OWLDL is the set of all OWL-DL ontologies. ♦

Restrictions on OWL properties introduce new abstract classes that are not explicitly named. We differentiate those from the set of simple, named classes of an OWL-DL ontology, which is the set of all classes that are explicitly defined and referred to by a URI12:

Definition 3.5 (Simple Named Classes) Let O = hA, Ei be an OWL-DL ontology. We define the set C of simple, named O classes in O as

C := {C | ′(C ⊑ C1 ⊓ . . . ⊓ Cn)′ ∈ A ∨ ′(C = C1 ⊓ . . . ⊓ Cn)′ ∈A with n ∈ N}, O 12We use single quotes to distinguish between Description Logic formulae and the mathematical meta level. For example, ′ϕ′ ∈ A simply means that the description logic formula ϕ is in the set A. 3.4 Definitions and Notation 65

where the Ci =6 C (i = 1,...,n) are either classes defined in O, the superclass ⊤ (owl#Thing), or OWL-DL descriptions for class union (D1 ⊔ D2), complement (¬D), enumeration ({oq,...,om}), or cardinality or range descriptions on properties as shown in Table 3.1. The same holds recursively for D,D1 and D2. ♦

Names of ontology classes are URI references, which typically share the same XML namespace. For the sake of readability we therefore allow an additional axiom type in OWL-DL ontologies that introduces abbreviations for XML namespaces. For exam- ple, we write prefix = http://www.example.org/ontology.owl to introduce prefix as an abbreviation for the URI http://www.example.org/ontology.owl, which can then be used throughout an ontology. Given an OWL-DL ontology we are also interested in the set of individuals defined in the ontology:

Definition 3.6 (Individuals of an Ontology) Let O = hA, Ei be an OWL-DL ontology. We define the set I of individuals in O as O I := {i | ′(i ∈ C)′ ∈E for some C ∈ C } O O ♦

Example 3.4.1 We would like to illustrate the above definitions with an example ontology that is a simple conceptualisation of a part of the animal world. The ontology consists of the following axioms and facts:

an = http://www.biology.org/animals.owl an#LivingThing ⊑ owl#Thing an#Animal ⊑ an#LivingThing an#Plant ⊑ an#LivingThing > 1 an#eats ⊑ an#Animal ⊤ ⊑ ∀ an#eats. an#LivingThing an#Carnivore = an#Animal ⊓ ∃ an#eats. an#Animal an#Herbivore = an#Animal ⊓ ∀ an#eats. an#Plant an#Fido ∈ an#Carnivore The ontology introduces five classes (the classes of living things, animals, plants, car- nivores and herbivores) and an object property “eats” which models that every animal eats some living things. Furthermore, Fido is defined as an individual of type Carnivore. All classes, properties and individuals are defined in the same XML namespace abbre- viated by an. The superclass ⊤ of all classes in OWL is named owl#Thing. The concepts Carnivore and Herbivore are then defined as subclasses of Animal with a value restriction on the eats property. Thus, the simple, named classes in our ontology are LivingThing, Animal, Plant, Carnivore and Herbivore which all live in the namespace an. Figure 3.4 shows a graphical representation of the subsumption hierarchy of our ontology. We define the set of datatype and object properties in an ontology as those properties that have explicit domain or range statements. They are crucial for the definition of preconditions and effects of OWL-S processes: 66 Chapter 3. Semantic Web Services

Figure 3.4: Subsumption hierarchy of the animals ontology

Definition 3.7 (Datatype and Object Properties) Let O = hA, Ei be an OWL-DL ontology. We define the set R of properties in O as O the set of all properties with explicit domain and range definitions, i.e: R := {R | ’(> 1 R ⊑ C)’∈A and ’(⊤ ⊑ ∀ R. D)’∈A for some C,D∈ C }. O O ♦

3.4.2 The Semantic Web Rule Language The Semantic Web Rule Language (SWRL, pronounced “swirl”) [Horrocks et al., 2004] combines OWL with the rule language RuleML [Boley et al., 2001]. SWRL is one of the languages directly supported by OWL-S to describe the preconditions and effects of atomic processes and service profiles. Rules in (SWRL) are implications over conjunc- tions of atoms. Atoms in SWRL rules may refer to individuals, data literals, individual variables or data variables. Variable names are URI references in U just as the names of OWL classes, properties and individuals. In what follows, we assume that a set VD ⊂ U of data variables and a set VI ⊂ U of individual variables are given. Next to the recommended XML/RDF syntax, an abstract syntax has been proposed for SWRL [Horrocks et al., 2004]. In our work we use an extension of the Description Logic syntax of OWL-DL to represent SWRL expressions. We use a restricted set of SWRL atoms, namely the set of individual property atoms:

Definition 3.8 (SWRL Atoms) Let O be an OWL-DL ontology, I be the set of individuals in O, and VI be a set of O individual variables. The set ASWRLO of SWRL atoms over O is the smallest set, such that

′(hi1, i2i ∈ R)′ ∈ASWRLO for all R ∈ R and i1, i2 ∈ I ∪ VI . O O ♦

The language SWRLO (SWRLO ) contains conjunctions (and disjunctions) of (pos- ∧¬ ∧∨ itive) SWRL individual property literals. The language SWRLO will be used to define ∧¬ the preconditions of atomic OWL-S processes and services profiles. SWRLO will be used for the effects of services (see Section 3.4.3.3). ∧∨

Definition 3.9 (Disjunctions and Conjunctions of SWRL Literals)

Let O be an OWL-DL ontology and ASWRLO be the set of all SWRL atoms over O. The sets SWRLO and SWRLO are defined as the smallest sets fulfilling: ∧¬ ∧∨ 3.4 Definitions and Notation 67

1. ASWRLO ⊂ SWRLO and ASWRLO ⊂ SWRLO (Individual Property Atoms) ∧¬ ∧∨

2. if A ∈ASWRLO then ¬A ∈ SWRLO (Negation) ∧¬

3. if A1,...,An ∈ASWRLO then L1 ∧ . . . ∧ Ln ∈ SWRLO and A1 ∧ . . . ∧ An ∈ SWRLO , ∧¬ ∧∨ where Li = Ai or Li = ¬Ai. (Conjunction)

4. if A1,...,An ∈ASWRLO then A1 ∨ . . . ∨ An ∈ SWRLO , (Disjunction) ∧∨ ♦

Note that the set SWRLO is not a superset of SWRLO . We do allow negated SWRL ∧∨ ∧¬ atoms in SWRLO , but not in SWRLO . Allowing only positive atoms in service effects simplifies∧¬ several reasoning tasks∧∨ described later in this thesis: In Chapter 5 we will describe how a Description Logic reasoner can be used to check whether the effects of a query profile are entailed by the effects of a service profile. Negative effects would require a more complicated reasoning procedure for this entailment test. In Section 6.4.2.2 we will show how OWL-S service profiles can be translated into planning operators for the planning system PRODIGY. The planning operators will not have any negative effects (i.e. they have empty delete lists). Thus, interleaving goals cannot occur during planning and a more efficient version of PRODIGY can be used. So far, all services in MathServe could be described without negative effects. This is mainly due to the fact, that almost all reasoning services create new objects, and the service effects simply make statements about these new objects. Thus, it is not necessary to negate old facts about existing objects. We extend the entailment relation introduced in Section 3.1.5 to disjunctions, and conjunctions of SWRL literals:

Definition 3.10 (Entailment of SWRL Formulae) Let O be an OWL-DL ontology. We define DL entailment on SWRL formulae as follows:

1. For every A ∈ SWRLO , O|= ¬A if and only if O 2 A, ∧¬

2. for every A1 ∧ . . . ∧ An ∈ SWRLO , ∧¬ O|= A1 ∧ . . . ∧ An if and only if O|= Ai for all 1 ≤ i ≤ n, and

3. for every A1 ∨ . . . ∨ An ∈ SWRLO , ∧∨ O|= A1 ∨ . . . ∨ An if and only if O|= Ai for some 1 ≤ i ≤ n. ♦

We also define functions for determining the variables and predicates occurring in SWRL formulae:

Definition 3.11 (Variables and Properties in SWRL Formulae) Let O be an OWL-DL ontology. We define the set of (free) variables of SWRL atoms and formulae on O as follows: 68 Chapter 3. Semantic Web Services

-- vars (hi1, i2i ∈ R) := {i1, i2} ∩ VI ,

-- vars (¬hi1, i2i ∈ R) := {i1, i2} ∩ VI , and

-- vars (L1ω...ωLn) := vars (Li), where ω ∈ {∧, ∨}. i=1S...n Similarly, the set of OWL properties (predicates) occurring in SWRL formulae is de- fined as:

-- props (hi1, i2i ∈ R) := {R},

-- props (¬hi1, i2i ∈ R) := {R}, and

-- props (L1ω...ωLn) := props (Li), where ω ∈ {∧, ∨}. i=1S...n The set of literals occurring in SWRL formulae is defined as:

-- lits (hi1, i2i ∈ R) := {hi1, i2i ∈ R},

-- lits (¬hi1, i2i ∈ R) := {¬hi1, i2i ∈ R}, and

-- lits (L1ω...ωLn) := {L1,...,Ln}, where ω ∈ {∧, ∨}.

The set of atoms occurring positively or negatively in a SWRL formulae ϕ is defined as: atoms (ϕ) := {A | A ∈ lits (ϕ) or ¬A ∈ lits (ϕ)}. ♦

Sometimes, variables in SWRL atoms or formulae have to be replaced by other variables or individuals, i.e. we need a notion of a substitution:

Definition 3.12 (Substitutions for SWRL Formulae) Let O be an OWL-DL ontology, I be the set of individuals in O, and VI be a set of O individual variables. A substitution for O is a mapping

σ : VI → (VI ∪ I ). O

For every ′(hi1, i2i ∈ R)′ ∈ASWRLO and for A1,...,An ∈ASWRLO we define substitution application as:

σ(ik) : if ik ∈ VI -- (hi1, i2i ∈ R)σ =(hj1, j2i ∈ R), where jk :=  ik : otherwise.

-- (¬A)σ = ¬(Aσ),

-- (A1 ∧ . . . ∧ An)σ = A1σ ∧ . . . ∧ Anσ, and

-- (A1 ∨ . . . ∨ An)σ = A1σ ∨ . . . ∨ Anσ. 3.4 Definitions and Notation 69

SWRL rules are simple implications whose antecedents and succedents are conjunctions of SWRL atoms.

Definition 3.13 (SWRL Rules) The set RL of SWRL rules over an ontology O contains all implications over con- O junctions of SWRL atoms, i.e.

RL = {A1,...,Am ⇒ B1,...,Bn | Ai, Bj ∈A }. O O ♦

Notation 3.14: For a SWRL rule (A1,...,Am ⇒ B1,...,Bn) ∈ RL with m, n ∈ N0 O we also write:

B1 ∧ . . . ∧ Bn if m = 0 and

¬(A1 ∧ . . . ∧ Am) if n =0.

The definitions of the functions vars (.), props (.), lits (.) and atoms (.) (see Defini- tion 3.11) are extended to SWRL rules in the natural way:

Definition 3.15 (Properties in SWRL Rules) The set of OWL properties (predicates) occurring in a SWRL rule is defined as:

vars (ϕ ⇒ ψ) := vars (ϕ) ∪ vars (ψ) props (ϕ ⇒ ψ) := props (ϕ) ∪ props (ψ) lits (ϕ ⇒ ψ) := lits (ϕ) ∪ lits (ψ) atoms (ϕ ⇒ ψ) := atoms (ϕ) ∪ atoms (ψ)

3.4.3 OWL-S Service Descriptions We saw in Section 3.2.2 that an OWL-S service description consists of three parts: a service profile, a process model, and a service grounding. Service profiles are of particular interest because they can be used for service discovery and automated service composition as realised in the MathServe broker (see Section 5). In what follows, we introduce convenient (and human-readable) notations for OWL-S atomic processes and service profiles.13

13By the time our work was done, a human readable surface syntax for OWL-S processes was under development. However this development had not been finished yet and the surface syntax was designed for version 1.0 of OWL-S. This is why we define our own surface syntax for OWL-S processes and profiles. 70 Chapter 3. Semantic Web Services

3.4.3.1 Atomic Processes A crucial part of an OWL-S service description is the process model which gives a detailed perspective of how a service operates. Atomic processes are directly related to concrete Web Services and define the input and output parameters as well as the preconditions and effects of Web Services. Process Parameters. OWL-S parameter declarations consist of an RDF identifier (a URI reference), an OWL class which provides type information, and, optionally, a value field that may contain arbitrary XML content. Process parameters are defined as SWRL variables coupled with a parameter type. This allows SWRL expressions to refer to the input and output parameters of a process. Input parameters may also be bound to a value (an OWL individual).

Definition 3.16 (Process Parameters) Let O be the domain ontology underlying an OWL-S service description. Furthermore, let VI be a set of individual variables. Let also x1, x2 ∈ VI , C1,C2 ∈ C , and v be an O OWL individual of class C2, then

-- (x1,C1) is a valid process input parameter and a valid process output parameter with respect to O, and

-- (x2,C2, v) is also a valid process input parameter with value v. ♦

Notation 3.17: We also write x :: C for a parameter (x, C) and x :: C ← v for an input parameter (x, C, v) with value v. If the OWL individual v has an RDF identifier ⋆ns#idv then we also allow references to that identifier. The reference is marked with ’⋆’ as in x :: C ← ⋆ns#idv. OWL-S atomic processes are richer descriptions of concrete web services. They are characterised by their Input, Outputs, Preconditions and Effects (the IOPE scheme).

Definition 3.18 (OWL-S Atomic Process) Let O = hA, Ei be an OWL-DL ontology. An OWL-S atomic process based on O is a five-tuple (u,I,O,P,E) where

-- u ∈ U is a URI reference identifying the service,

-- I is a finite set of process input parameters with types in C , O -- O is a finite set of process output parameters with types in C , O

-- P ⊂SWRLO is a finite sets of precondition formulae, and ∧¬ -- E ⊂SWRLO is a finite set of effect formulae. ∧∨

-- |E ∩ (SWRLO − SWRLO )| ≤ 1 ∧∨ ∧¬ 3.4 Definitions and Notation 71

The last condition ensures that the effects of an atomic process do not contain more than one disjunctive effect. This restriction will become important in Chapter 6 where we discuss the decision-theoretic composition of OWL-S services. ♦

Process parameters and ontology classes are identified by URI references that are made unique by the XML namespace mechanism [Bray et al., 1999]. Since XML namespace expressions can be lengthy we allow for an abbreviation mechanism similar to XML namespace and entity declarations [Bray et al., 2004]. This is reflected in our notation for atomic OWL-S processes:

Notation 3.19: Let A be an atomic process (u,I,O,P,E) with I={in1 :: C1, ..., inm :: Cm}, O={out1 :: D1, ..., outn :: Dn}, P = {ϕ1,...,ϕk}, and E = {ψ1,...,ψl} where m, n, k, l ∈ N0. Furthermore, let decls be a list of namespace declarations in the domain ontology underlying A then we write A as

atomic u: inputs: in1 :: C1 . . . inm :: Cm outputs: out1 :: D1 . . . outn :: Dn preconds: ϕ1 ∧ . . . ∧ ϕk effects: ψ1 ∧ . . . ∧ ψl decls with all occurrences of XML namespaces in the inputs, outputs, preconditions and effects replaced by the abbreviations declared in decls. We will often use only the local name of the full URI reference identifying the service. As an illustration of our notation we present the atomic process of a CurrencyCon- verter service in Figure 3.514. The service takes a price and a desired output currency as inputs and returns the converted price.

atomic CurrencyConverter: inputs: InputPrice :: book#Price OutputCurrency :: curr#Currency outputs: OuputPrice :: book#Price preconds: currency(InputPrice, curr#GBP) effects: currency(OutputPrice, OutputCurrency) book = http://www.mindswap.org/2004/owl-s/concepts.owl curr = http://www.daml.ecs.soton.ac.uk/ont/currency.owl

Figure 3.5: The atomic process of the CurrencyConverter service

The precondition of CurrencyConverter requires the input price (InputPrice) to be in British pounds (GBP). The effect of the service is that the currency of the resulting price is the one given as an input (OutputCurrency). The process description does not define how the input and the output price are related.

14The OWL-S description is available at http://www.mindswap.org/2004/owl-s/1.1/ CurrencyConverter.owl. 72 Chapter 3. Semantic Web Services

3.4.3.2 Conditional Probabilistic Effects Not all Web Services are as deterministic as the currency converter service described above. One can imagine that the invocation of more dynamic Web Services can produce many different outcomes. This is the case for some of the reasoning services available in the MathServe framework. Sometimes, statistical data about the likelihood of certain outcomes can be collected (see Section 4.4). In our framework, statistical data about possible outcomes of a reasoning service are modelled as (conditional) probabilistic effects. Originally, OWL-S did not offer the means to define probabilistic effects. Therefore, we define this new class of effects as an extension of SWRL rules with a (possibly empty) body and a non-empty head: Definition 3.20 (Conditional Probabilistic Effects) The set CPE of conditional probabilistic effects is defined as the set of all triples O (ϕ,pr,c) where -- ϕ ∈ RL is a SWRL rule with a nonempty head, i.e O

ϕ = A1 ∧ . . . ∧ Am ⇒ C1 ∧ . . . ∧ Cn with m, n ∈ N0 and n> 0. -- pr ∈ [0; 1] is a probability, and -- c ∈ N is a cost value.

(ϕ,pr,c) ∈ CPE is called a probabilistic effect if ϕ = C1 ∧ . . . ∧ Cn. ♦ O

Intuitively, the meaning of a conditional probabilistic effect (ϕ,pr,c) with ϕ = A1 ∧ . . . ∧ Am ⇒ C1 ∧ . . . ∧ Cn is the following:

If the condition A1 ∧. . .∧Am holds in the current world state then the invoca- tion of a service with the effect (ϕ,pr,c) will, with probability pr, result in a state where C1 ∧ . . . ∧ Cn is true. The average cost of “successful” invocations of the service (i.e. invocations after which C1 ∧ . . . ∧ Cn indeed becomes true) is c. A formal semantics of probabilistic effects will be provided Chapter 6 and Appendix E where we translate service profiles (containing such effects) into a stochastic situation calculus action domain. Notation 3.21: For a conditional probabilistic effect (ϕ ⇒ ψ,pr,c) we also write ϕ → ψ (pr)(c)

The definitions of the functions props(.) and atoms(.) (see Definition 3.11) can be extended to probabilistic effects in a natural way: Definition 3.22 (Properties and Atoms of Probabilistic Effects) For an ontology O, the sets of OWL properties and atoms occurring in probabilistic effects are defined as: props((ϕ,pr,c)) := props(ϕ), and atoms((ϕ,pr,c)) := atoms(ϕ), for all (ϕ,pr,c) ∈ CPE . ♦ O 3.4 Definitions and Notation 73

3.4.3.3 OWL-S Service Profiles

OWL-S service profiles contain information needed for service discovery and matchmak- ing purposes. They can be regarded as extensions of atomic processes. In particular, OWL-S requires the inputs, outputs, preconditions and effects of a profile to exist in the corresponding process model. Additionally, service profiles contain further information about a service:

-- The service category of a profile is used for classifying a service and refers to an entry in some ontology or taxonomy of services.

-- An expandable list of service parameters can be used to express additional prop- erties of a service. It is worth mentioning that service parameters have nothing in common with process or profile parameters.

The categorisation information in an OWL-S profile can simply be seen as a URI pointing to an entry in a taxonomy. We will introduce a taxonomy for semantic rea- soning services in Section 4.2. In Section 4.4.5 we show how to use service parameters to encode conditional probabilistic effects. In what follows, define profiles as atomic processes with categories and conditional probabilistic effects:

Definition 3.23 (OWL-S Service Profile) O = hA, Ei be an OWL-DL ontology. An OWL-S service profile F based on O is a tuple (u,I,O,P,E,C,PE) where (u,I,O,P,E) is an atomic process based on O, C = {uri1,...,urin} is a finite set of URI references to service categories, and P E = {e1,..., ep} is a finite set of conditional probabilistic effects. The probabilistic effects in P E have to be related to the disjunctive effect in E := E ∩ (SWRLO /SWRLO ) ∨ as follows: ∧∨ ∧¬ ∀(ϕ ⇒ ψ,pr,c) ∈ P E. atoms(ψ) ⊆ atoms(e), e[E∨ ∈ i.e. all SWRL atoms occurring in the heads of SWRL rules of probabilistic effects have to occur in the disjunctive effect in E. This ensures that probabilistic effects only assign probabilities to effects explicitly mentioned as one of the possible effects of a nondeterministic service. In case E is empty but there are atoms in the heads of ∨ probabilistic effects, the tuple (u,I,O,P,E,C,PE) is not a valid service profile. The set OWLSP is the set of all possible OWL-S profiles based on O. ♦ O

The services described by OWL-S service profiles are also called atomic Semantic Web Services to differentiate them from composite services. The following notation introduces a convenient way to represent service profiles.

Notation 3.24: For a service profile (u,I,O,P,E,C,PE) with I={in1 :: C1,...,inm :: Cm}, O={out1 :: D1,...,outn :: Dn}, P = {ϕ1,...,ϕk}, E = {ψ1,...,ψl}, C = {uri1,...,urin}, and P E = {e1,...,ep} we write 74 Chapter 3. Semantic Web Services

profile u: inputs: in1 :: C1 . . . inm :: Cm outputs: out1 :: D1 . . . outn :: Dn preconds: ϕ1 ∧ . . . ∧ ϕk effects: ψ1 ∧ . . . ∧ ψl categs: uri1 ... urin params: e1,...,ep decls

The correspondence between this notation and the standard RDF/XML syntax of OWL-S is shown in Appendix D. The following set of projection functions will become useful in Chapter 6 and Appendix E. Definition 3.25 (Projection Functions for Service Profiles) We define a set of projection functions on service profiles as follows: name ((u,I,O,P,E,C,PE)) := u inputs ((u,I,O,P,E,C,PE)) := I outputs ((u,I,O,P,E,C,PE)) := O preconds ((u,I,O,P,E,C,PE)) := P statPreconds ((u,I,O,P,E,C,PE)) := {ϕ ∈ P | vars(ϕ) ⊆ vars(I)} dynPreconds ((u,I,O,P,E,C,PE)) := {ϕ ∈ P | vars(ϕ) 6⊆ vars(I)} effects ((u,I,O,P,E,C,PE)) := E

disjEffects ((u,I,O,P,E,C,PE)) := E ∩ (SWRLO − SWRLO ) ∧∨ ∧¬ cats ((u,I,O,P,E,C,PE)) := C, and cpEffects ((u,I,O,P,E,C,PE)) := P E, for all (u,I,O,P,E,C,PE) ∈ OWLSP . The function vars (see Definition 3.11) is O extended to lists of input and output parameters:

vars({in1 :: C1←v1,..., inm :: Cm←vm}) = {in1,...,inm} and

vars({par1 :: C1,..., parn :: Cn}) = {par1,...,parn}. ♦

The functions statPreconds and dynPreconds determine the static and dynamic preconditions of a service profile, respectively. Static preconditions contain only input parameters as variables. The distinction between static and dynamic preconditions is important in the context of planning for automated Web Service composition (see Chapter 6). Static preconditions can be checked during planning without a concrete binding of the preconditions’ variables to objects. Dynamic preconditions express prop- erties of concrete objects (OWL individuals) that are not known during planning.

3.5 Summary

The Semantic Web is an extension of the classical Web in which resources are given well-defined meaning. Simple statements about resources in the Semantic Web can be 3.5 Summary 75 expressed in RDF. Ontologies written in OWL can be used for more expressive state- ments about classes of objects and their properties. With the help of the specialised ontology OWL-S, the semantics of Web Services can be described. OWL-S service descriptions can be used to locate services, to choose the most appropriate services for a task, or to combine services manually or automatically. We introduced convenient definitions and notations for OWL ontologies and OWL-S descriptions. Out notation is far more compact and readable than the RDF/XML syntax used by many other au- thors. We extended OWL-S service profiles by conditional probabilistic effects which will be used in Chapters 4 and 6 to describe the performance of different reasoning systems. 76 Chapter 3. Semantic Web Services Part II

Brokering Semantic Reasoning Services

Chapter 4

Semantic Reasoning Services

In this chapter we present our work on modelling automated reasoning systems as atomic Semantic Web Services. Reasoning systems are made accessible as Web Services. The semantics of these services is described using the OWL-S upper ontology. We present the systems integrated in MathServe and the OWL-S descriptions of the services they offer. Since OWL-S descriptions are based on a domain ontology we start with an abstract view of the MathServe OWL-DL ontology in Section 4.1. We will describe further re- finements of the ontology as we proceed through the chapter and introduce different kinds of reasoning services. MathServe’s reasoning services are divided into different categories. The service categories are defined in the MathServe service taxonomy that is described in Section 4.2. The remainder of the chapter is organised according to the different services categories defined in the taxonomy. We define the services pro- vided by problem transformation systems (Section 4.3), first-order theorem proving systems (Section 4.4), problem analysing systems (Section 4.5), finite model genera- tors (Section 4.6), decision procedures (Section 4.7), and proof transformation systems (Section 4.8).

4.1 A Domain Ontology

The MathServe domain ontology consists of 86 simple, named OWL-DL classes and 34 object and datatype properties. Figure 4.1 shows the top level classes of the Math- Serve ontology1. Most of these classes have, again, subclasses that we omit for the sake of readability. Some of the subclasses will be introduced later in this chapter. For a presentation of the complete MathServe ontology we refer the reader to Appendix A. The class Problem is the superclass for different types of reasoning problems. Re- search in the projects MONET [MONET, 2002] and Calculemus identified three types of reasoning problems that typically occur in mathematical activities: proving, deciding, and computing. The MathServe ontology contains a class for each of these problem types:

ProvingProblem: The class of theorem proving problems, i.e. problem descriptions that contain a set of axioms and a conjecture in a logical theory. A solution to a

1All OWL classes are subclasses of the distinguished class owl#Thing. 80 Chapter 4. Semantic Reasoning Services

proving problem can range from a natural language argumentation that shows that the conjecture follows from the axioms up to a checkable formal proof object.

DecisionProblem: The class of decision problems, i.e. problems consisting of a formula in a decidable background theory. A solution for a decision problem should indicate whether the formula in the problem is valid (unsatisfiable) with respect to the background theory or not.2

ComputationProblem: Computation problems typically refer to algebraic and numeric computations as they are performed by Computer Algebra Systems (CASs). The projects MathBroker [Schreiner and Caprotti, 2001] and MONET [MONET, 2002] have defined several numeric and symbolic computation problems and mod- elled several computation services as Semantic Web Services (cf. Section 9.1).

In the remaining sections of this chapter we will describe the class of proving prob- lems and its subclasses in more detail. The MathServe ontology also defines the class

ProvingProblem

Problem DecisionProblem

ComputationProblem Resource

ModGenProblem Syntax

ProverResult

owl:Thing Logic ModGenResult

Result DecProcResult

ComputationResult Proof

Failure Model

Calculus

Figure 4.1: Top level classes of the MathServe domain ontology and the direct sub- classes of Problem and Result

Resource of resources that might be consumed by services (e.g., CPU time and ma- chine memory), and the class Syntax of concrete syntaxes for encoding problems and formal objects such as formulae, terms and formal proofs. TPTP [Sutcliffe and Sut-

2As a matter of fact, some decision procedures internally use theorem provers (e.g., with a su- perposition calculus). Thus, these systems (internally) regard decision problems as theorem proving problems. Therefore, the classes ProvingProblem and DecisionProblem could be modelled with a non-empty intersection. However, what matters for the MathServe framework are only the inputs, outputs, preconditions and effects of a service and not its internal workings. 4.2 A Taxonomy of Reasoning Systems 81 tner, 1998]3, SMT [Ranise and Tinelli, 2006], OMDoc [Kohlhase, 2006] and content MathML [Carlisle et al., 2003] are examples for such syntaxes. The class Logic consists of all formal logics such as propositional logic, classical first- order predicate logic, modal and temporal logics, λ-calculi, and higher-order logics. Result stands for all possible results produced by reasoning services. So far we distin- guish results from theorem proving systems (ProverResult), model generators (ModGen- Result), decision procedures (DecProcResult), computation systems (ComputationResult). For failed service invocations we introduce the class Failure (see Section 5.2.6). The MathServe domain ontology has been developed and maintained with the help of the Prot´eg´eontology editor [Gennari et al., 2002]. Prot´eg´efully supports the OWL language family and provides a graphical user interface for editing ontologies and check- ing their consistency. Figure 4.2 shows the Prot´eg´etool with the MathServe domain ontology loaded.

Figure 4.2: The Prot´eg´etool showing the MathServe domain ontology

4.2 A Taxonomy of Reasoning Systems

For matchmaking and other brokering tasks it is useful to annotate service descrip- tions with the category the service belongs to. In OWL-S, services are categorised by using URI references to existing taxonomies of services. Figure 4.3 shows the service

3TPTP stands for “Thousands of Problems for Theorem Provers” and stands for a library of first- order theorem proving problem, and for a Prolog-style language to encode these problems. The new version of TPTP (TPTP v3.0.0 or newer) also allows to encode refutation proofs. 82 Chapter 4. Semantic Reasoning Services categories in the MathServe taxonomy. It contains categories for each of the prob- lem classes described in the previous section, i.e. the categories of services for proving (Prover), deciding (DecProc), and computing (CompServ). Theorem proving services are divided into services for first-order (FO-ATP) and higher-order logic (HO-ATP). Further- more, our taxonomy contains the categories of services that analyse reasoning problems (Analyser) or transform reasoning problems (ProbTrans). Services for the transformation of formal proofs into different formal languages, logics and logical calculi are in the cat- egory ProofTrans. Services offered by model generators fall into the category ModelGen.

MathServe Service

Prover DecProc CompServ Analyser TransServ ModelGen

ATP ProbTrans ProofTrans

FO−ATP HO−ATP

Figure 4.3: Different categories of MathServe services

4.3 Problem Transformation Services

In this section we present atomic Semantic Web Services for the transformation of problems that occur in the context of automated reasoning, such as proving prob- lems (i.e. individuals of the class ProvingProblem shown in Figure 4.1). Transforma- tion services are particularly important in the field of automated reasoning because of the huge variety of different systems and languages developed by different research groups. Despite efforts for developing standard interchange formats for reasoning sys- tems, such as TPTP [Sutcliffe et al., 2004], OpenMath [Caprotti and Cohen, 1998], MathML [Carlisle et al., 2003] and OMDoc [Kohlhase, 2006], most of these systems have their own peculiar input syntax and different systems may have different logical foundations. Furthermore, reasoning problems in a certain logic are sometimes found to belong to a logic of lower expressivity for which more efficient solution mechanisms exist. For instance, theorem proving problems in classical first-order logic without vari- ables are effectively propositional satisfiability problems and may be translated into a format suitable for efficient SAT solvers. But also first-order problems with variables might be abstracted to SAT problems for heuristic reasons (see Section 10.1.3). We first focus on clause normal form generators which are important for automated theorem proving systems based on the resolution calculus. 4.3 Problem Transformation Services 83

4.3.1 Clause Normal Form Generators

Many reasoning system for first-order logic, in particular most first-order Automated Theorem Proving systems, work on clause normal forms4 (CNFs) of logical formulae. When translating first-order formulae (FOF) into CNF the worst case size of the CNF can be exponential in the size of the original formula5. Thus, efficiently computing small CNFs for FOF problems is a difficult problem which has been addressed by many authors (see e.g, [Boy de la Tour, 1992, Plaisted and Greenbaum, 1986]). We integrated three CNF generating systems into MathServe. Each service has particular strengths needed in different applications: 1) The tptp2X utility covers the input formats of almost all available ATP systems. 2) The FLOTTER system can efficiently create small CNFs. 3) The TRAMP system computes CNFs as well as a mapping from the literals in that CNF to subformulae of the original first-order formulae. We integrated all three systems into MathServe. In what follows, we briefly describe the systems and the services they offer.

The tptp2X utility. The TPTP Library [Sutcliffe and Suttner, 1998] contains the tptp2X utility for reformatting, transforming, and generating TPTP problem files. The transformations currently available in tptp2X include conversion of FOF problems to CNF, random reordering of formulae and literals, addition and removal of equal- ity axioms from problems, and to apply Stickel’s set transformation [Stickel, 1994]. Although the tptp2X utility is not very efficient and the CNFs it produces are not optimised, it proved to be very useful for our purposes mainly because it can trans- form TPTP problems into the input formats of all major ATP systems. In fact, every first-order ATP service, as described below in Section 4.4, uses tptp2X to translate incoming TPTP problems into the input format of the underlying ATP system.

The FLOTTER system. In [Nonnengart et al., 1998] an efficient method for gen- erating compact CNFs is presented. It integrates an improved version of renaming (as introduced in [Boy de la Tour, 1992]), optimised and strong Skolemisation, and efficient redundancy tests. Their method has been implemented in the FLOTTER system which is part of the ATP system SPASS. Although FLOTTER produces small CNFs, its use is restricted to problems in DFG syntax [H¨ahnle et al., 1996] which is not widely used. We adopted the TPTP syntax (as from Version 3.0.0 of the TPTP Library) as our default syntax for first-order problems. For FLOTTER to be able to transform problems in TPTP syntax the problems have to be translated to DFG syntax using the tptp2X utility and FLOTTER’s result has to be translated back to TPTP with the same utility. The two calls to tptp2X increase the overall runtime drastically. Nevertheless we decided to offer FLOTTER as the semantic reasoning web service FlotterCNF because for big problems the size of the CNF can have a crucial im- pact on the performance of an ATP system tackling the problems. The service profile of FlotterCNF looks as follows6:

4Clause Normal Forms consist of sets of sets of literals representing conjunctions of disjunctions of those literals. 5 n The CNF of the propositional formula (X1 ∧ Y1) ∨ (X2 ∧ Y2) ∨···∨ (Xn ∧ Yn) contains 2 clauses. 6See Section 4.4.1 for definitions of the classes TptpFOFProblem and TptpCNFProblem. 84 Chapter 4. Semantic Reasoning Services

profile mw#FlotterCNF: inputs: fof problem :: mw#TptpFOFProblem outputs: cnf problem :: mw#TptpCNFProblem preconds: effects: cnfFor(cnf problem, fof problem) categs: params: mw = http://www.mathweb.org/owl/mathserve.owl

The Tramp system. The TRAMP system [Meier, 2000] employs a na¨ıve algorithm for CNF generation which does not incorporate any of the optimisations used, for in- stance, in the FLOTTER system. However, TRAMP’s CNF transformation computes a ∆-Relation which maps the literals in the CNF clauses to the corresponding literals (term positions) in the original FOF formulae. This mapping is crucial for another module of TRAMP which constructs ND proofs from CNF refutation proofs. For our work we split the two functionalities of the TRAMP system (CNF generation and proof transformation) into two services. The proof transformation service is described in Section 4.8.2. By decoupling the clause normalisation and the proof transformation we leave the choice of the ATP system used to the client application or user. This also allows the CNF problem to be processed by services or combinations of services other than ATP systems. The profile of TRAMP’s service looks very similar to FlotterCNF. However it has two outputs, the CNF problem and the ∆-Relation generated:

profile mw#TrampCNF: inputs: fof problem :: mw#TptpFOFProblem outputs: cnf problem :: mw#TptpCNFProblem delta relation :: mw#DeltaRelation preconds: effects: cnfFor(cnf problem, fof problem) ∧ relatesCNF (delta relation, cnf problem) ∧ toFOF (delta relation, fof problem) categs: params: mw = http://www.mathweb.org/owl/mathserve.owl

The additional effects (compared to FlotterCNF) determine the output cnf problem as the domain and the input fof problem as the range of the ∆-Relation computed. We hope that future versions of other CNF generators, such as FLOTTER, offer a special mode in which they also compute the ∆-Relation. This would lead to efficient CNFs without losing the information made explicit by TrampCNF.

4.3.2 Higher-Order to First-Order Translation In the previous section we introduced the TRAMP system which was developed as part of the proof assistant Ωmega [Siekmann et al., 2002]. Ωmega and TRAMP are 4.3 Problem Transformation Services 85 based on the language POST , a variant of Church’s simply typed λ-calculus with prefix polymorphism (cf. [Church, 1940]). In classical type theory, terms and all their subterms have concrete types. Polymorphism allows the introduction of type variables such that statements can be made for all types. For instance, in POST , a variable Xoα denotes a mapping from objects of some type α to objects of the type o of truth values. In Church’s standard notation, the type βα stands for the functional type α → β. We refer the reader to [Andrews, 1986] for a more detailed introduction to the simply typed λ-calculus. TRAMP was designed to integrate first-order ATP systems into Ωmega and, con- sequently, it can translate many higher-order problems formalised in POST into first-order problems. TRAMP’s translation mechanism maps a higher-order term 1 1 1 1 Poα(fαα(aα)) to a first-order term @oα(P, @αα(f, a)), where @oα and @αα are newly defined function symbols used to represent applications of unary symbols with the λ-calculus types oα and αα, respectively. Similar first-order function symbols are in- troduced for higher-order symbols with different arities and types. This approach has been presented in [Kerber, 1992] and was proved to be sound. However, this transla- tion does not work with higher-order terms with embedded λ-abstraction, such as the equation λxα Poα(x)=o(oα)(oα) λyα Qoα(y) which states that two sets of objects of type α are equal7. Such terms can be dealt with by a translation using λ-combinators as described in [Meng and Paulson, 2004]. With TRAMP already being integrated in MathServe, it is relatively easy to offer the higher-order to first-order translation of TRAMP as a separate service. The fol- lowing OWL-DL axioms define the new class of higher-order problems formalised in POST :

{mw#PostProvingProblem = mw#ProvingProblem ⊓ ∀ mw#inLogic. mw#SimpTypLamCalc ⊓ ∀ mw#language. mw#POST }

The class is a subclass of the class mw#ProvingProblem with range restrictions on the properties inLogic and language. The service is offered under the name TrampHo2Fo and has the following profile:

profile mw#TrampHo2Fo: inputs: hof problem :: mw#PostProvingProblem outputs: cnf problem :: mw#TptpCNFProblem preconds: effects: cnfFor(cnf problem, hof problem) categs: params: mw = http://www.mathweb.org/owl/mathserve.owl

7In the simply typed λ-calculus sets are typically formalised as λ-terms with type oα. 86 Chapter 4. Semantic Reasoning Services

4.4 First-Order Automated Theorem Proving Ser- vices

In this section we model Automated Theorem Proving (ATP) systems for classical first- order predicate logic with equality as semantic reasoning web services. In Chapter 2 we have already mentioned why it is hard for human users as well as computer programs to use ATP systems. By modelling ATP systems as Semantic Web Reasoning Services we make them more accessible to both humans and machines. Our OWL-S descriptions of ATP services provide information supporting an automated retrieval of suitable ATP services for a given problem. The services themselves return machine- and human- interpretable outputs in well-established formats and provide precise information about what has been established by the underlying ATP system. To be able to describe ATP services in OWL-S we extend the MathServe domain ontology with four classes: • First-order theorem proving problems contain the actual conjectures to be proved. • ATP problem statuses describe, in an unambiguous way, the logical status of a problem as achieved by an ATP system. • Results of ATP services contain all the relevant information established by the underlying theorem prover. • Specialist Problem Classes (SPCs) are used to describe the performance of ATP services on classes of problems. In the following Sections we briefly describe these parts of the MathServe ontology and conclude with the service profiles of some first-order ATP services.

4.4.1 First-order Theorem Proving Problems

In MathServe, first-order theorem proving problems are defined in classical first-order logic with equality and can refer to a predefined first-order theory. A problem can also be annotated with its problem class if this is known. The actual conjecture is provided in the formal description of a problem and is formulated in one of the languages introduced in Section 4.1. Additional to its RDF ID, a problem can also be annotated with a name. A time resource in seconds is provided which defines the time in which an answer to a problem should be delivered. The following set of OWL-DL axioms defines the class of first-order proving problems and the corresponding properties.

{mw#FOProvingProblem = mw#ProvingProblem⊓ ∀ mw#inLogic. mw#FOLogic ⊓ ∀ mw#inTheory. mw#FOTheory ⊓ ∀ mw#problemClass. mw#FOProblemClass, > 1 mw#formalDescription ⊑ mw#FOProvingProblem, ⊤ ⊑ ∀ mw#formalDescription. xsd#string, > 1 mw#language ⊑ mw#FOProvingProblem, ⊤ ⊑ ∀ mw#language. mw#Syntax, > 1 mw#time ⊑ mw#FOProvingProblem, ⊤ ⊑ ∀ mw#time. mw#TimeResource } 4.4 First-Order Automated Theorem Proving Services 87

All ATP services in MathServe accept at least TPTP as an input syntax. This is why we also define the classes of TPTP proving problems in FOF and CNF format:

{mw#TptpProblem = mw#FOProvingProblem mw#TptpFOFProblem = mw#TptpProblem⊓ mw#language: mw#TPTP ⊓ mw#format: mw#FOF, mw#TptpCNFProblem = mw#TptpProblem⊓ mw#language: mw#TPTP ⊓ mw#format: mw#CNF }

4.4.2 An Ontology of ATP Statuses The output from current ATP systems varies widely in quantity, quality, and meaning. At the low end of the scale, systems that search for a refutation of a set of clauses may output only an assurance that a refutation exists. At the high end of the scale a system may output a natural deduction proof [Gentzen, 1935, Prawitz, 1965] of a problem expressed in FOF, such as the AProS system8. In some cases the output is misleading, e.g., when a CNF based system claims that a FOF input problem is “unsatisfiable” it means that the (negated) CNF of the problem is unsatisfiable, i.e. the problem is proved. We developed an ontology which provides a precise and reasonably fine grained set of status values that can be used to specify the logical status of an ATP problem, as it may be established by an ATP system. The ontology is based on initial work by Armando, Kohlhase, and Ranise [Armando et al., 2000] on establishing communication protocols for systems on the MathWeb Software Bus (see Section 2.3.3). A first version of our ontology was presented in [Sutcliffe et al., 2003]. Since then it has been further refined and extended and now contains 28 statuses. However, not all of these statuses can be determined by reasoning systems. We defined a new sub-ontology of the MathServe ontology that contains OWL classes and individuals for all the statuses in our ontology. By describing ATP problem statuses in OWL they can be used in the preconditions and effects of OWL-S processes and profiles to express that a service tries to establish a certain status for a given prob- lem or expects an input problem to have a certain status. The classes and properties of the sub-ontology are the following:

{stat#SystemStatus ⊑ owl#Thing, stat#FoAtpStatus = stat#SystemStatus ⊓ ∀ stat#specialises. stat#FoAtpStatus, stat#FoSolvedStatus ⊑ stat#FoAtpStatus, stat#FoDeductiveStatus ⊑ stat#FoSolvedStatus, stat#FoPreservingStatus ⊑ stat#FoSolvedStatus, stat#FoUnSolvedStatus ⊑ stat#FoAtpStatus, > 1 stat#specialises ⊑ stat#SystemStatus, ⊤ ⊑ ∀ stat#specialises. stat#SystemStatus, > 1 stat#description ⊑ stat#SystemStatus, ⊤ ⊑ ∀ stat#description. xsd:string, > 1 stat#code ⊑ stat#SystemStatus, ⊤ ⊑ ∀ stat#code. xsd:string, stat = http://www.mathweb.org/owl/status.owl, xsd = http://www.w3.org/2001/XMLSchema }

8See http://www.phil.cmu.edu/projects/apros/index.php?page=overview. 88 Chapter 4. Semantic Reasoning Services

The actual problem statuses are individuals of one of the subclasses of FoAtpStatus. Figure 4.4 contains the individuals of the class FoDeductiveStatus and the specialises property that relates statuses with their super-status. The complete status ontology is presented in Appendix B.

Counter Satisfiable No Consequence Satisfiable

Theorem Contradictory Counter Axioms Theorem

Equivalent Tautologous Counter Unsatisfiable Conclusion Equivalent Conclusion

specialises Tautology Unsatisfiable conjunction

Figure 4.4: Hierarchy of deductive statuses for first-order ATP problems

The description property of each status contains a natural language definition of the status. Our definitions expect a problem F (for which a status is determined) to be of the form Ax ⇒ C, where Ax is a set (conjunction) of axioms and C is a single conjecture formula. The status value indicates the relationship between Ax and C. By showing that F is valid, an ATP system shows that C is a theorem (a logical conse- quence) of Ax, i.e. Ax |= C, where |= is the standard classical first-order entailment. For instance, the OWL-DL definition of the status (OWL individual) Satisfiable looks as follows:

{stat#Satisfiable ∈ stat#FoDeductiveState, stat#Satisfiable.stat#code = ”SAT”, stat#Satisfiable.stat#description = ”Some models of Ax (and there are some) are models of C.” }, while CounterTheorem is defined as {stat#CounterTheorem ∈ stat#FoDeductiveState, stat#CounterTheorem.stat#code = ”CTH”, stat#CounterTheorem.stat#description = ”Every model of Ax (and there are some) is a model of ¬C.” } If F is not of the form Ax ⇒ C, it is treated as a single monolithic conjecture formula. This is equivalent to Ax being ⊤. In this case not all of the statuses are appropriate. For instance, systems that report Theorem for a monolithic formula F must have established Tautology. A set of axioms is treated as a conjecture formed from the conjunction of the formulae. This also allows to encode CNF problems as a conjunction of axioms (the clauses) that are disjunctions. 4.4 First-Order Automated Theorem Proving Services 89

The class FoPreservingStatus represents statuses in which there is not a direct corre- spondence (but some mapping) between the models of Ax and the models of C. These statuses are of minor importance for our work but are necessary for the completeness of the ontology. They are included in the complete presentation of the status ontology in Appendix B. If an ATP system is not able to determine a deductive or a preserving status for a given problem it should return an individual of the class UnsolvedStatus. Figure 4.5 shows all the statuses (individuals) in this class. The status InputError signals that

Unknown

InputError GaveUp ResourceOut Assumed

specialises Timeout MemoryOut

Figure 4.5: Hierarchy of unsolved statuses for first-order ATP systems the system encountered a problem with one of the inputs. Timeout and MemoryOut are returned if the system exhausted a given time or memory resource respectively. GaveUp indicates that the system gave up the attempt to solve a problem but it did not run out of resources. The status Assumed expresses that the system only assumes that the given problem has a certain status. If a system reports Assumed it should also name the SolvedStatus assumed and provide a natural language description of the reason for the assumption made.

4.4.3 Results of First-Order ATP Systems The undecidability of first-order predicate logic implies that the result of the call of an ATP system on a proving problem cannot be determined in advance. The system might find a proof, run out of resources in the proof attempt, or terminate because of various other reasons that are captured by the ATP statuses introduced in the previous section. Next to the status, the result of an ATP system should also contain a reference (URI) to the problem the result was produced for, a proof object (if one was generated) and, optionally, additional information, for instance the complete output of the ATP system. Furthermore, saturation-based ATP systems (such as Bliksem, Otter, SPASS or E) can produce clause sets that are saturated with respect to the underlying calculus but do not contain the empty clause (statuses Satisfiable or CounterSatisfiable). In this case some systems return the saturation found. This can be important for certain applications and should also be part of the result of ATP services. To meet these particular requirements of ATP systems we extended the MathServe ontology by a subclass of ProverResult which represents all possible results of first-order ATP systems: 90 Chapter 4. Semantic Reasoning Services

{mw#FoAtpResult ⊑ mw#ProverResult, > 1 mw#status ⊑ mw#FoAtpResult, ⊤ ⊑ ∀ mw#status. stat#FoAtpStatus, > 1 mw#resultFor ⊑ mw#FoAtpResult, ⊤ ⊑ ∀ mw#resultFor. mw#ProvingProblem, > 1 mw#proof ⊑ mw#FoAtpResult, ⊤ ⊑ ∀ mw#proof . mw#FormalProof, > 1 mw#output ⊑ mw#FoAtpResult, ⊤ ⊑ ∀ mw#output. xsd:string, > 1 mw#saturation ⊑ mw#FoAtpResult, ⊤ ⊑ ∀ mw#saturation. xsd:string, > 1 mw#time ⊑ mw#FoAtpResult, ⊤ ⊑ ∀ mw#time. mw#TimeResource }

The class FoAtpResult has six properties with the following meaning: “status” refers to the problem status established, “resultFor” links to the problem the result has been produced for, “proof ” may contain a formal proof object, “output” contains the complete output of the underlying ATP system as a string, “saturation” may contain a saturation if it has been established for a clause normal form, and, finally, “time” specifies the time resource used by the ATP service to produce a result.

4.4.4 Specialist Problem Classes and System Performance The performance of fully-automated ATP systems depends on many factors. The computational resources, time and memory, given to the prover are two of these factors. However, the search space of resolution-based ATP systems typically grows super- exponentially and the search for a refutation quickly becomes intractable. Sutcliffe and Suttner have found that every system has a so-called Peter Principle Point (PPP) that is a point beyond which a linear increase of the computational resources does not lead to the solution of significantly more problems [Sutcliffe and Suttner, 2001]. Another factor that has an influence on the performance of ATP systems is the type of the problem to solve. This is why modern ATP systems have their own specific classification scheme for incoming problems. For every proof attempt they choose a suitable search strategy according to their classification of the problem. The provers’ peculiar set of problem features for the internal classification is typically known to the system developers only. Furthermore they are not useful to establish an objective comparison of provers because the features of one ATP system can have no meaning at all for another ATP system. Sutcliffe and Suttner identified “objective problem features” with the aim of com- paring the performance of ATP systems [Sutcliffe and Suttner, 2001]. They distinguish six general features of problems in classical first-order logic that have an impact on the performance of ATP systems:

Theoremhood: The problem is a theorem (THM) or a non-theorem (SAT for satis- fiable or CSA for countersatisfiable first-order problems).

Form: The problem is in clause normal form (CNF) or first-order form (FOF).

Clause Form: The clauses are all Horn9 (HRN) or some are non-Horn (NHN).

Order: The problem is effectively propositional (EPR) or real first-order (RFO).

9A Horn clause is a clause with at most one positive literal. 4.4 First-Order Automated Theorem Proving Services 91

Equality: The problem contains no equality (NEQ), some equality (SEQ), or it is pure equality (PEQ).

Unit Equality: All clauses are unit equalities (UEQ) or there are some non-unit equality clauses (NUE).

All of these features, except the first, can be determined by a syntactic analysis of a problem. The service TptpAnalyser, which is described in Section 4.5, performs this analysis for problems in TPTP format. For problems whose logical status (THM, SAT, or CSA) is not known the abbreviations NKS and NKC have been introduced. NKS stands for clause normal form problems that are not known to be satisfiable and NKC for first-order form problems that are not known to be counter-satisfiable. For first-order problems without a conjecture formula the abbreviation NKS NUN indicates that the problem is Not Known to be Satisfiable and Not Unknown, i.e. one wants to show that the formulae in the problem are satisfiable. The meaningful combinations10 of the above features define 21 problem classes that are called Specialist Problem Classes (SPCs). The abbreviations used for the problem features are used to form the name of an SPC. For instance, the SPC FOF NKC EPR stands for all conjectures in first-order form that are potential theorems (not known to be counter satisfiable) and are effectively propositional (and therefore decidable due to a finite Herbrand universe). Of course, the logical status (see Section 4.4.2), such as theoremhood or unsatisfiability, is typically not known in advance. Up to now we integrated seven ATP systems as Web Services in MathServe (see next section) and measured the performance of these systems on the TPTP Library. 11 The interested reader can find the performance data for all the ATP systems integrated in MathServe in Appendix C. Table 4.1 compares the performance of the ATP systems E 0.82 and SPASS 2.1 on the 21 SPCs of the TPTP Library with a fixed time limit of 300 seconds per problem. The 7260 problems of the TPTP Library are spread over 19 SPCs (two SPCs are not populated yet). The second column of Table 4.1 contains the number of problems in each SPC. The third and fourth column show the percentage of problems solved by E and the average CPU time used for the successful proving attempts. The fifth and sixth column display the same data for the SPASS system. For some SPCs the provers show a comparable strength. For instance, in the SPC FOF SAT EPR SPASS and E solved the same number of problems but the average CPU time of SPASS is a third of the time of E. However, in the case of the FOF NKS NUN EPR, for instance, SPASS could solve all problems and E solved only 26% of the problems. A visual comparison of the performance of E and SPASS is presented in Figure 4.6. Every star in the scatter graph represents one SPC12. It can be used for a rough comparison of the two ATP systems. For example, the stars above the diagonal line in Figure 4.6 show that, for 10 SPCs, E solved more problems than SPASS. Similarly, the average CPU times used by the two ATP systems are compared in Figure 4.7. For

10HRN, for instance, does not apply for problems in first-order form (FOF). 11Measurements were done on a Linux machine with Intel Xeon 2.80GHz CPU and 2GB memory. 12Due to readability issues, it was not possible to produce a more detailed figure containing the names of the SPCs. 92 Chapter 4. Semantic Reasoning Services

E SPASS Specialist No. Solved CPU Solved CPU Problem Class Probs (%) (ms) (%) (ms) FOF CSA EPR 319 91 10341 86 4391 FOF CSA RFO 18 67 391 67 1740 FOF SAT EPR 3 34 100 34 30 FOF SAT RFO 17 41 143 29 720 FOF NKC EPR 395 80 12160 98 5227 FOF NKC RFO EQU 915 35 12795 53 30638 FOF NKC RFO NEQ 28 100 114 100 3 FOF NKS NUN EPR 50 26 15561 100 13072 FOF NKS NUN RFO NEQ 0 50 0 50 0 FOF NKS NUN RFO EQU 0 50 0 50 0 CNF NKS EPR 476 96 9656 99 5836 CNF SAT EPR 220 64 17425 66 27333 CNF SAT RFO NEQ 275 52 5186 45 437 CNF NKS RFO NEQ NHN 540 68 7979 54 9335 CNF SAT RFO EQU NUE 224 57 340 55 1536 CNF SAT RFO PEQ UEQ 54 13 100 11 2 CNF NKS RFO NEQ HRN 461 13 6951 66 16768 CNF NKS RFO SEQ HRN 390 87 7248 49 6265 CNF NKS RFO SEQ NHN 1816 51 4676 34 14960 CNF NKS RFO PEQ NUE 337 87 3240 75 4635 CNF NKS RFO PEQ UEQ 722 85 3412 71 8398

Table 4.1: Performance of the ATP systems E 0.82 and SPASS 2.1 on the TPTP Library 3.0.1 with a 300sec time limit. Columns four and six contain average CPU times for solved problems in milliseconds 4.4 First-Order Automated Theorem Proving Services 93

100

80

60

40 % solved by E

20

0 0 20 40 60 80 100 % solved by SPASS

Figure 4.6: Scatter graph comparing the percentage of problems solved by E and SPASS in 21 SPCs. Every star represents one SPC example, the diamonds below the diagonal show that, for 10 SPCs, SPASS used more time than E to solve problems.

4.4.5 ATP Services in OWL-S MathServe currently offers the ATP systems DCTP 10.21p [Letz and Stenz, 2001a], E 0.91 [Schulz, 2001], Otter 3.3 [McCune, 1994b], Paradox 1.3 [Claessen and S¨orensson, 2003], SPASS 2.2 [Weidenbach et al., 1999], Vampire 8.0 [Riazanov and Voronkov, 2002] and Waldmeister 704 [L¨ochner and Hillenbrand, 2002] as Semantic Web Services described in OWL-S. Brief descriptions of these systems can be found in Appendix C. All ATP services work on classical first-order logic with equality and present the same Web Service interface. They are invoked with a theorem proving problem, a time limit, and (optionally) prover-specific options. ATP services return a first-order ATP result (mw#FoAtpResult) as introduced in Section 4.4.3. The OWL-S service profiles of ATP services were annotated with the performance data described in the previous section. Performance data is stored in the service pa- rameters as conditional probabilistic effects. The data is, for instance, used by the MathServe broker to choose between different provers as shown in Chapter 7. As an example for a first-order theorem proving service, we present the OWL-S profile of the service provided by the system EP (or simply E)13. E [Schulz, 2001] is a purely equational theorem prover. The calculus used by E combines superposition (with selection of negative literals) and rewriting. No special rules for non-equational literals have been implemented, i.e. resolution is simulated via paramodulation and equality resolution. Proof search in E is primarily controlled by a literal selection strategy, a

13While EP creates refutation proof objects, E does not. 94 Chapter 4. Semantic Reasoning Services

4 x 10 4

3.5

3

2.5

2

1.5

Avg. time used by E 1

0.5

0 0 1 2 3 4 Avg. time used by SPASS 4 x 10

Figure 4.7: Scatter graph comparing the time used by E and SPASS on the problems of 21 SPCs. Every diamond represents on SPC

clause evaluation heuristic, and a simplification ordering. Supported term orderings are several parameterised instances of Knuth-Bendix-Ordering and Lexicographic Path Ordering. Another feature of E is the maximally shared term representation. This includes parallel rewriting for all instances of a particular subterm.

E is currently one of the strongest general purpose ATP systems available. The service profile for E looks as follows: 4.4 First-Order Automated Theorem Proving Services 95

profile mw#EpATP: inputs: tptp problem :: mw#TptpProblem time res :: mw#TimeResource outputs: atp result :: mw#FoAtpResult preconds: effects: resultFor(atp result, tptp problem). (status(atp result, stat#Theorem)∨ status(atp result, stat#Unsatisfiable)∨ status(atp result, stat#Satisfiable)∨ status(atp result, stat#CounterSatisfiable)∨ status(atp result, stat#Unknown)∨status(atp result, stat#Timeout)) categs: mw#FO-ATP params: problemClass(tptp problem, mw#FOF NKC EPR) → status(atp result, stat#Theorem) (0.80) (12160ms) . . . problemClass(tptp problem, mw#CNF NKS RFO PEQ UEQ) → status(atp result, stat#Unsatisfiable) (0.85) (3412ms) stat = http://www.mathweb.org/owl/status.owl mw = http://www.mathweb.org/owl/mathserve.owl

The service expects a problem in TPTP format and a time resource as inputs. The time resource restricts the CPU and the wall-clock time for a proving attempt. The service returns an OWL-DL individual of class FoAtpResult defined in Section 4.4.3. The performance of E on the TPTP Library (shown in Table 4.1) is represented as conditional probabilistic effects in the parameters of the service profile (also see Section 3.4). The parameter

problemClass(tptp problem, mw#FOF NKC EPR) → status(atp result, stat#Theorem) (0.80) (12160ms), for instance, is interpreted as follows: if the input problem belongs to the SPC FOF NKC EPR then the result of a service invocation will have the status Theorem with a probability of 80%. The average runtime of a service invocation resulting in the status Theorem is 12160 milliseconds. It is worth mentioning that, for every SPC, the probabilities for all possible problem statuses have to add up to one. We’d like to point out that modelling performance data as probabilistic effects is a strong generalisation. Of course, the success of a theorem proving attempt also depends on the problem at hand and not just on general syntactic features. However, with a significantly large number of problems in the different SPCs, the relative frequency of successful proving attempts can be regarded as a probability. Typically, the status of a problem given to a theorem prover is not known in ad- vance. Therefore, the performance of a prover on problems with known statuses (SPCs containing SAT and CSA) is interpreted as the performance on the corresponding SPC with unknown status. For example, E could solve 64% of the satisfiable problems in the SPC CNF SAT EPR with an average time of 17425msecs. This information is used to predict the performance of E on satisfiable problems in CNF NKS EPR. Therefore, the profile of E contains the probabilistic effect 96 Chapter 4. Semantic Reasoning Services

problemClass(tptp problem, mw#CNF NKS EPR) → status(atp result, stat#Satisfiable) (0.64) (17425ms).

However, the SPC CNF NKS EPR also contains unsatisfiable problems and therefore, the service profile contains an entry for the likelihood that E can determine unsatisfi- ability:

problemClass(tptp problem, mw#CNF NKS EPR) → status(atp result, stat#Unsatisfiable) (0.96) (96561ms).

The probability of success of E, and all other ATP systems, on certain SPCs depends on what the system is supposed to achieve. This becomes important in Chapter 6 where we describe how MathServe uses probabilistic effects to choose the most promising ATP system depending on the SPC of a problem.

The Otter System

Otter [McCune, 1994b] is a fourth-generation Argonne National Laboratory deduction system whose ancestors (dating from the early 1960s) include the TP series, NIUTP, AURA, and ITP. It is designed to prove theorems stated in first-order logic with equal- ity. Otter’s inference rules are based on resolution and paramodulation, and it includes facilities for term rewriting, term orderings, Knuth-Bendix completion, weighting, and strategies for directing and restricting searches for proofs. Otter can also be used as a symbolic calculator and has an embedded equational programming system. Although Otter could prove interesting theorems in the past it is no longer one of the strongest provers available. However, one advantage of the Otter system is that its default cal- culus is restricted to a set of standard inference rules (binary resolution, factoring, paramodulation and demodulation) which makes it easy to process Otter refutation proofs with other tools (see Section 4.8.2).14 To emphasise the special role of the restricted Otter calculus we extended our do- main ontology by one individual and two classes. The new individual BrFP of type Reso- lutionCalculus represents the restricted Otter calculus15. The class TptpCnfBrFPRefutation stands for TPTP refutation proofs in that calculus. The class FoBrFPResult represents all results of ATP systems that only contain proofs in the restricted calculus:

{mw#BrFP ∈ mw#ResolutionCalculus, mw#TptpBrFPRefutation = mw#TptpCnfRefutation ⊓ mw#calculus: mw#BrFP, mw#FoBrFPResult = mw#FoAtpResult ⊓ ∀ mw#proof . mw#TptpCnfBrFPRefutation }

With these new ontology classes we can define the profile of the service OtterATP as follows:

14Since 1997 Otter also contains an experimental implementation of splitting which has to be acti- vated manually [McCune, 1994b]. 15BrFP is an abbreviation for the combination of Binary resolution, Factoring and Paramodulation. 4.5 A Service for TPTP Problem Analysis 97

profile mw#OtterATP: inputs: tptp problem :: mw#TptpProblem time res :: mw#TimeResource outputs: atp result :: mw#FoBrFPResult preconds: effects: resultFor(atp result, tptp problem)∧ (status(atp result, stat#Theorem)∨ status(atp result, stat#Unsatisfiable)∨ status(atp result, stat#Unknown)∨ status(atp result, stat#Timeout)) categs: mw#FO-ATP params: problemClass(tptp problem, mw#FOF NKC EPR) → status(atp result, stat#Theorem) (0.55) (8009ms) . . . problemClass(tptp problem, mw#CNF NKS RFO PEQ UEQ) → status(atp result, stat#Unsatisfiable) (0.73) (2384ms) stat = http://www.mathweb.org/owl/status.owl mw = http://www.mathweb.org/owl/mathserve.owl

4.5 A Service for TPTP Problem Analysis

In Section 4.4.4 we mentioned that the performance of ATP systems depends on the Specialist Problem Class of the problem. To be able to determine the SPC of a proving problem we integrated a TPTP problem analyser as a semantic web service into Math- Serve. The service expects a TPTP problem as an input and returns one of the 13 SPCs of problems that are not known to be satisfiable (NKS) or not known to be countersatisfiable (NKC):

profile mw#TptpAnalyser: inputs: tptp problem :: mw#TptpProblem outputs: problem class :: mw#ProvingProblemClass preconds: effects: (problemClass(tptp problem, stat#FOF NKC EPR)∨ problemClass(tptp problem, stat#FOF NKS NUN RFO NEQ)∨ . . . problemClass(tptp problem, stat#CNF NKS RFO PEQ UEQ)) categs: params: stat = http://www.mathweb.org/owl/status.owl mw = http://www.mathweb.org/owl/mathserve.owl

The time needed for an analysis is linear in the length of the given problem. The service always succeeds for non-empty, syntactically correct TPTP problems. 98 Chapter 4. Semantic Reasoning Services

4.6 Finite Model Generation Services

MathServe offers the services of the finite model generators MACE [McCune, 2003] and SEM [Zhang and Zhang, 1995] as semantic reasoning services. Both services accept problems in TPTP format. A uniform format for finite first-order models has not been developed yet. Therefore, the results of model generation services contains the output of the underlying system.

{mw#ModGenResult ∈ mw#Result, > 1 mw#status ⊑ mw#ModGenResult, ⊤ ⊑ ∀ mw#status. mw#ModGenStatus, > 1 mw#resultFor ⊑ mw#ModGenResult, ⊤ ⊑ ∀ mw#resultFor. mw#TptpProblem, > 1 mw#output ⊑ mw#ModGenResult, ⊤ ⊑ ∀ mw#output. xsd:string, > 1 mw#time ⊑ mw#ModGenResult, ⊤ ⊑ ∀ mw#time. mw#TimeResource }

profile MaceMG: inputs: tptp problem :: mw#TptpProblem time res :: mw#TimeResource outputs: result :: mw#ModGenResult preconds: effects: resultFor(result, tptp problem) categs: params: mw = http://www.mathweb.org/owl/mathserve.owl

4.7 Decision Procedure Services

In Section 2.2.3 we described how a combination of decision procedures can be used to determine the satisfiability of formulae in many interesting decidable fragments of first-order logic. With more and more reasoning systems based on decision procedures available it also became necessary to develop standardised benchmarks for such sys- tems. The SMT Library is a library of approximately 1500 SMT benchmark problems. Similar to the TPTP Library for first-order theorem proving, it also defines a standard- ised syntax to encode SMT problems. The performance of reasoning systems based on combined decision procedures is compared at the annual SMT system competitions. All problems in the SMT Library are annotated with the theory they are defined in. The SMT Library defines three combined theories containing quantified formulae with free function and/or predicate symbols, namely the theories of • formulae on arrays of integers and linear integer arithmetic (AUFLIA), • formulae on arrays of arrays of integer index and real values containing only linear atoms (AUFLIRA), and • possibly non-linear formulae on arrays of arrays of integer index and real values (AUFNIRA). 4.7 Decision Procedure Services 99

Furthermore, the library defines seven combined theories containing only quantifier-free (QF) formulae, namely the theories of: • formulae with uninterpreted sort, function and predicate symbols (QF UF), • real difference logic (QF RDL), • integer difference logic (QF IDL), • linear real arithmetic (QF LRA), • linear integer arithmetic (QF LIA), • integer difference logic and uninterpreted functions (QF UFIDL), and • linear integer arithmetic, arrays, and uninterpreted functions (QF UFLIA). The theories of the SMT Library are reflected in the MathServe domain ontology with the help of new classes and properties. The Description Logic class SmtProblem of SMT problems is very similar to the class of first-order theorem proving problems with the restriction that the problem must be defined in one of the decidable theories of the SMT Library: {mw#SmtProblem ⊑ mw#DecisionProblem⊓ mw#language: mw#SMT ⊓ ∀ mw#theory. mw#DecidableTheory } Results of decision procedures are much simpler than the results of ATP systems. A result contains the name of the problem the result was produced for and the name of the underlying system which produced the result. Furthermore, it contains a problem status (DecProcStatus) which is either the Unsatisfiable, Satisfiable, Timeout, or MemoryOut as defined in Section 4.4.2. {mw#DecProcResult ⊑ mw#Result, > 1 mw#status ⊑ mw#DecProcResult, ⊤ ⊑ ∀ mw#status. mw#DecProcStatus, > 1 mw#resultFor ⊑ mw#DecProcResult, ⊤ ⊑ ∀ mw#resultFor. mw#SmtProblem, > 1 mw#time ⊑ mw#DecProcResult, ⊤ ⊑ ∀ mw#time. mw#TimeResource } MathServe integrates the decision procedure systems MathSAT 3.3.1 [Bozzano et al., 2005] and Yices 1.0 [Dutertre and de Moura, 2006] as Semantic Web Services. Both systems are sound, and performed well on the 2005 SMT competition. Both systems accept problems in the SMT syntax. We measured the systems’ performance on the quantifier-free problems of the SMT Library (version of 24th July 2006).16 For every theory we determined the sets of satisfiable and unsatisfiable problems in the SMT Library. For each set we counted the number of problems solved by the systems within 600 seconds. Table 4.2 compares the performance of MathSAT and Yices on the satisfiable quantifier-free problems. Yices can generally solve more satisfiable problems than MathSAT except for problems in the integer difference logic (QF IDL). Table 4.3 compares MathSAT and Yices on the unsatisfiable quantifier-free prob- lems of the SMT Library. Again, MathSAT performs better on problems in the integer difference logic. But MathSAT could also solve as many problems as Yices in the real difference logic (QF RDL), and MathSAT could solve these problems in less than half of the CPU time. 16The theories AUFLIA, AUFLIRA and AUFNIRA contain quantified formulae and are not sup- ported by MathSAT or Yices. 100 Chapter 4. Semantic Reasoning Services

MathSAT Yices SMT No. Solved CPU Solved CPU Theory Probs (%) (ms) (%) (ms) QF IDL 548 62 29043 59 35692 QF LRA 297 72 71366 91 73031 QF UFIDL 293 95 15888 100 2411 QF UF 135 92 9855 100 1717 QF LIA 179 74 10642 93 10515 QF RDL 113 55 34916 77 62270 QF UFLIA 27 4 13233 100 15508

Table 4.2: Performance of the decision procedures MathSAT 3.3.1 and Yices 1.0 on satisfiable quantifier-free SMT Library problems of seven theories (600sec time limit)

MathSAT Yices SMT No. Solved CPU Solved CPU Theory Probs (%) (ms) (%) (ms) QF IDL 548 81 22903 77 20881 QF LRA 204 81 30773 83 23670 QF UFIDL 106 60 78115 75 52250 QF UF 12 67 77557 71 51907 QF LIA 58 78 22951 86 12345 QF RDL 56 66 19252 66 49312 QF UFLIA 83 19 25260 59 20875

Table 4.3: Performance of the decision procedures MathSAT 3.3.1 and Yices 1.0 on unsatisfiable quantifier-free SMT Library problems of seven theories (600sec time limit) 4.8 Proof Transformation Services 101

The service profiles of the Semantic Web reasoning services for MathSAT and Yices have been annotated with the data in Tables 4.2 and 4.3. For example, the service profile of the service provided by MathSAT is the following:

profile MathSAT: inputs: smt problem :: mw#SmtProblem time res :: mw#TimeResource outputs: dc result :: mw#DecProcResult preconds: effects: resultFor(dc result, smt problem)∧ (status(dc result, stat#Satisfiable)∨ status(dc result, stat#Unsatisfiable)∨ status(dc result, stat#Timeout)∨ status(dc result, stat#MemoryOut)) categs: theory(smt problem, mw#QF UF) → status(dc result, stat#Satisfiable) (0.92) (9855ms) . . . theory(smt problem, mw#QF RDL) → status(dc result, stat#Unsatisfiable) (0.66) (19252ms) params: mw = http://www.mathweb.org/owl/mathserve.owl

The author of the service profiles have to ensure that, for every decidable theory, the probabilities for the possible problem statuses add up to one. However, the probabilities for two different theories are not related.

4.8 Proof Transformation Services

Proof objects generated by interactive or automated theorem proving systems are for- malised in the system’s own logic and calculus. Typically, it is very hard for humans to read and understand these proofs. This is due to the often idiosyncratic syntax the proofs are expressed in, as well as the low level of the calculi used. The inference steps used in machine-generated proofs are much more fine-grained than inferences performed, for instance, by mathematicians. Furthermore, the development of inde- pendent proof checkers is hindered by the big number of different logical formalisms, languages and calculi implemented in theorem proving systems. Proof transformation systems can help to overcome the difficulties that humans and machines face when interacting with theorem proving systems: By transforming proofs into a more human- readable calculus, a graphical representation, or even natural language, they become more accessible for humans. A translation of proofs into a more widely used calculus eases the use of these proofs by other systems and the development of independent proof checking tools. However, research on proof transformation is still at an early stage and developers of theorem proving systems are typically interested in improving the problem solv- ing capabilities of their systems rather than improving the output produced by their systems (cf. Section 2.2). We integrated two prototypical implementations of proof transformation tools into MathServe: The Otterfier tool can translate refutation proofs 102 Chapter 4. Semantic Reasoning Services in different resolution calculi into proofs in a restricted Otter calculus. TRAMP trans- lates resolution proofs into proofs in an abstract Natural Deduction calculus. In what follows, we describe both systems and the services they offer in MathServe.

4.8.1 The Otterfier Service – Transforming CNF Derivations The derivations (typically refutations) produced by contemporary CNF based ATP systems are built from inference steps, which have one or more parent clauses and one resultant inferred clause. The inference rules that create the steps vary depending on the ATP system, ranging from simple binary resolution through to complex rules such as superposition [Bachmair and Ganzinger, 1994]. In almost all cases the inferred clauses are logical consequences of their parent clauses, the most common exception being clauses resulting from the various forms of splitting that have been implemented in ATP systems such as Vampire, E, and SPASS [Riazanov and Voronkov, 2001]. While a wider range and complexity of inference rules typically improves the performance of ATP systems, it is impractical to require proof post-processing tools to be able to process inference steps created by all the various rules (and new ones that may be invented in the future). It is therefore desirable to transform derivations so that each inference step uses one of a limited selection of inference rules. The Otterfier system is a transformation tool that transforms a source derivation containing source inference steps of logical consequence, to a derivation whose inference steps use only inference rules available in Otter. Thus, the profile for the service offered by Otterfier is straightforward:

profile mw#Otterfier: inputs: old result :: mw#FoAtpResult outputs: new result :: mw#FoBrFPResult preconds: resultFor(old result, proving problem) effects: resultFor(new result, proving problem) categs: params: mw = http://www.mathweb.org/owl/mathserve.owl

The preconditions of the profile expect the given ATP system result to be the result for some proving problem and to contain a proof. The effects state that the produced system result contains the new resolution proof and is also a valid result for the original proving problem. The transformation performed by Otterfier is independent of the inference rules used in the source inference steps, relying only on the inferred clauses being logical consequences of their parent clauses. Otterfier uses a modified version of Otter to do this transformation. The standard Otter system includes the hints strategy. Hints are normally used as a heuristic for guiding the search, in particular in selecting from the given clauses and in deciding whether to keep derived clauses. The fast version of the hints strategy, called hints2, allows the user to specify a set of clauses against which newly inferred clauses are tested for subsumption. For Otterfier the hints strategy has been modified so that when a newly inferred clause is equal to or subsumes a hint, 4.8 Proof Transformation Services 103 the search is halted and the derivation of the newly inferred clause is output. This modified strategy is called the target strategy. The basic mechanism of Otterfier is to place the parents of a source inference step into Otter’s set-of-support list, and the inferred clause into Otter’s hint list. The in- ferred clause is called the target clause in this context. Otter is then run with a complete selection of inference rules, e.g., binary resolution, factoring, and paramodulation. As Otter derives the logical consequences of the parent clauses, the target strategy checks each logical consequence against the target clause in the hints list. When the target clause is derived or subsumed, the derivation output by Otter provides a transformed version of the source inference step, using only Otter’s inference rules. A source deriva- tion is transformed by performing this transformation on each source inference step, and the combined transformed steps form a complete transformed derivation. We refer the interested reader to the more detailed description of the Otterfier system in [Zimmer et al., 2004].

4.8.2 The TRAMP Service – Generating Natural Deduction Proofs The TRAMP system [Meier, 2000] can transform resolution proofs as produced by certain ATP systems into natural deduction proofs at the assertion level [Huang, 1994]. The assertion level allows for human-oriented macro-steps justified by the application of theorems, lemmas, or definitions, which are collectively called assertions. For instance, the assertion level step F ⊂G c∈F c∈G DEF⊂ derives the conclusion c ∈ G by an application of the subset definition DEF⊂17 from the premises c ∈ F and F ⊂ G. A corresponding base ND proof, including the expansion of the subset definition, consists of a sequence of seven ND steps. TRAMP consists of transformation procedures that translate the proof output of an ATP system into an internal resolution proof object. Internalised resolution proofs can be translated into an ND proof at the assertion level which can be further processed. In particular, each assertion application can be expanded such that the resulting proof is a pure ND proof without assertion application steps. TRAMP can output its proofs in LATEX format as well as in the formal languages POST and Twega [Fiedler, 2001b]. The latest version of TRAMP is able to process the output of the ATP systems Bliksem, Otter, Waldmeister, ProTeIn, EQP, and an old version of the SPASS system. The original TRAMP system has one input: a description of the original conjecture in POST , an extension of Church’s simple type theory [Church, 1940]. TRAMP com- putes the clause normal form (CNF) of the problem and calls one of the supported ATP systems on the CNF. The result of the ATP system is then converted into TRAMP’s internal format and an ND proof for the original conjecture is created. The reason for the CNF generation within TRAMP is that, in order to create a ND proof, TRAMP has to know the ∆-Relation, which maps the literals in the CNF clauses to the cor- responding literals (term positions) in the original first-order formulae. We extended

17 A formalisation of DEF⊂ could be ∀S1.∀S2.(S1 ⊂ S2 ⇔ ∀x.(x ∈ S1 ⇒ x ∈ S2)). 104 Chapter 4. Semantic Reasoning Services

TRAMP in order to employ it as a stand-alone proof transformation system. A new input module for the TPTP format has been developed. TRAMP now accepts three inputs: the TPTP FOF description of a conjecture, a TPTP resolution proof of the clause normal form of the conjecture and the ∆-Relation computed during the clause normalisation. From the ATP systems integrated in MathServe only Otter and Wald- meister are supported by TRAMP (cf. Section 4.4.5). We express this limitation by demanding the ATP system result to be of type FoBrFPResult as defined in Section 4.4.5. The service profile for TRAMP’s proof transformation service reflects both the role of the ∆-Relation as well as the restriction on the ATP system result containing the resolution proof:

profile mw#TrampNDforFOF: inputs: fof problem :: mw#TptpFOFProblem atp result :: mw#FoBrFPResult delta relation :: mw#DeltaRelation outputs: nd proof :: mw#TwegaNDProof preconds: resultFor(atp result, cnf problem) ∧ status(atp result, mw#Unsatisfiable) ∧ cnfFor(cnf problem, fof problem) ∧ relatesCNF (delta relation, cnf problem) ∧ toFOF (delta relation, fof problem) effects: proofOf (nd proof , fof problem) categs: params: mw = http://www.mathweb.org/owl/mathserve.owl

4.9 Summary

We have shown how the services provided by different types of reasoning systems can be modelled as Semantic Web Reasoning Services. Some service descriptions, such as TrampNDforFOF, contain strong preconditions defining the exact circumstances under which a service can be invoked. Other descriptions, like FlotterCNF, simply define the I/O parameters of a service and the relation between those parameters. The results of the problem and proof transformation services presented are pre- dictable in the sense that, given a correct problem (proof), they will produce a new problem (proof) after a reasonable time. The results of reasoning services, however, are far less predictable: Within a given time limit theorem provers may not find a proof and decision procedures may not be able to decide whether a given formula is satisfi- able or not. The likelihood of successfully solving reasoning problems typically depends on certain problem features: First-order ATP systems perform differently on different SPCs, decision procedures perform differently on different theory combinations. The performance of reasoning systems on large corpora of problems was measured. The resulting performance data was modelled as conditional probabilistic effects which are added to the OWL-S service profiles. We will show in Chapter 6 how the resulting performance data can be used to automatically choose suitable reasoning services for concrete problems. Chapter 5

The MathServe Framework

MathServe [Zimmer and Autexier, 2006] is a framework for modelling reasoning sys- tems as services in the Semantic Web, and for advertising, retrieving, and composing these services. Reasoning systems in MathServe are accessible as Web Services. The se- mantics of these services is described using the OWL-S upper ontology as shown in the previous chapter. All service descriptions are based on the MathServe domain ontology described in OWL-DL. Furthermore, MathServe provides a middle agent, the Math- Serve broker, which performs automated service matchmaking and service composition. Composite services are represented in a human-readable way as Golog procedures. In this chapter, we describe the central parts of the MathServe framework and discuss some practical aspects and implementation details of the framework. We start with an introduction to OWL-S query profiles and composite services in Section 5.1. Sec- tion 5.2 provides a description of the different modules of the MathServe broker. Some implementation details are presented in Section 5.3.

5.1 Queries and Composite Services in MathServe

In service-oriented architectures, such as MathServe, service providers offer services to a community of service requesters. With a growing number of reasoning services available it is hard for a service requester (e.g., a human user or a client application) to find the service(s) needed for solving a particular reasoning problem, in particular in open, large-scale networks such as the Internet. Introducing mediation facilities or middle agents (such as service brokers) can help to overcome this problem. In MathServe, service providers register descriptions of their services with the MathServe broker. Service requesters send queries to the broker which tries to find services or compositions of services which can potentially answer these queries.

5.1.1 OWL-S Query Profiles Service providers in MathServe register services in OWL-S. However, no official query language for OWL-S service profiles has yet been defined. RDF query languages, such as Triple [Sintek and Decker, 2002], RDQL [Seaborne, 2004] and SPARQL [Prud’hom- meaux and Seaborne, 2006], follow ideas of the Standard Query Language (SQL) and work on the low level of RDF triples. Querying for an OWL-S service with an RDF 106 Chapter 5. The MathServe Framework query language requires tediously breaking the description of the service down into RDF triples. The resulting query is even less readable than the original description of the service sought. Therefore, MathServe uses extended OWL-S profiles, so called query profiles, as a query language. Query profiles are OWL-S profiles whose input parameters take OWL individuals as values. Furthermore, a query profile limits the time given to answer a query.

Definition 5.1 (OWL-S Query Profile) Let O be an OWL-DL ontology. If (u,{in1 :: C1. . . inm :: Cm},O,P,E,C, ∅) is an OWL-S service profile based on O and t ∈ N is a time limit in seconds, then (u,I,O,P,E,C,t) is an OWL-S query profile based on O, where I={in1 :: C1 ← v1,...,inm :: Cm ← vm} and vi is an OWL individual of class Ci (1 ≤ i ≤ m). The set OWLSQP is the set of all OWL-S query profiles based on O. ♦ O A middle agent receiving a query profile (u,I,O,P,E,C,t), is expected to answer that query within t seconds of wall-clock time. The list C of service categories tells the broker to restrict its search to services with categories in C only. An empty set C is interpreted as no restriction to the set of services.

Example 5.1.1 An exemplary query profile is shown in Figure 5.1. The query ND Query1 asks for a service which accepts the first-order proving problem ⋆tptp#PUZ- 0001+1 available in the TPTP Library. The matching service should be able to deliver a proof of PUZ0001+1 in the Natural Deduction calculus. The effects of ND Query1 sim- ply state that the resulting proof should actually be a proof of PUZ0001+1. A middle agent receiving the query can use services of all categories and should deliver an answer within 30 seconds of wall-clock time.

query mw#ND Query1: inputs: puz problem :: mw#TptpFOFProblem = ⋆tptp#PUZ0001+1 outputs: my proof :: mw#NDProof preconds: effects: proofOf (my proof , puz problem) categs: timeout: 30 secs mw = http://www.mathweb.org/owl/mathserve.owl tptp = http://www.tptp.org/Problems

Figure 5.1: The profile of a query asking for a (composite) service that can generate Natural Deduction proofs for first-order conjectures

The projection functions introduced in Definition 3.25 are extended canonically to query profiles. Additionally, we define the notion of static query effects analogous to static preconditions of service profiles: 5.1 Queries and Composite Services in MathServe 107

Definition 5.2 (Static Query Effects) Let O be an OWL-DL ontology. A q ∈ OWLSQP be a query profile with input pa- O rameters I = inputs (q), output parameters O = outputs (q) and effects E = effects (q). The static effects of q are defined as statEffects (q) := {ψ ∈ E | vars(ψ) ⊆ I ∪ O} ♦

In Section 6.4.2.3 we describe how the static effects of a query profile can be trans- lated into the goal state of a classical planning system.

5.1.2 Composite Services as Golog Procedures In the previous chapter we have seen how different types of atomic reasoning services can be described in OWL-S. More often than not, one atomic service is not sufficient to solve a reasoning problem. In OWL-S composite processes can be described using the control constructs presented in Section 3.2.2.1 (e.g., Sequence, Split and Choice). However, the resulting process models are very difficult to read and modify for hu- man users. Moreover, a complete execution engine for composite OWL-S processes (including conditional choices) is not yet available. In [McIlraith and Son, 2002] it was shown how composite processes can be rep- resented in human-readable form in the Golog language which we introduced in Sec- tion 2.4.2. Indeed, almost all control constructs available in OWL-S (except Split and Split+Join) are also available in Golog. In MathServe we follow the approach of McIl- raith and Son and represent composite reasoning services as Golog procedures. An atomic service is modelled as a primitive situation calculus action with the service’s URI as the action name. The input and output parameters of atomic services become the parameters of the corresponding situation calculus action.1 The lists of input and output parameters are both sorted lexicographically with respect to the parameters’ RDF identifiers. The concatenation of both lists becomes the parameter list of the situation calculus action. For example, the situation calculus action term for the ser- vice TrampCNF (see page 83) with input parameter fof problem and output parameters cnf problem and delta relation (see Section 4.3.1) is http://www.mathweb.org/owl/trans#TrampCNF(fof problem, cnf problem, delta relation) For the sake of readability we will typically use the local name of the service’s URI whenever the namespace can be neglected, i.e. the above action is written2: TrampCNF(fof problem, cnf problem, delta relation) Since the situation calculus is an untyped language, OWL-S parameter types are simply ignored.3 1Note that OWL-S parameters are SWRL variables. They become free variables in the correspond- ing situation calculus action term. 2For Prolog implementations of the situation calculus action names are uncapitalised. 3The situation calculus could be extended by adding new unary predicates for all types together with subtyping rules. However, we did not work with such an extension because it would have required major changes to the decision-theoretic situation calculus reasoner used in Chapter 6. 108 Chapter 5. The MathServe Framework

If needed, type information can always be reconstructed from the OWL-S service profiles. A list of OWL-S service profiles can be translated into a stochastic situa- tion calculus action domain (see Section 2.4.5.1) as shown in Section 6.5.1 and Ap- pendix E.2.

proc DecisionAttempt (fof problem, result ) (EpATP (fof problem, atp result ) | SpassATP (fof problem, atp result )) ; if (status (atp result, stat#Theorem ) then bind (result, atp result ) ; else MaceMG (fof problem, result ) ; endIf endProc

Figure 5.2: A simple composite service in Golog

In our work, composite OWL-S services are modelled as Golog procedures. The OWL-S composition constructs (see Section 3.2.2) are translated into the correspond- ing Golog expressions (Section 2.4.2). Figure 5.2 shows the Golog representation of the simple composite service DecisionAttempt. The procedure expects a first-order theorem proving problem (fof problem) in TPTP format as its only input. It first nondetermin- istically chooses one of the theorem-proving services EpATP or SpassATP and invokes it. (see Section 4.4.5). If the status of the result of EpATP or SpassATP is Theorem (i.e. one of the theorem provers could prove the conjecture) then the parameter result of the procedure is bound to atp result. Otherwise the model generation service MaceMG is invoked to try to find a counter-example for the conjecture. In Chapter 6, more composite services will be presented.

5.2 The MathServe Broker

The MathServe broker is a middle agent which provides semantic mediation services to service requesters. If necessary, the broker performs automated service matchmaking and service composition based on the OWL-S service profiles of reasoning services and queries. It is worth mentioning, that the broker provides a Web Service interface just as any other service in MathServe. Figure 5.3 depicts the different modules of the MathServe broker and their inter- action. Service providers can register OWL-S descriptions of reasoning services to the broker’s service registry. Service requesting applications send reasoning problems as OWL-S query profiles to the query manager. The service matchmaker is used to determine services that match a given query (see Section 5.2.3). Automated service composition is performed by the service composer. The OWL-DL reasoner Pellet [Sirin et al., 2006, Sirin et al., 2003] is used by several modules to perform Description Logic reasoning. The MathServe broker offers a convenient interface for theorem proving 5.2 The MathServe Broker 109

Pellet OWL Reasoner KB

recommendAtomic(q)

Service Query getServices Requester Service 1 Service {s'1,...,s'k} Matchmaker {s1,...,sn} Provider 1

brokerComposite(q) / Manager registerService(s1) getServices evalComposite(s,q) Service Service Service Composer Registry Requester 2 Result {s1,...,sn} registerService(sn)

Service OWL−S Provider Service Profiles n

Golog Pellet OWL Interpreter Reasoner KB

proveTPTPOpt(p) Proof ATP Assistant Interface Result MathServe Broker

Figure 5.3: The different modules of the MathServe broker and the broker’s interfaces to service providers and service requesters problems which is currently used by several proof assistants and verification environ- ments (see Section 5.4). In what follows we provide more detailed descriptions of most of the MathServe broker’s modules. The service composer is presented separately in Chapter 6.

5.2.1 The Service Registry MathServe’s service registry maintains a persistent data base of OWL-S service profiles. Service profiles are added to the data base if advertised by service providers. The registry maintains a blacklist of services that are temporarily not available. Whenever a component of MathServe fails to invoke a Web Service, the service is added to the service registry’s blacklist. A counter is associated to blacklisted services to keep track of the number of failed invocations of the service. After 10 failures the service is removed from the service registry and has to be advertised again by the corresponding service provider. At regular intervals, the registry checks whether blacklisted services are available again by sending the most recent SOAP request. If a service becomes available again, it is removed from the blacklist.

5.2.2 The Query Manager The query manager is the main interface between service requesters and the MathServe broker. For every query sent to the broker, a new instance of the query manager is created. This ensures that MathServe can answer several queries concurrently. 110 Chapter 5. The MathServe Framework

The query manager offers three WSDL operations: The operation recommendAtomic is used by service requesters to ask for a list of atomic OWL-S services matching a given query. To determine the list of matching services the query is forwarded to the service matchmaker. The operation brokerComposite is used to ask the broker to find a suitable composite service to answer a query, invoke the service and return the result of the invocation. When using brokerComposite(q), the query q is first passed on to the service composer. If the composer returns a Golog procedure containing an atomic or a composite service, the procedure is evaluated with the Golog interpreter. The result of the evaluation is returned to the service requester. With the operation evalComposite(s,q), a client application can ask the broker to evaluate the given Golog procedure s with respect to the query q. The separation of operations is based on the assumption that service requesters can invoke atomic services but cannot evaluate Golog procedures representing composite services.

5.2.3 The Service Matchmaker The service matchmaker tries to find suitable atomic reasoning services that can po- tentially answer a query. OWL-S services matching a query must fulfil several require- ments: The number of input parameters of a matching service must be smaller or equal to the number of inputs of the query profile, but the service can have more outputs than the query. However, mappings between the input (output) parameters of the query and the input (output) parameters services must exist. These mappings must respect the parameter types, i.e. the type of a query input must be subsumed by the type of a service input. Similarly, the type of a service output must be subsumed by the type of a query output. Furthermore, the preconditions of the service must be entailed by the preconditions of the query. Similarly, the effects of the query should be entailed by the knowledge base extended with the effects of the service. A disjunctive service effect plays a special role because it names different outcomes that a service potentially achieves. If some of these outcomes entail some of the effects of a given query, the service can potentially answer that query. Currently, the probabilities with which certain literals of a disjunctive effect become true are ignored by MathServe’s matchmaking algorithm. However, one could change the matching algorithm such that only services which achieve a query effect with a certain probability are returned. Formally, we define a matching service for a query q as follows. Definition 5.3 (Matching Service) Let O be an OWL-DL ontology and KB be an OWL-DL knowledge base containing O O. Let further q = (uq, Iq,Oq,Pq, Eq,Cq, t) be a query profile based on O and s = (us, Is,Os,Ps, Es,Cs, P Es) be the profile of a reasoning service s based on O. We say that s matches q if and only if the following conditions hold: 1. The service shares at least one category with the query, i.e. if Cq =6 ∅ then Cq ∩ Cs =6 ∅.

2. There exists a surjective mapping min : Iq → Is such that for all (inq :: Cq ← vq) ∈ Iq with min(inq :: Cq ← vq) = ins :: Cs the entailment KB |= Cq ⊑ Cs O holds. 5.2 The MathServe Broker 111

3. There exists an injective mapping mout : Oq → Os such that for all (outq :: Cq) ∈ Oq with mout(outq :: Cq)= outs :: Cs the entailment KB |= Cs ⊑ Cq holds. O

4. σ : vars(Is ∪Os) → vars(Iq ∪Oq) is the substitution corresponding to the inverse of min and mout, i.e. σ(ins)= inq iff min(inq :: Cq ← vq)= ins :: Cs and outq : if ∃outq ∃Cq. mout(outq :: Cq)= outs :: Cs. σ(outs)=  outs : otherwise.

5. If Φ = ϕ then for all preconditions ψ ∈ Ps the entailment KB , Φ |= ψσ O ϕ Pq V∈ holds.

6. If {A1,...,An} = lits(Es) and Ψ := Ai then for all effects ϕ ∈ Eq the 1 i n ≤V≤ entailment KB , Ψσ |= ϕ holds. O ♦

We remind the reader that according to Definition 3.18 an OWL-S service profile can contain at most one disjunctive effect (see Sections 3.4.3.1 and 3.4.3.3). Figure 5.4 shows the pseudo-code of the matchmaking algorithm used by the Math- Serve broker. The algorithm is a direct realisation of Definition 5.3. It expects a query profile, a service registry, and an OWL-DL domain ontology as inputs. It returns the set of services matching the given query. The functions matchInputs and matchOutputs respect conditions 2–5 in Defini- tion 5.3 and try to match the input (output) parameters of the service with the param- eters of the query. The functions use the Pellet reasoner to perform class subsumption tests. They return the substitutions σin and σout which can be applied to the precon- ditions and effects of the service. In the case of a match, the Pellet reasoner is used to check the entailment between the instantiated preconditions (effects) of the service and the preconditions (effects) of the query. If both entailment tests are successful the service is added to the answer set. Preliminary experiments showed that matchmaking based on DL reasoning is not very efficient. Large parts of the time of a matchmaking attempt is used for the entailment tests between the instantiated effects of the service profiles and the effects of the query profile. This is why most high-performance matchmakers for Semantic Web Services, such as OWL-S-MX [Klusch et al., 2006], ignore preconditions and effects and only try to match the input and output parameters of services and queries.

5.2.4 The Service Composer On receiving an OWL-S query profile, the service composer first tries to find suitable sequences of services which can potentially answer the query. Service composition in MathServe combines classical AI planning with decision-theoretic reasoning in the situ- ation calculus. Due to its complexity, the composition technique is presented separately in Chapter 6. 112 Chapter 5. The MathServe Framework

proc match(query, registry, ontology) S = getServices(registry); kb = new KnowledgeBase(ontology ∪ {query}); pellet = new PelletReasoner(); answerSet = ∅ for all s ∈ S do if categories(q) = ∅ ∨ categories(s) ∩ categories(q) 6= ∅ then σin = matchInputs(inputs(q), inputs(s), pellet); if σin 6= ∅ then σout = matchOutputs(outputs(q), outputs(s), pellet); if σout 6= ∅ then if pellet.entails(kb, σin(preconds(s))) then ′ kb = kb.tell(lits(σin(σout(effects(s))))); if pellet.entails(kb’, effects(q)) then answerSet = answerSet ∪{s}; fi ... fi end return answerSet ; end

Figure 5.4: Matching algorithm used by the service matchmaker

5.2.5 The Pellet Ontology Reasoner Several components of the MathServe broker use the Pellet DL reasoner which works on OWL-DL knowledge bases. Pellet [Sirin et al., 2006, Sirin et al., 2003] is an OWL-DL reasoner based on tableaux algorithms developed for expressive Description Logics. It supports the full expressivity OWL-DL including reasoning about nominals (enumer- ated classes). Currently, Pellet is the first and only sound and complete DL reasoner that can handle this expressivity. Pellet ensures soundness and completeness by incor- porating the recently developed decision procedure for SHOIQ [Horrocks and Sattler, 2005].4 Pellet is used in three different ways by the MathServe broker: 1. The service matchmaker uses Pellet to decide the subsumption relation be- tween the OWL classes of output (input) parameters of OWL-S service profiles and query profiles. 2. MathServe’s service composer uses Pellet to compute the complete directed acyclic graph of simple named classes in MathServe’s domain ontology and their subsumption relationship. This graph is translated into a type hierarchy for service composition with AI planning (see Chapter 6). 3. During the execution of a composite reasoning service, the Golog interpreter uses Pellet to compute property values of individuals. This is needed to check the preconditions of the atomic services before they are invoked. The individ- uals’ properties are also used to check the truth values of situation calculus

4SHOIQ extends SHOIN (D) with qualified cardinality restrictions. The term “qualified” means that restrictions are not made on the overall number of values of a property, but only on the number of values of a certain type. 5.2 The MathServe Broker 113

formulae in conditional statements.

5.2.6 The Golog Interpreter Interpreters for Golog are typically implemented in Prolog (see Section 2.4.2). Unfortu- nately, even modern Prolog interpreters do not provide the means to develop Internet applications. Among other things, they lack facilities for handling XML documents as well as basic libraries for Internet interoperability. In MathServe, however, Golog pro- grams represent composite reasoning services and during the evaluation of a program, atomic reasoning services have to be invoked. Therefore, we developed our own basic on-line Golog interpreter in Java. Our approach has the advantage that the interpreter can access all components of MathServe including OWL-DL knowledge bases and the DL reasoner Pellet. Golog and Query Profiles. In MathServe, primitive situation calculus actions rep- resent atomic reasoning services and Golog procedures represent composite services. Situation calculus actions can only by evaluated if those action parameters which cor- respond to input parameters of OWL-S service profiles are bound to appropriate values. The same holds for the input parameters of a Golog procedure. Currently, the values bound to input parameters are provided by a valid query profile associated with a Golog procedure. The input and output parameters of such a profile must have the same names as the corresponding parameters of the Golog procedure. All input param- eters of the profile have to be bound to OWL individuals of an appropriate OWL-DL class. We illustrate this correspondence with the composite service DecisionAttempt (see Section 5.1.2) represented by the Golog procedure with the head

proc DecisionAttempt (fof problem, result ).

A valid query profile associated with this service is the following:

query mw#ResultQuery1: inputs: fof problem :: mw#TptpFOFProblem = ⋆tptp#PUZ0001+1 outputs: result :: mw#Result preconds: effects: status(result, stat#Theorem) . ∨ status(result, stat#Unsatisfiable) categs: timeout: 10 secs mw = http://www.mathweb.org/owl/mathserve.owl stat = http://www.mathweb.org/owl/status.owl tptp = http://www.tptp.org/Problems

The query profile provides an instantiation of the input parameter fof problem. Fur- thermore, it declares the parameter result as an output, and provides a time limit for the evaluation of the composite service. It is worth mentioning that all OWL-S atomic service profiles are valid query profiles for the primitive situation calculus actions they are represented by. Thus, query profiles 114 Chapter 5. The MathServe Framework have to be provided only for composite services (Golog procedures). Furthermore, a query profile sent to MathServe’s service composer is automatically valid for all Golog procedures generated by the service composer to answer that query.

Evaluating Golog Procedures. MathServe’s Golog interpreter supports Golog pro- grams consisting solely of action sequences, conditional statements, and nondetermin- istic choice of actions. Loops and nondeterministic choice of arguments are not yet supported (cf. Section 2.4.2). Given a Golog procedure and a valid query profile, the evaluation process starts with a new instance of the Pellet reasoner working on an initial OWL-DL knowledge base containing all OWL-S service profiles and the MathServe domain ontology. The input parameters (OWL individuals) and preconditions (SWRL atoms) of the valid query profile are asserted in that knowledge base. The interpreter maintains a binding for all free variables in a Golog procedure. In the initial state only the input parameters of the procedure are bound.

The statements of a sequence δ1,...,δn are evaluated in order. In the case of a nondeterministic choice δ1 | . . . | δn, the interpreter uses a pseudo random number gen- erator to pick an i ∈ Nn and evaluates δi. The truth value of conditions in conditional statements is determined with the help of the Pellet reasoner, which checks whether the condition is entailed by the current knowledge base. For every primitive situation calculus action the interpreter checks whether the preconditions of the corresponding OWL-S profile are entailed by the current knowl- edge base. If one of the preconditions is not entailed, the interpreter returns a failure message (see Table 5.1). Otherwise, the underlying Web Service is invoked with its input parameters bound to current values of the corresponding parameters of the prim- itive situation calculus action. A successful Web Service invocation returns an RDF document containing one or more OWL individuals as values of the service’s output parameters. The situation calculus action variables representing service outputs are bound to the OWL individuals. Finally, the instantiated effects of the OWL-S service are asserted in the knowledge base. If a service invocation fails, the failure message is returned as the overall result of the interpretation.

Time Resource Assignment. MathServe’s Golog interpreter explicitly assigns time resources to atomic reasoning services if possible. Given a time limit for the execution of a Golog program (as provided in query profiles) it keeps track of the time already used by previous invocations of reasoning services and the time remaining. When invoking an atomic OWL-S process with an unbound input parameter of type TimeResource, that parameter is assigned a time resource following a simple heuristics: If the service is part of a sequence of n services still to be invoked and the time remaining for the whole sequence is t seconds then the input parameter is assigned a wall-clock time of t ⌊ n ⌋ seconds. If the OWL-S process has no input parameter of type TimeResource the service is simply invoked without time limitations. The time needed by a Web Service invocation is measured and subtracted from the time still available.

Failure Handling. The evaluation of a composite service can fail for several rea- sons. A failure can occur either on the Golog interpreter’s side or during the remote invocation of a Web Service. In the case of a failure, the Golog interpreter returns an 5.2 The MathServe Broker 115 instance of the OWL-DL class Failure or one of its subclasses. These classes are defined in the MathServe ontology as follows:

{mw#Failure ⊑ owl#Thing mw#Timeout ⊑ mw#Failure mw#PreConditionFailure ⊑ mw#Failure mw#NetworkFailure ⊑ mw#Failure mw#InputFailure ⊑ mw#Failure mw#InternalFailure ⊑ mw#Failure > 1 mw#message ⊑ mw#Failure, ⊤ ⊑ ∀ mw#message. xsd#string. }

The message property contains a natural language description of the cause of the failure. Table 5.1 describes the different OWL failure classes. If a network failure occurs during the invocation of an atomic OWL-S service, the service registry is asked to add the service to its blacklist.

OWL Class Failure Description Message Content Timeout The time resource given to The name of the last atomic the Golog interpreter was service invoked. used up before the pro- cedure could be evaluated completely. PreConditionFailure One or more of the precon- The name of the precondi- ditions of an atomic service tion that failed and the ob- were not fulfilled. jects bound to the input pa- rameters of the service. NetworkFailure A Web Service was tem- Failure message provided by porarily not available due to the web service invocation network or server failure. engine. InputFailure A Web Service could not The name of the service and process one of its input pa- the input parameter causing rameters. the failure. InternalFailure An atomic reasoning ser- A description of the failure vice or the MathServe bro- if available. ker encountered some inter- nal failure.

Table 5.1: Description of the different types of failures reported by MathServe’s Golog interpreter

5.2.7 The Automated Theorem Proving Interface To ease the access to the first-order theorem proving services in MathServe, the broker offers a convenient interface which accepts theorem proving problems in the standard 116 Chapter 5. The MathServe Framework formats TPTP and OMDoc. The interface uses the Golog interpreter to evaluate pre- defined Golog procedures on the given theorem proving problems. The ATP interface offers the following WSDL operations5: atpChoice evaluates the Golog procedure resultQueryProc (presented in Section 6.5.2) on a theorem proving problem. An arbitrary first-order ATP service is chosen to tackle the problem. atpOpt evaluates the optimal policy procedure resultQueryProcOpt (presented in Sec- tion 6.6) to solve an ATP problem. The procedure chooses the most promising ATP service according to the SPC of the problem. atpSplitAndJoin invokes all available ATP services in parallel to try to solve an ATP problem. Returns the results of all invocations once they have terminated.

The above WSDL operations accept theorem proving problems in the formats TPTP and OMDoc. They can be used by simple SOAP clients to try to solve first-order ATP problems. So far, clients have been developed in the programming languages Java, LISP and Haskell.

5.3 System Implementation

The MathServe framework has been implemented in the Sun Java development environ- ment. The development of Web applications in general and Web Service applications in particular is well-supported in Java. Web Services in MathServe are based on a Tomcat Web Server [Brittain and Darwin, 2003] and the Apache Axis package [Irani and Bashna, 2002]. The latter is an implementation of SOAP, the standard protocol for Web Service communication (see Section 3.2.1.2). To ease the integration of rea- soning systems, MathServe offers generic classes for the integration of first-order ATP systems, model generators, and decision procedures. By instantiating these classes, new systems can be easily integrated. Next to the Tomcat Web server and the mechanised reasoning systems integrated, MathServe uses various other third-party software packages. OWL ontologies (knowl- edge bases) are managed with the help of the Jena library [McBride, 2001], a Java API for reading, creating and modifying RDF documents and OWL ontologies. Jena only provides limited reasoning support for RDF and OWL. Therefore, the Pellet reasoner is used for all Description Logic reasoning tasks. MathServe employs the Mindswap OWL-S-API6 for managing OWL-S service descriptions. XML documents other than OWL/RDF are parsed and created with the JDOM package [McLaughlin, 2001]. Fur- thermore, the TPTP utilities and the off-line DTGolog interpreter used by MathServe are based on the Prolog implementations SWI-Prolog [Wielemaker, 2003] and Yap7. The TRAMP system and the PRODIGY planner (see Section 6.4.2) are implemented in Allegro Common LISP8.

5See Section 3.2.1.1 for a description of WSDL operations. 6Available at http://www.mindswap.org/2004/owl-s/api/. 7See http://www.ncc.up.pt/~vsc/Yap/index.html. 8See http://www.franz.com/. 5.4 Availability & Usability 117

5.4 Availability & Usability

The MathServe system can be easily installed on a Unix/Linux platform with one of the available binary packages. The system sources are available via anonymous CVS [Fogel and Bar, 2003]. Further information and installation instructions can be found at the system’s Web page (see http://www.ags.uni-sb.de/~jzimmer/mathserve.html). The MathServe broker’s ATP interface is currently used by the Ωmega proof as- sistant, the VeriFun system [Walther and Schweitzer, 2003], and the Hets toolset [Mossakowski et al., 2006] to solve first-order theorem proving problems. In all three systems, formal proofs delivered by MathServe are integrated into ongoing proof de- velopments. The usability and stability of MathServe has also been demonstrated at two CADE ATP System Competitions (see Chapter 7).

5.5 Summary

The MathServe framework offers reasoning services as Semantic Web Services. The MathServe broker provides service matchmaking and composition facilities to help service requesters to find suitable reasoning services. Queries are sent to the broker as extended OWL-S profiles. Composite services are represented in a human-readable way as Golog procedures. A specialised Golog interpreter is used to evaluate these procedures and invoke atomic reasoning services. A specialised ATP interface provides an easy access to the first-order ATP services available in MathServe. The interface is used on a daily basis by the Ωmega system and two other projects. The robustness and usability of MathServe have been demonstrated at two CADE ATP System Competitions (see Chapter 7). 118 Chapter 5. The MathServe Framework Chapter 6

Automated Composition of Semantic Reasoning Services

In Chapters 4 and 5 we showed how automated reasoning systems can be described as Semantic Web Services in OWL-S and how these descriptions can be used for service matchmaking performed by the MathServe broker. However, reasoning problems can- not always be solved by a single reasoning service. Frequently, such problems require transformation or translations between different formalisms before they can be given to a specific reasoning service. Similarly, the results of reasoning services, such as formal proofs, may have to be translated into different logical formalisms. In this case, several services have to be combined to solve a reasoning problem. In this chapter we show how the service composer of the MathServe broker auto- matically combines different reasoning services, if necessary. The broker’s composition procedure combines classical planning with decision-theoretic reasoning in the situa- tion calculus. In Section 6.1 we present the requirements on the service composition process. We analyse existing approaches for automated Web Service composition (cf. Chapter 3) with respect to our requirements in Section 6.2. The composition approach used by MathServe is described in Section 6.3. More detailed descriptions of the plan- ning part and the decision-theoretic reasoning part of the approach are presented in Sections 6.4 and 6.5, respectively.

6.1 Requirements on Web Service Composition

During the development of the MathServe framework, we identified several require- ments that have to be met by a composition technique for reasoning services. Some of these requirements are inherent to the composition of Semantic Web Services; others stem from peculiarities of our application domain and from the way reasoning services are encoded in MathServe. In what follows, we ignore other requirements generally mentioned in the context of Web Service composition, such as scalability and quality of service (QoS) issues. We identified five main requirements: Description Logic Classes and Subsumption. The input and output parameters of OWL-S service profiles are annotated with OWL-DL classes which provide rich type information. From a service composition point of view, type information can help 120 Chapter 6. Composition of Reasoning Services to reduce the search space of the composition process. But, more importantly, type information leads to well-typed composite services with a higher likelihood of successful execution. This is why the composition procedure used by the MathServe broker should respect the type annotation of input and output parameters of OWL-S services and the subsumption relationship between those types. In general, a service’s output with type C1 can be used as another service’s input (of type C2) if C2 is a superclass of C1. Conditional Probabilistic Effects. Some of the reasoning services described in Chapter 4 achieve particular effects only with a certain probability, i.e. they are agent actions with probabilistic effects. Sometimes the probabilities of these effects depend on the situation a service is invoked in, i.e. they are conditional probabilistic effects. In MathServe, conditional probabilistic effects are used to provide information about the performance of reasoning systems on different types of problems. Therefore, a composition process making use of this information has to be able to reason about services (agent actions) with conditional probabilistic effects. Sensing Actions. An optimal choice of reasoning services may depend on properties of objects that are only known when a (composite) service is executed. Therefore, our composition procedure should be able to introduce sensing actions to dynamically determine unknown properties of objects. In classical planning, sensing actions (as opposed to physical actions) differ from other actions insofar as they do not alter the state of the external world but only extend the knowledge of the plan execution agent. Strictly speaking, no reasoning service in MathServe changes an external world. Thus, all services in MathServe are sensing actions in the classical sense. We there- fore change the definition of the term sensing action for our work: In MathServe, all reasoning services which do introduce new objects but which merely provide new infor- mation about existing objects are regarded as sensing actions. In Section 4.5 we saw an example for such a sensing action: The TPTP problem analysing service TptpAnalyser which determines the Specialist Problem Class (SPC) of a first-order theorem proving problem. A procedure for composing reasoning services should be able to automatically introduce sensing actions whenever it is necessary and possible. Alternatives and Enumeration of Composite Services. Many reasoning ser- vices available in MathServe are problem solving services which are not guaranteed to successfully solve every given problem. This implies that the execution of composite services that incorporate problem solving services might not deliver the desired result1. To ensure a high likelihood of success the composition procedure should incorporate alternatives. These alternatives could either be represented as different composite ser- vices or as one composite service with nondeterministic choice points. From a planning point of view alternatives are created by the enumeration of different plans that achieve a given goal. Dynamic Creation of new Objects. The creation of new objects is inherent in the domain of Semantic Web Service composition and has been discussed in [McDer- mott, 2002]. Semantic Web Services are processes that generate new objects as their output, and it is difficult to model such processes as classical planning operators2.

1Of course, Web Service invocation can always fail because of hardware or network failures (cf Section 5.2.6). 2Classical planning operators change the state of the world but they do not introduce new objects. 6.2 Analysis of Existing Approaches 121

For example, the execution of the processes underlying the first-order theorem proving services introduced in Chapter 4 leads to the creation of an individual of the OWL class FoAtpResult. This individual is not known before the invocation of the service. A service composition procedure should be compatible with the dynamic creation of new objects.

6.2 Analysis of Existing Approaches

In Section 3.3 we presented the most prominent approaches towards automated Web Service composition. In this section we will evaluate these approaches with respect to the requirements described above.

6.2.1 Planning for Web Service Composition We have already mentioned in Section 3.3.1 that the classical planning paradigm is, in principle, suitable for automated Web Service composition. In fact, some of the OWL- S service profiles presented in Chapter 4 look almost like classical planning operators. However, only a few planning systems support object types, and, typically, the type systems are flat and type hierarchies are not supported. Thus, a planning system for Web Service composition has to be chosen with care. A few approaches for extending classical planning with stochastic actions have been presented (see Section 2.4.4.1). However, existing implementations only allow purely propositional domain descriptions with parameter-less action descriptions. A fundamental assumption made by classical AI planners is that there is no un- certainty in the world: the planner has full knowledge of the conditions under which the plan will be executed and the outcome of every action is fully predictable. How- ever, McDermott pointed out that the use of planning for Web Service composition requires branching plans that take the possible outcomes (or failures) of a Web Service invocation into account. Conditional planning and conformant planning take all foreseeable contingencies into account. However, most contingency planners work only with actions with two possible outcomes (e.g., Warplan-C [Peot and Smith, 1992]) or they cannot enumerate several plans (e.g., the Cassandra planner [Pryor and Collins, 1996]). Furthermore, no existing contingency planner supports decision-theoretic reasoning. Neither classical STRIPS planning, nor its extensions to nondeterministic and stochas- tic actions, support the introduction of new objects by planning operators. This is essentially due to the representation of world states with ground first-order literals.

6.2.2 Golog and the Situation Calculus The Golog language has been used for the representation and execution of atomic and composite OWL-S services [McIlraith and Son, 2002]. However, no work on automati- cally composing atomic services in the situation calculus and Golog has been presented yet. In [Finzi et al., 2000] Finzi, Pirri and Reiter presented preliminary work on forward planning in the situation calculus. However, their approach has not been pursued any 122 Chapter 6. Composition of Reasoning Services further. Generally, situation calculus planners cannot compete with classical planning systems because of the need for computationally expensive regression algorithms for determining the truth values of fluents. Both the situation calculus and Golog are untyped languages and neither support typed objects nor type subsumption. The situation calculus only distinguishes between different sorts of objects, such as actions, situations and fluents. Type information could be expressed in action precondition and successor state axioms of an action theory. However, this would require a high number of type-subsumption axioms which would lower the performance of situation calculus planners even more. Actions with probabilistic effects are not supported in the standard Golog language but by the decision-theoretic extension DTGolog (see Section 2.4.5). In the situation calculus and Golog, sensing actions and physical actions are treated equally in successor state axioms. Sensing actions have been studied intensively in the context of online Golog interpreters dealing with exogenous actions, i.e. actions per- formed by other agents [Soutchanski, 2001]. In DTGolog policies, every stochastic agent action is associated with a corresponding sensing actions via senseEffect statements (see Section 2.4.5.2). These statements detect which outcome has actually occurred by executing the stochastic action. Available Prolog implementations of Golog support the introduction of new objects since every Prolog variable can be bound to newly created atoms not present in the initial situation S0.

6.2.3 Classical Markov Decision Processes MDPs are designed for decision-theoretic planning with stochastic agent actions. How- ever, the use of classical MDPs for Web Service composition is problematic due to the limitations mentioned in Section 2.4.4.2: Standard MDPs require the explicit enu- meration of discrete states and actions. This representation is not suitable for highly structured application domains such as Web Service composition. In factored MDPs, states are described with the help of Boolean variables. However, translating com- plex operator (Web Service) descriptions into propositional logic leads to very large state spaces with hundreds or thousands of propositional variables. Unlike planning, reasoning in MDPs uses a discrete action model, i.e. MDPs do not allow the use of pa- rameterised actions. Expressing OWL-S processes as actions in MDPs is therefore not possible without some kind of propositional abstraction as it has been applied in [Doshi et al., 2005]. Last but not least, MDPs neither support state objects (or dynamically introduced objects) nor object types. A solution to an MDP is an optimal policy which maximises the expected accumu- lated reward an agent receives. Theoretically, there can be several optimal policies with the same accumulated expected reward. However, typical algorithms for MDPs com- pute only one such policy. Furthermore, policies computed by classical MDP solvers are typically not human readable or editable.

6.2.4 Program Synthesis Rao investigated the use of program synthesis in propositional Linear Logic for Web 6.2 Analysis of Existing Approaches 123

Service composition (see Section 3.3.4). A similar approach was presented in [Dixon et al., 2006]. Linear Logic (LL) is particularly suitable for a logic representation of Web Services because the creation of new objects by Web Services can simply be modelled as the introduction of new resources in LL. Rao uses a set of subtyping rules to represent the subsumption relationship between ontology classes defined in DAML-S. The LL theorem prover respects these subtyping rules and produces well-typed workflows. The approach described in [Dixon et al., 2006] does not support object types. Rao mentions that the use of propositional LL limits the use of his framework. An extension to first-order LL might lead to a non-terminating or less efficient program synthesis. The work of Dixon and colleagues has the drawback that, in the synthesised workflows, atomic services occur as many times as their output is needed by subsequent services. This means that a service is invoked k times on the same inputs if its output is needed k times by other services. Neither the work of Rao nor the work of Dixon et al supports Semantic Web Services with nondeterministic or probabilistic effects.

6.2.5 Summary Table 6.1 summarises our discussion of existing Web Service composition approaches with respect to the particular requirements of our application domain. In the table ′ ++′ denotes that a requirement is well-supported by a particular approach. ′+′ is used if the requirement is only met by a few implemented systems. ′±′ is used for requirements that have only been dealt with in theory or that are not available in any existing system. The fact that an approach does not support a requirement is expressed by ′−′.

AI Program Requirement Planning Golog MDPs Synthesis Parameter types & subsumption + − − + Probabilistic effects ± − ++ − Sensing actions ± ++ − ± Solution enumeration + − − − Dynamic objects − ± − ++

Table 6.1: Existing service composition approaches with respect to the requirements of Semantic Web Reasoning Services

Table 6.1 shows that no single existing approach to Web Service composition can fulfil all requirements listed in Section 6.1. AI planning offers little support for nonde- terministic and probabilistic effects. Classical MDP’s are based on propositional logic representations and require the explicit enumeration of states and actions. Planning in the situation calculus has not been studied intensively due to efficiency problems. This is why Web Service composition in MathServe is based on a combination of planning and decision-theoretic reasoning in the situation calculus. This combination meets al- most all requirements. Only the problem of the dynamic creation of new objects is not 124 Chapter 6. Composition of Reasoning Services addressed by our approach. We overcome this problem with a workaround described in Section 6.4.2.2.

6.3 Automated Service Composition in MathServe

Service composition in MathServe combines classical AI planning with decision-theoretic reasoning in the situation calculus. Figure 6.1 shows the two stages of the composi- tion procedure for a given query profile and a set of service profiles obtained from the service registry of the MathServe broker. In the first stage, the OWL-S profiles of the reasoning services and the query profile are translated into a planning domain and a planning problem for the classical planning system PRODIGY. The planner is asked to find suitable sequences of services that can potentially answer the query. We describe the translation and the planning process in Section 6.4.

OWL−S Query OWL−S Service Profile Profiles

Planning SitCalc Domain Domain Generator Generator

Offline Optimal SitCalc DTGolog Policy Action Domain Interpreter Planning Domain Yes & Problem Prob. Golog Effects? No Procedure

PRODIGY Prodigy Golog Golog Procedure Planner Plans Procedure Generator

Stage 1: Classical Planning Stage 2: Decision−Theoretic Reasoning

Figure 6.1: The two stages of service composition in MathServe and the data flow between the different components

In the second stage, a Golog Procedure Generator represents the plans found by PRODIGY, together with the query provided, as nondeterministic choices in a sin- gle Golog procedure. In case the procedure uses reasoning services with probabilistic effects, an offline DTGolog interpreter evaluates the procedure with respect to a DT- Golog action domain generated automatically from the OWL-S service profiles. The interpreter computes an optimal policy for the Golog procedure. Both the original procedure and the optimal policy can be executed in MathServe’s Golog interpreter (see Section 5.2.6). In the remainder of this chapter we describe the two stages of MathServe’s service composition with the help of examples. The planning part is presented in Section 6.4 6.4 Planning with Deterministic Agent Actions 125 and the decision-theoretic reasoning part in Section 6.5. The computation of an optimal policy is described in Section 6.6. Our approach requires a translation of OWL-S service profiles to planning domains and stochastic situation calculus action domains, and the translation of OWL-S query profiles to planning problems and Golog procedures. For the sake of readability, all translation functions described in the following sections are formally defined in Appendix E.

6.4 Planning with Deterministic Agent Actions

In this section we describe the use of the classical planning system PRODIGY for au- tomated composition of reasoning services. We show how our domain ontology, atomic processes and queries can be translated (fully automatically) into type hierarchies, planning domains, and planning problems, respectively. Only static preconditions and classical (deterministic) effects of services are represented in the planning domain. Dy- namic preconditions of reasoning services are ignored because their truth cannot be determined during the planning process. Disjunctive and probabilistic effects are also ignored because they are not supported by PRODIGY. The most important reasons for choosing the PRODIGY system were that:

1. The system works with a hierarchical type system which allows us to incorporate the classes and subsumption relationship of MathServe’s domain ontology on the object level,

2. it can enumerate several, structurally different, plans for a given problem,

3. it returns the shortest plans first, and

4. it allows the user to control the search process with declarative control knowledge.

These features distinguished PRODIGY from two other planning systems we evaluated: a proprietary partial-order planner and the SGP (Sensory Graphplan) system [Weld et al., 1998]. These systems were hard to use, had no type system, and could not enumerate plans.

6.4.1 The PRODIGY System PRODIGY [Veloso et al., 1995] is a general-purpose problem solving architecture that has been used for research on planning as well as machine learning. We employ version 4.0 of PRODIGY which we use as a linear planner incorporating a means-end analysis. A world state in PRODIGY is represented as a set of first-order ground atoms. Constants (objects) and variables have a type assigned to them. Types are declared in planning domains in a hierarchical structure with single inheritance. PRODIGY’s planning operators describe the preconditions and effects of agent actions. Precondi- tions are either single literals or conjunctions, disjunctions, or universal or existential quantifications of formulae. Effects are STRIPS-style add and delete lists. PRODIGY also allows for conditional effects, i.e. effects with secondary preconditions [Pednault, 126 Chapter 6. Composition of Reasoning Services

1991]. Although PRODIGY has not been maintained since 1998, the planner turned out to be rather stable and easy to use and extend. Figure 6.2 shows the PRODIGY description of the PICK-UP operator from the Blocksworld domain [Nilsson, 1980] together with the declaration of the type OBJECT 3. PICK-UP has one parameter () of type OBJECT and three preconditions on this parameter. The effects of PICK-UP contains the literals that are deleted from the world state (del) and literals that are added (add) when the operator is applied.

(ptype-of OBJECT :top-type)

(OPERATORPICK-UP (params ) (preconds (( OBJECT)) (and (clear ) (on-table ) (arm-empty))) (effects () ; no vars genenerated in effects list ((del (on-table )) (del (clear )) (del (arm-empty)) (add (holding )))))

Figure 6.2: The PRODIGY definition of the type OBJECT and the operator PICK-UP in the Blocksworld domain

Planning problems in PRODIGY define a set of typed objects, the initial state, and the goal to be achieved. The initial state and the goal state are conjunctions of literals. The following is a simple problem description in the Blocksworld domain: (create-problem (name example-blocks) (objects (a b c OBJECT)) (state (and (on-table a) (on c a) (on b c) (clear b) (arm-empty))) (goal (and (on c b) (on b a)))))

PRODIGY provides different flags to control the search for plans. Among others, the user can tell the planner to:

-- use depth-first or breadth-first search, to

-- search for a single solution or multiple solutions, to

-- perform linear search (no goal interactions) or not, and to

3The type :top-type is the root of PRODIGY’s type hierarchy. 6.4 Planning with Deterministic Agent Actions 127

-- limit the maximum depth of the search tree. To make plan search more efficient, decision points in PRODIGY can be influenced by the user with declarative control rules. The user can define control rules to prefer, select, or reject the current pending goal, an applicable operator, or certain bindings of an applicable operator. The following rule from the Blocksworld domain, for instance, selects PICK-UP in case the current goal is to hold an object and the object is on the table: (Control-Rule SELECT-OP-PICKUP-FOR-HOLDING (if (and (current-goal (holding )) (true-in-state (on-table )))) (then select operator PICK-UP)) Furthermore, PRODIGY allows the user to define new meta-predicates (essentially LISP functions) that can be used in the condition part of control rules. For more detailed information about the PRODIGY system we refer to [Veloso et al., 1995] and the system manual [Blythe et al., 1992].

6.4.2 PRODIGY for Web Service Composition In this section we show how our domain ontology, OWL-S profiles of atomic services and queries are translated into PRODIGY types, planning operators, and planning problems, respectively. Formal translation functions are defined in Appendix E. We will depict the translation process with the help of examples and refer to the corre- sponding translation functions when necessary.

6.4.2.1 Ontology Classes and Prodigy Types We translate the simple, named classes of our domain ontology to PRODIGY type declarations. Types in the original PRODIGY system are organised in a tree structure with single inheritance. To be able to model OWL-DL classes with multiple simple named superclasses, we extended PRODIGY’s type system to allow multiple inher- itance. With this extension, types form a directed acyclic graph. Our translation replaces the pre-defined PRODIGY supertype (:top-type) by OWL-Thing, which cor- responds to the superclass of all OWL classes. Figure 6.3 shows a small fragment of the class assertions of the MathServe ontology and the corresponding PRODIGY type declarations. As we can see from Figure 6.3, abstract classes resulting from property restrictions are ignored by our translation. OWL-S service descriptions can define new local OWL classes which can be used in the description of processes and profiles. Local classes are treated as if they had been defined in the MathServe ontology and are also trans- lated into PRODIGY types. Formally, this translation is performed by the translation function τcl defined in Definition E 5.1 (see Appendix E).

6.4.2.2 Translating Atomic Processes to Planning Operators Using the type shown above we can translate OWL-S service profiles of atomic processes to PRODIGY planning operators. For this, the input and output parameters of a 128 Chapter 6. Composition of Reasoning Services

(a)

mw#Problem ⊑ owl#Thing mw#Proof ⊑ owl#Thing mw#FormalProof ⊑ mw#Proof mw#NDProof ⊑ mw#FormalProof ⊓∀ mw#calculus. mw#ND-Calculus mw#ProvingProblem = mw#Problem mw#FOProvingProblem = mw#ProvingProblem ⊓∀ mw#logic. mw#FOLogic mw#TptpProblem = mw#FOProvingProblem ⊓ mw#language : mw#TPTP (b) (ptype-of OWL-Thing :top-type) (ptype-of MW-Problem OWL-Thing) (ptype-of MW-Proof OWL-Thing) (ptype-of MW-FormalProof MW-Proof) (ptype-of MW-NDProof MW-FormalProof) (ptype-of MW-ProvingProblem MW-Problem) (ptype-of MW-FOProvingProblem MW-ProvingProblem) (ptype-of MW-TptpProblem MW-FOProvingProblem) (ptype-of MW-TptpFOFProblem MW-TptpProblem)

Figure 6.3: (a) A small fragment of the MathServe ontology and (b) its translation to PRODIGY type declarations 6.4 Planning with Deterministic Agent Actions 129 profile become the parameters of the planning operator. All deterministic (classical) preconditions and effects of the profile become the preconditions and effects of the planning operator. Since reasoning services in MathServe do not have negative effects (cf. Section 3.4.2) all planning operators created will have empty delete lists. Dynamic preconditions and disjunctive (or probabilistic) effects are ignored by the translation.

Example 6.4.1 We illustrate the translation with an example. Figure 6.4 shows the service profile TrampNDforFOF introduced in Section 4.8.2 (page 103), and the planning operator that is automatically generated from it. The inputs and outputs of the process TrampNDforFOF become the parameters , , , and of the planning operator. Also the static precondi- tions and deterministic conjunctive effects of the profile are translated to preconditions and effects in the PRODIGY operator (see Section 3.4.3.3). Dynamic preconditions are statements about properties of concrete values of input parameters. In general, they cannot be evaluated at planning time. Therefore, they are not part of the operator description. For example, the precondition status(atp result, mw#Unsatisfiable) of the service TrampNDforFOF is dynamic because the value of the property status is not known at planning time. Consequently, this precondition does not occur in the corresponding planning operator.

In Section 3.3.1 we noted that classical planners, such as PRODIGY, work only on a static set of objects while Web Services produce new objects. Therefore, we have to introduce a number of objects in the (initial) planning domain: For every output parameter of an atomic OWL-S profile we create a fixed number k of new objects of the appropriate type. Introducing a fixed set of objects for a certain type implies that the number of possible applications of an operator in a single plan is restricted.4 In PRODIGY new domain objects are created with the pinstance-of construct. For the output nd proof of the process TrampNDforFOF, for instance, we introduce the following statements after the operator declaration:

(pinstance-of nd_proof_1 MW-TwegaNDProof) (pinstance-of nd_proof_2 MW-TwegaNDProof) ... (pinstance-of nd_proof_k MW-TwegaNDProof)

We also introduce a special predicate bound which indicates that an object is bound to an output parameter of an applied operator. The planning operator for TrampND- forFOF has additional preconditions requiring all the objects representing inputs of the process to be bound and the object , which represents the output of the pro- file, to be unbound5. The effects of the planning operator contain the static effects of the atomic profile as well as additional bound statements for all outputs of the profile. Free variables in the preconditions and effects of a process are simply declared with type OWL-Thing, i.e. they can be bound to objects of any type.

4However, more than one process might trigger the introduction of objects of a certain type C and, therefore, an operator could be applied more than k times. 5Negation in PRODIGY is written as ’∼’ and represents the absence of a fact in the current world description (the closed-world assumption). 130 Chapter 6. Composition of Reasoning Services

(a) profile TrampNDforFOF: inputs: fof problem :: mw#TptpFOFProblem atp result :: mw#FOBrFPResult delta relation :: mw#DeltaRelation outputs: nd proof :: mw#TwegaNDProof preconds: resultFor(atp result, cnf problem) ∧ status(atp result, mw#Unsatisfiable) ∧ cnfFor(cnf problem, fof problem) ∧ relatesCNF (delta relation, cnf problem) ∧ toFOF (delta relation, fof problem) effects: proofOf (nd proof , fof problem) categs: params: mw = http://www.mathweb.org/owl/mathserve.owl (b) (OPERATOR TrampNDforFOF (params ) (preconds (( MW-TptpFOFProblem) ( MW-FOBrFPResult) ( MW-DeltaRelation) ( MW-TwegaNDProof) ( OWL-Thing)) (and (bound ) (bound ) (bound ) (~ (bound )) (resultFor ) (cnfFor ) (relatesCNF ) (toFOF ))) (effects () ((add (bound )) (add (proofOf )))))

Figure 6.4: (a) The service profile of TrampNDforFOF and (b) the corresponding PRODIGY planning operator 6.4 Planning with Deterministic Agent Actions 131

PRODIGY treats all parameters of planning operators equally and binds them to objects of the specified parameter type or of a subtype thereof. The class of an OWL-S output parameter, however, expresses that the corresponding service produces individuals of exactly that class. To reflect this in our translation we have to ensure that PRODIGY parameters corresponding to OWL-S output parameters are not bound to objects of a proper subtype of the parameter’s type. We do this by rejecting certain bindings of the corresponding planning operator by means of a control rule. The control rule generated for the operator TrampNDforFOF, for instance, looks as follows:

(Control-Rule reject-wrong-binding-for-TrampNDforFOF (if (and (applicable-operator (TrampNDforFOF )) (~ (exact-type-of-object MW-TwegaNDProof)) )) (then sub-goal))

The newly defined binary meta-predicate exact-type-of-object evaluates to true if and only if the type of the first argument is exactly the type given as the second argument. The rule fires if the operator TrampNDforFOF is applicable but the object bound to the parameter is not of the exact type MW-TwegaNDProof. The expression sub-goal in the then-part tells the planner to not apply the applicable operator but to try matching other operators.6 Similar control rules are generated automatically for every planning operator and are added to the planning domain. The formal translation function τsp is defined in Definition 5.3 (see Section E.1).

6.4.2.3 Translating Queries to Planning Problems The translation of query profiles into PRODIGY planning problems is very similar to the translation of atomic OWL-S profiles described in the previous section. However, in the case of queries the input and output parameters of a query profile are declared as new objects in the PRODIGY problem. The preconditions of the profile are represented as a conjunction of PRODIGY atoms in the initial state. The goal state is a conjunction consisting of a list of atoms corresponding to the static effects of the query profile and bound statements for the outputs of the query. Values of input parameters are OWL individuals of the corresponding types. The timeout provided by a query cannot be expressed in the PRODIGY problem itself. But it is used to compute the time resources available for the planning attempt.

Example 6.4.2 Figure 6.5 shows the query ndQuery1 introduced in Section 5.1.1 and the corresponding planning problem. The value of the input parameter puz problem is ignored during planning.

We refer the interested reader to the translation functions τqp and τcompl (see Def- initions 5.4 and 5.5 in Appendix E.1) which produce planning problems and com- plete PRODIGY planning domains, respectively. Appendix F contains a complete PRODIGY domain. 6See the PRODIGY manual [Blythe et al., 1992] for details about the PRODIGY’s search process. 132 Chapter 6. Composition of Reasoning Services

(a) query ndQuery1: inputs: puz problem :: mw#TptpFOFProblem = ⋆tptp#PUZ0001+1 outputs: my proof :: mw#NDProof preconds: effects: proofOf (my proof , puz problem) categs: timeout: 30secs mw = http://www.mathweb.org/owl/mathserve.owl tptp = http://www.tptp.org/Problems

(b) (create-problem (name ndQuery1) (objects (puz_problem MW-TptpFOFProblem) (my_proof MW-NDProof) ) (state) (goal (and (bound my_proof) (proofOf my_proof puz_problem))))

Figure 6.5: (a) The query ndQuery1 and (b) the corresponding PRODIGY problem

6.4.2.4 Search for Plans Given a set of OWL-S services and a query, the broker automatically generates a planning domain and a planning problem, and runs PRODIGY in depth-first search mode. In this mode the search tree will not grow deeper than a limit set by the MathServe broker. PRODIGY’s search space contains five choice points for every operator application7. Therefore, running the planner with a depth-bound of 30, for instance, will result in plans no longer than six steps. determining a good heuristics for the depth limit is a difficult task. Currently, MathServe’s service composer follows a simple heuristic. It assumes that it is unlikely that operators of exactly the same service category are applied more than twice in a sequence of consecutive operators. Therefore, the depth limit is set to ten times the number of different categories of services encoded in the planning domain. To achieve a higher efficiency in search we start PRODIGY in linear search in which it works as a non-interleaving planner. Interleaving goals cannot occur in our domain. Our planning operators do not delete any facts from the world description. Thus, no planning operator can pose a threat to the preconditions of previously applied operators. The broker tells PRODIGY to return all solutions it can find within the given time and depth limits. Furthermore, we ask for different solutions. This ensures that all the plans returned by PRODIGY are structurally different, i.e. two plans will

7A planning cycle in PRODIGY consists of five steps: choosing a node in the search tree, choosing a pending goal, an operator that can achieve that goal, and a suitable binding, and either apply one or more operators or continue producing new subgoals. 6.5 Service Profiles and Plans as DTGolog Domains 133 never have the same sequence of instantiated operators.

Example 6.4.3 We assume that the broker’s service registry contains only advertise- ments of the ATP services EpATP, OtterATP, SpassATP, VampireATP, and of the transfor- mation services TrampCNF, TrampNDforFOF, and Otterfier described in Chapter 4. Given the query ndQuery1 shown in Figure 6.5 (a), the broker generates a planning domain and a planning problem. Figure 6.6 contains the four plans found by PRODIGY with

TrampCNF(puz_problem cnf_problem_2 delta_relation_3) ; EpATP(cnf_problem_2 time_res_8 atp_result_7) ; Otterfier(atp_result_7 atp_result_5) ; TrampNDforFOF(puz_problem atp_result_5 delta_relation_3 my_proof)

TrampCNF(puz_problem cnf_problem_2 delta_relation_3) ; SpassATP(cnf_problem_2 time_res_8 atp_result_7) ; Otterfier(atp_result_7 atp_result_5) ; TrampNDforFOF(puz_problem atp_result_5 delta_relation_3 my_proof)

TrampCNF(puz_problem cnf_problem_2 delta_relation_3) ; VampireATP(cnf_problem_2 time_res_8 atp_result_7) ; Otterfier(atp_result_7 atp_result_5) ; TrampNDforFOF(puz_problem atp_result_5 delta_relation_3 my_proof)

TrampCNF(puz_problem cnf_problem_2 delta_relation_3) ; OtterATP(cnf_problem_2 time_res_8 atp_result_9) ; TrampNDforFOF(puz_problem atp_result_9 delta_relation_3 my_proof)

Figure 6.6: Four plans found by PRODIGY for query ndQuery1 a depth limit of 30 and a time limit of 10 seconds. All plans use TrampCNF to create the clause normal form of the input problem but they differ in the choice of the ATP service used. In the case of EpATP, SpassATP, and VampireATP the Otterfier service has to be used to transform resolution proofs in the calculus understood by TrampNDforFOF. If OtterATP is used this transformation is not necessary.

6.5 Service Profiles and Plans as DTGolog Domains

In this and the following section we describe the second stage of MathServe’s ser- vice composition process in which the probabilistic effects of reasoning services are taken into account. From the OWL-S service profiles and the classical plans found by PRODIGY the MathServe broker generates a stochastic situation calculus action domain and a Golog procedure representing all plans. An offline DTGolog interpreter computes an optimal policy for the procedure. In the following sections, the translation of OWL-S profiles and their parts into situation calculus formulae is illustrated with examples. Formal translation functions are defined in Appendix E.2. We will refer to the definitions of these functions when appropriate. 134 Chapter 6. Composition of Reasoning Services

6.5.1 Generating DTGolog Action Domains The DTGolog domain generator (see Figure 6.1) computes a DTGolog action domain from OWL-S service profiles. For every reasoning service, action precondition and successor state axioms are generated. The generation of action precondition axioms is trivial. SWRL formulae can be translated directly into situation calculus formulae by adding a situation argument. However, one special requirement must be met: variables in preconditions that do not occur in the input and output parameters of a service have to be existentially quantified. For example, the action precondition axiom for the service TrampNDforFOF contains a existentially quantified variable u for the OWL-S variable cnf problem:

Poss(trampNDforFOF (x, y, z),s) ≡ ∃u. resultFor(y,u) ∧ status(y, stat#Unsatisfiable, s) ∧ cnfFor(u,x,s) ∧ relatesCNF (y,u,s) ∧ toF OF (y,x,s).

Action precondition axioms are generated from OWL-S service profiles with the help of the translation function σpre (see Definition 5.9, p 240). Successor state axioms are created following Reiter’s solution for the frame problem (see Section 2.4.1). First, the set F of situation calculus fluents of the domain is determined from the classical and probabilistic effects of all available service profiles (see Definition 5.10, p 240). For every fluent F ∈ F the domain generator computes + the positive normal form effect axiom γF (~x, a, s) (see Definition 5.13). Since we do not allow negative effects in OWL-S service profiles, the negative normal form effect axiom γF−(~x, a, s) is always false and can be ignored. Following the causal completeness assumption we obtain the general successor state axiom

+ F (~x, do(a, s)) ≡ γF (~x, a, s) ∨ F (~x, s).

Figure 6.7 contains the OWL-S profile of the service TrampCNF which has been pre- sented in Section 4.3.1. The effects of the service define the three fluents cnfFor, re- latesCNF and toFOF . The fluent cnfFor is also in the effect list of the service FlotterCNF (see Section 4.3) while relatesCNF and toFOF do not occur in the effects of any other service. Thus, we obtain the successor state axioms

cnfFor (x2,x1, do(a, s)) ≡ (a = trampCNF (x1,x2,x3) ∨ a = flotterCNF (x1,x2)) ∨

cnfFor (x2,x1,s)), relatesCNF (x3,x2, do(a, s)) ≡ (a = trampCNF (x1,x2,x3)) ∨ relatesCNF (x3,x2,s)), and

toFOF (x3,x1, do(a, s)) ≡ (a = trampCNF (x1,x2,x3) ∨ toFOF (x3,x1,s)).

Similar successor state axioms are created for all fluents in the domain. For services with disjunctive effects the domain generator introduces additional axioms that specify the possible outcomes (nature’s choices as introduced in Sec- tion 2.4.5.1) of the service (also see Definition 5.11, p 241). For every literal in the disjunctive effects of a service the domain generator creates a choice for nature, i.e. a 6.5 Service Profiles and Plans as DTGolog Domains 135

profile TrampCNF: inputs: fof problem :: mw#TptpFOFProblem outputs: cnf problem :: mw#TptpCNFProblem delta relation :: mw#DeltaRelation preconds: effects: cnfFor(cnf problem, fof problem) ∧ relatesCNF (delta relation, cnf problem) ∧ toFOF (delta relation, fof problem) categs: params: mw = http://www.mathweb.org/owl/mathserve.owl

Figure 6.7: Service profile of TrampCNF

profile EpATP: inputs: tptp problem :: mw#TptpProblem outputs: atp result :: mw#FoAtpResult preconds: effects: resultFor(atp result, tptp problem). (status(atp result, stat#Theorem)∨ status(atp result, stat#Unsatisfiable)∨ status(atp result, stat#Satisfiable)∨ status(atp result, stat#CounterSatisfiable)∨ status(atp result, stat#Unknown)). categs: params: problemClass(tptp problem, mw#FOF NKC EPR) → status(atp result, stat#Theorem) (0.80) (12160ms) problemClass(tptp problem, mw#CNF NKS RFO PEQ UEQ) → status(atp result, stat#Unsatisfiable) (0.85) (3412ms) . . .

stat = http://www.mathweb.org/owl/status.owl mw = http://www.mathweb.org/owl/mathserve.owl

Figure 6.8: The service EpATP with some of its probabilistic effects. The probabilistic effects are derived from the performance data in Table C.1. No probabilistic effects are specified for the literal status(atp result, stat#Unknown) 136 Chapter 6. Composition of Reasoning Services new deterministic action representing the outcome given by the literal. We assume that the choices provided by a nondeterministic action are independent of the situa- tion in which the action is executed. In the decision-theoretic situation calculus this is represented using the defined predicate choice. The different choices are distinguished with the help of the predicate senseCond . For example, the service EpATP, shown in Figure 6.8, has a disjunctive effect which implies five choices for nature:

choice(epATP(x, y), epATP1(x, y)) ∧

choice(epATP(x, y), epATP2(x, y)) ∧ . . . ∧

choice(epATP(x, y), epATP5(x, y))

The choices are associated with the corresponding literals by the following axioms:

senseCond (epATP1(x, y), status(y, stat#Theorem )).

senseCond (epATP2(x, y), status(y, stat#Unsatisfiable )).

senseCond (epATP3(x, y), status(y, stat#Satisfiable )).

senseCond (epATP4(x, y), status(y, stat#CounterSatisfiable )).

senseCond (epATP5(x, y), status(y, stat#Unknown )).

Technically, axioms for nature’s choices are generated by the functions σch and σsc (see Definition 5.16, p 243). By default, all choices occur with the same probability. If a service provides prob- abilistic effects that correspond to some of the choices (contain the same literal as a choice) then the probabilities in these effects are chosen for the corresponding choices. In the case of conditional probabilistic effects, the conclusion has to match a choice literal. The service EpATP provides probabilistic effects for choices epATP1 to epATP4. The probability axioms generated for the first two choices are prob (epATP1(x,y),s)= pr ≡ (problemClass (x, stat#FOF NKC EPR ,s) ∧ pr = 0.80) ∨ . . . ∨ (problemClass (x, stat#CNF NKS RFO PEQ UEQ ,s) ∧ pr = 0). prob (epATP2(x,y),s)= pr ≡ (problemClass (x, stat#FOF NKC EPR ,s) ∧ pr = 0) ∨ . . . ∨ (problemClass (x, stat#CNF NKS RFO PEQ UEQ ,s) ∧ pr = 0.85).

Similar statements are created for choices epATP3 and epATP4. The unspecified probability for choice epATP5 is computed from the probabilities provided:

prob (epATP5(x,y),s)= pr ≡ prob (epAT P1(x,y),s)= pr1 ∧

prob (epATP2(x,y),s)= pr2 ∧

prob (epATP3(x,y),s)= pr3 ∧

prob (epATP4(x,y),s)= pr4 ∧ pr = (1 − pr1 − pr2 − pr3 − pr4).

The author of a service profile with probabilistic effects has to ensure that, for a particular condition (e.g., problemClass (x, C1)), the probabilities for all choices add up 6.5 Service Profiles and Plans as DTGolog Domains 137 to 1.0. If the probabilities for one or more choices is not explicitly defined then the sum of all defined probabilities has to be smaller or equal to 1.0. This is a requirement of the underlying Markov Decision Process. In DTGolog, the distinguished predicate reward (n, s) is used to define the agent’s reward in situation s. In our framework the reward of the initial situation S0 is zero. The reward of a situation s = do(a, s′) is the negative cost of the action (service) a whose execution led to s. We introduce the axioms

reward(0,S0) and reward(r, do(a, s)) ≡ cost(a, s)= c ∧ r = 0 − c. Probabilistic effects in service profiles define the average cost of a service invocation for the different choices of nature. If no costs are specified in a service profile, the corresponding situation calculus action is assigned a default cost. The default cost is the average of all cost statements occuring in all profiles of services of the same category. For the service EpATP the following cost axiom is generated: cost(epATP(x,y),s)= c ≡ (problemClass (x, stat#FOF NKC EPR ,s) ∧ c = 12160) ∨ . . . ∨ problemClass (x, stat#CNF NKS RFO PEQ UEQ ,s) ∧ c = 3412).

Probability and cost axioms are generated by the functions σprob and σcost (see Defi- nition 5.18, p 244). The set F of fluents contains all predicates used in classical and probabilistic effects of services. The successor state axioms for fluents in F must also include nature’s choices. For example, the fluent status (x, y) occurs in the disjunctive effects of all ATP services. The corresponding successor state axiom (including nature’s choices) looks as follows:

status (x,z,do(a, s)) ≡ (a = epATP1(x, y) ∧ z = stat#Theorem ) ∨

(a = epATP2(x, y) ∧ z = stat#Unsatisfiable ) ∨ . . . ∨

(a = spassATP1(x, y) ∧ z = stat#Theorem ) ∨ . . . ∨

(a = vampireATP5(x, y) ∧ z = stat#Unknown ) ∨ status(x,z,s). It is worth mentioning that in the Prolog implementation of DTGolog, definitions of successor state axioms can be split into different Prolog clauses, which simplifies the translation process. In particular, the Prolog clauses for each service can be generated independently. For instance, the successor state axioms for status (x, y) is represented by the following Prolog clauses: status(X,Z, do(A,S)) :- A = epATP1(X,Y), Z = ’stat#Theorem’. ... status(X,Z, do(A,S)) :- A = vampireATP5(X,Y), Z = ’stat#Unknown’. status(X,Z, do(A,S)) :- status(X,Z,S). 138 Chapter 6. Composition of Reasoning Services

6.5.2 Generating a Golog Procedure

Based on the DTGolog action domain generated from OWL-S service profiles, a query generator transforms a given query and the corresponding PRODIGY plans into a DTGolog procedure that can be evaluated by the decision-theoretic reasoner. All plans found by PRODIGY are represented in a single DTGolog procedure as nondeterministic choices. The query generator computes the maximal common prefix of all plans for a more compact representation of plans (cf. Section E.2).

Example 6.5.1 For the query ndQuery1 (shown in Figure 6.5 (a)) PRODIGY found four plans (Figure 6.6). All plans start with an invocation of the service TrampCNF which is also the maximal common prefix. The corresponding DTGolog procedure is

proc ndQuery1 Proc (puz problem, my proof ) trampCNF (puz problem, cnf problem 2, delta relation 3 ) ; ((epATP (cnf problem 2, time res 8, atp result 7 ) ; otterfier (atp result 7, atp result 5 ) ; trampNDforFOF (puz problem, atp result 5, delta relation 3, my proof )) | (spassATP (cnf problem 2, time res 8, atp result 7 ) ; otterfier (atp result 7, atp result 5 ) ; trampNDforFOF (puz problem, atp result 5, delta relation 3, my proof )) | (vampireATP (cnf problem 2, time res 8, atp result 7 ) ; otterfier (atp result 7, atp result 5 ) ; trampNDforFOF (puz problem, atp result 5, delta relation 3, my proof )) | (otterATP (cnf problem 2, time res 8, atp result 9 ) ; trampNDforFOF (puz problem, atp result 9, delta relation 3, my proof ))) endProc

The effects of a query profile represent additional conditions that have to be fulfilled after the (composite) process has been executed completely. Static effects are dealt with during the planning stage of service composition (Section 6.4). Dynamic effects, i.e. effects whose truth-value cannot be determined at planning time, are added as test statements at the end of the DTGolog procedure. An effect literal is only expressed as a test statement if its predicate p is a situation calculus fluent (p ∈F).

Example 6.5.2 The query resultQuery (shown in Figure 6.9) asks for a result of a first- order ATP service for the conjecture provided. The profile contains the dynamic effect status(my result, stat#Theorem). This effect expresses that the sender of the query is interested in proving the conjecture at hand. The predicate status is a situation calculus fluent that occurs, for instance, in the effects of ATP services. Therefore, the effect is added as a test statement at the end of the Golog procedure generated from the six (single-step) plans found by PRODIGY: 6.6 Computing an Optimal Policy 139

query resultQuery: inputs: my problem :: mw#TptpFOFProblem outputs: my result :: mw#FoATPResult preconds: effects: resultFor(my result, fof problem) status(my result, stat#Theorem) categs: timeout: 10secs mw = http://www.mathweb.org/owl/mathserve.owl tptp = http://www.tptp.org/Problems stat = http://www.mathweb.org/owl/status.owl

Figure 6.9: The query profile resultQuery

proc resultQueryProc (my problem, my result ) (epATP (my problem, time res 1, my result ) | spassATP (my problem, time res 1, my result ) | vampireATP (my problem, time res 1, my result ) | otterATP (my problem, time res 1, my result ) | dctpATP (my problem, time res 1, my result ) | waldmeisterATP (my problem, time res 1, my result )) ; status (my result, stat#Theorem )? endProc Test statements in Golog procedures are important for a correct interpretation of the results delivered by the offline DTGolog interpreter. Next to an optimal policy, the interpreter computes the overall probability of successful termination of that policy as well as the expected accumulated reward gained by an execution. For the query resultQuery in Example 6.5.2, for instance, the probability of successful execution for the optimal policy computed is 0.175 (with respect to the performance data shown in Appendix C). This means that, without any prior knowledge about a theorem proving problem, the policy will be able to prove the theorem (the status of the result is Theorem) with 17.5% probability. Without the test statement at the end of the initial procedure, the computed optimal policy is executed successfully with 100% probability. This reflects the fact that the ATP services in the policy will always deliver a result, whether the conjecture is proved (status (my result, stat#Theorem )) or not. In our framework, the Golog procedure corresponding to a set of PRODIGY plans and a query profile is generated by the formal translation function σproc (Definition 5.21, p 245).

6.6 Computing an Optimal Policy

In the previous section we described how OWL-S profiles of Semantic Web Reasoning Services, PRODIGY plans, and query profiles are translated into DTGolog action do- mains and Golog procedures, respectively. The Golog procedures nondeterministically choose one of the available plans. However, in some situations some plans are more promising than others. For example, the different ATP services in the procedure result- 140 Chapter 6. Composition of Reasoning Services

QueryProc (see Example 6.5.2) perform differently on problems from different problem classes (see Section 4.4). In this section we show how decision-theoretic reasoning in DTGolog is used to compute Golog policies from Golog procedures. The policies pro- duced are optimised with respect to the probability of successful termination and the average execution time. MathServe employs an offline DTGolog interpreter developed by Mikhail Soutchan- ski [Soutchanski, 2003] to compute an optimal policy. As a consequence, the policy computed by the MathServe broker could not perform optimally if exogenous actions and events permanently change the environment. Such an environment required the use of an on-line DTGolog interpreter as described in [Soutchanski, 2001] and [Ferrein et al., 2004]. However, the environment the MathServe broker acts in is not dynamic in the above sense. In fact, the world state is only altered by actions (services) being executed (invoked) by the MathServe broker. We modified Soutchanski’s off-line interpreter such that it introduces sensing actions if needed. If the computation of the optimal policy for a procedure proc p (~x ) δ endProc fails because the value of a fluent F is not known, the interpreter looks for an action a that senses F (F occurs positively in the action’s effects). If such an action exists, the interpreter tries to compute a policy for a new procedure p′ which executes the sensing action a first, i.e:

proc p′ (~x ) a ; δ endProc Example 6.6.1 The Golog procedure resultQueryProc (see Example 6.5.2) is an ex- ample of a procedure which requires the introduction of a sensing action. Computing an optimal policy for resultQueryProc fails because the interpreter needs to know the SPC of the problem to compute the probabilities of the nature’s choices for the ATP services in resultQueryProc. However, in the initial situation S0, the fluent problem- Class(my problem, c) is false for all SPCs c. The SPC of the problem my problem can only be determined by the sensing action TptpAnalyser. Consequently, the DTGolog interpreter tries to compute an optimal policy for the procedure proc resultQueryProc’ (my problem, my result ) tptpAnalyser (my problem, prob class ) ; (epATP (my problem, time res 1, my result ) | spassATP (my problem, time res 1, my result ) | vampireATP (my problem, time res 1, my result ) | otterATP (my problem, time res 1, my result )) ; dctpATP (my problem, time res 1, my result )) ; waldmeisterATP (my problem, time res 1, my result )) ; (status (my result, stat#Theorem ) ? endProc and succeeds. In what follows, we show a small fragment of the optimal policy pro- cedure computed from resultQueryProc’ using the action domain generated from all OWL-S profiles presented in Chapter 4. The policy procedure consists of the sensing action TptpAnalyser followed of 21 conditional statements, one for each SPC: 6.7 Summary 141

proc resultQueryProcOpt (my problem, my result ) TptpAnalyser (my problem, prob class ) ; if problemClass (my problem, mw#CNF NKS RFO SEQ HRN ) then EpATP (my problem, time res 1, my result ) ; senseEffect(EpATP (my problem, time res 1, my result )) else if problemClass (my problem, mw#FOF NKS NUN RFO NEQ ) then DctpATP (my problem, time res 1, my result ) senseEffect(DctpATP (my problem, time res 1, my result )) else if problemClass (my problem, mw#CNF NKS RFO SEQ NHN ) then . . . endIf endIf endIf endProc

6.7 Summary

In this chapter we showed how the MathServe broker can automatically create com- posite reasoning services to answer queries that cannot be answered by any single reasoning service. We presented six requirements on Web Service composition that are of particular importance in our application domain. An analysis of Web Service composition approaches proposed in literature showed that none of the approaches ful- fils all our requirements. In MathServe, we use a combination of classical AI planning and decision-theoretic reasoning. The PRODIGY planner finds promising, well-typed sequences of reasoning services that can potentially answer a given query. Several PRODIGY plans are represented compactly as one Golog procedure. Using the prob- abilistic effects of some reasoning services an offline DTGolog interpreter computes a policy which is optimised with respect to the probability of success and the runtime. Both the initial Golog procedure and the policy computed for it can be executed by the same execution engine. 142 Chapter 6. Composition of Reasoning Services Part III

Applications and Evaluation

Chapter 7

Brokering Theorem Proving Services – MathServe at CADE ATP System Competitions

In the previous chapters we have shown how semantic descriptions of reasoning web services can be used for automated service matchmaking and composition. We also showed how composite services can be represented in the high-level programming lan- guage Golog. The MathServe broker can use a decision-theoretic reasoner to compute an optimal policy for choosing the most suitable service when faced with a reasoning problem. In this chapter we evaluate the performance of the MathServe Broker with respect to the brokering of first-order ATP services, i.e. services that attempt to automatically solve theorem proving problems in classical first-order logic with equality. The most im- portant and influential evaluation method for ATP systems is the system competition associated with the annual Conference on Automated Deduction (CADE) [Nieuwen- huis, 2005]. In the CADE ATP System Competition (CASC), fully automatic ATP systems for first-order logic are evaluated on a set of problems composed of old prob- lems randomly chosen from the TPTP Library, and a number of new problems that the ATP systems have not seen before. The systems are rated with respect to the number of problems solved, and the average runtime for successful runs. MathServe participated in the demonstration division of CASC-20 [Sutcliffe, 2005] and CASC-J3 [Sutcliffe, 2006a]. MathServe is a general framework for distributed automated reasoning services, and it is neither particularly designed nor optimised for CASC. Nevertheless, the overall aim of our evaluation was to prove that MathServe can provide a better service than any standalone ATP system. More specifically, the aim was to find answers to the following questions: • Is the MathServe framework stable enough for the demands of an estab- lished system competition? • Do Specialist Problem Classes (SPCs)1 constitute a suitable classification

1We remind the reader that Specialist Problem Classes (SPCs) are classes of problems determined by syntactical features (cf. Section 4.4.4). 146 Chapter 7. MathServe at CADE System Competitions

of first-order theorem proving problems? • Does the “optimal policy” computed by the DTGolog interpreter perform better than the best ATP systems in the competition division? In what follows, we give answers to these questions. We describe the CASC compe- tition setup and the preparation of MathServe for CASC-20 in Sections 7.1 and 7.2, respectively. The performance of MathServe at CASC-20 and additional experiments performed after the competition are discussed in Section 7.3. The improvements done on MathServe after CASC-20 are described in Section 7.4. Finally, the performance of MathServe at CASC-J3 is presented in Section 7.5.

7.1 The CADE System Competition

The CADE ATP System Competition [Pelletier et al., 2002] is divided into divisions according to problem and system characteristics. There are five competition divisions in which systems are explicitly ranked:

-- The MIX division contains mixed CNF, non-propositional theorems (with unsat- isfiable clause sets).

-- The FOF division consists of mixed non-propositional first-order form theorems.

-- The SAT division is composed of CNF problems that are non-propositional non- theorems (with satisfiable clause sets).

-- In the EPR division, provers are tested on effectively propositional theorems (and non-theorems) in CNF format.

-- The UEQ division contains unit equality problems in CNF that are non-proposi- tional theorems.

There is also a demonstration division in which systems demonstrate their abilities without being formally ranked. The set of problems in the demonstration division is the union of the problem sets of all competition divisions. In all divisions, CPU and wall-clock time limits are imposed on each solution attempt. The CPU time limit is chosen as a reasonable value within the range allowed, and is announced on the day of the competition. The wall-clock time limit is imposed in addition to the CPU time limit, to avoid very high memory usage. The wall-clock time limit is double the CPU time limit. In the demonstration division, each entrant can choose to use either a CPU or a wall-clock time limit. ATP systems taking part in any of the competition divisions have to be installed on the system machines before the system installation deadline, which is typically 3 weeks before the competition. Systems in the demonstration divisions can also be installed later or can be run on hardware provided by the entrant. Problem Selection. The competition problems are chosen from the latest version of the TPTP Library extended with the new problems. This version is not released until after the competition, so that new problems have not been seen by the entrants. The problems have to meet certain criteria to be eligible for selection. Performance 7.2 Training the MathServe Broker 147 data from submitted systems is used to compute the problem ratings for the extended TPTP Library. This rating of a problem is measured relative to state-of-the-art ATP systems. For each problem, the rating r is the fraction of the number of state-of-the-art ATP systems that can not solve the problem, and the total number of state-of-the-art ATP systems, i.e. #{failing state-of-the-art ATP systems} r = . #{state-of-the-art ATP systems} According to its rating each problem is classified as:

-- Easy: The problem is solvable by all state-of-the-art ATP systems (rating 0.0), or

-- Difficult: The problem is solvable by some state-of-the-art ATP systems (rating between 0.0 and 1.0), or

-- Unsolved: The problem has not yet been solved by any ATP system, or

-- Open: The theoremhood is unknown.

Difficult problems with a rating in the range 0.21 to 0.99 are eligible to CASC. The competition problems are randomly selected from all eligible problems at the start of the competition, based on a seed supplied by the competition panel. The selection is constrained so that no division contains an excessive number of very similar prob- lems. The selection mechanism is biased to select problems that are new in the TPTP Library, until 50% of the problems in each category have been selected. After that, random selection (from old and new problems) continues. The actual percentage of new problems used depends on how many new problems are eligible and the limitation on very similar problems.

7.2 Training the MathServe Broker

To prepare the MathServe framework for the system competition we had to collect performance data of the available ATP services. We were faced with the task of finding suitable training sets of theorem proving problems. Since MathServe was going to be evaluated with respect to a new version of the TPTP Library, it was natural to train MathServe on TPTP Library problems. However, we also looked for a second, independent set of problems to avoid an over-fitting of the learnt policy. The result of our investigation was that the TPTP Library is the only existing problem set of a sufficient size that is also heterogeneous enough with respect to a partitioning in Specialist Problem Classes. The MPTP Library [Urban, 2003, Urban, 2004] consists of over 30,000 problems automatically extracted from the Mizar Library. However, this collection of problems is too homogeneous: All problems in the library fall in one of three SPCs (FOF NKC RFO EQU, FOF NKC EPR, FOF NKC RFO NEQ)2, which is due to the fact that the problems are automatically generated. Therefore, we trained

2Some problems of the CASC FOF division are also in some of these SPCs. 148 Chapter 7. MathServe at CADE System Competitions the MathServe Broker on the complete TPTP Library (Version 3.0.1 for CASC-20 and Version 3.1.0 for CASC-J3), which is also the training method used for standalone ATP systems. As a preparation for CASC-20 we measured the performance of the ATP systems E 0.82, Otter 3.3, SPASS 2.1 and Vampire 7.0. For every SPC of the TPTP Library we counted the problems solved by the ATP systems with a CPU time limit of 300 seconds per problem. Both the percentage of problems solved as well as the CPU time for solved problems were measured. Table 7.1 compares the performance of the ATP systems. For the sake of readability it contains only the percentage of problems solved by the systems, and not the CPU time used by the systems 3. Numbers in bold font indicate the strongest system in each SPC, which is the system that solved most of the problems. In case two or more systems solved the same number of problems, the best system was determined by the average CPU time used for successful problem solving attempts. Table 7.1 contains the provers’ performance for all SPCs. However, only those SPCs containing problems that are either not known to be satisfiable (NKS) or not known to be countersatisfiable (NKC) are relevant for CASC4. For instance, E, SPASS and Vampire solved all problems in the SPC of first-order problems that are real first-order and contain no equality (FOF NKC RFO NEQ). However, on average, SPASS used the least time resources and was marked as the most successful system. It is worth mentioning that developers of ATP systems do not publish the latest ver- sions of the systems before the CADE System Competition. Therefore, these versions of the provers were not available to MathServe before the competition. For CASC-20, the success rates in Table 7.1 and the average CPU times were modelled as conditional probabilistic effects of the corresponding ATP services as shown in Section 4.4.4. An offline DTGolog interpreter was used to compute an optimal policy for the following Golog procedure (cf. Section 6.6):

proc Casc20Proc (problem, result ) (EpATP (problem, time res, result ) | SpassATP (problem, time res, result ) | VampireATP (problem, time res, result ) | OtterATP (problem, time res, result )) ; (status (result, stat#Theorem )∨ status (result, stat#Unsatisfiable ))? endProc

The procedure Casc20Proc consists of a nondeterministic choice of one of the four avail- able ATP services and is an early version of the procedure resultQueryProc introduced in Section 6.5.2. Optimally, the status of the result delivered by the ATP services should be either Theorem or Unsatisfiable. For every SPC, the optimal policy computed for Casc20Proc chose the strongest ATP service. For CASC-20, the strongest ATP service corresponds to the best ATP system in Table 7.1. The optimal policy consisted of 21 conditional statements (one for each SPC):

3The full tables including the CPU times can be found in Appendix C. 4The TPTP problem analyser will never classify a new problem as satisfiable or counter-satisfiable. 7.2 Training the MathServe Broker 149

Specialist No. Percentage of Problems solved Problem Class Probs E 0.82 Otter 3.3 SPASS 2.1 Vampire 7.0 FOF CSA EPR 319 91 0 86 68 FOF CSA RFO 18 67 0 67 50 FOF SAT EPR 3 34 0 34 34 FOF SAT RFO 17 41 0 29 24 FOF NKC EPR 395 81 55 98 80 FOF NKC RFO EQU 915 64 34 53 61 FOF NKC RFO NEQ 28 100 96 100 100 FOF NKS NUN EPR 50 14 2 100 8 FOF NKS NUN RFO NEQ 0 0 0 0 0 FOF NKS NUN RFO EQU 0 0 0 0 0 CNF NKS EPR 476 98 77 99 99 CNF NKS RFO NEQ NHN 540 70 49 54 67 CNF SAT EPR 220 64 0 61 52 CNF SAT RFO NEQ 275 52 0 45 51 CNF SAT RFO EQU NUE 224 58 0 55 50 CNF SAT RFO PEQ UEQ 54 13 0 11 11 CNF NKS RFO NEQ HRN 461 89 72 66 88 CNF NKS RFO SEQ HRN 390 87 56 49 75 CNF NKS RFO SEQ NHN 1816 48 25 34 42 CNF NKS RFO PEQ NUE 337 88 20 75 85 CNF NKS RFO PEQ UEQ 722 85 73 71 83

Table 7.1: Percentage of problems solved by the ATP systems E, Otter, SPASS and Vampire in the 21 SPCs of the TPTP Library 3.0.1. Systems were run with a 300sec wall-clock time limit for each problem

proc Casc20ProcOpt (problem, result ) TptpAnalyser (problem, prob class ) ; if problemClass (problem, mw#CNF NKS RFO SEQ HRN ) then EpATP (problem, time res, result ) ; senseEffect(EpATP (problem, time res, result )) else if problemClass (problem, mw#FOF CSA EPR ) then EpATP (problem, time res, result ) senseEffect(EpATP (problem, time res, result )) else if problemClass (problem, mw#FOF NKS NUN RFO NEQ ) then SpassATP (problem, time res, result ) . . . endIf endIf endIf endProc For CASC-20 the procedure Casc20ProcOpt was used by the atpOpt operation of the 150 Chapter 7. MathServe at CADE System Competitions

MathServe broker’s ATP interface described in Section 5.2.7. As mentioned before, the time resource parameters of the atomic ATP services in Casc20ProcOpt are bound dynamically by MathServe’s Golog interpreter (cf. Section 5.2.6). However, the Golog interpreter does not provide a sophisticated time resource management. For the op- timal policy Casc20ProcOpt this means that the time used by the problem analyser (TptpAnalyser) is subtracted from the time available to the theorem proving services. This can be seen as a disadvantage for MathServe in a competition environment. The ATP service chosen by the policy will have less time to tackle a problem than the standalone ATP systems in the competition divisions. Furthermore, the ATP services have to translate theorem proving problems into the ATP systems input syntax. For the ATP systems in the competition divisions, this translation is performed prior to the competition.

7.3 MathServe at CASC-20

The 20th CADE System Competition (CASC-20) [Sutcliffe, 2005] took place on the 26th July 2005 in Tallinn, Estonia. CASC-20 counted 19 entrants, most of which participated in some of the five competition divisions. The ATP system E was the only entrant running on the problems of all competition divisions. The problem set for CASC-20 was composed of 660 randomly chosen (eligible) problems from the TPTP Library. 147 of these problems had not been seen by the provers before. Table 7.2 shows the distribution of the 660 problems over seven SPCs,

Specialist Problem Class Problems ATP chosen CNF NKS EPR 115 SPASS CNF NKS RFO PEQ NUE 36 E CNF NKS RFO NEQ NHN 100 E CNF NKS RFO PEQ UEQ 160 E CNF NKS RFO SEQ NHN 99 E FOF NKC RFO NEQ 35 SPASS FOF NKC RFO EQU 115 E Complete 660

Table 7.2: Distribution of CASC-20 problems over seven SPCs and the ATP service chosen by MathServe and the prover chosen by MathServe’s optimal policy for these SPCs. The CPU time limit of 600 seconds for each problem was announced at the competition. Since MathServe does not constitute a new ATP system, but employs other ATP systems, it participated in the demonstration division. A specialised MathServe client was run sequentially on all 660 problems with 600sec CPU and wall-clock time limit. The client was executed on a competition machine provided by the CASC organisers. The client tried to solve the competition problems via the ATP interface of a Math- Serve broker running on a machine at Saarbr¨ucken University, Germany. The server ran on a Linux machine with four Intel Xeon 2.80GHz CPUs and 2GB total memory. 7.3 MathServe at CASC-20 151

At the competition, MathServe used the optimal strategy Casc20ProcOpt via the atpOpt operation of the ATP interface. With this strategy it could solve 392 problems, which corresponds to a success rate of 59.4%. The average CPU time used for solved problems was approximately 28 seconds. The average wall-clock time used for all queries was 224 seconds, which also includes the time for unsuccessful calls that account for 600 seconds each.5 The performance of MathServe at CASC-20 was unexpectedly poor, taking into account that the system could choose from several ATP services. MathServe performed sub-optimally on effectively propositional (EPR) and unit equality (UEQ) problems. MathServe could only solve 86 of the 120 EPR problems and 64 of the 120 UEQ problems. This was due to the fact that MathServe had no specialised reasoning services for these problem classes available. More importantly, MathServe did not outperform the most powerful ATP systems E and Vampire. Table 7.3 shows the performance of MathServe compared to the leading ATP systems E and Vampire.

System Problems Problems Percentage Percentage given solved of given complete Vampire 8.0 540 430 79.6% 65.2% E 0.9pre3 660 409 62.0% 62.0% MathServe 0.62 660 392 59.4% 59.4%

Table 7.3: Comparison of MathServe with leading ATP systems E and Vampire

7.3.1 Comparison with the E System. After CASC-20 we analysed in detail why E and Vampire could solve more problems than MathServe. First, we compared MathServe with the system E, which took part in all competition divisions. For 38 of the 268 problems not solved by MathServe, E (0.9pre3) found a solution. In further experiments we found out why this (newer) version of E could solve these problems but MathServe could not. The main reasons were:

Improvements to E: To a great extend, the better performance of E was due to the improvements performed on version 0.9pre3 of the E system. In fact, 28 of the 38 problems (i.e. 73.7%) could only be solved by the newer version of E and not by the older version used by MathServe.

Large problems: Six of the remaining ten problems could not be solved by Math- Serve due to their size. This shortcoming was due to the inefficient memory management of MathServe. Queries sent to MathServe are XML documents that have to be parsed to determine the type of the query and to extract the TPTP problem description. We realised that XML parsing and string copying routines lead to a memory usage approximately 20 times the size of the XML document.

5In further experiments, we showed that, with a random choice of one ATP service for each problem, MathServe could solve 297 problems. With a parallel invocation of all available ATP services, 405 problems could be solved by MathServe. 152 Chapter 7. MathServe at CADE System Competitions

This caused a memory overflow for all problems with a size bigger than 2.5MB (Megabytes). Table 7.4 shows the names and the sizes of the problems that were too big to be solved by MathServe. Service selection: For 13 of the 38 unsolved problems, the optimal strategy of Math- Serve suggested to use the ATP service SPASS (version 2.1) instead of E. The six problems in Table 7.4 were among these problems and could not be parsed by MathServe. We ran E on the remaining seven problems to see whether Math- Serve could have solved these problems if it had chosen E instead. Only four of the seven problems could have been solved by choosing E instead of SPASS.

Problem Size (MB) SYN831-1 8.6 SYN853-1 11.0 SYN830-1 9.2 SYN816-1 2.8 SYN864-1 3.4 SYN865-1 3.4

Table 7.4: CASC Problems not solved by MathServe due to their size

From the above experiments we concluded that, even if MathServe had managed to deal with theorem proving problems bigger than 2.5MB, and if it had chosen E instead of SPASS for some problems, it would have been able to solve only ten problems more than with the original strategy. This is mainly due to the significant improvements on version 0.9pre3 (compared to version 0.82) of E. Even with improvements on the MathServe system itself and the optimal policy computed, MathServe would not have been able to outperform the system E.

7.3.2 Comparison with the Vampire System. The Vampire system did not participate in all competition divisions and was only run on 540 CASC-20 problems. But Vampire 7.0 and Vampire 8.0 were the two leading systems in the prestigious MIX and FOF divisions of CASC-20. This is why we also performed a detailed comparison of MathServe with Vampire 8.0. For 71 of the 268 problems not solved by MathServe, Vampire 8.0 could find a solution. Again, the six problems in Table 7.4 were among these problems. MathServe could have used an older version of Vampire (version 7.0) but did not do so because of the performance data in Table 7.1 and the resulting optimal policy Casc20ProcOpt shown above. Vampire 7.0 could have solved 45 of the 71 problems (i.e. 63.4%). Thus, the per- formance difference between two versions of Vampire is smaller than in the case of E6. Table 7.5 shows the distribution of the 45 problems over five SPCs. The table also shows that MathServe chose E for all of these problems except one.

6The older version of E could only solve 26.3% of 38 problems not solved by MathServe. 7.4 Improving MathServe 153

Specialist Solved by ATP chosen Problem Class Vampire 8.0 CNF NKS RFO PEQ NUE 1 E CNF NKS RFO NEQ NHN 5 E CNF NKS RFO SEQ NHN 15 E FOF NKC RFO NEQ 1 SPASS FOF NKC RFO EQU 23 E

Table 7.5: Distribution of 45 problems not solved by MathServe but solved by Vampire 7.0, and the ATP system chosen by MathServe

We created a new policy in which Vampire 7.0 was chosen for problems in the SPCs of Table 7.5. We ran MathServe with the new policy on the CASC-20 problems to see how MathServe would have performed if it had chosen Vampire 7.0 for these SPCs. With the new policy, MathServe could solve 383 of all CASC-20 problems, i.e. nine problems fewer than with the original policy Casc20ProcOpt. The average CPU time for solved problems was 34 seconds (6 seconds more than at CASC-20) which was due to the fact that Vampire typically used more CPU resources than the ATP systems SPASS and E. On average, the client had to wait 242 seconds for the result of successfully solved queries, which was 18 seconds more than for the original policy Casc20ProcOpt. In a second experiment we created a policy in which Vampire 7.0 was chosen only for the two SPCs (CNF NKS RFO SEQ NHN and FOF NKC RFO EQU) that contain 84% of the 45 problems. We ran MathServe with this policy on the CASC-20 problems. With this policy, MathServe could solve 385 of the 660 CASC-20 problems, which is still seven problems less than with the original policy Casc20ProcOpt. However, the average CPU time for solved problems, 29 seconds, was just one second longer than with Casc20ProcOpt. Also, in average, the client had to wait only 228 seconds for an answer to a query. From our experiments, we concluded that, even if it had used Vampire 7.0 for some SPCs, MathServe could not have outperformed Vampire 8.0 at CASC-20.

7.4 Improving MathServe

In the previous section we saw that MathServe solved fewer CASC-20 problems than the ATP systems E and Vampire. We identified the following reasons for this outcome: • Significant improvements had been made on the ATP systems participating in CASC-20. These systems were not available for MathServe. • Due to an inefficient memory handling, MathServe could not process prob- lems bigger than 2.5MB. • MathServe did not use specialised reasoning services for unit equality prob- lems and the problems that are effectively propositional. In order to improve MathServe we integrated the CASC-20 versions of the systems E and Vampire into MathServe. With these new versions of the ATP systems integrated MathServe could already solve 440 CASC-20 problems and outperform E 0.9pre3 and 154 Chapter 7. MathServe at CADE System Competitions

Vampire 8.0 (cf. Table 7.3). Furthermore, we added the systems DCTP [Letz and Stenz, 2001a] and Waldmeis- ter [Hillenbrand et al., 1999] which are specialised on problems that are effectively propositional and on unit equality problems, respectively. The ATP system Paradox, which is specialised on satisfiable CNF problems and effectively propositional prob- lems, was also integrated into MathServe. With the new ATP services, MathServe could solve 451 CASC-20 problems (cf. next section). Last but not least, we also improved MathServe’s memory management and the handling of large theorem proving problems. When receiving a large problem, the improved MathServe broker immediately stores the problem in a file on the hard disc and all further processing is performed on the problem file. As a consequence Math- Serve can also process the problems shown in Table 7.4.

7.5 MathServe at CASC-J3

The 21st CADE System Competition (CASC-J3) [Sutcliffe, 2006a] was held on the 18th August 2006 in Seattle, USA. As a preparation for CASC-J3, we trained the improved version of MathServe on the full TPTP Library v3.1.0 with a time limit of 600 seconds for each problem. The percentage of problems solved by the ATP systems is shown in Table 7.6 which is analogous to Table 7.1. A new optimal policy, CascJ3ProcOpt, was computed using the data in Table 7.6. For every SPC, the ATP system whose service is chosen by this policy is marked in bold font in Table 7.6. It is worth mentioning that for some SPCs the ATP system with the second highest percentage of problems solved is chosen by the policy. For example, for the SPC CNF NKS RFO SEQ NHN, the service of E (50% solved) is chosen instead of Vampire (52% solved). This is because the average CPU time for solved problems is taken into account as the cost of actions when computing the optimal policy. In average, the system E took half of the time of Vampire to prove problems in the class CNF NKS RFO SEQ NHN. In the context of CASC this is a sub-optimal decision because the CPU time used by the ATP systems is of minor importance. However, MathServe is also used in other contexts where this decision is justified.7 As an additional preparation for CASC-J3, we tested the new optimal policy on the CASC-20 problems. MathServe could solve 451 CASC-20 problems, i.e. 21 problems more than Vampire 8.0 and 42 problems more than E 0.9pre3 alone. Due to the more efficient memory management, MathServe could solve all large problems listed in Table 7.4. The improved version 0.80 of MathServe took part in the demonstration division CASC-J3. All entrants were given an overall time limit of 400 seconds for each problem. The entrants were challenged with 600 problems which belonged to the same SPCs as the CASC-20 problems. The distribution of the problems over seven SPCs is shown in Table 7.7. The table also shows that, with the new policy CascJ3ProcOpt, the MathServe broker chose DCTP for effectively propositional problems and Waldmeister for unit

7In the future, the decision-theoretic reasoner used by MathServe could be triggered to either compute a policy with respect to the success rates and time, or with respect to the success rates only. 7.5 MathServe at CASC-J3 155

Specialist Percentage of Problems solved Problem Class DCTP E Paradox SPASS Vampire Waldm. 10.21p 0.9pre3 1.3 2.1 8.0 704 FOF CSA EPR 88 91 91 86 0 0 FOF CSA RFO 28 26 36 26 0 0 FOF SAT EPR 90 77 100 92 0 0 FOF SAT RFO 0 41 47 29 0 0 FOF NKC EPR 38 96 48 98 93 0 FOF NKC RFO EQU 0 69 8 59 71 0 FOF NKC RFO NEQ 0 100 0 97 100 0 FOF NKS NUN EPR 0 80 0 100 61 0 FOF NKS NUN RFO NEQ 50 50 50 50 50 0 FOF NKS NUN RFO EQU 0 0 0 0 0 0 CNF NKS EPR 99 98 96 99 98 0 CNF NKS RFO NEQ NHN 0 71 3 55 75 0 CNF SAT EPR 92 64 88 67 48 0 CNF SAT RFO NEQ 0 52 99 45 0 0 CNF SAT RFO EQU NUE 0 58 75 55 0 0 CNF SAT RFO PEQ UEQ 0 7 89 6 0 0 CNF NKS RFO NEQ HRN 0 90 2 68 94 5 CNF NKS RFO SEQ HRN 59 88 0 50 93 0 CNF NKS RFO SEQ NHN 0 50 2 37 52 0 CNF NKS RFO PEQ NUE 0 80 1 62 78 0 CNF NKS RFO PEQ UEQ 0 80 0 68 81 88

Table 7.6: Percentage of problems solved by the ATP systems DCTP, E, Paradox, SPASS, Vampire and Waldmeister in the 21 SPCs of the TPTP Library 3.1.0. Systems were run with a 600sec time limit for each problem equality problems. Like the policy Casc20ProcOpt, the policy CascJ3ProcOpt was optimised to de- termine the status Theorem for FOF problems and the status Unsatisfiable for CNF problems. With a new optimal policy, computed from the performance data in Ta- ble 7.7, MathServe could solve 414 of 600 problems. With the help of the specialised ATP systems DCTP and Waldmeister, MathServe could solve 98 of 100 EPR problems and 93 of 100 UEQ problems.8 Table 7.8 shows a comparison of MathServe with the leading standalone ATP sys- tems E 0.99 and Vampire 8.1. MathServe could solve more problems than any stan- dalone ATP system. However, Vampire 8.1 did not participate in the SAT competition division and was only run on 500 problems. Theoretically, Vampire 8.1 could have solved more problems than MathServe if run on all 600 problems. However, Table 7.6 shows that Vampire 8.0 did not solve any SAT problem in the TPTP Library. Since version 8.1 of Vampire was not significantly different from version version 8.0, it did

8Due to a bug in the TPTP Library’s classifier for SPCs (and consequently in the TPTPAnalyser service) some non-unit-equality problems were falsely classified as CNF NKS RFO PEQ UEQ. Thus, Waldmeister was actually run on 122 problems. 156 Chapter 7. MathServe at CADE System Competitions

Specialist Problem Class Problems ATP chosen CNF NKS EPR 100 DCTP CNF NKS RFO PEQ NUE 26 E CNF NKS RFO NEQ NHN 90 Vampire CNF NKS RFO PEQ UEQ 100 Waldmeister CNF NKS RFO SEQ NHN 112 E FOF NKC RFO NEQ 35 E FOF NKC RFO EQU 115 E Complete 600

Table 7.7: Distribution of CASC-J3 problems over seven SPCs as classified by the TPTPAnalyser service

System Problems Problems Percentage given solved complete MathServe 0.80 600 414 69.0% Vampire 8.1 500 412 68.7% E 0.99 600 402 67.0%

Table 7.8: Comparison of MathServe with the ATP systems E and Vampire on CASC- J3 problems not perform better on SAT problems.

7.6 MathServe on SAT Problems

On both CADE competitions MathServe performed poorly on problems of the SAT division, i.e. on CNF problems with satisfiable clause sets. At CASC-J3, MathServe could solve only five percent of the SAT problems. This was because of the fact that, due to the design of the competition, MathServe was not told what to achieve for the given problems. The optimal policy used by MathServe was based on the assumption that the status Unsatisfiable should be achieved for CNF problems and the status Theorem for FOF problems. Consequently, MathServe chose the ATP systems E and Vampire for SAT problems. These systems are not designed to detect satisfiable clause sets. MathServe would have performed better if it had chosen either DCTP or Paradox for SAT problems. Indeed, with a new policy optimised to determine satisfiability for CNF problems, MathServe could solve 74 of the 100 CASC-J3 problems in the SAT division. If MathServe had used this policy for SAT problems only, it could have solved 473 CASC-J3 problems.

7.7 Summary

We discussed the application of MathServe to theorem proving in classical first-order logic. MathServe participated in the demonstration division of the CADE ATP System 7.7 Summary 157

Competitions CASC-20 and CASC-J3, and proved to be stable enough to participate in an established system competition. However, in CASC-20 MathServe did not outper- form the leading ATP systems E and Vampire. In further experiments we discovered that this was mainly due to the significant improvements made to the most recent versions of these ATP systems. These improved ATP systems were not available to MathServe at the time of the competition. Furthermore, some problems were too large to be handled by MathServe. Last but not least, MathServe did not have specialised services for unit equality problems and effectively propositional problems. As a consequence we improved MathServe in three ways: 1) We integrated the latest versions of the ATP systems E and Vampire. 2) We added the services offered by the specialised ATP systems DCTP, Paradox and Waldmeister. 3) We changed the XML processing unit of MathServe such that it could handle large problems. With these improvements MathServe did outperform both E 0.9pre3 and Vampire 8.0 on CASC-20 problems. In CASC-J3, MathServe performed better than E 0.99 and could also solve two problems more than Vampire 8.1. From the results obtained through our experiments we conclude that MathServe can offer a better theorem proving service than any standalone ATP system. More specifically, SPCs do constitute a suitable classification of first-order theorem proving problems in the context of the TPTP Library9. Also, our data provides enough evidence to support the thesis that the policy computed from the performance of ATP systems on SPCs performs better than stand-alone ATP systems. With respect to CASC, MathServe could be improved in several ways. First of all, MathServe could choose a better strategy for problems of the SAT division, if it was told what to achieve for a problem. Furthermore, a parallel invocation of ATP services could increase the performance of MathServe. So far, parallel execution of ATP services is only possible via the atpSplitAndJoin operation of the broker’s ATP interface (see Section 5.2.7). The execution of atpSplitAndJoin is barrier synchronised, i.e. it only returns a result when all service invocations have terminated. An improved parallel execution should terminate all service invocations as soon as the problem at hand has been solved by one service (cf. Section 10.1.2). A comparison of sets of ATP services (instead of single ATP services) could be used to perform an optimal choice of sets of services for parallel service execution. When executing a set of ATP services in parallel, time resources could be assigned to the individual services according to performance data as presented in Sections 7.3 and 7.5. This would minimise the amount of computational resources used while maximising the probability of successfully solving a problem.

9For other problem sets, such as the MPTP Library, other criteria would have to be found for partitioning. 158 Chapter 7. MathServe at CADE System Competitions Chapter 8

MathServe on Higher-Order Problems

In the previous chapter we evaluated the MathServe framework with respect to its capability to choose suitable reasoning services depending on certain problem features. In this chapter, we provide evidence that composite reasoning services in MathServe can (in some cases) solve problems faster and more reliably than atomic services designed for the same task. We show how composite reasoning services can be used to efficiently solve higher-order theorem proving problems about sets, relations and functions. The reasoning services combine a higher-order ATP system, a definition expansion service, a higher-order to first-order translation, and the first-order ATP strategy presented in Chapters 6 and 7. An evaluation of MathServe on higher-order problems proved to be difficult because there is only a small number of automated reasoning systems for higher-order logic. For example, there are only two automated theorem proving systems for higher-order logic, namely the ATP systems Leo [Benzm¨uller and Kohlhase, 1998] and TPS [Andrews et al., 1996]. However, only Leo offers a degree of automation sufficient for an integration into MathServe. The TPS system does provide an automated search procedure but still requires a large amount of user interaction to actually invoke that procedure on a conjecture. Furthermore, large libraries of problems formalised in a standard format, such as the higher-order TPTP format [Gelder and Sutcliffe, 2006], are not yet avail- able. For some higher-order proof assistant systems, such as HOL, Isabelle and Coq, large problem libraries have been developed. However, these problems are formalised in the systems’ own formal language and syntax. Typically, a translation into the input formats of other systems is difficult. Therefore, we evaluated MathServe on a (relatively small) set of 45 higher-order conjectures about sets, relations and functions. This problem set has been introduced in [Benzm¨uller et al., 2005] to evaluate a special version of Leo which closely cooperated with the first-order ATP system Bliksem. The problem set is presented in Section 8.1. In Section 8.2, we describe the approach of [Benzm¨uller et al., 2005] to tackle these problems. The services provided by Leo in MathServe are shown in Section 8.3. A specialised reasoning service for the expansion of definitions and set extensionality is introduced in Section 8.4. The composite services used in our evaluation are presented in Section 8.5. Results are discussed in Section 8.6. 160 Chapter 8. MathServe on Higher-Order Problems

8.1 A Set of Higher-Order Problems

In [Benzm¨uller et al., 2005] higher-order formalisations of 45 first-order TPTP Library problems about sets, relations and functions was presented. The selection of these problems was motivated by the work of Ganzinger and Stuber on using a specialised superposition calculus on the first-order counterparts of the problems [Ganzinger and Stuber, 2003]. Several of the first-order formalisations of the problems are considered challenging for first-order ATP systems. Five problems (e.g., SET066+1) have the TPTP Library rating 1.0, i.e. they are known theorems that have not yet been proven by any first-order system. The work in [Benzm¨uller et al., 2005] was inspired by the idea that some of these problems might be easier to solve when they are first formulated concisely in higher-order logic. Table 8.1 shows the formalisations of the problems in a variant of Church’s simply typed λ-calculus with prefix polymorphism1. Compared to their first-order counter- parts, the problem formalisations in Table 8.1 are very concise. This is due to the availability of λ-abstraction and the defined concepts shown in Table 8.2 (page 163). The problems’ readability is further improved by using infix notation for the standard set relations. The performance of the higher-order ATP system Leo on the problems in Table 8.1 has been presented in [Benzm¨uller et al., 2005]. With its standard strategy, Leo can solve 21 of the 45 problems with a time limit of 100 seconds. When using the standard standard strategy (STD) all occurrences of primitive equality are immediately replaced by Leibniz equality2, and the use of extensionality reasoning is limited. In [Benzm¨uller et al., 2005], three other strategies of Leo are discussed which offer moderate exten- sionality reasoning (EXT), delayed expansion of primitive equality (EI), and delayed equality expansion combined with advanced recursive extensionality reasoning (EIR). Leo can solve 22 problems with the strategy EXT, and with 37 with EIR, but it cannot solve all 45 problems with any single strategy. To provide a comparison, we ran the TPS system (manually) on the problems in Table 8.1. TPS used the standard mating search strategy (BASIC-MS04-2) and was also given a time limit of 100 seconds. The system could solve 39 problems, most of them in less than 200 milliseconds.

Extensionality Reasoning. Extensionality reasoning plays a key role in higher-order theorem proving in general. This is why the strategies of the Leo system differ significantly in the way extension- ality reasoning is supported. There are separate axioms for Boolean3, functional4, and set extensionality. Relevant for the problems in Table 8.1 is a good support or set extensionality, such as provided by Leo’s EIR strategy, The set extensionality axiom states that two sets are equal if they have the same elements. Using the definition of

1This language, also named POST , forms the logical basis of the Ωmega proof assistant and the TRAMP system. See also Section 4.3.2. 2Leibniz equality states that two objects are equal iff they have the same properties. For some type α it is defined as =oαα:= λxα λyα [∀Poα P x ⇒ Py]. 3Two Boolean values are equal if they are equivalent. 4Two functions are equal, if they deliver the same value on all arguments. 8.2 Leo and First-Order ATP Systems. 161

Problem Problem Formalisation in Simply Typed Lambda Calculus

SET014+4 Xoα,Yoα,Aoα [[X A Y A] (X Y ) A] ∀ ⊆ ∧ ⊆ ⇒ ∪ ⊆ SET017+1 xα,yα,zα [UnOrderedPair (x,y)= UnOrderedPair (x,z) y = z] ∀ ⇒ SET066+1 xα,yα [UnOrderedPair (x,y)= UnOrderedPair (y,x) ∀ SET067+1 xα,yα [UnOrderedPair (x,x) UnOrderedPair (x,y)] ∀ ⊆ SET076+1 xα,yα Zoα x Z y Z UnOrderedPair (x,y) Z ∀ ∀ ∈ ∧ ∈ ⇒ ⊆ SET086+1 xα yα [y Singleton(x)] ∀ ∃ ∈ SET096+1 Xoα,yα [X Singleton(y) [X = X = Singleton(y)]] ∀ ⊆ ⇒ ∅ ∨ SET143+3 Xoα,Yoα,Zoα [(X Y ) Z = X (Y Z)] ∀ ∩ ∩ ∩ ∩ SET171+3 Xoα,Yoα,Zoα [X (Y Z) = (X Y ) (X Z)] ∀ ∪ ∩ ∪ ∩ ∪ SET580+3 Xoα,Yoα,uα [u ExclUnion(X,Y ) [u X u Y ]] ∀ ∈ ⇔ ∈ ⇔ 6∈ SET601+3 Xoα,Yoα,Zoα[(X Y ) ((Y Z) (Z X)) = (X Y ) ((Y Z) (Z X))] ∀ ∩ ∪ ∩ ∪ ∩ ∪ ∩ ∪ ∩ ∪ SET606+3 Xoα,Yoα [X (X Y )= X Y ] ∀ \ ∩ \ SET607+3 Xoα,Yoα [X (Y X) = X Y ] ∀ ∪ \ ∪ SET609+3 Xoα,Yoα,Zoα [X (Y Z) = (X Y ) (X Z)] ∀ \ \ \ ∪ ∩ SET611+3 Xoα,Yoα [X Y = X Y = X] ∀ ∩ ∅⇔ \ SET612+3 Xoα,Yoα,Zoα [X (Y Z) = (X Y ) (X Z)] ∀ \ ∪ \ ∩ \ SET614+3 Xoα,Yoα,Zoα [(X Y ) Z = X (Y Z)] ∀ \ \ \ ∪ SET615+3 Xoα,Yoα,Zoα [(X Y ) Z = (X Z) (Y Z)] ∀ ∪ \ \ ∪ \ SET623+3 Xoα,Yoα,Zoα [ExclUnion(ExclUnion(X,Y ),Z)= ExclUnion(X, ExclUnion(Y,Z))] ∀ SET624+3 Xoα,Yoα,Zoα [Meets(X, (Y Z)) [Meets(X,Y ) Meets(X,Z)]] ∀ ∪ ⇔ ∨ SET630+3 Xoα,Yoα [Misses(X Y, ExclUnion(X,Y ))] ∀ ∩ SET640+3 R ,Q [Subrel(R, Q) Subrel(R, (λuα ) (λv ))] ∀ oβα oβα ⇒ ⊤ × β ⊤ SET646+3 xα,y [Subrel(Pair (x,y), (λuα ) (λv ))] ∀ β ⊤ × β ⊤ SET647+3 Roβα, Xoα [(RDom(R) X) Subrel(R, X RCodom(R))] SET648+3 ∀R ,Y [(RCodom(R)⊆ Y )⇒ Subrel(R, RDom× (R) Y )] ∀ oβα oβ ⊆ ⇒ × SET649+3 R , Xoα,Y [[RDom(R) X RCodom(R) Y ] Subrel(R, X Y )] ∀ oβα oβ ⊆ ∧ ⊆ ⇒ × SET651+3 R [RDom(R) Aoα Subrel(R, A (λu ))] ∀ oβα ⊆ ⇒ × β ⊤ SET657+3 R [F ield(R) ((λuα ) (λv ))] ∀ oβα ⊆ ⊤ ∪ β ⊤ SET669+3 Roαα [Subrel(Id(λuα ), R) [(λuα ) RDom(R) (λuα )= RCodom(R)]] ∀ ⊤ ⇒ ⊤ ⊆ ∧ ⊤ SET670+3 Zoα, R , XoαY [IsRelOn(R,X,Y ) IsRelOn(RestrictRDom(R, Z),Z,Y )] ∀ oβα oβ ⇒ SET671+3 Zoα, R , Xoα,Y [[IsRelOn(R,X,Y ) X Z] RestrictRDom(R, Z)= R] ∀ oβα oβ ∧ ⊆ ⇒ SET672+3 Z , R , XoαY [IsRelOn(R,X,Y ) IsRelOn(RestrictRCodom(R, Z),X,Z)] ∀ oβ oβα oβ ⇒ SET673+3 Z , R , Xoα,Y [[IsRelOn(R,X,Y ) Y Z] RestrictRCodom(R, Z)= R] ∀ oβ oβα oβ ∧ ⊆ ⇒ SET680+3 R , Xoα,Y [IsRelOn(R,X,Y ) ∀ oβα oβ ⇒ [ uα u X [u RDom(R) v v Y R(u, v)]]] ∀ ∈ ⇒ ∈ ⇔ ∃ β ∈ ∧ SET683+3 R , Xoα,Y [IsRelOn(R,X,Y ) ∀ oβα oβ ⇒ [ v v Y [v RCodom(R) uα u X u RDom(R)]]] ∀ β ∈ ⇒ ∈ ⇒ ∃ ∈ ∧ ∈ SET684+3 P , R ,xα,zγ [RelComp(P, R)xz y Pxy Ryz] ∀ oβα oγβ ⇔ ∃ β ∧ SET686+3 Zoα, R ,xα [x InverseImageR(R, Z) yα Rxy x Z] ∀ oγβ ∈ ⇔ ∃ ∧ ∈ SET716+4 Fβα,Gγβ [[Inj (F ) Inj (G)] Inj (G F )] SET724+4 ∀F ,G ,H [[F ∧ G = F ⇒H Surj◦(F )] G = H] ∀ βα γβ γβ ◦ ◦ ∧ ⇒ SET741+4 Fβα,Gγβ ,Hαγ [[Inj ((F G) H) Surj ((G H) F ) Surj ((H F ) G)] Bij (H)] SET747+4 ∀ F ,G , ⊳1 , ⊳2 , ⊳◦3 [[◦IncreasingF∧ (F,◦ ⊳1, ⊳◦2) DecreasingF∧ ◦(G,◦ ⊳2, ⊳3⇒)] ∀ βα γβ oαα oββ oγγ ∧ ⇒ DecreasingF (F G, ⊳1, ⊳3)] ◦ SET752+4 Xoα,Yoα,F [ImageF (F, X Y ) = ImageF (F, X) ImageF (F,Y )] ∀ βα ∪ ∪ SET753+4 Xoα,Yoα,Fβα [ImageF (F, X Y ) ImageF (F, X) ImageF (F,Y )] SET764+4 ∀F [InverseImageF (F, ) = ∩] ⊆ ∩ ∀ βα ∅ ∅ SET770+4 Roβα,Qoβα [[EquivRel(R) EquivRel(Q)] ∀[EquivClasses(R) = EquivClasses∧ (Q) Disjoint⇒ (EquivClasses(R), EquivClasses(Q))]] ∨ Table 8.1: Higher-order formalisations of 45 TPTP Library problems on sets, relations and functions (Source [Benzm¨uller et al., 2005]) set membership provided in Table 8.2 the axiom can be written as

∀Aoα, Boα [A = B ⇔ [∀xα x ∈ A ⇔ x ∈ B]]. (8.1)

8.2 Leo and First-Order ATP Systems.

In [Benzm¨uller et al., 2005] it is shown that a close cooperation between Leo and the first-order ATP system Bliksem can solve 43 of the 45 problems in Table 8.1 using only 162 Chapter 8. MathServe on Higher-Order Problems one strategy and less time than Leo on its own. The cooperation between Leo and Bliksem is realised with the help of the Ω-Ants system (see Section 2.3.2) which is part of the Ωmega system. The idea of combining Leo with the first-order system Bliksem stems from the observation that, often, higher-order problems require only a few but essential higher- order reasoning steps while the major part of the reasoning consists of first-order or even propositional steps. In the approach of [Benzm¨uller et al., 2005], Leo repeatedly sends all first-order-like clauses to the Bliksem system which is then asked to look for a refutation proof. If Bliksem succeeds, the theorem is proved. In the approach of [Benzm¨uller et al., 2005], first-order-like clauses do not contain any real higher-order terms, such as λ-abstraction or embedded equations, and, after a simple translation, they can be treated by first-order ATP systems (cf. Section 4.3.2). 5 Currently, a rational reconstruction of the approach described in [Benzm¨uller et al., 2005] is not possible in MathServe. This is mainly due to the fact that Math- Serve does not support stateful Web Services which are needed to model an iterative communication between Leo and a first-order ATP system. 6 In what follows, we pursue a different and much simpler approach in which the service provided by Leo is combined sequentially with a definition expansion service, a higher-order to first- order translation service, and first-order ATP services. In the following sections we will describe the components of this composite service.

8.3 The LEO Services

We integrated the Leo system in MathServe and described two higher-order ATP services, one for Leo using the standard strategy (STD) and one for Leo using the strategy EIR. Currently, the reasoning services accept theorem proving problems and the corresponding formal theories in POST syntax. Thus, next to the class PostProv- ingProblem introduced in Section 4.3.2, we also defined the class of theories formalised in POST :

{mw#PostTheory = mw#HoTheory ⊓ ∀ mw#inLogic. mw#SimpTypLamCalc ⊓ ∀ mw#language. mw#POST }

The OWL-S service profile of the Leo service with standard strategy looks as follows:

5Note, that a higher-order to first-order translation using λ-combinators could also translate arbi- trary λ-terms (see Section 8.7). 6We will discuss possible ways to overcome this limitation in Section 10.1.3. 8.3 The LEO Services 163

Defined Concepts for Typed Sets := λxα,Aoα [Ax] ∈ := [λxα ] ∅ ⊥ := λAoα, Boα [ xα x A x B] ⊆ ∀ ∈ ⇒ ∈ := λAoα, Boα [λxα x A x B] ∪ ∈ ∨ ∈ := λAoα, Boα [λxα x A x B] ∩ ∈ ∧ ∈ := λAoα [λxα x / A] ∈ := λAoα, Boα [λxα x A x / B] \ ∈ ∧ ∈ ExclUnion( , ) := λAoα, Boα [(A B) (B A)] \ ∪ \ Disjoint( , ) := λAoα, Boα [A B = ] ∩ ∅ Meets( , ) := λAoα, Boα [ xα x A x B] ∃ ∈ ∧ ∈ Misses( , ) := λAoα, Boα [ xα x A x B] ¬∃ ∈ ∧ ∈ Defined Concepts for Relations UnOrderedPair ( , ) := λxα,yα [λuα u = x u = y] ∨ Singleton( ) := λxα [λuα u = x] Pair( , ) := λxα,y [λuα, v u = x v = y] β β ∧ := λAoα, B [λuα, v u A v B] × oβ β ∈ ∧ ∈ RDom( ) := λR [λxα y Rxy] oβα ∃ β RCodom( ) := λR [λy xα Rxy] oβα β ∃ Subrel( , ) := λR ,Q [ xα yα Rxy Qxy] oβα oβα ∀ ∀ ⇒ Id( ) := λAoα [λxα,yα x A x = y] F ield( ) := λR [RDom(B∈) RCodom∧ (R)] oβα ∪ IsRelOn( , , ) := λR ,Aoα λB [ xα,y Rxy (x A x B)] oβα oβ ∀ β ⇒ ∈ ∧ ∈ RestrictRCodom( , ) := λR ,Aoα [λxα,y x A Rxy] oβα β ∈ ∧ RelComp( , ) := λR ,Q [λxα,zγ y Rxy Ryz] oβα oγβ ∃ β ∧ InverseImageR( , ) := λR , B [λxα y y B Rxy] oβα oβ ∃ β ∈ ∧ Reflexive( ) := λR [ xα Rxx] oβα ∀ Symmetric( ) := λR [ xα yα Rxy Ryx] oβα ∀ ∀ ⇒ T ransitive( ) := λRoβα [ xα yα zα Rxy Ryz Rxz] EquivRel( ) := λR [Reflexive∀ ∀ ∀(R) Symmetric∧ ⇒(R) T ransitive(R)] oβα ∧ ∧ EquivClasses( ) := λRoαα [λAoα uα u A vα v A Ruv] ∃ ∈ ∧ ∀ ∈ ⇔ Defined Concepts for Functions Inj ( ) := λF [ xα,y F (x)= F (y) x = y] βα ∀ β ⇒ Surj ( ) := λFβα [ yβ xα y = F (x)] Bij ( ) := λF Surj∀ (∃F ) Inj (F ) βα ∧ ImageF ( , ) := λF ,Aoα [λy xα x A y = F (x)] βα β ∃ ∈ ∧ InverseImageF ( , ) := λF , B [λxα y y B y = F (x)] βα oβ ∃ β ∈ ∧ := λFβα,Gγβ [λxα G(F (x))] ◦ 1 2 1 2 IncreasingF ( , , ) := λFβα, ⊳oαα, ⊳oββ [ xα,yα x ⊳ y F (x) ⊳ F (y)] 1 2 ∀ 1 ⇒ 2 DecreasingF ( , , ) := λF , ⊳ , ⊳ [ xα,yα x ⊳ y F (y) ⊳ F (x)] βα oαα oββ ∀ ⇒ Table 8.2: Definitions of the concepts used in the problem formalisations in Table 8.1 (Source [Benzm¨uller et al., 2005])

profile mw#LeoStdATP: inputs: hof problem :: mw#PostProvingProblem hof theory :: mw#PostTheory outputs: atp result :: mw#HoAtpResult preconds: effects: resultFor(atp result, hof problem)∧ (status(atp result, stat#Theorem)∨ status(atp result, stat#Unknown)∨status(atp result, stat#Timeout)) categs: params: mw = http://www.mathweb.org/owl/mathserve.owl

Similarly, the service LeoEirATP uses Leo with the strategy EIR. The profile of the service LeoEirATP differs from LeoStdATP only by its name. The service profiles do not contain any performance data such as available for first-order ATP services. So far, we have not identified criteria which could be used to specify the expertise of the two 164 Chapter 8. MathServe on Higher-Order Problems services. On the one hand, syntactic features, which suggest advanced extensionality reasoning, have not been studied yet. On the other hand, large libraries of higher-order logic problems, such as the TPTP Library for first-order logic, do not yet exist. It is worth mentioning that, compared to a standalone Leo system, the integration of Leo as a stateless reasoning service produces a runtime overhead of approximately 15 seconds. The additional time is needed for starting a lisp process, initialising the Leo system, loading a problem description, and starting the proof search. A similar overhead would occur if the TPS system was integrated as a stateless reasoning service in MathServe.

8.4 A Definition and Extensionality Expansion Ser- vice

Most of the reasoning services in Chapter 4 were general purpose services which typ- ically accept any reasoning problem described in a certain logic. In this section, we introduce a special-purpose service for the expansion of definitions and set extension- ality. The service is based on the observation that for many problems in Table 8.1 a complete expansion of all definitions results in a single quantified equation involving two sets. By applying the set extensionality axiom (8.1), this equation can be trans- formed into a first-order-like problem. We illustrate the necessary reasoning steps in the Natural Deduction calculus [Gentzen, 1935]7 As an example, we show the reasoning steps for problem SET143+3 which states that set intersection is associative:

∀Xoα,Yoα,Zoα [(X ∩ Y ) ∩ Z = X ∩ (Y ∩ Z)]

By replacing the symbol for set intersection by its definition and by performing β- reduction we obtain a universally quantified set equation:

∀Xoα,Yoα,Zoα [λaα [[X(a) ∧ Y (a)] ∧ Z(a)] = λbα [X(b) ∧ [Y (b) ∧ Z(b)]]].

An elimination of the leading universal quantifiers leads to

λaα [[X(a) ∧ Y (a)] ∧ Z(a)] = λbα [X(b) ∧ [Y (b) ∧ Z(b)]].

Now the set extensionality axiom (8.1) can be applied to obtain the equivalence

∀xα [x ∈ λaα [[X(a) ∧ Y (a)] ∧ Z(a)] ⇔ x ∈ λbα [X(b) ∧ [Y (b) ∧ Z(b)]]].

This formula still contains the symbol ∈ for set membership. However, an expansion of the definition of ∈ with subsequent β-reduction finally leads to the first-order-like formula ∀xα [[X(x) ∧ Y (x)] ∧ Z(x)] ⇔ [X(x) ∧ [Y (x) ∧ Z(x)]]. After a translation into first-order logic (see Section 4.3.2) this formula can be easily proved by one of the first-order ATP services available in MathServe. Within the proof assistant Ωmega, we combined the four inference steps, as illustrated above,

7The Natural Deduction calculus is the base calculus of Ωmega and TRAMP. 8.5 Two Composite Services 165

tactic DefAndSetExtExpand(ϕ, T ) ϕ1 = Defs-I(ϕ, T ) ; ϕ2 = Forall-I*(ϕ1) ; if isSetEquation(ϕ2) then ϕ3 = SetExtContract(ϕ2) ; return Defs-I(ϕ3, T ) ; else fail ; fi end

Figure 8.1: An Ωmega tactic expanding definitions and applying the set extensionality axiom in the tactic DefAndSetExtExpand shown in Figure 8.1. Given a formula ϕ and the underlying theory T , the tactic first tries to expand all defined symbols in ϕ with definitions in T (tactic Defs-I). Then leading universal quantifications are removed (Forall-I*). If the resulting formula is an equation on sets, i.e. a formula of the form Aoα = Boα then the set extensionality axiom is applied (SetExtContract). Otherwise the tactic fails. Finally, the definitions of newly introduced symbols (e.g., ∈ for set membership) are replaced by their definition and the resulting formula is returned. The soundness of the tactic DefAndSetExtExpand is guaranteed by the soundness of its sub-tactics (e.g., Defs-I and Forall-I*) and of the Ωmega proof assistant. We made the functionality of the tactic available in MathServe as a homonymous special-purpose reasoning service. The OWL-S profile of this service looks as follows:

profile mw#DefAndSetExtExpand: inputs: post problem :: mw#PostProvingProblem post theory :: mw#PostTheory outputs: exp problem :: mw#PostProvingProblem preconds: inTheory(post problem, post theory) effects: equivalent(exp problem, post problem) categs: params: mw = http://www.mathweb.org/owl/mathserve.owl

8.5 Two Composite Services

The theorem proving problem returned by the service DefAndSetExtExpand is still for- mulated in the higher-order language POST . However, if the resulting problem is first-order like, it can be translated to an equivalent problem in first-order predicate logic using the service TrampHo2Fo introduced in Section 4.3.2. The resulting first-order problem can be given to any first-order ATP service or the optimal theorem proving policy resultQueryProcOpt presented in Section 6.6. Motivated by the work presented in [Benzm¨uller et al., 2005] and [Benzm¨uller et al., 2001], we defined two composite services which combine both Leo services with the services DefAndSetExtExpand and TrampHo2Fo. The Golog procedure representing 166 Chapter 8. MathServe on Higher-Order Problems the composite reasoning service for LeoStdATP is the following:

proc LeoStdEnhanced (post problem, post theory, result ) LeoStdATP (post problem, post theory, time res, leo result ) ; if status (leo result, stat#Theorem ) then bind (result, leo result ) else DefAndSetExtExpand (post problem, post theory, new problem ) ; TrampHo2Fo (new problem, cnf problem ) ; ResultQueryProcOpt (cnf problem, result ) endIf endProc

The composite service first calls the atomic service LeoStdATP. If Leo finds a proof the result of LeoStdATP is returned as the overall result. Otherwise, the input problem is transformed using DefAndSetExtExpand and TrampHo2Fo. If both service invocations succeed8, the resulting problem is given to the optimal policy for first-order problems. It is worth mentioning that this composite reasoning service cannot be generated automatically by the service composition procedure shown in Chapter 6. This is due to the fact that the planner used by MathServe’s composition procedure cannot generate conditional plans. The Golog procedure LeoStdEnhanced could be given to MathServe’s decision- theoretic reasoner to compute an optimal policy. However, the atomic services used in the procedure are all deterministic and the procedure ResultQueryProcOpt has already been optimised with respect to the performance of first-order ATP systems. Therefore, re-optimising the procedure does not increase its performance. But one could imagine replacing the call of ResultQueryProcOpt by a non-deterministic choice of several ATP systems that perform particularly well on the problems created by DefAndSetExtExpand and TrampHo2Fo. The resulting procedure could then be optimised to obtain a new optimal policy. The second composite service LeoEirEnhanced differs from LeoStdEnhanced only in that LeoStdATP.

8.6 Evaluation of Composite Services

In our experiments, the services LeoStdATP and LeoEirATP were compared with the composite services LeoStdEnhanced and LeoEirEnhanced, respectively. The number of problems solved by the services was counted, and the CPU time needed to deliver a result was measured. The atomic Leo services were directly invoked with the POST representation of the conjectures, the corresponding background theory, and with a time limit of 100 seconds. The composite services were given to the MathServe broker together with a query containing the conjecture (PostProvingProblem) at hand (cf. Section 5.2.2). The query for the problem SET066+1, for instance, is shown in Figure 8.4.

8The service TrampHo2Fo fails if the input problem is not first-order like 8.6 Evaluation of Composite Services 167

b) a) ∞

b

b

b

5 ms) 4 4

3

2 LeoStdEnhanced (10 1 36 21 0 0 1 2 3 4 5 ··· ∞ LeoStdATP LeoStdEnhanced LeoStdATP (104ms)

Figure 8.2: (a) Number of problems solved and (b) times used by LeoStdATP and LeoStdEnhanced for the problems in Table 8.1. Failed proving attempts are indicated with infinite time

The results of our evaluation are summarised in Figure 8.2 and Figure 8.3 9. The detailed results are shown in Table 8.3 (page 170). Figure 8.2 (a) shows that the composite service LeoStdEnhanced could solve 36 prob- lems, i.e. 15 problems more than LeoStdATP. The CPU times used by the services are compared in the scatter graph in Figure 8.2 (b). Failed proof attempts are indicated with infinite time. The CPU times used by LeoStdATP and LeoStdEnhanced for problems solved by both services are roughly the same. The results for the services LeoEirATP and LeoEirEnhanced are presented in Figure 8.3. The composite service LeoEirEnhanced solved 41 problems, i.e. four problems more than the atomic service LeoEirATP (cf. Figure 8.3 (a)). Of particular interest is the problem SET624+3 (cf. Table 8.3) for which the service LeoEirATP took 88.7 seconds to find a proof. When evaluating the composite service LeoEirEnhanced, MathServe’s Golog interpreter first invokes the service LeoEirATP with a time limit of 50 seconds within which the service cannot find a proof. However, the problem can be translated to first-order logic and solved by the service VampireATP within 18.3 seconds. Thus, with an overall CPU time of 68.3 seconds, the problem can be solved faster than with the standalone Leo service. In the case of the problem SET076+1 both LeoStdATP and LeoEirATP report a failed proving attempt already after approximately 16 seconds of CPU time. But the ser-

9The MathServe broker and all services ran on an Intel Pentium IV Linux machine with 1GB RAM. 168 Chapter 8. MathServe on Higher-Order Problems

b) a) ∞

b

b

b

8 ms) 4

6

4

LeoEirEnhanced (10 2

37 41 0 0 2 4 6 8 ··· ∞ LeoEirATP LeoEirEnhanced LeoEirATP (104ms)

Figure 8.3: (a) Number of problems solved and (b) times used by LeoEirATP and LeoEirEnhanced for the problems in Table 8.1. Failed proving attempts are indicated with infinite time vices DefAndSetExtExpand, TrampHo2Fo, and ResultQueryProcOpt can prove the conjecture within one second. The first-order ATP problems created by the services TrampHo2Fo belonged to the SPCs CNF NKS EPR, CNF NKS RFO PEQ NUE and CNF NKS RFO SEQ NHN for which the services of the ATP systems Vampire 8.0 and E 0.91 were chosen. The times used by the Leo services shown in Table 8.3 are not directly comparable to the times presented in [Benzm¨uller et al., 2005]. This is because of two reasons: The times provided in Table 8.3 include the time needed to start a lisp process, load a prob- lem description, translate the problem to higher-order CNF, and to start the prover (approximately 15 seconds), while in [Benzm¨uller et al., 2005] only the CPU time of Leo’s central proving loop are presented. Furthermore, our experiments were con- ducted on hardware with a lower performance than the hardware used in [Benzm¨uller et al., 2005].

8.7 Summary

In Chapters 4 to 7, we only presented reasoning services for classical first-order logic with equality of less expressive languages. This was a natural starting point because most automated reasoning systems are built for propositional logic, first-order logic, or decidable sublanguages of the latter. However, MathServe is not restricted to the services provided by these reasoning systems. In this chapter, we showed that reasoning 8.7 Summary 169

query mw#ProverResultQuery: inputs: post problem :: mw#PostProvingProblem = ⋆omega#SET066+1 post theory :: mw#PostTheory = ⋆omega#relation outputs: result :: mw#ProverResult preconds: effects: resultFor(result, post problem) categs: timeout: 100 secs mw = http://www.mathweb.org/owl/mathserve.owl omega = http://www.mathweb.org/omega .

Figure 8.4: A query profile containing a higher-order theorem proving problem services for higher-order logic can also be modelled in MathServe. We evaluated MathServe on a set of 45 higher-order conjectures on sets, relations and functions. We modelled two instances of the higher-order ATP system Leo as ser- vices in MathServe. We described two composite services which extend Leo’s services with a service for the expansion of definitions and set extensionality, a higher-order to first-order translation service, and the composite first-order theorem proving service presented in Chapters 6 and 7. We showed that these composite services can prove more problems than the atomic services they are composed of. In particular, the ser- vice LeoEirEnhanced could solve 41 of the 45 problems, four problems more than the atomic Leo service. Moreover, the composite services typically used less than one second more CPU time than the atomic services. In one case, the composite service could prove a conjecture significantly faster (20 seconds less) than the corresponding Leo ATP service. The higher-order ATP system TPS can solve 39 problems in very little time. How- ever, TPS still requires user interaction to prove theorems and an integration of TPS into MathServe would require major modifications to the system. If a reasoning service for TPS had been available and if that service had been enhanced with the method presented in Section 8.5, 43 problems could have been solved by the enhanced service. The evaluation of MathServe on higher-order conjectures was hampered by two issues: Firstly, only a few automated reasoning systems for higher-order logic are avail- able for comparison. Secondly, large libraries of higher-order conjectures formalised in a standardised formal language are not yet available. It is our hope that the ongoing development of a higher-order variant of the TPTP Library will help to overcome the latter problem. The higher-order to first-order translation of MathServe is simple and only works for first-order-like formulae. Other translation mechanisms have been discussed in the literature. Of particular interest is the translation with λ-combinators presented in [Hurd, 2002] and [Meng and Paulson, 2004] which can translate arbitrary λ-terms. We are planning to offer this translation as a MathServe service. With such a service available, a comparison between the different translations could be performed. 170 Chapter 8. MathServe on Higher-Order Problems

Problem LeoStdATP LeoStdEnhanced LeoEirATP LeoEirEnhanced Solved Time Solved Time Solved Time Solved Time SET014+4 √ 18279 √ 21087 √ 21129 √ 22416 SET017+1 ------SET066+1 - - √ 17364 - - √ 70802 SET067+1 √ 15316 √ 15601 √ 15571 √ 15753 SET076+1 - - √ 17063 - - √ 17598 SET086+1 √ 15090 √ 15233 √ 15214 √ 15384 SET096+1 ------SET143+3 - - √ 17619 √ 16207 √ 16312 SET171+3 - - √ 17810 √ 16088 √ 16292 SET580+3 √ 16066 √ 16282 √ 16183 √ 17387 SET601+3 - - √ 19448 √ 18110 √ 18388 SET606+3 - - √ 17634 √ 16065 √ 16420 SET607+3 - - √ 17323 √ 15898 √ 16046 SET609+3 - - √ 18558 √ 16793 √ 17850 SET611+3 - - - - √ 27927 √ 31253 SET612+3 - - √ 18649 √ 16707 √ 16992 SET614+3 - - √ 18506 √ 16473 √ 16713 SET615+3 - - √ 18680 √ 16670 √ 16677 SET623+3 - - √ 25378 √ 24775 √ 24991 SET624+3 √ 49289 √ 49867 √ 88719 √ 68298 SET630+3 √ 16099 √ 16126 √ 16135 √ 17216 SET640+3 √ 15667 √ 15699 √ 15746 √ 15816 SET646+3 √ 15990 √ 16002 √ 16049 √ 16133 SET647+3 √ 15769 √ 15879 √ 15893 √ 16110 SET648+3 √ 15832 √ 15885 √ 15888 √ 16046 SET649+3 √ 16273 √ 16296 √ 16325 √ 16445 SET651+3 √ 15534 √ 15655 √ 15646 √ 15772 SET657+3 √ 15505 √ 15646 √ 15597 √ 15720 SET669+3 - - - - √ 16286 √ 17111 SET670+3 √ 15982 √ 16011 √ 16013 √ 16146 SET671+3 - - - - √ 16399 √ 16523 SET672+3 √ 15961 √ 16085 √ 16155 √ 17134 SET673+3 - - - - √ 16551 √ 16573 SET680+3 √ 16836 √ 17018 √ 17192 √ 17250 SET683+3 √ 16284 √ 16477 √ 16391 √ 17295 SET684+3 √ 17144 √ 17226 √ 18454 √ 17912 SET686+3 √ 17005 √ 17016 √ 17666 √ 17761 SET716+4 √ 16229 √ 16323 √ 16616 √ 16633 SET724+4 ------SET741+4 - - √ 20884 √ 19502 √ 19267 SET747+4 √ 16904 √ 17101 √ 17351 √ 17238 SET752+4 - - √ 19160 - - √ 68963 SET753+4 - - √ 18099 - - √ 78558 SET764+4 - - - - √ 15467 √ 15598 SET770+4 ------Table 8.3: Experimental data for the problems shown in Table 8.1. All times denote CPU time in milliseconds Part IV

Conclusion

Chapter 9

Conclusions and Related Work

In this thesis we presented the MathServe framework which provides the means to describe automated reasoning systems and related systems as Semantic Web Services. MathServe currently offers the services of several first-order ATP systems, one higher- order ATP system, finite model generators, and decision procedures. Furthermore, MathServe integrates systems for problem and proof transformation. Reasoning services in MathServe are accessible as Web Services. The semantics of these services is described in the OWL-S upper ontology. MathServe provides an OWL- DL domain ontology which defines the classes and properties necessary to describe reasoning services. Data about the performance of reasoning services is provided in OWL-S service profiles as conditional probabilistic effects. This data can be used to select suitable services for reasoning problems. The MathServe broker is a specialised middle agent which provides service match- making and composition facilities to service requesters. Service matchmaking is per- formed with the help of the Description Logic reasoner Pellet. Service composition combines classical planning in PRODIGY with decision-theoretic reasoning in DT- Golog. An offline DTGolog interpreter takes the conditional probabilistic effects of OWL-S services into account and computes an optimal policy. The policy chooses rea- soning services optimally according to problem features determined by other services. Composite reasoning services are represented as Golog procedures and are human read- able. It is easy to write new composite services in Golog. MathServe participated in the demonstration division of the CADE ATP System Competitions CASC-20 and CASC-J3. This evaluation demonstrated that the system performs well in a competition environment. At CASC-J3, MathServe could solve more problems than the best standalone ATP system. The framework could have solved even more problems if it had been given some information about what to achieve for the competition problems. For example, MathServe could have chosen the specialised ATP systems DCTP or Paradox for problems of the SAT division, if it had been told to prove satisfiability. However, due to the design of the competition this information is not provided to the participants. We also evaluated the performance of MathServe on 45 higher-order problems on sets, relations and functions formalised in a variant of Church’s simply typed λ-calculus. We compared two instances of the higher-order ATP system Leo with two composite services which extend Leo’s services with a service for the expansion of definitions and 174 Chapter 9. Conclusions and Related Work set extensionality, a higher-order to first-order translation service, and the composite first-order theorem proving service used in the CASC evaluation. We showed that the composite services could prove more problems than the atomic services they are composed of. Moreover, the composite services typically used only a fraction of a second more CPU time than the atomic services. In one case, the composite service could prove a conjecture significantly faster (20 seconds less) than the corresponding Leo ATP service. This evaluation showed that the composition of reasoning services can significantly increase the success rate with respect to number of solved problems. It also showed that the time consumption of composite services is not necessarily much higher than the time used by atomic reasoning services because of the distribution of time resources performed by the MathServe broker. An easy to install binary distribution of MathServe is available for download. So far, the system has been used by the projects Hets, Ωmega and VeriFun to prove conjectures in first-order logic, or to transform resolution proofs into ND calculus. MathServe is inspired by the vision of a Semantic Mathematical Web. It com- plements the work of other projects which offer services for the semantic retrieval of mathematical content (MBase and HELM), or develop languages for a semantic markup of mathematical content (OMDoc and OpenMath). Some of the ideas be- hind MathServe have been investigated by other projects. In the following sections we describe the differences and similarities between MathServe and those projects.

9.1 Semantic Computation Services.

MathServe is most closely related to the projects MathBroker [Schreiner and Caprotti, 2001] and MONET [MONET, 2002] which describe symbolic and numeric computation services in the proprietary Mathematical Service Description Language (MSDL). The two projects shared most of their research and development efforts. However, while MathBroker focused on symbolic computation services, MONET (with the main devel- opers located at the Numerical Algorithms Group (NAG) in Cambridge) was mostly concerned with numerical computation services. MathServe complements the projects MathBroker and MONET by offering logic- based automated reasoning systems as Semantic Web Services. However, MathServe differs from MathBroker and MONET in several ways. Instead of using the propri- etary language MSDL, MathServe uses standardised Semantic Web languages, such as WSDL, OWL-DL and OWL-S, recommended by the World Wide Web Consortium. This allows the use of emerging tools for registering, editing and matching Semantic Web Services. The preconditions and effects of services in MSDL are described in OpenMath. OpenMath is an expressive language but it lacks a formal semantics. Also the con- cepts (symbols) defined in OpenMath content dictionaries are only given an intuitive meaning. The definition of relationships between OpenMath symbols is not sup- ported. MathServe, on the other hand, employs the Description Logic OWL-DL to define the concepts and properties needed in service descriptions. OWL-DL has a well-defined formal semantics and is supported by many applications and reasoning tools developed 9.2 Online Access to ATP Systems. 175 in the context of the Semantic Web. Preconditions and effects of OWL-S service profiles are described in the SWRL language. SWRL is less expressive than OpenMath but has a formal semantics based on the semantics of OWL. Furthermore, the use of SWRL leads to a decidable entailment test between preconditions and effects of OWL-S service and query profiles. Despite its name, the MathBroker project did not develop brokering facilities, such as the ones offered by the MathServe broker.

9.2 Online Access to ATP Systems.

The online utility System on TPTP 1 (SOTPTP) offers a system recommendation service for first-order ATP problems. The utility’s recommendations are based on the performance of different ATP systems on the TPTP Library. Therefore, the service is similar to the composite service resultQueryProcOpt (see Section 6.6) used by the ATP interface of the MathServe broker to choose appropriate ATP services. The SOTPTP utility is designed for human users and does not provide access to systems other than first-order ATP systems. As opposed to SOTPTP, MathServe services are designed for use by other software applications. Furthermore, MathServe is more general than SOTPTP because it offers many different reasoning and translation services. Additionally, the MathServe broker offers automated service composition, a feature which is not provided by SOTPTP.

9.3 Optimal Choice of Reasoning Systems.

A portfolio of 12 algorithms for the propositional SAT problem has been implemented in the SATzilla system [Nudelman et al., 2004]. The performance of these algorithms was measured on a set of approximately 5000 problems. The problems were classified according to 56 problem features most of which are numerical. Some Problem fea- tures capture the number of clauses and variables in a problem and their ratio. Other features involve counting the number of unary, binary and ternary clauses, and the number of Horn clauses in a problem. SATzilla employs a ridge regression learning algorithm to compute an optimal policy for the choice of SAT solvers. The computa- tional complexity of this algorithm is O(n3) where n is the number of problems. Most features in SATzilla are fast to compute, with a few features taking O(m3) time, where m is the problem size. The SATzilla approach is similar to the optimal policy used by MathServe broker to choose first-order ATP systems. However, the problem features used by the Math- Serve broker are purely syntactic and can be computed in linear time. Currently, the numerical features of SATzilla could not be used by the MathServe broker because they require complex machine learning algorithms to compute an optimal policy. While SATzilla is a specialised system for solving SAT problems, MathServe is a general decentralised framework for solving different types of reasoning problems: New services can be added to the framework by independent service providers. Data about

1Available at http://www.cs.miami.edu/~tptp/cgi-bin/SystemOnTPTPFormMaker. 176 Chapter 9. Conclusions and Related Work the performance of reasoning services in MathServe are part of OWL-S descriptions and are available to all service requesters. Chapter 10

Limitations and Future Work

Despite the positive evaluation of MathServe framework at the CASC-J3 (Chapter 7) and on higher-order conjectures (Chapter 8), the system has a number of limitations. Some of these limitations are due to the foundations of MathServe, while others stem from shortcomings of the reasoning systems integrated in MathServe. We discuss the shortcomings of MathServe in Section 10.1. In Section 10.2 we present possible di- rections for future work starting with approaches that might help to overcome these shortcomings.

10.1 Limitations of MathServe

The MathServe framework is designed as a general, open framework in which reasoning services are offered as Semantic Web Services. The high-level programming language Golog is used to describe composite reasoning services in a human-readable way. On the one hand, the use of Web Services, Semantic Web technology, and Golog eases the access to reasoning services and supports advanced mediation services. For example, semantic descriptions of Web Services allow the MathServe broker to auto- matically find and combine suitable reasoning services. On the other hand, following the paradigm of Semantic Web Services and using Golog to represent composite rea- soning services limits the applicability of MathServe. The most critical limitations are the following: • Time resources cannot be assigned explicitly to the atomic reasoning ser- vices in a composite service. Instead, time resources are allocated dynami- cally by MathServe’s Golog interpreter. • An invocation of a reasoning Web Services (and the underlying reasoning system) cannot be terminated by the client application. This causes prob- lems with large-scale parallel invocations of reasoning services. • Atomic reasoning services in MathServe are stateless. Therefore, iterative reasoning processes which deliver multiple results over a long period of time cannot be modelled in MathServe. In what follows, we discuss these issues in greater detail. 178 Chapter 10. Limitations and Future Work

10.1.1 Management of Time Resources

Currently, the management of time resources in the MathServe framework is rudimen- tary (cf. Section 5.2.6). When evaluating a composite service, MathServe’s Golog interpreter uses simple heuristics to allocate time resources to atomic services. These heuristics depend on three factors: the overall time resource available to the composite service, the time used so far, and the number of (atomic) services being invoked se- quentially. In some applications, however, it is desirable to explicitly specify concrete time (and space) resources for the atomic services of a composite service.

Example 10.1.1 An example for a composite service involving time resource speci- fications is the following (described in natural language):

For a theorem proving problem p containing a conjecture in first-order logic, first call the ATP service OtterATP on p for at most 60 seconds. If the ATP status of the result of OtterATP is not Theorem then call the model generator service MaceMG on p for at most 20 seconds to try to find a counter-model. Currently, this composite service cannot be described in the MathServe framework. In particular, the Golog language does not allow to specify time resources for primitive agent actions. In fact, the situation calculus axioms capture a purely qualitative notion of time. Sequential action occurrence is the only temporal concept captured by the axioms; an action is executed before or after another action. Since Golog programs macro-expand to situation calculus formulae, the notion of time in Golog is the same as the situation calculus notion.

10.1.2 Concurrent Service Invocations

The Golog language does not support parallel execution of atomic or composite ser- vices. This feature is desirable in domains in which many different systems (e.g., ATP systems) can potentially solve a given problem. It is particularly desirable if an answer to a problem (e.g., a proof of a conjecture) has to be delivered within certain time lim- its. For example, at the CASC competitions (see Chapter 7), MathServe could have performed better if it had been able to invoke several ATP systems in parallel. Concurrent service invocation is related to the problem of reasoning services with states as described in the next section: If several services are invoked in parallel on the same problem, all invocations should be terminated as soon as one service delivers the desired result. But an early termination of service invocations by a client application is not possible with standard Web Services. A service invocation can only be terminated by the service provider. In the CASC scenario, new problems were sent to the MathServe broker as soon as a result for the previous problem was received. Thus, with a parallel invocation of several ATP service without terminating previous service calls, the available computational resources would have been used up quickly. The atpSplitAndJoin operation of the Math- Serve broker’s ATP interface only returns a result when all parallel service invocations have returned a result (see Section 5.2.7). 10.1 Limitations of MathServe 179

10.1.3 Reasoning Services with States Web Services are stateless by definition and require synchronous communication via SOAP messages. This eases the communication with a Web Service: A client applica- tion can simply send a SOAP message to a Web Service and receives one message as an answer. Semantic descriptions of Web Services in OWL-S rely on the fact that the underlying Web Services are stateless. Atomic OWL-S processes are directly linked to one stateless Web Service. Composite OWL-S processes describe how a client applica- tion can interact with several atomic services, and describe how the state of the client changes by invoking Web Service. If services are stateful, it becomes difficult to describe their semantics. Instead of describing the inputs, outputs, preconditions and effects of a single process, one has to describe a complete interaction protocol and the change to the service’s state according to this protocol. Thus, when using stateful Web Services, key applications of Semantic Web Services, such as automated service matchmaking and composition, become infeasible or at least far more complex. While the use of stateless services allows MathServe to use standard languages to describe reasoning services and their semantics, it also limits the usability of Math- Serve for certain applications. The haRVey system [Ranise and Deharbe, 2003] is an example for a reasoning system whose reconstruction in MathServe requires stateful reasoning services. haRVey is a decision procedure for the quantifier-free fragment of first-order logic with equality. It automatically decides the validity of arbitrary boolean combinations of ground literals modulo an equational theory. The background theory is assumed to be axiomatised by a finite set of equational clauses.

α −1

α(∼φ) Λ Λ Propositional SAT−Solver ∼ φ Λ=={} Refinement Abstraction (SAT/BDD) No

−1 Yes α (Λ) Theory T (CNF)

enumerate φ valid Dec Proc next Ep/Superposition

ρ

Yes ρ==⊥?

No

∼ φ satisfiable

Figure 10.1: Modules of the haRVey decision procedure

haRVey is a monolithic system that combines a propositional abstraction module with a boolean satisfiability (SAT) solver and a decision procedure in a loop. Fig- 180 Chapter 10. Limitations and Future Work ure 10.1 shows the different modules of the haRVey loop and their interaction. The propositional abstraction module computes a propositional abstraction function α for the negated input formula ∼ Φ, and the abstracted formula α(∼ Φ). A SAT solver enumerates satisfying assignments Λ for α(∼ Φ). An empty Λ indicates that there are no (further) satisfying assignments, i.e. Φ is valid. A non-empty Λ is translated back 1 to ground literals by applying α− . The new formula is given to a decision procedure (e.g., the ATP system E with superposition calculus) together with the background theory T . If the decision procedure proves the input formula to be inconsistent with the background theory the SAT solver is asked to enumerate another propositional assignment. Otherwise ∼ Φ is satisfiable modulo T , i.e. Φ is not valid in T . Together with Silvio Ranise of the Lorraine Laboratory of IT Research and its Ap- plications (LORIA, Nancy, France) we have been working on a rational reconstruction of haRVey using interchangeable services. For this, the different modules of haRVey (the boxes in Figure 10.1) have to be modelled as reasoning services. However, in the current version of the MathServe framework we cannot describe all the services needed for a distributed version of haRVey. For instance, we cannot describe a stateful SAT solving service which enumerates satisfying assignments for a propositional formula.

10.1.4 Limitations of First-order Theorem Provers

The use of MathServe by the VeriFun system1 and the Hets toolset (see Section 5.4) is limited by shortcomings of the ATP systems and representation languages underlying the first-order ATP services presented in Section 4.4. In the context of VeriFun it is desirable to encode sort information in first-order theorem proving problems. However, neither the TPTP utilities nor ATP systems (except SPASS) support sorted languages. For the Hets toolset it is important to determine the list of axioms used in a proof. So far, only the ATP system SPASS delivers this information. If MathServe chooses an ATP system other than SPASS the list of first-order axioms used in a proof cannot be determined.

10.2 Future Work

During the development of MathServe we identified several issues that could be ad- dressed by future research. Some of these issues aim at overcoming the limitations described in the previous section. Others improve the framework in general or widen its applicability.

10.2.1 Management of Time Resources

In the following paragraphs we discuss some languages that might help to overcome the restriction of the classical situation calculus with respect to the management of time resources.

1See also http://www.inferenzsysteme.informatik.tu-darmstadt.de/verifun/. 10.2 Future Work 181

10.2.1.1 The Temporal Situation Calculus The most prevalent works on modelling time in the situation calculus are those by Pinto and Reiter [Pinto, 1994, Reiter, 1996, Reiter, 2001], and Kakas [Kakas and Miller, 1997, Kakas et al., 2000]. Reiter and Pinto suggest an extension of the situation calculus in which time is handled explicitly by introducing a totally ordered time line and by associating a start time with actions and with situations. In Reiter’s approach, actions have an additional time argument as in

pickup(Coffee, t1) which denotes that an agent picks up some coffee at the time point t1. Instead of a time argument, Pinto introduces a predicate occurs(a, s) which denotes the fact that an action a occurs in a situation s. In both approaches, the start time of a situation do(a, s) is the start time of the action a. The foundational axioms of the situation calculus have to be extended to ensure that the start times of actions and situations are compatible (see [Reiter, 1996, Reiter, 2001] for details). Continuous actions are modelled with instantaneous start and end actions with new action precondition and effect axioms. For instance, the continuous action walk(a, b) of walking from point a to point b is represented as startW alk(a, b, t1) and endW alk(a, b, t2) with the constraint t1 < t2. The action precondition and successor state axioms are Poss(startW alk(x, y, t),s) ≡ ¬∃u, v.walking(u,v,s) ∧ location(s)= x Poss(endW alk(x, y, t),s) ≡ walking(x,y,s) walking(x,y,do(a, s)) ≡ a = startW alk(x, y, t) ∨ walking(x,y,s) ∧ a =6 endW alk(x, y, t). For our purposes, the Golog language could be extended with time in a similar manner. After the necessary modifications, the composite service in example 10.1.1 could be expressed as the following Golog procedure:

proccomposite (p, r, t1, t2 ) startOtterATP (p, r1, t1 ) ; endOtterATP (p, r1, t1 +60) ; if status (r1, mw#Theorem ) then bind (r, r1) else startMaceMG (p, r, t2 ) ; endMaceMG (p, r, t2 +20) endIf endProc For this program to deliver the desired results, the Golog interpreter has to be adjusted such that it executes the instantaneous end actions at the appropriate time points. One of the main problems with explicit end actions for continuous actions is the fact that the explicit termination of a Web Service execution (such as OtterATP) is not possible with standard Web Service technology. However, in the context of the com- putational GRID, stateful services have been that allow a client application to control the life-cycle of a service instance. We will discuss this issue further in Section 10.2.3. 182 Chapter 10. Limitations and Future Work

10.2.1.2 Description Logics and Temporal Logics Next to the first-order languages, such as the situation and the event calculus, De- scription Logics can be used to reason about actions and change. In fact, the OWL-S service descriptions presented in Chapter 4 are OWL-DL individuals. The use of Description Logics for reasoning about the executability and projection for Web Services is discussed in [Baader et al., 2005]. In their work, a service description consists of three sets of ABox assertions in the Description Logic ALCQIO2, namely the preconditions, occlusions and postconditions of the service. This is similar to the OWL-S descriptions used in our work. However, the authors of [Baader et al., 2005] consider only fully instantiated service descriptions (that contain no variables). Furthermore, only sequential composite services are investigated. It has been shown that Description Logics can be combined with temporal logics while preserving decidability of the satisfiability problem [Schild, 1993, Wolter and Zakharyaschev, 1999]. A terminating, sound and complete satisfiability checking algo- rithm (based on a tableau calculus) was presented in [Lutz et al., 2001]. However, it remains an open research question whether such a formalism could be used to represent and reason about composite services as they occur in MathServe.

10.2.2 Concurrency In the situation calculus, the notion of concurrency is closely bound to the notion of time discussed in the previous section. In [Reiter, 2001] a concurrent, non-temporal variant of the situation calculus is described. The calculus is obtained by replacing single actions in the foundational axioms by sets of actions. This leads, for instance, to situation terms such as

do({walk(a, b), sing},s0) which denotes the situation reached when the agent simultaneously starts to walk and to sing in the situation s0. However, actions in this approach are still instantaneous. Together with time arguments for actions and start times of situations (as shown above) this allows for real concurrency. For instance, the situation

do({startW alk(a, b, 3), startSing(3)},s0) denotes the situation reached when the agent simultaneously starts to walk and to sing at the discrete time point 3. This approach has the advantage that only minor changes to the foundational axioms of the situation calculus are necessary. The Golog variant ConGolog [Lesperance et al., 1999] incorporates a rich account of concurrency. It handles concurrent processes, high-level interrupts, and arbitrary exogeneous actions3. Concurrent actions are modelled with instantaneous start and stop actions as shown above. Compared to standard Golog, ConGolog additionally offers the following programming constructs:

2According to [Baader et al., 2005], the DL ALCQIO can be easily translated into OWL-DL. 3Exogeneous actions are actions that change the environment but are not executed by the agent executing a Golog program. 10.2 Future Work 183

(δ1 || δ2) (concurrent execution)

(δ1 ii δ2) (concurrency with priorities)

δ|| (concurrent iteration) hϕ → δi (interrupt)

The first two constructs are self-explanatory. The construct δ|| behaves like nonde- terministic iteration, but the instance of δ are executed concurrently, rather than in sequence. For an interrupt ϕ → δ with trigger condition ϕ and body δ. The body δ will be executed, if the interrupt gets control from higher-priority processes when ϕ is true. Unfortunately, a combination of ConGolog and decision-theoretic reasoning, as in DTGolog, has not been developed yet. Such a combination is non-trivial from a theo- retical point of view and requires further research.4

10.2.2.1 Concurrent Markov Decision Processes Propositional concurrent MDPs are typically solved by reasoning on sets of states and sets of actions [Mausam and Weld, 2004]. Only sets of non-interacting actions can be executed concurrently. Two actions interact if in any state they have inconsistent preconditions or conflicting effects. In principle, algorithms for classical MDPs, such as value iteration (see Section 2.4.4.2) or labelled real-time dynamic programming (RTDP) [Bonet and Geffner, 2003], can be used to solve concurrent MDPs. However, for a set A of actions, the set of action combinations is exponential in |A|. Thus, more efficient techniques are needed to solve concurrent MDPs. An application of these techniques to first-order MDPs has not yet been studied.

10.2.3 Stateful Reasoning Services Descriptions of interaction protocols for stateful service providers have been studied intensively in the context of multi-agent systems. The idea of modelling reasoning services as Belief-Desire-Intention (BDI) agents was presented in [Franke et al., 1999]. In our early work we tried to realise the ideas of [Franke et al., 1999] in the Java Agent DEvelopment Framework (JADE) [Telecom Italia Lab, 2003], a reference implementa- tion of the FIPA specifications [Fipa, 2003]. However, we encountered serious problems with implementing the BDI semantics of FIPA-ACL. Particular problems arose from the fact that, in JADE, both service providing and service requesting agents have to implement the (sometimes complex) interaction protocols and the FIPA-ACL performa- tives. This requires a profound knowledge of the JADE architecture and FIPA-ACL. Therefore, we considered the BDI agent model as not suitable for building an open framework of easily-accessible reasoning services such as MathServe. Declarative descriptions of agent protocols and general execution engines for those protocols have not yet been developed. Research on applying ideas from agent protocols

4Personal communication with Mikhail Soutchanski, Cognitive Robotics Group, University of Toronto. 184 Chapter 10. Limitations and Future Work to Web Services is still very young, but preliminary results have, for instance, been presented in [Walton, 2005]. Stateful Web Services have been studied intensively in the Grid computing com- munity [Foster and Kesselman, 2004]. In recent years, the Global Grid Forum decided to re-engineer Grid computing standards, merging them with the most current Web Services standards. The result was the Web Services Resource Framework (WSRF). WSRF defines a generic and open framework for modeling and accessing stateful re- sources using Web services. This includes mechanisms to describe views on the state of a service, to support management of the state through properties associated with the Web Service, and to describe how these mechanisms are extensible to groups of Web Services. In principle, WSRF could be used to create stateful variants of the reasoning services described in this thesis. However, these stateful services could only be accessed using specialised SOAP messages. Furthermore, semantic descriptions of WSRF services are not possible yet. Consequently, the semantic service matchmaking and composition facilities available in MathServe could not be used for WSRF services. The adaptation of Semantic Web technology for Grid services is summarised by the vision of a Semantic Grid [Corcho et al., 2006]. Similar to the Semantic Web, the Semantic Grid is an infrastructure where all resources, including services, are ade- quately described with machine-processable semantic annotations. However, how this is supposed to be achieved is still a matter of ongoing (early) research.

10.2.4 Extending the Range of Reasoning Services A wider range of reasoning services would broaden the applicability of MathServe. Two types of services would find immediate application in the context of proof assistant systems: Services for symbolic computation could be used as oracles in complex proof developments. Services for automated theorem proving in higher-order logic could increase the degree of automation in proof assistants. Services for Symbolic Computation. In the Ωmega system, symbolic compu- tation services have been successfully used to perform complex computations or to find witness terms for existentially quantified variables. In general, (possibly unsound) symbolic computations can be very helpful in proof search but have to be verified to obtain a formally correct proof. Symbolic computation services, such as defined by the projects MathBroker and MONET, can in principle be described in OWL-S. However, such services require the use of more expressive languages in the preconditions of effects of semantic service descriptions. Due to the modular structure of OWL-S, new languages for conditions can be introduced easily. Similar to the Mathematical Service Description Language (MSDL), OpenMath could be used to describe the preconditions and effects of com- putation services. However, if symbols of public OpenMath Content Dictionaries (CDs) are used in these descriptions, the entailment between conditions cannot be determined by formal reasoning methods. Therefore, the set of symbols used in Open- Math formulae would have to be restricted to symbols from a well-defined formal theory. 10.2 Future Work 185

In general, more expressive preconditions and effects also require stronger reasoning support. In the worst case, the entailment test between conditions might become undecidable. Reasoning Services for Description Logics. With a growing interest in the Se- mantic Web and semantic technologies, the need for reasoning tools for formal on- tologies has also grown. As a consequence, efficient and robust reasoning tools have been developed. In Chapter 5 we presented the Pellet reasoner which is used by the MathServe broker for DL reasoning. Next to Pellet, the system FaCT [Horrocks, 1998] and the commercial system Racer [Haarslev and M¨oller, 2003] are the most prominent and efficient reasoning systems for the description logic OWL-DL. Offering the capa- bilities of these systems as reasoning services could be a useful extension of MathServe. Different DL reasoners are specialised on certain reasoning tasks. For example, some reasoners can deal with large A-boxes while others have more efficient T-box reasoning. The MathServe broker could use this information to choose the most suitable reasoning service for a DL reasoning task at hand. However, many DL reasoners are based on the DIG interface [Bechhofer et al., 2003] and require a stateful reasoning engine. Thus, their services could only be offered with the help of stateful Web Services.

10.2.5 Networks of MathServe Brokers In the context of the MathWeb-SB, we gained some experience with dynamic networks of yellow-page servers (see Section 2.3.3). This approach could be easily applied to the MathServe framework. As soon as a new broker becomes available, it registers itself to a list of known brokers in the Internet. Information about new brokers is passed on recursively to other known brokers until all brokers know each other. The service registries of all brokers in the network could be synchronised as it was done with the yellow page servers of the MathWeb-SB. Furthermore, services blacklists could also be shared between the brokers. In the case of a high load on one broker, service matchmaking and composition tasks could be delegated to other brokers in the network.

10.2.6 Online Update of System Performance In Chapter 4, it was shown how data about the performance of reasoning systems on different problem classes is made available in the OWL-S profiles of the corresponding reasoning services. The MathServe broker uses the performance data to compute an optimal policy for the choice of reasoning services. The data and the optimal policy computed from it are static in a sense that they are not updated with data obtained by the broker during the invocation of reasoning services. However, a periodic update of the performance data and the derived policy could improve the performance of the broker. For example, with a dynamic update of the policy used at the CADE ATP System Competitions, the MathServe broker might have performed better on the problems of the SAT division (cf. Section 7.6). A dynamic update of the performance data of reasoning services and the broker’s optimal policy could be easily realised in the broker. Since the search space for an op- timal policy in DTGolog is heavily constrained, a new optimal policy can be computed 186 Chapter 10. Limitations and Future Work from scratch in a fraction of a second. Thus, a new policy can be computed during the normal operation of the broker.

10.3 Summary

We presented the MathServe framework which is a major contribution to a Mathe- matical Semantic Web as outlined in Chapter 1. offers the functionality of automated reasoning tools as Semantic Web Services whose descriptions contain information about the input and output behaviours of the services as well as their area of expertise. This information is used by the MathServe broker to perform automated service matchmak- ing and composition tasks. The suitability of the Semantic Web approach and the robustness of MathServe have been demonstrated at two CADE ATP System Competitions and in an additional evaluation involving higher-order theorem proving problems about sets and relations. Despite the fact that MathServe performed well in both evaluations it also has its limitations which we have discussed in this chapter. One of the most serious limitations is the fact that Web Services, by definition, are stateless entities. However, stateful services are needed for some reasoning applications as described above. Another serious issue is the assignment of explicit time resources to the atomic services constituting a composite service. Last but not least, a concurrent invocation of reasoning services is also desirable, in particular in CASC scenario described in Chapter 7. Future work should concentrate on trying to improve the time resource manage- ment of the MathServe broker. As a first, easy step, every composite service could be combined with a time distribution strategy which tells the broker what percentage of the overall time resource should be assigned to the different services. This would, for instance, allow the (manual) design of more sophisticated composite services in which several reasoning services could be called sequentially to try and solve a given problem. In the context of the CADE ATP Systems Competition, a parallel invocation of several ATP systems could seriously increase the success rate and decrease the average answer time of the MathServe broker. With an optimal policy including parallel service invocations, several (sub-)sets of ATP systems could be compared with respect to their success rate and for every problem, an optimal set of ATP systems could be selected. However, as mentioned above, the problem of computing an optimal strategy for a first-order MDP with parallel agent actions has not yet been solved. Last but not least, one could try to model stateful reasoning services and integrating them into MathServe, probably at the cost of less automation in MathServe’s service matchmaking and brokering modules. Bibliography

[Alexoudi et al., 2004] Marianthi Alexoudi, Claus Zinn, and Alan Bundy. English Summaries of Mathematical Proofs. In Proceedings of the Workshop on Computer- supported mathematical theory development at the 2nd International Joint Confer- ence on Automated Reasoning (IJCAR), pages 49–60, Cork, Ireland, 2004.

[Alur and Peled, 2004] Rajeev Alur and Doron Peled, editors. Proceedings of the 16th International Conference on Computer Aided Verification (CAV’04), volume 3114 of LNCS, Omni Parker House Hotel, Boston, Massachusetts, July 13–17 2004. Springer Verlag. [Ambros-Ingerson and Steel, 1988] J. Ambros-Ingerson and S. Steel. Integrating plan- ning execution and monitoring. In Proceedings of the 7th National Conference on Artificial Intelligence (AAAI’88), pages 83–88, Saint Paul, MN, 1988.

[Andr´eka et al., 1998] H. Andr´eka, J. van Benthem, and I. N´emeti. Modal Logics and Bounded Fragments of Predicate Logic. Journal of Philosophical Logic, 27:217–274, 1998.

[Andrews et al.] T. Andrews, F. Curbera, H. Dholakia, Y. Goland, J. Klein, F. Ley- mann, K. Liu, D. Roller, D. Smith, S. Thatte, I. Trickovic, and S. Weerawarana. Business Process Execution Language for Web Services – Version 1.1. [Andrews et al., 1996] Peter B. Andrews, Matthew Bishop, Sunil Issar, Dan Nesmith, Frank Pfenning, and Hongwei Xi. TPS: A Theorem Proving System for Classical Type Theory. Journal of Automated Reasoning, 16(3):321–353, 1996. [Andrews, 1986] Peter B. Andrews. An Introduction to Mathematical Logic and Type Theory: To Truth Through Proof. Academic Press, 1986. [Armando and Zini, 2000] A. Armando and D. Zini. Towards Interoperable Mecha- nized Reasoning Systems: The Logic Broker Architecture. In A. Poggi, editor, Proceedings of the AI*IA-TABOO Joint Workshop ‘From Objects to Agents: Evolu- tionary Trends of Software Systems’, Parma, Italy, May 29–30 2000.

[Armando et al., 2000] A. Armando, M. Kohlhase, and S. Ranise. Communication Protocols for Mathematical Services based on KQML and OMRS. In M. Kerber and M. Kohlhase, editors, Proceedings of the Calculemus Symposium 2000, 2000.

[Asperti et al., 2001] A. Asperti, L. Padovani, C. Sacerdoti Coen, and I. Schena. HELM and the Semantic Math-Web. In Richard J. Boulton and Paul B. Jack- son, editors, Proceedings of the 14th International Conference on Theorem Proving in Higher Order Logics, volume 2152 of LNCS, Edinburgh, Scotland, UK, September 3–6 2001. Springer Verlag. 188 BIBLIOGRAPHY

[Auld et al., 2002] Chris Auld, Paul Spencer, Jeff Rafter, Jon James, Dave Addey, Oli Gauti Gudmundsson, Allan Kent, Alex Schiell, and Inigo Surguy. Practical XML for the Web. Glasshaus, 2002.

[Avenhaus and Denzinger, 1993] J. Avenhaus and J. Denzinger. Distributing Equa- tional Theorem Proving. In Claude Kirchner, editor, Proceedings of the 5th Interna- tional Conference on Rewriting Techniques and Applications (RTA’93), volume 690 of LNCS, pages 62–76, Montreal, Canada, June 16-18 1993. Springer.

[Baader et al., 1990] F. Baader, H.-J. B¨urckert, B. Hollunder, W. Nutt, and J. Siek- mann. Concept logic. DFKI Research Report RR-90-10, Deutsches Forschungszen- trum f¨ur K¨unstliche Intelligenz, Kaiserslautern, 1990.

[Baader et al., 2002] F. Baader, D. Calvanese, D. McGuinness, D. Nardi, and P. Patel- Schneider. The Description Logic Handbook. Cambridge University Press, 2002.

[Baader et al., 2005] F. Baader, C. Lutz, M. Milicic, U. Sattler, and F. Wolter. Inte- grating Description Logics and Action Formalisms: First Results. In Manuela M. Veloso and Subbarao Kambhampati, editors, Proceedings of the 20th National Con- ference on Artificial Intelligence (AAAI’05). AAAI Press / MIT Press, July 9–13 2005.

[Baader, 1990] F. Baader. A formal definition for expressive power of knowledge rep- resentation languages. In Proceedings of the 9th European Conference on Artificial Intelligence, ECAI-90, pages 53–58, Stockholm (Schweden), 1990.

[Baader, 2003] Franz Baader, editor. Proceedings of the 19th International Conference on Automated Deduction (CADE–19), volume 2741 of LNCS, Miami, FL, USA, July 28–August 2 2003. Springer Verlag.

[Bacchus et al., 1995] Fahiem Bacchus, Joseph Y. Halpern, and Hector J. Levesque. Reasoning about Noisy Sensors in the Situation Calculus. In Chris Mellish, editor, Proceedings of the 14th International Joint Conference on Artificial Intelligence, pages 1933–1940, San Francisco, 1995. Morgan Kaufmann.

[Bachmair and Ganzinger, 1994] L. Bachmair and H. Ganzinger. Rewrite-Based Equa- tional Theorem Proving with Selection and Simplification. Journal of Logic and Computation, 4(3):217–247, 1994.

[Baraka and Schreiner, 2006] Rebhi Baraka and Wolfgang Schreiner. Semantic Query- ing of Mathematical Web Service Descriptions. In M. Bravetti, M. Nuˇnes, , and Gianluigi Zavattaro, editors, Proceedings of the 3rd International Workshop on Web Services and Formal Methods (WS-FM’06), volume 4184 of LNCS, pages 73–87, Vienna, Austria, September 8-9 2006. Springer-Verlag, Berlin.

[Barrett and Berezin, 2004] Clark Barrett and Sergey Berezin. Cvc lite: A new imple- mentation of the cooperating validity checker. In Alur and Peled [2004]. BIBLIOGRAPHY 189

[Barrett et al., 2006] C. Barrett, L. de Moura, and A. Stump. SMT-COMP: Satisfi- ability Modulo Theories Competition. In K. Etessami and S. Rajamani, editors, Proceedings of the 17th International Conference on Computer Aided Verification (CAV’06), pages 20–23. Springer Verlag, 2006.

[Bechhofer et al., 2003] S. Bechhofer, R. Moller, and P. Crowther. The DIG Descrip- tion Logic Interface: DIG/1.1. In Proceedings of the 16th International Workshop on Description Logics (DL’03), Rome, Italy, September 5–7 2003.

[Bechhofer et al., 2004] Sean Bechhofer, Frank van Harmelen, Free University Am- sterdam Jim Hendler, Ian Horrocks, Deborah L. McGuinness, Peter F. Patel- Schneider, and Lynn Andrea Stein. OWL Web Ontology Language Reference. http://www.w3.org/TR/owl-ref/, February 2004.

[Beckett, 2004] Dave Beckett. RDF/XML Syntax Specification (Revised). http:// www.w3.org/TR/rdf-syntax-grammar/, February 2004.

[Bellman, 1957] R. Bellman. Dynamic Programming. Princeton Univ. Press, 1957.

[Benzm¨uller and Kohlhase, 1998] Christoph Benzm¨uller and Michael Kohlhase. LEO – a Higher Order Theorem Prover. In Kirchner and Kirchner [1998], pages 139–144.

[Benzm¨uller and Sorge, 2000] Christoph Benzm¨uller and Volker Sorge. Oants – An open Approach at Combining Interactive and Automated Theorem Proving. In M. Kerber and M. Kohlhase, editors, Proceedings of the Calculemus Symposium 2000, St. Andrews, UK, August 6–7 2000. AK Peters, New York, NY, USA.

[Benzm¨uller et al., 2001] Christoph Benzm¨uller, Mateja Jamnik, Manfred Kerber, and Volker Sorge. Experiments with an Agent-Oriented Reasoning System. In Franz Baader, Gerhard Brewka, and Thomas Eiter, editors, Proceedings of Advances in Artificial Intelligence, Joint German/Austrian Conference on AI (KI’01), number 2174 in LNAI, pages 409–424, Vienna, Austria, September 19–21 2001. Springer.

[Benzm¨uller et al., 2005] C. Benzm¨uller, V. Sorge, M. Jamnik, and M. Kerber. Can a higher-order and a first-order theorem prover cooperate? In F. Baader and A. Voronkov, editors, Proceedings of the 11th International Conference on Logic for Programming Artificial Intelligence and Reasoning (LPAR’05), number 3452 in LNAI, pages 415–431. Springer, 2005.

[Berardi, 2005] Daniela Berardi. Automatic Service Composition – Models, Techniques and Tools. PhD thesis, Universit`aDegli Studi di Roma, 2005.

[Berners-Lee et al., 1998] T. Berners-Lee, R. Fielding, and L. Masinter. Rfc 2396: Uniform Resource Identifiers (URI): Generic syntax. Available from http://www.faqs.org/rfcs/rfc2396.html#, August 1998.

[Berners-Lee et al., 2001] T. Berners-Lee, J. Hendler, and O. Lassila. The Semantic Web. Scientific American, 284(5):34–43, 2001. 190 BIBLIOGRAPHY

[Bertoli et al., 2003] Piergiorgio Bertoli, Alessandro Cimatti, Ugo Dal Lago, and Marco Pistore. Extending PDDL to Nondeterminism, limited Sensing and iterative con- ditionals Plans. In Derek Long, Drew McDermott, and Sylvie Thi´ebeaux, editors, Proceedings of the Workshop in PDDL in conjunction with ICAPS’03, pages 15–24, Trento, Italy, June 10 2003.

[Bertsekas and Shreve, 1978] Dimitir P. Bertsekas and Steven Shreve. Stochastic Op- timal Control: The Discrete-Time Case. Academic Press, Orlando, 1978.

[Bieberstein et al., 2005] Norbert Bieberstein, Sanjay Bose, Marc Fiammante, Keith Jones, and Rawn Shah. Service-Oriented Architecture (SOA) Compass. Pearson Education, 2005.

[Blackburn et al., 1999] Patrick Blackburn, Johan Bos, Michael Kohlhase, and Hans de Nivelle. Inference and Computational Semantics. In Proceedings of the 3rd Inter- national Workshop on Computational Semantics (IWCS-3), Tilburg, The Nether- lands, 1999.

[Blythe et al., 1992] Jim Blythe, Oren Etzioni, Yolanda Gil, and Robert Joseph et al. PRODIGY 4.0: The Manual and Tutorial. School of Computer Science, Carnegie Mellon University, Pittsburgh PA, USA, June 1992.

[Bobrow and Winograd, 1977] D.G. Bobrow and T. Winograd. An Overview of KRL, a Knowledge Representation Language. Cognitive Science, 1(1), 1977.

[Boley et al., 2001] Harold Boley, Said Tabet, and Gerd Wagner. Design Rationale of RuleML: A Markup Language for Semantic Web Rules. In Proceedings of the In- ternational Semantic Web Working Symposium (SWWS’01), Stanford, July/August 2001.

[Bonacina and Hsiang, 1995] Maria Paola Bonacina and Jieh Hsiang. The clause- diffusion methodology for distributed deduction. Fundamenta Informaticae, 24:177– 207, 1995.

[Bonacina, 2000] Maria Paola Bonacina. A taxonomy of parallel strategies for deduc- tion. Annals of Mathematics and Artificial Intelligence, 29(1-4):223–257, 2000.

[Bonet and Geffner, 2003] Blai Bonet and H´ector Geffner. Labeled RTDP: Improv- ing the Convergence of Real-Time Dynamic Programming. In Enrico Giunchiglia, Nicola Muscettola, and Dana S. Nau, editors, Proceedings of the 13th International Conference on Automated Planning and Scheduling (ICAPS’03), Trento, Italy, June 9–13 2003. AAAI.

[Boole, 1854] George Boole. An Investigation of The Laws of Thought. Macmillan, Barclay, & Macmillan, Cambridge, United Kingdom, 1854.

[Boutilier and Goldszmidt, 1996] Craig Boutilier and Mois´es Goldszmidt. The Frame Problem and Bayesian Network Action Representations. In Proceedings of the 9th Canadian Conference on Artificial Intelligence (CCAI’96), 1996. BIBLIOGRAPHY 191

[Boutilier et al., 1999] Craig Boutilier, Thomas Dean, and Steve Hanks. Decision- theoretic planning: Structural assumptions and computational leverage. Journal of Artificial Intelligence Research, 11:1–94, 1999.

[Boutilier et al., 2000a] Craig Boutilier, Richard Dearden, and Mois´es Goldszmidt. Stochastic dynamic programming with factored representations. Artificial Intelli- gence, 121(1-2):49–107, 2000.

[Boutilier et al., 2000b] Craig Boutilier, Ray Reiter, Mikhail Soutchanski, and Sebas- tian Thrun. Decision-theoretic, High-level Agent Programming in the Situation Calculus. In Henry Kautz and Bruce Porter, editors, Proceedings of the 17th Na- tional Conference on Artificial Intelligence (AAAI’00), Austin, Texas, July 30–Aug 2 2000. AAAI Press.

[Boutilier et al., 2001] Craig Boutilier, Ray Reiter, and Bob Price. Symbolic Dynamic Programming for First-order MDPs. In Proceedings of the 7th International Joint Conference on Artificial Intelligence (IJCAI’01), pages 690–697, Seattle, 2001.

[Boy de la Tour, 1992] Thierry Boy de la Tour. An optimality result for clause form translation. Journal of Symbolic Computation, 14(4):283–301, 1992.

[Boyer and Moore, 1979] Robert S. Boyer and J Strother Moore. A Computational Logic. Academic Press, New York, USA, 1979.

[Boyer and Moore, 1988] Robert S. Boyer and J Strother Moore. Integrating Decision Procedures into Heuristic Theorem Provers: A Case Study with Linear Arithmetic, volume 11 of Machine Intelligence. Oxford University Press, Oxford, United King- dom, 1988.

[Bozzano et al., 2005] Marco Bozzano, Roberto Bruttomesso, Alessandro Cimatti, Tommi Junttila, Peter van Rossum, Stephan Schulz, and Roberto Sebastiani. The mathsat 3 system. In Nieuwenhuis [2005].

[Brachman and Levesque, 1984] R. J. Brachman and H.J. Levesque. The tractability of subsumption in frame-based description languages. In Proceedings of the 4th National Conference on Artificial Intelligence (AAAI’84), pages 34–37, Austin, Texas, 1984.

[Brachman and Schmolze, 1985] R. J. Brachman and J. G. Schmolze. An Overview of the KL-ONE Knowledge Representation Systems. Cognitive Science, 9(2):171–216, 1985.

[Bray et al., 1999] Tim Bray, Dave Hollander, and Andrew Layman. Namespaces in XML. http://www.w3.org/TR/REC-xml-names/, January 1999.

[Bray et al., 2004] Tim Bray, Jean Paoli, C. M. Sperberg-McQueen, Eve Maler, and Fran¸cois Yergeau. The Extensible Markup Language (XML). http://www.w3.org/ TR/REC-xml/, February 2004.

[Brittain and Darwin, 2003] Jason Brittain and Ian F. Darwin. Tomcat: The Definitive Guide. O’Reilly & Associates, 2003. 192 BIBLIOGRAPHY

[Bundy et al., 1990] Alan Bundy, Frank Van Harmelen, Christian Horn, and Alan Smaill. The oyster-clam system. In M.E. Stickel, editor, Proceedings of the 10th International Conference on Automated Design, volume 449 of LNAI, pages 647– 648. Springer Verlag, 1990.

[Burstein, 2004] Mark H. Burstein. Dynamic Invocation of Semantic Web Services That Use Unfamiliar Ontologies. IEEE Intelligent Systems, 19(4):67–73, 2004.

[Buswell et al., 2003] Stephen Buswell, Olga Caprotti, Mike Dewar, Stilo Ltd, Techni- cal University of Eindhoven, and NAG Ltd. The Mathematical Service Description Language: Final version. http://monet.nag.co.uk/cocoon/monet/publicdocs/ monet-msdl-final.pdf, March 2003.

[Calvanese et al., 2001] D. Calvanese, G. De Giacomo, M. Lenzerini, and D. Nardi. Reasoning in Expressive Description Logics. In A. Robinson and A. Voronkov, edi- tors, Handbook of Automated Reasoning, volume 2, pages 1581–1634. Science Publishers, North-Holland, Amsterdam, 2001.

[Caprotti and Cohen, 1998] Olga Caprotti and Arjeh M. Cohen. Draft of the Open Math standard. The Open Math Society, http://www.nag.co.uk/projects/ OpenMath/omstd/, 1998.

[Carlisle et al., 2003] David Carlisle, Patrick Ion, Robert Miner, and Nico Poppelier. Mathematical Markup Language (Mathml) version 2.0. http://www.w3.org/TR/ 2001/REC-MathML2-20010221/, October 21 2003.

[Char et al., 1992] Bruce W. Char, Keith O. Geddes, Gaston H. Gonnet, Benton L. Leong, Michael B. Monagan, and Stephen M. Watt. First Leaves: A Tutorial Intro- duction to Maple V. Springer Verlag, Berlin, 1992.

[Charles F. Goldfarb, 1991] (1991) Charles F. Goldfarb. The SGML Handbook. Oxford University Press, 1991.

[Christensen et al., 2001] Erik Christensen, Francisco Curbera, Greg Meredith, and Sanjiva Weerawarana. Web Services Description Language. http://www.w3.org/ TR/wsdl, March 2001.

[Church, 1936] Alonzo Church. An unsolvable problem of elementary number theory. American Journal of Mathematics, 1936.

[Church, 1940] Alonzo Church. A Formulation of the Simple Theory of Types. Journal of Symbolic Logic, 5(2):56–68, 1940.

[Claessen and S¨orensson, 2003] Koen Claessen and Niklas S¨orensson. New Techniques that Improve MACE-style Finite Model Finding. In Baader [2003].

[Clark et al., 2003] Mike Clark, Peter Fletcher, J. Jeffrey Hanson, Romin Irani, Mark Waterhouse, and Jorgen Thelin. Web Services Business Strategies and Architectures. Apress, Berkeley, USA, 2003. BIBLIOGRAPHY 193

[Clark, 1978] K. L. Clark. Negation as failure. In H. Gallaire and J. Minker, editors, Logic and Data Bases, pages 292–322. Plenum Press, New York, 1978.

[Constable et al., 1986] Robert L. Constable, Stuart F. Allen, H. Mark Bromley, W.Rance Cleaveland, James F. Cremer, Robert W. Harper, Douglas J. Howe, Todd B. Knoblock, Nax P. Mendler, Prakash Panangaden, James T. Sasaki, and Scott F. Smith. Implementing Mathematics with the Nuprl Proof Development Sys- tem. Prentice Hall, Englewood Cliffs, NJ, USA, 1986.

[Cooper, 1972] D. C. Cooper. Theorem Proving in Arithmetic without Multiplication. In B. Meltzer and D. Michie, editors, Machine Intelligence, pages 91–100. Edinburgh University Press, 1972.

[Coq, 2002] Project Coq. The Coq Proof Assistant Reference Manual, 2002.

[Corcho et al., 2006] Oscar Corcho, Pinar Alper, Ioannis Kotsiopoulos, Paolo Missier, Sean Bechhofer, and Carole Goble. An overview of S-OGSA: A Reference Semantic Grid Architecture. Web Semantics: Science, Services and Agents on the World Wide Web, 4(2):102–115, June 2006.

[Davis et al., 1962] Martin Davis, George Logemann, and Donald Loveland. A machine program for theorem-proving. Communications of the ACM, 5(7):394–397, 1962.

[Dean and Kanazawa, 1989] Thomas Dean and Keiji Kanazawa. A model for reasoning about persistence and causation. Computational Ingelligence, 5(3):142–150, 1989.

[Dennis et al., 2000] Louise A. Dennis, Graham Collins, Michael Norrish, Richard Boulton, Konrad Slind, Graham Robinson, Mike Gordon, and Tom Melham. The prosper toolkit. In Proceedings of the 6th International Conference on Tools and Al- gorithms for the Construction and Analysis of Systems (TACAS’00), LNCS. Springer Verlag, 2000.

[Denzinger and Dahn, 1998] J¨org Denzinger and Ingo Dahn. Cooperating theorem provers. In and Peter Schmitt, editors, Automated Deduction – A Basis for Applications, volume 2, pages 483–416. Kluwer, 1998.

[Denzinger, 1993] J¨org Denzinger. Teamwork: A method to design distributed knowl- edge based theorem provers. PhD thesis, Universit¨at Kaiserslautern, 1993.

[Dixon et al., 2006] Lucas Dixon, Alan Smaill, and Alan Bundy. Planning as Deductive Synthesis in Intuitionistic Linear Logic. Technical Report EDI-INF-RR-0786, School of Informatics, University of Edinburgh, 2006.

[Doshi et al., 2005] Prashant Doshi, Richard Goodwin, Rama Akkiraju, and Kunal Verma. Dynamic Workflow Composition using Markov Decision Processes. Interna- tional Journal of Web Services Research, 2(1):1–17, 2005.

[Drechsler and Becker, 1998] R. Drechsler and B. Becker. Binary Decision Diagrams: Theory and Implementation. Kluwer Academic Press, 1998. 194 BIBLIOGRAPHY

[Dutertre and de Moura, 2006] Bruno Dutertre and Leonardo de Moura. A fast linear- arithmetic solver for dpll(t). In Proceedings of 18th International Conference on Computer Aided Verification (CAV’06), volume 4144 of LNCS, pages 263–277. Springer Verlag, 2006. [E´en and S¨orensson, 2004] Niklas E´en and Niklas S¨orensson. An Extensible SAT- solver. In Enrico Giunchiglia and Armando Tacchella, editors, Proceedings of the 6th International Conference on Theory and Applications of Satisfiability Testing (SAT’03), volume 2919 of LNCS. Springer, 2004. [Erol et al., 1994] K. Erol, D. Nau, and J. Hendler. Htn Planning: Complexity and Expressivity. In Proceedings of the 12th National Conference on Artificial Intelligence (AAAI’94), Seattle, WA, USA, July 31 - August 4 1994. AAAI Press. [Etzioni et al., 1992] Oren Etzioni, Steve Hanks, Daniel Weld, Denise Draper, Neal Lesh, and Mike Williamson. An Approach to Planning with Incomplete Information. In William Nebel, Bernhard; Rich, Charles; Swartout, editor, Proceedings of the 3rd International Conference on Principles of Knowledge Representation and Reasoning, pages 115–125, Cambridge, MA, USA, 1992. Morgan Kaufmann publishers Inc.: San Mateo, CA, USA. [Fallside and Walmsley, 2004] David C. Fallside and Priscilla Walmsley. XML schema part 1. http://www.w3.org/TR/2004/REC-xmlschema-1-20041028/structures. html, October 2004. [Fensel and Bussler, 2002] D. Fensel and C. Bussler. The web service modeling frame- work (wsmf). Electronic Commerce Research and Applications, 1(2):113–137, 2002. [Ferm¨uller et al., 2001] Christian G. Ferm¨uller, Alexander Leitsch, Ullrich Hustadt, and Tanel Tammet. Resolution decision procedures. In A. Robinson and A. Voronkov, editors, Handbook of Automated Reasoning, pages 1791–1849. Else- vier Science Publishers, Amsterdam, The Netherlands, 2001. [Ferrein et al., 2004] A. Ferrein, C. Fritz, and G. Lakemeyer. On-line decision-theoretic golog for unpredictable domains. In S. Biundo, T. Fr¨uhwirt, and G. Palm, editors, Proceedings of 27th German Conference on AI, 2004. [Fiedler, 2001a] Armin Fiedler. P.rex: An interactive proof explainer. In Gor´eet al. [2001], pages 416–420. [Fiedler, 2001b] Armin Fiedler. User-Adaptive Proof Explanation. PhD the- sis, Naturwissenschaftlich-Technische Fakult¨at I, Universit¨at des Saarlandes, Saarbr¨ucken, Germany, 2001. [Fikes et al., 1971] Richard E. Fikes, Peter E. Hart, and Nils J. Nilsson. STRIPS: A New Approach to the Application of Theorem Proving. Artificial Intelligence, 2:189–208, 1971. [Fikes et al., 1972] Richard E. Fikes, Peter E. Hart, and Nils J. Nilsson. Learning and executing generalized robot plans. Artificial Intelligence, 3(4):251–288, 1972. BIBLIOGRAPHY 195

[Finzi et al., 2000] Alberto Finzi, Fiora Pirri, and Ray Reiter. Open world planning in the situation calculus. In Proceedings of the 14th National Conference on Artificial Intelligence (AAAI’00), pages 754–760. AAAI Press, 2000. [Fipa, 2003] Fipa. The Foundation for Itelligent Physical Agents Specifications. http://www.fipa.org/, 2003. [Fitting, 1990] Melvin Fitting. First-Order Logic and Automated Theorem Proving. Springer Verlag, 1990. [Fogel and Bar, 2003] Karl Fogel and Moshe Bar. Open Source Development with CVS, 3rd Edition. Paraglyph Press, 2003. [Foster and Kesselman, 2004] Ian Foster and Carl Kesselman, editors. The Grid 2: Blueprint for a New Computing Infrastructure. Morgan Kaufmann Publishers, 2004. [Franke and Kohlhase, 1999] Andreas Franke and Michael Kohlhase. System descrip- tion: MathWeb, an agent-based communication layer for distributed automated theorem proving. In Harald Ganzinger, editor, Proceedings of the 16th International Conference on Automated Deduction (CADE–16), volume 1632 of LNAI, pages 217– 221, Trento, Italy, July 7–10, 1999. Springer Verlag. [Franke et al., 1999] Andreas Franke, Stephan M. Hess, Christoph G. Jung, Michael Kohlhase, and Volker Sorge. Agent-oriented Integration of Distributed Mathematical Services. Journal of Universal Computer Science, 5:156–187, 1999. [Frege, 1879] Gottlieb Frege. Begriffsschrift, eine der arithmetischen nachgebildete Formelsprache des reinen Denkens. Halle, Germany, 1879. [Furbach and Shankar, 2006] U. Furbach and N. Shankar, editors. Proceedings of the 3rd International Joint Conference on Automated Reasoning, volume 4130 of LNAI, Seattle, USA, August 2006. Springer Verlag. [Ganzinger and Stuber, 2003] H. Ganzinger and J. Stuber. Superposition with Equiv- alence Reasoning and Delayed Clause Normal Form Transformation. In Baader [2003], pages 335–349. [Ganzinger et al., 2004] Harald Ganzinger, George Hagen, Robert Nieuwenhuis, Albert Oliveras, and Cesare Tinelli. DPLL(T): Fast Decision Procedures. In Alur and Peled [2004]. [GAP, 1998] The GAP Group, Aachen, St Andrews. GAP – Groups, Algorithms, and Programming, Version 4, 1998. http://www-gap.dcs.st-and.ac.uk/~gap. [Gardiol and Kaelbling, 2004] Natalia H. Gardiol and Leslie Pack Kaelbling. Envelope- based Planning in Relational MDPs. In Proceedingsof the 16th Conference of Ad- vances in Neural Information Processing Systems (NIPS’03), Vancouver, 2004. [Gelder and Sutcliffe, 2006] Allen Van Gelder and Geoff Sutcliffe. System description: Extending the tptp language to higher-order logic with automated parser generation. In Furbach and Shankar [2006], pages 156–161. 196 BIBLIOGRAPHY

[Genesereth and Fikes, 1990] M.R. Genesereth and Richard E. Fikes. Knowledge Inter- change Format – version 3.0. Technical report, Stanford University, 1990. Available at http://www-ksl.stanford.edu/knowledge-sharing/papers/kif.ps.

[Gennari et al., 2002] J. Gennari, M. A. Musen, R. W. Fergerson, W. E. Grosso, M. Crubezy, H. Eriksson, N. F. Noy, and S. W. Tu. The Evolution of Prot´eg´e: An Environment for Knowledge-based Systems Development. Technical Report SMI-2002-0943, Standford School of Medicine – Standford Medical Informatics, 2002. Available at http://smi-web.stanford.edu/auslese/smi-web/reports/ SMI-2002-0943.pdf.

[Gentzen, 1935] G. Gentzen. Untersuchungen ¨uber das Logische Schließen I und II. Mathematische Zeitschrift, 39:176–210, 405–431, 1935.

[Gerevini and Long, 2005] Alfonso Gerevini and Derek Long. Plan Constraints and Preferences in PDDL3 – The Language of the 5th International Planning Compe- tition. Technical report, Department of Electronics for Automation, Brescia, Italy, August 2005.

[Ginsberg, 1991] Matthew L. Ginsberg. Knowledge Interchange Format: the KIF of Death. AI Magazine, 12(3):57–63, 1991.

[Girard, 1987] Jean-Yves Girard. Linear Logic. Journal of Theoretical Computer Sci- ence, 50(1):1–102, 1987.

[G¨odel, 1930] Kurt G¨odel. Die Vollst¨andigkeit der Axiome des logischen Funktio- nenkalk¨uls. Monatshefte f¨ur Mathematik und Physik, 37:349–360, 1930. English Version in [van Heijenoort, 1967].

[G¨odel, 1964] Kurt G¨odel. Russell’s Mathematical Logic. In P. Benacerraf and H. Put- nam, editors, Philosophy of Mathematics: Selected Readings, 2nd Edition, pages 211–232. Englewood Cliffs, 1964.

[Gordon and Melham, 1993] Mike J. C. Gordon and Tom F. Melham. Introduction to HOL: A theorem proving environment for higher-order logic. Cambridge University Press, Cambridge, United Kingdom, 1993.

[Gor´e et al., 2001] Rejeev Gor´e, Alexander Leitsch, and Tobias Nipkow, editors. Pro- ceedings of the 1st International Joint Conference on Automated Reasoning, volume 2083 of LNAI. Springer Verlag, June 18–22 2001.

[Gottlob and Leitsch, 1985] G. Gottlob and A. Leitsch. On the Efficiency of Subsump- tion Algorithms. Journal of the ACM, 32(2):280–295, 1985.

[Haarslev and M¨oller, 2003] V. Haarslev and R. M¨oller. Racer: A Core Inference En- gine for the Semantic Web. In Proceedings of the 2nd International Workshop on Evaluation of Ontology-based Tools (EON’03), pages 27–36, Sanibel Island, Florida, USA, October 20 2003. BIBLIOGRAPHY 197

[Haarslev et al., 2004] V. Haarslev, R. M¨oller, and M. Wessel. Querying the Semantic Web with Racer and nRQL. In Proceedings of the KI-2004 International Workshop on Applications of Description Logics (ADL’04), Ulm, Germany, September 24 2004.

[H¨ahnle et al., 1996] Reiner H¨ahnle, Manfred Kerber, and Christoph Weidenbach. Common Syntax of DFG-Schwerpunktprogramm “Deduktion”. Interner Bericht 10/96, Universit¨at Karlsruhe, Fakult¨at f¨ur Informatik, 1996.

[H¨ahnle, 2001] Reiner H¨ahnle. Tableaux and Related Methods. In A. Robinson and A. Voronkov, editors, Handbook of Automated Reasoning, volume 1, pages 100–178. Elsevier Science Publishers, North-Holland, Amsterdam, 2001.

[Hallett, 1984] M. Hallett. Cantorian set theory and limitation of size. Oxford Univer- sity Press, 1984.

[Hanks and McDermott, 1994] Steve Hanks and Drew McDermott. Modeling a Dy- namic and Uncertain World I: Symbolic and Probabilistic Reasoning about Change. Artificial Intelligence, 65(2), 1994.

[Harper et al., 1987] Robert Harper, Furio Honsell, and Gordon Plotkin. A Frame- work for Defining Logics. In Proceedings 2nd Annual IEEE Symposium on Logic in Computer Science, (LICS’87), pages 194–204, New York, June 22–25 1987. IEEE Computer Society Press.

[Hendrix, 1979] Garry Hendrix. Encoding knowledge in partitioned networks. In Nico- las Findler, editor, Associative Networks. Academic Press, New York, 1979.

[Herbrand, 1930] Jaques Herbrand. Recherches sur la th´eorie de la d´emonstration. PhD thesis, Universit´ede Paris, 1930. English translation in [van Heijenoort, 1967].

[Hern´andez-Lerma and Lasserre, 1990] O. Hern´andez-Lerma and J. B. Lasserre. Error bounds for rolling horizon policies in discrete-time markov control processes. IEEE Transactions on Automatic Control, 35(10):1118–1124, 1990.

[Heule et al., 2005] Marijn Heule, Joris van Zwieten, Mark Dufour, and Hans van Maaren. March eq: Implementing Additional Reasoning into an Efficient Looka- head Sat Solver. In Proceedings of the 7th International Conference on Theory and Applications of Satisfiability Testing (SAT’04), volume 3542 of LNCS, pages 345– 359. Springer Verlag, 2005.

[Hillenbrand et al., 1999] Th. Hillenbrand, A. Jaeger, and B. L¨ochner. Waldmeister - Improvements in Performance and Ease of Use. In Proceedings of the 16th Interna- tional Conference on Automated Deduction, volume 1632 of LNAI, pages 232–236. Springer Verlag, 1999.

[Hodes, 1972] Louis Hodes. Solving problems by formula manipulation in logic and linear inequalities. Artificial Intelligence, 3(1–3):165–174, 1972. 198 BIBLIOGRAPHY

[Hodges, 1983] Wilfred Hodges. Elementary Predicate Logic. In and Franz Guenthner, editors, Handbook of Philosophical Logic. Vol. 1: Elements of Clas- sical Logic, volume 164 of Synthese Library. Studies in Logic, Methodology and Philo- sophical Science, chapter I.1, pages 1–133. D. Reidel Publishing Company, Boston, MA, USA, 1983. [Hoey et al., 1999] Jesse Hoey, Robert St. Aubin, Alan Hu, and Craig Boutilier. Spudd: Stochastic Planning using Decision Diagrams. In Proceedings of the 15th Annual Conference on Uncertainty in Artificial Intelligence (UAI’99), Stockholm, Sweden, 1999. [Hoffmann, 2000] J. Hoffmann. A heuristic for domain independent planning and its use in an enforced hill-climbing algorithm. In Proceedings of the 12th International Symposium on Methodologies for Intelligent Systems. Springer Verlag, 2000. [Hofreiter et al., 2002] B. Hofreiter, C. Huemer, and W. Klas. ebXML: Status, Re- search Issues and Obstacles. In Proceedings of the 12th Int. Workshop on Research Issues on Data Engineering (RIDE’02), San Jose, California, February 2002. [Horrocks and Sattler, 2001] I. Horrocks and U. Sattler. Ontology reasoning in the SHOQ(D) description logic. In B. Nebel, editor, Proceedings of the 17th Int. Joint Conf. on Artificial Intelligence (IJCAI’01), pages 199–204. Morgan Kaufmann, 2001. [Horrocks and Sattler, 2005] Ian Horrocks and Ulrike Sattler. A tableaux decision pro- cedure for SHOIQ. In Proceedings of the 19th International Joint Conference on Artificial Intelligence (IJCAI’05), pages 448–453, 2005. [Horrocks et al., 1999] I. Horrocks, U. Sattler, and S. Tobies. Practical reasoning for expressive description logics. In H. Ganzinger, D. McAllester, and A. Voronkov, edi- tors, Proceedings of the 6th International Conference on Logic for Programming and Automated Reasoning (LPAR’99), volume 1705 of LNAI, pages 161–180. Springer Verlag, 1999. [Horrocks et al., 2001] Ian Horrocks, Frank van Harmelen, and Peter Patel-Schneider. DAML+OIL Language Specification. http://www.daml.org/2001/03/daml+ oil-index.html, March 2001. [Horrocks et al., 2003] Ian Horrocks, Peter F. Patel-Schneider, and Frank van Harme- len. From SHIQ and RDF to OWL: The Making of a Web Ontology Language. Journal of Web Semantics, 1(1):7–26, 2003. [Horrocks et al., 2004] Ian Horrocks, Peter F. Patel-Schneider, Harold Boley, Said Tabet, Benjamin Grosof, and Mike Dean. SWRL: A Semantic Web Rule Language Combining OWL and RuleML. Available from http://www.w3.org/Submission/ SWRL/, Member submission, W3C, May 2004. [Horrocks et al., 2005] Ian Horrocks, Bijan Parsia, Peter Patel-Schneider, and James Hendler. Semantic Web Achitecture: Stack or Two Towers? In Francois Fages and Sylvain Soliman, editors, Principles and Practice of Semantic Web Reasoning (PPSWR’05), volume 3703 of LNCS, pages 37–41. Springer Verlag, 2005. BIBLIOGRAPHY 199

[Horrocks, 1998] Ian Horrocks. Using an expressive description logic: FaCT or fiction? In Proceedings of the 6th International Conference on Principles of Knowledge Rep- resentation and Reasoning (KR98), pages 636–647, Trento, Italy, June 2–5 1998.

[Howard, 1960] Ronald Howard. Dynamic Programming and Markov Processes. The MIT Press, 1960.

[Huang and Fiedler, 1996] Xiaorong Huang and Armin Fiedler. Presenting Machine- Found Proofs. In Michael A. McRobbie and John K. Slaney, editors, Proceedings of the 13th International Conference on Automated Deduction (CADE–13), volume 1104 of LNAI, pages 221–225, New Brunswick, NJ, USA, July 30– August 3 1996. Springer Verlag.

[Huang and Fiedler, 1997] Xiaorong Huang and Armin Fiedler. Proof Verbalization in PROVERB. In J¨org Siekmann, Frank Pfenning, and Xiaorong Huang, editors, Proceedings of the 1st International Workshop on Proof Transformation and Presen- tation, pages 35–36, Schloss Dagstuhl, Germany, 1997.

[Huang, 1994] Xiaorong Huang. Reconstructing Proofs at the Assertion Level. In Alan Bundy, editor, Proceedings of the 12th Conference on Automated Deduction, volume 814 of LNAI, pages 738–752, Nancy, France, 1994. Springer Verlag.

[Hurd, 2002] Joe Hurd. An LCF-Style Interface between HOL and First-Order Logic. In Voronkov [2002], pages 134–138.

[Irani and Bashna, 2002] Romin Irani and S. Jeelani Bashna. AXIS: Next Generation Java SOAP. Wrox Press, 2002.

[Jensen et al., 2001] R. Jensen, M. Veloso, and M. Browling. OBDD-Based Optimistic and Strong Cyclic Adversarial Planning. In Proceedings of the 6th European Con- ference on Planning (ECP’01), Toledo, Spain, September 12-14 2001.

[Joyner, 1976] W.H. Joyner. Resolution Strategies as Decision Procedures. Journal of the ACM, 23(3):398–417, 1976.

[Kakas and Miller, 1997] Antonis Kakas and Rob Miller. A simple declarative language for describing narratives with actions. The Journal of (Special Issue on Reasoning about Action and Change), 31(1–3):157–200, 1997.

[Kakas et al., 2000] Antonis Kakas, Rob Miller, and Francesca Toni. E-res - a sys- tem for reasoning about actions, events and observations. In Chitta Baral and Miroslaw Truszczynski, editors, Proceedings of the 8th International Workshop on Non-Monotonic Reasoning (NMR’00), Breckenridge, Colorado, April 9–11 2000.

[Kaufmann and Moore, 1996] Matt Kaufmann and J Strother Moore. ACL2: An In- dustrial Strength Version of Nqthm. In Eleventh Annual Conference on Computer Assurance (COMPASS ’96), pages 23–34, Gaithersburg, Maryland, USA, June 17– 21 1996. IEEE Computer Society Press. 200 BIBLIOGRAPHY

[Kerber, 1992] Manfred Kerber. On the Representation of Mathematical Concepts and their Translation into First Order Logic. PhD thesis, Fachbereich Informatik, Uni- versit¨at Kaiserslautern, Kaiserslautern, Germany, 1992. To prove difficult theorems in a mathematical field requires substantial knowledge of that field. In this the- sis a frame-based knowledge representation formalism including higher-order sorted logic is presented, which supports a conceptual representation and to a large extent guarantees the consistency of the built-up knowledge bases. In order to operational- ize this knowledge, for instance, in an automated theorem proving system, a class of sound morphisms from higher-order into first-order logic is given, in addition a sound and complete translation is presented. The translations are bijective and hence compatible with a later proof presentation. In order to prove certain theorems the comprehension axioms are necessary, (but difficult to handle in an automated system); such theorems are called truly higher-order. Many apparently higher-order theorems (i.e. theorems that are stated in higher-order syntax) however are essen- tially first-order in the sense that they can be proved without the comprehension axioms: for proving these theorems the translation technique as presented in this thesis is well-suited. [Kifer et al., 2005] Michael Kifer, Jos de Bruijn, Harold Boley, and Dieter Fensel. A Re- alistic Architecture for the Semantic Web. In Proceedings of the International Con- ference on Rules and Rule Markup Languages for the Semantic Web (RuleML’05), Galway, Ireland, 2005. [Kirchner and Kirchner, 1998] Claude Kirchner and H´el`ene Kirchner, editors. Pro- ceedings of the 15th International Conference on Automated Deduction (CADE–15), volume 1421 of LNAI, Lindau, Germany, July 5–10 1998. Springer Verlag. [Klusch et al., 2005] M. Klusch, A. Gerber, and M. Schmidt. Semantic Web Service Composition Planning with OWLS-Xplan. In Proceedings of the 1st International AAAI Fall Symposium on Agents and the Semantic Web, Arlington VA, USA, 2005. [Klusch et al., 2006] M. Klusch, B. Fries, and K. Sycara. Automated Semantic Web Service Discovery with OWLS-MX. In Proceedings of the 5th International Con- ference on Autonomous Agents and Multi-Agent Systems (AAMAS’06), Hakodate, Japan, 2006. ACM Press. [Knuth and Bendix, 1970] Donald E. Knuth and Peter B. Bendix. Simple word prob- lems in universal algebras. In J. Leech, editor, Computational Problems in Abstract Algebra, pages 263–297. Pergamon Press, 1970. [Kohlhase and Franke, 2001] Michael Kohlhase and Andreas Franke. Mbase: Repre- senting Knowledge and Context for the Integration of Mathematical Software Sys- tems. Journal of Symbolic Computation, 23(4):365–402, 2001. [Kohlhase, 2006] Michael Kohlhase. OMDoc: An Open Markup Format for Mathemat- ical Documents [version 1.2], volume 4180 of LNAI. Springer Verlag, 2006. [Kushmerick et al., 1995] Nicholas Kushmerick, Steve Hanks, and Daniel S. Weld. An Algorithm for Probabilistic Planning. Artificial Intelligence, 76(1-2):239–286, 1995. BIBLIOGRAPHY 201

[Leibniz, 1986] G. W. Leibniz. Projet et Essais pour arriver `aquelque certitude pour finir une bonne partie des disputes et pour avancer l’art d’inventer. In Karel Berka and Lothar Kreiser, editors, Logik-Texte: Kommentierte Auswahl zur Geschichte der Modernen Logik (vierte Auflage), chapter I.2, pages 16–18. Akademie-Verlag, Berlin, 1986.

[Lesperance et al., 1999] Yves Lesperance, Todd G. Kelley, John Mylopoulos, and Eric S. K. Yu. Modeling Dynamic Domains with ConGolog. In M. Jarke and A. Ober- weis, editors, Proceedings of the 11th Conference on Advanced Information Systems Engineering (CAiSE’99), volume LNCS 1626, 1999.

[Letz and Stenz, 2001a] Reinhold Letz and Gernot Stenz. DCTP: A Disconnection Calculus Theorem Prover. In Gor´eet al. [2001], pages 381–385.

[Letz and Stenz, 2001b] Reinhold Letz and Gernot Stenz. Model Elimination and Con- nection Tableaux Procedures. In A. Robinson and A. Voronkov, editors, Handbook of Automated Reasoning, volume 2. Elsevier Science Publishers, North-Holland, Am- sterdam, 2001.

[Levesque et al., 1997] H Levesque, Reiter R., Y. Lesperance, Lin F., and Scherl R. Golog: A logic programming language for dynamic domains. Journal of Logic Pro- gramming, 31:59–84, 1997.

[Lin and Reiter, 1997] Fangzhen Lin and Ray Reiter. How to Progress a Database. Artificial Intelligence, 92(1-2):131–167, 1997.

[L¨ochner and Hillenbrand, 2002] B. L¨ochner and T. Hillenbrand. The Next Waldmeis- ter Loop. In Voronkov [2002], pages 486–500.

[Lutz et al., 2001] C. Lutz, H. Sturm, F. Wolter, and M. Zakharyaschev. Tableaux for Temporal Description Logic with Constant Domain. In Rajeev Gor´e, Alexander Leitsch, and Tobias Nipkow, editors, Proceedings of the 1st International Joint Con- ference on Automated Reasoning (IJCAR’01), volume 2083 of LNAI, pages 121–136, Siena, Italy, 2001. Springer Verlag.

[Manola et al., 2004] Frank Manola, Eric Miller, and Brian McBride. RDF Primer. http://www.w3.org/TR/rdf-primer/, February 2004.

[Martin et al., 2004] David Martin, Mark Burstein, Jerry Hobbs, Ora Lassila, Drew McDermott, Sheila McIlraith, Srini Narayanan, Massimo Paolucci, Bijan Parsia, Terry Payne, Evren Sirin, Naveen Srinivasan, and Katia Sycara. OWL-S: Semantic Markup for Web Services. http://www.daml.org/services/owl-s/1.1/overview, October 2004.

[Martinez and Lesprance, 2004] E. Martinez and Y. Lesprance. Web Service Com- position as a Planning Task: Experiments using Knowledge-Based Planning. In Proceedings of the ICAPS-2004 Workshop on Planning and Scheduling for Web and Grid Services, pages 62–69, British Columbia, Canada, June 2004. Whistler. 202 BIBLIOGRAPHY

[Mausam and Weld, 2004] Mausam and Daniel S. Weld. Solving Concurrent Markov Decision Processes. In Proceedings of the 19th National Conference on Artificial Intelligence (AAAI’04), San Jose, CA, USA, July 25–29 2004. American Association for Artificial Intelligence, AAAI Press.

[McBride, 2001] Brian McBride. Jena: Implementing the RDF Model and Syn- tax Specification. In Stefan Decker, Dieter Fensel, Amit Sheth, and Stef- fen Staab, editors, Proceedings of 2nd International Workshop on the Semantic Web (SemWeb’01), Hongkong, China, May 2001. available at http://sunsite. informatik.rwth-aachen.de/Publications/CEUR-WS/Vol-40/.

[McCarthy, 1968] J. McCarthy. Situations, actions and causal laws. In M. Minsky, editor, Semantic Information Processing, pages 410–417. MIT Press, Cambridge, Mass., 1968.

[McCluskey et al., 2003] T. L. McCluskey, D. Liu, and R. Simpson. Gipo ii: HTN planning in a Tool-supported Knowledge Engineering Environment. In Enrico Giunchiglia, Nicola Muscettola, and Dana S. Nau, editors, Proceedings of the 13th International Conference on Automated Planning and Scheduling (ICAPS 2003), Trento, Italy, June 9–13, 2003. AAAI.

[McCune, 1994a] W. McCune. A davis-putnam program and its application to finite first-order model search. Technical Report Technical Report ANL/MCS-TM-194, Argonne National Laboratories, 1994.

[McCune, 1994b] W. McCune. Otter 3.0 Reference Manual and Guide. Technical Report ANL-94/6, Argonne National Laboratory, Argonne, USA, 1994.

[McCune, 1997] W. McCune. Solution of the Robbins Problem. Journal of Automated Reasoning, 19(3):263–276, 1997.

[McCune, 2003] W. McCune. Mace4 Reference Manual and Guide, 2003. http:// www.citebase.org/cgi-bin/citations?id=oai:arXiv.org:cs/0310055.

[McDermott, 2002] D. McDermott. Estimated-regression planning for interactions with Web Services. In Proceedings of the 6th International Conference on AI Plan- ning and Scheduling (AIPS’02), Toulouse, France, April 23–27 2002. AAAI Press.

[McIlraith and Son, 2002] S. McIlraith and T. Son. Adapting Golog for Composition of Semantic Web Services. In Proceedings of 8th International Conference on Principles of Knowledge Representation and Reasoning, Toulouse, France, 2002.

[McLaughlin, 2001] Brett McLaughlin. Java and XML. O’Reilly Associates, June 2001.

[McNeill et al., 2004] F. McNeill, A. Bundy, and C. Walton. Diagnosing and Repair- ing Ontological Mismatches. In Proceedings of the 2nd Starting AI Researchers’ Symposium, Valencia, Spain, August 2004. BIBLIOGRAPHY 203

[Meier, 2000] Andreas Meier. TRAMP: Transformation of Machine-Found Proofs into Natural Deduction Proofs at the Assertion Level. In D. McAllester, editor, Proceed- ings of the 17th Conference on Automated Deduction (CADE–17), volume 1831 of LNAI, pages 460–464, Pittsburgh, USA, 2000. Springer Verlag.

[Melis and Siekmann, 2004] E. Melis and J. Siekmann. Activemath: An intelligent tutoring system for mathematics. In L. Rutkowski, J. Siekmann, R. Tadeusiewicz, and L.A. Zadeh, editors, Proceedings of the 7th International Conference on Artificial Intelligence and Soft Computing (ICAISC’04), volume 3070 of LNAI, pages 91–101. Springer Verlag, 2004.

[Meng and Paulson, 2004] J. Meng and L. Paulson. Experiments On Supporting In- teractive Proof Using Resolution. In David Basin and Michael Rusinowitch, editors, Automated Reasoning — 2nd International Joint Conference, IJCAR 2004, volume 3097 of LNAI, Cork, Ireland, July 4–8 2004. Springer Verlag.

[Millo et al., 1979] Richard A. De Millo, Richard J. Lipton, and Alan J. Perlis. Social processes and proofs of theorems and programs. Commun. ACM, 22(5):271–280, 1979.

[Milner, 1999] Robin Milner. Communicating and Mobile Systems: The Pi-Calculus. Cambridge University Press, 1999.

[Minton et al., 1994] Steven Minton, John Bresina, and Mark Drummond. Total-order and Partial-Order Planning: A Comparative Analysis. Journal of Artificial Intelli- gence Research, 2:227–262, 1994.

[Moler, 2006] Cleve Moler. Numerical Computing with MATLAB. Society for Indus- trial & Applied Mathematics, 2006.

[MONET, 2002] MONET. The MONET Project. http://monet.nag.co.uk/cocoon/ monet/index.html, April 2002.

[Mortimer, 1975] M. Mortimer. On Languages with two Variables. Zeitschrift f¨ur mathematische Logik und Grundlagen der Mathematik, 21:135–140, 1975.

[Moskewicz et al., 2001] M.W. Moskewicz, C.F. Madigan, Y. Zhao, L. Zhang, and S. Malik. Chaff: Engineering an Efficient SAT Solver. In Proceedings of the 38th Design Automation Conference (DAC’01), Las Vegas Convention Center, June 2001. Association for Computing Machinery.

[Mossakowski et al., 2006] T. Mossakowski, C. Maeder, and K. L¨uttich. The Heteroge- neous Tool Set. Available at www.tzi.de/cofi/hets, University of Bremen, 2006.

[Narayanan and McIlraith, 2002] Srini Narayanan and Sheila A. McIlraith. Simula- tion, Verification and Automated Composition of Web Services. In Proceedings of the 11th International Conference on the World Wide Web (WWW’02), pages 77–88, New York, NY, USA, 2002. ACM Press. 204 BIBLIOGRAPHY

[Nau et al., 2003] D. S. Nau, T. C. Au, O. Ilghami, U. Kuter, J. W. Murdock, D. Wu, , and F. Yaman. Shop2: An HTN Planning System. Journal of Artificial Intelligence Research, 20:379–404, 2003.

[Nelson and Oppen, 1979] G. Nelson and D. C. Oppen. Simplification by cooperating decision procedures. ACM Transactions on Programming Languages and Systems (TOPLAS), 1(2):245–257, 1979.

[Nieuwenhuis and Oliveras, 2005] Robert Nieuwenhuis and Albert Oliveras. Decision Procedures for SAT, SAT Modulo Theories and Beyond. The Barcelogic Tools. In Ge- off Sutcliffe and Andrei Voronkov, editors, Proceedings of the 12th International Con- ference on Logic for Programming, Artificial Intelligecne and Reasoning (LPAR’05), LNAI 3835, pages 23–46, 2005.

[Nieuwenhuis, 2005] Robert Nieuwenhuis, editor. Proceedings of the 20th International Conference on Automated Deduction, volume 3632 of LNCS, Tallinn, Estonia, 2005. Springer Verlag.

[Nilsson, 1980] N. J. Nilsson. Principles of Artificial Intelligence. Tioga Publishing Co., Palo Alto, California, 1980.

[Nipkow et al., 2002] Tobias Nipkow, Lawrence C. Paulson, and Markus Wenzel. Is- abelle/HOL – A Proof Assistant for Higher-Order Logic, volume 2283 of LNCS. Springer Verlag, 2002.

[Nonnengart et al., 1998] Andreas Nonnengart, Georg Rock, and Christoph Weiden- bach. On Generating Small Clause Normal Forms. In Kirchner and Kirchner [1998], pages 397–411.

[Nudelman et al., 2004] Eugene Nudelman, Kevin Leyton-Brown, Holger Hoos, Alex Devkar, and Yoav Shoham. Understanding Random SAT: Beyond the Clauses-to- Variables Ratio. In Proceedings of the 10th International Conference on the Princi- ples and Practice of Constraint Programming (CP’04), volume 3258 of LNCS, pages 438–452, Toronto, Canada, September 27 –October 1 2004. Springer Verlag.

[Owre et al., 1996] Sam Owre, Sreeranga Rajan, John M. Rushby, Natarajan Shankar, and Mandayam K. Srivas. PVS: Combining Specification, Proof Checking, and Model Checking. In Rajeev Alur and Thomas A. Henzinger, editors, Proceedings of the 8th International Conference Computer Aided Verification (CAV-96), volume 1102 of LNCS, pages 411–414, New Brunswick, NJ, USA, July 31–August 3 1996. Springer Verlag.

[Pasula et al., 2007] Hanna M. Pasula, Luke S. Zettlemoyer, and Leslie Pack Kaelbling. Learning symbolic models of stochastic domains. Journal of Artificial Intelligence Research, 29:309–352, 2007.

[Patel-Schneider et al., 2003] Peter F. Patel-Schneider, Patrick Hayes, and Ian Hor- rocks. OWL Web Ontology Language Semantics and Abstract Syntax. http: //www.w3.org/TR/owl-semantics/, August 2003. BIBLIOGRAPHY 205

[Paulson, 1994] Lawrence C. Paulson. Isabelle: a Generic Theorem Prover, volume 828 of LNCS. Springer Verlag, Berlin, Germany, 1994.

[Pearl, 1988] Judea Pearl. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann, San Mateo, 1988.

[Pednault, 1991] E. P. D. Pednault. Generalizing nonlinear planning to handle com- plex goals and actions with context-dependent effects. In Proceedings of the 12th International Joint Conference on Artificial Intelligence, pages 249–254, 1991.

[Pelletier et al., 2002] F.J. Pelletier, G. Sutcliffe, and C.B. Suttner. The Development of CASC. AI Communications, 15(2-3):79–90, 2002.

[Peot and Smith, 1992] M. Peot and D. Smith. Conditional Nonlinear Planning. In James Hendler, editor, Proceedings of the 1st International Conference on AI Plan- ning Systems, pages 189–197, College Park, Maryland, June 15–17 1992. Morgan Kaufmann.

[Pfalzgraf, 2006] Alexander Pfalzgraf. Ein robustes System zur automatischen Kom- position semantischer Web Services in SmartWeb. Master’s thesis, Universit¨at des Saarlandes, Saarbr¨ucken, Germany, 2006.

[Pinto, 1994] Javier Andr´es Pinto. Temporal Reasoning in the Situation Calculus. PhD thesis, University of Toronto, 1994.

[Pirri and Reiter, 1999] Fiora Pirri and Ray Reiter. Some contributions to the metatheory of the situation calculus. Journal of the ACM, 46(3):325–361, 1999.

[Pistore et al., 2004] M. Pistore, F. Barbon, P. Bertoli, D. Shaparau, and P. Traverso. Planning and monitoring web service composition. In The 11th International Conference on Artificial Intelligence, Methodologies, Systems, and Applications (AIMSA’04), Whistler, British Columbia, Canada, June 3–7 2004.

[Pistore et al., 2005] M. Pistore, P. Traverso, and P. Bertoli. Automated composition of web services by planning in asynchronous domains. In Proceedings of the Inter- national Conference on Automated and Planning Sheduling (ICAPS’05), Monterey, California, USA, June 5–10 2005.

[Plaisted and Greenbaum, 1986] D. A. Plaisted and S. Greenbaum. A structure- preserving clause form translation. Journal of Symbolic Computation, 2(3):293–304, 1986.

[Poole, 1996] D. Poole. A Framework for Decision-Theoretic Planning I: Combining the Situation Calculus, Conditional Plans, Probability and Utility. In Proceedings of the 12th Conference on Uncertainty in Artificial Intelligence (UAI’96), pages 436– 445, Portland Oregon, 1996.

[Prawitz, 1965] D. Prawitz. Natural Deduction – A Proof-Theoretical Study. Acta Universitatis Stockholmiensis 3. Almqvist & Wiksell, Stockholm, Sweden, 1965. 206 BIBLIOGRAPHY

[Prud’hommeaux and Seaborne, 2006] Eric Prud’hommeaux and Andy Seaborne. The SPARQL Query Language for RDF. http://www.w3.org/TR/rdf-sparql-query/, October 2006.

[Pryor and Collins, 1996] Louise Pryor and Gregg Collins. Planning for Contingencies: A Decision-based Approach. Journal of Artificial Intelligence Research, 4:287–339, 1996.

[Puterman, 1994] Martin L. Puterman. Markov Decision Processes: Discrete Stochas- tic Dynamic Programming. Wiley Series in Probability and Mathematical Statistics. John Wiley & Sons, New York, 1994.

[Ranise and Deharbe, 2003] S. Ranise and D. Deharbe. Light-weight Theorem Proving for Debugging and Verifying Units of Code. In Proceedings of the 1st International Conference on Software Engineering and Formal Methods (SEFM’03), pages 220– 228, Canberra, Australia, September 2003. IEEE Computer Society Press.

[Ranise and Tinelli, 2006] Silvio Ranise and Cesare Tinelli. The SMT-LIB Standard: Version 1.2. Available at http://combination.cs.uiowa.edu/smtlib/papers/ format-v1.2-r06.08.30.pdf%, August 2006.

[Rao et al., 2004] Jinghai Rao, Peep K¨ungas, and Mikhail Matskin. Logic-based Web Service Composition: from Service Description to Process Model. In Proceedings of the 2004 IEEE International Conference on Web Services, ICWS’04, pages 446–453, San Diego, California, USA, July 6–9 2004. IEEE Computer Society Press.

[Rao, 2004] Jinghai Rao. Semantic Web Service Composition via Logic-based Program Synthesis. PhD thesis, Department of Computer and Information Science, Norwegian University of Science and Technology, December 2004.

[Reiter, 1993] Ray Reiter. Proving Properties of States in the Situation Calculus. Artificial Intelligence, 64(2):337–351, 1993.

[Reiter, 1996] Ray Reiter. Natural actions, concurrency and continuous time in the situation calculus. In Luigia Carlucci Aiello, Jon Doyle, and Stuart Shapiro, editors, Proceedings of the 5th International Conference on Principles of Knowledge Repre- sentation and Reasoning (KR’96), Cambridge, MA, November 5–8 1996. Morgan Kaufmann, Los Altos.

[Reiter, 2001] Ray Reiter. Knowledge in Action: Logical Foundations for Describing and Implementing Dynamical Systems. MIT Press, 2001.

[Riazanov and Voronkov, 2001] A. Riazanov and A. Voronkov. Splitting without Back- tracking. In B. Nebel, editor, Proceedings of the 17th International Joint Conference on Artificial Intelligence, pages 611–617. Morgan Kaufmann, 2001.

[Riazanov and Voronkov, 2002] A. Riazanov and A. Voronkov. The Design and Imple- mentation of Vampire. AI Communications, 15(2-3):91–110, 2002. BIBLIOGRAPHY 207

[Robinson, 1965] John Alan Robinson. A Machine Oriented Logic Based on the Res- olution Principle. Journal of the ACM, 12:23–41, 1965.

[Roman et al., 2005] Dumitru Roman, Uwe Keller, Holger Lausen, Jos de Bruijn, Rubn Lara, Michael Stollberg, Axel Polleres, Cristina Feier, Christoph Bussler, and Dieter Fensel. Web Service Modeling Ontology. Applied Ontology, 1(1):77–106, 2005.

[Rueß and Shankar, 2001] Harald Rueß and Natarajan Shankar. Deconstructing Shostak. In Joseph Halpern, editor, Proceedings of the 16th Annual IEEE Sym- posium on Logic in Computer Science (LICS’01), pages 19–28, Washington, DC, USA, 2001. IEEE Computer Society Press.

[Russell and Norvig, 1995] S. Russell and P. Norvig. Artificial Intelligence — A Mod- ern Approach. Prentice–Hall, Englewood Cliffs, 1995.

[Schild, 1993] Klaus D. Schild. Combining terminological logics with tense logic. In Miguel Filgueiras and Lu´ıs Damas, editors, Proceedings of the 6th Portuguese Con- ference on Artificial Intelligence, (EPIA’93), volume 727 of LNAI, pages 105–120, Porto, Portugal, October 1993. Springer Verlag.

[Schmidt-Schauß and Smolka, 1991] Manfred Schmidt-Schauß and Gert Smolka. At- tributive concept descriptions with complements. Artificial Intelligence, 48(1):1–26, 1991.

[Schreiner and Caprotti, 2001] W. Schreiner and O. Caprotti. The MathBroker Project. http://www.risc.uni-linz.ac.at/projects/basic/mathbroker/, Oc- tober 2001.

[Schulz, 2001] S. Schulz. System abstract: E 0.61. In Gor´eet al. [2001], pages 370–375.

[Seaborne, 2004] Andy Seaborne. RDQL - A Query Language for RDF. http://www. w3.org/Submission/RDQL/, January 2004.

[Shadbolt et al., 2006] Nigel Shadbolt, Tim Berners-Lee, and Wendy Hall. The Se- mantic Web Revisited. IEEE Intelligent Systems, 21(3):96–101, Many/June 2006.

[Shostak, 1977] R. E. Shostak. On the SUP-INF Method for Proving Presburger For- mulas. Journal of the ACM, 24(4):529–543, 1977.

[Shostak, 1984] R. E. Shostak. Deciding combinations of theories. Journal of the ACM, 31(1):1–12, 1984.

[Siegel, 1996] Jon Siegel. CORBA: Fundamentals and Programming. John Wiley & Sons, Inc., 1996.

[Siekmann et al., 2002] J¨org Siekmann, Christoph Benzm¨uller, Vladimir Brezhnev, Lassaad Cheikhrouhou, Armin Fiedler, Andreas Franke, Helmut Horacek, Micha¨el Kohlhase, Andreas Meier, Erica Melis, Markus Moschner, Immanu¨el Normann, Mar- tin Pollet, Volker Sorge, Carsten Ullrich, Claus-Peter Wirth, and J¨urgen Zimmer. 208 BIBLIOGRAPHY

Proof development with Omega. In Proceedings of the 18th International Confer- ence on Automated Deduction (CADE-18), volume 2392 of LNAI, pages 143–148, København, 2002. Springer.

[Siekmann et al., 2006] J¨org Siekmann, Christoph Benzm¨uller, and Serge Autexier. Computer Supported Mathematics with OMEGA. Journal of Applied Logic – Special Issue on Mathematics Assistance Systems, 2006.

[Sintek and Decker, 2002] Michael Sintek and Stefan Decker. TRIPLE – A Query, Inference, and Transformation Language for the Semantic Web. In Proceedings of the 1st International Semantic Web Conference (ISWC), Sardinia, Italy, June 2002.

[Sirin et al., 2003] Evren Sirin, Bijan Parsia, Bernardo Cuenca Grau, Aditya Kalyan- pur, and Yarden Katz. Pellet: A Practical OWL-DL Reasoner. http://www. mindswap.org/2003/pellet/, 2003.

[Sirin et al., 2006] Evren Sirin, Bijan Parsia, Bernardo Cuenca Grau, Aditya Kalyan- pur, and Yarden Katz. Pellet: A Practical OWL-DL Reasoner. Journal of Web Semantics, 2006. Submitted.

[Slaney, 1995] John Slaney. FINDER: Finite Domain Enumerator, 1995. available at ftp://arp.anu.edu.au/pub/papers/slaney/finder/finder.ps.gz.

[Smolka, 1995] Gert Smolka. The Oz Programming Model. In Jan van Leeuwen, editor, Computer Science Today, volume 1000 of LNCS, pages 324–343. Springer Verlag, 1995.

[Sonntag et al., 2007] Daniel Sonntag, Ralf Engel, Gerd Herzog, Alexander Pfalzgraf, Norbert Pfleger, Massimo Romanelli, and Norbert Reithinger. Smartweb Handheld – Multimodal Interaction with Ontological Knowledge Bases and Semantic Web Services. In Proceedings of the International Workshop on AI for Human Computing (AI4HC) in conjunction with (IJCAI) 2007, Hyderabad, India, 2007.

[Soutchanski, 2001] Mikhail Soutchanski. An On-line Decision-Theoretic Golog Inter- preter. In Bernhard Nebel, editor, Proceedings of the 17th International Joint Con- ference on Artificial Intelligence (IJCAI), pages 19–26, Seattle, WA, USA, August 4–10 2001. Morgan Kaufmann, San Mateo, CA, USA.

[Soutchanski, 2003] Mikhail Soutchanski. High-Level Robot Programming in Dynamic and Incompletely Known Environments. PhD thesis, Department of Computer Sci- ence, University of Toronto, 2003.

[Stickel, 1994] M.E. Stickel. Upside-Down Meta-Interpretation of the Model Elimina- tion Theorem-Proving Procedure for Deduction and Abduction. Journal of Auto- mated Reasoning, 13(2):189–210, 1994.

[Sutcliffe and Suttner, 1998] Geoff Sutcliffe and Christian B. Suttner. The TPTP Problem Library. Journal of Automated Reasoning, 21(2):177–203, October 1998. BIBLIOGRAPHY 209

[Sutcliffe and Suttner, 2001] Geoff Sutcliffe and Christian B. Suttner. Evaluating Gen- eral Purpose Automated Theorem Proving Systems. Artificial Intelligence, 131(1- 2):39–54, 2001.

[Sutcliffe et al., 2003] Geoff Sutcliffe, J¨urgen Zimmer, and Stephan Schulz. Communi- cation Formalisms for Automated Theorem Proving Tools. In V. Sorge, S. Colton, M. Fisher, and J. Gow, editors, Proceedings of the Workshop on Agents and Au- tomated Reasoning, 18th International Joint Conference on Artificial Intelligence, 2003.

[Sutcliffe et al., 2004] Geoff Sutcliffe, J¨urgen Zimmer, and Stephan Schulz. TSTP Data-Exchange Formats for Automated Theorem Proving Tools. In Weixiong Zhang and Volker Sorge, editors, Distributed Constraint Problem Solving and Reasoning in Multi-Agent Systems, pages 201–215, 2004.

[Sutcliffe, 2005] Geoff Sutcliffe. Proceedings of the CADE-20 ATP System Competi- tion. Tallinn, Estonia, 2005.

[Sutcliffe, 2006a] Geoff Sutcliffe. The 3rd IJCAR Automated Theorem Proving Com- petition. AI Communications, page To appear, 2006.

[Sutcliffe, 2006b] Geoff Sutcliffe. The CADE-20 Automated Theorem Proving Compe- tition. AI Communications, 19(2):173–181, 2006.

[Tate et al., 1996] Austin Tate, Brian Drabble, and Jeff Dalton. O-plan: a knowledge- based planner and its application to logistics. In Austin Tate, editor, Advanced Planning Technology. AAAI Press, May 1996.

[Telecom Italia Lab, 2003] Telecom Italia Lab. Java Agent DEvelopment Framework (JADE). http://sharon.cselt.it/projects/jade/, 2003.

[Traverso and Pistore, 2004] P. Traverso and M. Pistore. Automated composition of semantic web services into executable processes. In Proceedings of the 3rd Inter- national Semantic Web Conference (ISWC’04), Hiroshima, Japan, November 9–11 2004.

[Turing, 1937] Alan Turing. On computable numbers, with an application to the Entscheidungsproblem. Proceedings of the London Mathematical Society, 42:230– 265, 1937. 43:544-546.

[UDDI, 2000] UDDI. The UDDI Technical White Paper. Technical report, http: //www.uddi.org, 2000.

[Urban, 2003] Josef Urban. Translating Mizar for First Order Theorem Provers. In Paul Cairns and Piotr Rudnicki, editors, Proceedings of the 2nd International Con- ference on Mathematical Knowledge Management, volume 2594 of LNCS, pages 203– 215. Springer Verlag, 2003.

[Urban, 2004] Josef Urban. MPTP - motivation, implementation, first experiments. Journal of Automated Reasoning, 33(3-4):319–339, 2004. 210 BIBLIOGRAPHY

[van Heijenoort, 1967] Jean van Heijenoort. From Frege to G¨odel: A Source Book in Mathematical Logic 1879-1931. Source Books in the History of the Sciences. Harvard University Press, Cambridge, Massachusetts, 1967.

[Vardi, 1997] M.Y. Vardi. What makes modal logic so robustly decidable? In N. Im- merman and Ph.G. Kolaitis, editors, Descriptive Complexity and Finite Models, pages 149–183. American Mathematical Society, 1997.

[Veloso et al., 1995] M. Veloso, J. Carbonell, A. P´erez, D. Borrajo, E. Fink, and J. Blythe. Integrating Planning and Learning: The PRODIGY Architecture. Journal of Experimental and Theoretical Artificial Intelligence, 7(1), 1995.

[Voronkov, 2002] Andrei Voronkov, editor. Proceedings of the 18th International Con- ference on Automated Deduction, volume 2392 of LNAI, Copenhagen, Denmark, July 27–30 2002. Springer Verlag.

[Walsh, 2002] Aaron E. Walsh. UDDI, SOAP, and WSDL: The Web Services Specifi- cation Reference Book. Pearson Education, 2002.

[Walther and Schweitzer, 2003] Christoph Walther and Stephan Schweitzer. About verifun. In Baader [2003], pages 322–327.

[Walton, 2005] C. Walton. Protocols for web service invocation. In Proceedings of 1st the AAAI Fall Symposium on Agents and the Semantic Web (ASW’05), Technical Report FS-05-0, Arlington, Virginia, USA, November 2005. AAAI Press.

[Warren, 1976] D. H. D. Warren. Generating Conditional Plans and Programs. In Proceedings of the Summer Conference on AI and Simulating Behavior, Edinburgh, 1976.

[Weidenbach et al., 1999] C. Weidenbach, B. Afshordel, U. Brahm, C. Cohrs, T. Engel, E. Keen, C. Theobalt, and D. Tpoic. System Description: SPASS Version 1.0.0. In H. Ganzinger, editor, Proceedings of the 16th International Conference on Automated Deduction, volume 1632 of LNAI, pages 378–382. Springer Verlag, 1999.

[Weld et al., 1998] Daniel S. Weld, Corin R. Anderson, and David E. Smith. Extending Graphplan to Handle Uncertainty and Sensing Actions. In Proceedings of AAAI-98, pages 897–904. The AAAI Press, USA, 1998.

[Whitehead and Russell, 1910] Alfred North Whitehead and Bertrand Russell. Prin- cipia Mathematica, volume I. Cambridge University Press, Cambridge, Great Britain, 1910.

[Wielemaker, 2003] Jan Wielemaker. An Overview of the SWI-Prolog Programming Environment. In Fred Mesnard and Alexander Serebenik, editors, Proceedings of the 13th International Workshop on Logic Programming Environments, pages 1–16, Heverlee, Belgium, December 2003. Katholieke Universiteit Leuven. CW 371.

[Wolfram, 1999] Stephen Wolfram. The Mathematica Book (5th Edition). Cambridge University Press, 1999. BIBLIOGRAPHY 211

[Wolter and Zakharyaschev, 1999] F. Wolter and M. Zakharyaschev. Temporalizing description logic. In D. Gabbay and M. de Rijke, editors, Frontiers of Combining Systems, pages 379–402. Studies Press/Wiley, 1999.

[Wu et al., 2003] Dan Wu, Evren Sirin, Bijan Parsia, James Hendler, and Dana Nau. Automatic web services composition using SHOP2. In Proceedings of Planning for Web Services Workshop in ICAPS 2003, Trento, Italy, June 2003.

[Zhang and Zhang, 1995] J. Zhang and H. Zhang. Sem: A System for Enumerat- ing Models. In Proceedings of the 4th International Joint Conference on Artificial Intelligence (IJCAI’95), Montreal, Qu´ebec, Canada, August 20–25 1995. Morgan Kaufmann.

[Zimmer and Autexier, 2006] J¨urgen Zimmer and Serge Autexier. The MathServe Sys- tem for Semantic Web Reasoning Services. In Furbach and Shankar [2006], pages 140–144.

[Zimmer and Kohlhase, 2002] J¨urgen Zimmer and Michael Kohlhase. System Descrip- tion: The Mathweb Software Bus for Distributed Mathematical Reasoning. In Voronkov [2002], pages 139–143.

[Zimmer et al., 2004] J¨urgen Zimmer, Andreas Meier, Geoff Sutcliffe, and Yuan Zhang. Integrated Proof Transformation Services. In Christoph Benzm¨uller and Wolf- gang Windsteiger, editors, Proceedings of the IJCAR 2004 Workshop on Computer- Supported Mathematical Theory Development, Cork, Ireland, 2004. RISC Technical Report Series, Linz, Austria. 212 BIBLIOGRAPHY Part V

Appendices

Appendix A

The MathServe Domain Ontology

It is unfortunate, that the standard means to present OWL domain ontologies is still the XML/RDF syntax of OWL. We believe that XML documents are not designed for humans to read. However, tools for generating human readable representations of OWL ontologies are not available yet or do not produce the desired result. However, the Prot´eg´etools allows to present the OWL-DL classes defined in MathServe’s domain ontology as a graph. The classes and their subsumption relationship are shown in Figure A.1 on page 217. The properties of the domain ontology cannot be presented as a graph yet. There- fore they are presented in the DL syntax introduced in Chapter 3. In what follows, we present the properties and (if specified in the ontology) their domains and ranges:

⊤ ⊑ > 1 code ⊑ FOProblemFeature ⊤ ⊑ > 1 relation ⊑ DeltaRelation ⊤ ⊑ > 1 saturation ⊑ ProverResult ⊤ ⊑ > 1 output ⊑ ProverResult ⊤ ⊑ > 1 wallClockTime ⊑ TimeResource ⊤ ⊑ > 1 formalProof ⊑ Proof ⊤ ⊑ > 1 twelfDocument ⊑ Logic ⊤ ⊑ > 1 name ⊑ Logic ⊔ Syntax ⊔ Calculus ⊔ FOProblemFeature ⊤ ⊑ > 1 inTheory ⊑ ProvingProblem ⊔ DecisionProblem ⊤ ⊑ > 1 toFOF ⊑ DeltaRelation ⊤ ⊑ > 1 timeRes ⊑ ProvingProblem ⊔ ProverResult ⊔ Failure ⊔ DecisionProblem ⊤ ⊑ > 1 proofOf ⊑ Proof ⊤ ⊑ > 1 resultFor ⊑ ProverResult ⊤ ⊑ > 1 proof ⊑ ProverResult ⊤ ⊑ > 1 time ⊑ Result ⊔ Proof ⊤ ⊑ > 1 relatesCNF ⊑ DeltaRelation ⊤ ⊑ > 1 proofLogic ⊑ Proof ⊤ ⊑ > 1 alternativeProof ⊑ Proof ⊤ ⊑ > 1 language ⊑ ProvingProblem ⊔ FormalProof ⊔ DecisionProblem ⊤ ⊑ > 1 problemClass ⊑ Problem ⊤ ⊑ > 1 status ⊑ ProverResult ⊔ DecProcResult ⊤ ⊑ > 1 features ⊑ ProblemClass ⊤ ⊑ > 1 inLogic ⊑ ProvingProblem ⊔ Theory 216 Chapter A. The MathServe Domain Ontology

⊤ ⊑ > 1 cnfFor ⊑ ProvingProblem ⊤ ⊑ > 1 format ⊑ TstpProblem ⊔ FOTheory ⊤ ⊑ > 1 calculus ⊑ Proof ⊤ ⊑ > 1 probName ⊑ ProvingProblem ⊤ ⊑ > 1 cpuTime ⊑ TimeResource ⊤ ⊑ > 1 axioms ⊑ ProverResult ⊤ ⊑ > 1 formalDescription ⊑ ProvingProblem ⊔ FOTheory ⊔ DecisionProblem ⊤ ⊑ > 1 message ⊑ Failure ⊤ ⊑ > 1 system ⊑ Result ⊤ ⊑ ∀ toFOF .TstpFOFProblem ⊤ ⊑ ∀ timeRes.TimeResource ⊤ ⊑ ∀ proofOf .ProvingProblem ⊤ ⊑ ∀ resultFor.ProvingProblem ⊤ ⊑ ∀ proof .FormalProof ⊤ ⊑ ∀ time.TimeResource ⊤ ⊑ ∀ relatesCNF .TstpCNFProblem ⊤ ⊑ ∀ proofLogic.Logic ⊤ ⊑ ∀ alternativeProof .Proof ⊤ ⊑ ∀ language.Syntax ⊤ ⊑ ∀ status.SystemStatus ⊤ ⊑ ∀ inLogic.Logic ⊤ ⊑ ∀ cnfFor.ProvingProblem ⊤ ⊑ ∀ format.FOProblemFormat ⊤ ⊑ ∀ calculus.Calculus ⊤ ⊑ ∀ relation.xsd:string ⊤ ⊑ ∀ saturation.xsd:string ⊤ ⊑ ∀ output.xsd:string ⊤ ⊑ ∀ wallClockTime.xsd:integer ⊤ ⊑ ∀ formalProof .xsd:string ⊤ ⊑ ∀ twelfDocument.xsd:string ⊤ ⊑ ∀ name.xsd:string ⊤ ⊑ ∀ probName.xsd:string ⊤ ⊑ ∀ cpuTime.xsd:integer ⊤ ⊑ ∀ axioms.xsd:string ⊤ ⊑ ∀ description.xsd:string ⊤ ⊑ ∀ formalDescription.xsd:string ⊤ ⊑ ∀ message.xsd:string ⊤ ⊑ ∀ system.xsd:string 217

is-a FOProblemFeature TstpCNFTheory ProvingProblemFeature is-a UndecidableTheory is-a TstpFOFTheory is-a is-a FOTheory is-a DecidableTheory is-a SmtTheory ProblemFeature

is-a FOScriptResult ProverResult is-a is-a is-a NDProverResult ProvingProblemFeatureList Failure is-a is-a is-a FoXmlAtpResult Theory is-a ModGenResult is-a is-a FoAtpResult is-a FOBrFPResult Result is-a DecProcResult is-a is-a DeltaRelation Annotation is-a Assurance is-a

Output is-a FORefutation is-a

is-a FOXmlBrFPResult

FOXmlAtpResult is-a stat:FoDeductiveStatus is-a stat:DecProcStatus stat:FoSolvedStatus is-a is-a is-a stat:FoPreservingStatus is-a is-a is-a stat:SystemStatus stat:FoAtpStatus stat:FoUnsolvedStatus

is-a is-a is-a ProvingProblem OmdocProvingProblem OmdocFOProvingProblem is-a FOOmdocProvingProblem is-a is-a is-a is-a DecisionProblem FOProvingProblem is-a is-a is-a is-a owl:Thing Problem is-a is-a SolvingProblem SmtProblem TstpProblem TstpFOFProblem is-a is-a TstpCNFProblem is-a ComputationProblem

SetTheory

is-a is-a ModalLogic is-a

Logic is-a HOLogic is-a is-a is-a FOLogic

is-a PropLogic

LambdaCalculus

is-a is-a Calculus SequentCalculus ConstructiveNDCalculus is-a is-a is-a is-a Resource NDCalculus is-a ClassicalNDCalculus is-a is-a Proof ResolutionCalculus is-a BfFPResolutionCalculus

is-a OmdocNDProof is-a ProblemClass TableauCalculus NDProof is-a is-a TwegaNDProof is-a is-a FOProblemFeatureList is-a TimeResource TableauProof is-a is-a FOProblemFormat FormalProof is-a ProofScript is-a CoqProofScript is-a is-a XTstpCnfBrFPRefutation is-a is-a FormalLanguage InformalProof ResolutionProof TstpCNFRefutation TstpCnfLcRefutation is-a is-a is-a TstpCnfBrFPRefutation ProvingProblemClass is-a FOProblemClass XTstpCNFRefutation

Figure A.1: The classes of the MathServe domain ontology and their subsumption relationship 218 Chapter A. The MathServe Domain Ontology Appendix B

Statuses of ATP Problems

In this appendix we describe the complete hierarchy of ATP statuses which we intro- duced in Section 4.4.2. The hierarchy is shown in Figure B.1.

Result

Satisfiability Vacuous Counter satisfiability preserving theorem preserving

Satisfiability Satisfiable Counter Counter satisfiability partial mapping and satisfiable partial mapping

Satisfiability Theorem Neither Counter Counter satisfiability mapping theorem mapping

Satisfiability Tautologies Unsatisfiable Counter satisfiability bijection bijection Error

Input errorGave up Timeout Resource out Unknown

Figure B.1: Status hierarchy for the output of ATP systems

The hierarchy assumes that the input F to the ATP system is of the form Ax ⇒ C, where Ax is a set of formulae, C is a single formula, Ax and C have no free variables, and ⇒ is the standard first-order implication. An empty Ax, i.e., F is a monolithic formula (a particular example is a set of clauses), is the same as Ax being {true}. An empty C, e.g., when testing the satisfiability of a set of axioms, is the same as C being true. By showing that F is valid, an ATP system shows that the conjecture C is a theorem (a logical consequence) of the axioms Ax, i.e., Ax |= C, where |= is the standard first-order entailment indicating that every model of Ax is a model of C. If F is not valid there are several other possible relationships between Ax and C, as shown in the hierarchy and enumerated below. Associated with each possible status are possible outputs from an ATP system. 220 Chapter B. Statuses of ATP Problems

1. Tautologies Every interpretation is a model of Ax and a model of C -- Shows: F is valid; ∼F is unsatisfiable; C is a tautology -- Outputs: Assurance; Proof of F ; Refutation of ∼F

2. Theorem Every model of Ax (and there are some) is a model of C, but not case Tautologies

-- Shows: F is valid; C is a theorem of Ax -- Outputs: Assurance; Proof of C from Ax; Refutation of Ax ∪ {∼C}; Refu- tation of the clause normal form of Ax ∪ {∼C}

3. Satisfiable Some models of Ax (and there are some) are models of C -- Shows: F is satisfiable; ∼F is not valid; C is not a theorem of Ax -- Outputs: Assurance; Model; Saturation

4. SatisfiabilityBijection There is a bijection between the models of Ax (and there are some) and models of C -- Examples: Skolemisation; Pseudo-splitting [Riazanov and Voronkov, 2001] -- Shows: Nothing about F -- Outputs: Assurance

5. SatisfiabilityMapping There is a mapping from the models of Ax (and there are some) to models of C -- Shows: Nothing about F -- Outputs: Assurance

6. SatisfiabilityPartialMapping There is a partial mapping from the models of Ax (and there are some) to models of C -- Example: Ax = {p | q},C=p&r -- Shows: Nothing about F -- Outputs: Assurance; Pairs of models; Pairs of saturations

7. SatisfiabilityPreserving If there exists a model of Ax then there exists a model of C -- Shows: Nothing about F -- Outputs: Assurance

8. ContradictoryAxioms There are no models of Ax -- Shows: F is valid; Anything is a theorem of Ax 221

-- Outputs: Assurance; Refutation of Ax; Refutation of the clause normal form of Ax

9. NoConsequence Some models of Ax (and there are some) are models of C, and some are models of ∼C. -- Shows: F is not valid; F is satisfiable; ∼F is not valid; ∼F is satisfiable; C is not a theorem of Ax -- Outputs: Assurance; Pair of models; Pair of saturations

10. CounterSatisfiabilityPreserving If there exists a model of Ax then there exists a model of ∼C -- Shows: Nothing about F -- Outputs: Assurance

11. CounterSatisfiabilityPartialMapping There is a partial mapping from the models of Ax (and there are some) to models of ∼C -- Shows: Nothing about F -- Outputs: Assurance; Pairs of models

12. CounterSatisfiabilityMapping There is a mapping from the models of Ax (and there are some) to models of ∼C

-- Shows: Nothing about F -- Outputs: Assurance

13. CounterSatisfiabilityBijection There is a bijection between the models of Ax (and there are some) and models of ∼C -- Shows: Nothing about F -- Outputs: Assurance

14. CounterSatisfiable Some models of Ax (and there are some) are models of ∼C -- Shows: F is not valid; ∼F is satisfiable; C is not a theorem of Ax -- Outputs: Assurance; Model; Saturation

15. CounterTheorem Every model of Ax (and there are some) is a model of ∼C, but not Unsatisfiable -- Shows: F is not valid; ∼F is valid; ∼C is a theorem of Ax; C cannot be made into a theorem by extending Ax; -- Outputs: Assurance; Proof of ∼C from Ax; Refutation of Ax ∪{C}; Refu- tation of the clause normal form of Ax ∪{C} 222 Chapter B. Statuses of ATP Problems

16. Unsatisfiable Every interpretation is a model of Ax and a model of ∼C -- Shows: F is unsatisfiable; ∼F is valid; ∼C is a tautology -- Outputs: Assurance; Refutation of F ; Proof of ∼F Appendix C

First-Order ATP Systems and their Performance

In what follows, we provide short descriptions of the first-order ATP systems integrated in MathServe. We will also present the performance data stored in the corresponding OWL-S service profiles.

C.1 ATP Systems in MathServe

The DCTP System DCTP [Letz and Stenz, 2001a] is an automated theorem prover for first-order clause logic. It is an implementation of the disconnection calculus described in [Letz and Stenz, 2001b]. The disconnection calculus is a proof confluent and inherently cut-free tableaux calculus with a weak connectedness condition. The inherently depth-first proof search is guided by a literal selection based on how much a literal is instanti- ated or on literal complexity and a heavily parameterised link selection. The pruning mechanisms mostly rely on different forms of variant deletion and unit based strategies. Additionally the calculus has been augmented by full tableaux pruning.

The E System E [Schulz, 2001] is a purely equational theorem prover. The calculus used by E combines superposition (with selection of negative literals) and rewriting. No special rules for non-equational literals have been implemented, i.e. resolution is simulated via paramod- ulation and equality resolution. E 0.62 includes AC redundancy elimination and AC simplification for dynamically recognised associative and commutative equational the- ories, as well as simulated clause splitting. E is based on the DISCOUNT-loop variant of the given-clause algorithm, i.e. a strict separation of active and passive facts. Proof search in E is primarily controlled by a literal selection strategy, a clause evaluation heuristic, and a simplification ordering. Supported term orderings are several parame- terised instances of Knuth-Bendix-Ordering (KBO) and Lexicographic Path Ordering. The most unique feature of E is the maximally shared term representation. This includes parallel rewriting for all instances of a particular subterm. A second impor- 224 Chapter C. First-Order ATP Systems and their Performance tant feature is the use of perfect discrimination trees with age and size constraints for rewriting and unit-subsumption.

The Otter System Otter [McCune, 1994b] is a fourth-generation Argonne National Laboratory deduction system whose ancestors (dating from the early 1960s) include the TP series, NIUTP, AURA, and ITP. It is designed to prove theorems stated in first-order logic with equal- ity. Otter’s inference rules are based on resolution and paramodulation, and it includes facilities for term rewriting, term orderings, Knuth-Bendix completion, weighting, and strategies for directing and restricting searches for proofs. Otter can also be used as a symbolic calculator and has an embedded equational programming system. Although Otter could prove interesting theorems, like the Robbins conjecture [McCune, 1997], in the past it is not anymore one of the strongest provers available today. However, one advantage of the Otter system is that its default calculus is restricted to a set of standard inference rules (binary resolution, factoring, paramodulation and demod- ulation) which makes it easy to process Otter refutation proofs with other tools (cf. Section 4.8.2).

The Paradox System Paradox [Claessen and S¨orensson, 2003] is a finite-domain model generator. It is based on a MACE-style flattening and instantiating of the first-order clauses into proposi- tional clauses, and then the use of a SAT solver to solve the resulting problem (see Section 2.2.4). Paradox incorporates the following features: Polynomial-time clause splitting heuristics, the use of incremental SAT, static symmetry reduction techniques, and the use of sort inference.

The SPASS System SPASS [Weidenbach et al., 1999] is an automated theorem prover for classical first- order logic with equality. It is a saturation based prover employing superposition, sorts and splitting. In contrast to many approaches to order-sorted clausal reasoning, the calculus enables sort predicates and equations to occur arbitrarily within clauses. Therefore, the sort theory is not separated from the problem clauses, but automati- cally and dynamically extracted. SPASS also offers a variety of further inference and reduction rules including hyper resolution, unit resulting resolution, various variants of paramodulation and a terminator. SPASS relies on an internal library supporting specific data structures and algorithms like, for example, indexing or orderings (KBO).

The Vampire System Vampire [Riazanov and Voronkov, 2002] implements the calculi of ordered binary reso- lution and superposition for handling equality. The splitting rule and negative equality splitting are simulated by the introduction of new predicate definitions and dynamic folding of such definitions. A number of standard redundancy criteria and simplifica- tion techniques are used for pruning the search space: subsumption, tautology deletion C.2. Performance of the ATP Systems 225

(optionally modulo commutativity), subsumption resolution, rewriting by ordered unit equalities, and irreducibility of substitution terms. The reduction orderings used are the standard Knuth-Bendix ordering and a special non-recursive version of the Knuth- Bendix ordering. A number of efficient indexing techniques is used to implement all major operations on sets of terms and clauses. Run-time algorithm specialisation is used to accelerate some costly operations, e.g., checks of ordering constraints. Although the kernel of the system works only with clausal normal forms, the preprocessor com- ponent accepts a problem in the full first-order logic syntax, transforms it into CNF and performs a number of useful transformations before passing the result to the ker- nel. When a theorem is proved, the system produces a verifiable proof, which validates both the clausification phase and the refutation of the CNF.

The Waldmeister System Waldmeister is a theorem prover for first-order unit equational logic. It is based on unfailing Knuth-Bendix completion. The system is one of the strongest provers for unit equality problems because it is very efficient in terms of time and space consumption. Waldmeister works in theories formulated as a set E of implicitly universally quantified equations over a many-sorted signature. It tries to prove whether a given equation s = t is valid in E, i.e. whether holds in all models of E. Parametrised with a reduction ordering, unfailing Knuth-Bendix completion transforms E into a ground convergent set of rewrite rules. For theoretical reasons, this set is not necessarily finite, but if so, the word problem of E is solved by testing for syntactical identity after normalisation. In both cases, however, if s = t holds, then a proof is always found in finite time. This justifies the use of unfailing completion as semi-complete proof procedure for equational logic. Accordingly, when searching for a proof, Waldmeister saturates the given axiomatisation until the goals can be shown by narrowing or rewriting.

C.2 Performance of the ATP Systems

Table C.1 shows the percentage of TPTP Library problems (v3.1.0) solved by the ATP systems currently available in MathServe. The first column of the table shows the name of the SPC as introduced in Section 4.4.4. SPCs populated by CASC-J3 problems are written in bold letters. The second column contains the number of problems in the different SPC. The rest of Table C.1 shows the percentage of problems solved by the ATP systems in the corresponding SPC. If an ATP system (service) is chosen by Math- Serve’s optimal policy, the percentage is written in bold digits. 226 Chapter C. First-Order ATP Systems and their Performance

Specialists No. DCTP E Paradox SPASS Vampire Waldmeister Problem Class Probs 10.21p 0.91 1.3 2.2 8.0 704 FOF CSA EPR 321 88.0 91.0 0.0 86.0 0.0 0.0 FOF CSA RFO 46 28.0 26.0 36.0 26.0 0.0 0.0 FOF SAT EPR 48 90.0 77.0 100.0 92.0 0.0 0.0 FOF SAT RFO 17 0.0 41.0 47.0 29.0 0.0 0.0 FOF NKC EPR 504 38.0 96.0 48.0 98.0 93.0 0.0 FOF NKC RFO EQU 1231 0.0 69.0 8.0 59.0 71.0 0.0 FOF NKC RFO NEQ 39 0.0 100.0 0.0 97.0 100.0 0.0 FOF NKS NUN EPR 116 0.0 80.0 0.0 100.0 61.0 0.0 FOF NKS NUN RFO NEQ 1 0.0 0.0 0.0 0.0 0.0 0.0 FOF NKS NUN RFO EQU 0 0.0 0.0 0.0 0.0 0.0 0.0 CNF NKS EPR 480 99.0 98.0 96.0 99.0 98.0 0.0 CNF NKS RFO NEQ NHN 543 0.0 71.0 3.0 55.0 75.0 0.0 CNF SAT EPR 221 92.0 64.0 88.0 67.0 48.0 0.0 CNF SAT RFO NEQ 276 0.0 52.0 0.0 45.0 0.0 0.0 CNF SAT RFO EQU NUE 224 0.0 58.0 75.0 55.0 0.0 0.0 CNF SAT RFO PEQ UEQ 94 0.0 7.0 0.0 6.0 0.0 5.0 CNF NKS RFO NEQ HRN 462 0.0 90.0 2.0 68.0 94.0 0.0 CNF NKS RFO SEQ HRN 391 59.0 88.0 0.0 50.0 93.0 0.0 CNF NKS RFO SEQ NHN 1816 0.0 50.0 2.0 37.0 52.0 0.0 CNF NKS RFO PEQ NUE 417 0.0 80.0 0.0 62.0 78.0 0.0 CNF NKS RFO PEQ UEQ 765 0.0 80.0 0.0 68.0 81.0 88.0 Table C.1: Percentages of problems solved by the ATP systems DCTP, E, Paradox, SPASS, Vampire and Waldmeister on the SPCs of the TPTP Library v3.1.0 with a 600sec time limit. The ATP systems chosen by MathServe are marked by bold letters. SPCs populated by CASC-J3 problems are also written in bold font Appendix D

An OWL-S Service Description

In Section 3.4 we introduced an abstract frame-like representation for OWL-S service profiles. In what follows, we present the RDF/XML syntax of a complete OWL- S service description to illustrate the correspondence to the service profiles used in this thesis. The proof transformation service TrampNDforFOF has been introduced in Section 4.8.2 and has the following profile:

profile mw#TrampNDforFOF: inputs: fof problem :: mw#TptpFOFProblem atp result :: mw#FOBrFPResult delta relation :: mw#DeltaRelation outputs: nd proof :: mw#TwegaNDProof preconds: resultFor(atp result, cnf problem) ∧ status(atp result, mw#Unsatisfiable) ∧ . cnfFor(cnf problem, fof problem) ∧ relatesCNF (delta relation, cnf problem) ∧ toFOF (delta relation, fof problem) effects: proofOf (nd proof , fof problem) categs: params: mw = http://www.mathweb.org/owl/mathserv.owl

On the following pages we present the complete OWL-S service description for Tramp- NDforFOF including the service profile and the mapping to the WSDL grounding of an instance of the service1. Similar OWL-S description have been created for all other services introduced in Chapter 4.

1The service instance is assumed to be available at http://hades.ags.uni-sb.de:8080/axis/ services/TrampNDService. 228 Chapter D. An OWL-S Service Description

]>

This is the OWL-S service description for the TrampND web service. The service takes a first-order logic conjecture in TPTP FOF format and the result of a first-order automated theorem prover for that problem. Furthermore, it requires the delta relation computed when the CNF of the FOF problem was created. If the given ATP result contains a proof in TPTP format, the service generates a Natural Deduction (ND) proof for the conjecture.

229

&mw;#TptpFOFProblem

&mw;#FOBrFPResult

&mw;#DeltaRelation

&mw;#TwegaNDProof

This is a local variable that represents the unknown problem in TPTP CNF format. &mw;#TptpCNFProblem

230 Chapter D. An OWL-S Service Description

231

232 Chapter D. An OWL-S Service Description

TrampNDforFOF

http://hades.ags.uni-sb.de:8080/axis/services/TrampNDService?wsdl#ndForFOFRequest

http://hades.ags.uni-sb.de:8080/axis/services/TrampNDService?wsdl#in0

http://hades.ags.uni-sb.de:8080/axis/services/TrampNDService?wsdl#in1

http://hades.ags.uni-sb.de:8080/axis/services/TrampNDService?wsdl#in2

>http://hades.ags.uni-sb.de:8080/axis/services/TrampNDService?wsdl#ndForFOFReturn

TrampNDService ndForFOF

http://hades.ags.uni-sb.de:8080/axis/services/TrampNDService?wsdl#ndForFOFResponse

http://hades.ags.uni-sb.de:8080/axis/services/TrampNDService?wsdl

Always True

234 Chapter D. An OWL-S Service Description Appendix E

Translation Functions for Web Service Composition

In this Appendix we present formal translation functions needed for automated Web Service composition in MathServe as described in Chapter 6. In Section E.1 we define functions to translate OWL-S service profiles and query profiles to PRODIGY planning domains and planning problems, respectively. In Section E.2 we define functions to translate (query) profiles into stochastic situation calculus action domains and Golog procedures.

E.1 Translation to Planning Domains and Prob- lems

PRODIGY planning domains are expressed as LISP S-expressions. We define the set SExpr of all S-expressions as the language produced by the following BNF-style gram- mar starting from the non-terminal s expression.

s expression ::= atomic symbol | ”(”s expression ”.” s expression”)” | s expression s expression | list list ::= ”(”s expression {s expression}”)” atomic symbol ::= letter atom part atom part ::= empty | letter atom part | number atom part letter ::= ”a” | ”b” | ... | ”z” | ”A” | ”B” | ... | ”Z” | ”<” | ”>” | ”/” | ”-” | ” ” | ” ∼ ” | ”:” number ::= ”1” | ”2” | ... | ”9” empty ::= ” ”

Ontology Classes as Prodigy Types

We define the functions τcln and τcld for ontology classes as follows. 236 Chapter E. Translation Functions for Web Service Composition

Definition 5.1 (Translation Function – Ontology) Let O = hA, Ei be an OWL-DL ontology and C the set of simple named classes in O O (cf. Definition 3.5, page 64). The function

τcln : C ∪{owl : Thing}→ SExpr O translates names of simple named OWL-DL classes to PRODIGY type names:

τcln(owl : Thing) := OWL-Thing

τcln(ns : C) := NS-C, ∀ (ns : C) ∈ C O The partial function

τcld : OW L − DL → SExpr translates OWL-DL class definition axioms into PRODIGY type declarations. For every class C ∈ C with a defining axiom ′C = C1 ⊓ . . . ⊓ Cn′ ∈ A or ′C = C1 ⊑ O . . . ⊓ Cn′ ∈A, we define

τcld(C = C1 ⊓ . . . ⊓ Cn ∈A) := ((ptype-of τcln(C) τcln(Ci1 )) . . .

(ptype-of τcln(C) τcln(Cim ))), and

τcld(C ⊑ C1 ⊓ . . . ⊓ Cn ∈A) := ((ptype-of τcln(C) τcln(Ci1 )) . . .

(ptype-of τcln(C) τcln(Cim ))),

where {Ci1 ,...,Cim } = {C1,...,Cn} ∩ C is the set of all simple named classes occur- O ring in the definition of C. ♦

SWRL Atoms and Formulae

The function τswrl translates SWRL literals and formulae into LISP expressions: Definition 5.2 (Translation Function – SWRL Formulae) Let O be an OWL-DL ontology and let SWRLO be the set of all SWRL conjunctions and disjunctions (see Definition 3.9, page 66). The∧∨ function

τswrl : SWRLO → SExpr ∧∨ translates atoms and formulae into preconditions and effects of PRODIGY operators. It ignores data-valued property atoms and is defined as follows:

τswrl(hi1, i2i ∈ R) := (R )

τswrl(¬hi1, i2i ∈ R) := ∼(R )

τswrl(L1 ∧ . . . ∧ Ln) := (and τswrl(L1) ... τswrl(Ln))

τswrl(A1 ∨ . . . ∨ An) := (or τswrl(A1) ... τswrl(An)) ♦ E.1. Translation to Planning Domains and Problems 237

Service Profiles as Planning Operators

The function τsp generates PRODIGY planning operator descriptions from OWL-S service profiles. It ignores probabilistic effects:

Definition 5.3 (Translation Function – Atomic Processes) Let O be an OWL-DL ontology and OWLSP the set of all OWL-S profiles based O on O (see definition 3.23). The function

τsp : OWLSP → SExpr O translates OWL-S service profiles into PRODIGY planning operators. For a service pro- file p =(u,I,O,P,E,C,PE) with I = {in1 :: C1,..., inm :: Cm}, O = {out1 :: D1,..., outn :: Dn}, P = {ϕ1,...,ϕk}, E = {ψ1,...,ψl}, it is defined as follows:

τsp(p) := (OPERATOR u

(params . . . . . . ) (preconds

(( τcln(C1)) . . . ( τcln(Cm)))

(( τcln(D1)) . . . ( τcln(Dn)))

(and (bound ) . . . (bound )

(∼ (bound )) . . . (∼ (bound ))

τswrl(ϕ1) ...τswrl(ϕk))) (effects () (

(add (bound )) . . . (add (bound ))

(add τswrl(ψ1)) . . . (add τswrl(ψl)) )))

(pinstance-of out1 1 τcln(D1)) . . . (pinstance-of out1 5 τcln(Dn)) . . .

(pinstance-of outn 1 τcln(Dn)) . . . (pinstance-of outn 5 τcln(Dn)) (Control-Rule reject-wrong-binding-for-U (if (and

(applicable-operator (U . . . . . . ))

( ∼ (exact-type-of-object τcln(D1))) . . .

( ∼ (exact-type-of-object τcln(Dn))))) (then sub-goal))

Query Profiles as Planning Problems OWL-S query profiles are translated into PRODIGY planning problems. 238 Chapter E. Translation Functions for Web Service Composition

Definition 5.4 (Translation Function – Query Profiles) Let O be an OWL-DL ontology and OWLSP the set of all OWL-S profiles based O on O. The function

τqp : OWLSP → SExpr O translates OWL-S query profiles into PRODIGY planning problems. For a query profile q = (u,I,O,P,E,C, ∅) with I={in1 :: C1,...,inm :: Cm}, O={out1 :: D1,...,outn :: Dn}, P = {ϕ1,...,ϕk}, E = {ψ1,...,ψl} it generates the planning problem:

τqp(q) := (create-problem (name U)

(objects (in1 τcln(C1)) . . . (inm τcln(Cm)))

(state (and τswrl(ϕ1) ...τswrl(ϕk))) (goal (and

(bound out1) . . . (bound outn)

τswrl(ψ1) ...τswrl(ψk))))

The set Objτqp(q) := {in1,...,inm} is the set of all data objects in τqp(q). ♦

Generating complete Planning Problems Combining the above translation functions we can generate complete PRODIGY plan- ning domains and problems including type definitions and control rules.

Definition 5.5 (Translation Function for PRODIGY Domains) Let O be an OWL-DL ontology and {C1,...,Cm} the set of simple named classes in O. Let also S = {p1,...,pn} be a set of OWL-S service profiles based on O and q be a query profile based on O. The function

τcompl : OWLDL× P(OWLSP ) × OWLQP → SExpr O O creates a complete PRODIGY domain and a planning problem corresponding to the service and the query profiles. It is defined as

τcompl(O,S,q) := τcld(C1) ...τcld(Cm)

τsp(p1) ... τsp(pn)

τqp(q)

The set Objτ ( ,S,q) of data objects in the domain is defined as compl O

Objτ ( ,S,q) := Objτqp(q) ∪{o | (pinstance-of o C) occurs in τsp(p1)} compl O ♦ E.2. Translation to the Situation Calculus 239

PRODIGY Plans We define prodigy plans to be sequences of instantiated planning operators.

Definition 5.6 (PRODIGY Plans) Let O be an OWL-DL ontology, S ⊂ OWLSP be a set of OWL-S profiles based O on O with names U := {name(p) | p ∈ S}, and q be a query profile based on O. Let further be τcompl(O,S,q) be the PRODIGY planning domain created for S. The set LPS of all legal plans for τcompl(O,S,q) is defined as

(1) (1) (n) (n) N LPS := { hu1(x1 ,...,xo1 ),...,un(x1 ,...,xon )i| n ∈ ∧ 1 ≤ i ≤ n ∧ (i) ∃p ∈ S.ui = name(p) ∧ oi = |inputs(p) ∪ outputs(p)| ∧ xl ∈ Objτcompl( ,S,q)}, O i.e. the set of all tuples of operators instantiated with data objects defined in the planning domain τcompl(O,S,q). ♦

E.2 Translation to the Situation Calculus

For decision-theoretic reasoning, OWL-S service profiles are translated into a DTGolog action domain, i.e. a set of situation calculus formulae. We start with action precon- dition axioms.

Actions and Precondition Axioms First we need transformation functions that add a situation argument to atoms in SWRL formulae. In what follows, Lsitcalc denotes the language of the situation calculus as defined in [Reiter, 2001].

Definition 5.7 (Situated Formulae) Let Vsc be the countably infinite set of variables of the situation calculus. The function

σsit : SWRLO ∪ SWRLO × Vsc →Lsitcalc ∧¬ ∧∨ directly translates SWRL atoms and formulae into situation calculus atoms and for- mulae by adding situation arguments as follows:

σsit(hi1, i2i ∈ R,s) := R(i1, i2,s)

σsit(¬hi1, i2i ∈ R,s) := ¬R(i1, i2,s)

σsit(A1 ∧ . . . ∧ An,s) := σsit(A1,s) ∧ . . . ∧ σsit(An,s)

σsit(A1 ∨ . . . ∨ An,s) := σsit(A1,s) ∨ . . . ∨ σsit(An,s).

For a simplified presentation we define the parameter names for an OWL-S service profile. 240 Chapter E. Translation Functions for Web Service Composition

Definition 5.8 (Parameters Names of a Profile) Let O be an OWL-DL ontology and (u,I,O,P,E,C,PE) ∈ OWLSP be an OWL-S O service profile with I = {in1 :: C1,..., inm :: Cm} and O = {out1 :: D1,..., outn :: Dn} If hx1,...,xmi is the lexicographic ordering of {in1,...,inm} and hy1,...,yni is the 1 lexicographic ordering of {out1,...,outn} then we define the parameter names of p as the tuple hz1,...,zoi := hx1,...,xm,y1,...,yni with o = n + m. ♦

Action precondition axioms can be generated directly from the preconditions of service profiles.

Definition 5.9 (Service Preconditions as Action Preconditions) Let O be an OWL-DL ontology and OWLSP the set of all OWL-S profiles based O on O. The function

σpre : OWLSP →Lsitcalc O generates action precondition axioms from profile preconditions and is defined as fol- lows:

σpre(p) := Poss(u(z1,...,zo),s) ≡

∃ x1,...,xl. σsit(ϕ1,s) ∧ . . . ∧ σsit(ϕk,s) where s ∈ Vsc is an arbitrary situation calculus variable, u is the name of p, hz1,...,zoi are the parameter names of p, {ϕ1,...,ϕk} = preconds(p), and {x1,...,xl} = vars(ϕ1∧ . . . ∧ ϕk) ∩{z1,...,zo} (see Definition 3.11, page 67). ♦

Successor State Axioms The generation of successor state axioms follows the general description in Section 2.4.1.2 with two differences: In our domain, all fluent symbols have arity 22. Furthermore, the deterministic situation calculus actions introduced for nature’s choices have to be taken into account by the successor state axioms.

Definition 5.10 (Fluents and Atoms of Service Profiles) Let p be a service profile and S = {p1,...,pn} be a set of service profiles. The situation calculus fluents associated with p and S are defined as

Fp := {F | (hi1, i2i ∈ F ) ∈ lits (effects (p)) or

(¬hi1, i2i ∈ F ) ∈ lits (effects (p))}

FS := Fp, p S S∈ 1The lexicographic ordering of distinct URI references is uniquely defined. 2This is due to the fact that all SWRL atoms are binary. E.2. Translation to the Situation Calculus 241 respectively. The sets of (positive) fluent atoms associated with p and S are defined as + Fp := {F (i1, i2,s) | (hi1, i2i ∈ F ) ∈ lits (effects (p))} and + + FS := Fp , p[S ∈ respectively. Similarly, the sets of negative fluent atoms associated with p and S are defined as

Fp− := {F (i1, i2,s) | (¬hi1, i2i ∈ F ) ∈ lits (effects (p))} and

FS− := Fp−. p[S ∈ Disjunctive effects contain only positive SWRL atoms whose literals are defined as

Fp∨ := {F (i1, i2,s) | (hi1, i2i ∈ F ) ∈ atoms (disjEffs (p))} and

FS∨ := Fp∨. p[S ∈ ♦

Nature’s choices are new situation calculus action names generated from the name of a service profiles. We need a new action name for each choice in the disjunctive effect of an OWL-S service. We also define a one to one mapping between nature’s choice and literals in disjunctive effects. Definition 5.11 (Nature’s Choices for Disjunctive Effects) Let p be a service profile with name u = name(p). We define the set Cp of nature’s choices for the disjunctive effects of p as a set of new (distinct) names derived from u as follows:

CHp = {”u 1”,..., ”u n” | n = |Fp∨|}, i.e. CHp contains |Fp∨| new situation calculus action names. We will omit the double quotes in what follows and simply write CHp = {u 1,...,u n}. We also define litp(.) to bea bijection between nature’s choices and the atoms in Fp∨. ∨ If hA1,...,A i is the lexicographic ordering of the atoms in Fp∨ then we define |Fp |

litp : CHp →Fp∨ bijective

litp(u i) := Ai for CHp = {u 1,...,u n} and 1 ≤ i ≤ n ♦

For every literal in a service’s effects we also have to define which situation calculus action it is linked to. Definition 5.12 (Literal Action Mapping) Let p be a service profile. The mapping actp associates a situation calculus action name with all atoms in the effects of p. It is defined as + actp : Fp ∪Fp∨ → CHp ∪{name(p)} + name(p) : if F (x,y,s) ∈Fp −Fp∨ actp(F (x,y,s)) := 1  litp− (F (x,y,s)) : if F (x,y,s) ∈Fp∨ 242 Chapter E. Translation Functions for Web Service Composition

Now we can define the positive and negative normal form effect axioms that are needed for successor state axioms.

Definition 5.13 (Normal Form Positive Effect Axioms) Let p ∈ OWLSP be a service profile with name u = name(p) and parameters O hz1,...,zoi. Let further F ∈ Fp be a fluent occuring in the effects of p. For every + fluent literal ζF = F (t1, t2,s) ∈Fp ∪Fp∨, the normal form positive effect axiom for F ζF and p and its antecedent Ψp are defined as

∃y1,...,yl. x1 = t1 ∧ x2 = t2 ∧ a = actp(ζF )(z1,...,zo) ⊃ F (x1, x2,do(s, a)),

ζF Ψp | {z } where {y1,...,yl} = {z1,...,zo}/{x1, x2}. Furthermore, given a set S = {p1,...,pn} of service profiles, the positive normal form effect axiom for a fluent F ∈FS is defined as

+ ζF γF (x1, x2,a,s):=( Ψp ) ⊃ F (x1, x2,do(s, a)). + ∨ p_S ζF _p ∈ ∈F ∪Fp ♦

Definition 5.14 (Normal Form Negative Effect Axioms) Let p ∈ OWLSP be a service profile with name u = name(p) and parameters O hz1,...,zoi. Let further F ∈Fp be a fluent occurring in the negative effects of p. For every fluent literal ηF = F (t1, t2,s) ∈Fp−, the normal form negative effect axiom for F ηF and p and its antecedent Θp are defined as

∃y1,...,yl. x1 = t1 ∧ x2 = t2 ∧ a = u(z1,...,zo) ⊃ ¬F (x1, x2,do(s, a)),

ηF Θp | {z } where {y1,...,yl} = {z1,...,zo}/{x1, x2}. Given a set S = {p1,...,pn} of service profiles, the negative normal form effect axiom for a fluent F ∈FS is defined as

ηF γF−(x1, x2,a,s):=( Θp ) ⊃ ¬F (x1, x2,do(s, a)). − p_S ηF_ p ∈ ∈F ♦

As a consequence of the causal completeness assumption (see Section 2.4.1.2) we can now define the successor state axioms in the following way.

Definition 5.15 (Successor State Axioms) Let O be an OWL-DL ontology and OWLSP the set of all OWL-S profiles based O on O. The function E.2. Translation to the Situation Calculus 243

σssa : P(OWLSP ) −{∅}→Lsitcalc O computes successor state axioms for a non-empty finite set S of OWL-S service profiles as follows:

+ σssa(S) := {F (x1, x2,do(a, s)) ≡ γF (x1, x2,a,s) ∨ F (x1, x2,s) ∧ ¬γF−(x1, x2,a,s)}. F[S ∈F ♦

Conditions, Probabilities and Costs of Nature’s Choices Stochastic action domains require the specification of nature’s choices for stochastic actions, the probabilities they occur with, and the costs associated with them. We first define functions which generate axioms for defining nature’s choices.

Definition 5.16 (Nature’s Choices and Conditions) Let O be an OWL-DL ontology and p ∈ OWLSP an OWL-S profile based on O. O We define two functions

σch, σsc : OWLSP →Lsitcalc. O For a service profile p, with parameters hz1,...,zoi, σch generates axioms for nature’s choices for p as

σch(p) := {choice (u(z1,...,zo),u i(z1,...,zo)) | u i ∈ CHp},

The function σsc generates axioms that define the conditions sensed by nature’s choices as follows:

σsc(p) := {senseCond (u i(z1,...,zo), litp(u i)) | u i ∈ CHp}.

By default, every choice of nature occurs with the same probability. However, if a service profile contains explicit probabilities for certain choices, then those probabilities should be specified in the situation calculus action domain. First, we identify the set ex CHp of choices with explicit probabilities:

Definition 5.17 (Nature’s Choices with Explicit Probabilities) u Let p ∈ OWLSP be an OWL-S profile. For nature’s choice u ∈ CHp, the set Sp of O explicit probability statements for u in p is defined as

u lit Sp := {(ϕ ⇒ ψ,x,c) ∈ cpEffects (p) | ψ = p(u)}

ex The set CHp of choices with at least one explicit probability statement is defined as

ex u CHp := {u ∈ CHp | Sp =6 ∅} ♦ 244 Chapter E. Translation Functions for Web Service Composition

ex With the help of CHp we can generate axioms defining the probabilities of nature’s choices. Definition 5.18 (Probabilities and Costs of Nature’s Choices) ex Let p ∈ OWLSP be an OWL-S profile with parameters hz1,...,zoi and CHp = O {u1,...,un}. The functions

σprob, σcost : OWLSP →Lsitcalc O and generate probability and cost statements for p as follows σ u ex σprob(p) := {prob (u(z1,...,zo),s)= pr ≡ ( sit(ϕ,s) ∧ pr = x) | ∈ CHp } ∪ (ϕ ψ,x,c_ ) Su ⇒ ∈ p {prob (u(z1,...,zo),s)= pr ≡

( prob(ui(z1,...,zo),s)= pr i) 1 ^i n ≤ ≤ ex u ex ∧ pr = (1 − pr 1 − . . . − pr n)/(|CHp|−|CHp |) | ∈ CHp − CHp } and σ u ex σcost(p) := {cost (u(z1,...,zo),s)= k ≡ ( sit(ϕ,s) ∧ k = c) | ∈ CHp } ∪ (ϕ ψ,x,c_ ) Su ⇒ ∈ p u ex {cost (u(z1,...,zo),s)= k0 | ∈ CHp − CHp } ♦

Using all the above translation functions we can generate complete stochastic situation calculus action domains. Definition 5.19 (situation calculus Action Domain) Let S ⊂ OWLSP be a set of OWL-S profiles based on O, and q ∈ OWLSP be a O O query profile with preconditions {ϕ1,...,ϕn}. The (stochastic) situation calculus action domain for S and q is defined as q q DS := Σ ∪Dap ∪Dssa ∪Ddt ∪Ds0 ∪Duna, with Σ as described in Section 2.4.1.1 and

-- Dap = {σpre(p) | p ∈ S},

-- Dssa = {σssa(p) | p ∈ S},

-- Ddt = {σprob(p) | p ∈ S}∪{σcost(p) | p ∈ S},

q s0 s0 -- Ds0 = {σsit(ϕ1),...,σsit(ϕn)}, and

-- Duna = {u1(z1,...,zo) =6 u2(y1,...,yu) | p1 =6 p2 ∈ S with u1 = name(p1),u2 = name(p2)}∪ {u(z1,...,zo)= u(y1,...,yo) ⊃ z1 = y1 ∧ . . . ∧ zo = yo | u ∈ S} ♦ E.2. Translation to the Situation Calculus 245

Queries as Golog Procedures For decision-theoretic reasoning in the situation calculus, sequences of services (PRODIGY plans) have to be described as Golog procedures. For a more compact representation we have to define the longest common prefix of all plans. Definition 5.20 (Longest Common Prefix of Plans) Let S ⊂ OWLSP be a set of OWL-S profiles and q ∈ OWLSP be a query profile. O O Let further (k) (k) N P = {hs1 ,...,snk i| 1 ≤ k ≤ n0 ∧ nk ∈ }⊂LPS be a set of n0 ∈ N plans for the planning domain τcompl(O,S,q) (see Definition 5.6). P We define the longest common prefix P remax of P to be P (1) (1) P remax := hs1 ,...,sνmax i where N (κ1) (κ2) N N νmax = max{ν ∈ ∪{0}| si = si for all i ∈ ν, κ1, κ2 ∈ n0 with κ1 =6 κ2 }, P i.e. P remax is the longest sequence of instantiated operators that occurs at the beginning of all plans in P . ♦

To achieve a compact representation, the longest common prefix occurs only once at the beginning of the Golog procedure. It is followed by a nondeterministic choice of the remaining plan steps of the different plans. Definition 5.21 (Plans and Queries as Golog Procedures) Let S ⊂ OWLSP be a set of OWL-S profiles and q ∈ OWLSQP be a query profile O O with name v and parameter names hz1,...zoi. Let further E S := {ψ1,...,ψm} be the F set of all effects with property symbols in FS, i.e. E S = {ψ ∈ effects(q) | props(ψ) ⊂ F FS}. Let also

(k) (k) P = {hs1 ,...,snk i| 1 ≤ k ≤ n0}⊂LPS be a set of n0 ∈ N plans with plan steps corresponding to services in S and with longest P common prefix P remax := hs1,...,sti. The function

O σproc : OWLSQP × 2OWLSP → GOLOG O generates a Golog procedure from q and P as follows

σproc(q,P ) := proc v Proc(z1,...,zo)

s1 ; . . . ; st ; (1) (1) ((st+1 ; . . . ; sn1 ) (2) (2) | (st+1 ; . . . ; sn2 ) . . . | (s(n0) ; . . . ; s(n0))) t+1 nn0 (ψ1 ∧ . . . ∧ ψm)? endProc 246 Chapter E. Translation Functions for Web Service Composition

♦ Appendix F

Planning and Situation Calculus Domains

The MathServe broker performs automated Semantic Web Service composition with a combination of classical planning and decision-theoretic reasoning in the situation calculus. The composition approach has been presented in Chapter 6. In this appendix we present a PRODIGY planning domain (Section F.1) and a stochastic situation calculus actions domain (Section F.2) as created by MathServe’s service composer for the query ResultQuery with the following profile:

query ResultQuery: inputs: fof problem :: mw#TptpFOFProblem outputs: atp result :: mw#FoATPResult preconds: effects: resultFor(atp result, fof problem) status(atp result, stat#Theorem) . categs: timeout: 10secs mw = http://www.mathweb.org/owl/mathserv.owl tptp = http://www.tptp.org/Problems stat = http://www.mathweb.org/owl/status.owl

F.1 The PRODIGY Planning Domain for Result- Query

(create-problem-space ’mathserve :current t)

(ptype-of OWL-Thing :top-type) (ptype-of OWL-ProblemClass OWL-Thing) (ptype-of OWL-Calculus OWL-Thing) (ptype-of OWL-Syntax OWL-Thing) (ptype-of OWL-ProverState OWL-Thing) (ptype-of OWL-Specialisation OWL-Thing) (ptype-of OWL-Logic OWL-Thing) (ptype-of OWL-Resource OWL-Thing) (ptype-of OWL-Output OWL-Thing) (ptype-of OWL-ProblemFeature OWL-Thing) (ptype-of OWL-Problem OWL-Thing) (ptype-of OWL-Proof OWL-Thing) (ptype-of OWL-FOProblemFormat OWL-Thing) (ptype-of OWL-Result OWL-Thing) 248 Chapter F. Planning and Situation Calculus Domains

(ptype-of OWL-ProvingProblemClass OWL-ProblemClass) (ptype-of OWL-NDCalculus OWL-Calculus) (ptype-of OWL-ResolutionCalculus OWL-Calculus) (ptype-of OWL-SequentCalculus OWL-Calculus) (ptype-of OWL-FOATPState OWL-ProverState) (ptype-of OWL-FOSpecialisation OWL-Specialisation) (ptype-of OWL-SetTheory OWL-Logic) (ptype-of OWL-HOLogic OWL-Logic) (ptype-of OWL-FOLogic OWL-Logic) (ptype-of OWL-TimeResource OWL-Resource) (ptype-of OWL-FORefutation OWL-Output) (ptype-of OWL-Assurance OWL-Output) (ptype-of OWL-ProvingProblemFeature OWL-ProblemFeature) (ptype-of OWL-DecisionProblem OWL-Problem) (ptype-of OWL-SolvingProblem OWL-Problem) (ptype-of OWL-ComputationProblem OWL-Problem) (ptype-of OWL-ProvingProblem OWL-Problem) (ptype-of OWL-InformalProof OWL-Proof) (ptype-of OWL-FormalProof OWL-Proof) (ptype-of OWL-ModelGeneratorResult OWL-Result) (ptype-of OWL-ProverResult OWL-Result) (ptype-of OWL-Failure OWL-Result) (ptype-of OWL-DecisionProcResult OWL-Result) (ptype-of OWL-FOProblemClass OWL-ProvingProblemClass) (ptype-of OWL-ConstructiveNDCalculus OWL-NDCalculus) (ptype-of OWL-ClassicalNDCalculus OWL-NDCalculus) (ptype-of OWL-BfFPResolutionCalculus OWL-ResolutionCalculus) (ptype-of OWL-FOProblemFeature OWL-ProvingProblemFeature) (ptype-of OWL-OmdocProvingProblem OWL-ProvingProblem) (ptype-of OWL-FOProvingProblem OWL-ProvingProblem) (ptype-of OWL-OmdocFOProvingProblem OWL-FOProvingProblem) (ptype-of OWL-ResolutionProof OWL-FormalProof) (ptype-of OWL-NDProof OWL-FormalProof) (ptype-of OWL-OmdocNDProof OWL-NDProof) (ptype-of OWL-NDProverResult OWL-ProverResult) (ptype-of OWL-FoAtpResult OWL-ProverResult) (ptype-of OWL-FoXmlAtpResult OWL-ProverResult) (ptype-of OWL-FoScriptResult OWL-ProverResult) (ptype-of OWL-FOOmdocProvingProblem OWL-OmdocProvingProblem) (ptype-of OWL-TptpProblem OWL-FOProvingProblem) (ptype-of OWL-TptpCNFRefutation OWL-ResolutionProof) (ptype-of OWL-XTptpCNFRefutation OWL-ResolutionProof) (ptype-of OWL-TwegaNDProof OWL-NDProof) (ptype-of OWL-FOBrFPResult OWL-ProverResult) (ptype-of OWL-FOXmlBrFPResult OWL-FoXmlAtpResult) (ptype-of OWL-TptpCNFProblem OWL-TptpProblem) (ptype-of OWL-TptpFOFProblem OWL-TptpProblem) (ptype-of OWL-TptpCnfLcRefutation OWL-TptpCNFRefutation) (ptype-of OWL-TptpCnfBrFPRefutation OWL-TptpCnfLcRefutation) (ptype-of OWL-Annotation OWL-Thing) (ptype-of OWL-DeltaRelation OWL-Annotation)

(OPERATOR www.mathweb.org/owl/analyse/TPTPAnalyserService.owl (params ) (preconds ( ( OWL-TptpProblem) ( OWL-ProvingProblemClass) ) (and (bound ) (~ (bound )) )) (effects ( ) ( (add (bound )) (add (problemClass )) ))) F.1. The PRODIGY Planning Domain for ResultQuery 249

(pinstance-of object_prob_class_1 OWL-ProvingProblemClass)

(Control-Rule reject-wrong-binding-for-TPTPAnalyser (if (and (applicable-operator (www.mathweb.org/owl/analyse/TPTPAnalyserService.owl )) (or (~ (exact-type-of-object OWL-ProvingProblemClass)) ))) (then sub-goal))

(OPERATOR www.mathweb.org/owl/atp/OtterService.owl (params ) (preconds ( ( OWL-TptpProblem) ( OWL-TimeResource) ( OWL-Thing) ( OWL-FOBrFPResult) ) (and (bound ) (bound ) (problemClass ) (~ (bound )) )) (effects ( ) ( (add (bound )) (add (resultFor )) )))

(pinstance-of object_atp_result_2 OWL-FOBrFPResult)

(Control-Rule reject-wrong-binding-for-OtterATP (if (and (applicable-operator (www.mathweb.org/owl/atp/OtterService.owl )) (or (~ (exact-type-of-object OWL-FOBrFPResult)) ))) (then sub-goal))

(OPERATOR www.mathweb.org/owl/atp/ParadoxService.owl (params ) (preconds ( ( OWL-TptpProblem) ( OWL-TimeResource) ( OWL-Thing) ( OWL-FoAtpResult) ) (and (bound ) (bound ) (problemClass ) (~ (bound )) )) (effects ( ) ( (add (bound )) (add (resultFor )) )))

(pinstance-of object_atp_result_3 OWL-FoAtpResult)

(Control-Rule reject-wrong-binding-for-ParadoxATP 250 Chapter F. Planning and Situation Calculus Domains

(if (and (applicable-operator (www.mathweb.org/owl/atp/ParadoxService.owl )) (or (~ (exact-type-of-object OWL-FoAtpResult)) ))) (then sub-goal))

(OPERATOR www.mathweb.org/owl/atp/SpassService.owl (params ) (preconds ( ( OWL-TptpProblem) ( OWL-TimeResource) ( OWL-Thing) ( OWL-FoAtpResult) ) (and (bound ) (bound ) (problemClass ) (~ (bound )) )) (effects ( ) ( (add (bound )) (add (resultFor )) )))

(pinstance-of object_atp_result_4 OWL-FoAtpResult)

(Control-Rule reject-wrong-binding-for-SpassATP (if (and (applicable-operator (www.mathweb.org/owl/atp/SpassService.owl )) (or (~ (exact-type-of-object OWL-FoAtpResult)) ))) (then sub-goal))

(OPERATOR www.mathweb.org/owl/atp/VampireService.owl (params ) (preconds ( ( OWL-TptpProblem) ( OWL-TimeResource) ( OWL-Thing) ( OWL-FoAtpResult) ) (and (bound ) (bound ) (problemClass ) (~ (bound )) )) (effects ( ) ( (add (bound )) (add (resultFor )) )))

(pinstance-of object_atp_result_5 OWL-FoAtpResult)

(Control-Rule reject-wrong-binding-for-VampireATP (if (and (applicable-operator (www.mathweb.org/owl/atp/VampireService.owl )) (or (~ (exact-type-of-object OWL-FoAtpResult)) F.1. The PRODIGY Planning Domain for ResultQuery 251

))) (then sub-goal))

(OPERATOR www.mathweb.org/owl/atp/EService.owl (params ) (preconds ( ( OWL-TptpProblem) ( OWL-TimeResource) ( OWL-Thing) ( OWL-FoAtpResult) ) (and (bound ) (bound ) (problemClass ) (~ (bound )) )) (effects ( ) ( (add (bound )) (add (resultFor )) )))

(pinstance-of object_atp_result_6 OWL-FoAtpResult)

(Control-Rule reject-wrong-binding-for-EpATP (if (and (applicable-operator (www.mathweb.org/owl/atp/EService.owl )) (or (~ (exact-type-of-object OWL-FoAtpResult)) ))) (then sub-goal))

(OPERATOR www.mathweb.org/owl/atp/WaldmeisterService.owl (params ) (preconds ( ( OWL-TptpProblem) ( OWL-TimeResource) ( OWL-Thing) ( OWL-FoAtpResult) ) (and (bound ) (bound ) (problemClass ) (~ (bound )) )) (effects ( ) ( (add (bound )) (add (resultFor )) )))

(pinstance-of object_atp_result_7 OWL-FoAtpResult)

(Control-Rule reject-wrong-binding-for-WaldmeisterATP (if (and (applicable-operator (www.mathweb.org/owl/atp/WaldmeisterService.owl )) (or (~ (exact-type-of-object OWL-FoAtpResult)) ))) (then sub-goal))

(OPERATOR www.mathweb.org/owl/atp/DctpService.owl 252 Chapter F. Planning and Situation Calculus Domains

(params ) (preconds ( ( OWL-TptpProblem) ( OWL-TimeResource) ( OWL-Thing) ( OWL-FoAtpResult) ) (and (bound ) (bound ) (problemClass ) (~ (bound )) )) (effects ( ) ( (add (bound )) (add (resultFor )) )))

(pinstance-of object_atp_result_8 OWL-FoAtpResult)

(Control-Rule reject-wrong-binding-for-DctpATP (if (and (applicable-operator (www.mathweb.org/owl/atp/DctpService.owl )) (or (~ (exact-type-of-object OWL-FoAtpResult)) ))) (then sub-goal))

(pinstance-of dummy_time_resource0 OWL-TimeResource) (pinstance-of dummy_time_resource1 OWL-TimeResource) (pinstance-of dummy_time_resource2 OWL-TimeResource) (pinstance-of dummy_time_resource3 OWL-TimeResource) (pinstance-of dummy_time_resource4 OWL-TimeResource)

(setf (current-problem) (create-problem (name www.foo.org/owl/queries/ResultQuery.owl) (objects (www.foo.org/owl/queries/ResultQuery.owl#fof_problem OWL-TptpFOFProblem) (www.foo.org/owl/queries/ResultQuery.owl#atp_result OWL-FoAtpResult) ) (state (and (bound www.foo.org/owl/queries/ResultQuery.owl#fof_problem) (bound dummy_time_resource0) (bound dummy_time_resource1) (bound dummy_time_resource2) (bound dummy_time_resource3) (bound dummy_time_resource4) )) (goal (and (bound www.foo.org/owl/queries/ResultQuery.owl#atp_result)

(resultFor www.foo.org/owl/queries/ResultQuery.owl#atp_result www.foo.org/owl/queries/ResultQuery.owl#fof_problem)))))

F.2 A Stochastic situation calculus Domain

To compute an optimal policy for a Golog procedure, a stochastic situation calculus action domain is generated from the service profiles of all primitive situation calculus actions (services) occurring in that procedure. The action domain is generated as described in Section 6.5 and Appendix E.2. In what follows, we show some of the set of Prolog clauses describing the action domain created for the query ResultQuery. We F.2. A Stochastic situation calculus Domain 253 show only fragments of the domain because the complete domain fills approximately 20 pages. The successor state axioms for the fluents status, resultFor and problemClass are encoded as several Prolog clauses. For the sake of readability, we replaced XML namespaces in URI references by the abbreviations used in previous chapters. agentAction(’TPTPAnalyserService.owl’(TPTP_PROBLEM, PROB_CLASS)). poss(’TPTPAnalyserService.owl’(TPTP_PROBLEM, PROB_CLASS), S) :- true. nondetActions(’TPTPAnalyserService.owl’(TPTP_PROBLEM, PROB_CLASS), S, [’TPTPAnalyserService.owl0’(TPTP_PROBLEM, ’mw#CNF_SAT_RFO_EQU_NUE’), ’TPTPAnalyserService.owl1’(TPTP_PROBLEM, ’mw#FOF_NKC_EPR’), ’TPTPAnalyserService.owl2’(TPTP_PROBLEM, ’mw#CNF_NKS_RFO_NEQ_NHN’), ’TPTPAnalyserService.owl3’(TPTP_PROBLEM, ’mw#FOF_SAT_RFO’), ’TPTPAnalyserService.owl4’(TPTP_PROBLEM, ’mw#FOF_NKC_RFO_EQU’), ’TPTPAnalyserService.owl5’(TPTP_PROBLEM, ’mw#FOF_SAT_EPR’), ’TPTPAnalyserService.owl6’(TPTP_PROBLEM, ’mw#FOF_NKS_NUN_RFO_EQU’), ’TPTPAnalyserService.owl7’(TPTP_PROBLEM, ’mw#FOF_NKS_NUN_EPR’), ’TPTPAnalyserService.owl8’(TPTP_PROBLEM, ’mw#CNF_NKS_RFO_PEQ_UEQ’), ’TPTPAnalyserService.owl9’(TPTP_PROBLEM, ’mw#CNF_SAT_RFO_PEQ_UEQ’), ’TPTPAnalyserService.owl10’(TPTP_PROBLEM, ’mw#CNF_SAT_EPR’), ’TPTPAnalyserService.owl11’(TPTP_PROBLEM, ’mw#FOF_CSA_RFO’), ’TPTPAnalyserService.owl12’(TPTP_PROBLEM, ’mw#CNF_NKS_RFO_SEQ_HRN’), ’TPTPAnalyserService.owl13’(TPTP_PROBLEM, ’mw#CNF_NKS_EPR’), ’TPTPAnalyserService.owl14’(TPTP_PROBLEM, ’mw#FOF_CSA_EPR’), ’TPTPAnalyserService.owl15’(TPTP_PROBLEM, ’mw#CNF_NKS_RFO_NEQ_HRN’), ’TPTPAnalyserService.owl16’(TPTP_PROBLEM, ’mw#FOF_NKC_RFO_NEQ’), ’TPTPAnalyserService.owl17’(TPTP_PROBLEM, ’mw#FOF_NKS_NUN_RFO_NEQ’), ’TPTPAnalyserService.owl18’(TPTP_PROBLEM, ’mw#CNF_NKS_RFO_SEQ_NHN’), ’TPTPAnalyserService.owl19’(TPTP_PROBLEM, ’mw#CNF_SAT_RFO_NEQ’), ’TPTPAnalyserService.owl20’(TPTP_PROBLEM, ’mw#CNF_NKS_RFO_PEQ_NUE’) ]). senseCondition(’TPTPAnalyserService.owl’(TPTP_PROBLEM, PROB_CLASS), Phi) :- Phi = problemClass(TPTP_PROBLEM, PROB_CLASS). senseCondition(’TPTPAnalyserService.owl0’(TPTP_PROBLEM, PROB_CLASS), Phi) :- Phi = problemClass(TPTP_PROBLEM, PROB_CLASS). senseCondition(’TPTPAnalyserService.owl1’(TPTP_PROBLEM, PROB_CLASS), Phi) :- Phi = problemClass(TPTP_PROBLEM, PROB_CLASS). senseCondition(’TPTPAnalyserService.owl2’(TPTP_PROBLEM, PROB_CLASS), Phi) :- Phi = problemClass(TPTP_PROBLEM, PROB_CLASS). senseCondition(’TPTPAnalyserService.owl3’(TPTP_PROBLEM, PROB_CLASS), Phi) :- Phi = problemClass(TPTP_PROBLEM, PROB_CLASS). senseCondition(’TPTPAnalyserService.owl4’(TPTP_PROBLEM, PROB_CLASS), Phi) :- Phi = problemClass(TPTP_PROBLEM, PROB_CLASS). senseCondition(’TPTPAnalyserService.owl5’(TPTP_PROBLEM, PROB_CLASS), Phi) :- Phi = problemClass(TPTP_PROBLEM, PROB_CLASS). senseCondition(’TPTPAnalyserService.owl6’(TPTP_PROBLEM, PROB_CLASS), Phi) :- Phi = problemClass(TPTP_PROBLEM, PROB_CLASS). senseCondition(’TPTPAnalyserService.owl7’(TPTP_PROBLEM, PROB_CLASS), Phi) :- Phi = problemClass(TPTP_PROBLEM, PROB_CLASS). senseCondition(’TPTPAnalyserService.owl8’(TPTP_PROBLEM, PROB_CLASS), Phi) :- Phi = problemClass(TPTP_PROBLEM, PROB_CLASS). senseCondition(’TPTPAnalyserService.owl9’(TPTP_PROBLEM, PROB_CLASS), Phi) :- Phi = problemClass(TPTP_PROBLEM, PROB_CLASS). senseCondition(’TPTPAnalyserService.owl10’(TPTP_PROBLEM, PROB_CLASS), Phi) :- Phi = problemClass(TPTP_PROBLEM, PROB_CLASS). senseCondition(’TPTPAnalyserService.owl11’(TPTP_PROBLEM, PROB_CLASS), Phi) :- Phi = problemClass(TPTP_PROBLEM, PROB_CLASS). senseCondition(’TPTPAnalyserService.owl12’(TPTP_PROBLEM, PROB_CLASS), Phi) :- Phi = problemClass(TPTP_PROBLEM, PROB_CLASS). senseCondition(’TPTPAnalyserService.owl13’(TPTP_PROBLEM, PROB_CLASS), Phi) :- Phi = problemClass(TPTP_PROBLEM, PROB_CLASS). senseCondition(’TPTPAnalyserService.owl14’(TPTP_PROBLEM, PROB_CLASS), Phi) :- Phi = problemClass(TPTP_PROBLEM, PROB_CLASS). senseCondition(’TPTPAnalyserService.owl15’(TPTP_PROBLEM, PROB_CLASS), Phi) :- Phi = problemClass(TPTP_PROBLEM, PROB_CLASS). 254 Chapter F. Planning and Situation Calculus Domains

senseCondition(’TPTPAnalyserService.owl16’(TPTP_PROBLEM, PROB_CLASS), Phi) :- Phi = problemClass(TPTP_PROBLEM, PROB_CLASS). senseCondition(’TPTPAnalyserService.owl17’(TPTP_PROBLEM, PROB_CLASS), Phi) :- Phi = problemClass(TPTP_PROBLEM, PROB_CLASS). senseCondition(’TPTPAnalyserService.owl18’(TPTP_PROBLEM, PROB_CLASS), Phi) :- Phi = problemClass(TPTP_PROBLEM, PROB_CLASS). senseCondition(’TPTPAnalyserService.owl19’(TPTP_PROBLEM, PROB_CLASS), Phi) :- Phi = problemClass(TPTP_PROBLEM, PROB_CLASS). senseCondition(’TPTPAnalyserService.owl20’(TPTP_PROBLEM, PROB_CLASS), Phi) :- Phi = problemClass(TPTP_PROBLEM, PROB_CLASS). problemClass(TPTP_PROBLEM, PROB_CLASS, do(A, S)) :- A = ’TPTPAnalyserService.owl0’(TPTP_PROBLEM, PROB_CLASS), (PROB_CLASS = ’mw#CNF_SAT_RFO_EQU_NUE’). problemClass(TPTP_PROBLEM, PROB_CLASS, do(A, S)) :- A = ’TPTPAnalyserService.owl1’(TPTP_PROBLEM, PROB_CLASS), (PROB_CLASS = ’mw#FOF_NKC_EPR’). problemClass(TPTP_PROBLEM, PROB_CLASS, do(A, S)) :- A = ’TPTPAnalyserService.owl2’(TPTP_PROBLEM, PROB_CLASS), (PROB_CLASS = ’mw#CNF_NKS_RFO_NEQ_NHN’). problemClass(TPTP_PROBLEM, PROB_CLASS, do(A, S)) :- A = ’TPTPAnalyserService.owl3’(TPTP_PROBLEM, PROB_CLASS), (PROB_CLASS = ’mw#FOF_SAT_RFO’). problemClass(TPTP_PROBLEM, PROB_CLASS, do(A, S)) :- A = ’TPTPAnalyserService.owl4’(TPTP_PROBLEM, PROB_CLASS), (PROB_CLASS = ’mw#FOF_NKC_RFO_EQU’). problemClass(TPTP_PROBLEM, PROB_CLASS, do(A, S)) :- A = ’TPTPAnalyserService.owl5’(TPTP_PROBLEM, PROB_CLASS), (PROB_CLASS = ’mw#FOF_SAT_EPR’). problemClass(TPTP_PROBLEM, PROB_CLASS, do(A, S)) :- A = ’TPTPAnalyserService.owl6’(TPTP_PROBLEM, PROB_CLASS), (PROB_CLASS = ’mw#FOF_NKS_NUN_RFO_EQU’). problemClass(TPTP_PROBLEM, PROB_CLASS, do(A, S)) :- A = ’TPTPAnalyserService.owl7’(TPTP_PROBLEM, PROB_CLASS), (PROB_CLASS = ’mw#FOF_NKS_NUN_EPR’). problemClass(TPTP_PROBLEM, PROB_CLASS, do(A, S)) :- A = ’TPTPAnalyserService.owl8’(TPTP_PROBLEM, PROB_CLASS), (PROB_CLASS = ’mw#CNF_NKS_RFO_PEQ_UEQ’). problemClass(TPTP_PROBLEM, PROB_CLASS, do(A, S)) :- A = ’TPTPAnalyserService.owl9’(TPTP_PROBLEM, PROB_CLASS), (PROB_CLASS = ’mw#CNF_SAT_RFO_PEQ_UEQ’). problemClass(TPTP_PROBLEM, PROB_CLASS, do(A, S)) :- A = ’TPTPAnalyserService.owl10’(TPTP_PROBLEM, PROB_CLASS), (PROB_CLASS = ’mw#CNF_SAT_EPR’). problemClass(TPTP_PROBLEM, PROB_CLASS, do(A, S)) :- A = ’TPTPAnalyserService.owl11’(TPTP_PROBLEM, PROB_CLASS), (PROB_CLASS = ’mw#FOF_CSA_RFO’). problemClass(TPTP_PROBLEM, PROB_CLASS, do(A, S)) :- A = ’TPTPAnalyserService.owl12’(TPTP_PROBLEM, PROB_CLASS), (PROB_CLASS = ’mw#CNF_NKS_RFO_SEQ_HRN’). problemClass(TPTP_PROBLEM, PROB_CLASS, do(A, S)) :- A = ’TPTPAnalyserService.owl13’(TPTP_PROBLEM, PROB_CLASS), (PROB_CLASS = ’mw#CNF_NKS_EPR’). problemClass(TPTP_PROBLEM, PROB_CLASS, do(A, S)) :- A = ’TPTPAnalyserService.owl14’(TPTP_PROBLEM, PROB_CLASS), (PROB_CLASS = ’mw#FOF_CSA_EPR’). problemClass(TPTP_PROBLEM, PROB_CLASS, do(A, S)) :- A = ’TPTPAnalyserService.owl15’(TPTP_PROBLEM, PROB_CLASS), (PROB_CLASS = ’mw#CNF_NKS_RFO_NEQ_HRN’). problemClass(TPTP_PROBLEM, PROB_CLASS, do(A, S)) :- A = ’TPTPAnalyserService.owl16’(TPTP_PROBLEM, PROB_CLASS), (PROB_CLASS = ’mw#FOF_NKC_RFO_NEQ’). problemClass(TPTP_PROBLEM, PROB_CLASS, do(A, S)) :- A = ’TPTPAnalyserService.owl17’(TPTP_PROBLEM, PROB_CLASS), (PROB_CLASS = ’mw#FOF_NKS_NUN_RFO_NEQ’). problemClass(TPTP_PROBLEM, PROB_CLASS, do(A, S)) :- A = ’TPTPAnalyserService.owl18’(TPTP_PROBLEM, PROB_CLASS), (PROB_CLASS = ’mw#CNF_NKS_RFO_SEQ_NHN’). problemClass(TPTP_PROBLEM, PROB_CLASS, do(A, S)) :- A = ’TPTPAnalyserService.owl19’(TPTP_PROBLEM, PROB_CLASS), (PROB_CLASS = ’mw#CNF_SAT_RFO_NEQ’). F.2. A Stochastic situation calculus Domain 255

problemClass(TPTP_PROBLEM, PROB_CLASS, do(A, S)) :- A = ’TPTPAnalyserService.owl20’(TPTP_PROBLEM, PROB_CLASS), (PROB_CLASS = ’mw#CNF_NKS_RFO_PEQ_NUE’). prob(’TPTPAnalyserService.owl0’(TPTP_PROBLEM, PROB_CLASS), Pr, S) :- ( Pr is 0.047619047619047616 ). prob(’TPTPAnalyserService.owl1’(TPTP_PROBLEM, PROB_CLASS), Pr, S) :- ( Pr is 0.047619047619047616 ). prob(’TPTPAnalyserService.owl2’(TPTP_PROBLEM, PROB_CLASS), Pr, S) :- ( Pr is 0.047619047619047616 ). prob(’TPTPAnalyserService.owl3’(TPTP_PROBLEM, PROB_CLASS), Pr, S) :- ( Pr is 0.047619047619047616 ). prob(’TPTPAnalyserService.owl4’(TPTP_PROBLEM, PROB_CLASS), Pr, S) :- ( Pr is 0.047619047619047616 ). prob(’TPTPAnalyserService.owl5’(TPTP_PROBLEM, PROB_CLASS), Pr, S) :- ( Pr is 0.047619047619047616 ). prob(’TPTPAnalyserService.owl6’(TPTP_PROBLEM, PROB_CLASS), Pr, S) :- ( Pr is 0.047619047619047616 ). prob(’TPTPAnalyserService.owl7’(TPTP_PROBLEM, PROB_CLASS), Pr, S) :- ( Pr is 0.047619047619047616 ). prob(’TPTPAnalyserService.owl8’(TPTP_PROBLEM, PROB_CLASS), Pr, S) :- ( Pr is 0.047619047619047616 ). prob(’TPTPAnalyserService.owl9’(TPTP_PROBLEM, PROB_CLASS), Pr, S) :- ( Pr is 0.047619047619047616 ). prob(’TPTPAnalyserService.owl10’(TPTP_PROBLEM, PROB_CLASS), Pr, S) :- ( Pr is 0.047619047619047616 ). prob(’TPTPAnalyserService.owl11’(TPTP_PROBLEM, PROB_CLASS), Pr, S) :- ( Pr is 0.047619047619047616 ). prob(’TPTPAnalyserService.owl12’(TPTP_PROBLEM, PROB_CLASS), Pr, S) :- ( Pr is 0.047619047619047616 ). prob(’TPTPAnalyserService.owl13’(TPTP_PROBLEM, PROB_CLASS), Pr, S) :- ( Pr is 0.047619047619047616 ). prob(’TPTPAnalyserService.owl14’(TPTP_PROBLEM, PROB_CLASS), Pr, S) :- ( Pr is 0.047619047619047616 ). prob(’TPTPAnalyserService.owl15’(TPTP_PROBLEM, PROB_CLASS), Pr, S) :- ( Pr is 0.047619047619047616 ). prob(’TPTPAnalyserService.owl16’(TPTP_PROBLEM, PROB_CLASS), Pr, S) :- ( Pr is 0.047619047619047616 ). prob(’TPTPAnalyserService.owl17’(TPTP_PROBLEM, PROB_CLASS), Pr, S) :- ( Pr is 0.047619047619047616 ). prob(’TPTPAnalyserService.owl18’(TPTP_PROBLEM, PROB_CLASS), Pr, S) :- ( Pr is 0.047619047619047616 ). prob(’TPTPAnalyserService.owl19’(TPTP_PROBLEM, PROB_CLASS), Pr, S) :- ( Pr is 0.047619047619047616 ). prob(’TPTPAnalyserService.owl20’(TPTP_PROBLEM, PROB_CLASS), RestPr, S) :- prob(’TPTPAnalyserService.owl0’(TPTP_PROBLEM, PROB_CLASS), Pr0, S), prob(’TPTPAnalyserService.owl1’(TPTP_PROBLEM, PROB_CLASS), Pr1, S), prob(’TPTPAnalyserService.owl2’(TPTP_PROBLEM, PROB_CLASS), Pr2, S), prob(’TPTPAnalyserService.owl3’(TPTP_PROBLEM, PROB_CLASS), Pr3, S), 256 Chapter F. Planning and Situation Calculus Domains

prob(’TPTPAnalyserService.owl4’(TPTP_PROBLEM, PROB_CLASS), Pr4, S), prob(’TPTPAnalyserService.owl5’(TPTP_PROBLEM, PROB_CLASS), Pr5, S), prob(’TPTPAnalyserService.owl6’(TPTP_PROBLEM, PROB_CLASS), Pr6, S), prob(’TPTPAnalyserService.owl7’(TPTP_PROBLEM, PROB_CLASS), Pr7, S), prob(’TPTPAnalyserService.owl8’(TPTP_PROBLEM, PROB_CLASS), Pr8, S), prob(’TPTPAnalyserService.owl9’(TPTP_PROBLEM, PROB_CLASS), Pr9, S), prob(’TPTPAnalyserService.owl10’(TPTP_PROBLEM, PROB_CLASS), Pr10, S), prob(’TPTPAnalyserService.owl11’(TPTP_PROBLEM, PROB_CLASS), Pr11, S), prob(’TPTPAnalyserService.owl12’(TPTP_PROBLEM, PROB_CLASS), Pr12, S), prob(’TPTPAnalyserService.owl13’(TPTP_PROBLEM, PROB_CLASS), Pr13, S), prob(’TPTPAnalyserService.owl14’(TPTP_PROBLEM, PROB_CLASS), Pr14, S), prob(’TPTPAnalyserService.owl15’(TPTP_PROBLEM, PROB_CLASS), Pr15, S), prob(’TPTPAnalyserService.owl16’(TPTP_PROBLEM, PROB_CLASS), Pr16, S), prob(’TPTPAnalyserService.owl17’(TPTP_PROBLEM, PROB_CLASS), Pr17, S), prob(’TPTPAnalyserService.owl18’(TPTP_PROBLEM, PROB_CLASS), Pr18, S), prob(’TPTPAnalyserService.owl19’(TPTP_PROBLEM, PROB_CLASS), Pr19, S), (RestPr is 1.0 - (Pr0 + Pr1 + Pr2 + Pr3 + Pr4 + Pr5 + Pr6 + Pr7 + Pr8 + Pr9 + Pr10 + Pr11 + Pr12 + Pr13 + Pr14 + Pr15 + Pr16 + Pr17 + Pr18 + Pr19 + 0.0)). cost(’TPTPAnalyserService.owl’(TPTP_PROBLEM, PROB_CLASS), Cost, S) :- (Cost is 3000). myReward(’TPTPAnalyserService.owl0’(TPTP_PROBLEM, PROB_CLASS), 0). myReward(’TPTPAnalyserService.owl1’(TPTP_PROBLEM, PROB_CLASS), 0). myReward(’TPTPAnalyserService.owl2’(TPTP_PROBLEM, PROB_CLASS), 0). myReward(’TPTPAnalyserService.owl3’(TPTP_PROBLEM, PROB_CLASS), 0). myReward(’TPTPAnalyserService.owl4’(TPTP_PROBLEM, PROB_CLASS), 0). myReward(’TPTPAnalyserService.owl5’(TPTP_PROBLEM, PROB_CLASS), 0). myReward(’TPTPAnalyserService.owl6’(TPTP_PROBLEM, PROB_CLASS), 0). myReward(’TPTPAnalyserService.owl7’(TPTP_PROBLEM, PROB_CLASS), 0). myReward(’TPTPAnalyserService.owl8’(TPTP_PROBLEM, PROB_CLASS), 0). myReward(’TPTPAnalyserService.owl9’(TPTP_PROBLEM, PROB_CLASS), 0). myReward(’TPTPAnalyserService.owl10’(TPTP_PROBLEM, PROB_CLASS), 0). myReward(’TPTPAnalyserService.owl11’(TPTP_PROBLEM, PROB_CLASS), 0). myReward(’TPTPAnalyserService.owl12’(TPTP_PROBLEM, PROB_CLASS), 0). myReward(’TPTPAnalyserService.owl13’(TPTP_PROBLEM, PROB_CLASS), 0). myReward(’TPTPAnalyserService.owl14’(TPTP_PROBLEM, PROB_CLASS), 0). myReward(’TPTPAnalyserService.owl15’(TPTP_PROBLEM, PROB_CLASS), 0). myReward(’TPTPAnalyserService.owl16’(TPTP_PROBLEM, PROB_CLASS), 0). myReward(’TPTPAnalyserService.owl17’(TPTP_PROBLEM, PROB_CLASS), 0). myReward(’TPTPAnalyserService.owl18’(TPTP_PROBLEM, PROB_CLASS), 0). myReward(’TPTPAnalyserService.owl19’(TPTP_PROBLEM, PROB_CLASS), 0). myReward(’TPTPAnalyserService.owl20’(TPTP_PROBLEM, PROB_CLASS), 0). agentAction(’VampireService.owl’(TPTP_PROBLEM, TIME_RES, ATP_RESULT)). poss(’VampireService.owl’(TPTP_PROBLEM, TIME_RES, ATP_RESULT), S) :- true. nondetActions(’VampireService.owl’(TPTP_PROBLEM, TIME_RES, ATP_RESULT), S, [’VampireService.owl0’(TPTP_PROBLEM, TIME_RES, ATP_RESULT), ’VampireService.owl1’(TPTP_PROBLEM, TIME_RES, ATP_RESULT), ’VampireService.owl2’(TPTP_PROBLEM, TIME_RES, ATP_RESULT), ’VampireService.owl3’(TPTP_PROBLEM, TIME_RES, ATP_RESULT), ’VampireService.owl4’(TPTP_PROBLEM, TIME_RES, ATP_RESULT) ]). senseCondition(’VampireService.owl’(TPTP_PROBLEM, TIME_RES, ATP_RESULT), Phi) :- Phi = resultFor(ATP_RESULT, TPTP_PROBLEM). senseCondition(’VampireService.owl0’(TPTP_PROBLEM, TIME_RES, ATP_RESULT), Phi) :- Phi = status(ATP_RESULT, ’stat#Unsatisfiable’). senseCondition(’VampireService.owl1’(TPTP_PROBLEM, TIME_RES, ATP_RESULT), Phi) :- Phi = status(ATP_RESULT, ’stat#Theorem’). senseCondition(’VampireService.owl2’(TPTP_PROBLEM, TIME_RES, ATP_RESULT), Phi) :- Phi = status(ATP_RESULT, ’stat#CounterSatisfiable’). senseCondition(’VampireService.owl3’(TPTP_PROBLEM, TIME_RES, ATP_RESULT), Phi) :- Phi = status(ATP_RESULT, ’stat#Satisfiable’). senseCondition(’VampireService.owl4’(TPTP_PROBLEM, TIME_RES, ATP_RESULT), Phi) :- Phi = status(ATP_RESULT, ’stat#Unknown’). F.2. A Stochastic situation calculus Domain 257

resultFor(ATP_RESULT, TPTP_PROBLEM, do(A, S)) :- A = ’VampireService.owl0’(TPTP_PROBLEM, TIME_RES, ATP_RESULT). resultFor(ATP_RESULT, TPTP_PROBLEM, do(A, S)) :- A = ’VampireService.owl1’(TPTP_PROBLEM, TIME_RES, ATP_RESULT). resultFor(ATP_RESULT, TPTP_PROBLEM, do(A, S)) :- A = ’VampireService.owl2’(TPTP_PROBLEM, TIME_RES, ATP_RESULT). resultFor(ATP_RESULT, TPTP_PROBLEM, do(A, S)) :- A = ’VampireService.owl3’(TPTP_PROBLEM, TIME_RES, ATP_RESULT). resultFor(ATP_RESULT, TPTP_PROBLEM, do(A, S)) :- A = ’VampireService.owl4’(TPTP_PROBLEM, TIME_RES, ATP_RESULT). prob(’VampireService.owl0’(TPTP_PROBLEM, TIME_RES, ATP_RESULT), Pr, S) :- holds(problemClass(TPTP_PROBLEM, ’mw#CNF_NKS_EPR’), S), ( Pr is 0.9791667) ; holds(problemClass(TPTP_PROBLEM, ’mw#CNF_NKS_RFO_NEQ_NHN’), S), ( Pr is 0.74585634) ; holds(problemClass(TPTP_PROBLEM, ’mw#CNF_NKS_RFO_NEQ_HRN’), S), ( Pr is 0.93722945) ; holds(problemClass(TPTP_PROBLEM, ’mw#CNF_NKS_RFO_SEQ_HRN’), S), ( Pr is 0.9335038) ; holds(problemClass(TPTP_PROBLEM, ’mw#CNF_NKS_RFO_SEQ_NHN’), S), ( Pr is 0.5187225) ; holds(problemClass(TPTP_PROBLEM, ’mw#CNF_NKS_RFO_PEQ_NUE’), S), ( Pr is 0.7793765) ; holds(problemClass(TPTP_PROBLEM, ’mw#CNF_NKS_RFO_PEQ_UEQ’), S), ( Pr is 0.80522877) ; ( Pr is 0.0 ). prob(’VampireService.owl1’(TPTP_PROBLEM, TIME_RES, ATP_RESULT), Pr, S) :- holds(problemClass(TPTP_PROBLEM, ’mw#FOF_NKC_EPR’), S), ( Pr is 0.9265873) ; holds(problemClass(TPTP_PROBLEM, ’mw#FOF_NKC_RFO_EQU’), S), ( Pr is 0.7091795) ; holds(problemClass(TPTP_PROBLEM, ’mw#FOF_NKC_RFO_NEQ’), S), ( Pr is 1.0) ; holds(problemClass(TPTP_PROBLEM, ’mw#FOF_NKS_NUN_EPR’), S), ( Pr is 0.61206895) ; holds(problemClass(TPTP_PROBLEM, ’mw#FOF_NKS_NUN_RFO_NEQ’), S), ( Pr is 0.5) ; holds(problemClass(TPTP_PROBLEM, ’mw#FOF_NKS_NUN_RFO_EQU’), S), ( Pr is 0.0010) ; ( Pr is 0.0 ). prob(’VampireService.owl2’(TPTP_PROBLEM, TIME_RES, ATP_RESULT), Pr, S) :- holds(problemClass(TPTP_PROBLEM, ’mw#FOF_CSA_EPR’), S), ( Pr is 0.0010) ; holds(problemClass(TPTP_PROBLEM, ’mw#FOF_CSA_RFO’), S), ( Pr is 0.0010) ; ( Pr is 0.0 ). prob(’VampireService.owl3’(TPTP_PROBLEM, TIME_RES, ATP_RESULT), Pr, S) :- holds(problemClass(TPTP_PROBLEM, ’mw#FOF_SAT_EPR’), S), ( Pr is 0.0010) ; holds(problemClass(TPTP_PROBLEM, ’mw#FOF_SAT_RFO’), S), ( Pr is 0.0010) ; holds(problemClass(TPTP_PROBLEM, ’mw#CNF_SAT_EPR’), S), ( Pr is 0.479638) ; holds(problemClass(TPTP_PROBLEM, ’mw#CNF_NKS_EPR’), S), ( Pr is 0.479638) ; holds(problemClass(TPTP_PROBLEM, ’mw#CNF_SAT_RFO_NEQ’), S), ( Pr is 0.0010) ; holds(problemClass(TPTP_PROBLEM, ’mw#CNF_NKS_RFO_NEQ’), S), ( Pr is 0.0010) ; holds(problemClass(TPTP_PROBLEM, ’mw#CNF_NKS_RFO_NEQ_NHN’), S), ( Pr is 0.0010) ; holds(problemClass(TPTP_PROBLEM, ’mw#CNF_SAT_RFO_EQU_NUE’), S), ( Pr is 0.0010) ; holds(problemClass(TPTP_PROBLEM, ’mw#CNF_NKS_RFO_EQU_NUE’), S), ( Pr is 0.0010) ; holds(problemClass(TPTP_PROBLEM, ’mw#CNF_NKS_RFO_SEQ_NHN’), S), ( Pr is 0.0010) ; holds(problemClass(TPTP_PROBLEM, ’mw#CNF_NKS_RFO_PEQ_NUE’), S), ( Pr is 0.0010) ; holds(problemClass(TPTP_PROBLEM, ’mw#CNF_SAT_RFO_PEQ_UEQ’), S), ( Pr is 0.0010) ; holds(problemClass(TPTP_PROBLEM, ’mw#CNF_NKS_RFO_PEQ_UEQ’), S), ( Pr is 0.0010) ; ( Pr is 0.0 ). prob(’VampireService.owl4’(TPTP_PROBLEM, TIME_RES, ATP_RESULT), RestPr, S) :- prob(’VampireService.owl0’(TPTP_PROBLEM, TIME_RES, ATP_RESULT), Pr0, S), prob(’VampireService.owl1’(TPTP_PROBLEM, TIME_RES, ATP_RESULT), Pr1, S), prob(’VampireService.owl2’(TPTP_PROBLEM, TIME_RES, ATP_RESULT), Pr2, S), prob(’VampireService.owl3’(TPTP_PROBLEM, TIME_RES, ATP_RESULT), Pr3, S), (RestPr is 1.0 - (Pr0 + Pr1 + Pr2 + Pr3 + 0.0)). cost(’VampireService.owl’(TPTP_PROBLEM, TIME_RES, ATP_RESULT), Cost, S) :- holds(problemClass(TPTP_PROBLEM, ’mw#CNF_NKS_EPR’), S), ( Cost is 2316) ; holds(problemClass(TPTP_PROBLEM, ’mw#CNF_NKS_RFO_NEQ_NHN’), S), ( Cost is 10454) ; holds(problemClass(TPTP_PROBLEM, ’mw#CNF_NKS_RFO_NEQ_HRN’), S), ( Cost is 24174) ; holds(problemClass(TPTP_PROBLEM, ’mw#CNF_NKS_RFO_SEQ_HRN’), S), ( Cost is 12522) ; holds(problemClass(TPTP_PROBLEM, ’mw#CNF_NKS_RFO_SEQ_NHN’), S), ( Cost is 20984) ; holds(problemClass(TPTP_PROBLEM, ’mw#CNF_NKS_RFO_PEQ_NUE’), S), ( Cost is 17850) ; 258 Chapter F. Planning and Situation Calculus Domains

holds(problemClass(TPTP_PROBLEM, ’mw#CNF_NKS_RFO_PEQ_UEQ’), S), ( Cost is 14016) ; holds(problemClass(TPTP_PROBLEM, ’mw#FOF_NKC_EPR’), S), ( Cost is 4755) ; holds(problemClass(TPTP_PROBLEM, ’mw#FOF_NKC_RFO_EQU’), S), ( Cost is 30794) ; holds(problemClass(TPTP_PROBLEM, ’mw#FOF_NKC_RFO_NEQ’), S), ( Cost is 666) ; holds(problemClass(TPTP_PROBLEM, ’mw#FOF_NKS_NUN_EPR’), S), ( Cost is 4073) ; holds(problemClass(TPTP_PROBLEM, ’mw#FOF_NKS_NUN_RFO_NEQ’), S), ( Cost is 300000) ; holds(problemClass(TPTP_PROBLEM, ’mw#FOF_NKS_NUN_RFO_EQU’), S), ( Cost is 0) ; holds(problemClass(TPTP_PROBLEM, ’mw#FOF_CSA_EPR’), S), ( Cost is 0) ; holds(problemClass(TPTP_PROBLEM, ’mw#FOF_CSA_RFO’), S), ( Cost is 0) ; holds(problemClass(TPTP_PROBLEM, ’mw#FOF_SAT_EPR’), S), ( Cost is 0) ; holds(problemClass(TPTP_PROBLEM, ’mw#FOF_SAT_RFO’), S), ( Cost is 0) ; holds(problemClass(TPTP_PROBLEM, ’mw#CNF_SAT_EPR’), S), ( Cost is 2035) ; holds(problemClass(TPTP_PROBLEM, ’mw#CNF_NKS_EPR’), S), ( Cost is 2035) ; holds(problemClass(TPTP_PROBLEM, ’mw#CNF_SAT_RFO_NEQ’), S), ( Cost is 0) ; holds(problemClass(TPTP_PROBLEM, ’mw#CNF_NKS_RFO_NEQ’), S), ( Cost is 0) ; holds(problemClass(TPTP_PROBLEM, ’mw#CNF_NKS_RFO_NEQ_NHN’), S), ( Cost is 0) ; holds(problemClass(TPTP_PROBLEM, ’mw#CNF_SAT_RFO_EQU_NUE’), S), ( Cost is 0) ; holds(problemClass(TPTP_PROBLEM, ’mw#CNF_NKS_RFO_EQU_NUE’), S), ( Cost is 0) ; holds(problemClass(TPTP_PROBLEM, ’mw#CNF_NKS_RFO_SEQ_NHN’), S), ( Cost is 0) ; holds(problemClass(TPTP_PROBLEM, ’mw#CNF_NKS_RFO_PEQ_NUE’), S), ( Cost is 0) ; holds(problemClass(TPTP_PROBLEM, ’mw#CNF_SAT_RFO_PEQ_UEQ’), S), ( Cost is 0) ; holds(problemClass(TPTP_PROBLEM, ’mw#CNF_NKS_RFO_PEQ_UEQ’), S), ( Cost is 0) ; (Cost is 0 ). myReward(’VampireService.owl0’(TPTP_PROBLEM, TIME_RES, ATP_RESULT), 300000). myReward(’VampireService.owl1’(TPTP_PROBLEM, TIME_RES, ATP_RESULT), 300000). myReward(’VampireService.owl2’(TPTP_PROBLEM, TIME_RES, ATP_RESULT), 300000). myReward(’VampireService.owl3’(TPTP_PROBLEM, TIME_RES, ATP_RESULT), 300000). myReward(’VampireService.owl4’(TPTP_PROBLEM, TIME_RES, ATP_RESULT), 0). status(ATP_RESULT, ’stat#Unsatisfiable’, do(A,S)) :- A = ’VampireService.owl0’(TPTP_PROBLEM, TIME_RES, ATP_RESULT) ; status(ATP_RESULT, ’stat#Unsatisfiable’, S). status(ATP_RESULT, ’stat#Theorem’, do(A,S)) :- A = ’VampireService.owl1’(TPTP_PROBLEM, TIME_RES, ATP_RESULT) ; status(ATP_RESULT, ’stat#Theorem’, S). status(ATP_RESULT, ’stat#CounterSatisfiable’, do(A,S)) :- A = ’VampireService.owl2’(TPTP_PROBLEM, TIME_RES, ATP_RESULT) ; status(ATP_RESULT, ’stat#CounterSatisfiable’, S). status(ATP_RESULT, ’stat#Satisfiable’, do(A,S)) :- A = ’VampireService.owl3’(TPTP_PROBLEM, TIME_RES, ATP_RESULT) ; status(ATP_RESULT, ’stat#Satisfiable’, S). status(ATP_RESULT, ’stat#Unknown’, do(A,S)) :- A = ’VampireService.owl4’(TPTP_PROBLEM, TIME_RES, ATP_RESULT) ; status(ATP_RESULT, ’stat#Unknown’, S). problemClass(TPTP_PROBLEM, PROB_CLASS, do(A, S)) :- A = ’TPTPAnalyserService.owl’(TPTP_PROBLEM, PROB_CLASS). resultFor(ATP_RESULT, TPTP_PROBLEM, do(A, S)) :- A = ’OtterService.owl’(TPTP_PROBLEM, TIME_RES, ATP_RESULT). resultFor(ATP_RESULT, TPTP_PROBLEM, do(A, S)) :- A = ’ParadoxService.owl’(TPTP_PROBLEM, TIME_RES, ATP_RESULT). resultFor(ATP_RESULT, TPTP_PROBLEM, do(A, S)) :- A = ’SpassService.owl’(TPTP_PROBLEM, TIME_RES, ATP_RESULT). resultFor(ATP_RESULT, TPTP_PROBLEM, do(A, S)) :- A = ’VampireService.owl’(TPTP_PROBLEM, TIME_RES, ATP_RESULT). resultFor(ATP_RESULT, TPTP_PROBLEM, do(A, S)) :- A = ’EService.owl’(TPTP_PROBLEM, TIME_RES, ATP_RESULT). resultFor(ATP_RESULT, TPTP_PROBLEM, do(A, S)) :- A = ’WaldmeisterService.owl’(TPTP_PROBLEM, TIME_RES, ATP_RESULT). resultFor(ATP_RESULT, TPTP_PROBLEM, do(A, S)) :- A = ’DctpService.owl’(TPTP_PROBLEM, TIME_RES, ATP_RESULT). restoreSitArg(resultFor(X, Y), S, resultFor(X, Y, S)). F.2. A Stochastic situation calculus Domain 259

restoreSitArg(status(X, Y), S, status(X, Y, S)). restoreSitArg(problemClass(X, Y), S, problemClass(X, Y, S)). myReward(’DctpService.owl0’(TPTP_PROBLEM, TIME_RES, ATP_RESULT), 300000). myReward(’DctpService.owl1’(TPTP_PROBLEM, TIME_RES, ATP_RESULT), 300000). myReward(’DctpService.owl2’(TPTP_PROBLEM, TIME_RES, ATP_RESULT), 300000). myReward(’DctpService.owl3’(TPTP_PROBLEM, TIME_RES, ATP_RESULT), 300000). myReward(’DctpService.owl4’(TPTP_PROBLEM, TIME_RES, ATP_RESULT), 0). status(ATP_RESULT, ’stat#Unsatisfiable’, do(A,S)) :- A = ’DctpService.owl0’(TPTP_PROBLEM, TIME_RES, ATP_RESULT) ; status(ATP_RESULT, ’stat#Unsatisfiable’, S). status(ATP_RESULT, ’stat#Theorem’, do(A,S)) :- A = ’DctpService.owl1’(TPTP_PROBLEM, TIME_RES, ATP_RESULT) ; status(ATP_RESULT, ’stat#Theorem’, S). status(ATP_RESULT, ’stat#CounterSatisfiable’, do(A,S)) :- A = ’DctpService.owl2’(TPTP_PROBLEM, TIME_RES, ATP_RESULT) ; status(ATP_RESULT, ’stat#CounterSatisfiable’, S). status(ATP_RESULT, ’stat#Satisfiable’, do(A,S)) :- A = ’DctpService.owl3’(TPTP_PROBLEM, TIME_RES, ATP_RESULT) ; status(ATP_RESULT, ’stat#Satisfiable’, S). status(ATP_RESULT, ’stat#Unknown’, do(A,S)) :- A = ’DctpService.owl4’(TPTP_PROBLEM, TIME_RES, ATP_RESULT) ; status(ATP_RESULT, ’stat#Unknown’, S). resultFor(X, Y, do(A, S)) :- resultFor(X, Y, S). status(X, Y, do(A, S)) :- status(X, Y, S). problemClass(X, Y, do(A, S)) :- problemClass(X, Y, S). myReward(A, 0). 260 Chapter F. Planning and Situation Calculus Domains List of Figures

1.1 The Semantic Web stack (adapted from [Kifer et al.,2005])...... 4 1.2 AMathematicalSemanticWeb ...... 6

2.1 TheMathWebSoftwareBus ...... 22 2.2 Anabstractdecisiontree...... 41

3.1 The Semantic Web stack (adapted from [Kifer et al.,2005])...... 44 3.2 ExampleRDFgraph ...... 46 3.3 PartsofanOWL-Sservicedescription ...... 57 3.4 Subsumption hierarchy of the animals ontology ...... 66 3.5 The atomic process of the CurrencyConverter service...... 71

4.1 Top level classes of the MathServe domain ontology ...... 80 4.2 Prot´eg´ewith MathServe’s domain ontology ...... 81 4.3 Different categories of MathServe services ...... 82 4.4 Hierarchy of deductive statuses for first-order ATP problems ...... 88 4.5 Hierarchy of unsolved statuses for first-order ATP systems ...... 89 4.6 Scatter graph comparing the success of E and SPASS ...... 93 4.7 Scatter graph comparing the time used by E and SPASS ...... 94

5.1 Anexamplequeryprofile...... 106 5.2 A simple composite service in Golog ...... 108 5.3 TheMathServeBroker ...... 109 5.4 Matching algorithm used by the service matchmaker ...... 112

6.1 Service composition in MathServe ...... 124 6.2 ExamplePRODIGYoperators...... 126 6.3 OntologyclassesasPRODIGYtypes ...... 128 6.4 The profile TrampNDforFOF anditsPRODIGYoperator ...... 130 6.5 AqueryprofileanditsPRODIGYproblem...... 132 6.6 Four plans found by PRODIGY for query ndQuery1 ...... 133 6.7 Service profile of TrampCNF ...... 135 6.8 A service profile with probabilistic effects ...... 135 6.9 The query profile resultQuery ...... 139

8.1 An Ωmega tactic...... 165 8.2 (a) Number of problems solved and (b) times used by LeoStdATP and LeoStdEnhanced for the problems in Table 8.1. Failed proving attempts areindicatedwithinfinitetime ...... 167 262 LIST OF FIGURES

8.3 (a) Number of problems solved and (b) times used by LeoEirATP and LeoEirEnhanced for the problems in Table 8.1. Failed proving attempts areindicatedwithinfinitetime ...... 168 8.4 A query profile containing a higher-order theorem provingproblem . . 169

10.1 Modules of the haRVey decisionprocedure ...... 179

A.1 Top classes of the MathServe domain ontology ...... 217

B.1 Status hierarchy for the output of ATP systems ...... 219 List of Tables

3.1 OWL-DLdescriptions ...... 51 3.2 OWL-DLaxiomsandfacts...... 52

4.1 PerformanceofEandSPASS ...... 92 4.2 PerformanceofMathSATandYices...... 100 4.3 MathSAT and Yices on unsatisfiable problems ...... 100

5.1 Failures of Web Service invocation ...... 115

6.1 Comparison of service composition approaches ...... 123

7.1 TPTP Library problems solved by ATP systems ...... 149 7.2 Distribution of CASC-20 problems over seven SPCs ...... 150 7.3 Comparison of MathServe with ATP systems ...... 151 7.4 CASC Problems not solved by MathServe due to their size ...... 152 7.5 SPCsofCASC-20problems ...... 153 7.6 ATPsystemsonTPTPLibrary3.1.0 ...... 155 7.7 Distribution of CASC-J3 problems over seven SPCs ...... 156 7.8 Comparison of MathServe with leading ATP systems ...... 156

8.1 Higher-ordersetexamples ...... 161 8.2 Definitions of higher-order concepts ...... 163 8.3 Experimental data for the problems shown in Table 8.1 ...... 170

C.1 Performanceoffirst-orderATPs ...... 226 264 LIST OF TABLES List of Acronyms and Symbols

API: Application Programming Interface. ATP: Automated Theorem Proving. BPEL4WS: Business Process Execution Language for Web Services. CASC: The CADE ATP System Competition. CNF: Clause Normal Form. DL: Description Logic. A formal language for knowledge representation. DPLL: The Davis-Putnam-Logemann-Loveland procedure. A procedure for solving the propositional satisfiability problem. ebXML: Electronic Business XML. An XML language to describe business processes. FOF First-Order Formulae. A format of problems in the TPTP Library. Another format is CNF (Clause Normal Form). Golog: AlGol in logic. A high-level programming language based on the situation calculus. HTN: Hierarchical Task Networks. HTTP: The Hypertext Transfer Protocol. The central protocol of the World Wide Web. MDP: Markov Decision Process. A model for decision-theoretic planning. OWL-DL: The Description Logic fragment of the Web Ontology Language OWL. OWL-S: An OWL ontology for describing the semantics of Web Services. OWL: The Web Ontology Language. POMDP: Partially Observable Markov Decision Process. RDF: The Resource Description Framework. RPC: Remote Procedure Call. The invocation of a procedure via a network. SAT: The problem of determining the satisfiability of a formula in propositional logic. SMT: Satisfiable Modulo a Theory. The problem of deciding the satisfiability of first- order formulae with respect to some decidable background theory. SOAP: The Simple Object Access Protocol. An Internet protocol for performing re- mote procedure calls. SWRL: The Semantic Web Rule Language. An extension of OWL for describing rules. TPTP: Thousands of Problems for Theorem Provers. The name of a library of first- order theorem proving problems and the name of the syntax of the problems in that library. 266 List of Symbols and Acronyms

UDDI: Universal Description, Discovery and Integration. A specification of Web Ser- vice registries based on WSDL documents. URI: A Uniform Resource Identifier. A unique ID based on an XML namespace. W3C: The World Wide Web Consortium. A consortium which produces recommen- dations for Web languages and protocols, such as OWL, WSDL and SOAP. WSDL: The Web Services Description Language. WSMO The Web Service Modelling Ontology. WWW The World Wide Web. XML: The Extensible Markup Language.

ASWRLO : The set of SWRL atoms over the ontology O.

CHp: The set of nature’s choices associated with the disjunctive effects of the service profile p. ex CHp : The set of nature’s choices with at least one explicit probability statement in p. C : The simple, named classes of the ontology O. O q DS: The situation calculus action domain representing the service profiles in S and the query profile q. + Fp : Positive situation calculus atoms associated with the positive effect literals of + the service profile p. FS for sets of profiles.

Fp−: Negative situation calculus atoms associated with the negative effect literals of the service profile p. FS− for sets of profiles.

Fp∨: The situation calculus atoms associated with the literals in disjunctive effects of the service profile p. FS∨ for sets of profiles.

Fp: situation calculus fluents associated with the OWL-S service profile p. FS for sets of profiles.

LPS: Set of all valid PRODIGY plans for the set S of OWL-S service profiles. R : The set of properties defined in the ontology O. O VD: The set of all data variables.

VI : The set of all OWL individual variables.

ζF Ψp : The normal form positive effect axiom for the literal ζF and the service profile p.

σch: Function generating situation calculus axioms specifying nature’s choices.

σcost: Function generating situation calculus axioms specifying the costs of nature’s choices.

σpre: Function for translating the preconditions of an OWL-S service profile into sit- uation calculus precondition axioms. List of Symbols and Acronyms 267

σprob: Function generating situation calculus axioms specifying the probabilities of nature’s choices.

σproc: Function generating Golog procedures from query profiles and sets of PRODIGY plans.

σsc: Function generating situation calculus axioms specifying the conditions sensed by nature’s choices.

σsit: Function for translating SWRL formulae into situation calculus formulae adding a situation argument.

σssa: Function generating successor state axioms from a set of OWL-S service profiles.

τcld: Function for translating OWL class definitions into PRODIGY type declara- tions.

τcompl: Function for creating OWL-S service profiles and a query profile into a PRODIGY planning domain and a planning problem, respectively.

τqp: Function for translating OWL-S query profiles into PRODIGY planning prob- lems.

τsp: Function for translating OWL-S service profiles into PRODIGY operator de- scriptions.

τswrl: Function for translating SWRL formulae into PRODIGY conditions.

ηF Θp : The normal form negative effect axiom for the literal ηF and the service profile p. + actp: Function mapping atoms in Fp and Fp∨ to deterministic agent terms. I : The set of individuals of the ontology O. O litp: Function mapping nature’s choices to atoms in Fp∨. P P remax: The longest common prefix of all PRODIGY plans in P . u Sp : The set of explicit probability statements in the profile p for nature’s choice u. OWLSP : The set of all OWL-S profiles based on the ontology O, as defined in O Definition 3.23. OWLSQP : The set of all OWL-S query profiles based on the ontology O, as defined O in Definition 5.1 OWLDL: The set of all OWL-DL ontologies. CPE : The set of all conditional probabilistic effects based on the ontology O. O RL : The set of SWRL rules over O. O SWRLO : The set of conjunctions of (possibly negated) SWRL atoms based on the ontology∧¬ O.

SWRLO : The set of conjunctions and disjunctions of positive SWRL atoms based on the∧∨ ontology O. 268 List of Symbols and Acronyms Index

action optimal policy, 41, 139 effect axiom, 25 policy, 40 precondition axiom, 26 policy procedure, 40 theory basic, 25 E, 223 stochastic, 38 ebXML, 56 assumption effect causal completeness, 28 conditional probabilistic, 72 closed world, 32 disjunctive, 73 ATP probabilistic, 72 services, 86 static, 107 status hierarchy, 87, 219 entailment of SWRL formulae, 67 system, 15 extensionality, 160 DCTP, 223 finite model generator, 18 E, 223 Mace, 18 Otter, 224 SEM, 18 Paradox, 224 FLOTTER, 83 result, 89 foundational axioms of the situation cal- SPASS, 224 culus, 24 Vampire, 224 frame Waldmeister, 225 axiom, 24 axiom problem, 26 action effect, 25 action precondition, 26 Golog frame, 24 language, 29 successor state, 25, 27 off-line interpreter, 31 on-line interpreter, 31 basic action theory, 26 procedure, 30, 138 CASC, 146 higher-order classes ATP simple named, 64 LEO, 159 clause normal form, 15 TPS, 159 compact, 83 set examples, 160 generators, 83 closed world assumption, 32 individuals of an ontology, 65

DCTP, 223 lambda calculus, 160 decision procedure, 17 LEO distributed automated reasoning, 20 services, 162 domain ontology, 79 Logic Broker Architecture, 21 DTGolog action domain, 134 Markov Decision Process 270 INDEX

fully observable, 35 Paradox, 224 partially observable, 37 policy, 36 MathBroker, 23, 174 DTGolog, 40 Mathematical Semantic Web, 5 Markov Decision Process, 36 MathServe optimal, 36 ATP interface, 115 policy procedure, 40 Broker, 108, 147 precondition composite services, 105, 107 dynamic, 74 Golog interpreter, 113 static, 74 ontology reasoner, 112 problem queries, 105 analysis, 97 sensing action, 120 frame, 26 service composer, 124 qualification, 28 service matchmaker, 110 ramification, 29 MathWeb Software Bus, 22 problem transformation services, 82 MONET, 23, 174 process atomic, 70 normal form parameters, 70 clause, 15 Prodigy system, 125 negative effect axioms, 28, 242 proof transformation positive effect axioms, 28, 242 service OMEGA-Ants, 21 Tramp, 103 ontology, 64 system, 19 individuals, 65 Otterfier, 102 optimal policy Tramp, 84 DTGolog, 41 Prosper project, 20 MDP, 36 qualification problem, 28 Otter, 224 OWL, 44 ramification problem, 29 OWL-DL, 49 individuals, 65 SAT solver, 17 ontology, 64 Semantic Web, 43 properties, 66 service, 53 OWL-Full, 50 composition, 61 OWL-Lite, 49 sensing action OWL-S in classical planning, 33 atomic process, 70 sensing action composite process, 58 in MathServe, 120 process parameters, 70 service profile, 73 category, 73 query profile, 105 effect, 70 service description, 57 parameter, 73 service grounding, 59 precondition, 70 service model, 57 simple named classes, 64 service profile, 59 situation calculus, 24 owl#Thing, 65 basic action theory, 25 INDEX 271

stochastic action theory, 38 SOAP, 55 SPASS, 224 successor state axiom, 25, 27 SWRL atoms, 66 formulae, 66 rules, 69 taxonomy of reasoning systems, 81 tptp2X, 83 Tramp, 84

UDDI, 55

Vampire, 224

Waldmeister, 225 WSDL, 54 WSMO, 60

XML entities, 45 namespace, 45