Efficient Epistemic Updates in Rank-Based Belief Networks
Dissertation zur Erlangung des akademischen Grades eines Doktors der Philosophie (Dr. phil.)
an der
Geisteswissenschaftliche Sektion Fachbereich Philosophie
vorgelegt von Stefan Alexander Hohenadel
Tag der mündlichen Prüfung: 4. September 2012 1. Referent: Prof. Dr. Wolfgang Spohn 2. Referent: PD Dr. Sven Kosub
Konstanzer Online-Publikations-System (KOPS) URL: http://nbn-resolving.de/urn:nbn:de:bsz:352-250406
Efficient Epistemic Updates in Rank-Based Belief Networks
Stefan Alexander Hohenadel
November 9, 2013
A Dissertation Submitted in Partial Fulfillment of the Requirements for the Degree of Doctor in Philosophy.
Department of Philosophy Faculty of Humanities University of Konstanz This document was compiled on November 9, 2013 from revision 341 without changes. Abstract – The thesis provides an approach for an efficient update algorithm of rank-based belief networks. The update is performed on two input values: the current doxastic state, rep- resented by the network, and, second, a doxastic evidence that is represented as a change on a subset of the variables in the network. From these inputs, a Lauritzen-Spiegelhalter-styled up- date strategy can compute the updated posterior doxastic state of the network. The posterior state reflects the combination of the evidence and the prior state. This strategy is well-known for Bayesian networks. The thesis transfers the strategy to those networks whose semantics is specified by epistemic ranking functions instead of probability measures. As a foundation, the construction of rank-based belief networks is discussed, which are graphical models for rank- ing functions. It is shown that global, local and pairwise Markov properties are equivalent in rank-based belief networks and, furthermore, that the Hammersley-Clifford-Theorem holds for such ranking networks. This means that from the equivalence of the Markov properties it follows that a potential representation of the actual ranking function can be derived from the network structure. It is shown how by this property the update strategy of the Lauritzen- Spiegelhalter-algorithm can be transferred to ranking networks. For this purpose, the solution of the two main problems is demonstrated: first, the triangulation of the moralized input network and the decompositon of this triangulation to a clique tree. Then, second, message passing can be performed on this clique tree to incorporate the evidence into the clique tree. The entire approach is in fact a technical description of belief revision.
Zusammenfassung – Diese Dissertation stellt einen Ansatz für die effiziente Aktualisierung von Rangfunktion-basierten doxastischen Netzwerken vor. Die Aktualisierung eines doxasti- schen Netzwerks erfolgt auf der Basis von zwei Eingaben: dem aktuellen doxastischen Zu- stand, repräsentiert durch das Netzwerk, sowie einer doxastischen Evidenz, die als Änderung von Aktualwerten einer Untermenge von Variablen des Netzwerks formalisiert wird. Aus diesen Eingaben kann mithilfe der Strategie des Algorithmus von Lauritzen & Spiegelhalter der aktualisierte Folgezustand des Netzwerks berechnet werden, in welchem die Evidenz re- flektiert ist. Dieses Vorgehen ist für Bayessche Netze bereits seit langem bekannt und wird hier auf Netze angewandt, deren Semantik statt durch Wahrscheinlichkeitsmaße durch Rang- funktionen spezifiziert ist. Als Grundlegung wird auch die Bildung grafischer Modelle für Rangfunktionen, also Rangfunktion-basierter doxastischer Netzwerke ausführlich diskutiert. Unter anderem wird dabei gezeigt, dass globale, lokale und paarweise Markov-Eigenschaften in Rangfunktion-basierten Netzwerken äquivalent sind und dass weiterhin für solche Rang- netzwerke das Hammersley-Clifford-Theorem gilt. Dies bedeutet, dass aus der Äquivalenz der genannten Markov-Eigenschaften folgt, dass stets eine cliquen-basierte Potentialdarstel- lung der jeweiligen Rangfunktion aus der Netzwerkstruktur abgeleitet werden kann. Es wird gezeigt, wie durch diese Eigenschaft die Aktualisierungsstrategie des Lauritzen-Spiegelhalter- Algorithmus auf Rangfunktion-basierte Netzwerke übertragen werden kann. Hierzu wird die Lösung der beiden Hauptaufgaben gezeigt: Zunächst ist die Triangulierung des moralisierten Netzwerks sowie die Dekomposition des triangulierten Netzwerks zu einem Cliquenbaum auszuführen, danach kann auf dem so gewonnenen Cliquenbaum message passing ausgeführt werden, um die doxastische Evidenz in den Cliquenbaum zu inkorporieren. Dies entspricht einer technischen Darstellung des Revisionsvorgangs von Überzeugungszuständen.
5 6 Contents
Preface ...... 9
IB elief,Belief States, and Belief Change ...... 15
1.1 Introduction · 15 – 1.1.1 Remark on Sources and Citation · 16 – 1.2 A Normative Per- spective on Epistemology · 16 – 1.2.1 Descriptive and Normative Perspective · 16 – 1.2.2 The Philosopher and the Engineer · 17 – 1.2.3 Belief Revision and the Problem of Induction · 23 – 1.3 Elements of a Belief Theory · 26 – 1.4 Propositions as Epistemic Units · 28 – 1.4.1 The Concept of Proposition · 28 – 1.4.2 Propositions Form Algebras · 29 – 1.4.3 Atoms and Atomic Algebras · 32 – 1.4.4 Beliefs, Contents, and Concepts · 33 – 1.5 Epistemic States and Rationality · 37 – 1.5.1 Rationality Postulates · 37 – 1.5.2 Rational Belief Sets · 40 – 1.5.3 Belief Cores · 41 – 1.6 Transitions Between Epistemic States · 42 – 1.6.1 Descrip- tion of Epistemic Updates · 42 – 1.6.2 Transition by Consistent Evidence · 43 – 1.6.3 The Inconsistent Case · 44 – 1.6.4 The Transition Function · 45
IIR anking Functions and Rank-based Conditional Independence ...... 47
2.1 Introduction · 47 – 2.2 Ranking Functions · 48 – 2.2.1 Ranking Functions on Pos- sibilities · 48 – 2.2.2 Negative Ranking Functions · 49 – 2.2.3 Minimitivity, Completeness, and Naturalness · 52 – 2.2.4 Two-sided Ranking Functions · 56 – 2.2.5 Conditional Nega- tive Ranks · 57 – 2.2.6 Conditional Two-sided Ranks · 62 – 2.2.7 A Digression on Positive Ranking Functions · 65 – 2.3 Conditionalization and Revision of Ranking Functions · 68 – 2.3.1 Plain Conditionalization · 68 – 2.3.2 Spohn-conditionalization · 70 – 2.3.3 Shenoy- conditionalization · 73 – 2.4 Rank-based Conditional Independence · 74
IIIG raphical Models for Ranking Functions ...... 81
3.1 Introduction · 81 – 3.1.1 Content of this Chapter · 81 – 3.1.2 Historical Remarks · 82 – 3.1.3 Measurability and Variables · 87 – 3.1.4 Algebras Over Variable Sets · 90 – 3.1.5 Graph- theoretic Preliminaries · 91 – 3.2 Graphoids and Conditional Independence Among Vari- ables · 100 – 3.2.1 Conditional Independence Among Variables · 100 – 3.2.2 RCI is a Gra- phoid · 101 – 3.2.3 Agenda · 103 – 3.3 Ranking Functions and Their Markov Graphs · 104 – 3.3.1 D-Maps, I-Maps, and Markov Properties · 104 – 3.3.2 Markov Blankets and Markov Boundaries · 106 – 3.4 Ranking Functions and Given Markov Graphs · 109 – 3.4.1 Po- tential Representation of Negative Ranking Functions · 109 – 3.4.2 Representations of Negative Ranking Functions by Markov Graphs · 111 – 3.5 RCI and Undirected Graphs · 118 – 3.6 Ranking Networks · 119 – 3.6.1 DAGs as Graphical Models · 119 – 3.6.2 Strict Linear
7 Contents
Orderings on Variables · 121 – 3.6.3 Separation in Directed Graphs · 122 – 3.6.4 Directed Markov Properties · 123 – 3.6.5 Factorization in Directed Graphs · 128 – 3.6.6 Potential Representation of Ranking Networks · 130 – 3.7 Perfect Maps of Ranking Functions · 131 – 3.7.1 Ranking Functions and DAGs · 131 – 3.7.2 Characterization of CLs that have Perfect Maps · 133 – 3.7.3 Outlook: Can CLs Be Made DAG-isomorphic? · 136
IVB elief Propagation in Ranking Networks...... 137
4.1 Introduction · 137 – 4.2 Hunter’s Algorithm for Polytrees · 140 – 4.3 The LS- Strategy on Ranking Networks: An Outline · 142 – 4.4 Phase 1 – Triangulation and De- composition of the Network · 146 – 4.4.1 Methods for Obtaining the Clique Tree from the Initial Network · 146 – 4.4.2 Triangulating Graphs: A Brief Survey · 149 – 4.4.3 Desired Criteria for Triangulations of Graphs · 153 – 4.4.4 Generating the Elimination Ordering · 155 – 4.4.5 The MCS-M Algorithm · 158 – 4.4.6 Determining the Set of Cliques of the Fill-In-Graph · 159 – 4.4.7 Inline Recognition of Cliques · 161 – 4.4.8 Inline Tree Construction · 166 – 4.4.9 An Algorithm for Decomposing a Moralized Ranking Network · 171 – 4.4.10 A Digression on Triangulatedness and the Epistemic Domain · 174 – 4.5 Phase 2 – Message Passing on the Clique Tree · 176 – 4.5.1 Local Computation on Cliques · 176 – 4.5.2 Locally Available In- formation · 182 – 4.5.3 Pre-Initializing the Permanent Belief Base · 183 – 4.5.4 Bottom-Up Propagation: Conditional Ranks of the Cliques · 184 – 4.5.5 Top-Down Propagation: Joint Ranks of the Separators · 186 – 4.5.6 Processing Update Information · 188 – 4.5.7 Queries on the Clique Tree · 188 – 4.6 Conclusion · 189 – 4.6.1 Achievements · 189 – 4.6.2 Remarks on Aspects Not Discussed · 191 – 4.7 Outlook: Learning Ranking Networks and Induction to the Unknown · 191
AAC omputed Example for Decomposition ...... 193
BAC omputed Example for Updating ...... 199
2.1 The Ranking Network · 199 – 2.2 Initialization Phase · 200 – 2.2.1 Going Bottom-Up: Computing the Conditional Ranks · 203 – 2.2.2 Going Top-Down: Joint Ranks of the Cliques · 205 – 2.3 Update By New Evidence · 206
Acknowledgements ...... 209
Index of Definitions ...... 211
Index of Symbols ...... 215
Index of Algorithms ...... 219
Definitions and Theorems from “The Laws of Belief”...... 221
Literature ...... 225
Index ...... 251
8 Preface
This thesis introduces an efficient algorithm for iterated belief change. It is based on the concept of belief modelled by the devices of ranking theory as developed by Wolfgang Spohn since 1983 in a series of papers and recently presented comprehensively in his (2012). The algorithm will introduce a concrete description of how a prior epistemic state is changed by an available evidence into a posterior epistemic state, where epistemic states are imple- mented by Spohnian ranking functions among a set of variables. Since ranking functions are the foundation of the formal modeling of epistemic states, we will speak of “rank-based” belief states and an algorithm for “epistemic updates”. The current research for this thesis was inspired by the philosophical discussion of belief revision. However, the specific argumentation of this thesis is characterized by a high level of technical concreteness and uses ranking theory not as a philosophical position but as a mathematical framework for constructing data structures. The thesis is interdisciplinary in the sense that it enriches the work on epistemological problems utilizing devices common in computer science. Since the author does not wish to assign his inquiry specifically to either philosophy or computer science, nor recognizes any requirement to do so, he will just present his arguments without much theorizing on meta-levels. The main goal of this thesis is to develop a comprehensive algorithmical treatment of rank- based epistemic updates illustrated with a concrete proposal. The first step will be to show how belief states and evidences are represented by ranking functions. The second step will be to construct a graph-based data structure that represents the belief state adequately as a rank-based belief network, and in a third step the algorithm for updating the belief network is introduced. The algorithm requires a prior belief state and a new evidence as input and generates a posterior belief state reflecting the evidence as an output. While the first step is mostly reproductive in that it presents many argumentative material brought in previously by other authors, the second and the third step prevailingly introduce arguments developed originally by the author. An important inspiration for the argumentation of this thesis is the well-known concept of a Bayesian network. Bayesian networks are considered as a tool for representing belief states and furthermore a foundation for the mechanism of transition from a prior to a posterior belief state. This transition is considered to be triggered by the accessibility of new evidence. Ongoing research over the last 25 years led to a differentiated understanding of the strengths and weaknesses of Bayesian networks considering the representation of belief states and the capabilities of epistemic updates in relation to different fields of application.
9 Preface
We will represent epistemic states by ranking functions instead of probability measures, but nevertheless we will utilize many known facts about Bayesian networks that serve as an important blueprint for modeling the update mechanism. During the work on this thesis, the author was frequently confronted with the requirement of re-introducing rank-based versions of concepts already common in probability theory. This requirement always contains the danger of drifting into “re-inventing the wheel”, yet it was very insightful to start formal theorizing about ranks on the very basic foundations. The author decided to not only provide his arguments but also a general introduction to the topic. Therefore, we will explicitly discuss the minimal set of indispensable algebra basics and, later on, also address some specific graph-theoretic problems, such as finding a perfect vertex ordering or complete recognition of cliques in a triangulated graph. Those topics are neither part of ranking theory in particular nor yet of epistemology in general, but they rather play an important role in the argumentation, which means that they had to be respected for presentation in an adequate manner. It is therefore characteristic for this thesis that it utilizes formal tools of measure theory, graph-theory and algorithmics for a contribution to formal epistemology. Instead of just compiling relevant research results about Bayesian networks for the use on an epistemological topic, the author transfers already available knowledge to the case of ranking theory with the aim to make a progress in the research topic of ranking theory towards a general mechanism of efficient updating. What this thesis does not do is to introduce a philosophical position of the author. Instead, the author concentrates on showing that ranking theory, taking it as a philosophical position on its own, is well-suited to provide a foundation of concrete applications in computer science. This of course has implications for epistemology since it shows on a very concrete formal level how iterated belief change works. The contribution therefore is the transfer of research results from probability theory to the context of ranking theory as well as extending ranking theory with new concepts and arguments. To the best knowledge of the author, no other publication has described an update algorithm for multiply connected rank-based networks so far. Nonetheless, as we will see later, highly relevant considerations of updates for singly connected networks have already been discussed insightfully in the works of Daniel Hunter. ChapterI of this thesis presents a sketch of the philosophical field of questions to which this thesis intends to contribute, and introduces the relevant concepts formally, but only on a very high level of abstraction. Additionally, chapterI sketches rather shortly the basic problems of belief revision with only a few pointers to the connected philosophical discussions1. It also introduces the algebraic foundation of belief theory in the form of propositional algebras. It is argued why propositions are supposed to be the objects of belief during this inquiry and how propositions are formally defined. The chapter further defines epistemic states and epistemic updates from the most coarse-grained abstraction level.
1Those discussions will be mostly known to readers with a more philosophical background. Readers more common with the parts of the work related to computer science may feel that the historical and theoretical background of the philosophical problems may not seem to be highly relevant for the concrete arguments.
10 Preface
The concepts and many of the arguments in chapterI are not invented originally by the au- thor. They are mostly reproduced from chapters 1, 2 and 4 of (Spohn, 2012) but with a strong focus on the formal and technical aspects, leaving out most of the references to the philosoph- ical discussion. The author added a brief reflection about the use of “engineering-like” formal tools in the philosophical domain (section 1.2.2) and a short digression about belief content and the related discussion originating from the arguments of Kripke and Putnam (section 1.4.4). The entire other argumentative substance in the chapter is owed to Wolfgang Spohn. ChapterII presents the most important aspects of the current research status in ranking theory. It formally introduces ranking functions among propositions and discusses the most basic formal extensions and variations that were introduced into discussion so far. Namely there are two-sided ranking functions proposed in (Spohn, 2012, p. 76)2 as well as the vary- ing concepts of conditional ranks. Additionally, two different methods of conditionalization (Spohn-conditionalization and Shenoy-conditionalization) are discussed that directly corre- spond to Jeffrey- and Field-conditionalization in probability theory. Other important contents described in this chapter are for instance a rank-based version of Bayes’ theorem and a “chain rule” for ranks. For the formal concepts that are introduced, additionally some notable prop- erties are shown. The chapter compiles the relevant material that was widely published before, also dis- cussing distinct variations in terminology and also properties of the formal concepts. Taken on its own, chapterII should be adequate as a general introduction to the mathematical foun- dation of ranking theory. Like chapterI, it is mostly reproductive and contributes compiled introduction of formal material. The arguments and concepts presented have already been described by other authors, mainly Wolfgang Spohn. It seems very reasonable to the author to present a compiled version of this material for different reasons:
1. A clean and detailed introduction of all formal concepts, adjusted in terminology and formal language to a uniform manner, is very important for a concise presentation of the arguments of this thesis. Since the formal concepts stem from different discussions and multiple disciplines, the author considered it necessary to introduce all relevant concepts to ensure uniformity. For this reason, chapterII introduces ranking theory while chapter III introduces a considerable set of concepts from graph theory.
2. Before the advent of (Spohn, 2012), the foundation of ranking theory was scattered about a variety of texts; some of them were separated from each other by many years during which the concepts underwent some variation concerning the conventions of naming, notation, and also mathematical properties. (For instance, Wolfgang Spohn at one time decided to use natural numbers instead of ordinal numbers as codomain for ranking functions. Additionally, a variety of concepts representing conditional ranks were proposed, mostly with different properties.) Presenting own introductions of the formal concepts ensures maximal clarity about what concepts are to be discussed. Of course, this is cum grano salis since the reproduction obviously suffers from a lack of
2Confer definition 5.12.
11 Preface
philosophical depth, which is a consequence of considering ranking functions as mere material for algorithmic work. Whenever Wolfgang Spohn added insightful comments on why he introduces things in a particular manner and not in the style of argument x, the author of this thesis mostly just joined his view whenever it did not entail difficulties in the algorithmic part. One could also state that the thesis is written from an engineer- ing point of view – nevertheless, this fact is philosophically reflected in the beginning of chapterI.
3. The reader not familiar with ranking theory may use chapterII as both, a general in- troduction to this topic and a reference. Therefore, chapterII and partly chapter III are written in the style of a textbook and should be easy to read for readers used to formal concepts. In general, the proof level of chapterII is extraordinary low. The level of some proofs is surely almost trivial for readers with more than the most basic mathe- matical knowledge and indeed, along some passages in this chapter, the argumentation progresses quite slow. Although, it may demand some patience from the reader with appropriate background, this is perfectly congruent with the intention of the author since he wants to show from which basic mathematical structures the rank-based con- cepts originate, without leaving behind the reader less experienced in formal topics. Since all proofs are consequently marked, it is easy to just skip a particular proof, if the reader is not interested in it. In chapters III andIV, the proof level returns to a more conventional level.
4. Only a complete introduction of all concepts will give a “vertical” insight to the matter introduced. To clarify this: the most coarse-grained vertical perspective may state that the author considers propositions as an algebraic structure on which ranking functions can be defined. Ranking functions can form graphical data structures that can be shown to satisfy the Markov properties and can hence be updated by a Lauritzen-Spiegelhalter- style update algorithm. It was the aim of the author to make this vertical view as complete as possible.
The main contribution of chapterII is therefore to present the current research status in ranking theory as well as to introduce the topic to readers not already familiar with it. Chapter III uses the formal material introduced in the previous chapters to develop ranking networks as graphical models of ranking-functions. A ranking network is a data structure that represents the ranking function among a set of variables and is therefore the implementation of an epistemic state at a certain discrete time. In a formal respect, ranking networks are directed acyclic graphs with their vertices being conditional matrices of variables and their edges being subjectively valid relationships of causal influence between the variables. The network has a negative ranking function assigned that defines the conditional information for each vertex. The concept of a measurable variable is introduced along with many basic concepts from graph theory (which may not be common to all readers). Since it has already been shown that rank-based conditional independence is a graphoid3, the focus lies on what is consequently
3Confer (Hunter, 1991a) and recently (Spohn, 2012, p. 132) with theorem 7.10.
12 Preface the subsequent step: showing that the Markov properties hold for ranking networks. This is separately discussed for undirected as well as for directed graphs and for three different Markov properties. Those Markov properties are the formal precondition to make the up- date algorithm work since they enable the represented ranking function to be expressed by a potential representation. Some of the relevant proofs are either very similar to their counterparts in probability the- ory, or they are solvable on the mere graph-theory level at all, not requiring any arguments from ranking theory. Others are more sophisticated and quite non-trivial. For the same argu- mentation as pointed out for chapterII, this material is introduced in uniform terminology. The presumably most important contribution of chapter III is a rank-based version of the Hammersley-Clifford-theorem as it is introduced by theorem 3.67 on page 114. To the best knowledge of the author, the validity of this theorem in ranking theory was never shown before. ChapterIV shows how a transition from a prior to a posterior belief state can be formally represented and efficiently executed. The prior as well as the posterior belief state are thereby represented by ranking networks as they are defined in chapter III. The update process is separated in two phases: in the first phase, a “compilation” of the ranking network is required in which the permanent belief base is created (or in fact re- created) from the prior ranking network. The permanent belief base is a data structure that represents the prior belief state in a technically feasible manner. It is ensured to be a tree (in the graph-theoretic sense of the word), which ensures that well-known techniques for tree updates can be applied to incorporate the evidence. The compilation phase is only required if the evidence has modified the structure of the network by the insertion or deletion of edges or vertices. If the evidential input does not change the network structure, recompiling can be omitted, although a compilation definitively has to be performed at least once when the iterated update process starts. Technically, the compilation phase decomposes the ranking network to a clique tree and computes a potential representation of the ranking function that characterizes the prior belief state. This chapter presents adaption of graph algorithms like clique recognition, and the establishing of a perfect elimination ordering to derive a triangulation of the input network. The second phase is the “update phase”: it incorporates the evidential input in the perma- nent belief base. After the update phase is completed, the belief base reflects the evidential input. In technical terms, this can be performed by Pearl’s message passing algorithm, but the factual arithmetics has to be replaced by operations adequate for the ranking semantics of the vertices. The update algorithm works on ranking networks in general, which means, on each kind of directed acyclic graph without loops and multi-edges. Especially the network is not supposed to be singly connected. As the reader may already have noticed while reading the former paragraphs, the algorithm is heavily inspired by the technique of Lauritzen and Spiegelhalter for updating multiply connected Bayesian networks, and the author drew many benefits from the former works of Judea Pearl and Richard Neapolitan while developing it. Nonetheless, a number of more recent research results are also considered in this thesis. Improving Neapoli- tan’s approach, the author shows that the compilation phase can be completed by the run of
13 Preface one single procedure. ChapterIV ends with an outlook over further tasks that could be reformulated in specific rank-based terms, such as structure learning of ranking networks and implementation of further types of inference. The argumentation in the thesis leads from basic considerations of propositional belief for- mation towards a directly implementable system of permanent epistemic activity that keeps a belief base up-to-date with the available evidence by passively incorporating the new evi- dence.
14 I
Belief,Belief States, and Belief Change
1.1 Introduction
This chapter elaborates on the relationship between normative epistemology and engineering. It further describes the contribution this thesis provides to this field of research, and addition- ally introduces the basic notions of belief theory as they will be understood throughout the text. As already stated on page 11, the argumentative material in this chapter is not originally invented by the author but just tries to “distill” the most relevant parts of formal belief rep- resentation for a brief but sufficient introduction. As a blueprint, the author used mainly chapters 1, 2 and 4 of (Spohn, 2012). Section 1.2 contains the discussion of the perspective of normative epistemology about which the thesis is written. Some of the commonalities with general engineering are em- phasized. The section is not intended to comprehensively discuss the conditions of normative epistemology but rather it seeks to underline why normative epistemology is attractive for working on philosophical topics by the devices provided by mathematics and computer sci- ence. Since this thought perhaps seems unintuitive at first, an introductory remark about this aspect may be reasonable. Section 1.3 describes the belief revision system the thesis will develop from an intuitive view, emphasizing the mechanistic perspective that will successively become more dominant in the later chapters and eventually supersede the philosophical considerations. Section 1.4, as already stated, introduces a framework for representing propositional belief that is suitable to technical implementation. It reproduces main ideas introduced by Wolfgang Spohn in chapters 2 and 4 of his (2012). Sections 1.5 and 1.6 reproduce a subset of the argumentative material Spohn gives about the rationality of belief sets and the transition between rational epistemic states. We will concentrate on the more technical parts and leave out most of the references to the history of analytical philosophy. The sections may be seen as a “foreshadowing” on the implementation described in chapters III andIV.
15 IBelief,Belief States, and Belief Change
1.1.1 Remark on Sources and Citation
In this chapter, we will introduce an algebraic framework to represent belief states and be- lief change. This framework was developed by Wolfang Spohn in chapter 2 of his (2012). Although, Spohn makes also use of some ideas that are quite common in the particular dis- cussion about belief revision and the adquate representation of belief, the author of this thesis used Spohn’s work as main orientation for his own writing in this chapter. It is a foremost goal of this thesis to keep compatibility with Spohn’s concepts. In the remainder of this chapter, we will therefore mainly introduce those of Spohn’s formal con- cepts important for our aims, but without reproducing each of his arguments considering the epistemological aspects. Instead, the author discusses the concepts in his own words. Section 1.2 presents a genuine argumentation of the author of this thesis with the excep- tion of subsection 1.2.3, whose arguments are mainly owed to section 1.1 of (Spohn, 2012). Especially the essential “two questions” on page 25 were heavily inspired from equivalent questions found in (Spohn, 2012, p. 6). Section 1.3 introduces nearly the same base elements as Spohn does and should therefore be read as just stating the starting point from already well-known concepts. The idea to use Hintikka’s (minimal) requirements of rationality was taken from (Spohn, 2012, p. 48). Subsection 1.4.4 presents a genuine argumentation of the author of this thesis. The reader should explicitly acknowledge that the turns made in section 1.4 were completely based on the concepts introduced in (Spohn, 2012, chapter 2) and that sections 1.5 and 1.6 entirely consist of material compiled from the sections 4.1 and 4.2 in (Spohn, 2012, chapter 4) but completely presented in own words of the author. As already stated above, the author of this thesis tried to distill the mere formal parts of Spohn’s arguments whenever possible while commenting them in his own words. Therefore, explicit citations do not occur frequently throughout sections 1.4 – 1.6. To ensure lucidity about the sources, all theorems and definitions stemming from (Spohn, 2012) are presented in an index starting on page 221, where the source of each item used is made explicit. The quite generous reproduction of Spohn’s formal concepts seems sensible to the author because in the later chapters, he will connect them to concepts from different research fields and he wishes to keep maximal formal consistency throughout the entire thesis.
1.2 ANormative Perspective on Epistemology
1.2.1 Descriptive and Normative Perspective
Before we take a concrete, detailed, and well-structured look on the questions that form the agenda of this thesis, it has to be pointed out what implications our perspective on the topic will have. A normative perspective focusses on what rational beings should believe, given certain cir- cumstances, and what they should not. This is different from the perspectives of most special sciences like sociology and, mostly, psychology, which just describe the factual belief states of the subject. We will therefore call the perspective of these disciplines the descriptive perspec- tive.
16 1.2 ANormative Perspective on Epistemology
While a descriptive perspective is interested in understanding and explaining the way how beliefs actually are formed and modified in terms of a particular special science, the normative perspective tries to point out how these processes should function to be acceptable as rational. A descriptive perspective considers the scientific facts and its interest concentrates on the empirical conditions related to beliefs and their dynamics in respect to the concepts of some special science. The many fascinating questions concerning the empirical aspects of beliefs are scattered widely about the subjects of special sciences, such as the neurosciences, cognitive psychology, linguistics and sociology, to state only the most obvious ones. The perspective that focusses on the concrete empirical questions is by convention not a genuine philosophical perspective, although, its observations take influence on philosophical questions. The normative perspective tries to find out what beliefs we should acquire under certain circumstances and consequently presupposes the existence of rules that enable us to sketch idealizations about the dynamics of beliefs. The prescriptive aspect in the word “should” im- plies the presence of such an idealization because otherwise no prescription would be justified. The normative perspective is not interested in practical modifications of cognitive mechanisms but just in gaining a lucid conception of rational belief change. It is a notable fact that engi- neering disciplines like computer science with its topics of machine learning and data mining, to state only two, have a strong family resemblance with normative epistemology concerning the normative perspective. Idealization about the dynamics of belief enables us to draw a distinction between belief states that are sufficiently justified, correct, coherent, consistent, and which are thus reasonable or at least rational, and other belief states. Because all these adjectives are mortgaged with non-neutral theory contents, in other words, a normative perspective on belief theory claims to be able to make a structured distinction between belief revision processes that would need correction to some extent to fit ideal conditions and belief revision processes that would not need any such correction at all. This sounds rather vague, because nothing is said so far about the criteria for this distinction. The terms “correction” and “ideal conditions” appear to be mere placeholders. Furthermore, it has been argued very lucidly by different authors like for instance Jaegwon Kim and Wil- frid Sellars, what normativity could mean and that the term “normative” is not atomic but describes many different ways in which a statement can be normative. We will not engage in this discussions at this point. Regardless of any reasonable differentiation one could establish, it is sufficient for under- standing the arguments presented in this chapter that the characteristic aspect of the normative perspective sums up to the question what we should believe under which circumstances. The normative inquiry of belief tries to sketch a lucid picture of the rules or laws that hold for the unbiased acquisition of beliefs and in the course of the unfolding of my investigation, the vague character of the above stated sentence will disappear.
1.2.2 The Philosopher and the Engineer
At a first glance, it may seem that figuring out the details of the normative perspective falls naturally and primarily in the responsibility of the philosopher. But this conjecture previously
17 IBelief,Belief States, and Belief Change has been (and currently remains ) the subject of dispute. The arguments most interesting for our particular topic were raised in context of a discus- sion about a naturalized version of normative epistemology. Naturalists usually argue that epistemology is a technical discipline. A common objection against this conjecture is that epistemology has to contain normative parts and that it would be not capable for this discipline to be normative in the sense it need to be. One of the responses of the naturalists was that epistemology could be normative in the sense engineering disciplines are normative. The locus classicus is Quine’s short (1986), where he argues that “normative epistemology” is “a branch of engineering”, a mere “technology of truth-seeking”:
“Naturalization of epistemology does not jettison the normative and settle for the indiscriminate description of ongoing procedures. For me normative epistemol- ogy is a branch of engineering. It is the technology of truth-seeking, or, in a more cautiously epistemological term, prediction. Like any technology, it makes free use of whatever scientific findings may suit its purpose. It draws upon mathe- matics in computing standard deviation and probable error and in scouting the gambler’s fallacy. It draws upon experimental psychology in exposing perceptual illusions, and upon cognitive psychology in scouting wishful thinking. It draws upon neurology and physics, in a general way, in discounting testimony from occult or parapsychological sources. There is no question here of ultimate value, as in morals; it is a matter of efficacy for an ulterior end, truth or prediction. The normative here, as elsewhere in engineering, becomes descriptive when the terminal parameter is expressed. We could say the same of morality if we could view it as aimed at reward in heaven.” (Quine, 1986, p. 664f)
Instead of reading Quine as if he would agree to factually exclude normative epistemology from the core interests of philosophy, the discussion recognized the proposal, that normative epistemology could be “naturalized” by emulating engineering disciplines. For example, an engineer who builds a bridge knows by which properties a “good” bridge is characterized and what he has to do to build a “good” bridge. In the same way, an epistemologist knows what is a “good” revision and how “good” revision can be made even “better”. Quine’s “engineering reply” to the objection of naturalized epistemology was itself a subject of discussion, see for instance (Wrenn, 2006), where also a lucid catalogue of the different types of normativity in question is provided. We will not join this discussion since it is not directly relevant for what we are about to do. On the other hand, this is in brief exactly what we will do in the remainder of this in- quiry: doing epistemology on an engineering level, using the formal devices provided by engineering-like disciplines, namely computer science and mathematics. It is therefore nonetheless reasonable to comment on the conditions of such an approach. Quine is undoubtedly right in stating that many “branches of engineering”, especially but not only in computer science, focus on what could be justifiably entitled to be called “truth- seeking” on a level that could be called “technological”. Examples for such applications are neither seldom nor special. Think of every day require- ments like a database system that has to preserve data consistency during update procedures,
18 1.2 ANormative Perspective on Epistemology the Bayesian filters in most email clients that can learn to make good decisions about which of the messages you receive are undesired, or the fuzzy logic in a digital camera that tries to gen- erate exceptionally good looking pictures of what seems to be reality. The internal processing logic of the camera implements a calculus to decide which visual effects are undesired and which are acceptable. Any knowledge representation system regardless of its purpose has to implement basic requirements of rationality, at least consistency. It is not very surprising that most sciences rely on their own techniques of truth seeking if we remember that the occidental conception of science is directly and inseparably connected to truth. While the normative part of “truth-oriented” philosophy analyzes conceptions of truth, most sciences develop truth conceptions implicitly. They are to be implemented in algorithms, heuristics and strategies within the conceptual scope of the particular discipline or subdis- cipline. One may see this as a “technological” view on truth. But the analysis of the truth conception of a specific discipline is usually not part of the discipline itself with philosophy being the remarkable exception. We will return to this thought a little later. The differences in perspective and aim of analysis between philosophy and engineering seem to induce a quite lucid distinction between the different spheres of competence: the philosopher’s task seems to be analyzing what truth “is” and the technologist’s task seems to be to develop techniques to “generate” true assertions from knowledge. This seems to be clear and fair on first sight. But one should not conclude from this observation that the philosopher is supposed to cede a part of the authority concerning the investigation of truth to the protagonists of other sci- ences. Unless one rejects the idea that normative reasoning about truth is a philosophical task, the philosopher is also addressed when it comes to the question which beliefs can – read: should – be legally generated from beliefs that are already accepted as a part of a subjective knowledge base. Our contemporary experience in this fields shows that mathematics, statis- tics, and especially computer science, provide very useful devices for working out the details of this question. When philosophy develops a formal conception of truth, it is a legitime question to the philosophers if this conception is applicable in a practical sense. This, in particular, includes the question if it can be implemented technologically. One may conclude that the imple- mentation is not a specific philosophical task, but once the legitimateness of the question of implementability is accepted, it becomes impossible to draw a sharp demarcation line between the philosopher’s part and the engineer’s part in the subject of normative theory of reasoning. This, of course, sounds like a feuilleton argument but it can be stated more precise by consid- ering the relationship to rationality that is maintained in philosophy on the one hand and in engineering on the other hand. The best example for engineering in this context is computer science because, among other topics, computer science tries to implement conceptions of intelligence, to understand the formal processes of decision making and inductive inference and to provide systems with the ability to apply these techniques to unknown situations. This seems to directly imply an understanding of at least some currently unexplained capabilities of the human mind. It may seem that either this claim, formerly the legal territory of philosophy, was usurped by a another discipline or – in an even more pessimistic way – that this shows that at last
19 IBelief,Belief States, and Belief Change philosophy turned out to be of no more importance for finding out how thinking works. Computer science is as closely connected to logic and linguistics as philosophy. One may refuse to consider computer science as a mere engineering discipline, thinking of subjects like complexity theory and formal languages, which seem to reside in the competence sphere of computer science without being very “engineering-like”. However, it is undebatable that com- puter science is strongly influenced by engineering paradigms. But in contrast to philosophy, computer science does not model rationality from a reflexive point of view, it just uses – as every engineering discipline – an implicit normative perspective which is motivated by the search for techniques to gain precise and good results within the scope of its domain. What “good” results are is in most cases beyond discussions: it is simply provided as a definition. For the philosopher a “good” solution to an issue may be a one that does not contain inconsistencies, fits acceptably well with already existing attractive explanations of related issues, does not rise heavy contradictions with existing attractive models of explanation, and is, however, acceptably simple. One may find other criteria and may also argue that they are not uniformly accepted in the entire community but at least, there is some minimal consensus about the properties listed above. This is also true for computer science, but here, the rules are much stricter: Whoever enters the community claiming to have a “good” solution to a problem has to face three types of questions.
1. Is the proposed solution less resource-consuming than any existing concurrent solution? (Which means: is it asymptotically faster – or at least empirically faster, for typical situations? Does it need asymptotically less memory or bandwidth?)
2. Does the proposed solution improve the quality of results over any existing concurrent solution? (Which is of special importance on solution types based on heuristics or domains with different conventions of modeling.)
3. Is the proposed solution simpler as any existing concurrent solution? (Which means: is it easier to understand or to implement?)
Each of these questions targets an aspect of what computer science considers as relevant aspects of scientific progress in its scope. If not at least one of these questions can be answered positively, ideally supported by ap- propriate theoretical proofs or strong arguments and empirical tests, it will typically not be accepted as a valid improvement. The reason for this assessment obviously is that in case of three times “no” the proposal does typically neither promise a contribution to scientific progress nor to practical use. Hence it seems, computer science does not try to reflect on rationality but merely tries to teach machines to make rational decisions, supposing that rationality is already defined. But this is far too brief: in fact, there are more sciences of “rationality” than only philosophy. No engineer can implement conceptions, neither in software nor in hardware, that are not fully understood and formally clear. This does not require an engineer to understand some mysteries of mind but it does require her to understand the particular conception of rationality
20 1.2 ANormative Perspective on Epistemology she uses to solve her concrete problem. For her claim, it must be completely defined “what it is like to be in a rational state”. This requirement is in a way weaker as well as stronger than the requirement of the philoso- pher. The philosopher is by tradition more interested in the questions but in exact answers. Therefore, the requirement of the engineer is stronger, because her conception of rationality criteria must be sufficiently concrete to implement it. The engineer has to understand the in- herent structure of the capability she intends to implement regardless if she uses fuzzy logic, probability theory, neural networks, machine learning algorithms, or other tools. The philosopher is interested in understanding, what rationality, well, “means”. This ad- dresses any rationality concept. In this regard, the claim of the philosopher is stronger, since her discipline always contains the critical reflection of the notions it uses. But it is obvious that the engineer’s claim comes down to exactly the same question as the philosopher’s: what is the inherent structure of the concept of rationality in consideration? This equivalence may consequently be interpreted as a kind of concurrence and it may there- fore seem that computer science and the cognition sciences contribute much more substance than contemporary philosophy does. It appear to be a side effect of scientific progress, that increasingly more of the questions for- merly seeming to be inherently philosophic turned out to be answerable using the conceptions of special sciences. Of course, this is a statement from the point of view the special sciences keep. Philosophy could answer that most of the questions currently discussed in the neurosciences were first developed in philosophy. Nonetheless, the rise of the neurosciences and computer science does show that not all philosophical questions are eternal. Some aspects indeed came down to be answerable by concrete approaches of special sci- ences that became feasible by sufficient scientific progress. And, currently it is an ongoing task for philosophy to incorporate the important scientific knowledge brought in by special sciences. Acknowledgement, analysis, and incorporation of the progress made by the special sciences is, and indeed must be, a permanent stimulus for contemporary philosophy. So far, it is lucid that philosophy and engineering share some questions and insofar it is correct to state that “truth-seeking” has many technological aspects. But how about the converse question: is any approach on rationality that uses formal devices an “engineering- like” approach? Does epistemology turn into an engineering discipline by using “engineering- like” devices? Of course, one may be seduced to join the naturalist view, accept normative epistemology as a mere technical discipline, and feel committed to the view that reflecting rationality and motivating epistemology is no more part of epistemology itself. Nonetheless, there are not only commonalities but also extremely important differences between epistemology and computer science. The mere fact that there exist engineering tasks that yield concrete technical implementa- tions of truth-seeking strategies does not show that the theoretical conception, the rules of acquiring true beliefs, is as a whole a part of engineering. The philosopher’s attitude becomes suspicious if thoughts can be made concrete to a level such that an engineer can implement them. At a first glance, this might seem to be an argu-
21 IBelief,Belief States, and Belief Change ment that they lack the abstraction level philosophy prefers and requires being able to reflect its questions. But in fact each science has its own device at hand to reach progress in knowledge in its special perspective. The rules sufficient in engineering are in many aspects not sufficient in a philosophical perspective. The learning algorithms of the computer scientist and her solution seeking techniques show some property that disqualifies them as sufficient answers to the questions of the philosopher: they always serve a certain purpose and are only applicable under specialized, strict, and in some respect uncommon conditions, and to a particular purpose. The criteria of truth in engineering are criteria of optimality to certain purposes. They are specialized and context dependent. They – and this is the strongest difference to philosophy, as already stated above – do not contain their own reflection nor their own motivation, where the latter is purely instru- mental throughout the entire discipline. Building bridges is part of engineering but analyzing reasons why bridges should be build is not, nor why bridge building itself is “good”. This is completely different for epistemology or philosophy in general. Again, this does not show that truth seeking belongs more to engineering than to philos- ophy but it shows that the questions philosophy tries to answer are highly relevant to other subjects, that the devices philosophy uses are related to the devices of other sciences, and that their devices can be inspiring for philosophical methods and vice versa. The argument that computer science uses conceptions of truth that are in some way spe- cialized is nonetheless debatable, because it depends on the abstraction level on which the conception is analyzed. The different conceptions of course have the common minimal core of consistency and deductive closure, regardless whether a relational database, a Bayesian filter or an engine for automatic planning is considered. It is therefore not a strong argument that each task adds its own extensions to the conceptions. It is not the task of the philosopher to develop solutions to the aim of optimality. It is her task to sketch new sensible pictures of old problems and discuss new thoughts about known subjects, bringing aspects to light that were hidden before, but this is a purpose on its own and it does not take place to serve some special purpose or to meet special requirements. Therefore, the philosopher enjoys more liberty than the engineer (who has to answer the three questions). It seems just impossible to define criteria of optimality for truth-seeking on the level on which a philosopher analyzes truth-seeking. This surely seems to be a strong indication for the engineering sciences that their perspective of truth-seeking cannot catch up with the one of the philosopher, and that reasoning about truth-seeking algorithms may not be a philosophical task. But on the other hand, from the mere fact that the philosopher is not the only researcher who is interested in a certain perspective, it does not follow that she is not entitled to share this perspective or has to give up her task. Another valid argument against the conjecture that epistemology is “pure” engineering is directly connected to this aspect: the task of the engineer is always the solution of a distinct and special task, and the concepts of truth and consistency are always instruments for her and not object of her scientific reflection of what she does. Reflection of what she does is, as said above, not part of her domain. This is surely different for the case of the philosopher. An implementation of some heuristic learning algorithm may try to use techniques of mak-
22 1.2 ANormative Perspective on Epistemology ing good decisions that are similar to those used by human beings, but the decision made by the algorithm will always be oriented toward some special and narrowed principles. It is not made for reflection but for bare use. Nevertheless, use is a fundamentally different destination than reflection. This is the most important distinction between the normative perspective of the philosopher and the normative perspective of the engineer. The normative perspective of the philosopher is free to produce categorical “goodness”. This is not possible for the engineer, whose categories of goodness are never independent from the underlying unreflected purpose that is not part of engineering itself. In short, a third aspect is that normative epistemology is the contemporary approach to the problem of induction. This problem definitely lies in the responsibility of the philosopher. We will see in the next section that the two central questions of this thesis sum up to a formal approach to the problem of induction. This problem cannot be analyzed without introducing idealization into epistemology and therefore, philosophy has to consider this topic. However, the fact that different disciplines are interested in the same conception for differ- ent goals does not in any way show that inquiries in the conception are a definitive task of just one of them. Where some authors may see a strong demarcation line, there is just an interdis- ciplinary discussion about the shared conceptions of truth and rationality. Philosophy in its analysis may utilize formal devices from mathematics, statistics, and even computer science without becoming engineering. Philosophy is not only allowed to do so. There is furthermore a strong indication to use whatever powerful devices are available. On the other hand, phi- losophy without being able to access results from the special sciences, would be cut off from scientific progress. These two aspects form the structure of the interaction between philoso- phy and engineering disciplines. Considering these facts, the question whether epistemology should be engineering or not seems ill-stated and of no relevance for practical scientific work.
1.2.3 Belief Revision and the Problem of Induction
The goal of any theory of belief is to describe how a set of beliefs is built up and maintained. The first is described by belief formation, the second by belief revision. In the course of the inquiry, the author will sometimes use the shortened term “belief theory” that includes belief formation as well as belief revision. Both aspects provide a question to the beginning of the inquiry. To make clear what the starting point of our reflection is, consider the following analogy. We want to imagine the computational aspect of belief theory as a kind of update algorithm on a given set of beliefs. Progressing from this thought, the following picture emerges. There is a cognitive system – perhaps the human mind, but it could also be another, lower- level information processing system – that consists of two functional components. First, there is a set of beliefs that are “held” in the system. This set represents the doxastic state of the system, or, one could say, its “knowledge”. The second component is a kind of update “cinetics”, a mechanism that operates on this set of beliefs. This component represents the concrete strategy for integrating new information into the knowledge base of the system. This integration process implies the transition from its prior doxastic state to a new state.
23 IBelief,Belief States, and Belief Change
With these two components, the system is capable to perform a kind of administration job on its belief base: whenever it encounters a new piece of information, it has to check whether it can accept the information. If this is the case, it adds the information to its belief base such that the system has at each discrete time an updated belief base. On this basis, we can name it a knowledge management system. This sounds nearly trivial, but it clearly is not. In the beginning, the system is in some initial state, in which it is characterized by a set of initial or basic beliefs, in which the algorithm is started into a continuous loop of distinct, consecutive modification operations, that are triggered by new information becoming available to the system. Whenever a new information is made present to the system, the algorithm processes the information by adjusting the already present set of beliefs such that the posterior state of the system reflects the new evidence. Immediately we can identify two different cases: either the new information raises a conflict with the prior state or it does not. In the second case, where the presence of the new belief does not affect the other beliefs, it can directly be added to the belief base of the system. Because the system is clever, it tries to also obtain knowledge by using this new evidence to draw inferences from. Hence, the integration of the new belief results in a kind of rule-based “interaction” with the belief base. It may also be the case that the new information represents a very strong evidence that causes the system to erase one or more beliefs that are in conflict with the new information. The system has then to decide whether the new evidence should be rejected because the sum of conflict-free beliefs it already keeps has more weight than the single new evidence. If it decides to add the new belief to the belief base it has to find a way to resolve the conflicts the new evidence introduces. In the course of this resolving process, some manipulation may be necessary to complete the transition to a conflict-free posterior doxastic state. In short, the algorithm performs an integration of the new information into the set of beliefs that is held in the system in some way that will be the subject of further investigation. The re- sult of such an update operation will be a posterior doxastic state, reflecting the new evidence on the basis of the prior doxastic state. (We also consider the case where the new evidence is rejected as a case of integration.) The crucial point is: knowing this algorithm would provide a solution for the problem of induction since it would provide an objective formal technique to generate new beliefs from current beliefs. This process always involves inductive reasoning. This picture is very coarse-grained. But it performs two adequate functions in a certain respect: first, it shows us the elements of the theory that will be part of further investigation. The second and more beneficial effect is, that it leads our attention to the positions that are explanatory empty. Obviously there are unexplained aspects at two crucial points in the picture 4. Although the process of continuous epistemic update seems intuitively very comprehensi- ble, we lack any intuition about the state in which the system starts with its update activity. How is the initial state of the belief base characterized before considering the first empirical evidence? Which beliefs do we have initially when our mind “starts” to acquire beliefs? In other words:
4The author follows quite direct the argumentation in section 1.1 of (Spohn, 2012). Spohn also narrows down his exposition of the topic to the two questions introduced in this section (cf. (Spohn, 2012, p. 6)) but he chooses another route.
24 1.2 ANormative Perspective on Epistemology
How is the initial doxastic state to be characterized?
This question involved continental rationalism and anglo-saxon empirism in a discussion about whether the mind is a tabula rasa in the beginning of life or whether a human has basic ideas in her mind when she enters the world. The question for the initial state implies the question for a priori beliefs which immediately involves complex considerations of apriority. We cannot engage in further investigations in this point, however, the precise nature of the initial belief base will be subject of further inquiries. We have identified one crucial question, but there is a second important point. Considering the initial belief state as given, the question arises in which way new evidences affect the mod- ification of the belief base. When the presence of new evidence triggers an epistemic update, how is the transition of a prior doxastic state into a posterior state structured? Which beliefs can we derive from already acquired beliefs in the light of new evidence? Which beliefs can we legally derive from beliefs we already have? Stated more briefly: what should we inferen- tially believe? Hence, put in a more formal way, the second question is:
Which rules hold for the transition of one doxastic state to another?
It can easily be seen that these two questions form a condensed version of the problem of induction. The first question is about the belief base and the second about inferential beliefs, thus if we find an approach that answers these questions by giving reasonable theories, a complete inductive account is implemented. The concepts of derivation and inference used in this chapter and also the following chapters should not be understood as purely deductive because it would be a reduced conjecture not adequate to the problem. One reason for this is that deduction does not have the capability to “lead us from percep- tions to beliefs about laws, the future or the unobserved” as (Spohn, 2012, p. 3) says. But we know that we have the ability to draw such inferences. Hence, deduction is not the only device of inference that we practically use. The strength of deductive logic is possibly not sufficient to understand the inference pro- cesses involved by the transition from one doxastic state to another. Another demonstration for this comes from the fact that we draw concrete consequences from attitudes that are vague or uncertain. For example, we are acquainted with some vague attitudes, drawn by strong intuitions5 without having evidences and without fulfilling the rules of deduction in our inference pro- cess. However, we would not say that one does not act rational by following her intuitions to some extent. Those intuitions influence our beliefs as well as our actions. As a result, we have beliefs that are uncertain. To be precise, there are strong indications that none of our beliefs is of absolute certainty. When we speak of inference or derivation, these notions have to be wide
5The notion “intuition” always means intuitions in the colloquial sense, not Kantian intuitions.
25 IBelief,Belief States, and Belief Change enough to cover uncertainty. Thus, what on first sight seems quite easy and mechanistic enough to be implemented im- mediately in some algorithms turns out to be all pretty complex and intricate. (The engineer may respond that mere complexity in the details of computation does not introduce a differ- ence in principle and the philosopher will answer that she is not interested in coping with the complexity of implementation but with understanding how the transition works in principle.) The normative perspective of belief revision does of course not formulate the psychological aspects of beliefs, but tries to find out how a perfectly rational mind would acquire beliefs on the basis of uncertain information. That means, it formulates the rules that a transition from one doxastic state to another must fulfill to be a valid epistemic update given the new evidence. The idealization lies in the assumption that, given a new evidence, some updates on the belief base lead to a more “preferable” doxastic state than others. What it precisely means to be more “preferable” will be investigated in section 1.5.1. We clearly see that a formal normative theory of doxastic states and their rules of change will yield an approach to the problem of induction because it would provide a method to infer new beliefs that are not contained in what is already represented in our beliefs.
1.3 Elements of a Belief Theory
When we introduced the picture of the mechanistic knowledge management system above, we remarked that one of its benefits would be to clarify the elements that are important in the theorizing about belief theory. To conceptually introduce the elements of the theory, we substitute the purely mechanistic concept of the knowledge management system by the concept of a subject. This does not make the former example invalid because the knowledge management system can also be a subject and has to act like a subject. Following this thought, the picture contains the following elements. There are first the objects of belief that take the role of epistemic units. The epistemic units are those objects to which a subject is related by the belief relation. Usually, they are called “propositions”, but there is a great variety in the philosophical discussion about how propo- sitions should be defined. Traditional AGM belief revision theory identifies propositions with sentences and that is precisely why AGM theory is considered primarily a logical theory and not an epistemic theory. As a consequence, it is not open to different kinds of interpretations or applications. Another widely investigated definition of propositions is taking a proposition to describe a set of possible worlds. The thesis will not follow these restrictions but will rather follow Spohn’s definition of propositions, he gives in (Spohn, 2012, p. 17). This definition is more open to interpretation. We will come for propositions in detail in section 1.4. To epistemic units, a subject can have different epistemic attitudes. Gärdenfors speaks of three epistemic attitudes in (Gärdenfors, 1988): a subject can accept a proposition, reject it or be indeterminate about it. There is a separate discussion on what it means to accept a belief, or take it to be true. The questions related to this discussion are not in the focus of this thesis. The epistemic attitude of acceptance of a certain belief can be imagined as keeping this belief
26 1.3 Elements of a Belief Theory as a part of the current epistemic state the subject is in. Rejecting a belief can be defined as accepting its negation. Being indeterminate about a belief means that neither the belief nor its negation are accepted in the current epistemic state. To make this clearer, epistemic states have to be explained. Epistemic states, or, as they will also be called doxastic states, are the central concept of belief theory. A doxastic state is imagined as the set of all beliefs the subject accepts. Formally an epistemic state is rep- resented by an aggregation of epistemic units. Consequently, an epistemic state is a set of propositions. But this is already a theoretic predetermination, because epistemic states are of different structure in different theories. They can also be expressed as a probability measure in Bayesian models or as a set of possible worlds, to mention only two of the most prominent conceptions. So far, these are the static aspects of belief. The dynamics of belief come into account by the following argument. An epistemic state can be altered by the confrontation with new evidences, or, as Gärdenfors calls them, “epistemic inputs”. This name suggests that there is some new entity introduced but this is in fact not the case. A minimal example for an epistemic input can for instance be a proposition becoming in some way “present” to the subject, combined with a particular posterior certainty degree6. If the subject accepts the new epistemic input, she alters her epistemic state by integrating the new proposition into it. This means that she performs a transition from a prior epistemic state to a posterior epistemic state. This kind of altering is called epistemic update. It means that the subject changes its prior epistemic state by integrating the new evidence into it. This requires changes on epistemic attitudes to some – possibly many – epistemic units. The result of performing the epistemic update is a posterior epistemic state. Theorizing about this transition from a prior to a posterior state implies a concept of the re- quirements of rationality because otherwise no assertion can be made about the requirements necessary to ensure that the transition mechanism leads to a rational state, regardless which kind of evidence it processes. The mechanism has to reflect – or, to be more provocative: define – rationality on epistemic states. Thus on the meta-level, a theory about belief revision makes assumptions on rationality. The most prominent rationality postulates were brought into discussion in (Hintikka, 1962): consistency and deductive closure. The axioms for modeling the update function have to ensure that the function meets these postulates. A later section will explain them in more detail. Having completed these considerations, the basic picture is sketched. The remainder of this chapter will propose formal definitions for all the elements described above. In the remainder of this chapter we will formally introduce propositions as epistemic units and show that they can be interpreted to form algebras. We will also analyze which rules should hold for the transition from a prior to a posterior state. The other epistemic entities – as there are attitudes, states, inputs and updates – will be defined in the terms of ranking theory in chapterII.
6Note that also in this most simplified presentation, the evidence is not just the proposition but the proposition together with its posterior certainty degree. Leaving out the certainty degree in fact means to assign some default value. This may be interpreted as evidence that comes with maximal certainty. We will discuss the details later.
27 IBelief,Belief States, and Belief Change
Chapters III andIV will then show how updates can concretely be implemented for those formal concepts.
1.4 Propositions as Epistemic Units
1.4.1 The Concept of Proposition
Talking about beliefs presupposes a conception of epistemic units that are the objects to which our epistemic attitudes relate us to. Developing such a conception is a philosophically very complex and intricate part of work and many aspects of the conception that we will introduce and use here are not beyond doubt or discussion. The strategy of this thesis is not to discuss or evaluate the strengths and weaknesses of Spohn’s ranking theory but just use it to invent a belief revision algorithm. This, basically, saves us from the requirement of defending and discussing the conceptions but nonetheless we add some remarks that help to understand why we do what we are about to do. It is Spohn’s declared strategy in (Spohn, 2012) to use conceptions that have found a suffi- cient amount of acceptance within the philosophical discourse to assign persuasiveness to any new theory to be built on them. Therefore, we stay with the strategy of Spohn that, whenever it is possible, we will try to use formally precise conceptions that are on the one hand as highly neutral to interpretation as possible and on the other hand near to the epistemological “mainstream” insofar as it is possible. We will start with the assumption that all objects of belief have the same nature or are of the same type. This assumption is not accepted by all authors. Furthermore, the precise description of the type of these objects is the center of a vivid discussion in which we will nonetheless not engage here. This assumption is also the first concession we have to make, because presupposing a uni- form conception of all objects of beliefs seems to hurt the requirement of total neutrality. Many positions in the discussion are build on arguments trying to show that there are different kinds of objects of belief. We will not engage in the discussion of those arguments now, but it should be remarked that this approach can also be extended to different kinds of objects of belief, although, no explicite trial is made to do this in this thesis. Since Freges Der Gedanke the mainstream of the discussion agrees to consider propositions as the objects of beliefs and therefore consider beliefs as propositional attitudes. Of course, there are rejections. Quine is a prominent protagonist who rejects the propositio- nal nature of beliefs. He argues that they were rather sentential than propositional. However, the author will use well-known and decently approved concepts as basis for his approach and therefore he follows the mainstream and accepts the propositional attitude of beliefs. The typical feature of propositions is their capability to have a truth value assigned. Carnap describes them as also having an “intension” and an “extension”, but this will not matter for the moment because it leads to numerous separate discussions. Computationally, propositions are boolean variables. But this mere “digital” interpretation does obviously not fit the requirements of our discussion because we are not interested in
28 1.4 Propositions as Epistemic Units interpreting our relations to beliefs as relations to truth values. Nonetheless, truth values play an important role for belief theory because truth values are assigned to propositions in con- sideration of truth conditions. One could say, truth conditions are the pure and uninterpreted content of a belief while the objects of beliefs are propositions. This seems rather artificial and the mainstream position is to simply identify propositions with truth conditions. We will stay with the latter. The probability theorist knows an entity equivalent to propositions she calls “event”. This word is strongly related to the perspective of a special science and philosophically not ade- quate, therefore we will use the notion “proposition” instead. Many formal concepts of prob- ability theory are very usable for belief theory and therefore we will continue to add short comments to the parallel concepts in probability theory and mathematics while introducing the formal concepts. The formal material presented in the remainder of this chapter is, as already pointed out, mostly compiled from chapters 2 and 4 of (Spohn, 2012). This is a reasonable strategy since, as will become clear later on, it explains how this conception directly fits the requirements for graphical models and the update algorithm. Since also Spohn makes use of some concepts that are quite common in the discussion, the author does not present this approach as specific Spohnian, but in fact, the formal treatment will be very close to Spohn’s position.
1.4.2 Propositions Form Algebras
Propositions are represented as sets of possibilities that are subsets from a given underlying set of possibilities W, meaning precisely the subset of all possibilities in which the proposition is true. In short, this will be formally fixated by definition 1.3. We will not give an explanation of the entity that is denoted by “possibility” in precise terms. The reason is that there is de facto no way to describe it without hurting neutrality. The reader may see possibilities as a placeholder that can be interpreted within an application of the theory that is to be unfolded here. For instance, possibilities may denote possible worlds, if this is convenient in the context of the particular application. Possibilities are denoted with italic small letters from the end of the alphabet like u, v, w. Since W represents the set of all accessible possibilities, it will be called “space of possi- bilities” to stress that there are no possibilities considered that are not contained in W. We denote W optionally with an index for differentiation. Note that the space of possibilities W directly corresponds to what a probability theorist calls “sample space” or “event space”. (It differs from what is called a “probability space”.) Propositions are denoted by slanted capital letters from the beginning of the alphabet like
A, B, C, . . ., optionally written with an index like Ai. The empty proposition is denoted by ∅. Note that W is itself a proposition. It may not be immediately obvious why propositions are represented by sets of possibilities.
Example 1.1 Consider a lottery as an example. An arbitrary sequence of 6 numbers is selected out of a finite set of natural numbers greater than 0 and less than 50 (without putting a number back that already has been selected). In this example, W consists of all possible sequences of 6 numbers, containing only numbers bigger than 0 and smaller than 50.
29 IBelief,Belief States, and Belief Change
Let A be the proposition denoted by the sentence “50% of the lottery numbers of this week are even numbers”. This sentence describes a situation that is true in a big number of cases, precisely in all those cases where 3 of the 6 numbers in the selected sequence are even. Let proposition A denote the set of all those cases. Let now the actual sequence of 6 numbers for this week be for example 2, 32, 48, 21, 17, 41 . This concrete vector represents a possibility that is clearly an element of A and A is therefore true in this case. Of course, the occurrence of a single concrete fact can make other propositions becoming true or false, although they are not connected to A. Consider proposition B and let it be represented by the sentence “50% of the lottery numbers of this week are prime numbers”. Given the above numbers are selected, B is also true.
For a set of sets S the symbol SS denotes the union of all elements of S. The union may be infinite as well as uncountable if nothing additional is said and it is not made explicit by the S∞ context. An infinite union for explicitly countable sets is expressed by i=1 Si. An explicitly Sn finite union is expressed by i=1Si for some n ∈ N \ 0 . If nothing additional is said about the index set I, it is assumed that I = N and n ∈ N where the symbol “N” is used7 to denote the set 0, 1, 2, . . . not including ∞. The same convention is always used for the intersection TS. The equivalence of sets (and a fortiori of propositions) is expressed by the operator “=”.
The cardinality of set S is denoted by S . To ensure that only propositions are the objects of consideration, we define the algebra of propositions A over W and consider the elements of A. The statement that A is a proposition is hence equivalent to the statement A ∈ A.
Definition 1.2 (Propositional Algebra) Let 2W be the power set of a space of possibilities W. A set A ⊆ 2W such that8:
(1) W ∈ A,
(2) if A ∈ A, then A ∈ A,
(3) if A, B ⊆ A, then A ∪ B ∈ A is then called a propositional algebra over W.
Instead of speaking of a “propositional algebra over W” we will mostly use the short term “algebra” if the context is implicitly understandable.
Definition 1.3 (Proposition) For a given propositional algebra A over a set of possibilities W, a set A ∈ A is called a proposition.
Propositions are closed under logical operations of negation (“¬”), conjunction (“∧”) and disjunction (“ ∨ ”). Since propositions are represented as sets, the logical operations are im- plemented by the corresponding set operations of complement (“ ”), intersection (“∩”) and
7Symbol “N” denotes the set as defined above in accordance to DIN 5473. This is nice to know, but we just define it this way since it seems practical. 8The symbol “⊆” denotes “subset of”, “⊂” means “proper subset of”.
30 1.4 Propositions as Epistemic Units union (“∪”). Furthermore, we use set difference (“\”) for ease of notation. Note that definition 1.2 preserves closure under all set operations9. Propositional algebras are closed under the operations of complement and union, meaning that any application of complement or union operations on propositions will always yield propositions as result and never any entity that is not a proposition. The empty proposition ∅ represents the contradiction ⊥. Since ∅ contains no actual possi- bilities it represents a proposition that is never true: no change in the state of the world could ever make ∅ come true. This can easily be recognized since the logical contradiction has the form A ∧ ¬A = ⊥ and A ∩ A = ∅. The complement of the empty proposition over W is the proposition W which accordingly represents the tautology >. Consider A ∨ ¬A = > and A ∪ A = W. Note that it always holds that ∅ ∈ A because definition 1.2 requires W ∈ A. For W = ∅ and A is closed under complement by definition, W and ∅ are elements of any propositional algebra. Note that W may be infinite or uncountable. If W is infinite, 2W is uncountable10. Since A is defined as a subset of 2W , also A may be uncountable and, hence, uncountability may play a role in any application of our theory. Therefore, the formal foundation cannot exclude uncountability neither on the pure formal level nor on the semantic level because this would hurt neutrality. It follows that propositions are also not ensured to be finite or countable. Consider for example the belief that the weather forecast will express the possibility that it rains tomorrow as a real value between 0 and 1. If any value in the interval [0 ; 1] is possible and we take W as containing each real number in the interval as a concrete possibility, W and also A over W are infinite an uncountable. Consequently, the proposition that the value will be in the interval [0 ; 1] is an infinite uncountable subset of A because there are uncountably many possibilities w that make the proposition true when they become actual. It is therefore required to consider cases where A is infinite or uncountably infinite. We have to distinguish different kinds of algebras and thus have to distinguish between closure under finite, infinite and uncountable operations.
Definition 1.4 (Sigma-Algebra) A propositional algebra A over W such that for each countable subset S ⊆ A it holds that SS ∈ A is called a σ-algebra.
Definition 1.5 (Complete Algebra) A propositional algebra A over W such that for each subset S ⊆ A it holds that SS ∈ A is called a complete algebra.
Corollary 1.6 (Completeness of Finite Propositional Algebras) Each finite algebra is complete.
Proof: It is to show that SS ∈ A for each subset S ⊆ A. From definition 1.2 follows that for S2 each pair of propositions A1, A2 ⊆ A it holds A1 ∪ A2 = i=1 Ai ∈ A. Obviously, if for some
9Note that although definition 1.2 demands closure only under conjunction and negation, it can easily be seen that this entails closure for disjunction and also for difference. By application of de Morgan’s law, each disjunction of propositions can be expressed by the complement of the union of their complements. The difference A \ B = A ∩ B can be expressed using complement and intersection. 10If a set S is infinite, the power set 2S is uncountable. This can be shown with a generalization of Cantor’s second diagonal argument in (Cantor, 1890/91) that he used to prove the uncountability of real numbers.
31 IBelief,Belief States, and Belief Change
A Sn−1 A n ∈ N and Ai ∈ with 1 ≤ i ≤ n it holds that i=1 Ai ∈ , then it also holds by definition 1.2 Sn−1 Sn A A that i=1 Ai ∪ An = i=1 Ai ∈ . Hence for every finite subset S := A1, A2,..., Am ⊆ Sm S and m ∈ N, it holds that i=1 Ai = S ∈ A. Since A is finite, each subset S ⊆ A is finite. S Hence, S ∈ A for every S ⊆ A.
This implies that the properties of being complete and being a σ-algebra are equivalent and in fact indistinguishable for finite algebras. Each finite algebra is hence a σ-algebra. An infinite algebra can be a σ-algebra without being complete. We will study an example for this case on page 54. The definition of a σ-algebra extends the definition of an algebra by ensuring closure on all countable operations also if they are infinite11. In order to cover also infinite and uncountable operations, we had to introduce the concept of a complete algebra.
1.4.3 Atoms and Atomic Algebras
Note that it is not ensured that w ∈ A for each w ∈ W. This means, it is not ensured that any single possibility w forms a corresponding proposition w .
Definition 1.7 (Atom) Let A be a propositional algebra over W. Let A ∈ A be a proposition such that A 6= ∅. If there is no proposition B ∈ A such that ∅ ⊂ B ⊂ A, then A is called an atom of A.
Note that atoms in our sketch correspond to what the probability theorist calls “elementary events”. Remember the example of the lottery in the preceding section. An example for an atom is the proposition: “This week, the lottery numbers will be 2, 32, 48, 21, 17, 41 .” An algebra that only contains propositions that can be expressed as unions of atoms is called “atomic”.
Definition 1.8 (Atomic Algebra) Let A be a propositional algebra. If for each proposition A ∈ A it holds that A is either an atom of A or A is equivalent to a union of some atoms of A then A is atomic.
It is understood that singletons of possibilities are atoms if they are members of an algebra A. However, note that the converse need not hold: not all atoms in A need to be singletons. It is possible that an algebra is entirely atomless. As an example consider the set A of all finite unions of half open intervals over R of the form a, b = x : a ≤ x < b with a, b, x ⊆ R. Note that A is an algebra but it is atomless since for each element E0 ∈ A there is some element E1 ⊂ E0 and E1 is in A.
Corollary 1.9 (Complete Algebras are Atomic) Each complete propositional algebra is atomic.
Proof: Let A be a complete propositional algebra over a space of possibilities W. For each T w ∈ W there obviously exists a corresponding proposition Iw := A ∈ A : w ∈ A ∈ A. Lucidly Iw is an atom. Hence, the set atoms A := Iw 6= ∅ : w ∈ W is a set of atoms of A. Each finite or infinite combination of propositions Iw, Iv ⊆ atoms A by union or
11A note on the background: the concept of a σ-algebra is of importance in measure theory, especially in proba- bility theory. The problem addressed by the requirement of closure under all countable operations is that this requirement allows the investigation of infinite countable sets.
32 1.4 Propositions as Epistemic Units complement will be lucidly a member of A. This is obvious from the definition of an algebra and the set operations. Additionally, each member of A can be represented as a combination of some elements of atoms A. If there was any proposition B ∈ A that could not be represented as a combination of members of atoms A, it would imply that B contains some possibility x for which Ix is empty. This is a contradiction.
Note that corollary 1.9 implies that each finite algebra is atomic.
1.4.4 Beliefs,Contents, and Concepts
Let us make a little digression and briefly discuss the concessions we made with the decision to consider propositions as the objects of belief. This decision is not free of problems, but the obvious seeming alternatives are much more problematic. Considering beliefs as propositions and therefore as sets of possibilities has the following advantage: for each application W has to be interpreted to bestow semantics to the appli- cation. By our conjecture of propositions this necessary interpretation step is not restricted in any respect. Furthermore, the interpretation itself can contain a restriction of W if this is reasonable for a given application. The concrete nature of possibilities is also not determined. One may say that this is rather a disadvantage than a feature, because it introduces indeterminism and vagueness into the model. But this is only a misunderstanding of the abstraction level of this consideration. The entities denoted by “possibility” are intentionally open to interpretation. Possibilities can be philosophically interpreted as any kind of possible worlds or as inter- pretations of a formal language to state only two examples. It is not easy to find a notion that preserves this kind of neutrality. Spohn is more detailed in this respect, introducing a “doxastic possibility” as a formal concept in (Spohn, 2012, p. 29). This has obvious advantages for belief theory but it is not necessary for our claim here, so we will not raise any discussion here. Spohn argues in (Spohn, 2012, p. 23f) that the perspective constructed so far establishes at least two strong regulations that violate neutrality. The first is, as already stated in section 1.4, that we presuppose a uniform nature of all objects of belief as they are all sets of possibilities. If someone does not agree to the idea of a uniform nature of the objects of belief, she may although assent to the argument that there may be different types (or perhaps “kinds”) of possibilities. A possible consequence would be that the approach given here is restricted to describe only certain types of beliefs generated by a certain interpretation of what a possibility is. The interaction of these different types of beliefs would then be, as Spohn points out, beyond the scope of the approach. The second irrevocable predefinition Spohn stresses is that beliefs are always propositional. He sketches three types of counterarguments. The first one may arise if propositions are considered as sets of possible worlds. This perspective identifies possibilities with possible worlds. In short, there is no argument why the framework should not allow for this. It means just to renounce some of the neutrality of the notion of possibility. This should be perfectly
33 IBelief,Belief States, and Belief Change acceptable when identifying possible worlds as an implementation of possibilities. Thus, there is no point in this argument, according to Spohn. The second type of argument against the propositional nature of belief is that propositions are of conceptual nature and belief may be non-conceptual. Perceptual belief seems to be an obvious counterexample, because perceptual belief seems not conceptually structured. This argument has to be met by showing methods how perceptual beliefs can be under- stood in terms of sets of possibilities. Spohn answers that since possibilities are introduced as unstructured it is not clear why they should not be appropriate to model perceptual belief in particular or non-conceptual belief in general. Spohn’s answer reverses the burden of proof. The third type of argument just reverses the second: it insists that conceptual belief seems to be at least a kind of standard or normal case of belief. Furthermore, one may argue that the “conceptual nature” of belief is not reflected in considering beliefs as sets of possibilities. Spohn explains the argument as follows:
“If we conceive of objects of belief as sets of possibilities, then we really conceive of them as pure contents. A pure content is nothing but a truth condition: a set of possibilities is true if and only if the one and only actual possibility is a member of it. The problem now is that such contents are not given directly to us. Having a belief is somehow having a mental representation in the belief mode (. . . ), which will usually be a conceptual representation or, if that is too unclear, a linguistic representation; this is, finally, something quite indeterminate. The belief is then bestowed with content only because this representation is somehow related to the content or because the sentence representing the belief has a truth condition. But then it seems that it is rather the representation that should be the object of belief and not the content; different representations are different beliefs, even if they have the same content. As Quine has (. . . ) insisted (. . . ), belief is, (. . . ) not a propositional, but rather a sentential attitude.” (Spohn, 2012, p. 25)
“(. . . ) The problem is that contents seem accessible only through representations, that in principle infinitely many representations represent the same content, and that the issue of whether two representations are equivalent (i.e., have the same content) can be computationally arbitrarily complex and is indeed undecidable above a certain level of complexity. Nobody knows or can be expected to know about all these equivalences. Restricting the discussion to logical equivalences, this is called the problem of logical omniscience.” (Spohn, 2012, p. 25)
While we consider propositions equivalent to contents or truth conditions, the third argu- ment seems to require us taking sentences as the objects of belief – but sentences are concrete representations of propositions. Spohn provides only the short explanation cited above. Although, we will not engage deeper in the discussion about the philosophy of semantics, it may be reasonable to comple- ment his considerations by elaborating shortly on this problem. To understand how the “conceptual nature” of beliefs should be reflected, one should de- velop an understanding of what concepts are.
34 1.4 Propositions as Epistemic Units
The arguments (Kripke, 1972) and (Putnam, 1975) showed that concepts were not to identify with meanings of predicates. Putnam demonstrated that the “mental states” of knowing the meaning of an expression in the heads of different speakers do not reflect unknown differences in the meaning of these expressions. This argument directly addresses the core of meaning internalism. The counterexample is the well-known twin-earth example, in which some in- dividuals are in the same mental state about the meaning of the word “water” although the extension of this word differs in their idiolects. The defender of internalism seemed not to be able to argue which difference on the mental level of the two individuals reflects the exten- sional difference of their concepts of “water”. One prominent line of defense argues that not the mental states of the two individuals have to reflect those differences. Instead, the content of the two individuals’ beliefs must differ. A well-known site for this argument is (Fodor, 1987). In section 1.4.1 we already stated that the content of a belief is a truth condition. Truth conditions are the logical figuration of entities that are called “concepts”, introduced by Frege who called them “Begriffe” and it is often argued that the contents of propositions are of conceptual nature and somehow composed from concepts. But we also know that contents are not given directly to us. They are mediated to us in a way that is not really clear to us and it seems that having a belief means having established a mental representation of a content that is kept in the “belief mode” as cited from Spohn above (and not for example in the wish mode). The way the representation is related to the content is interpreted in different ways by different philosophical positions, however, it seems to follow immediately that if this sketch is accepted, the representation should be considered as the object of the belief, not the content itself. The belief does only have content because the representation is somehow related to the content and therefore contents are only accessible through their representations. But it also follows that the same content is present through different representations in the subjects and it is absolutely convincing to proceed from the assumption that each subject instantiates her own representation of a given content. Therefore, there are as many representations as there are subjects and the number of representations of a given content can be infinitely in principle. What does it therefore mean that two subjects have the same belief? One has to find out if there are representations of the same content present in both. How should this be done? It can be, as Spohn says, “arbitrarily complex” to verify if two representations are representations of the same content. Above a certain level of complexity it is practically beyond any possibility for solution and it follows that it may be computationally impossible to check whether two subjects have equivalent beliefs. This reveals the silent assumption we made so far: we assume that different representa- tions of the same content, meaning of the same truth condition, are in fact, or at least can in principle, be recognized by the subject to be equivalent. Otherwise, it would be possible for a subject to belief a certain truth condition if the truth condition is given by one representation while she rejects the same truth condition when it is given by another representation. This assumption is crucial for justifying propositions as objects of belief. But it is not be- yond dispute: it is easy to imagine situations where subjects are confronted with different representations of the same truth condition and do not recognize them as equivalent. One
35 IBelief,Belief States, and Belief Change may think of two sentences of equivalent meaning in different languages. Examples describ- ing such scenarios can be found in (Loar, 1988), where it is argued that the content of a belief is not fixed by propositions at all. Loar argues that a person, Pierre, a native speaker of French, may have the belief “Londres est jolie” without having been in the city he knows as “Londres” in his life so far. It happens that he visits London, but without recognizing that this is the particular city he thinks of as “Londres”. He may find that London is loud, very crowded und full of traffic and may think “London isn’t beautiful”. We can then argue that the truth condition, that London is beautiful, is given to this subject in two different representations, and that the subject accepts it as true in one representation and rejects it in another. An even more intuitive case can be described considering the subject who thinks that the capital of Germany is a beautiful city, but finds that Berlin is a really ugly place, dirty, awfully loud and overcrowded with weird people. This is a subjectively consistent scenario assuming that this subject does not know that Berlin is the capital of Germany. This is an instance of a class of well-known examples discussed in the early analytical philosophy of language. These cases were first discussed by Frege when he analyzed the conditions of informative identity statements12. Nevertheless, there is one problem in this argument: we can argue that no rational subject would intentionally belief a truth condition in one representation and reject it in another. Once Pierre knows that “Londres” and “London” refer to the same city, he will adjust his beliefs such that they are consistent. Pierre intentionally used the names “Londres” and “London” to refer to different cities, but this was only possible because of a specific deviation of his partic- ular idolect that misses the consensus in his linguistic community. When he once recognized that he used the words in a different way than the linguistic community he lives in, he will adjust his use of language. Furthermore, Pierre is a rationally thinking individual and once he recognizes that his state of belief is logically inconsistent, he will immediately correct it. Examples like the ones of Frege and Loar can be constructed systematically by ignoring intentionality and adding some misunderstanding or ignorance about the meaning of some language expression by the subject.13 The example illustrates that it is completely reasonable to presuppose that the acceptance of some belief does not depend on its representation under conditions where encyclopedias and thesauri are available to the subject in principle. Of course, reduced knowledge or reduced competence in the idiolect of the given linguistic community may lead to inconsistencies but this does neither mean that the subjects are not rational nor that beliefs are better considered to be mental representations than truth conditions. However, we do not intend to engage into a deeper discussion about contents, concepts and meaning. The short digression of this section should only show that we are well advised to concentrate on propositions as objects of belief, and that this decision was not made arbitrarily
12Remember the prominent Hesperus/Phosphorus example: not knowing that Hesperus is Phosphorus, I can ag- gregate totally different beliefs about each of them. Frege took this as a hint that a proper name has a semantic dimension beyond its mere object of reference and he called this its “Sinn”. 13Similar well-known examples were constructed by Russell as well as by Tyler Burge and Stephen Stich, to quote only some. These examples founded broad discussion threads but not all of them come to real argumentative impact.
36 1.5 Epistemic States and Rationality and has important and vital consequences for our theorizing. Spohn avoids the discussion referenced above and answers the third counterargument by three quite pragmatical arguments for considering the truth conditions that are modeled by sets of possibilities as the objects of belief:
“One reason is that we are in good company; in that respect, I think, the majority of epistemologists are still with us. Of course, the argument from authority is dubious, but it is reassuring not to feel like an outsider. My main reason is that our stance is by far the more fruitful one. (. . . ) In fact, most theories apparently operating on a sentential level assume the substitutability of logical equivalences and differ only superficially from our procedure. Without this substitutability we are left with hardly any theory at all. (. . . ) In my view this [epistemological theorizing, remark by author] entails that the objects of belief should be conceived as contents or truth conditions or sets of possibilities and not as sentences or representations of contents.” (Spohn, 2012, p. 26)
Spohn’s second argument simply says: if we want to theorize, meaning to provide a struc- tural analysis about the topic, it is not reasonable to consider the topic as of a nature that makes structural analysis nearly impossible. It is therefore rational to choose the perspective that suits the needs of the claim. (Any engineer would commit to this point of view perfectly and unhesitantly.) His third argument in fact states that normative epistemology is not possible on a radically subjective level. This points us to a separate discussion we will not dive into, however, it should be clear that this argument is firmly grounded on a genuine philosophical perspective. We will close the discussion about the objects of belief and state that besides all discussions, it is at least from a pragmatical point of view a helpful move to consider propositions in their form as sets of possibilities as the objects of belief.
1.5 Epistemic States and Rationality
1.5.1 Rationality Postulates
We proceed with some arguments that are strongly oriented to Spohn’s argumentation in (Spohn, 2012, chapter 4, section 4.1). Our aim is to introduce the same rationality postulates he uses. In former sections we used the notion of an epistemic or doxastic state (both notions are use as synonyms throughout this thesis). Having developed an account of the objects of belief, it is straightforward to define epistemic states as characterized by the entirety of belief the subject believes. We therefore consider a set of beliefs B ⊆ A and demand that the subject, if it accepts all and exactly those beliefs in B, then B characterizes the epistemic state the subject is in (at a particular time t). This definition does not explain any aspects of the belief relation itself, which is treated simply as a black-box. But it permits us to speak about doxastic states in terms of belief sets,
37 IBelief,Belief States, and Belief Change which will be sufficient for the moment. We will later see that this definition is only sufficient for a static description of belief states and we will also develop a more precise notion of a belief set. At this point, the normative aspect comes into consideration: normative theorizing about belief is only possible when some criteria are defined that make it possible to verify which belief is “acceptable” and which is not. This criterion, precisely, is rationality. Is any belief set B of equal rationality, or can there be stated additional criteria for rational belief sets? In this inquiry, we will use the two rules14 for rational belief Hintikka brought up in his (1962).
Postulate 1.10 (Consistency) Belief sets characterizing rational doxastic states are consistent.
Informally this means a rational subject is required to know that an actual possibility can- not be contained in its complement. Stated in another way, the subject is required to apply consequently the belief that sentences of the form A ∧ ¬A are always false. This contains es- pecially that the empty proposition denotes contradiction because it cannot contain any actual possibility. The reason for requiring consistency is easy to understand: ex contradictione sequitur quodli- bet. Arbitrary inferences may be drawn from a logically (not only contingently) false proposi- tion as well as from two contradictory propositions. Giving up consistency would be equiva- lent to giving up inference and this is maximally unattractive for theorizing.
Postulate 1.11 (Deductive Closure) Belief sets characterizing rational doxastic states are deduc- tively closed.
According to postulate 1.11, a rational subject has to know two fundamental aspects. First, if the actual possibility is in each of two propositions, it is also contained in their disjunc- tion. This is obviously a direct consequence of the intersection operation, which implements disjunction for sets. The second aspect is that if the actual possibility is contained in some proposition, it will also be contained in any superset of this proposition. Consider the throw of a dice. I believe that the next throw in a sequence of throws will surely result in “3”. Let proposition A be the set consisting of only the actual possibility that “3” is the result. Let proposition B be the set consisting of all possibilities in which an odd number is the result. Clearly, it holds that A ⊆ B. If I believe A, there is no way to reject B at the same time. Put in a more general way: while accepting a highly distinguished belief, the subject cannot reject a more general version of this belief while maintaining rationality at the same time. Note that the converse does not hold: I can rationally expect that the next throw yields an odd number without expecting that it will be 3. Therefore, I may accept a general belief without actively believing some particular case included in it. Together both aspects of deductive closure will result in the requirement that the subject Vn must know that A1,..., An are all true if and only if i=1 Ai is true. This is formally explained in section 1.5.2 and will also be proven there.
14This idea stems from Spohn, confer (Spohn, 2012, p. 48).
38 1.5 Epistemic States and Rationality
This leads to the intuitive understanding of deductive closure: all that can be deduced from a set of beliefs is already logically contained in it. Or, to express it in the reverse way, from a rational belief set nothing can be deduced that has to be rejected as false while keeping the original belief set. Obviously this seems to require the subject to keep an infinity of beliefs. This does not lead to substantial problems because it is a question of definition. A subject can have an infinity of dispositional beliefs for instance and not all beliefs must be represented in the mind of the subject. Postulate 1.11 seems to require the subject to perform unending inference processes in some cases – precisely when the underlying algebra A is infinite. But this is a spurious argument because it is not required for rationality that a subject really produces each inferred belief by an explicite act of inference and then accepts this belief in an intentional and conscious act. Deductive closure only requires that the belief base must have a structure such that inference produces informative results that are not arbitrary. Both assumptions introduce a more serious problem: the problem of undecidability. Logical consistency as well as logical consequence is computationally undecidable (except the under- lying logic is restricted to a decidable system, say, monadic predicate logic, but this would be a rather artificial constraint). Belief relates a subject to a proposition and not to a sentence. As Spohn emphasizes in his (Spohn, 2012, p. 48f), taking undecidability as an argument against the above rationality assumptions applies computational aspects of sentences to propositions, presupposing that those rules will also hold for propositions. But this is granted in no way because we took only deduction for granted when we decided to consider propositions as objects of belief. This is a strong consequence because the nature of propositions completely depends on the nature of possibilities and we do not want to restrict their interpretation to only those notions that avoid undecidability. This would make the entire theory quite trivial. One may answer that these assumptions clearly express what we called an idealization in the beginning of this chapter – and that the occurrence of such deviations from intuition is one of the consequences of normative theory construction. But there is no point in inquiring idealized settings when one cannot exactly argue where the idealization lies. Nevertheless, besides the problem of undecidability, it is obvious that no existing human mind fits these requirements. Our mind does not function like a database system to whose engine all stored informations are equally “present” thus that it can recognize contradictions or avoid constraint violations easily and immediately. Our mind does also not work like a learning algorithm processing all informations in the same way, with the same correctness, and the same attention. Our beliefs are always biased by different states of attention, by having different beliefs “present” from which to draw inferences from and by emotions like wishes and fears that form attitudes towards beliefs and those attitudes usually take influence on the inference process. The point of this is that the idealization does not come into play specifically when undecidability is introduced but also without it. It is consensual in the discussion that consistency and deductive closure at most cover a minimal basis for rationality. They provoked relevant counterarguments, however, on the other hand, they are the only basis from where theorizing can really be developed. Many
39 IBelief,Belief States, and Belief Change philosophers felt that some strengthening is necessary to progress from a minimal definition of rationality to a more complete basis for further theorizing. But this strengthening is not clear and is not our subject here. However, sufficient criteria have not yet been established in the discussion, thus we will consider these two well-defined requirements as necessary conditions for rationality in the remainder of this inquiry.
1.5.2 Rational Belief Sets
Both rationality postulates can be developed to a formal definition of rational belief sets or, in short, for belief sets. Then, a belief set is a set of propositions B ⊆ A that meets the requirements of rationality. Consistency means that a rational system must not believe a contradictory proposition. This implies disbelief in the empty set to at least some degree. To preserve consistency, a rational belief set must never contain the empty set. Deductive closure of beliefs means that each proposition inferred from positively believed propositions has to be believed itself. The reverse is also part of the requirement: if some belief can be inferred from a belief set, all its premises have to be believed also.
Definition 1.12 (Belief Set) For any propositional algebra A, a set B ⊆ A such that for all proposi- tions A, B ⊆ A:
1) ∅ 6∈ B
2) if A, B ⊆ B, then A ∩ B ∈ B
3) if A ∈ B and A ⊂ B, then B ∈ B is called a rational belief set (in A) or belief set for short.
Requirement 1 implements consistency. The subject believes all propositions in B, thus, she obviously also believes the conjunction of all propositions. The complement of the conjunction over B is the empty set, hence it cannot be contained in B. Requirements 2 and 3 implement deductive closure as explained informally above. Requirement 2 entails that if a proposition A is in B and a proposition B is in B, the subject has to know that also A ∩ B has to be in B. Requirement 3 entails that if a proposition A is in B, also any superset of A is contained in B. In sum, 2 and 3 result in:
Corollary 1.13 (Deductive Closure of Belief Sets) For any algebra of propositions A and any ra- tional belief set B ⊆ A and propositions A, B ⊆ B it holds that
A ∩ B ∈ B ⇐⇒ A ∈ B ∧ B ∈ B.(1.1)
Proof: The first direction, A ∈ B ∧ B ∈ B =⇒ A ∩ B ∈ B is already expressed by requirement 2 of definition 1.12.
40 1.5 Epistemic States and Rationality
It remains to prove the other direction: A ∩ B ∈ B =⇒ A ∈ B ∧ B ∈ B. Let therefore A ∩ B ∈ B. It is obvious that A ∩ B ⊆ A, hence by requirement 3 it holds that A ∈ B. Analogously with A ∩ B ⊆ B it holds that B ∈ B. With both it holds that A ∩ B ∈ B =⇒ A ∈ B ∧ B ∈ B.
Note that the definition of belief sets corresponds to the mathematical concept of a “filter”. The equivalence stated in corollary 1.13 does also explain the informal remarks on deductive closure in section 1.5.1 on page 38. In section 1.4.2 it was argued that the case of infinite algebras must also always be con- sidered. Therefore, we extent the implementation of deductive closure to the infinite case. This is performed by adding the definition of a complete belief set corresponding to complete algebras.
Definition 1.14 (Complete Belief Set) If A is a complete algebra, then a rational belief set B ⊆ A such that for any B0 ⊆ B it holds that TB0 ∈ B is called a complete rational belief set or complete belief set for short.
Note that if A is finite, any belief set in A is complete. This seems to be a new case of infinity since we postulate deductive closure of belief sets from infinitely many consequences. But remember the argument that the rejection of infinity mainly addresses sentences and need not to be applicable to propositions.
1.5.3 Belief Cores
Note that TB as used in definition 1.14 is a single proposition. It represents the intersection of all propositions believed in the doxastic state that is denoted by B. It is therefore called the “core” of B and is denoted by core B.
Definition 1.15 (Core of a Belief Set) Let B be a complete belief set. Then, the proposition TB, denoted by core B is called the core of B.
Note that the core of a rational belief set is always non-empty and thus non-contradictory.
Corollary 1.16 (Consistency of Core) For any complete belief set B it holds that