Algebras of Information a New and Extended Axiomatic Foundation

Algebras of Information A New and Extended Axiomatic Foundation Prof. Dr. Jürg Kohlas, Dept. of Informatics DIUF University of Fribourg CH – 1700 Fribourg (Switzerland) E-mail: [email protected] http://diuf.unifr.ch/tcs Version: July 9, 2018 arXiv:1701.02658v1 [cs.IT] 10 Jan 2017 2 Contents 1 Introduction 5 I Labeled Algebras 13 2 Conditional Independence 15 2.1 Quasi-Separoids. .. .. .. .. .. .. .. 15 2.2 Arithmetic of Partitions . 18 2.3 Families of Compatible Frames . 22 3 Labeled Information Algebras 29 3.1 Axiomatics ............................ 29 3.2 Valuation Algebras . 35 3.3 Semiring Information Algebras . 42 4 Local Computation 55 4.1 MarkovTrees ........................... 55 4.2 ComputinginMarkovTrees . 62 4.3 Computation in Hypertrees . 66 II Domain-Free Algebras 71 5 Domain-Free Information Algebras 73 5.1 Unlabeling of Information . 73 5.2 Domain-Free Axiomatics . 75 5.3 Duality .............................. 81 6 Order of Information 87 6.1 TheIdempotentCase . .. .. .. .. .. .. 87 6.2 RegularAlgebras ......................... 88 6.3 SeparativeAlgebras . 93 3 4 CONTENTS 7 Proper Information 99 7.1 IdealCompletion ......................... 99 7.2 CompactAlgebras . 102 7.3 Duality For Compact Algebras . 114 7.4 Continuous Algebras . 124 7.5 AtomicAlgebras ......................... 134 III Constructing New Algebras 143 8 Information Maps 145 8.1 ContinuousMaps. .. .. .. .. .. .. .. 145 8.2 Cartesian Closed Categories . 151 9 Random Maps 155 9.1 SimpleRandomVariables . 155 9.2 RandomMappings . .. .. .. .. .. .. .. 162 9.3 RandomVariables . 168 10 Allocations of Probability 179 10.1 Algebra of Allocations of Probability . 179 10.2 Random Mappings and Allocations . 187 11 Support Functions 199 11.1 Characterisation . 199 11.2 Generating Support Functions . 206 11.3 Canonical Random Mappings . 210 11.4 MinimalExtensions . 219 11.5 TheBooleanCase . 225 References 235 Chapter 1 Introduction The basic idea behind information algebras (Kohlas, 2003a; Kohlas & Schmid, 2014) is that information comes in pieces, each referring to a certain question, that these pieces can be combined or aggregated and that the part relating to a given question can be extracted. This algebraic structure can be given different forms. Questions are often represented by a lattice of domains, and a popular model is based on the subset lattice of a set of variables. Pieces of information are then represented by valuations associated with these domains. This leads then to an algebraic structure called valuation algebras (Kohlas, 2003a). The axiomatics of this algebraic structure was in essence proposed by (Shenoy & Shafer, 1990a). Valuation algebras have already many important applications in Computer Science related to con- straint systems, relational databases, different uncertainty formalisms like probability, belief functions, fuzzy set and possibility measures, and many more, we refer to (Pouly & Kohlas, 2011). An important particular case of valuation algebras, both from practical as well as theoretical point of views, are idempotent valuation algebras, also called proper information algebras: The combination of a piece of information with itself or part of itself gives nothing new. This allows to introduce an order between pieces of information reflecting information content. It relates proper information algebras also to domain theory (Kohlas, 2003a; Kohlas & Schmid, 2014). The basic view of information as pieces which can be combined, which relate to questions and from which the part relating to given questions can be extracted, leads to two different but essentially equivalent algebraic structure, labeled and domain-free valuation algebras (Kohlas, 2003a; Kohlas & Schmid, 2014). The original proposal of an axiomatics in (Shenoy & Shafer, 1990a) was in labeled form; later (Shafer, 1991) proposed the domain-free form. However, for valuation algebras, the two forms are not fully equivalent, there are labeled forms which have no domain-free form and vice vera. An important contribution of this paper is to give a new axiomatic system, where there exists a full duality between these two forms. 5 6 CHAPTER 1. INTRODUCTION The representation of questions by lattice of domains or even subsets of variables is unnecessary restrictive and excludes important applications in Computer Science. Already early work on belief functions (Shafer, 1976) considered a reference structure for belief functions called family of compatible frames. This is not covered by valuation algebras. In this paper a much more general abstract framework for representing questions is proposed and based on it a new system of axioms for information algebras, covering the previous forms of valuation algebras and proper information algebras as special cases. Originally, the theory of valuation algebras in (Shenoy & Shafer, 1990a) was motivated by the desire to generalise the local computation scheme for probabilities proposed in (Lauritzen & Spiegelhalter, 1988) for other formalisms of uncertainty, especially belief functions. This goal will also be maintained for the new algebraic structures presented here. We claim however, that these algebraic structures represent moreover essential features of information in general, beyond particular uncertainty calculi. In probability theory, conditional independence structures between variables are essential for efficient local computation. It has been known since long that structures of conditional independence can be generalised beyond probability (Studeny, 1993; Shenoy, 1994b; Studeny, 1995). In fact, we claim that conditional independence is a basic issue for information and information algebras in general. In (Dawid, 2001) a fundamental mathematical structure called separoids is abstracted underlying all the concepts of conditional independence and its applications. It is shown in this paper that an even weaker concept (called here quasi-separoid) is sufficient to allow for local computation schemes in the context of information algebras in appro- priate conditional independence structures. The basis of the theory of information algebras as developed here, is the relation of conditional independence among questions or domains representing them. In Chap. 2 it is argued that questions should be partially ordered according to their granularity, their acuteness or coarseness of the possible answers. In fact, this partial order is required in the present theory to form a join-semilattice. The join of two questions is the coarsest among all questions finer than both original questions; the join represents thus the combined question of the two original ones. In addition, a three-place relation among questions is required which describes the conditional independence of two questions, given a third one. This relation is requested to satisfy four conditions, which are natural requirements for a concept of conditional independence. In fact a separoid, the usual concept for modelling conditional independence and irrelevance, satisfies (among others) these conditions. Therefore, a join-semilattice together with a three-place relation satisfying these conditions is called a quasi-separoid (or q-separoid). An important source of q-separoids are join-semilattices of partitions of some universe and they form useful models of systems of questions. Somewhat more general that partitions are families of compatible frames (f.c.f). This 7 notion has been introduced in (Shafer, 1976). Here a slightly modified version of this concept is proposed and it is shown how q-separoids arise from f.c.f. Both q-separoids or partitions of f.c.f generalise the most often used multivariate model, where questions are represented by families of variables and theirs domains, as in Bayesian networks, belief functions, etc. In this last case, q-separoids become separoids and this links our general theory to the more classical approach to valuation and information algebras. Q-separoids model questions. In Chap 3, pieces of information are added, each piece referring to an element of the q-separoid, to a deter- mined question. But information can to be transported or extracted rela- tive to other questions, and also pieces of information can be combined or aggregated. The corresponding operations are introduced and the required properties of them are stated as axioms. In particular, the operations of transport and combination are related to conditional independence. This determines then a labeled information algebra. For certain particular q- separoids, the axioms can be transformed into those of classical valuation algebras, (Shenoy & Shafer, 1990a; Kohlas, 2003a). The latter appear in this way as particular cases of the general information algebras treated in this text. In relation to partition and f.c.f q-separoids, pieces of information may be represented by subsets of the universe or of frames. These set information algebras are important models of information algebras. A general problem of information processing can be formulated in the framework of information algebras as combining a number of pieces of information and then extracting from the combination the part corresponding to one or several given questions. Formulated in this way, this may well be com- putationally infeasible. For probabilistic networks (Lauritzen & Spiegelhalter, 1988) proposed a computational scheme which avoids this problem by organising the computations in such a way that combination and extraction always can be done on the small domains of the pieces of information

Load more