Algebraic Structure of Information
Total Page:16
File Type:pdf, Size:1020Kb
Algebraic Structure of Information J¨urgKohlas ∗ Departement of Informatics University of Fribourg CH { 1700 Fribourg (Switzerland) E-mail: [email protected] http://diuf.unifr.ch/drupal/tns/juerg kohlas March 23, 2016 Abstract Information comes in pieces which can be aggregated or combined and from each piece of information the part relating to a given question can be extracted. This consideration leads to an algebraic structure of information, called an information algebra. It is a generalisation of valuation algebras formerly introduced for the purpose of generic local computation in certain inference problems similar to Bayesian networks, but for more general uncertainty formalisms or representa- tions of information. The new issue is that the algebra is based on a mathematical framework for describing conditional independence and irrelevance. The older valuation algebras are special cases of the new generalised information algebras. It is shown that these algebras allow for generic local computation in Markov trees. The algebraic theory of generalised information algebras is elaborated to some extend. The duality theory between labeled and domain-free versions is presented. A new issue is the development of information order, not only for idempotent algebras, as has been done formerly, but more generally also for non-idempotent algebras. Further, for the case of idempotent information algebras, issues relating to finiteness of information and approximation are discussed, generalizing results known for the more special case of idempotent valuation algebras. ∗Research supported by grant No. 2100{042927.95 of the Swiss National Foundation for Research. 1 CONTENTS 2 Contents 1 Introduction 3 2 Conditional Independence 8 2.1 Quasi-Separoids . 8 2.2 Family of Compatible Frames . 11 2.3 Markov Trees, Hypertrees and Join Trees . 20 3 Labeled Algebras of Information 27 3.1 Axioms . 27 3.2 Valuation Algebras . 34 3.3 Semiring Valuations . 43 4 Local Computation 58 4.1 Computing in Markov Trees . 58 4.2 Computation in Hypertrees . 63 5 Division and Inverses 66 5.1 Separative Semigroups . 66 5.2 Regular Valuation Algebras . 74 5.3 Separative Valuation Algebras . 78 5.4 Computing with Division . 85 5.5 Separative Semiring Valuations . 91 6 Conditionals 95 6.1 Conditionals and factorisations . 95 6.2 Causal Models . 108 6.3 Probabilistic Argumentation . 111 6.4 Compositional Models . 117 1 INTRODUCTION 3 7 Domain-Free Algebras of Information 122 7.1 Unlabeling of Information . 122 7.2 Domain-Free Algebras . 124 7.3 Duality . 130 7.4 Separativivity . 135 8 Information Order 135 8.1 The Idempotent Case . 135 8.2 Regular Algebras . 136 8.3 Separative Algebras . 142 9 Proper or Idempotent Information 146 9.1 Ideal Completion . 146 9.2 Compact Algebras . 150 9.3 Duality For Compact Algebras . 163 9.4 Continuous Algebras . 175 9.5 Atomic Algebras . 186 10 Conclusion 195 References 196 1 Introduction Information refers to questions, that is provides answers to questions, al- though possibly only partial ones. It should be possible to combine or ag- gregate pieces of information. Also, from a piece of information, it should be possible to extract the part referring to a given question. This simple idea seems not to be very widespread in information theory. However, be- hind this idea is hidden an algebraic structure, which has been proposed and used for computational purposes already for some time, albeit without 1 INTRODUCTION 4 recourse to the interpretation as information. Here we reconsider these al- gebraic structures, but on a more general basis as before. Although we also relate these structures with computational problems and schemes, we look at it in a semantic view of questions and information. So, the two new issues developed here are 1. a new, more general algebraic, axiomatic structure, 2. a systematic semantic interpretation of the structure relating to infor- mation. This is a mathematical text; but it is applied insofar, as it is motivated by a particular view about what information is; it models certain aspects of information. So, what we propose here is an algebraic theory of information. Some time ago, in (Shenoy & Shafer, 1990a) a simple axiomatic system was presented, which allows to apply a local computation scheme proposed in (Lauritzen & Spiegelhalter, 1988) in a generic setting beyond the special case of probability theory, especially also in the case of belief functions. This algebraic structure was taken up in (Kohlas, 2003a), where it was for the first time related to information. The elements of the algebraic structure were seen as pieces of information, the combination operation as aggregation of information and the projection operation as extraction of information. This picture is especially pertinent if the algebra is idempotent, that is if combination of identical pieces of information gives nothing new. Then, in (Kohlas, 2003a) this algebraic structure was called an information algebra, otherwise a valuation algebra. This algebraic structure is sufficient to allow local computation in the sense of (Lauritzen & Spiegelhalter, 1988). The pieces of information (in our terms) are usually given by valuations of sets of variables, like by probability potentials, possibility measures, logical valuations, etc. In local computa- tion, structures termed join or junction trees, hypertrees and also Markov trees are used. For the case of valuations of sets of variables they are es- sentially all the same. However, in (Shafer, 1991) it was noted, that the axiomatic scheme applies under more general circumstances. In particular, the structure of domains or questions (in our terms) could be allowed to be any lattice and not only a distributive lattice of subsets as in the case of multivariate models. This is one line of generalisation we take up here. Local computation and the underlying structures like join and junction trees, Markov trees, etc. are closely related to conditional independence relations; 1 INTRODUCTION 5 relations which came up originally in probability theory and also relational data base theory (Beeri et al. , 1983; Maier, 1983), but which can be ex- tended to more general valuation algebras (Shenoy, 1994; Kohlas, 2003a). On the other hand, (Dawid, 2001) proposed separoids as a mathematical framework for conditional independence and irrelevance. This is taken up here as a second line of generalisation of valuation algebras. It is shown here that a weakening of the concept of a separoid is already sufficient for local computation. This leads then to a new algebraic structure of information, which covers the old one as a particular case. The outline of this work is as follows: In Section 2 the system of questions or domains underlying our structure is modeled as a join-semilattice. The order between questions reflects their \granularity" and the join between two questions represents the combined question as coarsest question finer than both original questions. The essential additional element is then a ternary relation of conditional independence of two questions given a third one. This relation is subject to four basic requirements and defines then a structure, which we call a quasi-separoid or q-separoid. It is shown how the classical conditional independence structures of join tree, hypertrees and Markov trees defined for sets of variables can quite naturally be extended to q-separoids. It turns out however that, although any Markov tree is a hypertree and any hypertree is a join tree, join tree and hypertrees are not Markov trees in general. Equivalence between these there concepts holds only in the special case of a particular q-separoid in a distributive lattice. This is the case for the widespread multivariate model of sets of variables. Further a particular instance of a q-separoid is introduced, based on families of compatible frames, generalizing partitions. This is then an example where the equivalence mentioned above does no more hold. Next, in Section 3 the axioms of a generalised information algebra in its labeled form are presented. A few elementary results on these algebras, clar- ifying the meaning of conditional independence and irrelevance (of pieces of information) are given. It is shown that for a particular class of q-separoids based on lattices (instead of merely join-semilattices) the generalised in- formation algebras reduce to valuation algebras. Conversely, valuation al- gebras permit, under certain conditions, to reconstruct generalised infor- mation algebras; they are special cases of generalised information algebras. We should mention, that the valuation algebras obtained from generalised information algebras satisfy the so-called stability condition (Shafer, 1991; Kohlas, 2003a), which is not really needed for local computation. For in- stance probability potentials, underlying Bayesian networks, do not satisfy 1 INTRODUCTION 6 this condition. Finally, it is shown how a large class of information and valuation algebras can be obtained from commutative semirings. For these generalised information algebras, local computation in Markov trees is developed in Section 4. It is shown that the classical collect and dis- tribute algorithms work for generalised information algebras. On hypertrees however, in general, only the collect algorithm is available and for join trees none of them. In some cases it is possible to remove information. Mathematically this cor- responds to an operation inverse to combination. The theory of separative semigroups (Section 5.1) provides the basis for this. This theory is in Sec- tion 6 extended to a theory of separative and regular valuation algebras. It is shown that in these cases the valuation algebra can be embedded in a semigroup which is a union of disjoint groups. In each of these groups local inverses allow division. And in some sense these inverses are compatible with the operation of information extraction or projection. In particular, division permits to use certain efficient architectures of local computation, known from probability networks, in the more general setting of valuation al- gebras. Finally, separativity of semiring valuation algebras may be inherited from a corresponding notion in semiring theory (Section 5.5).