Algebraic Structure of Information

J¨urgKohlas ∗ Departement of Informatics University of Fribourg CH – 1700 Fribourg (Switzerland) E-mail: [email protected] http://diuf.unifr.ch/drupal/tns/juerg kohlas

March 23, 2016

Abstract

Information comes in pieces which can be aggregated or combined and from each piece of information the part relating to a given question can be extracted. This consideration leads to an algebraic structure of information, called an information algebra. It is a generalisation of valuation algebras formerly introduced for the purpose of generic local computation in certain inference problems similar to Bayesian networks, but for more general uncertainty formalisms or representa- tions of information. The new issue is that the algebra is based on a mathematical framework for describing conditional independence and irrelevance. The older valuation algebras are special cases of the new generalised information algebras. It is shown that these algebras allow for generic local computation in Markov trees. The algebraic theory of generalised information algebras is elaborated to some extend. The duality theory between labeled and domain-free versions is presented. A new issue is the development of information order, not only for idempotent algebras, as has been done formerly, but more generally also for non-idempotent algebras. Further, for the case of idempotent information algebras, issues relating to finiteness of information and approximation are discussed, generalizing results known for the more special case of idempotent valuation algebras. ∗Research supported by grant No. 2100–042927.95 of the Swiss National Foundation for Research.

1 CONTENTS 2

Contents

1 Introduction 3

2 Conditional Independence 8 2.1 Quasi-Separoids ...... 8 2.2 Family of Compatible Frames ...... 11 2.3 Markov Trees, Hypertrees and Join Trees ...... 20

3 Labeled Algebras of Information 27 3.1 Axioms ...... 27 3.2 Valuation Algebras ...... 34 3.3 Semiring Valuations ...... 43

4 Local Computation 58 4.1 Computing in Markov Trees ...... 58 4.2 Computation in Hypertrees ...... 63

5 Division and Inverses 66 5.1 Separative Semigroups ...... 66 5.2 Regular Valuation Algebras ...... 74 5.3 Separative Valuation Algebras ...... 78 5.4 Computing with Division ...... 85 5.5 Separative Semiring Valuations ...... 91

6 Conditionals 95 6.1 Conditionals and factorisations ...... 95 6.2 Causal Models ...... 108 6.3 Probabilistic Argumentation ...... 111 6.4 Compositional Models ...... 117 1 INTRODUCTION 3

7 Domain-Free Algebras of Information 122 7.1 Unlabeling of Information ...... 122 7.2 Domain-Free Algebras ...... 124 7.3 Duality ...... 130 7.4 Separativivity ...... 135

8 Information Order 135 8.1 The Idempotent Case ...... 135 8.2 Regular Algebras ...... 136 8.3 Separative Algebras ...... 142

9 Proper or Idempotent Information 146 9.1 Ideal Completion ...... 146 9.2 Compact Algebras ...... 150 9.3 Duality For Compact Algebras ...... 163 9.4 Continuous Algebras ...... 175 9.5 Atomic Algebras ...... 186

10 Conclusion 195

References 196

1 Introduction

Information refers to questions, that is provides answers to questions, al- though possibly only partial ones. It should be possible to combine or ag- gregate pieces of information. Also, from a piece of information, it should be possible to extract the part referring to a given question. This simple idea seems not to be very widespread in information theory. However, be- hind this idea is hidden an algebraic structure, which has been proposed and used for computational purposes already for some time, albeit without 1 INTRODUCTION 4 recourse to the interpretation as information. Here we reconsider these al- gebraic structures, but on a more general basis as before. Although we also relate these structures with computational problems and schemes, we look at it in a semantic view of questions and information. So, the two new issues developed here are

1. a new, more general algebraic, axiomatic structure,

2. a systematic semantic interpretation of the structure relating to infor- mation.

This is a mathematical text; but it is applied insofar, as it is motivated by a particular view about what information is; it models certain aspects of information. So, what we propose here is an algebraic theory of information. Some time ago, in (Shenoy & Shafer, 1990a) a simple axiomatic system was presented, which allows to apply a local computation scheme proposed in (Lauritzen & Spiegelhalter, 1988) in a generic setting beyond the special case of probability theory, especially also in the case of belief functions. This algebraic structure was taken up in (Kohlas, 2003a), where it was for the first time related to information. The elements of the algebraic structure were seen as pieces of information, the combination operation as aggregation of information and the projection operation as extraction of information. This picture is especially pertinent if the algebra is idempotent, that is if combination of identical pieces of information gives nothing new. Then, in (Kohlas, 2003a) this algebraic structure was called an information algebra, otherwise a valuation algebra. This algebraic structure is sufficient to allow local computation in the sense of (Lauritzen & Spiegelhalter, 1988). The pieces of information (in our terms) are usually given by valuations of sets of variables, like by probability potentials, possibility measures, logical valuations, etc. In local computa- tion, structures termed join or junction trees, hypertrees and also Markov trees are used. For the case of valuations of sets of variables they are es- sentially all the same. However, in (Shafer, 1991) it was noted, that the axiomatic scheme applies under more general circumstances. In particular, the structure of domains or questions (in our terms) could be allowed to be any lattice and not only a distributive lattice of subsets as in the case of multivariate models. This is one line of generalisation we take up here. Local computation and the underlying structures like join and junction trees, Markov trees, etc. are closely related to conditional independence relations; 1 INTRODUCTION 5 relations which came up originally in probability theory and also relational data base theory (Beeri et al. , 1983; Maier, 1983), but which can be ex- tended to more general valuation algebras (Shenoy, 1994; Kohlas, 2003a). On the other hand, (Dawid, 2001) proposed separoids as a mathematical framework for conditional independence and irrelevance. This is taken up here as a second line of generalisation of valuation algebras. It is shown here that a weakening of the concept of a separoid is already sufficient for local computation. This leads then to a new algebraic structure of information, which covers the old one as a particular case. The outline of this work is as follows: In Section 2 the system of questions or domains underlying our structure is modeled as a join-semilattice. The order between questions reflects their “granularity” and the join between two questions represents the combined question as coarsest question finer than both original questions. The essential additional element is then a ternary relation of conditional independence of two questions given a third one. This relation is subject to four basic requirements and defines then a structure, which we call a quasi-separoid or q-separoid. It is shown how the classical conditional independence structures of join tree, hypertrees and Markov trees defined for sets of variables can quite naturally be extended to q-separoids. It turns out however that, although any Markov tree is a hypertree and any hypertree is a join tree, join tree and hypertrees are not Markov trees in general. Equivalence between these there concepts holds only in the special case of a particular q-separoid in a distributive lattice. This is the case for the widespread multivariate model of sets of variables. Further a particular instance of a q-separoid is introduced, based on families of compatible frames, generalizing partitions. This is then an example where the equivalence mentioned above does no more hold. Next, in Section 3 the axioms of a generalised information algebra in its labeled form are presented. A few elementary results on these algebras, clar- ifying the meaning of conditional independence and irrelevance (of pieces of information) are given. It is shown that for a particular class of q-separoids based on lattices (instead of merely join-semilattices) the generalised in- formation algebras reduce to valuation algebras. Conversely, valuation al- gebras permit, under certain conditions, to reconstruct generalised infor- mation algebras; they are special cases of generalised information algebras. We should mention, that the valuation algebras obtained from generalised information algebras satisfy the so-called stability condition (Shafer, 1991; Kohlas, 2003a), which is not really needed for local computation. For in- stance probability potentials, underlying Bayesian networks, do not satisfy 1 INTRODUCTION 6 this condition. Finally, it is shown how a large class of information and valuation algebras can be obtained from commutative semirings. For these generalised information algebras, local computation in Markov trees is developed in Section 4. It is shown that the classical collect and dis- tribute algorithms work for generalised information algebras. On hypertrees however, in general, only the collect algorithm is available and for join trees none of them. In some cases it is possible to remove information. Mathematically this cor- responds to an operation inverse to combination. The theory of separative semigroups (Section 5.1) provides the basis for this. This theory is in Sec- tion 6 extended to a theory of separative and regular valuation algebras. It is shown that in these cases the valuation algebra can be embedded in a semigroup which is a union of disjoint groups. In each of these groups local inverses allow division. And in some sense these inverses are compatible with the operation of information extraction or projection. In particular, division permits to use certain efficient architectures of local computation, known from probability networks, in the more general setting of valuation al- gebras. Finally, separativity of semiring valuation algebras may be inherited from a corresponding notion in semiring theory (Section 5.5). Conditional probability distributions play an important role in probabilis- tic modeling. It turns out that a large part of the theory of conditioning in (discrete) probability theory can be carried over to valuation algebras. This is the subject of Section 6. In particular, the concept of causal and compositional models can be generalised from probability to separative val- uation algebras. An important instance of a separative valuation algebra is the algebra of so-called probability potentials. It provides the basis of lo- cal computation in probabilistic networks, in particular probabilistic causal models. In Section 6.3 an alternative interpretation of probability potentials is presented, based on probabilistic argumentation. The stability condition imposed on generalised information algebras permits to derive a second equivalent algebraic system, the domain-free information algebras. Conversely, from a domain-free algebra a labeled information alge- bra may be retrieved. In fact, there is a full duality theory between these two forms of information algebras. Thanks to this duality, one may freely select one or the other structure, according to convenience. The labeled algebra is better suited for computational purposes, the domain-free for theoretical, algebraic studies. This is discussed in detail in Section 7. A natural concept is the comparison of pieces of information regarding their 1 INTRODUCTION 7 information content. Some pieces may be more informative than other ones. In Section 8 this question is addressed, starting from the idea that a piece of information is more informative than an other one, if it is obtained from the latter by combination with a third one. It is shown that this leads in fact to a preorder representing information content. If the information algebra is idempotent, then the preorder becomes a partial order. This is a particularly important case and we speak then of a proper information algebra. Idempo- tent valuation algebra were called information algebras in (Kohlas, 2003a) and studied in detail in (Kohlas, 2003a; Kohlas & Schmid, 2014a; Kohlas & Schmid, 2014b). For the case of proper generalised information algebras, the subject is reconsidered here in Section 9. What is new here is the study of the preorder for non-idempotent information algebras. Orders in semigroups have already been treated in the literature on semigroups, see for instance (Nambooripad, n.d.; Mitsch, n.d.). They become interesting especially for regular and separative semigroups. So, it turns out that the concept of inverses and division in information algebras, as discussed in Section 6 in re- lation to labeled valuation algebras and local computation, become relevant for the very different sunject of information order. In Section 8, the theory of separative and regular semigroups is adapted to domain-free information algebras. It is shown that in these cases the preorder is natural in the sense that it is maintained by combination and extraction of information. In the final Section 9 we reconsider idempotent valuation algebras and ex- tend some of the results about idempotent valuation algebras from (Kohlas, 2003a) to generalised idempotent information algebras. In particular we con- sider the concepts of finiteness (finite information) and approximation (of information). This leads to compact, algebraic and continuous information algebras. It is shown how these concepts relate to the similar concepts of lattice or domain theory. In particular the duality theory of these categories of information algebras is presented. Last, atomic and atomistic informa- tion algebras are considered and linked to algebras, based on families of compatibles frames. This is a small beginning of representation theory for generalised information algebras, similar to the more developed theory for idempotent valuation algebras (Kohlas & Schmid, 2014b). 2 CONDITIONAL INDEPENDENCE 8

2 Conditional Independence

2.1 Quasi-Separoids

Information informs about possible answers to questions. Therefore the first task in developing an algebraic theory of information consists in modeling appropriate systems of questions. We may think of questions as represented by domains describing somehow the possible answers to questions; for in- stance by listing simply the possible answers. This simple idea will be pur- sued in Section 2.2. Here, at this point, we want to be more general. Let D be a set whose elements are thought to represent domains. It generic el- ements will be denoted by lowercase letters like x, y, z, . . .. We assume that domains or questions can be compared with respect to their granularity or fineness. Therefore we assume (D; ≤) to be a partial order, where x ≤ y means that y is finer than x, that is, answers to y will be more informative than answers to x. Moreover, if x and y are two elements of D, we want to be able to consider the combined question represented by x and y. This is surely a question finer than both x and y. So the combined question would be the coarsest question finer than x and y, that is the supremum or join of x and y. Therefore we assume that D contains with any pair x, y also its join x ∨ y. Hence we shall assume throughout that D is a join- semilattice. For more about questions and answers we refer to (Groenendijk, 2003; Groenendijk & Stokhof, 1997; Groenendijk & Stokhof, 1984) Now, between domains in D we want to define a relation x⊥y|z which de- scribes conditional independence of x and y given z. This relation is thought to express the idea that an information relative to x, does restrict the pos- sible answers to y only through its part relative to z, and vice versa. Or, in other words, only the part relative to z of an information relative to x is relevant as an information relative to y, and vice versa. In still other words: Given information relative to domain z, information relative to x gives no additional information relative to y, and vice versa. Rather than to give some explicit definition of this relation in D, we only require it to satisfy the following four conditions:

C1 x⊥y|y or all x, y ∈ D,

C2 x⊥y|z implies y⊥x|z,

C3 x⊥y|z and w ≤ y imply x⊥w|z, 2 CONDITIONAL INDEPENDENCE 9

C4 x⊥y|z implies x⊥y ∨ z|z.

A join-semilattice D together with a relation x⊥y|z,(D; ≤, ⊥), satisfying conditions C1 to C4 will be called a quasi-separoid (or also q-separoid). In the literature two additional conditions are usually assumed for a relation of conditional independence (Dawid, 2001):

C5 x⊥y|z and w ≤ y imply x⊥y|z ∨ w, C6 x⊥y|z and x⊥w|y ∨ z imply x⊥y ∨ w|z.

Then D is called a separoid. If D is a lattice, then yet another condition can be added:

C7 If z ≤ y and w ≤ y, then x⊥y|z and x⊥y|w imply x⊥y|z ∧ w.

With this additional condition D is called a strong separoid. For a detailed discussion of separoids we refer to (Dawid, 2001). For example it can be shown that C1 to C3 together with C5 and C6 imply C4. For our purposes, that is in particular for the study of local computation (Section 4), C1 to C4 are sufficient. This is thought to be one of the main results of this text. Families of compatible frames, as studied in the next section, provide an important example of a quasi-separoid, where D is in general only a join- semilattice. Here, we present important examples where D is a lattice. Define x⊥Ly|z to hold if and only if (x ∨ z) ∧ (y ∨ z) = z. (2.1)

Theorem 1 If D is a lattice the relation x⊥Ly|z defines a quasi-separoid.

Proof. We have (x∨y)∧(y ∨y) = y, hence C1 is satisfied. By the symmetry of the definition C2 holds too. If w ≤ y, then z ≤ (x ∨ z) ∧ (w ∨ z) ≤ (x ∨ z) ∧ (y ∨ z) = z, so C3 follows. Finally from (2.1) we see that C4 is valid. ut If x ≤ y, then from x⊥y|y (C1) it follows that x⊥x|y by C3. Now, in some cases x⊥x|y implies x ≤ y. A separoid with this property is called basic, see (Dawid, 2001). We adapt this to call a quasi-separoid basic, if x⊥x|y implies x ≤ y. The following theorem was proved in (Dawid, 2001) for a basic separoid, but it is valid for basic quasi-separoids too. 2 CONDITIONAL INDEPENDENCE 10

Theorem 2 Suppose (D; ≤) is a lattice. Then a q-separoid x⊥y|z is basic if and only if

x⊥y|z ⇒ (x ∨ z) ∧ (y ∨ z) = z. (2.2)

Proof. If (2.2) holds, then x⊥x|y implies x ∨ y = y, hence x ≤ y. Suppose now that x⊥y|z. Then (x ∨ z)⊥(y ∨ z)|z by C4 and C2. Define w = (x ∨ z) ∧ (y ∨ z) such that w ≤ x ∨ z and w ≤ y ∨ z. Using C3 and C2 we deduce then that w⊥w|z. So, if the quasi-separoid is basic, we obtain that w ≤ z. Since we always have w ≥ z, it follows that w = z. ut If we meet both sides of (2.1) with x we obtain x ∧ (y ∨ z) = x ∧ z which is equivalent to

x ∧ (y ∨ z) ≤ z. (2.3)

This condition is equivalent to (2.1) if the lattice D is modular. So, in this case we have x⊥Ly|z if and only if (2.3) holds.

Theorem 3 If (D; ≤) is a lattice, the relation x⊥Ly|z defines a separoid if and only if (D; ≤) is modular.

Proof. Assume D modular,. We are going to show that C5 and C6 are satisfied. If D is modular, then x ∧ (y ∨ z) = x ∧ z if and only if x⊥Ly|z. So, if w ≤ y, it follows x∧(y∨z∨w) = x∧(y∨z). Therefore, x∧(z∨w) ≤ x∧(y∨ z ∨w) = x∧(y ∨z) = x∧z ≤ x∧(z ∨w), hence x∧(y ∨(z ∨w)) = x∧(z ∨w). This shows that x⊥Ly|z ∨ w, that is C5. Further, x⊥Ly|z and x⊥Lw|y ∨ z imply x ∧ (y ∨ z) = x ∧ z and x ∧ (w ∨ y ∨ z) = x ∧ (y ∨ z). Together, this leads to x ∧ (w ∨ y ∨ z) = x ∧ z, hence x⊥L(y ∨ w)|z. So C6 holds.

On the other hand, assume x⊥Ly|z to be a separoid. By (2.1) we have x⊥Ly|x ∧ y. Thus, if z ≤ x, by C5, it follows that x⊥Ly|(x ∧ y) ∨ z. This means that x ∧ (y ∨ z) = (x ∧ y) ∨ z, which is modularity. ut Further, (2.1) implies that

x ∧ y ≤ z. (2.4)

If the lattice D is distributive, then (x ∨ z) ∧ (y ∨ z) = (x ∧ y) ∨ z. In this case (2.1) is equivalent to (2.4). 2 CONDITIONAL INDEPENDENCE 11

Theorem 4 If (D; ≤) is a distributive lattice the relation x⊥Ly|z defines a strong separoid.

Proof. A distributive lattice is modular, so C5 and C6 hold. It remains to prove C7. Assume D distributive so that x⊥Ly|z if and only if (2.4). Now x⊥Ly|z and x⊥Ly|w imply x ∧ y ≤ z and x ∧ y ≤ w, hence x ∧ y ≤ z ∧ w, which shows that x⊥Ly|z ∧ w. Therefore C7 is satisfied. ut

We may also consider the relation x⊥dy|z which holds if and only if x∧y ≤ z. The following theorem is due to (Dawid, 2001):

Theorem 5 The relation x⊥dy|z is a separoid if and only if (D; ≤) is a distributive lattice.

In a distributive lattice x⊥Ly|z if and only if x⊥dy|z by the discussion above. Therefore if x⊥dy|z is a separoid, it is a strong separoid by Theorem 4. An important instance of a distributive lattice is the lattice of the subsets of a set I. If s, t, r denote subsets of I, then s⊥Lt|r if and only if s∩t ⊆ r. This is the classical case considered in the large majority of studies on conditional independence. We shall come back to this case and its background in the next section and later.

2.2 Family of Compatible Frames

In this section we shall give some semantical background to the abstract definition of conditional independence in the previous section and define an important and very general instance of a q-separoid modeling a system of questions. A question can be represented by the set of its possible answers. By specifying this set, a certain granularity of answers is assumed. It is possible that after reflection, a finer granularity is needed. That is, the possible answers must each be split into several different distinctions. Thus we obtain more possible answers or a more precise or finer question. This is called a refinement of the original question. On the other hand answers to a question may also be grouped together to obtain a coarser granularity or a coarser question. In this way a whole family of compatible questions and domains representing them by specifying the possible answers may be obtained. This idea has been formalized mathematically in (Shafer, 1976) by the concept of a family of compatible frames. This will be the basis of this section, although we shall propose a somewhat more general setting. 2 CONDITIONAL INDEPENDENCE 12

A frame (called a frame of discernment in (Shafer, 1976)) Θ is simply a non-empty set, whose elements are thought to represent possible answers to the question represented by the frame. Another frame Λ may be obtained from Θ by splitting some or all elements θ of Θ. Mathematically this is represented by specifying for each θ ∈ Θ the subset τ(θ) in Λ consisting of the possibilities into which θ has been split. So, τ :Θ → P(Λ) is a mapping from Θ into the of Λ. It must satisfy the following conditions:

1. τ(θ) 6= ∅ for all θ ∈ Θ,

2. τ(θ0) ∩ τ(θ00) = ∅ if θ0 6= θ00,

3. ∪θ∈Θτ(θ) = Λ.

The τ is called a refining of Θ, the set Λ a refinement of Θ and the latter a coarsening of the former. Note that a refining τ determines a partition of Λ with blocks τ(θ) for θ ∈ Θ. Further, a refining τ may be extended to a map of subsets of Θ into the power set of Λ: [ τ(S) = τ(θ), θ∈S for any subset S of Θ. A refining τ satisfies the following conditions, see (Shafer, 1976):

1. τ is one-to-one,

2. τ(S) = ∅ if and only if S = ∅,

3. τ(Θ) = Λ,

4. τ(∪{S ∈ S}) = ∪{τ(S): S ∈ S),

5. τ(∩{S ∈ S}) = ∩{τ(S): S ∈ S),

6. τ(Sc) = τ(S)c,

7. τ(S) ⊆ τ(R) iff S ⊆ R.

Further, if τ1 is a refining of Θ1 into Θ2 and τ2 a refining of Θ2 into Θ3, the the composition τ1 ◦ τ2 is a refining of Θ1 into Θ3. 2 CONDITIONAL INDEPENDENCE 13

If Λ is a refinement of Θ and τ the corresponding refining, then we define a map v : P(Λ) → P(Θ) by

v(S) = {θ ∈ Θ: τ(θ) ∩ S 6= ∅}, (2.5)

This is called the outer reduction of S. Following (Shafer, 1976) we define the concept of a family of compatible frames.

Definition 1 Family of Compatible Frames: Suppose F is a non-empty collection of frames, where no pair of frames has any elements in common. Further suppose R is a non-empty collection of refinings, each element τ ∈ R being a refining of a frame Θ, and if Λ is the corresponding refinement, then both frames Θ and Λ belonging to F. The pair (F, R) is called a family of compatible frames (f.c.f) provided the following requirements are satisfied:

1. Composition of Refinings: If τ1 : P(Θ1) → P(Θ2) and τ2 : P(Θ2) → P(Θ3) belong to R, then τ1 ◦ τ2 ∈ R. 2. Identity: If Θ ∈ F, then the identity map id : P(Θ) → P(Θ) belongs to R.

3. Identity of Refinings: If τ1 : P(Θ) → P(Λ) and τ2 : P(Θ) → P(Λ) are elements of R, then τ1 = τ2.

4. Identity of Coarsenings: If τ1 : P(Θ1) → P(Λ) and τ2 : P(Θ2) → P(Λ) belong to R and if for each θ2 ∈ Θ2 there exists a θ1 ∈ Θ1 and for each θ1 ∈ Θ1 there exists a θ2 ∈ Θ2 such that τ1(θ1) = τ2(θ2), then Θ1 = Θ2.

5. Existence of Minimal Common Refinement: For any finite family Θ1,..., Θn of frames in F, there exists a common refinement Λ ∈ F such that if 0 0 Λ ∈ F is another common refinement of Θ1,..., Θn, then Λ is also a refinement of Λ,

In the system of f.c.f proposed by (Shafer, 1976) additional conditions are required. One condition is that any partition of a frame of F induces a coarsening of the frame. Another one postulates the existence of indefinite refininings. These requirements serve to form a kind of completion of an f.c.f. Further in (Shafer, 1976) only finite frames are allowed. None of these 2 CONDITIONAL INDEPENDENCE 14 conditions is required for our purpose. So our system defines a more general concept of a f.c.f as the one defined in (Shafer, 1976). In a f.c.f (F, R), a relation Θ ≤ Λ can be defined to hold, if Λ is a refinement of Θ, hence the latter a coarsening of the former. In fact, the system (F; ≤) is a partial order, and even a join-semilattice 1.

Theorem 6 If (F, R) is a family of compatible frames, then (F; ≤) is a join-semilattice.

Proof. Reflexivity follows from Identity (2). Assume Θ ≤ Λ and Λ ≤ Θ. Let τ1 be the refining of Θ to Λ and τ2 the refining of Λ to Θ. Then Θ1 is a coarsening of Λ and τ1 ◦ τ2 = idΘ and also τ2 ◦ τ1 = idΛ by the Identity of Refinings. Further, Λ is a coarsening of itself and for every θ ∈ Θ, we have τ1(θ) = {λ} = idΛ(λ). By the Identity of Coarsenings (3), it follows that Θ = Λ, hence ≤ is antisymmetric. Transitivity of the relation ≤ follows from Composition of Refinings (1). So the relation ≤ is a partial order in F.

Clearly, the minimal common refinement Λ of two frames Θ1 and Θ2 which exists by the requirement of the Existence of a Minimal Common Refinement 0 is an upper bound of Θ1 and Θ2. Suppose Λ is another common refinement 0 0 of Θ1 and Θ2, then Λ is a refinement of Λ, hence Λ ≤ Λ and Λ is the least upper bound. So (F; ≤) is a join-semilattice ut

In the sequel we write Λ = Θ1 ∨...∨Θn for the minimal common refinement of Θ1,..., Θn An important instance of a family of compatible frames is related to the lattice part(U) of the partitions of a universe U. To any partition P of part(U) we associate the frame ΘP consisting of all blocks of P. We use the opposite order between partitions than usual (see for instance (Gr¨atzer, 1978)), that is, we define P ≤ Q if any block of Q is contained in a block of P. Then ΘQ is a refinement of ΘP ; the refining is defined for any block P of P by τ(P ) = {Q : Q block of Q,Q ⊆ P }. Under this order, the minimal common refinement of the frames associated with a finite family of partitions is the frame associated with the frame of the join of the partitions (in our order). Note that the join of these partitions has all nonempty intersections

1If Identity of Coarsenings does not hold, the order is only a preorder. According to (Dawid, 2001) a preorder is sufficient for the theory of separoids. It may be conjectured that this could also be true for the present theory of information algebras. For simplicity’s sake we renounce to develop this generalisation. 2 CONDITIONAL INDEPENDENCE 15 of the blocks of the partitions as blocks. This is an important property, which does not hold in a f.c.f in general; a corresponding property for general f.c.f will be introduced and discussed below. Of course part(U) is a lattice and not only a semilattice. But in this way, any sub-join-semilattice of part(U) induces also a f.c.f. This shows that f.c.f have been modeled according to the example of partitions, however without necessarily assuming a universe, that is, a ultimate refinement, although not excluding it. On the other hand, the lattice part(U) is not a model of a family of compatible frames in the sense of (Shafer, 1976), because the existence of a ultimate refinement is excluded in the philosophy of evidential reasoning in the sense of (Shafer, 1976). It has been shown in (Cuzzolin, 2005) that families of compatible frames in the sense of (Shafer, 1976) form a lattice. We are now going to introduce a relation of conditional independence into a family of compatible frames, following (Kohlas & Monney, 1995) and show that it is a quasi-separoid. First, we define a compatibility relation between the elements of different frames Θ1,..., Θn of a family of compatible frames (F, R). Let Λ = Θ1 ∨ ... ∨ Θn and τi : P(Θi) → Λ the corresponding refinings of Θi for i = 1, . . . , n. Define

n R(Θ1,..., Θn) = {(θ1, . . . , θn): θi ∈ Θi, ∩i=1τi(θi) 6= ∅}. (2.6)

Thus R contains the tuples of mutually compatible elements θi. The frames Θ1,..., Θn are called independent if

R(Θ1,..., Θn) = Θ1 × · · · × Θn.

This relation has been studied in (Shafer, 1976) and (Cuzzolin, 2005). Consider now an element λ of a frame Λ in F. We stress that Λ is in general not necessarily different from every Θi. We now look for tuples of elements θi ∈ Θi which are compatible among themselves and with λ,

Rλ(Θ1,..., Θn) = {(θ1, . . . , θn):(θ1, . . . , θn, λ) ∈ R(Θ1,..., Θn, Λ)}.

The collection of frames Θ1,..., Θn is called conditionally independent given Λ, if for all λ ∈ Λ,

Rλ(Θ1,..., Θn) = Rλ(Θ1) × · · · × Rλ(Θn). (2.7)

Then we write

⊥{Θ1,..., Θn}|Λ 2 CONDITIONAL INDEPENDENCE 16

or Θ1⊥Θ2|Λ in the case of n = 2. This means that once an answer λ in Λ is given, knowing an answer θi ∈ Θi, compatible with λ, does not restrict the possible answers θj ∈ Θj for i 6= j. This relation has been studied in (Kohlas & Monney, 1995), although in a slightly different system of families of compatible frames. It has been shown there that conditions C3 and C4 of a q-separoid are fulfilled (see (Kohlas & Monney, 1995), Theorems 7.14 and 7.17). We generalize this result to our present case.

Theorem 7 Let (F, R) be a family of compatible frames. Then (F, ≤, ⊥), where the relation of conditional independence is defined as above, satisfies properties C1 to C3 of a q-separoid.

Proof. In order to verify condition C1 of a q-separoid consider θ ∈ Θ2. Then Rθ(Θ2) = {θ} and Rθ(Θ1) = {θ1 : τ1(θ1) ∩ τ2(θ) 6= ∅}, where τ1 and τ2 are the refinings of Θ1 and Θ2 to Θ1 ∨ Θ2 respectively. Finally, we have

Rθ(Θ1, Θ2) = {(θ1, θ): τ1(θ1) ∩ τ2(θ) 6= ∅}.

Then Rθ(Θ1, Θ2) = Rθ(Θ1) × Rθ(Θ2) and therefore Θ1⊥Θ2|Θ2. Condition C2 follows from the symmetry of the definition of conditional independence. 0 In order to prove C3 assume Θ1⊥Θ2|Λ and Θ2 ≤ Θ2. We must prove that 0 Θ1⊥Θ2|Λ, that is,

0 0 Rλ(Θ1, Θ2) = Rλ(Θ1) × Rλ(Θ2) for any λ ∈ Λ. Note that the relation on the left is always contained in the Cartesian product on the right. So we need only to show that any pair 0 0 0 0 (θ1, θ2) with θ1 ∈ Rλ(Θ1) and θ2 ∈ Rλ(Θ2) belongs to Rλ(Θ1, Θ2). Let now τ, τ1 and τ2 be the refinings of Λ, Θ1 and Θ2 to Θ1 ∨ Θ2 ∨ Λ and ω the 0 0 0 refining of Θ2 to Θ2. Then if θ1 ∈ Rλ(Θ1) and θ2 ∈ Rλ(Θ2),

0 τ1(θ1) ∩ τ(λ) 6= ∅, τ2(ω(θ2)) ∩ τ(λ) 6= ∅.

We must prove that this implies

0 τ1(θ1) ∩ τ2(ω(θ2)) ∩ τ(λ) 6= ∅, (2.8)

0 0 because this means that (θ1, θ2) ∈ Rλ(Θ1, Θ2). There is an element η in 0 0 τ2(ω(θ2)) ∩ τ(λ). But then there must be an element θ2 ∈ ω(θ2) such that 2 CONDITIONAL INDEPENDENCE 17

η ∈ τ2(θ2)∩τ(λ), hence we conclude that τ2(θ2)∩τ(λ) 6= ∅. Then, Θ1⊥Θ2|Λ implies

τ1(θ1) ∩ τ2(θ2) ∩ τ(λ) 6= ∅.

0 0 Since τ2(θ2) ⊆ τ2(ω(θ2)) we conclude that (2.8) holds, hence Θ1⊥Θ2|Λ and C3 is valid. ut Condition C4 seems not to be necessarily satisfied in a f.c.f. The following theorem gives a sufficient condition which guarantees C4, such that (F; ≤, ⊥) becomes a q-separoid.

Theorem 8 Assume (F, R) to be a f.c.f and assume that the following con- dition is satisfied: For any finite collection Θ1,..., Θn of frames in F with the minimal common refinement Λ = Θ1 ∨ ... ∨ Θn, if τi, i = 1, . . . , n, are the refinings of Θi to Λ, then for every λ ∈ Λ there exist θi ∈ Θi such that

τ1(θ1) ∩ ... ∩ τn(θn) = {λ}. (2.9)

Then the relation Θ1⊥Θ2|Λ satisfies condition C4 of a q-separoid.

Proof. In order to prove C4 assume Θ1⊥Θ2|Λ and consider the refinigs τ1 0 of Θ1 to Θ1 ∨ Θ2 ∨ Λ, τ2 of Θ2 to Θ2 ∨ Λ, τ of Λ to Θ2 ∨ Λ and finally τ of Θ2 ∨ Λ to Θ1 ∨ Θ2 ∨ Λ. In order to prove that Θ1⊥Θ2 ∨ Λ|Λ we must show 0 that for any pair of elements θ1 ∈ Rλ(Θ1), θ2 ∈ Rλ(Θ2 ∨ Λ) and λ ∈ Λ, 0 0 0 τ1(θ1) ∩ τ (θ2) ∩ τ (τ(λ)) 6= ∅ (2.10)

0 since this means that (θ1, θ2) belongs to Rλ(Θ1, Θ2 ∨ Λ) and thus

Rλ(Θ1, Θ2 ∨ Λ) = Rλ(Θ1) × Rλ(Θ2 ∨ Λ).

0 By the assumption of the theorem there is are elements θ2 in Θ2 and λ ∈ Λ such that

0 0 τ2(θ2) ∩ τ(λ ) = {θ2}.

0 The assumptions that θ1 ∈ Rλ(Θ1) and θ2 ∈ Rλ(Θ2 ∨ Λ) imply that 0 0 0 0 τ1(θ1) ∩ τ (τ(λ)) 6= ∅, τ (θ2) ∩ τ (τ(λ)) 6= ∅. Then we see that

0 0 0 0 0 0 0 τ (θ2) = τ (τ2(θ2) ∩ τ(λ )) = τ (τ2(θ2)) ∩ τ (τ(λ )) 6= ∅ 2 CONDITIONAL INDEPENDENCE 18 and further

0 0 0 0 0 0 0 ∅= 6 τ (θ2) ∩ τ (τ(λ)) = τ (τ2(θ2)) ∩ τ (τ(λ )) ∩ τ (τ(λ)).

0 This implies that λ = λ, hence θ2 ∈ Rλ(Θ2). From Θ1⊥Θ2|Λ, θ1 ∈ Rλ(Θ1) and θ2 ∈ Rλ(Θ2), we obtain that

0 0 τ1(θ1) ∩ τ (τ2(θ2)) ∩ τ (τ(λ)) 6= ∅.

But then we have also

0 0 τ1(θ1) ∩ τ (τ2(θ2) ∩ τ(λ)) ∩ τ (τ(λ)) 6= ∅.

0 If we replace here τ2(θ2) ∩ τ(λ) by θ2, we have (2.10). ut

So with the additional condition of Theorem 8 the relation Θ1⊥Θ2|Λ be- comes a quasi-separoid. Since part(U) is associated with the family of com- patible frames constituted by the sets of the blocks of the partitions, a q- separoid is also present there, because in this case the condition of Theorem 8 is fulfilled. In this case the conditional independence relation P1⊥P2|P holds if, for blocks P1,P2 and P from partitions P1, P2 and P respectively, from Pi ∩ P 6= ∅ for i = 1, 2 it follows that P1 ∩ P2 ∩ P 6= ∅. This is thus a further example of a quasi-separoid. However, as the following example shows, the condition of Theorem 8 is not necessary.

Example : A Simple Example: Consider a set U, and two partitions P1 and P2 of U. Let Q = P1 ∨ P2, where here the join is the one in the lattice of partitions; the blocks of Q are the non-empty intersections of the blocks of 0 0 P1 and P2. Let finally Q be a partition finer than Q, that is, Q ≤ Q . Let 0 ΘP1 ,ΘP2 ,ΘQ and ΘQ be the sets of blocks of the corresponding partitions and let F denote the the family of these sets. If R denotes the set of the corresponding refinings, then (F, R) is a f.c.f which in addition satisfies the condition of Theorem 8. Thus, in this example, (F; ≤, ⊥) is a q-separoid. 0 0 0 Eliminate now ΘQ from F and define F = {ΘP1 , ΘP2 , ΘQ} and let R denote the set of the corresponding set of refinings. Then (F 0, R0) is still a f.c.f, but the condition of Theorem 8 is no more satisfied. Nevertheless, (F 0; ≤, ⊥) is still a q-separoid. This shows that the condition of Theorem 8 is not necessary. We now show that the quasi-separoid of a family of compatible frames is basic, provided the condition of Theorem 8 is satisfied. 2 CONDITIONAL INDEPENDENCE 19

Theorem 9 Let (F, R) be a family of compatible frames satisfying the con- dition of Theorem 8 and Θ1⊥Θ2|Λ the relation of conditional independence in it. Then if Θ and Λ are elements of F, Θ⊥Θ|Λ implies Θ ≤ Λ.

Proof. Assume Θ⊥Θ|Λ and consider an element λ of Λ. Then

Rλ(Θ, Θ) = {(θ1, θ2): θ1, θ2 ∈ Θ, (θ1, θ2, λ) ∈ R(Θ, Θ, Λ)}. Consider the frame Θ ∨ Λ and denote by τ, ω the refinings of Θ and Λ to their minimal common refinement Θ ∨ Λ respectively. Then (θ1, θ2, λ) ∈ R(Θ, Θ, Λ) if τ(θ1) ∩ τ(θ2) ∩ ω(λ) 6= ∅. This implies θ1 = θ2 and we obtain

Rλ(Θ, Θ) = {(θ, θ): θ ∈ Θ, τ(θ) ∩ ω(λ) 6= ∅}. Further,

Rλ(Θ) = {θ : θ ∈ Θ, (θ, λ) ∈ R(Θ, Λ)}. (2.11) Here, (θ, λ) ∈ R(Θ, Λ) iff τ(θ) ∩ ω(λ) 6= ∅. By the assumption Θ⊥Θ|Λ we have Rλ(Θ, Θ) = Rλ(Θ) × Rλ(Θ). This is only possible, if both Rλ(Θ, Θ) and Rλ(Θ) contain each only one single element (θ, θ) and θ respectively. So, to any element λ from Λ there is only one compatible element θ in Θ. Therefore, from τ(θ) ∩ ω(λ) 6= ∅ it follows that τ(θ) ⊇ ω(λ). Further, due to (2.9), τ(θ) ∩ ω(λ) 6= ∅ implies τ(θ) ∩ ω(λ) = {χ} for some element χ of the minimal common refinement Θ ∨ Λ. It follows that ω(λ) = {χ}. Identity of Coarsenings implies then Θ ∨ Λ = Λ. Thus we conclude that Θ ≤ Λ. ut As we have seen above (part(U); ≤, ⊥) is a f.c.f satisfying the condition of Theorem 8 and as such it is basic. By Theorem 2 the relation P1⊥P2|P implies then

(P1 ∨ P) ∧ (P2 ∨ P) = P.

This is the defining identity of the relation P1⊥LP2|P discussed above in Sec- tion 2.1. However P1⊥P2|P1∧P2 does not hold in general in (part(U); ≤, ⊥). So, the conditional independence relations ⊥ and ⊥L are not equivalent, the former implies the latter, but not vice versa. A particular case both of a f.c.f and a partition lattice is the multivariate model. Consider a countable family of variables Xi, i = 1, 2 ... and assume a variable Xi takes values in a domain Ωi. For any subset s of indices i = 1, 2,... define Y Ωs = Ωi. i∈s 2 CONDITIONAL INDEPENDENCE 20

The elements of Ωs are tuples x : s → Ωs such that x(i) ∈ Ωi. These Cartesian products Ωs may be considered as frames. If s ⊆ t, a refining of Ωs to Ωt is defined by τs,t(x) = {y ∈ Ωt : y(i) = x(i) for i ∈ s}. This is clearly a refining and the family F of frames Ωs together with refinings τs,t for s ⊆ t is a f.c.f. Thus we have Ωs ≤ Ωt iff s ⊆ t. Then (F; ≤) is a lattice isomorphic to the subset lattice of indices i = 1, 2,.... Further, it is evident that Ωs⊥Ωt|Ωr if and only if s ∩ t ⊆ r. So, in this case of a distributive lattice, the f.c.f conditional independence relation ⊥ is equivalent to the lattice independence relation ⊥L.

2.3 Markov Trees, Hypertrees and Join Trees

In this section we introduce a number of conditional independence structures like Markov trees, hypertrees and join trees which play an important role in local computation. These structures are well known and frequently used in multivariate models, and there they are all equivalent. Here however, we study them in the context of quasi-separoids. And in this context they are all different. In Section 4 we discuss how different algorithms of local computation can be defined on these different structures. To start, we extend the notion of conditional independence to a family of domains in a join-semilattice (D, ≤).

Definition 2 Let (D; ≤, ⊥) be a quasi-separoid, and x1, . . . , xn and z ele- ments of D, n ≥ 2. The family {x1, . . . , xn} is called conditionally indepen- dent given z, if for all disjoint subsets J and K of the index set {1, . . . , n}

∨j∈J xj⊥ ∨k∈K xk|z. (2.12)

Then we write ⊥{x1, . . . , xn}|z.

By convention, for all x ∈ D we have ⊥{x}|z and ⊥∅|z. Note that due to condition C3 of a q-separoid, we may assume that J ∪ K = {1, . . . , n}.

Theorem 10 Assume ⊥{x1, . . . , xn}|z. Then,

1. if σ is a permutation of 1, . . . , n, then ⊥{xσ(1), . . . , xσ(n)}|z,

2. if J ⊆ {1, . . . , n}, then ⊥{xj : j ∈ J}|z, 2 CONDITIONAL INDEPENDENCE 21

3. if y ≤ x1, then ⊥{y, x2, . . . , xn}|z,

4. ⊥{x1 ∨ x2, x3, . . . , xn}|z,

5. ⊥{x1 ∨ z, x2, . . . , xn}|z.

Proof. Items 1.), 2.) and 4.) are immediate consequences of the definition. Item 3.) follows from C3 and 5.) from C4. ut

In case (D; ≤) is a lattice, ⊥L{x1, . . . , xn}|z implies x1⊥Lx2|z, x2⊥Lx3|z, etc. which means that (x1 ∨ z) ∧ (x2 ∨ z) = z,(x2 ∨ z) ∧ (x3 ∨ z) = z, etc. and this implies

(x1 ∨ z) ∧ (x2 ∨ z) ∧ · · · ∧ (xn ∨ z) = z.

If the lattice (D; ≤) is distributive, then

(∨j∈J xj ∨ z) ∧ (∨k∈K xk ∨ z) = ∨j∈J,k∈K (xj ∧ xk) ∨ z = z, hence xj ∧ xk ≤ z for all j 6= k. Therefore, in this case ⊥L{x1, . . . , xn}|z holds if and only if xj⊥Lxk|z for all pairs of distinct j and k; j, k = 1, . . . , n. Let (D; ≤, ⊥) be any q-separoid. Consider a tree T = (V,E), with nodes V and edges E ⊆ V 2, where V 2 is the family of two-elements subsets of V . Let λ : V → D be a labeling of the nodes of T with domains. The pair (T, λ) is called a labeled tree. By ne(v) we denote the set of neighbors of a node v, that is, the set {w ∈ V : {v, w} ∈ E}. For any subset U of nodes, let

λ(U) = ∨v∈U λ(v).

When a node v is eliminated together with all edges {v, w} incident to v, then a family of subtrees {Tv,w = (Vv,w,Ev,w): w ∈ ne(v)} of T are created, where Tv,w is the subtree containing node w ∈ ne(v). This allows to define the concept of a Markov tree.

Definition 3 Markov Tree: Let (D; ≤, ⊥) be a quasi-separoid. A labeled tree (T, λ) with T = (V,E), λ : V → D, is called a Markov tree, if for all v ∈ V ,

⊥{λ(Vv,w): w ∈ ne(v)}|λ(v). (2.13) 2 CONDITIONAL INDEPENDENCE 22

Markov trees have early been identified as important independence struc- tures for efficient computation with belief functions, using Dempsters rule (Shafer et al. , 1987a; Shenoy & Shafer, 1990b; Kohlas & Monney, 1995). In the first of these references, conditional independence and Markov trees are studied for partition lattices, whereas in the second the mulivariate model and in the third families of compatible frames are used. The concept is gen- eralised and adapted from the probabilistic framework of Markov fields. We prove two fundamental theorems, whose proofs are adapted from (Kohlas & Monney, 1995). The first theorem states that there is conditional independence between the domains of a node v and a subtree Vv,w given the domain of the neighbor w.

Theorem 11 If (T, λ) is a Markov tree, then, for any node v and all nodes w ∈ ne(v),

λ(v)⊥λ(Vv,w)|λ(w). (2.14)

Proof. For a node w ∈ ne(v), the Markov property (2.13) reads

⊥{λ(Vw,u): u ∈ ne(w)}|λ(w). Then, _ λ(Vw,v)⊥ λ(Vw,u)|λ(w). (2.15) u∈ne(w)−{v} Note that _ λ(Vv,w) = λ(Vw,u) ∨ λ(w). u∈ne(w)−{v} Hence from C4 we obtain

λ(Vw,v)⊥λ(Vv,w)|λ(w).

Finally, since λ(v) ≤ λ(Vw,v), we conclude (2.14) using C3. ut This theorem, as well as the next one which states that any subtree of a Markov tree is still a Markov tree, is important for local computation schemes (see Section 4)

Theorem 12 If (T, λ) is a Markov tree, then every subtree is also a Markov tree. 2 CONDITIONAL INDEPENDENCE 23

Proof. Assume T 0 = (V 0,E0) to be a subtree of T = (V,E) and λ0 the restriction of λ to V 0. Consider a node v ∈ V 0 and let ne0(v) be the set of 0 0 0 0 its neighbors in T . Also consider the subtrees Tv,w = (Vv,w,Ev,w) in the subtree T 0 obtained after removing the node v and the edges incident to v. 0 0 0 0 Note that ne (v) ⊆ ne(v) and Vv,w ⊆ Vv,w, so that λ (Vv,w) ≤ λ(Vv,w) for all w ∈ ne0(v). Therefore, from items 2 and 3 of Theorem 10 we conclude that

0 0 0 0 ⊥λ (Vv,w : w ∈ ne (v)}|λ (v) for all v ∈ V 0. This shows that (T 0, λ0) is a Markov tree. ut From Markov trees two important derived structures may be obtained. In a tree T , we may always select any node v and then number the nodes i : V → {1, . . . , n} if |V | = n, such the number i of node w is smaller than the number of any node u on the path from w to v. Assume then that in a Markov tree (T, λ), the nodes are numbered in such a way and let xi = λ(vi). To simplify, we denote the nodes in the sequel by their number. Then, for all i = 1, . . . , n−1 the set of nodes {i+1, . . . , n}, together with all the edges from E between these nodes, determines a subtree of T . In fact, a path in T from j > i to n can not pass through any node h ≤ i. So, the subgraph determined by the nodes {i + 1, . . . , n} is connected, hence a tree. There is exactly one node j ∈ ne(i) so that i < j. Denote this node by b(i). By Theorem 11 we have n _ xj⊥xi|xb(i). (2.16) j=i+1

This relation is defining a hypertree according to the following definition.

Definition 4 Hypertree: Let (D; ≤, ⊥) be a quasi-separoid. A n-element subset S of D is called a hypertree, if there is a numbering of its elements S = {x1, . . . , , xn} such that for all i = 1, . . . , n − 1 there is an element b(i) > i in the numbering so that

n _ xi⊥ xj|xb(i). (2.17) j=i+1

A hypergraph is usually defined as a set of subsets; in other words a set of elements of the lattice of subsets of a set. In a generalisation of this view, we take a hypergraph to be a set of elements of any join-semilattice D. The 2 CONDITIONAL INDEPENDENCE 24 concept of a hypertree as given in Definition 4 is then the corresponding transcription of the usual definition of a hypertree in the context of subset lattices. The condition (2.17) corresponds to the running intersection prop- erty in the usual subset lattice framework. Hypertrees in the usual sense were especially studied in relational algebra, where they were called acyclic hypergraphs and shown to have some desirable properties (Beeri et al. , 1983; Maier, 1983). In particular, hypertrees are interesting with respect to computational complexity (Gottlob et al. , 1999a; Gottlob et al. , 1999b; Gottlob et al. , 2001). These papers treat all hypertrees in the multivariate framework, whereas this issue will be taken up in Section 4 in the more general case of quasi-separoids and semilattices or lattices of domains. So, any Markov tree determines a hypertree; in fact many hypertrees, ac- cording to the numbering selected. Following to (Shenoy & Shafer, 1990b) the sequence x1, . . . , xn is called a (hypertree) construction sequence. A hy- pertree construction sequence x1, . . . , xn defines a tree T = (V,E) with nodes V = {1, . . . , n} and edges E = {{i, b(i)}, i = 1, . . . , n − 1}. In fact T is con- nected: If i and j are two nodes, then the node sequences i, b(i), b(b(i)),... and j, b(j), b(b(j)),... determine both paths from i and j to n. So there is a path between the nodes i and j. Since the number of edges is one less than the number of nodes, T is a tree. However, the labeling i 7→ xi does not in general yield a Markov tree. To see this consider a construction se- quence {x1, x2, x3, x4} such that x1⊥x2 ∨x3 ∨x4|x4 and x2⊥x3 ∨x4|x4. Then S = {x1, x2, x3, x4} is a hypertree. And the construction sequence defines the tree T = ({1, 2, 3, 4}, {{1, 4}, {2.4}, {3, 4}}). In order that the tree T with the labeling xi be a Markov tree, we must have ⊥{x1, x2, x3}|x4 and for this to be valid, for instance x1 ∨ x2⊥x3|x4 must hold. But this is not necessarily guaranteed by the construction sequence. However, we shall see that if (D; ≤) is a distributive lattice, then in a q-separoid (D; ≤, ⊥L) any hypertree defines in the way described a Markov tree. Let (T, λ) with T = (V,E) still be a Markov tree and consider two nodes v and u. Let w be any node on the path between v and u and v0 and u0 the neighbors of w on the path from v to w and u to w respectively. Then, from the Markov property (2.13) it follows that

λ(Vw,v0 )⊥λ(Vw,u0 )|λ(w) and therefore, by C3 λ(v)⊥λ(u)|λ(w). And this holds for any node w on the path between v and u. This is a defining property of another concept.

Definition 5 Join Tree: Let (D; ≤, ⊥) be a quasi-separoid and (T, λ) with 2 CONDITIONAL INDEPENDENCE 25

T = (V,E) a tree. If for any pair of nodes v, u ∈ V and for any node w on the path from v to u

λ(v)⊥λ(u)|λ(w), (2.18) then (T, λ) is called a join tree.

Join trees have been considered in relational database theory and (Beeri et al. , 1983; Maier, 1983) and, under varying names, also in local computa- tion theory for uncertainty calculi, in particular Bayesian networks, see for instance (Lauritzen & Spiegelhalter, 1988; Cowell et al. , 1999; Shenoy & Shafer, 1990a), but exclusively in the multivariate framework. In this case, we have λ(v)⊥λ(u)|λ(w) if and only if λ(V ) ∧ λ(u) ≤ λ(w). This is called the running intersection property. The present definition has been adapted from the multivariate framework. As seen above, any Markov tree is also a join tree. However, as with hy- pertrees, a join tree is not a Markov tree in general. Consider the tree T = ({1, 2, 3, 4}, {{1, 4}, {2.4}, {3, 4}}) already considered above and as- sume that xi is a labeling of this tree, such that x1⊥x2|x4, x1⊥x3|x4 and x2⊥x3|x4. The tree T with the labeling xi is then a join tree. These pair- wise conditional independence relations are however not sufficient to imply the Markov property ⊥{x1, x2, x3}|x4 for the labeled tree, except if in the q- separoid (D; ≤, ⊥L) the lattice D is distributive. In fact, if D is distributive, then the three concepts are equivalent in the q-separoid (D; ≤, ⊥L), a fact which has been known for long in the framework of multivariate models. Before we prove this result, we show that a hypertree in the quasi-separoid (D; ≤, ⊥L) is always a join tree. It is open whether this is true for any q-separoid (D; ≤ ⊥).

Theorem 13 Let D be a lattice and (D; ≤, ⊥L) a quasi-separoid, and S a hypertree with construction sequence x1, . . . , xn. Then the labeled tree (T, λ) with T = (V,E) with V = {1, . . . , n} and E = {{i, b(i)} : i = 1, . . . , i − 1} and λ(i) = xi is a join tree.

Proof. Consider two nodes i and j and assume i ≤ j. Then (2.17) implies

n xi ∧ xj ≤ xi ∧ (∨k=i+1xk) = xi ∧ xb(i) ≤ xb(i). (2.19)

By iterating this argument with xi ∧ xj ≤ xb(i) ∧ xj ≤ xb(b(i)), we see that xi ∧ xj ≤ xh for any node h on the path from i to n as long as h < j. 2 CONDITIONAL INDEPENDENCE 26

Let i1 be the first node on this path such that j ≤ i1. Then we still have xi ∧ xj ≤ xi1 . Further, in the same way we conclude that

n xi ∧ xj ≤ xi1 ∧ xj ≤ xj ∧ (∨k=j+1xk) = xj ∧ xb(j) ≤ xb(j). (2.20)

By iterating this, we obtain xi ∧ xj ≤ xh for all h such that i1 ≤ h < j1, where j1 is the first indice on the path from j to n which is greater than i1. Then we alternate this reasoning between the pathes from i to n, starting with i1, with the path from j to n, starting with j1, until we reach the common node on both paths. Thus we conclude that xi ∧ xj ≤ xh for all nodes on the path from i to j. Therefore (T, λ) is a join tree. ut Now, we show the equivalence of the concepts of a Markov tree, a hypertree and a join tree in the context of a q-separoid (D; ≤, ⊥L), if D is a distributive lattice.

Theorem 14 Let (D; ≤, ⊥L) be a quasi-separoid, D a distributive lattice and the labeled tree (T, λ) with T = (V,E) a join tree. Then

1. The set λ(V ) is a hypertree.

2. The labeled tree (T, λ) is a Markov tree.

Proof. (1) We have to find a hypertree construction sequences. For this purpose select any node v ∈ V and let |V | = n. Then there is a numbering i : V → {1, . . . , n}, such that i(v) = n and i(u) < i(w) if node w is on the path from node u to v. Define xi(u) = λ(u). We claim that x1, . . . , xn is a hypertree construction sequence and hence λ(V ) a hypertree. In order to show this, we identify the nodes by their number and define b(i) = j, if i < j and {i, j} ∈ E for i = 1, . . . , n − 1. Note that b(i) is uniquely determined, since there is only one path from i to n. Now, by distributivity

n n xi ∧ (∨j=i+1xj) = ∨j=i+1(xi ∧ xj).

If i < j, the path from i to j passes through node b(i), hence xi ∧ xj ≤ xb(i) for all j = i + 1, . . . , n. Therefore,

n xi ∧ (∨j=i+1xj) ≤ xb(i). But i + 1 ≤ b(i) ≤ n. So on the other hand,

n xi ∧ (∨j=i+1xj) ≥ xi ∧ xb(i) 3 LABELED ALGEBRAS OF INFORMATION 27 and this implies then n xi ∧ (∨j=i+1xj) = xi ∧ xb(i) n In a distributive, hence a modular lattice, this is equivalent to xi⊥L ∨j=i+1 xj|xb(i). This shows that x1, . . . , xn is a hypertree construction sequence. (2) Since D is distributive, the Markov property (2.13) holds if and only if λ(Vv.w)⊥Lλ(Vv,u)|λ(v) for all pairs of distinct neighbors w and u of v. We claim that these pairwise conditional independence relations hold in a join tree. In fact, by distributivity

(λ(Vv,w) ∨ λ(v)) ∧ (λ(Vv,u) ∨ λ(v))     _ 0 _ 0 =  λ(w ) ∨ λ(v) ∧  λ(u ) ∨ λ(v) 0 0 w ∈Vv,w u ∈Vv,u   _ 0 0 =  (λ(w ) ∧ λ(u ) 0 0 w ∈Vv,w,u ∈Vv,u     _ 0 _ 0 ∧  (λ(w ) ∧ λ(v)) ∧  (λ(u ) ∧ λ(v)) ∧ λ(v) 0 0 w ∈Vv,w u ∈Vv,u = λ(v), 0 0 since v is on all the pathes from nodes w in Vv,w to nodes u in Vv,u, so that λ(w0) ∧ λ(u0) ≤ λ(v) by the join tree property. This shows that λ(Vv.w)⊥Lλ(Vv,u)|λ(v), hence (T, λ) is a Markov tree. ut In summary, any Markov tree is a join tree and induces a hypertree, but not vice versa. Moreover, in a quasi-separoid (D; ≤, ⊥L) a hypertree induces a join tree, but again, not vice versa. If D in the q-separoid (D; ≤, ⊥L) however is a distributive lattice, the concepts of Markov, hyper and join tree become equivalent. A join tree is then a Markov tree and induces a hypertree and vice versa. In this case (D; ≤, ⊥L) is a also strong separoid (Theorem 4). This applies in particular to multivariate models.

3 Labeled Algebras of Information

3.1 Axioms

Consider a quasi-separoid (D; ≤, ⊥). We think of the elements of D as domains, representing questions and their answers and are now adding pieces 3 LABELED ALGEBRAS OF INFORMATION 28 of information relating to these questions. So, let Φ be a set whose generic elements will by denoted by lower case Greek letters like φ, ψ, . . .. These elements are thought to represent pieces of information each relating to a specific question, that is to an element of D; pieces of information which can be combined or aggregated and from which information can be extracted relative to different questions or domains in D. These ideas will be captured by the operations of labeling, which informs about the question a piece of information relates to, combination, which represents aggregation of two or more pieces of information and transport, which describes the extraction of the information relating to a given question from a piece of information. Formally, we consider the following operations:

1. Labeling: d :Φ → D, φ 7→ d(φ).

2. Combination: · :Φ × Φ → Φ, (φ, ψ) 7→ φ · ψ.

3. Transport: t :Φ × D → Φ, (φ, x) 7→ tx(φ).

These operations are required to satisfy the following axioms.

A0 Quasi-Separoid: (D; ≤, ⊥) is a quasi-separoid.

A1 Semigroup: (Φ; ·) is a commutative semigroup.

A2 Labeling: d(φ · ψ) = d(φ) ∨ d(ψ), d(tx(φ)) = x.

A3 Unit and Null: For all x ∈ D there are elements 0x (null) and 1x (unit) with d(0x) = d(1x) = x and such that

1. φ · 0x = 0x and φ · 1x = φ if d(φ) = x,

2. ty(φ) = 0y if and only if φ = 0d(φ),

3. φ · 1x = td(φ)∨x(φ),

4. 1x · 1y = 1x∨y. A4 Transport: if x⊥y|z and d(φ) = x, then

ty(φ) = ty(tz(φ)). (3.1)

A5 Combination: If x⊥y|z and d(φ) = x, d(ψ) = y, then

tz(φ · ψ) = tz(φ) · tz(ψ). (3.2) 3 LABELED ALGEBRAS OF INFORMATION 29

A6 Identity: If d(φ) = x, then tx(φ) = φ.

Axiom A1 implies in particular associativity of combination, such that pieces of information may be combined in any order to obtain the same result. The unit 1x represents vacuous information relative to the domain x. Combin- ing it with any other piece of information on the same domain does not change the information. Further, vacuous information remains vacuous, when transported to any other domain. The null elements 0x on the other hand destroy any information, see below, Lemma 1. They represent con- tradiction, if φ · ψ = 0x, then φ and ψ must be considered as contradictory pieces of information. Also transport can neither eliminate contradiction nor introduce it. Axioms A4 and A5 are important for local computation, see Section 4. They substantiate conditional independence: If x⊥y|z, then to transport a piece of information from x to y, only the part relating to z is relevant. Or to transport the combined pieces of information on x and y to z, only the extracted information to z is relevant. Further illustrations of the meaning of conditional independence and irrelevance are given below (Lemma 2). What is characteristic and basic for this system of information, is the as- sumption that information can be transported from one domain to another; that is extraction of the part of the information relevant to a specific ques- tion. This seems to be an essential property of information. There are vari- ants of this axiomatic system, relevant for local computation, which do not allow this operation, in particular systems related to probability or Bayesian networks (see Section 3.2). Another important property of information, not yet present in the axiomatic system above, is that the combination of a piece of information with itself or with a part of it gives no new information. This is sometimes added as an additional axiom.

A7 Idempotency: If d(φ) = x and y ≤ x, then φ · ty(φ) = φ.

We call a system (Φ,D; ≤, ⊥, d, ·, t) satisfying the axioms A0 to A6 a (gen- eralised) information algebra. The term information algebra has been used formerly in a slightly less general framework, and assumes also idempotency (Kohlas, 2003a; Kohlas & Schneuwly, 2009; Kohlas & Schmid, 2014a). We shall see, that these axiomatic systems are particular cases of an information algebra in the present sense. This is also the case of another axiomatic sys- tem proposed by (Shenoy & Shafer, 1990a), later called valuation algebras (Kohlas & Shenoy, 2000; Kohlas, 2003a). 3 LABELED ALGEBRAS OF INFORMATION 30

Here we list a few elementary properties of an information algebra concerning transport.

Lemma 1

1. y ≤ z implies ty(φ) = ty(tz(φ)),

2. d(φ) = x ≤ y ≤ z implies tz(φ) = tz(ty(φ)).

3. x ≤ z and d(φ) = x imply tx(tz(φ)) = φ,

4. d(φ) = x and d(ψ) = y imply tx(φ · ψ) = φ · tx(ψ),

5. d(φ) = x and d(ψ) = y imply φ · ψ = tx∨y(φ) · tx∨y(φ),

6. x ≤ z and d(φ) = x imply tz(φ) = φ · 1z,

7. d(φ) = x implies φ · 0y = 0x∨y,

8. ty(1x) = 1y,

Proof. 1.) Assume d(φ) = x. Then x⊥z|z (C1) and y ≤ z imply x⊥y|z (C3) and by A4 ty(φ) = ty(tz(φ)). 2.) From z⊥y|y (C1) it follows that x⊥z|y (C2 and C3). Then, by axiom A4, tz(φ) = tz(ty(φ)).

3.) x⊥z|z (C1) and x ≤ z imply x⊥x|z (C3) and by A4,A6 φ = tx(φ) = tx(tz(φ)).

4.) x⊥y|x (C1 and C2) implies by A5,A6 that tx(φ · ψ) = tx(φ) · tx(ψ) = φ · tx(ψ). 5.) x⊥x ∨ y|x ∨ y (C1) and y ≤ x ∨ y imply x⊥y|x ∨ y (C3) and by A2 and A5,A6, φ · ψ = tx∨y(φ · ψ) = tx∨y(φ) · tx∨y(ψ).

6.) x⊥z|z (C1) and x ≤ z, d(φ) = x imply by A2,A3,A5 and A6 φ · 1z = tz(φ · 1z) = tz(φ) · 1z = tz(φ). 7.) We have x⊥x ∨ y|x ∨ y (C1), hence by C3 also x⊥y|x ∨ y. Using axioms A2,A3,A5 and A6 we obtain φ · 0y = tx∨y(φ · 0y) = tx∨y(φ) · tx∨y(0y) = tx∨y(φ) · 0x∨y = 0x∨y.

8.) Assume first x ≤ y. Then by A3 and 6.) above, 1y = 1x · 1y = ty(1x). Next assume y ≤ x Then, by 3.) above and the case just proved, 1y = 3 LABELED ALGEBRAS OF INFORMATION 31

ty(tx(1y)) = ty(1x). In the general case x⊥y|x∨y (C1,C3), hence by A4 and the two cases just proved ty(1x) = ty(tx∨y(1x)) = ty(1x∨y) = 1y. ut We use these results in the sequel without further reference to the lemma. Here follow a few further results, illustrating the meaning of irrelevance and conditional independence.

Lemma 2 Assume x⊥y|z. Then

If d(ψ1) = x and d(ψ2) = y, then

tx(ψ1 · ψ2) = ψ1 · tx(tz(ψ2)),

ty(ψ1 · ψ2) = ψ2 · ty(tz(ψ1)).

If d(ψ1) = x, d(ψ2) = y and d(ψ3) = z, then

tx(ψ1 · ψ2 · ψ3) = ψ1 · tx((tz(ψ2) · ψ3),

ty(ψ1 · ψ2 · Ψ3) = ψ2 · ty((tz(ψ1) · ψ3).

Proof. 1.) We have tx(ψ1 · ψ2) = ψ1 · tx(ψ2) (Lemma 1). But from x⊥y|z by axiom A4, tx(ψ2) = tx(tz(ψ2)), hence tx(ψ1 · ψ2) = ψ1 · tx(tz(ψ2)). The second part follows by symmetry.

2.) Here we have tx(ψ1 · ψ2 · ψ3) = ψ1 · tx(ψ2 · ψ3) (Lemma 1). From x⊥y|z it follows that x⊥y ∨ z|z and then by A4, tx(ψ2 · ψ3) = tx(tz(ψ2 · ψ3)). Further by Lemma 1) tz(ψ2 ·ψ3) = tz(ψ2)·ψ3. So, tx(ψ1 ·ψ2 ·ψ3) = ψ1 ·tx((tz(ψ2)·ψ3), ut This Lemma shows that if x is conditionally independent from y given z, then only the part of ψ2 belonging to z is needed to combine it with the information ψ1 on x. For instance, if ψ2 with domain y contains no infor- mation relative to z, that is tz(ψ2) = 1z, then tx(ψ1 · ψ2) = ψ1 and ψ2 has no effect on domain x. Axiom A5 can be extended to a family of conditionally independent domains.

Theorem 15 Assume ⊥{x1, . . . , xn}|z and φ = φ1 · ... · φn with d(φi) = xi for i = 1, . . . , n. Then

tz(φ) = tz(φ1) · ... · tz(φn). (3.3) 3 LABELED ALGEBRAS OF INFORMATION 32

Proof. The proof is by induction. The theorem is valid for n = 2 by A5. n−1 Assume it holds for n−1. From ⊥{x1, . . . , xn}|z it follows that ∨i=1 xi⊥xn|z. From A5 we obtain then

tz(φ) = tz(φ1 · ... · φn−1) · tz(φ).

Using the assumption of induction tz(φ1 · ... · φn−1) = tz(φ1) · ... · tz(φn−1) the theorem follows. ut Here is lemma on idempotent information algebras, to be used later.

Lemma 3 Assume d(φ) = x and y ∈ D. If axiom A7 holds, then

φ · ty(φ) = tx∨y(φ). (3.4)

Proof. By Lemma 1, 5.), and 1.) and axiom A7,

φ · ty(φ)

= tx∨y(φ) · tx∨y(ty(φ)) = tx∨y(φ) · ty(φ)

= tx∨y(φ) · ty(tx∨y(φ)) = tx∨y(φ).

ut Here follows an important example of a generalised information algebra.

Example : Subset Algebra on a Family of Compatible Frames: As an ex- ample for a generalised information algebra consider a family of compat- ible frames (F, R) with the associated conditional independence relation Θ1⊥Θ2|Λ (Section 2.2). Subsets S of a frame Θ ∈ F can be considered as pieces of information in the sense that they restrict the unknown answer in Θ to S. Let

ΦΘ = {(S, Θ) : S ∈ P(Θ)} and [ Φ = ΦΘ. Θ∈F We consider the system (Φ, F; ⊥, d, ·, t), where (F; ≤, ⊥) is the q-separoid introduced in Section 2.2. The operations d, · and t are defined as follows:

1. Labeling: d :Φ → F, d(S; Θ) = Θ. 3 LABELED ALGEBRAS OF INFORMATION 33

2. Combination: · :Φ × Φ → Φ, defined for (S, Θ) and (R, Λ) by

(S, Θ) · (R, Λ) = (τ1(S) ∩ τ2(R), Θ ∨ Λ),

where τ1 and τ2 are the refinings of Θ and Λ to the minimal common refinement Θ ∨ Λ respectively. 3. Transport: t :Φ × F → Φ, defined for Λ ∈ F and (S, Θ) by

tΛ(S, Θ) = (v(τ(S)), Λ), where τ is the refining of Θ to the common refinement Θ ∨ Λ and v the outer reduction of Θ ∨ Λ to Λ (see (2.5)).

The system (Φ, F; ⊥, d, ·, t) satisfies the axioms of a generalised informa- tion algebra, provided (F; ≤, ⊥) is a q-separoid. This was already shown in (Kohlas & Monney, 1995). Although families of compatible frames where defined there slightly different than here, the proofs carry over. The semi- group condition (associativity of combination) is stated in Theorem 8.4, A4 is stated in Theorem 8.6 and A5 in Theorem 8.5 of (Kohlas & Monney, 1995). The other conditions are more or less evident. In addition, this sys- tem satisfies also the idempotency axiom. In fact, as mentioned in Section 2.2, the lattice part(U) of partitions of a universe U provides an example of a family of compatible frames. Subsets of blocks of partitions in part(U) form then also an information algebra. That (Φ, F; ⊥, d, ·, t) is a generalised information algebra will also follow later from the fact that it is an instance of a semiring-valued information algebra (see Section 3.3).

Example : Set Potentials and Belief Functions: The previous example may be generalised by assigning to the subsets S of a frame Θ some nonnegative numbers m(S). Such an assignment m : P(Θ) → R ∪ {0} is called a set potential on frame Θ. To a set potential on frame Θ we attach the label d(m) = Θ. If m1 and m2 are two set potentials on frames Θ and Λ, then a set potential m on frame Θ ∨ Λ can be defined as follows: For a subset S of frame Θ ∨ Λ, let X m(S) = {m1(S1)m2(S2): S1 ⊆ Θ,S2 ⊆ Λ, tΘ∨Λ(S1) ∩ tΘ∨Λ(S2) = S}.

Then m is called the combination of m1 and m2 and we write m = m1 · m2. Further, if m is a set potential on frame Θ and Λ any other frame, then we define a set potential tΛ(m) on frame Λ by X tΛ(m)(S) = {m(T ): T ⊆ Θ, tΛ(T ) = S}, 3 LABELED ALGEBRAS OF INFORMATION 34

for any subset S of Λ. Let ΦΘ denote the set of all set potentials on frame Θ and [ Φ = ΦΘ. Θ∈F

The system (Φ, F; ≤, ⊥, .,·, t) forms then a generalised information algebra. We refer to (Kohlas & Monney, 1995) and (Kohlas, 2003a) for a verification of axioms. Set potentials may be transformed into two other set function, X X b(S) = m(T ), q(S) = m(S). T ⊆S T ⊇S

There is a one-to-one relation between the set function m, b and q. In fact, we have X X m(S) = (−1)|S−T |b(T ) = (−1)|T −S|q(T ). T ⊆S T ⊇S

For a proof see (Shafer, 1976). If m(∅) = 0 and X m(S) = 1, S⊆Θ then the set potentials m is called a basic probability assignment, (Shafer, 1976) and the set function b is the associated belief function, the set func- tion q the commonality function. This is the formalism of Dempster-Shafer Theory, which is described in detail in (Shafer, 1976). In the next section an important axiomatic system will be presented which is less general that an information algebra, but which induces one under certain circumstances, and which has many models or instances.

3.2 Valuation Algebras

Consider the quasi-separoid (D; ≤, ⊥L), where D is a lattice. If (Φ,D; ≤ , ⊥L, d, ·, t) is a generalised information algebra, then an alternative ax- iomatic system can be derived. The operations of labeling and combination 3 LABELED ALGEBRAS OF INFORMATION 35 remain the same, whereas the transport operation is replaced by the partial operation

πy(φ) = ty(φ), defined for y ≤ d(φ).

This operation is called projection. The Semigroup and Labeling axioms remain. Axiom A3 is reformulated in the following way:

1. If d(φ) = x, then φ · 0x = 0x, φ · 1x = φ,

2. If y ≤ x = d(φ), then πy(φ) = 0y if and only if φ = 0x,

3. If y ≤ x, then πy(1x) = 1x,

4. 1x · 1y = 1x∨y.

For x ≤ y ≤ z = d(φ), we obtain,

πx(φ) = tx(φ) = tx(ty(φ)) = πx(πy(φ)).

This tells that projection can be executed stepwise. Further if d(φ) = x, d(ψ) = y, then (Lemma 1),

πx(φ · ψ) = tx(φ · ψ) = φ · tx(ψ).

But we have also x⊥Ly|x∧y. Then A4 and A3 imply tx(ψ) = tx(tx∧y(ψ)) = tx∧y(ψ) · 1x. Introducing this above, yields

πx(φ · ψ) = φ · πx∧y(ψ) · 1x = φ · πx∧y(ψ).

In summary, we have a system (Φ,D; ≤, d, ·, π) with the three operations of labeling, combination and projection, satisfying the following conditions:

S0 Lattice: (D, ≤) is a lattice.

S1 Semigroup: (Φ, ·) is a commutative semigroup.

S2 Labeling: d(φ · ψ) = d(φ) ∨ d(ψ) and d(πx(φ)) = x.

S3 Unit and Null: For all x ∈ D there are elements 0x and 1x with d(0x) = d(1x) = x and such that

1. If d(φ) = x, then φ · 0x = 0x and φ · 1x = φ,

2. If y ≤ x = d(φ), then πy(φ) = 0y if and only if φ = 0x, 3 LABELED ALGEBRAS OF INFORMATION 36

3. If y ≤ x, then πy(1x) = 1y,

4. 1x · 1y = 1x∨y. S4 Projection: If x ≤ y ≤ d(φ), then

πx(φ) = πx(πy(φ)). (3.5)

S5 Combination: If d(φ) = x and d(ψ) = y, then

πx(φ · ψ) = φ · πx∧y(ψ). (3.6)

This corresponds essentially to the axioms introduced in (Shenoy & Shafer, 1990a) for multivariate models, except that axiom S3 is there missing. There are instances of this reduced axiomatic system without neutral or null el- ements. But, as we shall see, this axiom, and in particular the property πy(1x) = 1y, called stability, is essential to transport information beyond projection (see Section 7) and the axiom also has some importance for local computation, see Section 4. A system (Φ,D; ≤, d, ·, π) like this has also been called a (labeled) valuation algebra (Kohlas & Shenoy, 2000; Kohlas, 2003a). A general information algebra with respect to the q-separoid (D; ≤, ⊥L) in- duces therefore a valuation algebra. The converse is also the case, as we shall see below. If the idempotency axiom

S6 Idempotency: If x ≤ d(φ), then

φ · πx(φ) = φ (3.7) is added, then a valuation algebra is also called a (labeled) information algebra in (Kohlas, 2003a). This is then a special case of an idempotent generalised information algebra in the present sense. There are many exam- ples or instances of valuation algebras known. Initial examples include belief and possibility functions, relational algebra (relational databases). Proba- bility potentials, related to Bayesian networks, satisfy the axioms without S3 (Shenoy & Shafer, 1990a). We refer to (Kohlas, 2003a; Pouly et al. , 2013) for much more models of valuation algebras. We show now that any valuation algebra induces a general information al- gebra. So, let (Φ,D; ≤, d, ·, π) be a valuation algebra. We consider the q-separoid (D; ≤, ⊥L) where D is a lattice, and where x⊥Ly|z iff (x ∨ z) ∧ 3 LABELED ALGEBRAS OF INFORMATION 37

(y ∨z) = z, see Section 2.1. We are going to extend the projection operation π, which is a partial transport operation defined only for domains x ≤ d(φ) to a full transport operation in two steps. First, for y ≥ d(φ) we define

ey(φ) = φ · 1y.

Then ey(φ) has label y. It is thus an extension of φ to a larger domain. By the Combination Axiom S5 and by Axiom S3, πx(ey(φ)) = πx(φ · 1y) = φ · πx(1y) = φ · 1x = φ. So we see that we may recover φ from its extension. The extension ey(φ) is therefore called the vacuous extension (Shafer, 1991; Kohlas, 2003a). Then, the transport operation is defined for any y ∈ D by first vacuously extending φ from its domain x to x ∨ y and then projecting this extension back to y. So, we define

ty(φ) = πy(ex∨y(φ)), where d(φ) = x. (3.8)

Note that for y ≤ d(φ) the transport operation is projection, ty(φ) = πy(φ) and for y ≥ d(φ) it is vacuous extension, ty(φ) = ey(φ). We remark that the transport operation can equivalently also be defined by

ty(φ) = ey(πx∧y(φ)). (3.9)

For a proof of this alternative way to compute the transport operation, we refer to (Kohlas, 2003a). A useful property of the transport operation as defined above, is that if y ≤ z, then

ty(φ) = ty(tz(φ)). (3.10)

Again we refer to (Kohlas, 2003a) for the proof. We are now going to show, that this transport operation in a valuation algebra satisfies conditions A3,A4,A5 and A6 of a generalised information algebra. Axioms A0,A1,A2 are inherited directly from the valuation algebra.

Theorem 16 Let (Φ,D; ≤, d, ·, π) be a valuation algebra. Then

1. ty(φ) = 0y if and only if φ = 0d(φ),

2. d(φ) = y implies φ · 1x = tx∨y(φ).

Proof. 1.) Assume first that φ = 0x. Then, by the Combination Axiom S5 and Axiom S3, tx(tx∨y(0x)) = πx(tx∨y(0x)) = πx(0x · 1x∨y) = 0x · 1x = 0x, 3 LABELED ALGEBRAS OF INFORMATION 38

hence by axiom S3, tx∨y(0x) = 0x∨y. But, again by axiom S3, ty(0x∨y) = πy(0x∨y) = 0y and using axiom S5, it follows that πy(0x∨y) = πy(0y ·1x∨y) = 0y · πy(1x∨y) = 0y · 1y = 0y.

Conversely, assume d(φ) = x and ty(φ) = 0y. Then ty(φ) = πy(tx∨y(φ)) = 0y. From axiom S3 we obtain then tx∨y(φ) = 0x∨y. Again by S3, φ = πx(tx∨y(φ)) = πx(0x∨y) = tx(0x∨y) = 0x.

2.) We have by axioms S2 and S3 φ · 1x = φ · 1x · 1x∨y = φ · 1x∨y = tx∨y(φ). ut This shows that axiom A3 is valid. Axioms A4,A5 and A6 are asserted in the following theorem

Theorem 17 Let (Φ,D; ≤, d, ·, π) be a valuation algebra. Then

1. x⊥Ly|z and d(φ) = x imply ty(φ) = ty(tz(φ)).

2. x⊥Ly|z and d(φ) = x, d(ψ) = y imply tz(φ · ψ) = tz(φ) · tz(ψ).

3. d(φ) = x implies tx(φ) = φ.

Proof. 1.) Assume d(φ) = x. Define ψ = tx∨z(φ), so that d(ψ) = x ∨ z. By the alternative definition of the transport operation (3.9), and x⊥Ly|z,

ty∨z(ψ) = ey∨z(π(x∨z)∧(y∨z)(ψ)) = ey∨z(πz(ψ)) = ty∨z(tz(ψ)). (3.11)

Now, by the definition of the transport operation and reversibility of vacuous extension, we have

ty(φ) = πy(ex∨y(φ)) = πy(πx∨y(ex∨y∨z(ex∨y(φ)))).

As projection, vacuous extension can also be executed stepwise, see axiom S4 and (3.10), such that ty(φ) = πy(ex∨y∨z(φ)) = πy(πy∨z(ex∨y∨z(ex∨z(φ)))) = πy(ty∨z(tx∨z(φ))).

Using (3.11), we obtain

ty(φ) = ty(ty∨z(tz(tx∨z(φ)))).

Since z ≤ y ∨ z, x ∨ z, we use (3.10) twice to conclude finally

ty(φ) = ty(tz(tx∨z(φ))) = ty(tz(φ)). 3 LABELED ALGEBRAS OF INFORMATION 39

This proves the first part of the theorem. 2.) Assume d(φ) = x and d(ψ) = y. Using axiom S3, it follows that φ · ψ · 1z = φ · ψ · 1x∨y · 1z = φ · ψ · 1x∨y∨z = ex∨y∨z(φ · ψ). So, by definition of the transport operation,

tz(φ · ψ) = πz(φ · ψ · 1z) = πz((φ · 1z) · (ψ · 1z)).

Now by axiom S4, since z ≤ x ∨ z ≤ x ∨ y ∨ z,

tz(φ · ψ) = πz(πx∨z((φ · 1z) · (ψ · 1z))).

As d(φ · 1z) = x ∨ z and d(ψ · 1z) = y ∨ z, we may apply S5 and obtain, using x⊥Ly|z,

tz(φ · ψ) = πz((φ · 1z) · πz(ψ · 1z)).

Again, by S5, and then the definition of the transport operation we conclude

tz(φ · ψ) = πz(φ · 1z) · πz(ψ · 1z) = tz(φ) · tz(φ).

This proves the second item of the theorem.

3.) If d(φ) = x, then d(tx(φ)) = d(πx(ex(φ))) = d(ex(φ)) = x, by axiom S2 and the definition of vacuous extension. This verifies the third item of the theorem. ut In summary, we have shown that a valuation algebra induces a (generalised) information algebra.

Theorem 18 If (P,D; ≤, d, ·, π) is a valuation algebra, then (Φ,D; ≤, ⊥L, d, ·, t), where the operation t is defined by (3.8) is a generalised information algebra.

There are various alternative axiomatic systems for valuation algebras. In particular, we shall consider valuation algebras, where Axiom S3 is removed, that is no unit and null elements are assumed. Or we consider also valuation algebras where a weaker axiom about unit elements is assumed:

S3’ For all x ∈ D there are elements 1x with d(1x) = x and such that

1. If d(φ) = x, then φ · 1x = φ.

2. 1x · 1y = 1x∨y. 3 LABELED ALGEBRAS OF INFORMATION 40

Further, there are valuation algebras, where the Combination Axiom S5 is satisfied in a stronger version:

S5’ If d(φ) = x, d(ψ) = y, and x ≤ z ≤ x ∨ y, then

πz(φ · ψ) = φ · πy∧z(ψ).

Especially in Sections 5.3 and 6 such valuation algebras will be considered. Here we give an example of such a valuation algebra. Note that if the lattice (D, ≤) is distributive, then S5’ follows from S5. In fact, we have, if d(φ) = x and d(ψ) = y, and x ≤ z ≤ x ∨ y,

φ · ψ = φ · ψ · 1x∨y = φ · ψ · 1z · 1x∨y = φ · ψ · 1z.

Theerefore, we obtain from the Combination Axiom S5,

πz(φ · ψ) = πz((φ · 1z) · ψ) = (φ · 1z) · πy∧z(ψ).

But distributivity of the lattice (D; ≤) implies x∨(y∧z) = (x∨y)∧(x∨z) = z, so that indeed πz(φ · ψ) = φ · πy∧z(ψ). Note that, if d(φ) = x, d(ψ) = y and x ≤ z ≤ y, then S5’ implies that πz(φ · ψ) = φ · πz(ψ). This same result holds also for any valuation algebra with unit elements, as stated in the following lemma:

Lemma 4 Let (Φ,D; ≤, d, ·, π) be a valuation algebra, either satisfying ax- ioms S0 to S5 or S3’ instead of S3, or else without unit elements, but satis- fying S5’ instead of S5. Then, in all these cases, if d(φ) = x, d(ψ) = y and x ≤ z ≤ y, then

πz(φ · ψ) = φ · πy∧z(ψ).

This result becomes important later in Sections 5.3 and 6.

Example : Densities: This example is based on the multivariate model of domains (see Section 2.2). Let (D; ⊆) be the lattice of finite subsets of s ω = {1, 2 ...}. We consider here the linear vector spaces R of real valued s tuples x : s → R where s is a finite subset of ω. On a space R we consider s + nonnegative functions f = R → R ∪ {0}, whose integrals Z +∞ f(x)dx (3.12) −∞ 3 LABELED ALGEBRAS OF INFORMATION 41 exist and are finite. To simplify, we consider continuous functions and Rie- man integrals; it would also be possible to consider measurable functions and Lebesgue integrals (Kohlas, 2003a). Such functions are called densi- s ties on R . We define the operations of a valuation algebra for densities as follows:

s 1. Labeling: d(f) = s if f is a density on R . 2. Combination: For densities f and g with d(f) = s and d(g) = t and s∪t x ∈ R ,

(f · g)(x) = f(xs) × g(xt),

where xs and xt denote the tuple of components of x in s and t re- spectively.

t 3. Projection: For density f with d(f) = s, t ⊆ s, and x ∈ R , Z +∞ (πt(f))(xt) = f(x)dxs−t. −∞

It is straightforward to verify the axioms of a valuation algebra for this system, except axioms S3 or S3’. There are no unit elements, since the s function h(x) = 1 for all x ∈ R is not finitely integrable, hence no density. However, the strong Combination Axiom S5’ is satisfied for densities. Note that the combination operation of densities seems to have no obvious sense in terms of classical probability theory. We come back to this point in Section 6.3, where a meaning to this operation will be given and thus the interest of this valuation algebra for applications clarified. Another application of this valuation algebra is presented in Section 5.3. Here follows another example of a valuation algebra.

Example : Gaussian Potentials: Gaussian densities are of particular interest in applications. A multivariate Gaussian density over a set s of variables is defined by

0 −1 f(x) = (2π)−n/2(det Σ)−1/2e−(1/2)(x−µ) Σ (x−µ).

s Here µ is a vector in R (see previous example), and Σ is a symmetric posi- tive definite matrix. The vector µ is the expected value vector of the density and Σ the variance-covariance matrix. The matrix K = Σ−1 is called the 3 LABELED ALGEBRAS OF INFORMATION 42 concentration matrix of the density. It is also symmetric and positive def- inite. A Gaussian density may be represented or determined by the pair (µ, K). Each such pair with µ an s-vector and K a symmetric positive defi- nite s × s matrix determines a Gaussian density. Gaussian densities belong to the valuation algebra of densities defined in the previous example. In fact, they form a subalgebra of the algebra of densities. Labeling, combination and projection can however now be expressed in terms of the pairs (µ, K). For this purpose, if t ⊇ s and µ is an s-vector, K a s × s matrix let µ↑t and K↑t be the vector or matrix obtained by adding to µ and K 0-entries for all indices in t − s. Further, if t ⊆ s, then let µt and Kt,t be the subvector or submatrix of µ and K respectively with componbents in t. We then define the following operations on pairs (µ, K):

1. Labeling: d(µ, K) = s if µ is a s-vector and K a s × s matrix.

2. Combination: For piars (µ1, K1) and (µ2, K2) with d(µ1, K1) = s and s∪t d(µ2, K2) = t and x ∈ R ,

(µ1, K1) · (µ2, K2) = (µ, K)

with

↑s∪t ↑s∪t K = K1 + K2

and

−1  ↑s∪t ↑s∪t ↑s∪t ↑s∪t µ = K K1 µ1 + K2 µ2 .

3. Projection: For a pair (µ, K) with d(µ, K) = s, t ⊆ s,

−1 −1 πt(µ, K) = (µt, ((K )t,t) ).

This is justified by the fact, the combination of two Gaussian densities re- sults again in a Gaussian density, and so does projection of Gaussian density. We refer to (Kohlas, 2003a) for more details. Again, the algebra of Gaussian potentials has no unit elements, but satisfies the strong Combination Ax- iom S5’. As for densities, a probabilistic interpretation of these operations, especially combination, will by given in Section 6.3. For an application of this valuation algebra to linear systems with Gaussian disturbances we refer to (Pouly & Kohlas, 2011). 3 LABELED ALGEBRAS OF INFORMATION 43

Most of the studies about local computation have been made with respect to valuation algebras in the multivariate framework, as discussed for instance in (Lauritzen & Spiegelhalter, 1988; Shenoy & Shafer, 1990a; Shenoy & Shafer, 1990a; Kohlas, 2003a). We shall show in Section 4 how local compu- tation can be defined in the present more general framework of generalised information algebras.

3.3 Semiring Valuations

In this section, we introduce a large class of information and valuation al- gebras based on semirings. This is an extension of the class of algebras described in (Kohlas & Wilson, 2008) in the frame of multivariate models. The elements of the algebra, the pieces of information, will be mappings from a frame of a family of compatible frames to a semiring. Many well-known instances of valuation or information algebras are of this kind. To start we define semirings.

Definition 6 Semirings; Let A be any set with two binary operations + : A × A → A and × : A × A → A. Then the signature (A; +, ×) is a commutative semiring, if

1. (A; +) and (A, ×) are commutative semigroups,

2. × distributes over +, that is a × (b + c) = (a × b) + (a × c) for all all a, b, c ∈ A.

In general for a semiring (A; +, ×) multiplication (×) is not necessarily as- sumed to be commutative; but we shall here only consider commutative semirings. So, in the sequel when we speak of semirings, we always assume them to be commutative. The concept of a semiring is a weakening of the concept of a ring. So rings and fields are all also semirings. Further examples will be given below. If there exists an element 0 ∈ A such that a + 0 = a and a × 0 = 0, then 0 is called a null element and the semiring (A; +, ×, 0) a semiring with null element. A null element is always unique and if A has no null element it can always be adjoined. If there exists an element 1 ∈ A such that 1×a = a×1 = a, then 1 is called a unit element and the semiring (A; +, ×, 1) a semiring with unit element. If (A; +, 0) is a group, then (A; +, ×, 0, 1) is 3 LABELED ALGEBRAS OF INFORMATION 44 a ring and if further (A; ×, 1) is a group too, then (A; +, ×, 0, 1) is a field. If (A; +×, 0) is a semiring with null element and a + b = 0 implies always a = b = 0, then A is called positive. If addition is idempotent in a semiring (A; +, ×), that is a+a = a for all a ∈ A, then a unit element can be adjoined (see (Kohlas & Wilson, 2008)). If (A; +, ×, 1) has a unit element such that 1 + 1 = 1, then a + a = a × (1 + 1) = a and addition is idempotent. For instance, if (A, +, ×, 0, 1) is a semiring with null and unit elements such that a + 1 = 1, then addition is idempotent. In this case A is called a c-semiring (constraint semiring). Finally, if A is a c-semiring and multiplication is also idempotent, then (A, +, ×, 0, 1) is a bounded distributive lattice with addition as join and multiplication as meet. Here follow a few examples of important semirings.

Example : Arithmetic Semirings: Take for A the set of non-negative real + numbers R ∪ {0}. with + and × designating the usual numerical addition + and multiplication. Then (R ∪ {0}; +, ×, 0, 1) is a semiring with the num- bers zero and one as null and unit elements. The semiring is positive too. This semiring serves to define the valuation algebra of probability potentials (see below). If we take only positive numbers R, we have still a semiring, and this is also the case if we take all real numbers R. In the latter case the semiring is a field and the last two semirings are no more positive. Instead of real numbers we may also take rational numbers or integers to obtain the respective semirings with ordinary addition and multiplication. For example the nonnegative integers N ∪ {0} yields a positive semiring. Example : Booelan Semring: Here consider A = {0, 1} and define the oper- ation 1 as a + b = max{a, b} and × as a × b = min{a, b}. This is a semiring with 0 as null and 1 as unit element. In addition 0 + 1 = 1 + 1 = 1, so it is a c-semiring. It is used to describe the valuation algebra of constraint systems and relational algebra (Kohlas, 2003a; Kohlas & Wilson, 2008).

Example Bottleneck Algebra: In this instance A is given by the real numbers augmented by +∞ and −∞. Addition is defined by the max-operator, multiplication by min. Then −∞ is the null element and +∞ the unit element. This is a c-semiring and in fact a distributive lattice. It is called bottleneck algebra (Cechlarova & Plavka, n.d.).

Example (max/min,+)-Semirings: In this example A is the set of all non- negative integers including +∞, N ∪ {0, ∞}. Addition is defined as the min-operation and semiring multiplication is arithmetical addition with the 3 LABELED ALGEBRAS OF INFORMATION 45 convention that a + ∞ = ∞. Both operations are commutative and associa- tive. Multiplication distributes over addition a + min{b, c} = min{a + b, a + c}. The min-operation is idempotent, ∞ is the null-element, the integer 0 is the unit element and we have min{a, 0} = 0. This shows that we have here again a c-semiring. It is called the tropical semiring. This semiring has numerous applications: It has been used to define a dynamic theory of graded belief states based on ordinal numbers (Spohn, 1988). It arises in the context of dynamic programming applied to minimizing a sum of functions (Shafer & Shenoy, 1988; Kohlas, 2003a; Pouly & Kohlas, 2011). Further it applies to weighted and partially satisfied constraints (Bistarelli & U. Montanari, 1999). Instead of min we may also take max as addition and instead of integers we may take real or nonnegative reals with +∞ or −∞ adjoined. There are many applications of these (min, +) or (max, +) semirings in networks, graph theory, queueing systems and discrete event systems (Kolokoltsov & Maslov, 1997). Finally, these semirings can also be used to for computing the most probable assigment in a Bayesian network (Pearl, 1988). Example t-Norms: Triangular norms (t-norms) were introduced for proba- bilistic metric spaces (Menger, 1942) and they are in particular important in fuzzy set theory and possibility theory. These norms are binary operations T on the unit interval [0, 1] which are commutative and associative, have the number 0 is the null element and 1 as unit element, and are nondecreasing in both arguments, that is

1. For all a, b, c ∈ [0, 1] we have T (a, b) = T (b, a) and T (a, (T (b, c)) = T (T (a, b), c). 2. a ≤ a0 and b ≤ b0 imply that T (a, b) ≤ T (a0, b0). 3. For all a ∈ [0, 1] we have T (a, 0) = T (0, a) = 0 and T (a, 1) = T (1, a) = a.

If we define multiplication × on the unit interval by a t-norm and ad- dition as + as max, then both operations are commutative and associa- tive. Distributivity of multiplication over addition is verified as follows: We have T (a, b),T (b, c) ≤ max{T (a, b),T (b, c)}, hence T (a, max{b, c}) ≤ max{T (a, b),T (b, c)}. But monotonicity of t-norms implies also that T (a, max{b, c}) ≥ T (a, b),T (a, c), 3 LABELED ALGEBRAS OF INFORMATION 46 hence T (a, max{b, c}) ≥ max{T (a, b),T (a, c)}, hence

T (a, max{b, c}) = max{T (a, b),T (a, c)}

Therefore, we see that a×(b+c) = (a×b)+(a×c). So, ([0, 1],T, max, 0, 1) is a semiring with null element 0 and unit element 1. Addition is idempotent and we have a + 1 = 1 for all a ∈ [0, 1], so A is a c-semiring. The following is a list of typical t-norms:

1. Minimum or G¨odel’st-norm: T (a, b) = min{a, b}.

2. Product t-norm T (a, b)) = a · b.

3. Lukasziewicz t-Norm: T (a, b) = max{a + b − 1, 0}.

4. Drastic product: T (a, 1) = T (1, a) = a, and T (a, b) = 0 in all other cases.

The first t-norm is idempotent. So the c-semiring induced by this t-norm is a distributive lattice. This is not the case for the other t-norms. Later we shall see that the t-norm differ also in other important aspects (Section 5.5). We remark that distributivity depends only on monotonicity, but not on 1 being the unit element. It is therefore possible to require that any other element in the unit interval is the unit element and we have still a semiring, albeit no more necessarily a c-semiring. Also, instead of max we may take any other commutative, associative and nondecreasing binary operation on the unit interval. Then we obtain a uninorm (Yager & Rybalov, 1996). If the unit element is the number 0, then the uninorm is called a t-conorm. Then the semiring has the number 1 as its null-element. We refer to (Baets, 1996; Klement & Pap, 2000) for more information on uninorms and t-norms.

Example Distributive Lattices: A c-semiring with idempotent multiplication is a distributive lattcie as we have seen. Conversely, every distributive lattice is a semiring with joins for addition and meets for multiplication. Both operations are idempotent. If the lattice has a bottom element ⊥, then it is the null element of the semiring. If it has a top element >, then this is the unit element. In this case the semiring is a c-semiring. This generalizes the Boolean semiring above. Another example of a distributive lattice is the Bottleneck Algebra. Distributive lattices can be far more general and represent for instance qualitative degrees of membership of elements to fuzzy 3 LABELED ALGEBRAS OF INFORMATION 47 sets. In particular, Boolean algebras are distributive lattices. Elements of a Boolean lattice can describe assumptions to be satisfied for membership to certain sets, see (Kohlas & Wilson, 2008). Example Multidimensional Semirings: Let (A; +, ×) be a semiring. Then we can define in An the operations of addition and multiplication compo- nentwise as follows:

(a1, . . . , an) + (b1, . . . , bn) = (a1 + b1, . . . , an + bn)

(a1, . . . , an) × (b1, . . . , bn) = (a1 × b1, . . . , an × bn) Clearly, commutativity, associativity and distributivity are inherited from A and therefore An is also a semiring. If addition is idempotent in A, then so it is in An and the same is true for multiplication. If A has a null element 0, then (0,..., 0) is the null element in An and if 1 is the unit element in A, then (1,..., 1) is the unit element in An. Thus, if A is a c-semiring, then so is An. Now, we introduce semiring valuations As a preparation consider finite, disjoint index sets Ij for j = 1, . . . , n and let I = I1 ∪ ... ∪ In. then, due to commutativity and associativity of addition, n X X X ai = ai.

j=1 i∈Ij i∈I This will be used in the sequel without further reference. Let now (F, R) be a family of compatible frames (f.c.f), whose frames Θ ∈ F are all finite. Let further the relation Θ1⊥Θ2|Λ be the associated conditional independence relation and assume that it defines a q-separoid (see Sections 2.1 and 2.2). We call the elements θ of a frame Θ its atoms. If Λ is a coarsening of Θ and τ :Λ → Θ the corresponding refining, then there is exactly one atom λ in Λ, compatible with θ, namely the atom λ such that θ ∈ τ(λ). In other words, we have v({θ}) = {λ}. We define now

tΛ(θ) = v({θ}) (3.13) This is nothing else than the restriction of the transport operation of the subsetalgebra on the f.c.f (F, R) to one-element subsets of the frames (see Section 3.1). It is called the projection of the atom θ to the coarser frame Λ. Suppose next, that Λ is a refinement of Θ with the refining τ :Θ → Λ. Now, the atoms in τ(θ) are all compatible with θ and we define

tΛ(θ) = τ(θ). (3.14) 3 LABELED ALGEBRAS OF INFORMATION 48

In the general case of two arbitrary frames Θ and Λ, the atoms λ in Λ are compatible with an atom θ in the frame Θ, if λ ∈ Rθ(Λ) = {λ : τ(θ)∩µ(λ) 6= ∅}, where τ and µ are the refinings of Θ and Λ to Θ ∨ Λ respectively (see Section 2.2. Therefore, we define in the general case

tΛ(θ) = v(τ(θ)) = Rθ(Λ), (3.15) where v is the outer reduction associated with the refining µ. Again, this is the restriction of the general transport operation of subsets of frames to one- element sets. Since the subset algebra of a f.c.f is a generalised information algebra, it satisfies the transport axiom A4 (Section 3.1), We restate this axiom as an important result for the transport of atoms:

Theorem 19 Assume θ ∈ Θ and Θ⊥Λ1|Λ2. then

tΛ1 (θ) = tΛ1 (tΛ2 (θ)). (3.16)

These transport operations become important below. Since a frame can be considered as representing possible answers to some questions, atoms being precise answers, these transport operations of atoms determine answers in frames Λ, compatible with a precise answer θ in a frame Θ. So transport of atoms has an important information-theoretic meaning. For a semiring (A; +, ×) we define now A-valuations on a frame Θ to be mappings

φ :Θ → A from a frame Θ of F into the semiring A. Let ΦΘ be the set of all A- valuations on the frame Θ and define [ Φ = ΦΘ, Θ∈F the set of all A-valuations in the f.c.f (F, R). Such A-valuations have been considered in (Kohlas & Wilson, 2008) for the special case of multivariate models. The results obtained there can be extended to the present more general case of A-valuations in a f.c.f. First, we define the following op- erations for A-valuations, using the transport operations of atoms defined above:

1. Labeling: d(φ) = Θ if φ is an A-valuation on Θ. 3 LABELED ALGEBRAS OF INFORMATION 49

2. Combination: If d(φ) = Θ and d(ψ) = Λ, then for ζ ∈ Θ ∨ Λ, an A-valuation φ · ψ on frame Θ ∨ Λ is defined by

φ · ψ(ζ) = φ(tΘ(ζ)) × ψ(tΛ(ζ)). (3.17)

3. Transport: If d(φ) = Θ and λ ∈ Λ, then an A-valuation tΛ(φ) on frame Λ is defined by X tΛ(φ)(λ) = φ(θ). (3.18)

θ∈tΘ(λ)

The idea behind combination is that in the combined A-valuation φ · ψ the value of ζ in the supremum or the combined frame Θ ∨ Λ equals the product of the values of the compatible atoms in the frames Θ and Λ respectively. For transport of an A-valuation to a frame Λ, the value of an atom in Λ is the sum of the values of the compatible atoms in the original frame Θ. We shall see in examples below that this makes sense in all applications. How far are the properties of an information algebra satisfied by these op- erations of A-valuations? Clearly, the Labeling Axiom A2 of an information algebra is satisfied by the definition of combination and transport,

d(φ · ψ) = d(φ) ∨ d(ψ), d(tΛ(φ)) = Λ.

Further, by assumption (F; ≤, ⊥) is a q-separoid, hence Axiom A0 is valid. The Identity Axiom A6 is also valid since if d(φ) = Θ, then tΘ(θ) = θ for all θ ∈ Θ. Next, we show that combination is both commutative and associative.

Theorem 20 With the definition of combination above, (Φ; ·) is a commu- tative semigroup.

Proof. Commutativity of combination follows by the definition of combina- tion from commutativity of semigroup multiplication.

To show associativity consider A-valuations φ1, φ2 and φ3 on frames Θ1,Θ2 and Θ3 respectively. Let θ be an atom in Θ1 ∨ Θ2 ∨ Θ3. Then we have

(φ1 · φ2) · φ3(θ)

= φ1 · φ2(tΘ1∨Θ2 (θ)) × φ3(tΘ3 (θ))

= (φ1(tΘ1 (tΘ1∨Θ2 (θ))) × φ2(tΘ2 (tΘ1∨Θ2 (θ)))) × φ3(tΘ3 (θ)) 3 LABELED ALGEBRAS OF INFORMATION 50

But from Θ1⊥Θ2|Θ1 ∨ Θ2 it follows that

tΘ1 (tΘ1∨Θ2 (θ)) = tΘ1 (θ), tΘ2 (tΘ1∨Θ2 (θ)) = tΘ2 (θ). So, we obtain

(φ1 · φ2) · φ3(θ) = (φ1(tΘ1 (θ)) × φ2(tΘ2 (θ))) × φ3(tΘ3 (θ)). We obtain in the same way

φ1 · (φ2 · φ3)(θ) = φ1(tΘ1 (θ)) × (φ2(tΘ2 (θ)) × φ3(tΘ3 (θ))). (3.19)

Since multiplication in A is associative, it follows that (φ1 · φ2) · φ3 = φ1 · (φ2 · φ3). ut

So, the Semigroup Axiom A2 is satisfied. It is evident that each (ΦΘ; ·) is a subsemigroup of (Φ; ·).

If the semiring A has a null element 0, then the A-valuations 0Θ defined by

0Θ(θ) = 0 for all θ ∈ Θ are the null elements in the semigroups (ΦΘ; ·). Similarly, if the semiring A has a unit element, then the A-valuations 1Θ defined as

1Θ(θ) = 1 for all θ ∈ Θ are unit elements in the semigroups (ΦΘ; ·). These particular elements have the following properties:

Theorem 21 If the semiring A has a null element and a unit element, then

1. φ · 1Λ = tΘ∨Λ(φ) if d(φ) = Θ.

2. 1Θ · 1Λ = 1Θ∨Λ.

3. φ · 0Λ = 0Θ∨Λ if d(φ) = Θ.

Proof. 1.) Let ζ be an atom in Θ ∨ Λ. Then we have

φ · 1Λ(ζ) = φ(tΘ(ζ)) × 1Λ(tΛ(ζ)) = φ(tΘ(ζ)) × 1 = φ(tΘ(ζ)) = tΘ∨Λ(ζ).

2.) and 3.) Again, for ζ ∈ Θ ∨ Λ, we have

1Θ · 1Λ(ζ) = 1Θ(tΘ(ζ)) × 1Λ(tΛ(ζ)) = 1 × 1 = 1. 3 LABELED ALGEBRAS OF INFORMATION 51 and

φ · 0Λ(ζ) = φ(tΘ(ζ)) × 0Λ(tΛ(ζ)) = φ(tΘ(ζ)) × 0 = 0. This concludes the proof. ut The following result is more profound and states that the Combination Ax- iom A5 is valid.

Theorem 22 Consider A-valuations φ and ψ with d(φ) = Θ1 and d(ψ) = Θ2. Assume Θ1⊥Θ2|Λ. Then,

tΛ(φ · ψ) = tΛ(φ) · tΛ(ψ).

Proof. Consider an atom λ ∈ Λ. Then we have by definition of transport and combination

tΛ(φ · ψ)(λ) X X = φ · ψ(ζ)) = φ(tΘ1 (ζ)) × ψ(tΘ2 (ζ)),

ζ∈tΘ1∨Θ2 (λ) ζ∈tΘ1∨Θ2 (λ) tΛ(φ) · tΛ(ψ)(λ) X X = tΛ(φ)(λ) × tΛ(ψ)(λ) = φ(θ1) × ψ(θ2)

θ1∈tΘ1 (λ) θ2∈tΘ2 (λ)

Now, tΘ1 (λ) = Rλ(Θ1) and tΘ2 (λ) = Rλ(Θ2). Further, we claim that

Rλ(Θ1, Θ2)

= {(θ1, θ2) ∈ Θ1 × Θ2 : θ1 = tΘ1 (ζ), θ1 = tΘ1 (ζ) for some ζ ∈ tΘ1∨Θ2 (λ)} (3.20)

But Θ1⊥Θ2|Λ implies also Rλ(Θ1, Θ2) = Rλ(Θ1) × Rλ(Θ2). So, once the claim above is proved, we conclude that

tΛ(φ · ψ)(λ) X = φ(tΘ1 (θ1)) × ψ(tΘ2 (θ2))

θ1∈Rλ(Θ1),θ2∈Rλ(Θ2)     X X =  φ(tΘ1 (θ1)) ×  ψ(tΘ2 (θ2))

θ1∈Rλ(Θ1) θ2∈Rλ(Θ2)

= tΛ(φ)(λ) × tΛ(ψ)(λ)

= tΛ(φ) · tΛ(ψ)(λ). 3 LABELED ALGEBRAS OF INFORMATION 52

This shows that tΛ(φ · ψ) = tΛ(φ) · tΛ(ψ). 0 0 It remains to verify the claim above. For this purpose let τ1 and τ2 be the refinings of Θ1 and Θ2 to Θ1 ∨ Θ2 respectively, τ1 and τ2 be the refinings of Θ1 and Θ2 to Θ1 ∨ Θ2 ∨ Λ, τ the refining of Λ to Θ1 ∨ Θ2 ∨ Λ and µ the refining of Θ1 ∨ Θ2 to Θ1 ∨ Θ2 ∨ Λ and finally v the outer reduction associated with µ. Then by definition of Rζ (Θ1, Θ2)

Rλ(Θ1, Θ2) = {(θ1, θ2): τ1(θ1) ∩ τ2(θ2) ∩ τ(λ) 6= ∅}.

Now, τ1(θ1) ∩ τ2(θ2) ∩ τ(λ) 6= ∅ is equivalent to

0 0 0 0 µ(τ1(θ1)) ∩ µ(τ2(θ2)) ∩ τ(λ) = µ(τ1(θ1) ∩ τ2(θ2)) ∩ τ(λ) 6= ∅.

Consider an atom ζ ∈ tΘ1∨Θ2 (λ) and atoms θ1 = tΘ1 (ζ), θ2 = tΘ2 (ζ) such that (θ1, θ2) belongs to the set on the right hand side of (3.20). This means 0 0 0 0 that ζ ∈ τ1(θ1) ∩ τ2(θ2) = τ1(tΘ1 (ζ)) ∩ τ2(tΘ2 (ζ)) and µ(ζ) ∩ τ(λ) 6= ∅. From this it follows that

0 0 µ(τ1(tΘ1 (ζ)) ∩ τ2(tΘ2 (ζ))) ∩ τ(λ) ⊇ µ(ζ) ∩ τ(λ) 6= ∅.

So, (θ1, θ2) ∈ Rλ(Θ1, Θ2) and the right hand side of (3.20) is contained in Rλ(Θ1, Θ2). 0 0 Conversely, consider a pair (θ1, θ2) ∈ Rλ(Θ1, Θ2) so that µ(τ1(θ1) ∩ τ2(θ2)) ∩ 0 0 τ(λ) 6= ∅. This implies τ1(θ1) ∩ τ2(θ2) 6= ∅. So, we may select an atom 0 0 0 0 ω ∈ µ(τ1(θ1) ∩ τ2(θ2)) ∩ τ(λ) and then an atom ζ ∈ v(ω) ⊆ τ1(θ1) ∩ τ2(θ2).

But this means that θ1 = tΘ1 (ζ) and θ2 = tΘ2 (ζ) and ω ∈ µ(ζ) ∩ τ(λ) 6= ∅. which implies ζ ∈ tΘ1∨Θ2 (λ). So Rλ(Θ1, Θ2) is contained in the right hand side of (3.20) and this proves the claim (3.20). ut So far, we have verified that A-valuations satisfy some axioms of a gen- eralised information algebra. However, the Transport Axiom A4 needs an additional property of the semiring. And in particular the property that tΛ(1Θ) = 1Λ (see Lemma 1) is not necessarily satisfied for an arbitrary semiring. A sufficient condition for this is that addition in a semiring (A; +, ×, 0, 1) is idempotent. Then we have from the definition of transport X tΛ(1Θ)(λ) = 1 = 1.

θ∈tΘ(λ)

And this condition is also sufficient for the validity of the Transport Axiom A4. 3 LABELED ALGEBRAS OF INFORMATION 53

Theorem 23 If addition in the semiring (A; +, ×, 0, 1) is idempotent, then for for all φ with d(φ) = Θ, Θ⊥Λ1|Λ2 implies

tΛ1 (φ) = tΛ1 (tΛ2 (φ)).

Proof. For an atom λ ∈ Λ1 we have

X 0 X X tΛ1 (tΛ2 (φ))(λ) = tΛ2 (φ)(λ ) = φ(θ), 0 0 0 λ ∈tΛ2 (λ) λ ∈tΛ2 (λ) θ∈tΘ(λ )

0 0 Now, Θ⊥Λ1|Λ2 implies tΘ(λ) = tΘ(tΛ2 (λ)) (Theorem 19). Let Iλ = tΘ(λ ) and [ I = Iλ0 . 0 λ ∈tΛ2 (λ)

We claim that I = tΘ(λ), Assume first θ ∈ I, so that θ ∈ Iλ0 for some 0 0 λ ∈ tΛ2 (λ). But then, since tΘ(λ ) ⊆ tΘ(λ) it follows that θ ∈ tΘ(λ).

Conversely, consider an atom θ ∈ tΘ(λ) = tΘ(tΛ2 (λ)). This implies that 0 0 there is a λ ∈ tΛ2 (λ) such that θ ∈ tΘ(λ ), hence θ ∈ I. So, indeed I = tΘ(λ). It follows therefore, using idempotency of addition in the semiring A, that X X X φ(θ) = φ(θ) = tΛ1 (φ)(λ). 0 0 λ ∈tΛ2 (λ) θ∈tΘ(λ ) θ∈tΘ(λ)

This shows that tΛ1 (φ) = tΛ1 (tΛ2 (φ)). ut

In particular, if Θ ≤ λ, then Θ⊥Θ|λ and therefore by this theorem tΘ(tΛ(φ)) = tΘ(φ) = φ if d(φ) = Θ. This means that tΛ(φ) may be considered as a vac- uous extension of φ, since φ can be retrieved from its extension (see Section 3.2). Now, we have nearly all axioms A1 to A6 of a generalised information alge- bra. The last item missing is axiom A3 2.) which states that tΛ(φ) = 0Λ implies φ = 0Θ, if d(φ) = Θ. For this it is sufficient that the semiring A is positive. This together with the results above, proves the following theorem:

Theorem 24 Let (A; +, ×, 0, 1) be a positive semiring with idempotent ad- dition. Then (Φ, F; ≤, ⊥, d, ·, t) where Φ is the set of A-valuations, with Labeling, Combination and Transport as defined above, is a generalised in- formation algebra. 3 LABELED ALGEBRAS OF INFORMATION 54

We note that, even if the semiring A is not positive, then all axioms are satisfied, as long as addition is idempotent, except Axiom A3 2.). If we scan the examples of semirings above, then we find the following examples, which induce generalised information algebras of A-valuations.

Example : Set Algebra: Consider the Boolean semiring A = {0, 1}. It is positive and addition (max) is idempotent. Here A-valuations φ are indica- tor functions on the associated frame d(φ) = Θ. Such an indicator function defines a subset {θ : φ(θ) = 1} of the frame. Combination and transport of these indicator functions correspond exactly to combination and transport of the associated subsets as defined in Section 3.1. So, we retrieve with this Boolean semiring valuation algebra the subset algebra on a f.c.f. This covers, in the multivariate case, constraint systems and gives a subset of relational algebra 7citekohlas03, which is useful in query processing and for constraint solving. Constraints may also be formulated by formulae of propositional or predicate logic. In this sense it is also related to inference in Boolean logic, see (Kohlas, 2003a).

Example : Distributive Lattices: A bounded distributive lattice A is a semir- ing with idempotent addition (join) and is positive. Since multiplication (meet) is also idempotent the A-valuations form in this case a proper infor- mation algebra. Such algebras are discussed in Section 9. Of course, the previous example is an instance of such a valuation algebra, since a Boolean algebra is a distributive lattice. We may generalise the previous example and consider A-valuations related to any Booelean algebra A. This is related to assumption-based reasoning (Kohlas & Wilson, 2008).

Example : Fuzzy Sets, Possibilistic Constraints: If we take a t-norm for multiplication and max for addition, then addition is idempotent and the semiring is positive. In this case an A-valuation on Θ is also called a possi- bilistic distribution, a possibilistic constraint or a fuzzy set. Intersection of possibilistic constraints or fuzzy sets are defined by the t-norm and and ad- dition is used to compute transport. These A-valuations form a generalised information algebra.

Example : Optimization: Consider the (max / min, +) semiring of reals. Again max and min are idempotent. So, the A-valuations form a generalised information algebra although A is not positive, sartisfying all axioms except A3 2.). We may always adjoin a least element ⊥ to a semitlattice F; ≤ and 3 LABELED ALGEBRAS OF INFORMATION 55 define X t⊥(φ) = φ(θ). θ∈Θ if φ is an A-valuation on Θ. In the present particular case, we have then

t⊥(φ1 · ... · φn) = max(φ1(tΘ(θ)) + ... + φn(tΘ(θ))). θ∈Θ So this information algebra and its associated local computation scheme serves for optimization. Note that local computation yields the maximum value of the combination. But the scheme may be adopted to compute also optimal solutions (Shenoy, 1996; Pouly & Kohlas, 2011). This is a version of dynamic programming. Now we change the focus on A-valuations to some extend with the goal to obtain valuation algebras instead of generalised information algebras. We consider f.c.f (F, R) for which we assume that

1. (F; ≤) is a lattice,

2. the relation Θ⊥Λ|Θ ∧ Λ holds for all pairs of frames Θ, Λ ∈ F.

It is interesting and important to clarify what these restricting requirements mean. Let µ1 and µ2 be the refinings of Θ ∧ Λ to Θ and Λ respectively and τ1 and τ2 the refinings of Θ and Λ to Θ ∨ Λ. Note that τ1(θ) ∩ τ1(µ1(ζ)) 6= ∅ if and only if θ ∈ µ1(ζ), and similarly τ2(λ) ∩ τ2(µ2(ζ)) 6= ∅ if and only if θ ∈ µ2(ζ) for any ζ ∈ Θ ∧ Λ. From the conditional independence condition Θ⊥Λ|Θ ∧ Λ we conclude that these conditions imply that

τ1(θ) ∩ τ2(λ) ∩ τ1(µ1(ζ)) 6= ∅, or that τ1(θ) ∩ τ2(λ) 6= ∅ if θ ∈ µ1(ζ) or if λ ∈ µ2(ζ). This condition can be expressed in two ways: a) If tΘ∧Λ(θ) = tΘ∧Λ(λ). then there exists an atom ζ ∈ Θ ∨ Λ such that tΘ(ζ) = θ and tΛ(ζ) = λ. b) If Θ, Λ and Θ ∧ Λ are considered as partitions of Θ ∨ Λ, then if θ and λ are in the same block of Θ ∧ Λ, there is an atom ζ ∈ Θ ∨ Λ such that θ and ζ are in the same block of Θ and λ and ζ are in the same block of Λ. 3 LABELED ALGEBRAS OF INFORMATION 56

Sublattices of the partition lattice part(U) of some universe satisfying con- dition b) are called partition lattices of type I (Gr¨atzer,1978) (recall, that our order is the inverse of the order usually considered between partitions). In particular, multivariate models satisfy this condition. We consider now A-valuations on such f.c.f. But we do no more assume that addition is idempotent nor do we necessarily assume null or unit elements in the semiring (A; +, ×). Instead of a general transport operation t for A- valuations, we consider only projection, that is partial transport operations πΛ = tΛ :ΦΘ → ΦΛ, defined only for frames Λ ≤ Θ. We define Labeling and Combination as before, such that we have

1. Labeling: d(φ) = Θ if φ is an A-valuation on Θ.

2. Combination: If d(φ) = Θ and (.ψ) = Λ, then for ζ ∈ Θ ∨ Λ, then an A-valuation φ · ψ is defined by

φ · ψ(ζ) = φ(tΘ(ζ)) × ψ(tΛ(ζ)). (3.21)

3. Projection: If d(φ) = Θ and λ ∈ Λ ≤ Θ, then an A-valuation πΛ(φ) is defined by X πΛ(φ)(λ) = φ(θ). (3.22)

θ∈tΘ(λ)

We show that in this way we get a valuation algebra of A-valuations Axiom S0 is valid by assumption that (F; ≤) is a lattice. The Semigroup Axiom S1 holds as before, the definition of combination has not changed; and so holds the Labeling Axiom S2. The next theorem gives us the combination Axiom S4 for A-valuations.

Theorem 25 Let (F; ≤) be a lattice such that Θ⊥Λ|Θ ∧ Λ for all frames Θ, Λ ∈ F. Then, if d(φ) = Θ and d(ψ) = Λ,

πΘ(φ · ψ) = φ · πΘ∧Λ(ψ).

Proof. The condition Θ⊥Λ|Θ∧Λ implies tΛ(θ)(tΘ∧Λ(θ)) for all θ ∈ Θ) (Theo- remth:StepwiseTransp). Since we have also Θ⊥Λ|Θ∨Λ, we find tΛ(θ)(tΘ∨Λ(θ)), hence

tΛ(θ)(tΘ∨Λ(θ)) = tΛ(θ)(tΘ∧Λ(θ)) 3 LABELED ALGEBRAS OF INFORMATION 57

To simplify writing, it is convenient to define for d(φ) = Θ and S ⊆ Θ, X φ(S) = φ(θ) θ∈S

Then projection can be written as πΛ(φ)(λ) = φ(tΘ(λ)). With this notation, from the definition of projection and comboination, we obtain for θ ∈ Θ,

πΘ(φ · ψ)(θ)

= φ · ψ(tΘ∨Λ(θ) = φ(θ) × ψ(tΛ(tΘ∨Λ(θ))

= φ(θ) × ψ(tΛ(tΘ∧Λ(θ)) = φ(θ) × tΘ∧Λ(ψ)(tΘ∧Λ(θ))

= φ · tΘ∧Λ(ψ)(θ).

This shows that πΘ(φ · ψ) = φ · πΘ∧Λ(ψ). ut So, in this case all axioms of a valuation algebra are satisfied, with the exception of Axiom S3 concerning Unit and Null valuations. In summary, (Φ, F, ≤, d, ·, π) satisfies the following axioms, provided that (F; ≤) is a lat- tice such that Θ⊥Λ|Θ ∧ Λ for all frames Θ, Λ ∈ F:

P1 Semigroup: (Φ; ·) is a commutative semigroup.

P2 Labeling: d(φ · ψ) = d(φ) ∨ d(ψ) and d(πΛ(φ)) = Λ.

P3 Projection: If d(φ) = Θ and Λ1 ≤ Λ2 ≤ Θ, then

πΛ1 (φ) = πΛ1 (πΛ2 (φ)).

P4 Combination: If d(φ) = Θ and d(ψ) = Λ, then

πΘ(φ · ψ) = φ · πΘ∧Λ(ψ).

We may extend the concept of a valuation algebra to a system (Φ, F, ≤ , d, ·, π) satisfying these axioms. That is, we do not necessarily require the existence of null or unit elements. So, we have the following theorem:

Theorem 26 Let (F; ≤) be a lattice such that Θ⊥Λ|Θ ∧ Λ holds for all frames Θ, Λ ∈ F. Then (Φ, F, ≤, d, ·, π) is a valuation algebra, satisfying axioms P1 to P4. 4 LOCAL COMPUTATION 58

If the semiring A has null and unit elements, then under some additional conditions, Axiom S3 of a valuation algebra in the old sense or parts of it may be satisfied. If the semiring A has a null and unit element, then Theorem 21 still holds for projections. If addition is idempotent, then πΛ(1Θ) = 1Λ. the valuation algebra is then called stable. As shown in Section 3.2, a generalised information algebra can then be derived from the valuation algebra. If the semiring A is also positive, then πΛ(φ) = 0Λ implies φ = 0Θ, if d(φ) = Θ. In these cases Axiom S3 of the original information algebra is valid too. Here follow a few examples of valuation algebras induced by a semiring.

+ Example : Probability Potentials: The arithmetic semiring (R ∪ {0}; +, ×) gives rise to the semiring of mappings of frames Θ to nonnegative real num- bers. Note that addition is not idempotent, so there is no generalised in- formation algebra relative to this semiring. This semiring valuation algebra is usually considered in the framework to Bayesian networks (Lauritzen & Spiegelhalter, 1988; Shenoy & Shafer, 1990a; Shafer, 1996). That is why the A-valuations are called probability potentials, especially, since a non- negative function may also be transformed into a probability distribution by normalisation. We refer to Section 6.3. Note however, that combination of probability potentials in the valuation algebra corresponds to no opera- tion of classical probability. In Section 6.3 we present a different source and interpretation of probability potentials and their valuation algebra. Example : Subsets, Fuzzy Sets or Possibilistic Constraints: The examples above of set algebras, fuzzy sets and possibilistic constraints based on t- norms exist also as valuation algebras. To conclude, we see that many important information or valuation algebras are induced by a semiring. We come back to this subject in Section 5.5. We remark further, that the concept of semiring-valuation algebras may be extended to set-based semiring valuation algebras (Pouly & Kohlas, 2011) which cover examples of valuation algebras wehich are not semiring valuation algebras, like belief functions and generalisations thereof.

4 Local Computation

4.1 Computing in Markov Trees

Generalising the local computation scheme for inference in Bayesian net- works (Lauritzen & Spiegelhalter, 1988), Shenoy and Shafer proposed an 4 LOCAL COMPUTATION 59 axiomatic scheme for general local computation, in addition to probability propagation, especially also for belief functions (Shenoy & Shafer, 1990a). This laid the basis for valuation algebras (Kohlas & Shenoy, 2000; Kohlas, 2003a) as an axiomatic foundation of local computation. The general infor- mation algebra proposed here extend and generalise valuation algebras as structures for local computation schemes. We note that in (Shafer et al. , 1987b) already local computation of belief on functions on partition lattices were described, which is an instance more general than valuation algebras, and in (Kohlas & Monney, 1995) local computation on families of compat- ible frames is discussed. The present discussion generalizes both of these approaches. Local computation schemes in the framework of information algebras pro- pose computational solutions to the so-called projection problem. It is as- sumed that information is given in pieces, which must be combined, and then the part relative to some given question is to be extracted. So, let φ1, . . . , φn be n pieces of information and φ = φ1 · ... · φn the aggregated information. Assume that the question to be examined is represented by some element x of D. Then the problem is to compute

tx(φ) = tx(φ1 · ... · φn). (4.1)

This is called the projection problem. Let xi denote the domain of φi. Then the domain of φ equals x1 ∨ ... ∨ xn according to the labeling axiom. We may presume that the basic operations of an information algebra, combina- tion and transport, have a degree of complexity which grows with the size of the domain and, in may cases, combination and transport may become computationally infeasible on large domains. Therefore, the naive solution of the projection problem which consists of sequentially combine the factors φ1, φ2 up to φn and then extract the part relating to x by the transport operator applied to φ may be infeasible or at least inefficient. In the case of Bayesian networks, (Lauritzen & Spiegelhalter, 1988) have shown that the projection problem may be solved under some circumstances by arranging the operations in such a way that they take place on the domains xi of the factors of the combination φ and (Shenoy & Shafer, 1990a) have shown that this is possible more generally in the case of abstract valuation algebras. This is called local computation. We show here that local computation is also possible with generalised information algebras. If local computation can be used to solve the projection problem, then the complexity is determined by the operations of combination or transport on the domains xi. Assume some measure of complexity c(xi) depending on 4 LOCAL COMPUTATION 60

the domain and let c = max{c(x1), . . . , c(xn)}. Then the complexity of local computation is n · c, hence linear in the problem size as measured by the number of factors to be combined. The complexity measure c(xi) depends much on the instance of the information or valuation algebra. It may be polynomial or exponential in some parameter measuring the size of domains xi, depending on the instance, see (Pouly & Kohlas, 2011). But the linearity in the problem size, once c is given, makes local computation in many cases feasible, because the structure of practical problems often guarantee small domains xi, and therefore a reasonably small value for c. This represents a situation studied more generally in parameterized complexity theory (Flum & Grohe, 2006). In the case of Bayesian networks, local computation is shown to be closely related to conditional independence structures, see for example (Cowell et al. , 1999). The same is the case in relational algebra, (Beeri et al. , 1983; Maier, 1983). In fact, in the case of generalised information algebras, these structures can be identified as Markov trees and hypertrees. In this section we assume throughout a generalised information algebra (Φ,D; ≤, ⊥, ·, t). Assume that the domains xi of the factor of a factorisation (4.1) of φ form a Markov tree. More precisely, let (T, λ) with T = (V,E) be a Markov tree such that |V | = n and each node v corresponds to exactly one domain xi, such that xi = λ(v). Under this assignment denote then φi by φv, such that Y φ = φv, d(φv) = λ(v). (4.2) v∈V

We call such a factorisation a Markov tree factorisation. As usual Tv,w = (Vv,w,Ev,w) denotes the subtree obtained by removing node v and containing the neighbor w of v. As we know, this is still a Markov tree (Theorem 12). Assume further that x = λ(v) for some node v of the Markov tree, such that tλ(v)(φ) is to be computed. Such a projection problem has a local computation solution, as the following theorem shows.

Theorem 27 Let (T, λ) be a Markov tree with T = (V,E) and φ given by a Markov tree factorisation (4.2) according to this Markov tree. Then, for any node v ∈ V , Y tλ(v)(φ) = φv · tλ(v)(tλ(w)(φv,w)), (4.3) w∈ne(v) where Y φv,w = φu. (4.4)

u∈Vv,w 4 LOCAL COMPUTATION 61

Proof. Note that d(φv,w) = λ(Vv,w). By Theorem 11, λ(v)⊥λ(Vv,w)|λ(w). Using axiom A4 we obtain

tλ(v)(φv,w) = tλ(v)(tλ(w)(φv.u)). Now, Y φ = φv · φv,w w∈ne(v)

By C1 and C2, λ(v)⊥ ∨w∈ne(v) λ(Vv,w)|λ(v). So by axiom A5, Y tλ(v)(φ) = tλ(v)(φv) · tλ(v)( φv,w). w∈ne(v)

Further, the Markov property ⊥{λ(Vv,w): w ∈ ne(v)}|λ(v) implies, using Theorem 15, Y Y tλ(v)( φv,w) = tλ(v)(φv,w). w∈ne(v) w∈ne(v)

Finally, by axiom A6 we have tλ(v)(φv) = φv, hence Y tλ(v)(φ) = φv · tλ(v)(tλ(w)(φv,w)). w∈ne(v) which concludes the proof. ut Formulas (4.3) and (4.4) define a recursive scheme of local computation for the solution of the projection problem. Once the subproblems tλ(w)(φv,w) in the Markov subtree Tv,w are solved for all neighbors w of v, only transports from node w to to node v and combinations on node v have to be executed. These are local operations on the domain λ(v). The anchors for the recursion are the trivial one-node Markov trees {u}, where the projection problem tλ(u)(φu) = φu is trivial. So this is indeed a local computation scheme. It is possible to represent this computation scheme as a message passing method on a tree, as already shown for valuation algebras by (Shenoy & Shafer, 1990a). For two neighboring nodes v and w define a message from w to v by

µw→v = tλ(v)(tλ(w)(φv,w)). (4.5) Select arbitrarily a node v of the tree as a root node and direct all arcs of the tree towards this node. Note that then all nodes, except the root node, have 4 LOCAL COMPUTATION 62 exactly one outgoing arc. Further there is at least one leaf node without incoming arc. Each such leaf node u has a unique neighbor w to which it can send the message

µu→w = tλ(w)(φu). Once a node u has received messages through all its incoming arcs, it can compute the message for the unique outgoing arc and send it to its neighbor w by Y µu→w = tλ(w)(φu · µn→u). n∈ne(u)−{w} Once the root node v has received all its messages through this procedure, it can compute the solution of the projection problem according to (4.3) by Y tλ(v)(φ) = φv · µw→v. w∈ne(v) This message passing scheme is also called a collect algorithm. What is even more, the root node can now send messages back to all its neighbors. If in the collect phase the messages have been cached, all neigh- bors w of v can then compute the projection tλ(w)(φ), Y tλ(w(φ) = φw · µu→w. u∈ne(w) These nodes are then in a position to send themselves messages back to all their neighbors through their incoming arcs, and so on. Finally, by sending messages in this way backwards towards the leafs, at the end tλ(w)(φ) has been computed for all nodes w of the Markov tree. This second phase of computation is also called a distribute algorithm. The formulae of this local computation scheme simplify somewhat, if one computes in a valuation algebra. Then, using the formula for transport in a valuation algebra( 3.9), we obtain for the messages Y µu→w = πλ(u)∧λ(w)(φu · µn→u). n∈ne(u)−{w} and Y πλ(v)(φ) = φv · µw→v. w∈ne(v) 4 LOCAL COMPUTATION 63

This is the version of the collect algorithm, usually defined in the multivari- ate setting (Shafer & Shenoy, 1990; Kohlas & Shenoy, 2000; Pouly & Kohlas, 2011). In the multivariate framework other variants of local computation are possible, like fusion or bucket elimination schemes, based on successive variable eliminations (Shenoy, 1992; Dechter, 1999), methods which are not available in our more general setting.

4.2 Computation in Hypertrees

Local computation can also be defined on a hypertree. Suppose that the domains x1, . . . , xn in a projection problem

φ = φ1 · ... · φn, d(φi) = xi, define a hypertree construction sequence and that x = xn in the projection problem (4.1). Then a local computation solution can be obtained as follows: Define

yi = xi+1 ∨ ... ∨ xn, i = 1, . . . , n − 1.

First, we eliminate the first domain x1 in the sequence by computing ty1 (φ), using the hypertree condition (Definition 4). From x1⊥y1|y1 and axioms A2,A5 and A6 we obtain,

ty1 (φ) = ty1 (φ1) · ty1 (φ2 · ... · φn) = ty1 (φ1) · φ2 · ... · φn.

The hypertree condition x1⊥y1|xb(1) implies ty1 (φ1) = ty1 (txb(1) (φ1)) and therefore,

ty1 (φ) = ty1 (txb(1) (φ1)) · φ2 · ... · φn.

Since xb(1) ≤ y1 = d(φ2 · ... · φn) we conclude (see Lemma 1, 5.) that

ty1 (φ) = ty1 (txb(1) (φ1)) · ty1 (φ2 · ... · φn) = txb(1) (φ1) · φ2 · ... · φn.

1 2 1 1 2 1 Define ψi = φi and then ψb(1) = ψb(1) ·tb(1)(ψ1) and ψj = ψj for j = 2, . . . , n 2 and j 6= b(1). Note that d(ψj ) = xj for j = 2, . . . , n. Then after elimination of domain x1 we obtain a new factorisation

2 2 ty1 (φ) = ψ2 · ... · ψn. 4 LOCAL COMPUTATION 64

We may now proceed in the same way, eliminating domains x2, x3,.... By induction let’s assume that

i i i tyi−1 (φ) = ψi · ... · ψn, d(ψj) = xj, j = i, . . . , n. (4.6)

Since yi ≤ yi−1 we have tyi (tyi−1 (φ)) = tyi (φ). Now we eliminate domain xi in yi−1 in (4.6) in the same way as we did above for x1 and obtain

tyi (φ) i i = tyi (ψi · ... · ψn) i i i = tyi (tb(i)(ψi)) · ψi+1 · ... · ψn i i i = txb(i) (ψi) · ψi+1 · ... · ψn.

Define

i+1 i i ψb(i) = ψb(i) · txb(i) (ψi) (4.7)

i+1 i i+1 and ψj = ψj for j = i + 1, . . . , n and j 6= b(i). We still have d(ψj ) = xj for j = i + 1, . . . , n. Thus we obtain the new factorisation

i+1 i+1 i+1 tyi (φ) = ψi+1 · ... · ψn , d(ψj ) = xj, j = i + 1, . . . , n.

This concludes the induction. At the end, for i = n, we obtain in this way

n txn (φ) = ψn. (4.8)

This solves the projection problem on the hypertree {x1, . . . , xn} and does it by local computation: In every step i = 1, . . . , n − 1 a transport operation i i tb(i)(ψi) and a combination operation of the result with ψb(i) have to be executed and this n − 1 times. These are all local operations on domains xb(i). So, here we have a second local computation scheme, this time on a hypertree. Note that a Markov tree induces a hypertree, even many hypertrees. In this case, it can by seen that the hypertree computation scheme just described essentially corresponds to the collect algorithm in the Markov tree. How- ever, since a hypertree does not necessarily induce a Markov tree, the back- wards distribute algorithm is not available in general in a hypertree. If the generalised information algebra however is idempotent, then the distribute algorithm gives the correct result for hypertrees too. This is formulated in the following thorem 4 LOCAL COMPUTATION 65

Theorem 28 Assume {x1, . . . , xn} to be a hypertree with a hypertree con- i struction sequence x1, . . . , xn, φ = φ1 · ... · φn with d(φi) = xi and ψi the intermediate results computed in the collect algorithm in this sequence. De- fine

µb(i)→i = txi (txb(i) (φ)). Then, for i = n − 1,..., 1, if axiom A7 (Idempotency) holds,

i txi (φ) = µb(i)→i · ψi. (4.9)

Proof. Define as before yi = xi+1 ∨ ... ∨ xn. Then, according to the collect algorithm above

i+1 i+1 i+1 tyi (φ) = ψi+1 · ... · ψn , d(ψj ) = xj, j = i + 1, . . . , n. (4.10)

Since xb(i) ≤ yi, we obtain

i i i ψi · µb(i)→i = ψi · txi (txb(i) (φ)) = ψi · txi (txb(i) (tyi (φ))).

Further, as x1, . . . , xn is a hypertree construction sequence, we have xi⊥yi|xb(i) or yi⊥xi|xb(i)(C1). Apply axiom A4 and (4.10) to obtain

i i i i+1 i+1 ψi · µb(i)→i = ψi · txi (tyi (φ)) = ψi · txi (ψi+1 · ... · ψn ).

Using (4.7) and axiom A5 with xi⊥yi|xi we obtain further i i i i i ψi · µb(i)→i = txi (ψi · ψi+1 · ... · ψn · txb(i) (ψi)). Now we use idempotency to show that

i i i i i i i ψi · ψi+1 · ... · ψn · txb(i) (ψi) = ψi · ψi+1 · ... · ψn In fact, since we assume idempotency, Lemma 3 applies, and so do Lemma 1, 5.) and 2.),

i i i i (ψi · ψi+1 · ... · ψn · txb(i) (ψi) n i Y i i Y i = txi∨xb(i) (ψi) · ψj = tyi−1 (txi∨xb(i) (ψi)) · tyi−1 (ψj) j=i+1 j=i+1 n n i Y i Y i = tyi−1 (ψi) · tyi−1 (ψj) = tyi−1 (ψj) j=i+1 j=i n Y i = ψj. j=i 5 DIVISION AND INVERSES 66

Then we obtain finally

i i i ψi · µb(i)→i = txi (ψi · ... · ψn) = txi (tyi−1 (φ)) = txi (φ), since xi ≤ yi−1. This concludes the proof. ut

According to this theorem, once txn (φ) has been computed in the n-th step of the collect algorithm, the other projection problems txi (φ) can be computed in the inverse order i = n − 1,..., 1 of the construction sequence. At step i, txj (φ) is known for all j ≥ i, and then (4.9) allows to compute txi−1 (φ), since b(i − 1) ≥ i. The assumptions that the domains of the factors of a projection problem form just a Markov tree or a hypertree is of course very strong. But as usual, the existence of unit elements allows the use of covering Markov trees or hypertrees. This is in generalised information algebras just as in valuation algebras (Kohlas, 2003a; Pouly et al. , 2013). In the multivariate case, covering Markov trees may be found by sequences of variable eliminations. Such procedures are not available in the more general case of general q- separoids (D; ≤ ⊥). So, finding good covering Markov trees is an open problem. Also, like in valuation algebras the semigroup (Φ, ·) may allow for division as a partially inverse operation of combination, (Lauritzen & Jensen, 1997; Kohlas, 2003a). This will be discussed in Section 5.4. Division has also an impact on information order, and this will be studied in Section 8.

5 Division and Inverses

5.1 Separative Semigroups

So far, in the present framework, pieces of information can be combined, in- formation can be added. In certain situations it could also be interesting and important to remove information. In algebraic terms, this is an operation inverse to combination, that is some kind of division. It can not be expected that this is possible in general. However there are important cases where division is possible. It is the subject of this section to introduce and discuss information algebras with division. We start by presenting the well-known theory of semigroups with inverses. Based on this theory, information, and in particular, valuation algebras with division will be studied. In probability theory, probabilistic information is represented by distributions. From them conditional distributions can be derived; and they play an important role 5 DIVISION AND INVERSES 67 in probability theory. Conditional distributions are obtained by dividing a distribution by a marginal of it; so division is basic in this context. It will be shown that in this class of valuation algebras, this can be mimicked to a large amount by conditional information, which we call conditionals (Sec- tion 6.1). Conditionals are shown to share most properties with conditional probability distributions. In particular, causal modelling, usually expressed in probabilistic terms, can be extended to valuation algebras with division. More generally the recent approach of compositional modelling (Jirousek, 1997; Jirousek, 2011; Jirousek & Shenoy, 2014; Jirousek & Shenoy, 2015) can be developed in valuation algebras with division. Finally, it will be shown how division can be exploited in local computation 5.4. In the sim- pler context of valuation algebras over sets of variables a large part of this theory has already been developed in (Kohlas, 2003a). In studying division in a semigroup we follow (Hewitt & Zuckerman, 1956). Consider a commutative semigroup (A; ×). Such a semigroup is called sep- arative, if for all elements a, b ∈ A,

a2 = b2 = a × b implies a = b. (5.1)

Let in A a ≡γ b if there is a positive integer n and two elements u, v ∈ A such that

an = u × b and bn = v × a.

This defines an congruence relation in the semigroup A. More precisely, the following holds:

Theorem 29 If (A; ×) is a separative semigroup, then ≡γ is an equivalence relation in A such that

2 1. a ≡γ a,

2. a ≡γ b and c ≡γ d imply a × c ≡γ b × d.

2 Proof. Reflexivity of ≡γ follows from a = a × a, and symmetry follows n 0 n 0 directly from the definition. If a ≡γ b and b ≡γ c, then a = u ×b, b = v ×a and bm = u00 × c, cm = v00 × b. It follows that anm = u0m × bm = u0m × u00 × c and cnm = v00n × bn = v00n × v0 × a. This shows that transitivity holds, a ≡γ c. So, ≡γ is an equivalence relation in A. 5 DIVISION AND INVERSES 68

Further, with n = 4, a4 = a2 × a2, and (a2)4 = a7 × a, which means that 2 n 0 n 0 a ≡γ a. Finally a ≡γ b and c ≡γ d imply that a = u × b, b = v × a and cm = u00 × d, dm = v00 × c. From this it follows that

(a × c)n = an × cn = u0 × cn−1 × (b × c), (b × c)n = bn × cn = v0 × cn−1 × (a × c)

This shows that a×c ≡γ b×c. In the same way it follows that b×c ≡γ b×d, and therefore, by transitivity a × c ≡γ b × d. ut

As ≡γ is an equivalence relation in A, the set A decomposes in a union of disjoint equivalences classes [a]γ. Consider then the quotient semigroup A/ ≡γ, where [a]γ × [b]γ = [a × b]γ. This semigroup is idempotent, since 2 [a]γ ×[a]γ = [a ]γ = [a]γ. Therefore, we may define a partial order [a]γ ≤ [b]γ, if [a]γ × [b]γ = [b]γ. Under this partial order the semigroup A/ ≡γ becomes a join semilattice and [a]γ × [b]γ = [a]γ ∨ [b]γ. Subsequently, we need the following lemma, valid for separative semigroups.

Lemma 5 (Hewitt & Zuckerman, 1956) In a separative semigroup,

1. an × b = an × c implies a × b = a × c,

0 0 2. if a ≡γ b, then there is positive integer n and elements u , v ∈ [a]γ such that an = u0 × b and bn = v0 × a.

Proof. 1.) Suppose first that n is odd. In this case, let m = (n + 1)/2, so that (am × b)2 = an+1 × b2 = a × an × b × b = a × an × c × b = an+1 × c2. Next, suppose that n is even. In this case let m = n/2. We then have (am × b)2 = an × b2 = an × b × c = an × c2. In both cases, (am × b)2 = (am × b) × (am × c) = (am × c)2, hence by separativity am × b = am × c. Since m is a positive integer such that m < n if 1 < n, this argument can be repeated until a × b = a × c. 2.) Under the assumptions of the lemma, we have an = u × b and bn = v × a for some u, v ∈ A, hence an+1 = (u × a) × b and bn+1 = (v × b) × a. Now, 2 [u × a]γ = [u]γ × [a]γ. And we have further [u]γ × [a]γ = [u]γ × [a ]γ = n+1 [u × a]γ × [b]γ = [u × a × b]γ = [a ]γ = [a]γ. So, we see that u × a ∈ [a]γ. In the same way we find that v × b ∈ |b]γ = [a]γ. This proves the lemma with u0 = u × a and v0 = v × b. ut The following theorem is the key for introducing division or inverses into a separative semigroup. 5 DIVISION AND INVERSES 69

Theorem 30 If (A; ×) is a separative semigroup, then the following holds:

1. The mapping a 7→ [a]γ is a semigroup homomorphism.

2. For all a ∈ A, the equivalence class [a]γ is a subsemigroup of A.

3. If for a, b, c ∈ [a]γ, a × b = a × c, then b = c.

Proof. 1.) We have a × b 7→ [a × b]γ = [a]γ × [b]γ. This shows that the mapping is a semigroup homomorphism.

2.) Consider elements b, c ∈ [a]γ, that is a ≡γ b and a ≡γ c. Then, since ≡γ 2 is a congruence a ≡γ a ≡γ b × c. Therfore b × c ∈ [a]γ and the class [a]γ is a subsemigroup of A. n 3.) Assume a × b = a × c and a, b, c ∈ [a]γ. Then we have b = u × a and cm = v × a for some positive integers n,m and some u, v ∈ A. It follows that bn+1 = u × a × b = u × a × c = bn × c and also cm+1 = v × a × c = v × a × b = cm × b. From bn × b = bn × c and cm × c = cm × b we deduce that b2 = b × c and c2 = b × c (Lemma 5). But then separativity implies b = c ut A semigroup (A; ×), where for all elements a × b = a × c implies b = c is called cancellative. So, Theorem 30, item 3, states that the subsemigroups [a]γ of A are cancellative. The separative semigroup A is therefore the union of disjoint cancellaticve subsemigroups [a]γ. Below, we shall see that separativity is not only sufficient, but also necessary for a semigroup to be the union of disjoint cancellative semigroups. We now first consider two important special cases of separative semigroups. In some cases the classes [a]γ are already groups.

Theorem 31 In a separative semigroup, if the equivalence class [a]γ con- tains an idempotent element, then it is a group.

Proof. Denote the idempotent element of the class [a]γ by f. Consider any 2 element b ∈ [a]γ. Then, f × b = f × b. By cancellativty of the semigroup [a]γ it follows that f × b = b. So f is the unit element of [a]γ. Also we have n n f = v × b for some v ∈ [a]γ (Lemma 5, item 2). But f = f, so v and b are inverses. This shows that [a]γ is a group. ut

The idempotent in a group [a]γ is unique, and we denote it subsequently by fa to underline its appartenance to group [a]γ. A separative semigroup 5 DIVISION AND INVERSES 70

(A; ×), where every equivalence class [a]γ has an idempotent element, is called regular. In this case a ≡γ b if and only if a = u × b and b = v × a, with u = a × b−1 and v = b × a−1. This is the Green relation in a regular semigroup. This means also that for all a ∈ A there is a b ∈ A such that

a = a2 × b. (5.2)

This is an alternative condition to define regularity of a semigroup. A regular semigroup A is thus the union of disjoint groups, that is, every element has a (local) inverse in its equivalence class [a]γ. In (Croisot, 1953) it has been shown that conversely, a semigroup satisfying (5.2) decomposes into a union of disjoint groups, see also (Kohlas, 2003a). This shows that a semigroup is regular if and only if it decomposes into a union of disjoint groups. A regular semigroup is also necessarily separative. But, of course, not every semigroup is regular or separative Here follow a few examples of semigroups,

Example : Arithmetic Semigroups: Real or rational numbers are semigroups under addition and multiplication and they are regular under both opera- tions. Under addition, the real numbers form a group; under multiplicatgion they decompose into the trivial group {0} and the multiplicative group of nonzero numbers. Nonnegative real or rational numbers are still regular semigroups under multiplication but no more under addition. Again, the nonnegative real or rational numbers are the union of the group {0} and the multiplicative group of positive real or rational numbers. Integers are under addition but not under multiplication regular semigroups. But all these semigroups are separative. The multiplicative semigroups of (strictly) positive real, rational numbers or integers are cancellative.

Example : Idempotent Semigroups: If in a semigroup (A; ×) the operation is idempotent, then the semigroup is trivially regular, since [a]γ = {a}. Examples are Boolean semigroups where ({0, 1}, max / min) or (join or meet) semilattices.

Example : t-Norms: If T is a t-norm (see Section 3.3), then ([0, 1],T ) is a commutative semigroup. Typical t-norms are given in the example on t- norms in Section 3.3. Clearly, the minimum t-norm and the product t-norm are regular. The Lukasziwicz and the drastic t-norm are not even separative.

Further examples of semigroups will be discussed later in the context of valuation algebras and semirings. 5 DIVISION AND INVERSES 71

Before we return to the general case of separative semigroups, assume that the semigroup (A; ×) is cancellative. Then it is surely separative. We define the relation (a, b) ≡δ (c, d) between ordered pairs of elements of A to hold if a × d = b × c. Further, we define

(a, b) × (c, d) = (a × c, b × d).

The relation ≡δ is a congruence relative to this operation as the next theorem shows.

Theorem 32 If (A; ×) is a cancellatice semigroup, then ≡δ is an equiva- 0 0 0 0 lence relation in A × A and (a, b) ≡δ (a , b ) and (c, d) ≡δ (c , d ) imply

0 0 0 0 (a, b) × (c, d) ≡δ (a , b ) × (c , d ).

Proof. Reflexivity and symmetry of the relation ≡δ are evident from the definition. Assume (a, b) ≡δ (c, d) and (c, d) ≡δ (e, f). Then d × (a × f) = (a×d)×f = (b×c)×f = b×(c×f) = b×(d×e) = d×(b×e). Cancellativity implies then a × f = b × e, hence (a, b) ≡δ (e, f). So transitivity holds, and ≡δ is an equivalence relation. 0 0 0 0 0 0 Further, from (a, b) ≡δ (a , b ) and (c, d) ≡δ (c , d ) we obtain a × b = a × b and c × d0 = c0 × d. Further, we have (a, b) × (c, d) = (a × c, b × d) and (a0, b0) × (c0, d0) = (a0 × c0, b0 × d0). Then it follows that (a × c) × (b0 × d0) = (a × b0) × (c × d0) = (a0 × b) × (c0 × d) = (b × d) × (a0 × c0). This shows that 0 0 0 0 (a, b) × (c, d) ≡δ (a , b ) × (c , d ). ut 0 Denote the equivalence classes of the relation ≡δ by [a, b]. and let A denote the set of all these equivalence classes. In A0 we define

[a, b] × [c, d] = [a × c, b × c].

This operation is well defined by Theorem 32. It turns out that A0 is a commutative group.

Theorem 33 If (A; ×) is a cancellative semigroup, then (A0; ×) is a group and (A; ×) is embedded in (A0; ×) by the mapping a 7→ [a2, a].

Proof. Since the operation × is clearly commutative and associative, (A0; ×) is a commutative semigroup. The class [a, a] is the unit element of ×, i.e. 5 DIVISION AND INVERSES 72

[a, a] × [b, c] = [a × b, a × c] = [b, c]. Finally, the class [b, a] is the inverse of [a, b], since [a, b] × [b, a] = [a × b, a × b]. So, (A0; ×) is a commutative group. The mapping a 7→ [a2, a] is a homomorphism, since a × b 7→ [a2 × b2, a × b] = [a2, a] × [b2, b]. Assume that [a2, a] = [b2, b], that is a2 × b = a × b2 or (a × b) × a = (a × b) × b. Then, by cancellativity, a = b, which shows that the mapping in injective, hence an embedding. ut Cancellativity is not only sufficient, but also necessary, for a semigroup A to be embedded in a group.

Theorem 34 A semigroup (A; ×) is cancellative if and only if it is embed- ded in a group.

Proof. We have already shown that cancellativity is sufficient. It remains to show that is necessary. So, assume f : A → B, to be an embedding of (A; ×) in a group (B; ×). Assume a × b = a × c in A. Then it follows that f(a)×f(b) = f(a)×f(c) in B and since B is a group this implies f(b) = f(c) Since f is injective it follows also b = c which shows that the semigroup A is cancellative. ut

Example : Arithmetic Semigroups: We have seen above that the multiplica- tive semigroup of positive integers is cancellative; and in fact it is embedded in the group of positive rational numbers. The multiplicative semigroups of positive real or rational numbers is cancellative too, but already itself a group. The additive semigroups of nonnegative real numbers or integers is also cancellative, and it is embedded in the additive group of real numbers or integers. We now turn to the general case of a separative semigroup (A; ×). We have seen above that in this case A is the union of the disjoint cancellative semigroups [a]γ. By Theorem 33 each of these semigroups is embedded in a group γ(a), the group of classes of pairs [a, b] for a, b ∈ [a]γ. Let [ A0 = γ(a). (5.3) a∈A In A0 we define

[a, b] × [c, d] = [a × c, b × d] and (A0; ×) becomes a commutative semigroup under this operation. The mapping a 7→ [a2, a] is still an embedding of the semigroup A into the 5 DIVISION AND INVERSES 73 semigroup A0, which is the union of disjoint groups. So, separativity of a semigroup is sufficient to embed the semigroup into a semigroup which is a union od disjoint groups. It is also necessary:

Theorem 35 A semigroup (A; ×) is separative if and only if it is embedded in a semigroup which is a union of disjoint groups.

Proof. As we have shown above, if (A; ×) is spearative it is embedded in a semigroup which is a union disjoint groups. Conversely, assume that [ S = Si, i where the (Si; ×) are disjoint gropus and (S; ×) a semigroup. Assume that f : A → S is an embedding of A in S. Consider elements a, b ∈ A such that a2 = b2 = a × b. Then we have f(a)2 = f(b)2 = f(a) × f(b). Then f(a) and f(b) belong to the same group Si and therefore f(a) = f(b). It follows that a = b, since f is injective, h ence (A; ×) is separative. ut Let now F be the set of idempotents in a separative semigroup (A; ×) which is embedded in the semigroup A0. Denote the group into which an element a ∈ A is embedded by γ(a) and its unit element by fa. These idempotents form themselves an idempotent semigroup (F ; ×) where fa × fb = fa×b. As between the subsemigroups [a]γ, we may define a partial order between idempotents by fa ≤ fb if fa × fb = fb. This means that a × b ≡γ b 0 or [a]γ ≤ [b]γ. So, the orders between idempotents of A and between equivalence classes are equivalent. We may also order the groups γ(a) in the same way, γ(a) ≤ γ(b) if [a]γ ≤ [b]γ. In fact, this partial order defines a join-semilattice, as we have seen above,

γ(a × b) = γ(a) ∨ γ(b) or fa × fb = fa ∨ fb.

The following simple lemma will be important later for separative valuation algebras.

Lemma 6 Let (A; ×) be a separative semigroup, embedded in the semigroup 0 (A , ×). Then γ(a) ≤ γ(b) implies fa × b = b. 5 DIVISION AND INVERSES 74

Proof. Assume γ(a) ≤ γ(b). Then fa × b = fa × fb × b = fb × b = b. ut Subsequently, we shall often identify the image of A under the embedding map with A itself and thus consider A as a subset of A0. This will simplify notation and should not cause confusion.

5.2 Regular Valuation Algebras

Information algebras are semigroups under combination. So we can apply the theory developed in the previous section to introduce inverses, hence di- vision in information algebras using separativity. It turns however out that this alone is not sufficient for interesting and useful properties of division in information algebras. Division must be linked to transport and especially projection. In fact, in relation to inverses and division, projection is more relevant than transport in general, especially in view of so-called condition- als, see Section 6.1. Therefore, we limit ourselves in this section to valuation algebras instead of considering generalised information algebras. However, we generalize the situation with respect to Section 3.2 in that we drop the Axiom S3. That is, we assume in general no null and unit elements in the valuation algebras. In some cases, when we consider valuation algebras with neutral elements, then we replace Axiom S3 by the weaker axiom (S3’) (see Section 3.2). So, assume (Ψ,D; ≤, d, ·, π) to be a valuation algebra, without axiom S3. If we require it to be a valuation algebra with unit elements, we assume Axiom S3’ instead of Axiom S3. To start with, we assume the semigroup (Ψ; ·) to be regular. This case is somewhat simpler, since the inverses belong to Ψ itself. But the regularity of the semigroup is not sufficient for our purposes, especially local computation, where we divide out or remove some projections πx(ψ) of an element ψ, but want to add it againlater. Then −1 we must assure that ψ · (πx(ψ)) · π(ψ) = ψ. This is not automatically guaranteed if (Ψ, ·) is simply a regular semigroup. What we need is the following definition:

Definition 7 Regular Valuation Algebra: A valuation algebra (Ψ,D; ≤, d, ·, π) is called regular, if for all elements φ ∈ Ψ and all x ∈ D, x ≤ d(φ), there is an element χx with domain d(χx) = x such that

φ = φ · πx(φ) · χx. (5.4) 5 DIVISION AND INVERSES 75

Of course, χx does not depend not only on x but also on ψ. Since for 2 x = d(ψ) (5.4) becomes φ = φ · χx, we see that in a regular valuation algebra the semigroup (Ψ; ·) is regular too. This implies, according to the previous Section 5.1, that Ψ is a union of disjoint groups γ(φ) which are equivalence classes of the Green relation φ ≡γ ψ, which holds if there are elements χ1 and χ2 in Ψ such that

φ = χ1 · ψ and ψ = χ2 · φ, (5.5) see the previous Section 5.1. So each valuation φ has an inverse in its group γ(φ). The idempotent or unit element of this group is as before denoted −1 by fφ. The important point here, is that the inverse φ of an element φ belongs to Ψ as do the idempotent elements fφ. In particular, projection is defined for all these elements. The main result for regular valuation algebras is that the Green relation is a congruence not only for combination, but also for labeling and projection.

Theorem 36 (Kohlas, 2003a) If (Ψ,D; ≤, d, ·, π) is a regular valuation al- gebra, then ≡γ is a congruence for the valuation algebra.

Proof. If φ ≡γ ψ, then φ = χ1 · ψ and ψ = χ2 · φ, and therefore d(φ) ≤ d(ψ) and d(ψ) ≤ d(φ), hence d(φ) = d(ψ). We know from the theory of separative semigroups (Section 5.1, Theorem 29) that ≡γ is a congruence relative to combination. It remains to show that it is a congruence relative to projection too. Assume φ ≡γ ψ and x ≤ d(φ) = d(ψ). Consider an element η in the group γ(πx(φ)). Then we have η = πx(φ)· = πx(φ·) for some  ∈ Ψx. Further, from φ ≡γ ψ 0 0 it follows that φ = χ1 · ψ, hence φ ·  = ψ ·  . So we obtain η = πx(ψ ·  ). By the regularity of the valuation algebra, ψ = ψ·πx(ψ)·χx, hence ψ = χ·πx(ψ), with χ = ψ · χx. This gives 0 0 η = πx(πx(ψ) · χ ·  ) = πx(ψ) · πx(χ ·  ).

This implies γ(η) = γ(πx(φ)) ≥ γ(πx(ψ)). In the same way we conclude also that γ(πx(ψ)) ≥ γ(πx(φ)), hence indeed γ(πx(ψ)) = γ(πx(ψ)) or πx(φ) ≡γ πx(ψ). ut As a consequence note that the groups γ(φ) are always subsemigroups of Ψx, that is [ Ψx = γ(φ).

φ∈Ψx 5 DIVISION AND INVERSES 76

Here is another important result on projections in a regular valuation alge- bra.

Theorem 37 (Kohlas, 2003a) If (Ψ,D; ≤, d, ·, π) is a regular valuation al- gebra, then the following holds:

1. Let d(φ) = d(ψ) and x ≤ d(φ). Then γ(ψ) ≤ γ(φ) implies γ(πx(ψ)) ≤ γ(πx(φ)).

2. γ(πx(φ)) ≤ γ(φ) for all φ ∈ Ψ and for all x ≤ d(φ),

Proof. 1.) We show that γ(πx(φ)) = γ(πx(φ) · πx(ψ)). Consider first an element η ∈ γ(πx(φ)). That means that d(η) = x and there is an element  ∈ Ψx such that η = πx(φ) ·  = πx(φ · ). Now, γ(ψ) ≤ γ(φ) implies γ(φ) = γ(φ · ψ). Hence there is an element 0 such that φ = φ · ψ · 0, or φ ·  = φ · ψ ·  · 0. Further, by the regularity of the valuation algebra, 0 0 φ = πx(φ)·χ·φ and ψ = πx(ψ)·χ ·ψ for some valuations χ, χ ∈ Ψx. Hence, we obtain 0 η = πx(φ · ψ ·  ·  ) 0 0 = πx((πx(φ) · πx(ψ)) · (χ · χ · φ · ψ ·  ·  ))

This shows that γ(η) = γ(πx(φ)) ≥ γ(πx(φ)·πx(ψ)). But, on the other hand, γ(πx(φ)) ≤ γ(πx(φ) · πx(ψ)), so that indeed γ(πx(φ)) = γ(πx(φ) · πx(ψ)) = γ(πx(φ)) ∨ γ(πx(ψ)), hence γ(πx(ψ)) ≤ γ(πx(φ)). −1 2.) We have γ(πx(φ)) = γ((πx(φ)) ). By regularity,

φ = φ · χ · πx(φ) (5.6) and d(χ) = x. From this it follows that πx(φ) = πx(φ) · χ · πx(φ). But −1 we have also πx(φ) = πx(φ) · (πx(φ)) · πx(φ), hence πx(φ) · χ · πx(φ) = −1 −1 πx(φ) · (πx(φ)) · πx(φ). It follows that (πx(φ)) = fπx(φ) · χ. This implies γ(πx(φ)) ≥ γ(χ). Therefore, we obtain from (5.6)

γ(φ) = γ(φ) ∨ γ(χ) ∨ γ(πx(φ)) = γ(φ) ∨ γ(πx(φ)).

But this shows that γ(πx(φ)) ≤ γ(φ). ut

If the valuation algebra has neutral elements 1x for all domains x, then γ(1x) is the smallest element among all groups γ(φ) for valuations φ in Ψx, since γ(φ · 1x) = γ(φ). Further the set [ Ψp = γ(1x) x∈D 5 DIVISION AND INVERSES 77

is closed under combination, is a semigroup: If φ ∈ γ(1x) and ψ ∈ γ(1y), then γ(φ·ψ) = γ(1x ·1y) = γ(1x∨y), hence φ·ψ ∈ γ(1x∨y). The elements of Ψp are called positive. Further, from γ(φ) = γ(1y) it follows γ(πx(φ)) = γ(πx(1y)). If the valuation algebra is stable, that is, πx(1x) = 1y, then the positive elements are closed under projection too. This is already the case, even if the algebra is not stable, if πx(1y) ≡γ 1x. We discuss briefly two important examples of regular valuation algebras.

Example : Probabilistic Potentials: Probability potentials (see Section 3.3) are a typical example of a regular valuation algebra. Equation (5.4) for probability potentials φ on a domain Θ of a family of compatible frames (F becomes

φ(θ) = φ(θ) × πΛ(φ)(tΛ(θ)) × χΛ(tΛ(θ)).

If φ(θ) > 0, we must have πΛ(tΛ(θ)) × χΛ(tΛ(θ)) = 1 and thus 1 1 χΛ(λ) = = P 0 , π (φ)(λ) 0 φ(θ ) Λ θ ∈tΘ(tΛ(θ))) with πΛ(φ)(λ) > 0 if φ(θ) > 0, since θ ∈ tΘ(tΛ(θ)). If πx(φ)(tΛ(θ)) = 0, then φ(θ) = 0, since probability potentials take only nonnegative values. Therefore, in this case χΛ(θ) may take an arbitrary nonnegative value. This shows that the valuation algebra of probability potentials is regular. Since the algebra is regular, every probability potential φ has an inverse −1 −1 −1 potential φ defined by φi (θ) = 1/φ(θ) if φ(θ) > 0, and φi (θ) = 0 oth- erwise. Let supp(φ) be the set of element θ for which φ is positive. Then the idempotent element fφ of the group γ(φ) is given by fφ(θ) = 1 for all θ ∈ supp(φ) and fφ(θ) = 0 otherwise. Clearly, assuming d(f1) = d(f2) for two idempotents on the same domain Θ, we have f1 ≤ f2 if and only if supp(f1) ⊇ supp(f2). We remark that fφ · fψ = fφ·ψ which corresponds to the support of φ · ψ in domain d(φ) ∨ d(ψ). This is nothing else than (supp(φ), d(φ)) · (supp(ψ), d(ψ)) in the subset algebra on the f.c.f F (see Section 3.1). Note that the projection of an idempotent πΛ(f) is no more an idempotent element, but supp(πΛ(f)) = tΛ(supp(f), d(f)). So, the semi- lattice of groups γ(φ) and of idempotent elements is closely related to the subset algebra of the f.c.f on which the valuation algebra of probability po- tentials is based.

Example : Densities: In Section 3.2 the valuation algebra of densities has been introduced. We want to show that this algebra is not regular. Equation 5 DIVISION AND INVERSES 78

s s (5.4) reads for a density f on the space R for vectors x ∈ R ,

f(x) = f(x) × (πt(f))(xt) × χt(xt).

If f(x) is positive, then (πt(f))(xt) × χt(xt) = 1, which implies χt(xt) = 1/(πt(f))(xt). Now (πt(f))(xt) is a density, but the inverse of a density is not necessarily finitely integrable, take density e−x for x > o as an example. So the valuation algebra of densities is not regular. But still, it seems there are inverses, although not in the algebra of densities itself. We come back to this example in the next Section 5.3.

5.3 Separative Valuation Algebras

There are many examples of valuation algebras, which are not regular, see the example of densities at the end of the previous section. One might think that a valuation algebra (Ψ,D; ≤, d, ·, π) where the semigroup of combina- tion (Ψ, ·) is separative could do the job. This is however not sufficient, for the same reason that pure regularity of the semigroup was not enough in the previous section, something more like (5.4) which linked projection to regularity was needed. Before discussing the general case, it is instructive to look first at the can- cellative case. Note that the semigroup (Ψ, ·) cannot be cancellative, since φ · ψ = φ · ψ · 1x if d(ψ) < d(φ) = x and ψ and ψ · 1x are not equal. Can- cellativity makes only sense in a domain x ∈ D. So, we define cancellative valuation algebras as follows.

Definition 8 Cancellative Valuation Algebra: A valuation algebra (Ψ,D; ≤ , d, ·, π) is called cancellative, if the semigroups (Ψx; ·) are cancellative for all x ∈ D.

According to Section 5.1 each cancellative semigroup (Ψx; ·) is embedded in 0 a group (Ψx; ·) whose elements are equivalence classes of pairs [φ, ψ], where φ, ψ ∈ Ψx. Define

0 [ 0 Ψ = Ψx. x∈D

0 0 0 In Ψ we define for [φx, ψx] ∈ Ψx and [φy, ψy] ∈ Ψy

0 [φx, ψx] · [φy, ψy] = [φx · φy, ψx · ψy] ∈ Ψx∨y. 5 DIVISION AND INVERSES 79

This operation is well defined according to Theorem 32. Thereby (Ψ0, ·) becomes a commutative semigroup. The semigroup (Ψ, ·) is embedded in the semigroup (Ψ0, ·) by the mapping φ 7→ [φ2, φ]. In fact, the mapping is clearly a homomorphism, and it is injective, since [φ2, φ] = [ψ2, ψ] implies d(φ) = d(ψ) and (φ·ψ)·φ = (φ·ψ)·ψ, hence φ = ψ thanks to cancellativity. 0 Let’s denote the idempotents of the groups (Ψ , ·) by fx. If (Ψ, ·) has unit elements 1x, then fx = 1x, otherwise the idempotents do not belong to Ψ. We have also fx ·fy = fx∨y. and for x ≤ d(φ) = y we have φ·fx = φ·fy ·fx = φ · fy = φ. This further implies, if we identify the elements of Ψ with their images under the embedding, for x ≤ d(φ),

−1 φ · πx(φ) · (πx(φ)) = φ · fxφ.

This condition is essential for local computation with division (see Section 5.4). Let’s illustrate these considerations by an example.

Example : Gaussian Potentials: In Section 3.2 we have introduced Gaussian potentials (µ, K) described by their mean value µ and concentration matrix K. If (µ, K), (µ1, K1) and (µ2, K2) are Gaussian potentials on the same domain s, then part of the cancellativty equation is K + K1 = K + K2 and this implies K1 = K2. And then,

−1 −1 (K + K1) · (K · µ + K1 · µ1) = (K + K2) · (K · µ + K2 · µ2) (5.7) implies µ1 = µ2. This shows that the valuation algebra of Gaussian po- tentials is cancellative. The quotient g(x)/πs(g)(xs) represents a family of conditional Gaussian densities, which is not itself a Gaussian density. This shows that the embedding of the valuation algebra into something larger is not only an abstract construct, but may have a real meaning. For such conditionals in the Gaussian case and in general see Section 6.

Example : Positive Densities: Positive densities f, where f(x) > 0 for s all x ∈ R form also a cancellative subalgebra of the valuation algebra of s densities (see Section 3.2). In fact, if f,g and h are densities on R , f(x) × s g(x) = f(x) × h(x) implies g(x) = h(x) for all x in R , if f is positive. For instance, the Gaussian potentials of the previous example are positive densities. As a preparation of the general case of separative valuation algebras define φ ≡δ ψ in a cancellative valuation algebra Ψ if d(φ) = d(ψ) such that the equivalence classes [φ]δ equal Ψx, which are cancellative sub-semigroups of 5 DIVISION AND INVERSES 80

Ψ. The relation ≡δ is a congruence relative to the valuation algebra Ψ and furthermore

πx(φ) · φ ≡δ φ for all φ ∈ Ψ and x ≤ d(φ). (5.8)

This is exactly, what is needed in the general case. Therefore, we define a separative valuation algebra as follows:

Definition 9 Separative Valuation Algebra: A valuation algebra (Ψ,D; ≤ , d, ·, π) is called separative if there is a combination congruence ≡δ in Ψ such that

1. πx(φ) · φ ≡δ φ for all φ ∈ Ψ and x ≤ d(φ),

2. the equivalence classes [φ]δ are all cancellative.

By Theorem 36 this concept of separative valuation algebras includes regular valuation algebras, with the Green relation as a valuation algebra congru- ence. It covers also cancellative valuation algebras with ≡δ as defined as above. In some cases, the the relation ≡δ is also a congruence relative to projection. Then slightly stronger results hold, but for our purposes of local computation with division 2 Note that from the first condition in this definition it follows that φ ≡δ φ, so that the classes [φ]δ are subsemialgebras of Ψ. Let δ(φ) denote the group into which the cancellative class [φ]δ is embedded and define [ Ψ0 = δ(φ). φ∈Ψ

The groups δ(φ) have the equivalence classes [φ, ψ] with φ, ψ ∈ [φ]δ as ele- ments, and between elements of Ψ0 combination is defined, as usual, by

[φ, ψ] · [φ0, ψ0] = [φ · φ0, ψ · ψ0].

Again, since ≡δ is a combination congruence, this operation is well defined. Thus, (Ψ0, ·) becomes a semigroup which is the union of disjoint groups, and the semigroup (Ψ, ·) is embedded with the mapping φ 7→ [φ2, φ]. By Theorem 35 it follows then that the semigroup (Ψ, ·) is separative.

Let fφ be the units or idempotents in the group δ(φ). As usual they are ordered by fφ ≤ fψ if fφ · fψ = fψ. We have also fφ · fψ = fφ·ψ = fφ ∨ fψ. 5 DIVISION AND INVERSES 81

There is an isomorphism between the semilattice of idempotents F = {fφ : φ ∈ Ψ} and the semilattice of groups δ(φ), so that δ(φ) ≤ δ(ψ) if and only if fφ ≤ fψ. Also, if δ(φ) ≤ δ(ψ), then

ψ · fφ = ψ · fψ · fφ = ψ · fψ = ψ.

Under these conditions, Theorem 37 carries over to separative valuation algebras.

Theorem 38 If (Ψ,D; ≤, d, ·, π) is a separative valuation algebra, then the following holds:

1. δ(πx(φ)) ≤ δ(φ) for all φ ∈ Ψ and for all x ≤ d(φ),

2. Assume that ≡δ is also a congruence relative to projection. Let d(φ) = d(ψ) and x ≤ d(φ). Then δ(ψ) ≤ δ(φ) implies δ(πx(ψ)) ≤ δ(πx(φ)).

Proof. The first item follows since πx(φ) · φ ≡δ φ holds in a separative valuation algebra. In order to prove the second item, assume δ(ψ) ≤ δ(φ) and x ≤ d(φ) = d(ψ). Then δ(πx(ψ)) ≤ δ(ψ) implies δ(πx(ψ) · φ) = δ(φ). Since ≡δ is a congruence relative to projection, it follows that δ(πx(φ)) = δ(πx(πx(ψ)·φ)) = δ(πx(ψ)· πx(φ)). But this means that indeed δ(πx(ψ)) ≤ δ(πx(φ)). ut The first item of this Theorem is what we need to extend local computation architectures to exploit division, see Section 5.4. In contrast to regular valuation algebras however, idempotent elements and especially inverses belong no more to the initial algebra Ψ, but only to Ψ0. Although combination has been extended to Ψ0 this is not the case for projection. We show now how projection may at least be partially extended to Ψ0. This is essential for example to generalise concepts like conditional probability distributions and causal reasoning to more general information structures, namely exactly the separative valuation algebras. For this part we need Lemma 4 of Section 3.2. Therefore, we assume for the following that the valuation algebra either has unit elements which satisfy S3 or at least S3’ or else that the strong Combination Axiom S5’ is satisfied. These assumptions are sufficient for Lemma 4 to hold. As we shall see, the following theory applies then among other to densities, Gaussian potentials 5 DIVISION AND INVERSES 82

(which satisfy axiom S5’) and to belief functions (which have unit elements, satisfying S3). First we extend labeling to Ψ0. If η ∈ δ(φ) for some φ ∈ Ψ, then we define d(η) = d(φ). Labeling is well defined since [φ]δ ⊆ Ψd(φ). Clearly this is an extension of the labeling operation from Ψ to Ψ0. Consider η ∈ δ(φ) and η0 ∈ δ(φ0). Then η · η0 ∈ δ(φ · φ0) = d(φ) ∨ d(φ0). Hence it follows d(η · η0) = d(η) ∨ d(η0). The labeling axiom extends to all of Ψ0. Next we turn to projection. Remind that an element η ∈ Ψ0 is an equivalence class [φ, ψ], where φ, ψ ∈ δ(φ). Note that

[φ, ψ] = [φ · φ, φ] · [ψ, ψ · ψ] = φ · ψ−1.

Now, in many cases, ψ is of the form ψ0 · f, where d(ψ0) < d(ψ) = d(φ) and f is an idempotent with d(f) = d(φ), but δ(f) ≤ δ(φ). Then, we have

η = φ · f · ψ0−1 = φ · ψ0−1.

Note that we have necessarily δ(ψ0) ≤ δ(φ). In this case, we may define projection for d(ψ0) ≤ x ≤ d(φ) by

0−1 πx(η) = πx(φ) · ψ , (5.9) since projection of φ ∈ Ψ is defined. The representation of an element η as φ · ψ−1, where φ, ψ ∈ Ψ and d(ψ) ≤ d(φ) is however not unique. So, we must show that definition (5.9) is unambiguous. Therefore assume that η = φ·ψ−1 = φ0 ·ψ0−1, where d(ψ), d(ψ0) ≤ d(φ) = d(φ0), δ(η) = δ(φ) = δ(φ0) and δ(ψ), δ(ψ0) ≤ δ(η). We obtain then φ · ψ0 = φ0 · ψ. It follows for x such that d(ψ), d(ψ0) ≤ x ≤ d(η) (Lemma 4)

0 0 0 0 πx(φ) · ψ = πx(φ · ψ ) = πx(φ · ψ) = πx(φ ) · ψ, since all elements involved belong to Ψ. Further, δ(ψ) ≤ δ(φ) implies δ(ψ) ≤ δ(πx(φ)) since δ(φ) = δ(φ·ψ), hence δ(πx(φ)) = δ(πx(φ·ψ)) = δ(πx(φ)·ψ) = 0 0 δ(πx(φ)) ∨ δ(ψ), and in the same way we conclude that δ(ψ ) ≤ δ(φ ). From this we obtain

−1 0 0−1 πx(φ) · ψ = πx(φ ) · ψ , which shows that πx(η) is well defined for all x ∈ D whenever there exist φ, ψ ∈ Ψ such that η = φ · ψ−1 with d(ψ) ≤ x ≤ d(φ) and δ(ψ) ≤ δ(φ). Of course there may be elements η in Ψ0, where projection is only trivially 5 DIVISION AND INVERSES 83

defined for x = d(φ). Also, we have always πx(η) = η if d(η) = x. Finally, −1 assume η ∈ Ψ, then for all x ≤ d(η), we have η = η · πx(η) · (πx(η)) . Then, −1 by the new definition of projection of η to x, πx(η·πx(η))·(πx(η)) = πx(η), where on the left is the projection as defined in Ψ. This shows that the new definition of projection is indeed an extension of the projection in Ψ. It turns out that this extension of the projection operator still satisfies the Transitivity and Combination Axioms in Ψ0.

Theorem 39 If (Ψ,D; ≤, d, ·, π) is a separative valuation algebra, satisfying either axioms S3 or S3’ or else axiom S5’, and π :Ψ0 × D → Ψ0 is the partially defined extension of projection, then the following holds:

0 1. If πx(η) exists for x ≤ d(η), η ∈ Ψ and x ≤ y ≤ d(η), then

πx(πy(η)) = πx(η). (5.10)

0 2. If η1, η2 ∈ Ψ with d(η1) = x, d(η2) = y and if πx∧y(η2) exists, then πx(η1 · η2) exists and

πx(η1 · η2) = η1 · πx∧y(η2). (5.11)

Proof. 1.) Assume that η = φ · ψ−1 with d(ψ) ≤ x ≤ y ≤ d(η) = d(φ) and δ(ψ) ≤ δ(φ). Then it follows as above that δ(ψ) ≤ δ(πy(φ)). So, πy(η) exists too. In Ψ we have πx(φ) = πx(πy(φ)). Therefore πx(πy(η)) = −1 −1 −1 πx(πy(φ)) · ψ ) = πx(πy(φ)) · ψ = πx(φ) · ψ = πx(η). −1 −1 2.) Assume η1 = φ1 · ψ1 and η2 = φ2 · ψ2 , where d(ψ1) ≤ d(φ1) = x, δ(ψ1) ≤ δ(φ1) and d(ψ2) ≤ x ∧ y , δ(ψ2) ≤ δ(φ2). Then

−1 η1 · η2 = (φ1 · φ2) · (ψ1 · ψ2) , where d(ψ1 · ψ2) = d(ψ1) ∨ d(ψ2) ≤ x ≤ d(φ1) ∨ d(φ2) and δ(ψ1 · ψ2) = δ(ψ1)∨δ(ψ2) ≤ δ(φ1)∨δ(φ2) = δ(φ1 ·φ2). Then we have by the Combination Axiom in Ψ,

−1 πx(η1 · η2) = πx(φ1 · φ2) · (ψ1 · ψ2) −1 = φ1 · πx∧y(φ2) · (ψ1 · ψ2) −1 −1 = (φ1 · ψ1 ) · (πx∧y(φ2) · ψ2 ) = η1 · πx∧y(η2). 5 DIVISION AND INVERSES 84

So, πx(φ1 · φ2) exists and the combination axiom holds under these circum- stances. ut This theory of partial projection is essential for generalizing the formalism of conditional probability distributions to separative valuation algebras, see Section 6. As an additional issue, assume that the semilattices of the groups γ(φ in 0 Ψx have each a minimal element and denote it by γx. Let fx be the unit element in group γx. Then φ · fx = φ for all elements φ ∈ Ψx. That is fx 0 is the neutral element of the semigroup Ψx. Usually, it does not belong to 0 Ψx. Consider any group γ(φ) in Ψx∨y. Then, γx ≤ γ(πx(φ)) ≤ γ(φ) and similarily, γy ≤ γ(φ). It follows that γx ∨ γy ≤ γ(φ), hence γx ∨ γy = γx∨y, which implies also

fx · fy = fx∨y. (5.12)

The neutrality axiom holds in this case also in Ψ0. This shows that that the formalism of a valuation algebra may, at least partially be extended to the semigroup Ψ0. See (Kohlas, 2003a) for more details about valuation algebras with partial projection. Let’s illustrate these issues for densities.

Example : Densities as Separative Valuation Algebras: Here we consider the valuation algebra of densities, see Section 3.2. Recall that in this algebra axiom S5’ is valid. The situation is here similar to the one of probability potentials, Section 5.2. Define, as in the case of probability potentials, for s s a density f on the space R the support set supp(f) to be the subset of R s where f(x) > 0. For two densities f and g on the same domain R define f ≡δ g if supp(f) = supp(g). This is clearly a valuation algebra congruence. Further f · πt(f)(x) = 0 holds if and only if f(x) = 0, so that f · πt(f) has the same support set as f, hence,

f · πt(f) ≡δ f

Finally, the class [f]δ of densities with the same support set ist cancellative, since f(x)g(x) = f(x)h(x) for all x ∈ supp(f) = supp(g) = supp(h) implies g(x) = h(x) for all x ∈ supp(f) and g(x) = h(x) = 0 otherwise. So g = h. This shows that the valuation algebra Ψ of densities is separative. Note that the lattice of groups δ(f) corresponds exactliy to the lattice of subsets s supp(f) of support sets of densities in R . 5 DIVISION AND INVERSES 85

The inverse of a density f is evidently defined as

 1 if x ∈ supp(f), f −1(x) = f(x) 0 otherwise.

Now, if t ⊆ d(f) = s, then

−1 f(x) f · (πt(f)) (x) = R +∞ −∞ f(xt, xs−t)dxs−t represents a conditional density for the variables xs−t given xt. This is no more a density in Ψ, but an element of Ψ0. Clearly, the marginal of a conditional density is defined for r such that t ≤ r ≤ s, by

R +∞ −1 −∞ f(y, xs−r)dxs−r) πr(f · (πt(f)) )(y) = R +∞ −∞ f(xt, xs−t)dxs−t −1 = πr(f) · (πt(f)) (y)

r for y ∈ R . For r ⊂ t the projection is no more defined. This is an important illustration of partial projection (or marginalization in terms of probability theory). We shall come back to this in a more general setting in Section 6. Note that the function e(x) = 1 for x ∈ supp(f), e(x) = 0 otherwiese is the unity of the group δ(f). If supp(f) has finite measure, then this unit belongs to Ψ otherwise only to Ψ0. The minimal element of the lattice of s s groups δ(f) in domain R is given by densities whose support equals R . s 0 The function es(x) = 1 for all x ∈ R are the neutral elements in Ψ , but they are not densities and do not belong to Ψ. This should illustrate how separative valuation algebras work in a concrete case. Another example of a separative valuation algebra is given by multivariate belief function (Kohlas, 2003a).

5.4 Computing with Division

In Section 4 it was shown, how the projection problem can be solved by local computation, a technique, which avoids to treat operations on pieces of information on big domains, which would often be computationally in- tractable. It is a method is based on Markov trees, and it is assumed that Q the terms ψv of the factorisation φ = v∈V ψv are assigned to nodes v of the Markov tree T = (V,E) and that their domains correspond to the labeling 5 DIVISION AND INVERSES 86

λ of the Markov tree (T, λ), such that d(ψv) = λ(v). This approach ap- plies to generalised information algebras. In the case of a valuation algebra (Ψ,D; ≤, d, ·, π) local computation simplifies somewhat (see Section 4.1). In particular, in each step of the algorithm only projections and combinations within the domains λ(v) of the Markov tree are used. This is the locality of the computational scheme. The collect algorithm was rdesigned to compute a single projection πλ(v)(φ). In case several or all projections πλ(v)(φ) for all v ∈ V have to be computed, instead of repeating for every node the collect algorithm, a more efficient method which avoids redundant computations based on message passing has been proposed. We call this way to organize the computation the Shenoy-Shafer architecture or SS-architectur, named after their inventors (Shenoy & Shafer, 1990a). In the SS-architecture, all messages between neighbor nodes must be cached in order to compute the required projections at the end. If the valuation algebra allows for division, then this caching can be avoided in at least two different ways. This has been realised in the case of probability potentials, which are regular algebras, in (Jensen et al. , 1990). That this can be generalised for valuation algebras in a multivariate setting has been noted in (Lauritzen & Jensen, 1997) and it has been worked out more formally in the framework of regular and separative valuation algebras in (Kohlas, 2003a). It is easily possible to generalize the corresponding computational architectures to the more general case of valuation algebras (Ψ,D; ≤, d, ·, π), where the domains form arbitrary lattices (D; ≤). This will be shown in this section. We consider a separative valuation algebra (Ψ,D; ≤, d, ·, π). The problem is to compute the projections πv(φ) for all nodes v of a Markov tree (T, λ), with T = (V,E), and where Y φ = ψv, ψv ∈ Ψ, d(ψv) = λ(v) ∀v ∈ V. v∈V First, we summarize the main results relating to the collect algorithm and the SS-architecture from Section 4 for further reference. For the collect algorithm, we assume that a node v is selected as root and that the nodes u in V are numbered in such a way the number of a node u on the path from a node w to the root v is larger than the one of w. If node u is the i-th node in this numbering, define xi = λ(u) and ψi = ψu. Then x1, . . . , xn form a construction sequence (see Section 4.2). For simplicity’s sake, we shall identify the nodes of the Markov tree with their domain xi. The index of the neighbor of node xi on the path towards the root xn is denoted by 5 DIVISION AND INVERSES 87 b(i), and we have i < b(i) for i = 1, . . . , n − 1. We remind that such a construction sequence determines a hypertree (see Section 4.2). In terms of such a construction sequence, the collect algorithm is defined as follows: 1 Define ψj = ψj for j = 1, . . . , n and for i = 2, . . . , n compute

i+1 i i i+1 i ψb(i) = ψb(i) · πxi∧xb(i) (ψi), ψj = ψj for j = i + 1, . . . , n, j 6= i.

n At the end, we have ψn = πxn (φ). Let further Ti denote the subtree of T consisting of all nodes xj, such that j ≤ i and whose pathes to xn lead through xi. This is still a Markov tree (see Section 2.3). Therefore, it follows that

i Y ψi = πxi ( ψj). (5.13) xj ∈Ti

The messages in the SS-architecture are defined as follows for two neighbor nodes i and j Y µi→j = πxi∧xj (ψi · µk→i) (5.14)

xk∈ne(xi),k6=j where ne(xi) is the set of all neighbors of node xi in the tree T . We then have finally Y πxi (φ) = ψi · µj→i. (5.15)

xj ∈ne(xi)

Note that with these messages, we have in the collect algorithm

i Y ψi = ψi · µj→i, (5.16) xj ∈pa(xi) where pa(xi) is the set of all neighbors xj of xi with j < i, hence before xi on the path to the root xn. With these messages, a two phase scheme, with first a collect phase in the order of the construction sequence and then a distribute phase in the inverse order of the construction sequence yield projections of φ to all nodes xi of the Markov tree. As mentinonned this presupposes that the messages are cached as they are computed. This resumes what we need in the following. There are two ways to change this collect/distribute scheme: In the first scheme, collect is executed as usual, with the exception, that at step i in 5 DIVISION AND INVERSES 88

i node xi not ψi is stored, but the message µi→b(i) is divided out, so that

i −1 Y −1 i = ψi · (µi→b(i)) = ψi · µj→i · (µi→b(i)) . xj ∈pa(xi) remains in the store of node xi. As before, after step n, we have in node xn, the projection χn = πxn (φ). In the distribute phase, again in the inverse order of the construction sequence, each node xi receives a message from neighbor xb(i), namely

πxi∧xb(i) (χb(i)), if in node xb(i) the valuation χb(i) is stored. This message will be combined in node xi with its stored value i, to get

χi = i · πxi∧xb(i) (χb(i)). (5.17)

This goes on for i = n − 1,..., 1. At the end of distribute each node xi contains the projection of φ to xi, as the next theorem asserts.

Theorem 40 With the recursive definitions (5.17) above,

χi = πxi (φ) for i = 1, . . . , n.

Proof. The claim of the theorem holds for i = n. We proceed by induction and asumme the the claim holds for all j > i. Now, b(i) > i so that

χb(i) = πxb(i) (φ). Using (5.15), we have Y χb(i) = ψb(i) · µj→b(i).

xj ∈ne(xb(i))

So, the message sent to xi in step i is, using the combination axiom and and (5.14),

πxi∧xb(i) (χb(i)) Y = πxi∧xb(i) (ψb(i) · µj→b(i))

xj ∈ne(xb(i)) Y πxi∧xb(i) (ψb(i) · µj→b(i)) · µi→b(i)

xj ∈ne(xb(i)),j6=i µb(i)→i · µi→b(i). 5 DIVISION AND INVERSES 89

So, we obtain

χi = i · πxi∧xb(i) (χb(i)) Y −1 = ψi · µj→i · (µi→b(i)) · µb(i)→i · µi→b(i)

xj ∈pa(xi) Y = ψi · µj→i · fµi→b(i) xj ∈ne(xi)

= πxi (φ) · fµi→b(i) . Note that (Theorem 38, item 1), Y µi→b(i) = πxi∧xb(i) ( ψj), xj ∈Ti which implies Y δ(mui→b(i)) = δ(πxi∧xb(i) ( ψj)) ≤ δ(πxi (φ)), xj ∈Ti

This proves that χi = πxi (φ), which concludes the induction. ut In a second adaption of the collect/distribute scheme, a new node is intro- duced for every edge in the Markov tree. So, if there is anedge linking nodes xi with node xj, a new node is introduced between these two nodes. It has assigned the domain xi ∧ xb(i) and we use this label as a designation of the node. These nodes are called separators. During the collect phase, at step i, the message µi→b(i) is stored in the separator xi ∧ xb(i). before it is sent to i node xb(i). As before at the end of the collect, the valuations ψi are stored n in the nodes xi and ψn = πxn (φ) in node n. The distribute phase proceeds again in the inverse order of the construction sequence. At step i = n, . . . , 2, nodes xj for j > i have stored valuations

χj, and in particular χn = πxn (φ). Node xb(i) sends as before the message

πxi∧xb(i) (χb(i)), but not to node xi but to the separator xi ∧ xb(i). Then this message is divided by the stored value µi→b(i) −1 πxi∧xb(i) (χxb(i) ) · (µi→b(i)) . (5.18) i and sent to the node xi, where it is combined with the actual content ψi i −1 χi = ψi · πxi∧xb(i) (χxb(i) ) · (µi→b(i)) . (5.19)

The separator stores χxi∧xb(i) = πxi∧xb(i) (χxb(i) ). As in the first case, at the end each node contains the projection of φ to its domain, and this holds even for the separators. 5 DIVISION AND INVERSES 90

Theorem 41 With the recursive definitions (5.18) and (5.19) above,

χi = πxi (φ) for i = 1, . . . , n.

Proof. The proof proceeds again by induction over the steps of the distribute phase. For i = n, we have χn = πxn (φ). Assume that χj = πxj (φ) for j > i and consider step i where node xb(i) sends its message to the node xi passing by the separator xi ∧ xb(i). The new value in the separator becomes

χxi∧xb(i) = πxi∧xb(i) (φ). The message passed to node xi is then πxi∧xb(i) (φ) · −1 i (µi→b(i)) . At node i, also from the collect phase we have the content ψi, which is given by (5.16), and so

Y −1 χi = ψi · µj→i · πxi∧xb(i) (φ) · (µi→b(i)) . xj ∈pa(xi)

In the proof of Theorem 40 we have seen that

−1 πxi∧xb(i) (φ) = µb(i)→i · (µi→b(i)) . Therefore, we obtain finally, Y χi = ψ · µj→i · fµi→b(i) . xj ∈ne(xi)

As in the proof of Theorem 40 we conclude from this that χi = πxi (φ) and this completes the induction. ut As a complement, we remark that at the beginning of the first scheme, we have the valuations ψv in the nodes v of the Markov tree and φ is the combination of the node contents. At each step i of collect, the content of the node xi is divided by the message µi→b(i) whereas this message is combined into the content of node xb(i). So the net effect on the combination −1 of node contents is µi→b(i) · (µi→b(i)) = fµi→b(i) . But this means that the combination of the node contents remains equal to φ. During distribute at each step a message πxi∧xb(i) (φ) is combined into node xi. So at the end the combination of all these elements is added to φ, whereas the node contents are πxi (φ). Therefore, at the end of distribute, we have

n−1 n Y Y φ · πxi∧xb(i) (φ) = πxi (φ). (5.20) i=1 i=1 5 DIVISION AND INVERSES 91

The same result holds also at the end of the distribute in the second compu- tational scheme (Kohlas, 2003a). This result will find another interpretation in Section 6.1. For an extension of these results in the multivariate setting, we refer to (Kohlas, 2003a); results presented there generalise easily to the more general setting here.

5.5 Separative Semiring Valuations

Separative valuation algebras may be derived from semirings with division (Kohlas & Wilson, 2008), at least in the multivariate setting. However, this extends also to the more general frame of families of compatible frames. We consider a semiring (A; +, ×), where the multiplicative semigroup (A; ×) is separative. As in Section 3.3 we consider a family of compatibles frames (F, R) with the conditional independence relation Θ1⊥Θ2|Λ. Let ΦΘ denote the set of A-valuations φ :Θ → A on frame Θ, and [ Φ = ΦΘ Θ∈F be the set of all valuations. Then (Φ, F; ≤, d, ·, π) is a valuation algebra, if (F; ≤) is a lattice and Θ⊥Λ|Θ ∧ Λ holds for all pair of frames (see Section 3.3). In both cases (Φ; ·) as well as all (ΦΘ; ·) are semigroups. We claim that if A is separative, then so are these semigroups. Or, more precsiely the following theorem holds.

Theorem 42 If in the semiring (A; +, ×) the multipicative semigroup (A; ×) is separative, regular or cancellative respectively, then so are the semigroups (ΦΘ; ·) for all Θ ∈ F. In the first two cases the semigroup (Φ; ·) is also separative or regular respectively.

Proof. Assume first that (A; ×) is separative and consider for A-valuations φ, ψ on a frame Θ ∈ F the equation φ · φ = ψ · ψ = φ · ψ. Then for all θ ∈ Θ we have

φ(θ) × φ(θ) = ψ(θ) × ψ(θ) = φ(θ) × ψ(θ).

Separativity of (A; ×) implies then φ(θ) = ψ(θ) for all θ ∈ Θ, hence φ = ψ. But this means that the semigroup (ΦΘ; ·) is separative. 5 DIVISION AND INVERSES 92

Similarly, if (A; ×) is regular and φ a A-valuation on the frame Θ, then for all θ ∈ Θ, there is an element χ of A such that

φ(θ) = φ(θ) × φ(θ) × χ.

Call this element χ = χ(θ). then χ is an A-valuation on Θ and φ = φ · φ · χ. So the semigroup (ΦΘ; ·) is regular. If (A; ×) is cancellative, and φ, ψ and χ three A-valuations on a frame Θ such that φ · χ = ψ · χ, then for all θ ∈ Θ we have φ(θ) × χ(θ) = φ(θ) × χ(θ). From cancellativity of the semigroup A it follows that ψ(θ) = χ(θ) and this for all θ ∈ Θ, therefore ψ = χ. This means that the semigroup (ΦΘ; ·) is cancellative. Now if for any A-valuation φ · φ = ψ · ψ holds, then φ and ψ must be A-valuations on the same frame and separativity of (Φ; ·) follows from sep- arativity of the subsemigroup of the A-valuations on the frame. Similar regularity of all subsemigroups (ΦΘ; ·) implies regularity of (Φ; ·). For can- cellativity however this does not work, since for example φ · 1Θ = φ · 1Λ as long as Θ, Λ ≤ d(φ), but 1Θ 6= 1Λ. ut According to this theorem, we know from Section 5.1 that each semigroup 0 (ΦΘ; ·) is embedded into a semigroup (ΦΘ; ·), which is a union of groups. We come back to the details of this embedding below. However, we have seen in local computation with valuation algebras, we have to divide out a projection of some valuation φ to some domain smaller than d(ψ), say Λ ≤ d(φ). That is, we need an inverse of tΛ(ψ) such that

−1 φ · (πΛ(φ)) · πΛ(φ) = φ. For this it is not sufficient that the semiring A be separative, an additional condition is needed, which is given in the following definition.

Definition 10 Separative Semiring A semiring (A; +, ×) is called separa- tive, if

1. The semigroup (A; ×) is separative,

2. for all a, b ∈ A, [a]γ ≤ [a+b]γ, where [a]γ denotes the equivalence class of the semigroup congruence induced by separativity (see Section 5.1).

The second condition is a strong version of positivity of the semiring. In fact, if A has a null element, then this condition implies [0]γ ≤ [a]γ. We 5 DIVISION AND INVERSES 93 should also mention, that in semiring theory, separativity usually refers to the additive group (A; +) and not to the multiplicative one. The reason is that in this way a step from a semiring towards a ring is made, see for instance (Golan, 1999), but our purpose here is different.

We define now an equivalence relation in Ψ, by φ ≡γ ψ if

1. d(φ) = d(ψ),

2. for all θ ∈ d(φ), φ(θ) ≡γ ψ(θ). where in the second condition ≡γ denotes the semigroup congruence induced in (A; ×) by separativity. This clearly an equivalence relation in Ψ. Assume φ ≡γ ψ and consider any A-valuation χ ∈ Ψ. Then d(φ × χ) = d(ψ · χ). Further, let d(φ) = Θ and d(χ) = Λ. Then, since ≡γ is a semigroup congruence in (A; ×) we have also for all  ∈ Θ ∨ Λ,

φ · χ() = φ(tΘ()) × χ(tΛ()) ≡γ ψ(tΘ()) × χ(tΛ()) = ψ · χ().

Thertefore, ≡γ is a combination congruence in Ψ. Further, for an A-valuation φ with d(φ) = Θ and all θ ∈ Θ, we have   X 0 πΛ(φ) · φ(θ) =  φ(θ ) × φ(θ) 0 θ ∈tΘ(tΛ(θ))

Since θ ∈ tΘ(tΛ(θ)), the second condition in the definition of a separative semiring implies that πΛ(φ)·φ(θ) ≡γ φ(θ) and therefore, πΛ(φ)·φ ≡γ φ. But this means that the valuation algebra (Φ; F; ≤, .,·, π) is a separative. This is summarized in the following theorem

Theorem 43 If (A; +, ×) is a separative semiring, then the associated semir- ing valuation algebra (Φ, F; ≤,.,·, π) is separative.

We examine now the structure of this separative semiring valuation algebra. Consider an A-valuation φ on frame Θ and assign to all θ ∈ Θ the group in A0 to which φ(θ) belongs,

spφ(θ) = γ(φ(θ)), 5 DIVISION AND INVERSES 94

This defines a map spφ from the domain d(φ) = Θ into the lattice of the groups which decompose A0. Further, we define γ(φ) = {g :Θ → A0 : ∀θ ∈ Θ, g(θ) ∈ γ(φ(θ))} the set of maps from Θ into A0 such that g(θ) belongs to the group of the value φ(θ) in A0. The set γ(φ) is essential identical to the Cartesian product γ(φ(θ1)) × ... × γ(φ(θn)) if we assume that Θ = {θ1, . . . , θn}. We have clearly γ(φ) = γ(ψ) if φ ≡γ ψ. It follows now that γ(φ) is a group: The unit element fφ of this group is given by fφ(θ) = fφ(θ), that is for all θ ∈ Θ the unit element in the groups γ(φ(θ)) in A0. The inverse of g ∈ γ(φ) is given by g−1(θ) = (g(θ))−1. In particular, the inverse of the A-valuation −1 −1 φ is defined by φ (θ) = (φ(θ)) . The equivalence classes [φ]γ form a subsemigroup of ΦΘ, since ≡γ is a combination congruence. This semigroup is embedded in γ(φ) using the embeddings of [φ(θ)]γ in γ(φ(θ)). Define 0 [ ΦΘ = γ(φ). φ∈Φθ The partial order between the groups γ(a) in A0 induces a partial oder 0 between the groups γ(φ) in ΦΘ by γ(φ) ≤ γ(ψ) if γ(φ(θ)) ≤ γ(ψ(θ)) for all θ ∈ Θ. In fact, we have γ(φ) ≤ γ(ψ) if γ(φ · ψ) = γ(ψ). If d(ψ) = Θ, 0 this means that d(φ · ψ) = Θ, since the group γ(ψ) is contained in ΨΘ. So, γ(φ) ≤ γ(ψ) if γ(φ · ψ(θ)) = γ(ψ(θ)) or γ(φ(πΛ(θ))) ≤ γ(ψ(θ)) for all θ ∈ Θ. This order is also reflected by the partial order of the idempotents fφ of the groups γ(φ): fφ ≤ fψ if fφ·fψ = fψ. And this holds if an only if γ(φ) ≤ γ(ψ). In fact, this partial order is a join-semilattice, where γ(φ · ψ) = γ(φ) ∨ γ(ψ) or fφ·ψ = fφ · fψ = fφ ∨ fψ. Note that γ(φ) ≤ γ(ψ) implies also fφ · ψ = ψ. The union of the groups γ(φ), [ Φ0 = γ(φ) φ∈Φ is a semigroup. In fact, if g1 ∈ γ(φ) and g2 ∈ γ(ψ), then g1 · g2 is defined for θ ∈ Θ ∨ Λ if d(φ) = Θ and d(ψ)) = Λ by

g1 · g2(θ) = g1(tΘ(θ)) × g2(tΛ(θ)). The order above between groups γ(φ) extends to all of Φ0 γ(φ) ≤ γ(ψ) if γ(φ · ψ) = γ(ψ). In the same way, we also have fφ ≤ fψ if fφ · fψ = fψ. As we have seen above πΛ(φ) · φ ≡γ φ. This implies

γ(πΛ(φ)) ≤ γ(φ). 6 CONDITIONALS 95

−1 And this implies φ · (πΛ(φ)) · πΛ(φ) = φ · fπΛ(φ) = φ. This is sufficient for local computation with division (Section 5.4). Here follow a few examples of separative semiring valuation algebras. Example : Probability Potential The arithmetic semiring of nonnegative real numbers is separative, even regular. In fact, it decomposes into the + trivial group {0} and the multiplicative group R of positive numbers. The + order between these two groups is {0} ≤ R . We refer to Section 5.2 for the associated regular valuation algebra of probability potentials. Example : Nonnegative Semirings: In many cases a semiring (A; +, ×) with null element decomposes into two semigroups {0} and A − {0}. If the latter is cancellative, then the semiring is separative. It is then embedded into the semiring which is the union of the group {0} and the the group G into which a − {0} is emebedded. In fact {0} ∪ G is a semigroup since we may define 0 × g = 0 for all g ∈ G. The partial order between groups is {0} ≤ G. Further, since 0 + b = b for all b ∈ A, the second condition in the definition of a separative semiring is satisfied. The arithmetic semiring inducing probability potentials belongs to this class of separative semirings.

Example : t-Norms, Spohn Potentials The product and the minimum-t- norms are regular and induce therefore regular valuation algebras. The Lukasziewicz and the drastic t-norms are not separative, and the so their + valuation algebras do not allow for division. The (N ; min, +) semiring is separative, even cancellatice. Division in this case is essentially subtraction. Its A-valuations are called Spohn potentials (Spohn, 1988; Kohlas, 2003a) and they form a separative valuation algebra.

6 Conditionals

6.1 Conditionals and factorisations

In probability theory, conditioning and conditional distributions play an im- portant role, as well as independence and¡conditional independence. This applies equally to modeling and to computational purposes. We claim that these concepts are not limited to probability, but concern more generally information. Therefore, we examine generalisations in this section in the realm of valuation algebras, more particularly, separative ones, because con- ditioning presupposes a concept of division. 6 CONDITIONALS 96

We assume throughout this section (Ψ,D; ≤, d, ·, π) to be a separative valua- tion algebra, as defined in Section 5.3 above. This includes the more special case of regular or cancellative algebras. In addition, we assume that the the valuation algebra has unit element satisfying axioms S3 or S*’ or else that the extended combinaiton axiom S5’ is satisfied (Sectiion 3.2). This guarantees that partial projection in Ψ0 is well defined. We define first the concept of a conditional, following the pattern of probabil- ity distributions. The results presented in this section were already exposed in (Kohlas, 2003a), however only in the multvariate setting. The results extend easily to the more general case of lattices (D; ≤), in particular to distributive ones.

Definition 11 Conditional: Let (Ψ,D; ≤, d, ·, π) be a separative valuation algebra. For an element φ ∈ Ψ, and y ≤ x ≤ d(φ),

−1 φx|y = πx(φ) · (πy(φ)) (6.1) is called a conditional of φ for x given y.

Note that a conditional φx|y does, in general, not belong to Ψ, but only to 0 Ψ , except if the valuation algebra is regular. In this case a conditional φx|y can be projected to all domains z ≤ x, whereas in general, projection exist only for y ≤ z ≤ x, see Section 5.3 for the extension of projection from Ψ to Ψ0. Further, it follows from the definition that

πx(φ) = φx|y · πy(φ) (6.2) since δ(πy(φ)) = δ(πy(πx(φ))) ≤ δ(πx(φ)) (Theorem 38, item 1). For this reason conditionals φx|y were also called continuers of φ from y to x in (Shafer, 1996), or we say that φx|y continues φ from y to x. We have also −1 δ(φx|y) = δ(πx(φ) · (πy(φ)) ) = δ(πx(φ)) ∨ δ(πy(φ)) = δ(πx(φ)), so that

δ(πx)), δ(πy(φ)) ≤ δ(φx|y). (6.3)

This is important for later developments. Further, we remark, that a condi- tional φx|z has projections for z such that y ≤ z ≤ x, since d(φx|y) = x and δ(πy(φ)) ≤ δ(πx(φ)). When we consider a conditional φx|y, then we assume always implicitly that y ≤ x ≤ d(φ). Here follow a few elementary results about conditionals. 6 CONDITIONALS 97

Lemma 7 Let (Ψ,D; ≤, d, ·, π) be a separative valuation algebra. Then the following holds:

1. πy(φx|y) = fπy(φ).

2. If z ≤ y ≤ x, then φx|z = φx|y · φy|z.

3. If z ≤ y ≤ x, then πy(φx|z) = φy|z.

4. If d(ψ) = y ≤ x, then (πx(φ) · ψ)x|y = φx|y · fψ.

5. If z ≤ y ≤ x and z ≤ w ≤ x then πw(φx|y · φy|z) = φw|z.

−1 −1 Proof. 1.) By definition πy(φx|y) = πy(πx(φ)·(πy(φ)) ) = πy(φ)·(πy(φ)) = fπy(φ) by transitivity of projection law. −1 −1 2.) Again by definition φx|y · φy|z = πx(φ) · (πy(φ)) · πy(φ) · (πz(φ)) = −1 πx(φ) · fπy(φ) · (πz(φ)) = φx|z since δ(πy(φ)) ≤ δ(πx(φ)). −1 3.) Here we have (unsing Lemma 4) πy(φx|z) = πy(πx(φ) · (πz(φ)) ) = −1 −1 πy(πx(φ)) · (πz(φ)) = πy(φ) · (πz(φ)) = φy|z.

4.) On the one hand, we have πx(φ) · ψ = φx|y · πy(φ) · ψ since φx|y continues φ from y to x. On the other hand we have also πx(φ) · ψ = (πx(φ) · ψ)x|y · πy(πx(φ)·ψ) = (πx(φ)·ψ)x|y ·πy(φ)·ψ, again using the continuation property of a conditional. This leads to the equation

φx|y · (πy(φ) · ψ) = (πx(φ) · ψ)x|y · (πy(φ) · ψ.) Multiplying both sides with the appropriate inverse, we get

φx|y · fπx(φ)·ψ = (πx(φ) · ψ)x|y · fπx(φ)·ψ.

By (6.3) we have δ(πy(φ) · ψ) ≤ δ((πx(φ) · ψ)x|y). Then it follows that

(πx(φ) · ψ)x|y = φx|y · fπy(φ)·ψ = φx|y · fπy(φ) · fψ = φx|y · fψ where the last equality follows from (6.3)..

5.) By item 2 proved above, φx|y · φy|z = φx|z and since z ≤ w ≤ x it follows from item 3 above that πw(φx|z) = φw|z. ut We now study factorisations of elements of Ψ over conditionally independent domains. We do not restrict the factors to be in Ψ, they can be in Ψ0, for example conditionals. More precisely, we define conditional independence with respect to an element from Ψ as follows. 6 CONDITIONALS 98

Definition 12 For a an element φ ∈ Ψ, x, y, z ∈ D we say that x and y are conditionally independent given z relativ to φ, and we write x⊥φy|z if

1. x⊥Ly|z,

0 2. there exist elements ψ1, ψ2 ∈ Ψ with domains d(ψ1) = x ∨ z, d(ψ2) = y ∨ z such that

πx∨y∨z(φ) = ψ1 · ψ2. (6.4)

The following lemma gives a few elementary results about this concept, which helps to better understand its meaning.

Lemma 8 Assume x⊥φy|z and assume πx∨y∨z(φ) = ψ1 · ψ2 and that pro- jection to z exists for ψ1 and ψ2, then,

πx∨z(φ) = ψ1 · πz(ψ2),

πy∨z(φ) = πz(ψ1) · ψ2,

πz(φ) = πz(ψ1) · πz(ψ2).

Proof. The first two results follow from the (extended) combination ax- iom and x⊥Ly|z. Applying the combination axiom once more to πz(φ) = πz(πx∨z(φ)) (transitivity) gives the third result. ut

So x⊥φy|z means that the only part of ψ2 relevant for the extraction of information in φ relative to domain x ∨ z is the part relative to z and the only parts of ψ1 and ψ2 relevant for the extraction of the part of φ relative to z are the parts relating to domain z respectively. Note also that in regular valuation algebras x⊥φy|z means always that φ factors into two factors with domain x ∨ z and y ∨ z belonging to Ψ such that no reserve concerning projection is needed. For our later needs, we extend the definition of conditionals to

−1 φx|y = πx∨y(φ) · (πy(φ)) (6.5) for x, y ≤ d(φ), such that φx|y = φx∨y|y or more generally, φx|y = φx0|y as long as x ∨ y = x0 ∨ y. In particular, Lemma 7 remains valid with this extended definition.

There are several equivalent statements for the relation x⊥φy|z. In the following theorem, the extended definition of conditionals (6.5) is used. 6 CONDITIONALS 99

Theorem 44 Let (Ψ,D; ≤, d, ·, π) be a separative valuation algebra. As- sume x⊥Ly|z and φ ∈ Ψ such that d(φ) ≥ x ∨ y ∨ z. Then the following statements are all equivalent:

0 1. πx∨y∨z(φ) = ψ1 · ψ2 with ψ1, ψ2 ∈ Ψ and d(ψ1) = x ∨ z, d(ψ2) = y ∨ z.

2. πx∨y∨z(φ) = φx|z · φy|z · πz(φ).

3. φx∨y|z = φx|z · φy|z. 0 4. φx∨y|z = χ1 ·χ2 with χ1, χ2 ∈ Ψ such that d(χ1) = x∨z, d(χ2) = y ∨z and for χ1, χ2 the projection to z exists.

5. πx∨y∨z(φ) · πz(φ) = πx∨z(φ) · πy∨z(φ).

6. πx∨y∨z(φ) = φx|z · πy∨z(φ).

7. φx|y∨z(φ) = φx|z · fπy∨z(φ). 0 8. φx|y∨z(φ) = χ · fπy∨z(φ) with χ ∈ Ψ such that d(χ) = x ∨ z, and for χ the projection to z exists.

Proof. (1) ⇒ (2): We have by definition and using the assumption (1), −1 −1 φx|z = πx∨z(φ) · (πz(φ)) = ψ1 · πz(ψ2) · (πz(φ)) , see Lemma 8. Similarly, −1 we obtain φy|z = ψ2 · πz(ψ1) · (πz(φ)) . From this, it follows that −1 −1 φx|z · φy|z · πz(φ) = ψ1 · ψ2 · πz(ψ1) · πz(ψ2) · πz(φ) · (πz(φ)) · (πz(φ)) .

By Lemma 8 πz(ψ1) · πz(ψ2) = πz(φ). So,

φx|z · φy|z · πz(φ) = ψ1 · ψ2 · fπz(φ) = φ · fπz(φ).

But, δ(πz(φ)) ≤ δ(φ), so indeed φx|z · φy|z · πz(φ) = φ.

(2) ⇒ (3): Since φx∨y|z is a continuer of φ from z to x∨y ∨z, we have, using (2), the equation φx∨y|z · πz(φ) = φx|z · φy|z · πz(φ). If we multiply both sides by the inverse of πz(φ), we obtain, using (6.3), φx∨y|z = φx|z · φy|z.

(3) ⇒ (4): Take χ1 = φx|z and χ2 = φy|z.

(4) ⇒ (5): Using (4) and the continuer φx∨y|z of φ from z to x ∨ y ∨ z we obtain πx∨y∨z(φ) · πz(φ) = φx∨y|z · πz(φ) · πz(φ) = (χ1 · πz(φ)) · (χ2 · πz(φ)). Further, again using (4),

πx∨z(φ) = πx∨z(πx∨y∨z(φ)) = πx∨z(φx∨y|z · πz(φ))

= πx∨z(χ1 · χ2 · πz(φ)) = χ1 · πz(χ2 · πz(φ))

= χ1 · πz(χ2) · πz(φ). (6.6) 6 CONDITIONALS 100

In the same way we get also πy∨z(φ) = χ2 ·πz(χ1)·πz(φ). Next, from Lemma

7 we have πz(φx∨y|z) = fπz(φ) and finally also πz(φx∨y|z) = πz(χ1 · χ2) = πz(πx∨z(χ1 · χ2)) = πz(χ1 · πz(χ2)) = πz(χ1) · πz(χ2) (Theorem 39), since the projections πz(χ1) and πz(χ2) do exist. All this together leads to

πx∨z(φ) · πy∨z(φ)

= (χ1 · πz(φ)) · (χ2 · πz(φ)) · (πz(χ1)) · πz(χ2))

= (χ1 · πz(φ)) · (χ2 · πz(φ)) · πz(φx∨y|z)

= (χ1 · πz(φ)) · (χ2 · πz(φ)) · fπz(φ) = (χ1 · πz(φ)) · (χ2 · πz(φ))

= πx∨y∨z(φ) · πz(φ), since δ(πz(φ) ≤ δ(φx∨y|z) = δ(χ1 · χ2). This is then (5). (5) ⇒ (6): Starting with (5), we have

πx∨y∨z(φ) · πz(φ) = πx∨z(φ) · πy∨z(φ) = φx|z · πz(φ) · πy∨z(φ), using φx|z as a continuation of φ from x to x ∨ z. From this we conclude πx∨y∨z(φ) = φx|z · πy∨z(φ), by multiplying with the inverse of πz(φ) and using δ(πz(φ)) ≤ δ(πy∨z(φ)) ≤ δ(πx∨y∨z(φ)).

(6) ⇒ (7): Taking φx|y∨z as a continuer of φ from y ∨ z to x ∨ y ∨ z, we have πx∨y∨z(φ) = φx|y∨z · πy∨z(φ). This, together with (6), leads to the equation φx|y∨z · πy∨z(φ) = φx|z · πy∨z(φ). Multiplying both sides by the inverse of

πy∨z(φ) yields φx|y∨z = φx|z ·fπy∨z(φ), since δ(πy∨z(φ)) ≤ δ(φx|y∨z), see (6.3).

(7) ⇒ (8): Take χ = φx|z.

(8) ⇒ (1): By (8) we have, using φx|y∨z as a continuer of φ from y ∨ z to x ∨ y ∨ z,

πx∨y∨z(φ) = φx|y∨z · πy∨z(φ) = χx|z · fπy∨z(φ) · πy∨z(φ) = χx|z · πy∨z(φ).

0 Take ψ1 = χ and ψ2 = πy∨z(φ). Then ψ1, ψ2 ∈ Ψ and d(ψ1) = x ∨ z, d(ψ2) = y ∨ z. ut As an illustration, we show that the statements of Theorem 44 are well- known results for probability distributions.

Example Probability Potentials and Densities: Consider probability poten- tials and densities in the familiar multivariate setting. Let s, t and r be 6 CONDITIONALS 101 disjoint sets of variables; this implies in the subset lattice of variables that s⊥Lt|r. Consider a probability potential p(x, y, z) and a density function s t f(x, y, z) on the domain s ∪ t ∪ r, and in the case of densities x ∈ R , y ∈ R , r z ∈ R . To fix ideas, we assume both p and f scaled, i.e. a proper discrete probability distribution and a proper density function. To better highlight the results we change notation for this example slightly, we denote projection of p and f to a subset s of variables by p↓s and f ↓s, that is for instance

X Z +∞ p↓s(x) = p(x, y, z), f ↓s(x) = f(x, y, z)dxdy. y,z −∞

Now, by item (5) of Theorem 44, we have

p↓s∪r(x, z)p↓t∪r(y, z) p(x, y, z) = . (6.7) p↓r(z)

By definition of discrete conditional probability distributions, it follows that

↓s∪r ↓t∪r p(x, y, z) = px|z (x, z)p (y, z) ↓s∪r ↓t∪r ↓r = px|z (x, z)py|z (y, z)p (z).

This illustrates items (6) and (2) of the theorem. Further, from the above

p(x, y, z) p (x, y, z) = = p↓s∪r(x, z)p↓t∪r(y, z). s∪t|r p↓r(z) x|z y|z

↓s∪r This is item (3) of the Theorem and also (4), since for instance px|z belongs in the regular algebra of probability potentials to Ψ, although it is not (normalized) probability table. Finally, we have, using (6.7)

p(x, y, z) ↓s∪r p (x, y, z) = = p (x, z)e ↓t∪r (y, z), s|t∪r p↓t∪r(y, z) supp(p ) where  1 for (y, z) ∈ supp(p↓t∪r), e ↓t∪r (y, z) = (6.8) supp(p ) 0 otherwise. is the idempotent on the support of p↓t∪r. This is (7), and also (8). Note that, although conditionals like ps|t∪r, etc and also the idempotent esupp(p↓t∪r) are no more normalized probability distributions, they still belong to the valuation algebra Ψ of probability potential, since their elements are not 6 CONDITIONALS 102 assumed to be normalized and the algebra is regular. The corresponding results for densities are similar, but here the conditionals belong no more to Ψ but only to Ψ0. All in all this shows that, as far as conditionals are concerned, the difference between regular and separative algebras is small.

It is known that the relation s⊥φy|z between sets of variables for multivariate valuation algebras forms a semigraphoid (Kohlas, 2003a), a concept related to separoids (Dawid, 2001). In our more general setting, if (D; ≤) is a modular lattice, then this relation is separoid. In order to prove this, we need the following lemma (compare also Lemma 4 in Section 3.2).

Lemma 9 Let (Ψ,D; ≤, d, ·, π) be a separative valuation algebra, where (D; ≤ ) is a modular lattice. Then, if φ ∈ Ψ, ψ ∈ Ψ0 with d(φ) = x, d(ψ) = y, such that projection to y ∧ z exists, and x ≤ z ≤ x ∨ y,

πz(φ · ψ) = φ · πy∧z(ψ). (6.9)

Proof. Note that δ(ψ) ≥ δ(πy∨z(ψ)), hence δ(φ·ψ) ≥ δ(φ·πy∧z(ψ)). Further, we have z = z ∧ (x ∨ y) = (z ∧ y) ∨ x, by modularity of the lattice (D; ≤). Then by the extended combination axiom, we obtain

πz(φ · ψ) = πz((φ · fφ·πy∧z(ψ)) · ψ)

= (φ · fφ·πy∧z(ψ)) · πy∧z(ψ) = φ · πy∧z(ψ).

ut

Theorem 45 Let (Ψ,D; ≤, d, ·, π) be a separative valuation algebra, where (D; ≤) is a modular lattice. Then the relation x⊥φy|z is a separoid.

Proof. We tacitly assume that the domains considered in this proof are smaller equal to d(φ). By Theorem 3 in Section 2.1, the relation x⊥Ly|z is a separoid if and only if the lattice (D; ≤) is modular.

C1) x⊥φy|y: Since φx|y is a continuation of φ from y to x ∨ y, we have

πx∨y(φ) = φx|y · πy(φ). Then, πx∨y(φ) = φx|y · fπy(φ) · πy(φ). Now, fπy(φ) = −1 πy(φ)·(πy(φ)) = φy|y, hence we have πx∨y(φ) = φx|y·φy|y·πy(φ). According to Theorem 44, item 2, it follows that x⊥φy|y. C2) follows directly form the definition. 6 CONDITIONALS 103

C3) From x⊥φy|z it follows by Theorem 44 that πx∨y∨z(φ)·πz(φ) = πx∨z(φ)· πy∨z(φ). Assume w ≤ y. Then, x ∨ z ≤ x ∨ w ∨ z ≤ x ∨ y ∨ z, and therefore, by Lemma 9

πx∨w∨z(πx∨z · πy∨z(φ)) = πx∨z(φ) · π(y∨z)∧(x∨w∨z)(φ). But (y∨z)∧(x∨w∨z) = (y∨z)∧((x∨z)∨(w∨z)) = ((y∨z)∧(x∨z))∨(w∨z) = z ∨ (w ∨ z) = w ∨ z by modularity and the definition of x⊥Ly|z. It follows therefore,

πx∨w∨z(πx∨z · πy∨z(φ)) = πx∨z(φ) · πw∨z(φ). And we have also also (Lemma 9), since z ≤ x ∨ w ∨ z ≤ x ∨ y ∨ z,

πx∨w∨z(πx∨y∨z(φ) · πz(φ)) = πx∨w∨z(φ) · πz(φ).

So, we obtain finally πx∨w∨z(φ) · πz(φ) = πx∨z(φ) · π(w∨z(φ), and this means that x⊥φw|z (Theorem 44). So condition C3 of a separoid holds. We remind that condition C4 follows from condition C3, C5 and C6, see Section 2.1, so we are going to verify these two conditions.

C5) Since conditionals are continuers, we have πx∨y∨(z∨w)(φ) = φx|z∨w · πy∨(z∨w)(φ). Further, by C5, x⊥Ly|z and w ≤ y imply x⊥Ly|z ∨ w. From Theorem 44, item 6, it follows then that x⊥φy|z ∨ w and this shows that C5 is valid.

C6) The assumption x⊥φw|y ∨ z means by Theorem 44 that

πx∨w∨y∨z(φ) = φx|y∨z · φw|y∨z · πy∨z(φ). Using conditionals as continuers, we obtain

φw|y∨z · πy∨z(φ) = πw∨y∨z(φ) = φw∨y|z · πz(φ).

Further, x⊥φy|z implies, according to Theorem 44, φx|y∨z = φx|z · fπy∨z(φ). Introducing this above we find

πx∨y∨z∨w(φ) = φx|z · fπy∨z(φ) · φw|y∨z · πy∨z(φ) = φx|z · φw|y∨z · πy∨z(φ)

= φx|z · φw∨y|z · πz(φ). (6.10)

This identity means by Theorem 44 that x⊥φw ∨ y|z and thus C6 is valid too. ut In a multivariate setting, C7 holds too if φ is a positive valuation, see (Kohlas, 2003a). It does not seem that this is valid in the more general 6 CONDITIONALS 104 context of a general distributive lattice (D; ≤). Also, in the case of a gen- eral lattice, the relation x⊥φy|z seems not be a quasi-separoid, the problem being condition C4 whose proof depends on Lemma 9. On the other hand, with the exception of condition C3 above, all other conditions are valid for a general lattice (D; ≤), their proof does not depend on modularity. To conclude this section, we are going to consider factorisations over several conditionally independent domains ⊥L{x1, . . . , xn}|z, in a continuation of the discussion in Section 2.3. We extend first the concept of conditional independence relative to a valuation φ as given in Definition 12. We simplify however to the extend that we consider only factorisations of valuations φ with factors in Ψ.

Definition 13 For an element φ ∈ Ψ, and a finite set of domains x1, . . . , xn, z ∈ D, we say that the domains x1, . . . , xn are conditionally independent given z relative to φ, and we write ⊥φ{x1, . . . , xn}|z, if

1. ⊥L{x1, . . . , xn}|z,

2. there exist elements ψ1, . . . , ψn ∈ Ψ with domains d(ψi) = xi ∨ z for i = 1, . . . , n such that

πx1∨...∨xn∨z(φ) = ψ1 ····· ψn. (6.11)

Theorem 10 in Section 2.3 generalizes partially.

Theorem 46 Let (Ψ,D; ≤, d, ·, π) be a valuation algebra (not necessarily separative) and ⊥φ{x1, . . . , xn}|z. Then:

1. if σ is a permutation of 1, . . . , n, then ⊥φ{xσ(1), . . . , xσ(n)}|z,

2. if J ⊆ {1, . . . , n}, then ⊥φ{xj : j ∈ J}|z,

3. ⊥φ{x1 ∨ x2, x3, . . . , xn}|z,

4. ⊥φ{x1 ∨ z, x2, . . . , xn}|z.

Proof. 1.) This is evident from the definition. 6 CONDITIONALS 105

2.) By Theorem 10 we have ⊥L{xj : j ∈ J}|z. Assume that πx1∨...∨xn (φ) = ψ1 · ... · φn with d(φi) = xi ∨ z. Assume ∅= 6 J ⊂ {1, . . . , n}, otherwise the proposition becomes trivial. Let then y = ∨j∈J (xj ∨ z). Then we have Y Y Y Y πy(φ) = πy( ψj · ψk) = ψj · πz( ψk), j∈J k6∈J j∈J k6∈J since ⊥L{x1, . . . , xn}|z implies y ∧ ∨k6∈J xk = z. So, select an index h ∈ J, then Y Y πy(φ) = (ψh · πz( ψk)) · ψj. k6∈J j∈J,j6=h

The first factor in this factorisation has domain xh ∨z, the other ones xj ∨z. So we see that indeed ⊥φ{xj : j ∈ J}|z holds.

3.) and 4.) are immediate consequences from the definition of ⊥φ{x1, . . . , xn}|z. ut Lemma 8 gneralizes as follows to this more general situation:

0 Lemma 10 Assume ⊥L{x1, . . . , xn}|z and φ = ψ1 · ... · ψn with ψi ∈ Ψ , d(ψi) = xi ∨ z and such that projection exists to z for all ψi and φ, then,

πx1∨z(φ) = ψ1 · πz(ψ2) · ... · πz(ψn), πz(φ) = πz(ψ1) · ... · πz(ψn). (6.12)

Proof. Using the extended combination axiom, we have

πx1∨z(φ) = πx1∨z(ψ1 · ... · ψn) = ψ1 · πz(ψ2 · ... · ψn),

n since from ⊥L{x1, . . . , xn}|z it follows that (x1 ∨z)∧(∨i=2xi) = z. It follows further that

πz(φ) = πz(ψ1) ··πz(ψ2 · ... · ψn).

But from the last result, by induction, we obtain πz(ψ2 · ... · ψn) = πz(ψ2) · ... · πz(ψn). This proves then both formulae in (6.12). ut

Note that πxi∨z(φ) has for all i = 2, . . . , n a corresponding similar repre- sentation like the first formula in (6.12). Now we extend some results from Theorem 44 above. 6 CONDITIONALS 106

Theorem 47 Let (Ψ,D; ≤, d, ·, π) be a separative valuation algebra. As- sume ⊥L{x1, . . . , xn ∨ z}|z and φ ∈ Ψ such that d(φ) ≥ x1 ∨ ... ∨ xn. Then the following statments are all equivalent:

0 1. πx1∨...∨xn∨z(φ) = ψ1 · ... · ψn with ψi ∈ Ψ and d(ψi) = xi ∨ z for i = 1, . . . , n.

2. πx1∨...∨xn∨z(φ) = φx1|z · ... · φxn|z · πz(φ).

3. φx1∨...∨xn|z = φx1|z · ... · φxn|z. n−1 4. πx1∨...∨xn∨z(φ) · πz (φ) = πx1∨z(φ) · ... · πxn∨z(φ).

Proof. (1) ⇒ (2): The proof goes essentially as in Theorem 44. We have, −1 using the definition of conditionals φxi|z = πxi∨z(φ) · (πz(φ)) ,

φx1|z · ... · φxn|z · πz(φ) n Y −n n = πxi∨z(φ) · (πz(φ)) · (πz(φ)) . i=1 Using Lemma 10, we get

φx1|z · ... · φxn|z · πz(φ) n n Y Y n−1 −n = ψi · (πz(ψi)) · (πz(φ)) · πz(φ) i=1 i=1 n Y = ψi. i=1

So, indeed πx1∨...∨xn∨z(φ) = φx1|z · ... · φxn|z · πz(φ). −1 (2) ⇒ (3): This follows since φx1∨...∨xn|z = πx1∨...∨xn∨z(φ) · (πz(φ)) . In- troduce here (2) for the first term, to get (3). (3) ⇒ (4): We use the the fact that conditionals are continuers to get

πx1∨...∨xn∨z(φ) = φx1∨...∨xn|z · πz(φ). Introduce here (3) for φx1∨...∨xn|z and then the definition of the conditionals φxi|z to obtain

n n n−1 Y −n n Y πx1∨...∨xn∨z(φ) · (πz(φ)) = πxi∨z(φ) · (πz(π)) · (πz(φ)) = πxi∨z(φ). i=1 i=1 since δ(πz(φ)) ≤ δ(πxi∨z(φ)). 6 CONDITIONALS 107

(4) ⇒ (1): From (4), we have

n Y −n+1 πx1∨...∨xn∨z(φ) = πxi∨z(φ) · (πz(φ)) . i=1

−1 So, take ψi = πxi∨z(φ) · (πz(π)) for i = 1, . . . , n − 1 and ψn = πxn∨z. Then 0 πx1∨...∨xn∨z(φ) = ψ1 · ... · ψn and ψi ∈ Ψ with d(ψi) = xi ∨ z. ut

As an application consider a factorisation over domains x1, . . . , xn and z such that ⊥L{x1, . . . , xn}|z,

φ = ψ1 · ... · ψn · ψ, d(ψi) = xi, d(ψ) = x.

Then we have

φ = (ψ1 · fπz(φ)) · ... · (ψn−1 · fπz(φ)) · (ψn · ψ),

Note that the factors here have domains xi ∨z. So we have ⊥φ{x1, . . . , xn}|z and we may apply Lemma 10 to obtain

πz(φ) = πz(ψ1 · fπz(φ)) · ... · πz(ψn−1 · fπz(φ)) · πz(ψn · ψ)

= πx1∧z(ψ1) · ... · πxn∧z(ψn) · ψ.

Similar results may also be obtained for factorisations over Markov trees. Let (T, λ) be a Markov tree with vertices V . Consider a factorisation Y φ = ψv, ψv ∈ Ψ, d(ψv) = λ(v). v∈V

We refer to (5.20) in Section 5.4, for a result similar to item 4 of Theorem 47. If we consider as in Section 5.4 any construction sequence x1, . . . , xn of the Markov tree, then (5.20) may be rewritten

n−1 n−1 Y Y φ = (π (φ) · (π (φ))−1) · π (φ) = φ · π (φ). xi xi∧xb(i) xn xi|xi∧xb(i) xn i=1 i=1 This concludes our discussion of conditionals and factorisation. The subject will be taken up under another angle in the next two sections. 6 CONDITIONALS 108

6.2 Causal Models

This section is motivated by probability theory, where probability distribu- tions, especially in applications, are often represented as products of condi- tional probability distributions. This is the case for instance with models defined on the base of probability networks such as Bayesian networks, see for example (Cowell et al. , 1999; Shafer, 1996), Usually, only discrete prob- ability potentials are considered in these frameworks. But such models may be considered in much more general valuation algebras. We show that sep- arative valuation algebras allow to develop a theory much in parallel to discrete probability theory. So, consider a separative valuation algebra (Ψ,D; ≤, d, ·, π) whose semigroup (Ψ; ·) is embedded according to the discussions in Section 5.3 into a semi- group (Ψ0; ·) which is a union of disjoint groups δ(φ), for φ ∈ Ψ. In addi- tion we assume the sufficient conditions for the existence of partial projec- tions, i.e. either Axioms S3 or S3’ or alternatively Axiom S5 (see Section 5.3). In particular, there exist conditionals φx|y for any valuation φ ∈ Ψ with d(φ) ≥ x ∨ y, according to the previous Section 6.1. We remind that 0 φx|y = φx0|y as long as x∨y = x ∨y. The essential properties of conditionals as elements of Ψ0 are:

1. d(φx|y) = x ∨ y,

2. for y ≤ z ≤ x ∨ y, the projection πz(φx|y) = φz|y exists and πy(φx|y) =

fπy(φ).

An element of Ψ0 with these properties has been called a kernel in (Kohlas, 2003a). In contrast to kernels, the elements φ of Ψ are characterized by the fact the projections πz(φ) for all z ≤ d(φ) exist. The main result for this section is given by the following theorem:

Theorem 48 Let (Ψ,D; ≤, d, ·, π) be a separative valuation algebra, where (D; ≤) is a modular lattice. If x⊥Ly|z, φ ∈ Ψ with d(φ) = x ∨ y, ψy|z a conditional such that δ(πz(ψ)) ≤ δ(πz(φ)), then φ · ψy|z belongs to Ψ.

Proof. We show that the projection πw(φ · ψy|z) exists for all w ≤ x ∨ y ∨ z. Assume first that x ∨ z ≤ w ≤ x ∨ y ∨ z. Then we have by Lemma 9, since the lattice (D :≤) is modular,

πw(φ · ψy|z) = φ · πw∧(y∨z)(ψy|z) 6 CONDITIONALS 109

and the projection πw∧(y∨z)(ψy|z) exists, since w∧(y∨z) ≥ z. So, πw(φ·ψy|z) exists in this case. If w ≤ x ∨ z, then

πw(φ · ψy|z) = πw(πx∨z(φ · ψy|z)) = πw(φ · πz(ψy|z)) = πw(φ · fπz(ψ)) = πw(φ), since x⊥Ly|z implies (x∨z)∧(y ∨z) = z.which shows that πw(φ·ψy|z) exists also for all w ≤ x ∨ z. ut This basic result presupposes that the lattice (D; ≤) is modular, and we re- tain this assumption throughout this and the next two sections. Theorem 48 generalizes as follows to combination of several conditionals as follows. Con- sider a valuation ψ1 ∈ Ψ and a sequence of conditionals ψ2 , . . . , ψn . h2|t2 hn|tn Define d(ψ1) = x and x = h ∨ t = d(ψi ) and 1 i i i hi|ti φ = ψ · ψ2 · ... · ψi . i 1 h2|t2 hi|ti If

i−1 i−1 xi⊥L ∨j=1 xj|ti, ti ≤ ∨j=1xj, and

δ(π (ψi )) ≤ δ(π (φ )) ti hi|ti ti i−1 for all i = 2, . . . n, then ψ , ψ2 , . . . , ψn is called a construction sequence 1 h2|t2 hn|tn (Shafer, 1996).

Theorem 49 Let (Ψ,D; ≤, d, ·, π) be a separative valuation algebra, where (D; ≤) is a modular lattice. Let ψ , ψ2 , . . . , ψn be a construction se- 1 h2|t2 hn|tn quence, Then

φ = ψ · ψ2 · ... · ψi . i 1 h2|t2 hi|ti belongs to Ψ for all i = 1, . . . , n.

Proof. The proof goes by induction. The proposition holds trivially for i = 1. Suppose it holds for i − 1. Then it follows from Theorem 48 that it holds for i. ut This shows that construction sequences provide a way to represent valuations of a separative valuation algebra as a combination of conditionals. Construc- tion sequences based on conditionals have been introduced and studied in 6 CONDITIONALS 110

(Shafer, 1996) in the multivariate setting and for discrete probability distri- butions. The results for this particular case generalise to the present more general case of separative valuation algebras with modular lattice (D; ≤) of domains. So, projections of a causal models to an initial segment of a construction sequence is obtained simply by ignoring the later factors from the sequence.

Theorem 50 Let (Ψ,D; ≤, d, ·, π) be a separative valuation algebra, where (D; ≤) is a modular lattice. Let ψ , ψ2 , . . . , ψn be a construction se- 1 h2|t2 hn|tn quence, and

φ = ψ · ψ2 · ... · ψn . 1 h2|t2 hn|tn Then , for i = 1, . . . , n,

π (φ) = ψ · ψ2 · ... · ψi . (6.13) x1∨...∨xi 1 h2|t2 hi|ti

Proof. We prove this first for n = 2, where φ = ψ1 · ψ2 , with t ≤ x = h2|t2 2 1 2 1 h1 ∨ t1, x2 = h2 ∨ t2, x1⊥Lx2|t2, and δ(πt2 (ψ )) ≤ δ(πt2 (ψ )). By Axiom S5 of a valuation algebra,

1 2 πx1 (φ) = ψ · πx1∧x2 (ψ h2|t2).

But from x1⊥Lx2|t2 we obtain x1 ∧ x2 = t2 and, then by Lemma 7

π (φ) = ψ1 · π (ψ2h |t ) = ψ · f (ψ2). x1 t2 2 2 1 πt2

2 1 1 Finally, δ(πt2 (ψ ) ≤ δ(πt2 (ψ1) ≤ δ(ψ ) gives πx1 (φ) = ψ . So, the claim holds for n = 2. For a n > 2, the proof proceeds by induction. Define φ = ψ ·ψ2 ·...·ψi . i 1 h2|t2 hi|ti Then φ = φ · ψn . By the properties of a construction sequence, the n−1 hn|tn result above applies here, such that πx1∨...∨xn−1 (φ) = φn−1. So, the claim holds for i = n − 1. Assume it holds for i. Then

πx1∨...∨xi−1 (φ) = πx1∨...∨xi−1 (πx1∨...∨xi (φ)) = πx1∨...∨xi (φi).

But φ = φ · ψi and the result above for n = 2 applies again such that i i−1 hi|ti indeed πx1∨...∨xi−1 (φ) = φi−1. This completes the proof. ut The computation of projections to some other domains is in general for causal models much more difficult, especially, if the domain of φ becomes 6 CONDITIONALS 111

large. If the domains xi of the construction sequence form a Markov tree, and the valuation algebra is regular, then local computation (Chapter 4) and even those with division (Section 5.4) may be applied, since all projections needed exist and are in Ψ. This is typically the case for probability potentials. In this case we may even add additional information (valuations) on the nodes of the Markov tree and compute “posterior” valuations for the causal model. If the valuation algebra is only separative, then attention must be paid to the existence of the required projections. This is discussed in (Kohlas, 2003a) in the framework of multivariate models; but the results extend to the present more general case. The problem has a simple solution in case that the tail t of any factor ψi of the construction sequence is contained in the i hi|ti domain xs(i) of some previous factor, s(i) < i. Then, if the lattice (D; ≤) is modular, the conditional independence relation x⊥L|z is a separoid. Since s(i) ≤ x1 ∨ ... ∨ xi − 1 it follows from item C5 of a separoid that

xi⊥l ∨j=1 xj|s(i).

This induces a tree T with vertices V = {1, . . . , n} and edges E = {(s(i), i): i = 2, . . . , n}. We note that in this tree the path from node 1 to i forms the initial segment of a construction sequences for φ. In this new numeration node i becomes number j and s(i) = j − 1. Then Theorem 50 implies (in the newnumeration)

j πx (φ) = πx (φj) = π (φj−1) · ψ j j xj ∧(x1∨...∨xj−1 hj |tj j = πx ∧x (πx (φj−1)) · ψ . j j−1 j−1 hj |tj This is a local computation scheme which traverses the tree from the root node 1 and computes the projections of φ to any node using only valuations on the domains xi of the tree. Particular cases of this schema arise if s(i) = i − 1 for all i. In probability theory this corresponds to a Markov chain. There would be much more to say about causal models, but for the time being we content ourselves with these few results.

6.3 Probabilistic Argumentation

In a Bayesian network model, one usually defines a joint probability dis- tribution of a set of variables by a product of conditional distributions as a causal model. Each conditional distribution is described by a probability potential, since the valuation algebra of probability potentials is regular, and 6 CONDITIONALS 112 the joint probability distribution is thus given by a combination of proba- bility potentials. Since the probability potentials form a valuation algebra, local computation of marginals of the joint probability distribution in a join tree is possible, as originally shown in (Lauritzen & Spiegelhalter, 1988), see also (Shafer, 1996). This represents the classical application of the valua- tion algebra of probability potentials to computational problems in discrete probability theory, especially Bayesian networks. Not every combination of probability potentials however represents a joint probability distribution as a product of conditional distributions. Therefore we present in this section another interpretation of probability potentials and their operations of combination and projection. This is probabilistic argumentation. As always for probability potentials we consider a f.c.f where (F, ≤) is a lattice and Θ⊥Λ|Θ∧Λ holds for all pairs of frames Θ and Λ from F. So, the following applies in particular also to the case of multivariate models, which is usually considered for probability potentials, but holds in our more general framework. Consider a frame Θ and a finite set of elements Ω together with a probability distribution q on Ω, such that X 0 ≤ q(ω), ∀ω ∈ Ω, q(ω) = 1. ω∈Ω Further let X :Ω → Θ be a mapping from Ω into Θ. This can be interpreted as an Probabilistic argumentation structure for Θ: The elements of Ω are considered as possible assumptions and one of them must be the valid one. Which one however is unknown, only their probabilities of being valid q(ω) are known. If an assumption ω holds, then X(Ω) ∈ Θ would be the correct answer to the question represented by Θ. So, we formally define :

Definition 14 Probabilistic Argumentation Structure: Let (Ω, q) be a dis- crete probability space, where Ω is a finite set, and q(ω) a discrete probability distribution on Ω; let further X :Ω → Θ be a mapping from Ω into Θ ∈ F. Then the quatruple (Ω, q, X, Θ) is called a probabilistic argumentation struc- ture for Θ.

Clearly, any probabilistic argumentation structure (Ω, q, X, Θ) for Θ induces a probability distribution p on Θ, because X is simply a random variable with values in Θ, and X p(θ) = q(ω). ω:X(ω)=θ 6 CONDITIONALS 113

On the other hand, a probability distribution p on Θ can be produced by many different probabilistic argumentation structures. The simplest one is Θ, p, id, Θ, where Θ itself is the assumption space and X is the identity map. This is called the canonical probabilistic argumentation structure for a probability distribution on Θ. We are going now to define two operations on probabilistic argumenta- tion structures, which will be reflected by the corresponding operations of combination and projection for probability potentials. For this purpose we remind that a probability potential is simply a nonnegative function + φ :Θ → R ∪ {0} assigning each element θ ∈ Θ a nonnegative real value. It is not a probability distribution, but it is proportional to one: φ(θ) p(θ) = P . (6.14) θ∈Θ φ(θ)

This holds for all probability potentials, except the null-potential, 0Θ(θ) = 0 for all θ. We introduce between probability potentials on a same frame Θ the relation φ ≡ ψ if there is a constant k > 0 such that

φ(θ) = k · ψ(θ), ∀θ ∈ Θ.

This is clearly an equivalence relation in Φ. Each equivalence class [φ], except the class [0] contains exactly one probability distribution which rep- resents the class. The class is furthermore a congruence: All elements of a class have the same label. If φ ≡ ψ with d(φ) = d(ψ) = Θ and χ is a potential with d(χ) = Λ, then φ(θ) = k × ψ(θ) for some k > 0, hence for all ζ ∈ Θ ∨ Λ,

φ · χ(ζ) = φ(tΘ(ζ)) · χ(tΛ(ζ) = k · ψ(tΘ(ζ)) · χ(tΛ(ζ) = k · ψ · χ(ζ), hence φ · χ ≡ ψ · χ. Similarly, if φ ≡ ψ and Λ ≤ d(φ) = Θ, then for all λ ∈ Λ, X X πΛ(φ)(λ) = φ(θ) = k · ψ(θ) = k · πΛ(ψ)(λ).

θ∈tΘ(λ) θ∈tΘ(λ)

So, it follows πΛ(φ) ≡ πΛ(ψ).

Therefore we may consider the quotient algebra Φ/ of classes [φ] with operations defined as follows

1. Labeling: d([φ]) = d(φ), 6 CONDITIONALS 114

2. Combination: [φ] · [ψ] = [φ · ψ],

3. Projection: πΛ([φ]) = [πΛ(φ)].

Then, the algebra (Φ/, F; ≤, d, ·, π) is still a valuation algebra; it is in fact homomorph under the mapping φ 7→ [φ].

Now we return to Probabilistic argumentation structures. Let (Ω1, q1,X1, Θ) and (Ω2, q2,X2, Λ) be two Probabilistic argumentation structures for Θ and Λ respectively. In order to combine these two structures, we consider com- bined assumptions (ω1, ω2) with ω1 ∈ Ω1 and ω2 ∈ Ω2. Assuming the as- sumptions stochastically independent, such a combined assumption pair has probability q(ω1, ω2) = q1(ω1)q2(ω). Now, assumption ω1 implies element X1(ω1) in Θ and assumption ω2 implies X2(ω2) in Λ. But these two impli- cations may be non-compatible. Let τ1 and τ2 be the refinings of Θ and Λ to Θ∨Λ. Then τ1(X1(ω1))∩τ2(X2(ω2)) is either empty or equal to one element set {ζ}. in the first case the two implications are non-compatible. But such a pair (ω1, ω2) of assumptions can not be valid, since the assumptions ω1 and ω2 are contradictory. Define then

Ω = {(ω1, ω2): τ1(X1(ω1)) ∩ τ2(X2(ω2)) 6= ∅} to be set of compatible pairs of assumptions. Since non-compatible assump- tions can be excluded, we normalise the probabilities of combined assump- tions q (ω )q (ω ) q(ω , ω ) = 1 1 2 2 . (6.15) 1 2 P q (ω )q (ω ) (ω1,ω2)∈Ω 1 1 2 2 in other, words, we condition the prior probabilities of combined assumptions ω1, ω2) on the event Ω that they are compatible. Further, we define the mapping X :Ω → Θ ∨ Λ by

X(ω1, ω2) = ζ, if ζ ∈ τ1(X1(ω1)) ∩ τ2(X2(ω2)) = {ζ}. (6.16)

This works if Ω is not empty, what is not excluded. If Ω is empty the two Probabilistic argumentation structures are contradictory. This case has to be treated separately, see below. This then leads to a new Probabilistic argumentation structure (Ω, q, X, Θ∨ Λ) for Θ ∨ Λ. We call this the combined structure of (Ω1, q1,X1, Θ) and (Ω2, q2,X2, Λ). Of course, this combination operation of Probabilistic argu- mentation structures may be extended to three or more structures. 6 CONDITIONALS 115

Let p1 be the probability distribution induced by (Ω1, q1,X1, Θ) on Θ, p2 the one induced by (Ω2, q2,X2, Λ) on Λ and p the probability distribution induced by (Ω, q, X, Θ ∨ Λ) on Θ ∨ Λ. We claim that

[p] = [p1] · [p2], (6.17) probabilistic argumentation structures corresponds to the combination of the associated probability potentials. In fact, for ζ ∈ Θ ∨ Λ, we have X p(ζ) = q(ω1, ω2).

(ω1,ω2):X(ω1,ω2)=ζ

But, we have X(ω1, ω2) = ζ exactly if X1(ω1) ∈ tΘ(ζ) and X2(ω2) ∈ tΛ(ζ). So, we obtain

p(ζ) X = kq1(ω1)q2(ω2)

(ω1,ω2):X1(ω1)∈tΘ(ζ),X2(ω2)∈tΛ(ζ) X X = k · q1(ω1) · q2(ω2).

ω1:X1(ω1)∈tΘ(ζ) ω2:X2(ω2)∈tΛ(ζ) Further, we have X X X X q1(ω1) = q1(ω1) = p1(θ).

ω1:X1(ω1)∈tΘ(ζ) θ∈tΘ((ζ) ω1:X1(ω1)=θ θ∈tΘ((ζ) Similarly, we obtain X X q2(ω2) = p2(λ).

ω2:X2(ω2)∈tΛ(ζ) λ∈tΛ(ζ) So, we have finally X X p(ζ) = k · p1(θ) × p2(λ) = k · p1 · p2(ζ).

θ∈tΘ((ζ) λ∈tΛ(ζ) This proves the claim above. The situation is particularly simple, if we take for each of p1 and p2 its canonical Probabilistic argumentation structures; this gives directly the combination of probability potentials. Secondly, again if (Ω, q, X, Θ) is an Probabilistic argumentation structure for Θ, and Λ ≤ Θ a coarsening of frame Θ, then, if we define the mapping Y :Ω → Λ by

Y (ω) = tΛ(X(ω)), 6 CONDITIONALS 116 the quadruple (Ω, q, Y, Λ) is a probabilistic argumentation structure for Λ. It is called the coarsening of the structure (Ω, q, X, Θ) to Λ. Let p be the probability distribution on Θ, induced by (Ω, q, X, Θ). Then, we claim that the probability distribution p0 induced by (Ω, q, Y, Λ) on Λ is given by

0 p = πΛ(p).

Note that a projection or, more appropriate in the context of probability, a marginalisation, of a probability distribution is still a probability distribu- tion. In fact for an atom λ ∈ Λ, we have

0 X X X X p (λ) = q(ω) = q(ω) = p(θ) = πΛ(p)(λ).

ω:Y (ω)=λ θ:tΛ(θ)=λ ω:X(ω)=θ θ:tΛ(θ)=λ

0 This means also that [p ] = πΛ([p]. Therefore, coarsening of an Probabilis- tic argumentation structure corresponds to projection or marginalization of the corresponding induced probability distribution. Again this is very direct for canonical Probabilistic argumentation structures of probability distribu- tions. So, this gives a certain semantical interpretation of probability potentials and their valuation algebra, different from the usual one as a representa- tion of probabilistic or Bayesian networks, causal models, etc.. Moreover, it is not necessary to restrict the implication of an assumption to a precise answer, an atom in a frame Θ. One may as well assume that an assump- tion ω implies a partial answer only, that is a subset X(ω) of Θ. Such a more general Probabilistic argumentation structure has been called a hint (Kohlas & Monney, 1995). It is strongly related to Dempster-Shafer theory and belief functions (Dempster, 1967; Shafer, 1976) as has been shown in (Kohlas & Monney, 1995). Associated with this are generalised information algebras, not only valuation algebras (Kohlas & Monney, 1995). In a log- ical setting this is an instance of probabilistic argumentation, see (Laskey & Lehner, 1989; Anrig et al. , 1997; Kohlas, 1997; Haenni et al. , 2000a; Haenni et al. , 2000b; Kohlas et al. , 2002; Kohlas, 2003b). The valuation algebra of densities, including Gaussian densities, may also be interpreted in a similar way as arising from probabilistic argumentation structures. For the Gaussian case we refer to (Pouly & Kohlas, 2011). Another generalisation are random variables with values in (proper) information algebras, (Kohlas, 2007). 6 CONDITIONALS 117

6.4 Compositional Models

Compositional models have been introduced as an alternative to Bayesian networks in (Jirousek, 1997; Jirousek, 2011). Later these models were ex- tended for possibility theory (Vejnarova, 1998) and for Dempster-Shafer theory (Jirousek & Daniel, 2007). Finally it was noted that compositional models can be formed and used, under some conditions, in valuation algebras (Jirousek & Shenoy, 2014; Jirousek & Shenoy, 2015). In the last two ref- erences, essentially regular valuation algebras are assumed, which excludes for instance compositional models of belief functions. We show here that in fact compositional models are possible in separative valuation algebras. This allows then, among other instances of valuation algebras like Gaussian den- sities, also to include belief functions into the framework of compositional models. Let then (Ψ,D; ≤, d, ·, π) throughout this section be a separative valuation algebra satsifying in addition axioms S3 or S3’ or S5’ so that partial pro- jection in the algebra is well defined. We introduce first the compositional operator and prove then two central theorems (Theorems 51 and 52) un- derlying the theory of compositional model. This shows then that results obtained in (Jirousek, 1997; Jirousek, 2011) and (Jirousek & Shenoy, 2014; Jirousek & Shenoy, 2015) apply to separative valuation algebras. This seems to be the most general frame for a theory of compositional models.

Definition 15 If φ, ψ ∈ Ψ with d(φ) = x, d(ψ) = y, define

−1 φ B ψ = φ · ψy|x∧y = φ · ψ · (πx∧y(ψ)) . (6.18) This is called a composition of φ and ψ and B is called the composition operator.

Note first that if δ(πx∧y(ψ)) ≤ δ(πx∧y(φ)) then φ Bψ belongs to Ψ according to Theorem 48. This is important, if chains of compositions like (φ Bψ)Bσ or φ B (ψ B σ) are considered, since the terms of the composition are assumed to be in Ψ. So, it is then a partial operator B :Ψ × Ψ → Ψ. It is evident that the composition operator is not commutative nor in general associative. If the algebra is regular, then all compositions belong of course automatically to Ψ. The following theorems give the main properties of composition. It is a rigorous generalisation of the main theorem in (Jirousek & Shenoy, 2014; Jirousek & Shenoy, 2015) to separative valuation algebras. 6 CONDITIONALS 118

Theorem 51 Let (Ψ,D; ≤, d, ·, π) be a separative valuation algebra and (D; ≤ ) a modular lattice. Assume φ, ψ ∈ Ψ with d(φ) = x, d(ψ) = y, and δ(πx∧y(ψ)) ≤ δ(πx∧y(φ)). Then,

1. d(φ B ψ) = x ∨ y.

2. πx(φ B ψ) = φ.

3. If y ≤ x, then φ B ψ = φ.

4. πx∧y(φ) = πx∧y(ψ) implies φ B ψ = ψ B φ.

5. If x ∧ y ≤ z ≤ y then (φ · πz(ψ)) B ψ = φ · ψ.

6. If x ∧ y ≤ z ≤ y, then (φ B πz(ψ)) B ψ = φ B ψ.

Proof. The proof depends on the generalised valuation algebra axioms for separative valuation algebras, especially the Combination Axiom S5 (see Section 5.3, in particular Theorem 39). We recall that we assume also axio0ms S3 or S3’ or S5’ so that partial projection the separative valuation algebra is well defined. 1.) is a simple consequence of the (generalised) Labeling Axiom. 2.) By the Combination Axiom, we have

−1 πx(φ · ψ · (πx∧y(ψ)) ) −1 = φ · πx∧y(ψ · (πx∧y(ψ)) ) −1 = φ · πx∧y(ψ) · (πx∧y(ψ))

= φ · fπx∧y(ψ) = φ, since δ(πx∧y(ψ)) ≤ δ(πx∧y(φ)) ≤ δ(φ).

3.) If y ≤ x then x ∧ y = y and therefore φ B ψ = φ · fπy(ψ) = φ as above. −1 4.) If πx∧y(φ) = πx∧y(ψ), then φ B ψ = φ · ψ · (πx∧y(ψ)) = ψ · φ · −1 (πx∧y(φ)) = ψ B φ since obviously δ(πx∧y(ψ)) = δ(πx∧y(φ)). 5.) If z ≤ y then by the modular law in the modular lattice (D; ≤) we obtain that (x∨z)∧y = (x∧y)∨z and from x∧y ≤ z it follows that (x∨z)∧y = z. Note that δ(πz(φ · πz(ψ))) ≥ δ(πz(ψ)) so that (φ · πz(ψ)) B ψ is well defined. 6 CONDITIONALS 119

We have

(φ · πz(ψ)) B ψ −1 = φ · πz(ψ) · ψ · (π(x∨z)∧y(ψ)) −1 = φ · πz(ψ) · ψ · (πz(ψ))

= φ · ψ · fπz(ψ).

But δ(πz(ψ)) ≤ δ(ψ) implies then that (φ B πz(ψ)) B ψ = φ · ψ. 6.) As before, we have (x ∨ z) ∧ y = z. Further x ∧ y ≤ z ≤ y implies x ∧ y ≤ x ∧ z ≤ x ∧ y, hence x ∧ z = x ∧ y. This implies that φ B πz(ψ) belongs to Ψ and all compositions are well defined. This time we have

(φ B πz(ψ)) B ψ −1 −1 = (φ · πz(ψ) · (πx∧z(ψ)) ) · ψ · (π(x∨z)∧y(ψ)) −1 −1 = (φ · πz(ψ) · (πx∧z(ψ)) ) · ψ · (π(z(ψ)) −1 = φ · ψ · (πx∧y(ψ)) ) · fπz(ψ) = φ B ψ.

This concludes the proof of the theorem. ut

As a complement, note that if φ B ψ = ψ B φ,

φ · ψ · πx∧y(φ) = φ · ψ · πx∧y(ψ).

So, in this case we have πx∧y(φ) = πx∧y(ψ), if the valuation algebra is cancellative; in this case commutativity of composition implies consistency of the valuations involved. For distributive lattices (D; ≤) stronger results are possible

Theorem 52 Let (Ψ,D; ≤, d, ·, π) be a separative valuation algebra and (D; ≤ ) a distributive lattice. Assume φ, ψ ∈ Ψ with d(φ) = x, d(ψ) = y, and δ(πx∧y(ψ)) ≤ δ(πx∧y(φ)). Then,

1. If x ≥ y ∧z and d(τ) = z, δ(πx∧z(τ)) ≤ δ(πx∧z(φ)), then (φBψ)Bτ = (φ B τ) B ψ.

2. If x ∧ y ≤ z ≤ x ∨ y then πz(φ B ψ) = πx∧z(φ) B πy∧z(ψ).

3. If x ≥ y ∧ z and d(τ) = z, δ(πy∧z(τ)) ≤ δ(πy∧z(ψ)) then (φ B ψ) B τ = φ B (ψ B τ). 6 CONDITIONALS 120

4. If y ≥ x ∧ z and d(τ) = z, δ(π(x∧z)(τ)) ≤ δ(π(x∧z)(φ)) then (φ B ψ) B τ = φ B (ψ B τ) then (φ B ψ) B τ = (φ B τ) B ψ.

Proof. 1.) The assumptions guarantee that φ B ψ as well as φ B τ are in Ψ and al compositions are well defined. If x ≥ y ∧ z, then by the distributivity of the lattice (D; ≤) it follows that (x ∨ y) ∧ z = (x ∧ z) ∨ (y ∧ z) = x ∧ z and similarly (x ∨ z) ∧ y = x ∧ y. Thus, we have

(φ B ψ) B τ −1 −1 = (φ · ψ · (πx∧y(ψ)) ) · τ · (π(x∨y)∧z(τ)) −1 −1 = (φ · τ(πx∧z(τ)) ) · ψ · (π(x∨z)∧y(ψ)) = (φ B τ) B ψ.

2.) If x ∧ y ≤ z ≤ x ∨ y then we have also x ∧ y ≤ y ∧ z ≤ y and we can use item 6 of Theorem 51, to obtain

πx∨z(φ B ψ) = πx∨z((φ B πy∧z(ψ) B ψ). Since x∧y ≤ z ≤ x∨y implies further that x∨(y∧z) = (x∨y)∧(x∨z) = x∨z (distributivity) it follows from item 2 of Theorem 51 that

πx∨z(φ B ψ) = φ B πy∧z(ψ). −1 Note that πx∧y(φ)Bφ = πx∧y(φ)·φ·(πx∧y(φ)) = φ. Therefore, using item 1 just proved above, we have

πx∨z(φ B ψ) = (πx∧y(φ) B φ) B πy∧z(ψ) = (πx∧y(φ) B πy∧z(ψ)) B φ.

Next we compute πz(φ B ψ) in the same way, again using properties 2 and 6 of Theorem 51 and property 1 of the present theorem:

πz(φ B ψ) = πz((πx∧y(φ) B πy∧z(ψ)) B φ) = πz(((πx∧y(φ) B πy∧z(ψ)) B πx∧z(φ)) B φ) = (πx∧y(φ) B πy∧z(ψ)) B πx∧z(φ) = (πx∧y(φ) B πx∧z(φ)) B πy∧z(ψ)) = πx∧z(φ) B πy∧z(ψ). We leave it to the reader to verify the conditions for the application of the the properties of Theorem 51 and of item 1 of the present theorem. 6 CONDITIONALS 121

3.) From the definition of composition, we have

−1 φ B (ψ B τ) = φ · (ψ B τ) · (πx∧(y∨z)(ψ B τ)) (6.19) Here we may apply item 2 of the present theorem for the last term, so that

−1 φ B (ψ B τ) = φ · (ψ B τ) · (πx∧y(ψ) B πx∧z(τ)) . Note that under the assumption x ≥ y ∧ z we have (x ∧ y) ∧ (y ∧ z) = y ∧ z. −1 Therefore, we have πx∧y(ψ)Bπx∧z(τ) = πx∧y(ψ)·πx∧z(τ)·(πy∧z(τ)) . This allows us to deduce

φ B (ψ B τ) −1 −1 = φ · ψ · τ · (πy∧z(τ)) · (πx∧y(ψ) B πx∧z(τ)) −1 −1 −1 = φ · ψ ··τ · (πy∧z(τ)) · (πx∧y(ψ)) · (πx∧z(τ)) · πy∧z(τ) −1 = (φ B ψ) · τ · (π(x∨y)∧z(τ)) = (φ B ψ) B τ.

4.) Note that from y ≥ x∧z it follows that (x∨y)∧z = (x∧z)∨(y∧z) = y∧z and x ∧ (y ∨ z) = (x ∧ y) ∨ (x ∧ z) = x ∧ y. So, using the definition of the compositional operator, we obtain

−1 −1 (φ B ψ) B τ = (φ · ψ · (πx∧y(ψ)) ) · τ · (πy∧z(τ)) On the other hand, we have

−1 φ B (ψ B τ) = φ · (ψ B τ) · (πx∧y(ψ B τ))

From item 2 of Theorem 51 we get, using πx∧y(ψ B τ) = πx∧y(πy(ψ B τ)),

φ B (ψ B τ) −1 = φ · (ψ B τ) · (πx∧y(ψ)) −1 −1 = φ · ψ · τ · (πy∧z(τ)) · (πx∧y(ψ)) .

The right hand side here is equal to the one above for (φ B ψ) B τ so, indeed (φ B ψ) B τ = φ B (ψ B τ). ut So, at least for distributive lattices (D; ≤) a separative valuation algebra allows for compositional models, and the results in (Jirousek & Shenoy, 2014; Jirousek & Shenoy, 2015) carry over to this more general case. Only this extension makes compositional modelling for belief functions, Gaussian densities and densities in general available, since the corresponding valuation algebras are not regular, but but only separative or cancellative respectively. 7 DOMAIN-FREE ALGEBRAS OF INFORMATION 122

7 Domain-Free Algebras of Information

7.1 Unlabeling of Information

In a labeled information algebra (Φ,D; ≤, ⊥, d, ·, t) different pieces of infor- mation φ and ψ with different labels d(φ) = x and d(ψ) = y may describe the same information, namely, if

tx∨y(φ) = tx∨y(ψ). (7.1)

In fact, then we see that

φ = tx(tx∨y(φ)) = tx(tx∨y(ψ)) = tx(ψ), and, similarly,

ψ = ty(φ).

So, φ and ψ really describe the same information, only once with respect to domain x and once with respect to domain y. This motivates to define the relation

φ ≡σ ψ, if d(φ) = x, d(ψ) = y and tx∨y(φ) = tx∨y(ψ). (7.2)

This relation can defined alternatively according to the following lemma.

Lemma 11 The relation φ ≡σ ψ holds if and only if tz(φ) = tz(ψ) for any z ∈ D.

Proof. If tz(φ) = tz(ψ) and d(φ) = x, d(ψ) = y, then, for z = x ∨ y, tx∨y(φ) = tx∨y(ψ), hence φ ≡σ ψ.

Conversely assume φ ≡σ ψ, that is, tx∨y(φ) = tx∨y(ψ). Then we have also tx∨y∨z(φ) = tx∨y∨z(ψ). Further, tz(tx∨y∨z(φ)) = tz(φ) and the same for ψ, hence indeed tz(φ) = tz(ψ). ut

The relation ≡σ is clearly an equivalence relation in Φ. It is even a congru- ence relative to the operation of combination and transport as the following theorem shows.

Theorem 53 The relation ≡σ is a congruence relative to combination and transport in the labeled information algebra (Φ,D; ≤, ⊥, d, ·, t). 7 DOMAIN-FREE ALGEBRAS OF INFORMATION 123

Proof. Assume φ1 ≡σ φ2 and d(φ1) = x, d(φ2) = y. Consider an element ψ ∈ Φ with d(ψ) = z. We show that φ1 ·ψ ≡σ φ2 ·ψ. From z⊥x∨y∨z|x∨y∨z (C1) it follows that x⊥z|x ∨ y ∨ z (C2 and C3). Therefore, by axiom A5

tx∨y∨z(φ1 · ψ) = tx∨y∨z(φ1) · tx∨y∨z(ψ).

In the same way, y⊥z|x ∨ y ∨ z, and

tx∨y∨z(φ2 · ψ) = tx∨y∨z(φ2) · tx∨y∨z(ψ).

Then φ1 ≡σ φ2 implies, using Lemma 11,

tx∨y∨z(φ1 · ψ) = tx∨y∨z(φ2 · ψ).

But this means that φ1 · ψ ≡σ φ2 · ψ.

Also φ ≡σ ψ implies tz(φ) = tz(ψ), hence tz(φ) ≡σ tz(ψ). This proves congruence relative to transport. ut This congruence allows us to consider the quotient structure Φ/σ, consisting of the equivalence classes [φ]σ of the equivalence relation ≡σ. Combination and transport are well-defined in this structure by

[φ]σ · [ψ]σ = [φ · ψ]σ,

x([φ]σ) = [tx(φ)]σ.

It is evident that combination in Φ/σ is associative and commutative, since it is so in Φ. Further, the classes [1x]σ and [0x]σ are the unit and null elements of combination in Φ/σ. So, (Φ/σ, ·) is a commutative semigroup with unit and null element. In addition, if d(φ) = x, then x([φ]σ) = [φ]σ, hence, in particular, x([1x]σ) = [1x]σ and x([φ]σ) = [0x]σ if and only if [φ]σ = [0x]σ. The following theorem shows how the axioms A,4 A5 and A7 of the la- beled information algebra (Φ,D; ≤, ⊥, d, ·, t) translate into this new algebraic structure.

Theorem 54 Let (Φ,D; ≤, ⊥, d, ·, t) be a labeled information algebra. Then in Φ/σ the following holds:

1. x([φ]σ) = [φ]σ and x⊥y|z imply y([φ]σ) = y(z([φ]σ)).

2. x([φ]σ) = [φ]σ, y([ψ]σ) = [ψ]σ and x⊥y|z imply z([φ]σ · [ψ]σ) = z([φ]σ) · z([ψ]σ). 7 DOMAIN-FREE ALGEBRAS OF INFORMATION 124

3. If, in addition, Idempotency A7 holds, then x([φ]σ) · [φ]σ = [φ]σ.

Proof. 1.) We have by definition and assumption x([φ]σ) = [tx(φ)]σ = [φ]σ. We may therefore select a representant of the class [φ]σ such that tx(φ) = φ, hence d(φ) = x. Then by axiom A4, x⊥y|z implies ty(φ) = ty(tz(φ)), Therefore, we obtain

y([φ]σ) = [ty(φ)]σ = [ty(tz(φ))]σ = y(z([φ]σ)). as claimed.

2.) As above, we have x([φ]σ) = [tx(φ)]σ = [φ]σ and y([ψ]σ) = [ty(ψ)]σ = [ψ]σ. Then x⊥y|z implies according to axiom A5,

tz(tx(φ) · ty(ψ)) = tz(tx(φ)) · tz(ty(ψ)).

Thus, we have

z([φ]σ · [φ]σ) = z([tx(φ)]σ · [ty(ψ)]σ) = z([tx(φ) · ty(ψ)]σ)

= [tz(tx(φ) · ty(ψ))]σ = [tz(tx(φ)) · tz(ty(ψ))]σ

= [tz(tx(φ))]σ · [tz(ty(ψ))]σ = z([(tx(φ))]σ) · z([(ty(ψ))]σ)

= z([φ]σ) · z([ψ]σ).

This verifies item 2 of the the theorem.

3.) We have x([φ]σ) · [φ]σ = [tx(φ) · φ]σ. Assume that d(φ) = y. By Lemma 3, φ · tx(φ) = tx∨y(φ). But φ ≡σ tx∨y(φ), hence indeed x([φ]σ) · [φ]σ = [φ]σ. ut The quotient structure considered here amounts in fact to unlable the pieces of information φ and to obtain a domain-free representation of pieces of in- formation [φ]σ. This will be exploited in the next section, where a new alge- braic structure of information is proposed, which has the algebraic structure derived here as an instance.

7.2 Domain-Free Algebras

Motivated by the previous section, we introduce here a new algebraic struc- ture, modeling information from a different point of view. Again we base ourselves on a q-separoid (D; ≤, ⊥), whose elements represent domains or questions. Let now Ψ be a set of elements representing pieces of informa- tion. This time, however, an element φ, ψ, . . . from Ψ is not attached to a 7 DOMAIN-FREE ALGEBRAS OF INFORMATION 125 specified domain, it is domain-free. Nevertheless, it is assumed to be pos- sible to extract information relative to a domain x in D from every piece of information ψ ∈ Ψ. This will be accomplished by extraction operators x :Ψ → Ψ attached to each domain x in D.. More formally, we assume two operations in Ψ:

1. Combination: · :Ψ × Ψ → Ψ, (φ, ψ) 7→ φ · ψ,

2. Extraction:  :Ψ × D → Ψ, (ψ, x) 7→ x(ψ).

As before, combination represents aggregation of information and, new, ex- traction describes filtering out information relative to a domain. Thus, x(ψ) is thought to be the part of ψ referring to domain or question x ∈ D. We consider a signature (Ψ,D; ≤, ⊥, ·, ) and impose the following axioms:

A0 Quasi-Separoid: (D; ≤, ⊥) is a quasi-separoid.

A1 Semigroup: (Ψ, ·) is a commutative semigroup with unit 1 and null element 0.

A2 Support: For all ψ ∈ Ψ, there is a domain x ∈ D such that x(ψ) = ψ, and whenever x(ψ) = ψ, x ≤ y, then y(ψ) = ψ.

A3 Unit and Null: For all x ∈ D, x(1) = 1 and x(φ) = 0 if and only if φ = 0.

A4 Extraction: If x⊥y|z and x(ψ) = ψ, then y(ψ) = y(z(ψ))

A5 Combination: If x⊥y|z and x(φ) = φ and y(ψ) = ψ, then z(φ · ψ) = z(φ) · z(ψ).

A system (Ψ,D; ≤, ⊥, ·, ) satisfying these axioms is called a domain-free information algebra. According to the previous section, if (Φ,D; ≤, ⊥, d, ·, t) is a labeld information algebra, then (φ/σ, D; ≤, ⊥, ·, ) is a domain-free information algebra. So, to any labeled information algebra, a domain-free algebra is associated. Below, we shall see that any domain-free information algebra induces a labeled one. This leads to a remarkable duality between the two kinds of algebras. There are also domain-free information algebras, which satisfy the following additional axiom: 7 DOMAIN-FREE ALGEBRAS OF INFORMATION 126

A6 Idempotency: For all φ ∈ Φ and x ∈ D, x(φ) · φ = φ. Note that idempotency implies, due to the support axiom A2, that φ·φ = φ. If this property holds, then the information algebra is called idempotent or proper. In fact, in a proper context of information a repetition of a piece of information or part of it, should give nothing new.

If for a domain x from D, x(φ) = φ, then x is called a support of φ. Here follow a few properties of support and extraction:

Lemma 12 If (Ψ,D; ≤, ⊥, ·, ) is a domain-free information algebra, then the following holds:

1. x is a support of x(φ), x(x(φ)) = x(φ). 2. If x is a support of both φ and ψ, then it is also a support of φ · ψ, x(φ · ψ) = φ · ψ. 3. If x is a support of φ and y of ψ, then x ∨ y is a support of φ · ψ, x∨y(φ · ψ) = φ · ψ.

4. For all x ∈ D, x(x(φ) · ψ) = x(φ) · x(ψ).

Proof. 1.) By axiom A2, φ has a support y. Since y⊥x|x (C1), axiom A4 implies x(x(φ)) = x(φ). 2.) and 3.) From x⊥x ∨ y|x ∨ y (C1) it follows that x⊥y|x ∨ y (C3). If x is a support for φ and y for ψ, then axiom A5 implies then x∨y(φ · ψ) = x∨y(φ) · x∨y(ψ). Further by the support axiom A2, x ∨ y is a support of both φ and ψ. Thus it follows that x∨y(φ · ψ) = φ · ψ, which proves item 3 of the lemma. With x = y, item 2 follows also.

4.) By axiom A2 any element ψ has a support. So suppose y(ψ) = ψ. Then x⊥y|x (C1,C2) and x(x(φ)) = x(φ) (item 1 proved above) imply, using axiom A5,

x(x(φ) · ψ) = x(x(φ)) · x(ψ) = x(φ) · x(ψ).

This concludes the proof. ut Just as with labeled information algebras (Section 3.2), an important and interesting special case arises for a domain-free information algebra (Ψ,D; ≤ , ⊥L, ·, ), where D is a lattice. In this case the following holds. 7 DOMAIN-FREE ALGEBRAS OF INFORMATION 127

Lemma 13 In the domain-free information algebra (Ψ,D; ≤, ⊥L, ·, ), where D is a lattice, for all x, y ∈ D,

y(x(φ)) = x∧y(φ).

Proof. We have x⊥Ly|x ∧ y. From this and x(x(φ)) = x(φ) it follows, using axiom A4,

y(x(φ)) = y(x∧y(x(φ))).

Assume that z is a support of φ. Then z⊥Lx|x (C1) implies z⊥Lx ∧ y|x (C3), and thus again by axiom A4 x∧y(x(φ)) = x∧y(φ), x ∧ y is a support of x∧y(φ), and x ∧ y ≤ y, hence indeed by axiom A2 y(x(φ)) = x∧y(φ). ut

This result implies that the extraction operators x for x ∈ D in the algebra (Ψ,D; ≤, ⊥L, ·, ) form a commutative semigroup under composition,

y(x(φ)) = x(y(φ)), which is not the case in general. Moreover, due to the idempotency of extrac- tion these extraction operators form an idempotent, commutative semigroup, hence a meet-semilattice, where we define x ≤ y if y ◦ x = x. Under this partial order it follows indeed that x ◦ y = x ∧ y = x∧y in concordance with Lemma 13 and also x ≤ y if and only if x ≤ y. This structure is an instance of an alternative algebraic structure which is defined as follows. Consider the signature (Ψ,E; ·, ◦), where the following operations are defined

1. Combination: · :Ψ × Ψ → Ψ; (φ, ψ) 7→ φ · ψ.

2. Extraction: Any  ∈ E is an operator  :Ψ → Ψ.

3. Composition: ◦ : E × E → E, such that ( ◦ η)(ψ) = (η(ψ)).

We impose the following axioms on this structure:

D0 Semigroup of Extraction: (E, ◦) is an idempotent, commutative semi- group.

D1 Semigroup of Information: (Ψ, ·) is a commutative semigroup with unit 1 and null 0. 7 DOMAIN-FREE ALGEBRAS OF INFORMATION 128

D2 Support: For all ψ ∈ Ψ there is an element  ∈ E such that (ψ) = ψ.

D3 Unit and Null: For all  ∈ E, (1) = 1 and (φ) = 0 if and only if φ = 0.

D4 Combination: For all  ∈ E and φ, ψ ∈ Ψ, ((φ) · ψ) = (φ) · (ψ).

This is the domain-free variant of a labeled valuation algebra (see Section 3.2), and we call it therefore a domain-free valuation algebra. Similar, more or less equivalent structures have been introduced by (Shafer, 1991) and are discussed in (Kohlas, 2003a). We may add an additional axiom concerning idempotence: D5 Idempotency: For all ψ ∈ Ψ and  ∈ E, (ψ) · ψ = ψ. These idempotent structures were studied extensively in (Kohlas, 2003a; Kohlas & Schmid, 2014a; Kohlas & Schmid, 2014b). There are also many examples and generic construction method for these algebras to be found there. Such idempotent valuation algebras were called information algebras in those papers. Note that he term information algebra in this paper refers to a more general, not necessarily idempotent algebraic structure. As mentioned above, axiom D0 implies that E is a meet-semilattice where  ≤ η if η ◦  = . In this order  ◦ η is the infimum of  and η,  ◦ η =  ∧ η. For later reference we add a few properties derived from the axioms of a domain-free valuation algebra.

Lemma 14 Let (Ψ,E; ·, ◦) be a domain-free valuation algebra. Then

1. (ψ) = ψ and  ≤ η imply η(ψ) = ψ.

2.  ≤ η implies (ψ) = (η(ψ)).

3. (ψ) = η(ψ) = ψ implies ( ∧ η)(ψ) = ψ

4. (ψ) = ψ implies ( ∧ η)(ψ) = η(ψ).

5. (φ) = φ and (ψ) = ψ imply (φ · ψ) = φ · ψ.

Proof. 1.) We have η(ψ) = η((ψ)) = ( ∧ η)(ψ). But  ∧ η = . Therefore, η(ψ) = (ψ) = ψ. 2.) Here we use again that ∧η =  and conclude that (η(ψ)) = (∧η)(ψ) = (ψ). 7 DOMAIN-FREE ALGEBRAS OF INFORMATION 129

3.) Since  ◦ η =  ∧ η we have ( ∧ η)(ψ) = (η(ψ)) = ψ. 4.) Again, from  ◦ η =  ∧ η we obtain ( ∧ η)(ψ) = η((ψ)) = η(ψ). 5.) Here we use the combination axiom D4 and see that

(φ · ψ) = ((φ) × ψ) = (φ) · (ψ) = φ · ψ.

This concludes the proof. ut Note that in the frame of a domain-free valuation algebra the property stated in item 1 may be derived, whereas in the case of a general domain- free information algebra it must be imposed as an axiom. In many cases E is not only a semilattice, but a lattice. In particular this is the case for the domain-free algebra (Ψ; D; ≤, ⊥L, ·, t) introduced above and if the domain-free valuation algebra is derived from a labeled one, see Section 7.3 below. Then some more results about extraction can be obtained.

Lemma 15 Let (Ψ,E; ·, ◦) be a domain-free valuation algebra, where (E; ≤) is a lattice under the order induced by the idempotent, commutative semi- group (E, ◦) and ⊥Lη|χ is the corresponding q-separoid in this lattice. Then

1. (ψ) = ψ and ⊥Lη|χ imply η(ψ) = η(χ(ψ)).

2. (φ) = φ, η(ψ) = ψ and ⊥Lη|χ imply χ(φ · ψ) = χ(φ) · χ(ψ). 3. (φ) = φ and η(ψ) = ψ imply ( ∨ η)(φ · ψ) = φ · ψ.

Proof. 1.) From  ≤  ∨ χ and η ≤ η ∧ χ, it follows, using items 1 and 2 of Lemma 14,

η(ψ) = η(( ∨ χ)(ψ)) = η((η ∨ χ)(( ∨ χ)(ψ))) = η((( ∨ χ) ∧ (η ∨ χ))(ψ)).

But ⊥Lη|χ implies ( ∨ χ) ∧ (η ∨ χ) = χ and therefore, indeed, η(ψ) = η(χ(ψ)). 2.) and 3.) Again, by Lemma 14, we have φ = (∨χ)(φ) and ψ = (η∨χ)(ψ). Therefore χ(φ · ψ) = χ(( ∨ χ)(φ) · (η ∨ χ)(ψ)). From χ ≤  ∨ χ,, Axiom D4 and that composition is infimum in (E; ≤), we obtain

χ(φ · ψ) = χ(( ∨ χ)(( ∨ χ)(φ) · (η ∨ χ)(ψ)) = χ(( ∨ χ)(φ) · (( ∨ χ) ∧ (η ∨ χ))(ψ) 7 DOMAIN-FREE ALGEBRAS OF INFORMATION 130

Applying ⊥Lη|χ, Axiom D4 and φ = ( ∨ χ)(φ) gives then

χ(φ · ψ) = χ(φ · χ(ψ)) = χ(φ) · χ(ψ).

This proves item 2. Item 3 follows from item 2 since ⊥Lη| ∨ η and φ = ( ∨ η)(φ), ψ = ( ∨ η)(ψ). ut These lemmas show that a domain-free valuation algebra (Ψ,E; ·, ◦) induces a domain-free information algebra (Ψ,E; ≤, ⊥L, ·, ) if E is a lattice. The sit- uation in the domain-free view is therefore rather the same as in the labeled view. This aspect will be clarified in the next section. Note however that if E is only a semilattice in the domain-free valuation algebra (Ψ,E; ◦, ·), then it is not a generalised information algebra.

7.3 Duality

We have seen above that a domain-free information algebra can be derived from a labeled one. It turns out that, conversely, a labeled information al- gebra may also be obtained from a domain-free one. This establishes then a duality between labeled and domain-free information algebras. This dual- ity applies also to the special case of valuation algebras, although with the reservation that the extraction operators form a lattice, not only a semilat- tice. Assume (Ψ,D; ≤, ⊥, ·, ) to be a domain-free information algebra. Define [ Φx = {(φ, x): x(φ) = φ}, Φ = Φx. x∈D

We define the following operations relative to the signature (Φ,D; ≤, ⊥, d, ·, t):

1. Labeling: d :Φ → D;(φ, x) 7→ d(φ, x) = x.

2. Combination: · :Φ × Φ → Φ; ((φ, x), (ψ, y)) 7→ (φ, x) · (ψ, y) = (φ · ψ, x ∨ y).

3. Transport: t :Φ × D → Φ; ((φ, x), y) 7→ ty(φ, x) = (y(φ), y).

Note that we use the same symbol · for combination in Ψ and in Φ; it will always be clear from the context which operation is meant. We remark that due to the results of the previous section all these operations are well defined 7 DOMAIN-FREE ALGEBRAS OF INFORMATION 131

We claim that (Φ,D; ≤, ⊥, ·, t) is a labeled information algebra. Here is the verification of the axioms: The structure (D; ≤, ⊥) is a q-separoid by defi- nition (A0). Combination in (Φ, ·) is clearly associative and commutative, so (Φ, ·) is a commutative semigroup (A1). By definition of Combination, Labeling and Transport in Φ we have d((φ, x) · (ψ, y)) = d(φ · ψ, x ∨ y) = x ∨ y = d(φ, x) ∨ d(ψ, y) and d(tx(φ, y)) = d(y(φ), y) = y. So the La- beling Axiom (A2) is satisfied. The null and unit elements associated with x ∈ D are (0, x) and (1, x). By the definition of Combination and Transport in Φ, we obtain tx(φ, y) = (x(φ), x) = (0x, x) if and only if φ = 0, (φ, y) · (1, x) = (φ, x ∨ y) = (x∨y(φ), x ∨ y) = tx∨y(φ, y) and (1, x) · (1, y) = (1, x ∨ y). This confirms the Unit and Null Axiom (A3). For the Transport Axiom (A4) assume x⊥y|z and consider an element (φ, x) such that d(φ, x) = x and x(φ) = φ. Then by the Extraction Axiom (A4) of the domain-free algebra

ty(φ, x) = (y(φ), y) = (y(z(φ)), y) = ty(tz(φ, x)).

This is the Transport Axiom (A4) in the labeled version. To verify the Combination Axiom (A5) assume x⊥y|z and consider elements (φ, x) and (ψ, y) such that d(φ, x) = x and d(ψ, y) = y. Then, by the definitions of Combination and Transport in Φ, and invoking Combination Axiom (A5) for the domain-free algebra,

tz((φ, x) · (ψ, y)) = (z(φ · ψ), z) = (z(φ) · z(ψ), z)

= (z(φ), z) · (z(ψ), z) = tz(φ, x) · tz(ψ, y).

This is the Combination Axiom (A4) in the labeled version.

The Identity Axiom (A6) follows from tx(φ, x) = (x(φ), x) = (φ, x) since by definition x is a support of φ. So, the algebra (Φ,D; ≤, ⊥, d, ·, t) is in- deed an instance of a labeled information algebra. If (Ψ,D; ≤, ⊥, ·, ) is idempotent, then so is (Φ,D; ≤, ⊥, d, ·, t). Consider (φ, x) and y ≤ x, then (φ, x) · ty(φ, x) = (φ, x) · (y(φ), y) = (φ · y(φ), x ∨ y) = (φ, x). This is the Idempotency axiom A7 of the labeled information algebra. We may now start with a labeled information algebra L, say L = (Φ,D; ≤ , ⊥, d, ·, t) and derive its domain-free version DL = (Φ/σ, D; ≤, ⊥, ·, ) in the way described in Section 7.1. In a further step we may construct from DL its labeled version LDL = (Φ0,D; ≤, ⊥, d0, ·0, t0) as shown above. In the same way we may start with a domain-free information algebra D = (Ψ,D; ≤, ⊥, ·, ) and get the associated labeled algebra LD = (Φ,D; ≤ 7 DOMAIN-FREE ALGEBRAS OF INFORMATION 132

, ⊥, d, ·, t) and then obtain from this labeled algebra its domain-free version DLD = (Ψ/σ, D; ≤, ⊥, ·0, 0). It may by suspected that L and LDL are essentially the same algebra, and so are D and DLD. In order to make this statement precise we need to define the notion of isomorphisms between labeled information algebra on the one hand and between domain-free in- formation algebras on the other hand. Consider two labeled information algebras L = (Φ,D; ≤, ⊥, d, ·, t) and L0 = (Φ0,D; ≤, ⊥, d0, ·0, t0). To simplify, we assume that both algebras are based on the same q-separoid; this corresponds to the situation we are interested 0 0 in. Let T = {tx : x ∈ D} and T = {tx : x ∈ D} be the sets of the transport operators in the two algebras. We consider a pair of maps

0 0 0 f :Φ → Φ , g : T 7→ T such that g(tx) = tx for all x ∈ D. If the following conditions are satisfied, the pair of maps (f, g) is called a labeled information algebra homomorphism:

1. f(φ · ψ) = f(φ) ·0 f(ψ), for all φ, ψ ∈ Φ,

0 0 0 0 2. f(0x) = 0x and f(1x) = 1x, where 0x and 1x are the null elements and unities in (Φ0, ·0),

3. f(tx(φ)) = g(tx)(f(φ)).

Note that the map g is by definition one-to-one and onto. If the map f is also onto and one-to-one, the pair (f, g) is called a labeled information algebra isomorphism and the two algebras are called isomorphic, written as L =∼ L0. Similarly, for domain-free information algebras: Let D = (Ψ,D; ≤, ⊥, ·, ) 0 0 0 0 0 0 and D = (Ψ ,D; ≤, ⊥, · ,  ). Let E = {x : x ∈ D} and E = {x : x ∈ D} be the sets of the extractor operators in the two algebras. Again a pair (f, g) of maps

0 0 0 f :Ψ → Ψ , g : E 7→ E such that g(x) = x for all x ∈ D. satisfying

1. f(φ · ψ) = f(φ) ·0 f(ψ), for all φ, ψ ∈ Ψ,

2. f(0) = 00 and f(1) = 10, where 00 and 10 are the null and unity in (Ψ0, ·0), 7 DOMAIN-FREE ALGEBRAS OF INFORMATION 133

3. f(x(φ)) = g(x)(f(φ)). is called a domain-free information algebra homomorphism. The map g is still one-to-one and onto. If the map f is onto and one-to-one, then (f, g) is called a domain-free information algebra isomorphism; D and D0 are called isomorphic, written D =∼ D0 We are now going to show that L and LDL are isomorphic and so are D and DLD. In the first case, consider the maps (f, g) from L into LDL, defined by

f(φ) = ([φ]σ, x), if d(φ) = x, 0 0 g(tx) = tx, where tx([φ]σ, y) = (x([φ]σ), x), (7.3)

Similarly, we define the pair (f, g) of maps from D into DLD,

f(ψ) = [(ψ, x)]σ, if x(ψ) = ψ, 0 0 g(x) = x, where x([(ψ, y)]σ) = [tx(ψ, y)]σ (7.4)

We claim that these pairs of maps are respectively labeled information al- gebra and domain-free information algebra isomorphisms.

Theorem 55 We have L =∼ LDL and D =∼ DLD and the respective pairs of maps (f, g) (7.3) and (7.4) are the isomorphisms.

Proof. First, consider the labeled case, the pair of maps defined in (7.3). Here we have first, assuming d(φ) = x and d(ψ) = y,

f(φ · ψ) = ([φ · ψ]σ, x ∨ y) = ([φ]σ · [ψ]σ, x ∨ y) 0 0 = ([φ]σ, x) · ([ψ]σ, y) = f(φ) · f(ψ).

Since f(0x) = ([0x]σ, x) and f(1x) = ([1x]σ, x), null and unit elements are preserved. Further assume d(φ) = y. Then

0 f(tx(φ)) = ([tx(φ)]σ, x) = (x([φ]σ), x) = tx([φ]σ, y) = g(tx)(f(φ)).

This proves that the pair of maps (f, g) is a labeled information algebra homomorphism. Consider any element ([φ]σ, x) in LDL. Then, this is the image of the element φ from L, so the map f is onto. Further, ([φ]σ, x) = ([ψ]σ, y) implies x = y and [φ]σ = [ψ]σ. By definition of the map f, d(φ) = x 7 DOMAIN-FREE ALGEBRAS OF INFORMATION 134

and d(ψ) = y = x. But this, together with [φ]σ = [ψ]σ, implies φ = ψ. The map f is therefore one-to-one. Thus, the pair (f, g) is a labeled information algebra isomorphism and therefore L =∼ LDL. Second, consider the domain-free case, the pair of maps defined in (7.4). Consider elements φ and ψ from D with supports x and y respectively. Then

0 0 f(φ · ψ) = [(φ · ψ, x ∨ y)]σ = [(φ, x)]σ · [(ψ, y)]σ = f(φ) · f(ψ).

Further, f(0) = [(0, x)]σ and f(1) = [(1, x)]σ are clearly the null and unit elements in DLD. Next, assume that y is a support of the element ψ in D. Then

0 f(x(ψ)) = [(x(ψ), x)]σ = [tx(ψ, y)]σ = x([(ψ, y)]σ) = g(x)(f(ψ)).

So the pair (f, g) is a domain-free information algebra homomorphism. If [(ψ, x)]σ is an element from DLD, then x is a support of ψ and f maps ψ to [(ψ, x)]σ. So, the map f is onto. Assume that [(φ, x)]σ = [(ψ, y)]σ. Then x and y are supports of φ and ψ respectively; and (φ, x) ≡σ (ψ, y) means that (φ, x ∨ y) = (ψ, x ∨ y). This shows that φ = ψ; the map f is one-to-one. Therefore, the pair (f, g) is a domain-free information algebra isomorphism, and D =∼ DLD. ut According to this theorem, labeled and domain-free algebras are dual in the technical sense of the theorem. We may freely pass from labeled to domain- free algebras and back. These two kinds of algebras are the two sides of the same coin. Labeled information algebras are better suited for computation purposes, whereas domain-free information algebras usually are preferred for theoretical studies. For labeled valuation algebras and domain-free valuation algebras a similar duality, the special case of the duality introduced above has been shown in (Kohlas, 2003a), provided that the semigroup of extractor operators E in the domain-free case is a lattice. 8 INFORMATION ORDER 135

7.4 Separativivity

8 Information Order

8.1 The Idempotent Case

Information may be, in informal terms, more or less precise, more or less informative. This should be reflected by some order between pieces of in- formation. Such orders are the subject of the present section. Information order can be studied both in labeled or domain-free information algebras. We propose to base our discussion on domain-free information algebras. Let then (Ψ,D; ≤, ⊥, ·, ) be a domain-free generalised information algebra. The basic idea is that a piece of information is more informative than an other one, if one needs to add a further piece of information to the second one to get the first one. So, we define, for φ, ψ ∈ Ψ,

φ ≤ ψ, iff there exists χ ∈ Ψ such that ψ = φ · χ. (8.1)

This relation satisfies

1. Reflexivity: ψ ≤ ψ, since ψ = ψ · 1,

2. Transitivity: φ ≤ ψ and ψ ≤ η imply φ ≤ η, since ψ = φ·χ1, η = ψ ·χ2 imply η = φ · χ1 · χ2.

Antiysmmetry however does not hold in general. Therefore, the relation ≤ defined in (8.1) is a preorder in Ψ. If the information algebra (Ψ,D; ≤, ⊥, ·, ) is idempotent, then φ ≤ ψ if and only if φ · ψ = ψ. In fact, ψ = φ · χ, gives by idempotency, if both sides are combined by ψ, ψ = (φ · χ) · ψ = φ · (φ · χ) · ψ = φ · ψ · ψ = φ · ψ. In idempotent information algebras, the relation ≤ is a partial order, since φ ≤ ψ and ψ ≤ φ imply φ = ψ · φ = ψ. Here φ ≤ ψ means that nothing is gained if a piece of information φ is added to ψ, the information in φ is already covered by ψ. Note that in this idempotent case

1. 1 ≤ ψ ≤ 0 for all ψ ∈ Ψ,

2. φ, ψ ≤ φ · ψ,

3. φ ≤ ψ implies φ · η ≤ ψ · η for all η ∈ Ψ, 8 INFORMATION ORDER 136

4. x(ψ) ≤ ψ for all x ∈ D and ψ ∈ Ψ,

5. φ ≤ ψ implies x(φ) ≤ x(ψ) for all x ∈ D,

6. x ≤ y implies x(ψ) ≤ y(ψ) for all ψ ∈ Ψ.

These are clearly properties one would expect from an information order in general: Vacuous information is least informative, contradiction (which properly speaking is not an information) is the greatest element in the in- formation order; combined information is more informative than each of its parts, the order is compatible with combination and extraction of informa- tion does not increase information. Note also that the preorder, defined in (8.1), satisfies the first three of these requirements. The other ones are not guaranteed in general and need special consideration. This will be discussed in the following sections. The partial order in idempotent information algebra really is a neat order of information. Therefore we call idempotent information algebra also proper information algebras. This order been discussed and studied in detail for the case of idempotent valution algebras in (Kohlas, 2003a; Kohlas & Schmid, 2014a; Kohlas & Schmid, 2014b). Most of the results for this case carry over to generalised information algebras. This will be discussed in Section 9. In the following two section, we present two interesting cases, where the preorder satisfies also the other requiremenst stated above. This will also show the relation of the preorder to the partial order of idempotent information and illuminate the limits of the preorder.

8.2 Regular Algebras

Order in semigroup theory has been studied in several papers, we cite only two of them, (Nambooripad, n.d.; Mitsch, n.d.). These papers study natural order, that is an order, which can be defined in terms of the operations of the semigroup. This is exactly the case of our definition above. Of particular interest in these theories are regular semigroups. In the context of valuation algebras, such regular semigroups or rather generalisation of them to valuation algebras, turned out to be of interest in two respects: They allow to introduce partial division into the algebra, which allows to adapt local computation architectures known for Bayesian networks to valuation algebras (Lauritzen & Jensen, 1997; Kohlas, 2003a). Secondly, this division permits also to generalize conditioning, as known in probability, to valuation 8 INFORMATION ORDER 137 algebras (Kohlas, 2003a). Now, as we shall see in this and the subsequent section, this is relevant for information order too. We summarize here the theory of regular semigroups, and adapt it to regular generalised information algebras, generalizing the theory of regular valuation algebras (Kohlas, 2003a) and discuss how this applies to information order as defined in (8.1). We start with the definition of regularity in information algebras.

Definition 16 Regular Information Algebras: Let (Ψ,D; ≤, ⊥, ·, ) be a gen- eralised domain-free information algebra. An element ψ ∈ Ψ is called regu- lar, if for all x ∈ D there is an element χ ∈ Ψ with support x such that

ψ = x(ψ) · χ · ψ. (8.2)

The information algebra (Ψ,D; ≤, ⊥, ·, ) is called regular, if all its elements are regular.

Of course, the element χ above in the definition of regularity depends both on x and ψ, although we do not express this dependence explicitly. If y is a support of ψ, then regularity implies also

ψ = ψ · χ · ψ. (8.3)

This is the definition of regularity in a semigroup (Ψ; ·) and establishes the link to semigroup theory, see for example (Clifford & Preston, 1967) and the work cited above. Note that in these references semigroups are not assumed to be commutative, as is the case here. In this section we assume that (Ψ,D; ≤, ⊥, ·, ) is regular. Two elements φ and ψ from Ψ are called inverses, if

φ = φ · ψ · φ and ψ = ψ · φ · ψ (8.4)

We keep with the notation in the literature, although in our commutative case we could also have written φ = φ·φ·ψ, . . .. Note that φ ≤ ψ and ψ ≤ φ if φ and ψ are inverses. The following results are well-known from semigroup theory (see for instance (Kohlas, 2003a)): If φ = φ · ψ · φ, then φ and ψ · φ · ψ are inverses. Each element has thus an inverse, and this inverse is unique. If φ and ψ are inverses, then f = φ · ψ is an idempotent element, f · f = f. If S is a subset 8 INFORMATION ORDER 138 of Ψ, define ψ · S to be the set {ψ · φ : φ ∈ S}. The set ψ · Ψ is called the principal filter generated by ψ, since ψ · Ψ = {ψ · χ : χ ∈ Ψ} = {φ : ψ ≤ φ}. There exists for any ψ ∈ Ψ a unique idempotent fψ such that ψ · Ψ = fψ · Ψ. The Green relation is defined as

φ ≡γ ψ if φ · Ψ = ψ · Ψ. (8.5)

It is an equivalence relation in Ψ. Its equivalence classes [ψ]γ are groups for all ψ ∈ Ψ. So Ψ is a union of disjoint groups. The unit element of the group [φ]γ is the idempotent fφ. Note that if φ and ψ are Green-equivalent, then φ ≤ ψ and ψ ≤ φ.

Consider now the idempotents F = {fψ : ψ ∈ Ψ}. They form an idempotent sub-semigroup of (Ψ; ·). According to Section 8.1 they are partially ordered 2 by fφ ≤ fψ if fφ·fψ = fψ The unit 1 and the null element 0 are idempotents, 0, 1 ∈ F . So, the idempotents F form a bounded semilattice where fφ · fψ = fφ ∨ fψ. Further, we have also

fφ · fψ = fφ·ψ. (8.6)

Since the idempotents fφ uniquely represent their class [φ]γ, we may also define a partial order among classes by [φ]γ ≤ [ψ]γ if fφ ≤ fψ. Then we obtain

[φ · ψ]γ = [φ]γ ∨ [φ]γ. (8.7)

We summarize now some results about preorder in Ψ and partial order among idempotents in F and among the classes [φ]γ.

Lemma 16 Let (Ψ,D; ≤, ⊥, ·, ) be a regular generalised information alge- bra. Then

1. φ ≤ ψ iff [φ]γ ≤ [ψ]γ, 2. φ ≤ ψ iff ψ · Ψ = φ · ψ · Ψ,

3. φ ≤ ψ iff ψ · Ψ ⊆ φ · Ψ,

4. φ ≤ ψ and ψ ≤ φ iff φ ≡γ ψ, 2This order is the opposite to the one usually considered in the literature, but it corresponds better to our purposes of information order, as we shall see. 8 INFORMATION ORDER 139

Proof. 1.) Assume φ ≤ ψ, that is φ·χ = ψ. Then [φ·χ]γ = [φ]γ ∨[χ]γ = [ψ]γ. This shows that [φ]γ ≤ [ψ]γ.

Conversely, assume [φ]γ ≤ [ψ]γ such that [φ · ψ]γ = [φ]γ ∨ [ψ]γ = [ψ]γ. This means that ψ · Ψ = φ · ψ · Ψ, hence ψ ∈ φ · ψ · Ψ, therefore ψ = φ · ψ · χ for some χ. But this means that φ ≤ ψ. 2.) We have just proved that ψ · Ψ = φ · ψ · Ψ implies φ ≤ ψ. Assume then that φ ≤ ψ. By item 1 we have also fφ ≤ fψ or fφ · fψ = fφ·ψ = fψ. But then ψ · Ψ = fψ · Ψ = fφ·ψ · Ψ = φ · ψ · Ψ. 3.) If φ ≤ ψ, then ψ = φ · χ. Consider η ∈ ψ · Ψ, then η = ψ · χ0 = φ · χ · χ0. So η ∈ φ · Ψ. Conversely, if ψ · Ψ ⊆ φ · Ψ, then ψ ∈ φ · Ψ, hence there is a χ such that ψ = φ · χ, and thus φ ≤ ψ. 4.) We have by item 2 φ ≤ ψ iff ψ ·Ψ = φ·ψ ·Ψ and ψ ≤ φ iff φ·Ψ = φ·ψ ·Ψ. Therefore, φ · Ψ = ψ · Ψ, hence φ ≡γ ψ. ut So far, this is essentially semigroup theory. We now consider extraction and extend thus this order theory to information algebras. Here is a first important result:

Theorem 56 Let (Ψ,D; ≤, ⊥, ·, ) be a regular generalised information al- gebra. The Green relation ≡γ is a congruence relative to combination and extraction in the algebra (Ψ,D; ≤, ⊥, ·, )

Proof. The relation ≡γ is an equivalence relation. If φ ≡γ ψ, then [φ]γ = [ψ]γ. Consider any element η of Ψ. Then [φ]γ ∨ [η]γ = [ψ]γ ∨ [η]γ, hence [φ · η]γ = [ψ · η]γ and thus φ · η ≡γ ψ · η.

Assume again φ ≡γ ψ such that φ · Ψ = ψ · Ψ, and consider the operator x. From φ ∈ φ · Ψ we conclude that φ = ψ · χ for some χ ∈ Ψ and 0 therefore x(φ) = x(ψ · χ). By regularity we have ψ = x(ψ) · χ · ψ and thus 0 0 x(φ) = x(x(ψ) · χ · χ · ψ) = x(ψ) · x(χ · χ · ψ) (Lemma 12). This shows that x(ψ) ≤ x(φ). By symmetry we have also x(φ) ≤ x(ψ), therefore, by Lemma 16 item 4, x(φ) ≡γ x(ψ). This proves that ≡γ is a congruence. ut Here follow a few results on order and extraction, which show some desirable results, in particular the validity of the expected properties 4.) to 6.) of an information order formulated above (Section 8.1).

Theorem 57 Let (Ψ,D; ≤, ⊥, ·, ) be a regular generalised information al- gebra. Then 8 INFORMATION ORDER 140

1. x(ψ) ≤ ψ for all x ∈ D and ψ ∈ Ψ.

2. φ ≤ ψ implies x(φ) ≤ x(ψ) for all x ∈ D.

3. x ≤ y implies x(ψ) ≤ y(ψ) for all ψ ∈ Ψ.

Proof. 1.) By regularity ψ = ψ · χ · x(ψ) where x(χ) = χ. Applying the extraction operator on both sides gives x(ψ) = x(ψ) · x(ψ) · χ, hence x(ψ) ≥ χ and therefore [x(ψ)]γ ≥ [χ]γ (Lemma 16). From the regularity formula we obtain also [ψ]γ = [ψ]γ ∨ [χ]γ ∨ [x(ψ)]γ = [ψ]γ ∨ [x(ψ)]γ, hence [x(ψ)]γ ≤ [ψ]γ. This implies x(ψ) ≤ ψ (Lemma 16). 2.) If φ ≤ ψ, then ψ · Ψ = φ · ψ · Ψ (Lemma 16). This implies ψ = ψ · φ · χ for 0 some χ ∈ Ψ. By regularity we have φ = φ · x(φ) · µ and ψ = ψ · x(ψ) · µ , where x is a support of both µ and µ0. From this we deduce

x(ψ) = x(ψ · φ · χ) 0 = x(x(ψ) · x(φ) · µ · µ · ψ · φ · χ) 0 = x(ψ) · x(φ) · x(·µ · µ · ψ · φ · χ) (8.8)

This proves that x(φ) ≤ x(ψ). 3.) Assume that z is a support of ψ. By C1, z⊥y|y and by C3 z⊥x|y, since x ≤ y. It follows from Axiom A4 that x(ψ) = x(y(ψ)). Then items 1 and 2 above shows that x(ψ) ≤ y(ψ). ut Based on Theorem 56, we may consider the quotient algebra (Ψ/γ, D; ≤, ·, ), which by general results of universal algebra must still be a generalised information algebra. In fact, we define the following operations between classes

1. Combination: [φ]γ · [ψ]γ = [φ · ψ]γ,

2. Extraction: x([ψ]γ) = [x(ψ)]γ.

We denote the operations of combination and extraction in Ψ/γ by the same symbols as in Ψ; there is no risk of confusion. The projection pair of maps (f, g), where f(ψ) = [ψ]γ and g(x) = x (meaning at the right hand side, the operator in Ψ/γ) is clearly a homomorphism. A homomorphism maintains order. In addition, it turns out that the information algebra (Ψ/γ, D; ≤, ⊥, ·, ) is idempotent. 8 INFORMATION ORDER 141

Theorem 58 Let (Ψ,D; ≤, ⊥, ·, ) be a regular generalised information alge- bra and ≡γ the Green relation. Then the quotient algebra (Ψ/γ, D; ≤, ⊥, ·, ) is an idempotent generalised information algebra, homomorph to (Ψ,D; ≤ , ⊥, ·, ).

Proof. That (Ψ/γ, D; ≤, ·, ) is a generalised information follows since the pair of maps defined above form a homomorphism. Idempotency follows from x[ψ]γ = [x(ψ)]γ ≤ [ψ]γ (Theorem 57), hence [ψ]γ · x([ψ]γ) = [ψ]γ ∨ x([ψ]γ) = [ψ]γ. ut Instead of the quotient algebra (Ψ/γ, D; ≤, ⊥, ·, ) we can also consider the idempotents in the classes, because there is a one-to-one association between idempotents and their classes. In the signature (F,D; ≤, ⊥, ·, ), where F = {fψ : ψ ∈ Ψ), again the two operations of combination and extraction are defined:

1. Combination: fφ · fψ = fφ·ψ,

2. Extraction: x(fψ) = fx(ψ).

This algebra is an idempotent subalgebra of (Ψ,D; ≤, ⊥, ·, ), hence an idem- potent generalised information algebra. Because of the idempotency, it can be considered as the deterministic part of (Ψ,D; ≤, ⊥, ·, ). We refer to (Kohlas, 2003a), where in the context of labeled valuation algebras proba- bility potentials as they occur in Bayesian networks are considered. They form a regular valuation algebra. The idempotents fψ correspond to the car- rier sets of the probability potentials ψ, that is the sets of tuples x where the probability potential ψ = p(x) is positive. In this context, the idempotents can be considered as events, and constitute in this sense the deterministic part of the valuation algebra of probability potential. By the pair of maps [ψ]γ 7→ fψ and  7→ , the algebras (Ψ/γ, D; ≤, ⊥, ·, ) and (F,D; ≤, ⊥, ·, ) are isomorphic. To conclude this section, we remark that in the relation ψ = φ·χ if φ ≤ ψ can be linked to conditionals in regular algebras, providing a generalisation of conditional probability distributions to general valuation or even generalised information algebras. For this aspect of regular valuation algebras we refer to (Kohlas, 2003a). Those results could be extended to generalised information algebras, but we shall not pursue this line of inquiry. Further, we remark that the preorder (8.1) can be made to a partial order in a regular generalised 8 INFORMATION ORDER 142

information algebra, if the condition ψ = fψ · φ is added,

φ ≤ ψ, iff [φ]γ ≤ [ψ]γ and ψ = fψ · φ. (8.9)

This is the partial order studied in semigroup theory (Nambooripad, n.d.; Mitsch, n.d.), the goal there being to study the structure of semigroups. The condition ψ = fψ · φ means in our context that ψ is obtained by com- bination of φ with a deterministic information fψ. So ψ results from a kind of conditioning of φ on fψ. We refer to (Kohlas, 2003a) for an illustration in the context of probability potentials. So, ψ is, according to this order, more informative than φ, if it is obtained by conditioning of φ. Although this makes sense, this order does not seem very interesting from the point of view of information algebras. For example it does not follow that x(ψ) ≤ ψ.

8.3 Separative Algebras

Here we go one step beyond regular algebras. Consider again a domain-free information algebra (Ψ,D; ≤, ⊥, ·, ). Instead of assuming it to be regular, and then use the Green relation to study order, we start with a congruence, similar to the Green relation and base the study of order on this relation. Thus, assume that there is a congruence ≡γ relative to combination and extraction in Ψ such that

x(ψ) · ψ ≡γ ψ (8.10) for all ψ ∈ Ψ and x ∈ D. Since any element ψ has a support, we have also

ψ · ψ ≡γ ψ

The equivalence classes [ψ]γ are semigroups. Indeed, if φ, ψ ∈ [ψ]γ, then φ · ψ ≡γ ψ · ψ since ≡γ is a congruence. But ψ · ψ ≡γ ψ, thus φ · ψ ≡γ ψ hence φ · ψ ∈ [ψ]γ. As in the previous section the quotient algebra (Ψ/γ, D; ≤, ⊥, ·, ) is an idempotent information algebra, homomorph to (Ψ,D; ≤, ⊥, ·, ), and the operations are defined as

1. Combination: [φ]γ · [ψ]γ = [φ · ψ]γ.

2. Extraction: x([ψ]γ) = [x(ψ)]γ. 8 INFORMATION ORDER 143

Idempotency of (Ψ/γ, D; ≤, ⊥, ·, ) follows from condition (8.10). Since the classes form an idempotent algebra, they are partially ordered by [φ]γ ≤ [ψ]γ if [φ]γ · [ψ]γ = [φ]γ. Under this order we have

[φ]γ · [ψ]γ = [φ]γ ∨ [ψ]γ.

As in the previous section, we would like this partial order of classes to represent the preorder defined in (8.1) in the sense that φ ≤ ψ iff [φ]γ ≤ [ψ]γ. But for this to hold, we need a further condition, since the classes [ψ]γ are in general, in contrast to regular algebras, not groups. In semigroup theory embeddings of semigroups into a disjoint union of groups is studied, see (Clifford & Preston, 1967). A sufficient condition for this to be possible is cancellativity, that is

φ · ψ = φ · ψ0 (8.11)

0 implies ψ = ψ . We assume therefore that all semigroups [φ]γ are cancella- tive. This leads to the following definition.

Definition 17 Separative Information Algebras: Let (Ψ,D; ≤, ⊥, ·, ) be a generalised domain-free information algebra. It is called separative, if there exists a congruence ≡γ relative to combination and extraction in Ψ such that

1. x(ψ) · ψ ≡γ ψ for all ψ ∈ Ψ and for all x ∈ D.

2. The semigroups [ψ]γ are cancellative for all ψ ∈ Ψ.

We remark that separative valuation algebras have been studied in (Kohlas, 2003a) with respect to local computation with division and to generalisation of conditionals from probability to general valuations or information. As in the case of regular algebras, the theory of conditionals is closely related to natural order. Here we focus on information order in generalised information algebras. For examples of separative valuation algebras, we refer to (Kohlas, 2003a; Pouly & Kohlas, 2011).

A cancellative semigroup such as [ψ]γ can be embedded into a group. The classical procedure is as follows: Consider ordered pairs (φ, ψ) for φ, ψ ∈ [ψ]γ and define equality among pairs by

(φ, ψ) = (φ0, ψ0) iff φ · ψ0 = φ0 · ψ. 8 INFORMATION ORDER 144

Let γ(ψ) denote the set of these pairs from [ψ]γ. Then we define the opera- tion

(φ, ψ) · (φ0, ψ0) = (φ · φ0, ψ · ψ0).

With this operation γ(ψ) becomes a group thanks to cancellativity. Its unit is (ψ, ψ) and the inverse of (φ, ψ) is (ψ, φ). The class [ψ]γ is embedded into γ(ψ) as a a semigroup by the map

ψ 7→ (ψ · ψ, ψ).

Define [ Ψ∗ = γ(ψ). ψ∈Ψ

In order to distinguish elements of Ψ∗ from those of Ψ, we denote elements of Ψ∗ by lower case letters like a, b, . . .. The union of groups Ψ∗ becomes a semigroup, if we define for a = (φa, ψa) and b = (φb, ψb),

a · b = (φa · φb, ψa · ψb).

This operation is well-defined, associative and commutative. Thus (Ψ∗; ·) is a commutative semigroup and (Ψ; ·) is embedded into it as a semigroup by the map ψ 7→ (ψ · ψ, ψ) as can easily be verified. In the sequel, in order to simplify notation, we denote the elements (ψ · ψ, ψ) of the image of (Ψ; ·) under this map simply by ψ.

We may carry over the order between the classes [ψ]γ to the groups γ(ψ), since there is a one-to-one relation between classes and groups. Hence γ(φ) ≤ γ(ψ) iff [φ]γ ≤ [ψ]γ. Then we deduce that

γ(φ · ψ) = γ(φ) ∨ γ(ψ).

We extend now the natural order (8.1) to the semigroup (Ψ∗; ·),

a ≤ b, iff there exists a c ∈ Ψ∗ such that b = a · c. (8.12)

Note that for elements of Ψ, this preorder φ ≤ ψ admits that in ψ = φ · c, the factor which completes φ to ψ does no more need to be an element of Ψ, but only of Ψ∗.

Lemma 17 In Ψ∗ we have a ≤ b iff γ(a) ≤ γ(b). 8 INFORMATION ORDER 145

Proof. Assume first a ≤ b, hence a · c = b for some c ∈ Ψ∗. Then γ(b) = γ(a · c) = γ(a) ∨ γ(c), hence γ(a) ≤ γ(c). Conversely, assume γ(a) ≤ γ(b). Then γ(b) = γ(a) ∨ γ(b) = γ(a · b). Therefore we see that a · b and b belong both to the group γ(b) and therefore b = a · b · (a · b)−1 · b, thus a ≤ b. ut We remark that for any element a of Ψ∗ we have a = a · a−1 · a. This means ∗ ∗ ∗ that the semigroup (Ψ , ·) is regular. And a ≡γ b implies a · Ψ = b · Ψ . In fact, if d ∈ a·Ψ∗, then d = a·c for some c ∈ Ψ∗. It follows then d = b·b−1·a·c, hence d ∈ b · Ψ∗. In the same way it follows that d ∈ b · Ψ∗ implies d ∈ a · Ψ∗, hence a·Ψ∗ = b·Ψ∗. Conversely, if a·Ψ∗ = b·Ψ∗, then a = b·c and b = a·c0 for some c, c0 ∈ Ψ∗. This means that a ≤ b and b ≤ a, hence γ(a) = γ(b), or a ≡γ b. This shows that the congruence ≡γ is the Green relation in the regular semigroup (Ψ∗, ·). As a consequence of this remark and of Lemma 17 we have, as in the previous section (Lemma 16), the following result:

Lemma 18 Let (Ψ,D; ≤, ⊥, ·, ) be a separative information algebra. Then

1. a ≤ b and b ≤ a iff γ(a) = γ(b),

2. φ ≤ ψ iff [φ]γ ≤ [ψ]γ,

3. φ ≤ ψ and ψ ≤ φ iff [φ]γ = [ψ]γ.

As in the case of regular information algebras, we have for separative infor- mation algebras the same results regarding order and extraction (see Theo- rem 57).

Theorem 59 Let (Ψ,D; ≤, ⊥, ·, ) be a separative generalised information algebra. Then

1. x(ψ) ≤ ψ for all x ∈ D and ψ ∈ Ψ.

2. φ ≤ ψ implies x(φ) ≤ x(ψ) for all x ∈ D.

3. x ≤ y implies x(ψ) ≤ y(ψ) for all ψ ∈ Ψ.

Proof. 1.) From (8.10) we obtain γ(x(ψ) · ψ) = γ(x(ψ)) ∨ γ(ψ) = γ(ψ). this shows that γ(x(ψ)) ≤ γ(ψ), which implies x(ψ) ≤ ψ (Lemma 17). 9 PROPER OR IDEMPOTENT INFORMATION 146

2.) From φ ≤ ψ we obtain γ(φ) ≤ γ(ψ) and from item 1 just proved γ(x(φ)) ≤ γ(φ). Thus we have γ(x(φ) · ψ) = γ(x(φ)) ∨ γ(ψ) = γ(ψ). Further, we have x(x(φ)·ψ) = x(φ)·x(ψ). Therefore, from the congruence of ≡γ, we conclude that γ(x(φ) · x(ψ)) = γ(x(ψ)), and this shows that x(φ) ≤ x(ψ). 3.) Is proved exactly as item 3 of Theorem 57. ut If (Ψ,D; ≤, ⊥, ·, ) is a separative generalised information algebra, then the quotient algebra (Ψ/γ, D; ≤, ⊥, ·, ), is an idempotent information algebra, homomorph to (Ψ,D; ≤, ⊥, ·, ) as noted above. Any group γψ has a unique unit and idempotent element, denoted by fψ. The idempotent information algebra of idempotents or the units of the groups γ(ψ), (F,D; ≤, ⊥, ·, ) with the operations defined as follows

1. Combination: fφ · fψ = fφ·ψ,

2. Extraction: x(fψ) = fx(ψ), is isomorphic to the quotient algebra (Ψ/γ, D; ≤, ⊥, ·, ). Note however, that the elements of F do not, in general, belong to Ψ. Nevertheless, we may still consider the elements of F as the deterministic parts of Ψ∗. As in the regular case, we may define an order

φ ≤ ψ, iff [φ]γ ≤ [ψ]γ and ψ = fψ · φ. (8.13) This is again a partial order, since φ ≤ ψ and ψ ≤ φ imply γ(φ) = γ(ψ) and ψ = fψ · φ. But fψ = fφ, hence ψ = fφ · φ = φ. The expression fψ · φ is again a kind of conditioning, namely the combination of a deterministic element fψ with an information element φ. We refer to (Kohlas, 2003a) for a discussion of the separative valuation algebra of densities, which illustrates these statements. Again, it makes sense that an information ψ obtained from another one by condition ψ = fψ · φ, wherefφ ≤ fψ is considered to be more informative. At least in probability theory this seems evident.

9 Proper or Idempotent Information

9.1 Ideal Completion

In this section we consider domain-free idempotent generalised information algebras (Ψ,D; ≤, ⊥, ·, ), that is, proper information algebras. Their as- sociated idempotent valuation were called information algebras in (Kohlas, 9 PROPER OR IDEMPOTENT INFORMATION 147

2003a). There it was shown that the partial order introduced by idempo- tency plays an important role in the theory of idempotent information alge- bras, see also (Kohlas & Schmid, 2014a; Kohlas & Schmid, 2014b). Most of these results carry over to the more general case of idempotent generalised information algebras. Some of them will be presented here. In an idempotent information algebra, we have defined φ ≤ ψ if φ · ψ = ψ, see Section 8.1. So, ψ is more informative than φ, if adding φ to ψ gives nothing new. Another way to express this is to say the piece of information φ is contained in ψ or also the piece of information φ is implied by ψ. As a consequence, combination corresponds to the supremum in this order, φ·ψ = φ ∨ ψ. So the idempotent semigroup (Ψ, ·) determines a join-semilattice (Ψ, ≤). If we want to stress the point of view of order we write φ∨ψ instead of φ · ψ. Instead of looking at a particular piece of information ψ we may look at families of pieces of information I. Such a family is consistent and complete if for any ψ ∈ I, all elements less informative, implied by ψ, belong also to I, and if φ and ψ belong to I, then the combined information φ · ψ belongs to I. This means that I is an ideal in the semilattice (Ψ; ≤). More formally, I is an ideal if

1. φ ≤ ψ and ψ ∈ I imply φ ∈ I, 2. φ, ψ ∈ I imply φ ∨ ψ ∈ I.

The set ↓ ψ of all elements less informative than or implied by ψ form an ideal, a principal ideal. Note that the unit is in all ideals, and if ψ belongs to an ideal, then all extractions x(ψ) belong to the ideal also. The null element 0 belongs only to the ideal Ψ. All ideals different from Ψ are called proper ideals. An ideal represents a piece of information too. In fact, we may extend the operations of combination and extraction from the algebra (Ψ,D; ≤, ⊥, ·, ) to the set of ideals IΨ of it:

1. Combination:

I1 · I2 = {ψ ∈ Ψ: ∃ψ1 ∈ I1, ψ2 ∈ I2 such that ψ ≤ ψ1 · ψ2}. (9.1)

2. Extraction:

x(I) = {ψ ∈ Ψ: ∃φ ∈ I such that ψ ≤ x(φ)}. (9.2) 9 PROPER OR IDEMPOTENT INFORMATION 148

These operation are well-defined, since they yield ideals in both cases.

It turns out that the set of ideals IΨ of Ψ with these operations in fact becomes an information algebra. In order to show that, we need some preparations. First, the intersection of an arbitrary family of ideals is still an ideal. Therefore, the ideal generated by a subset X of Ψ can be defined as the smallest ideal containing X, \ I(X) = {I : I ideal in Ψ,X ⊆ I}.

Alternatively, we have also

I(X) = {ψ ∈ Ψ: ∃ψ1, . . . , ψn ∈ X such that ψ ≤ ψ1 · ... · ψn}. (9.3)

In particular, we have I1 · I2 = I(I1 ∪ I2). If X is a finite subset of Ψ, then I(X) =↓∨X, the ideal generated by X is the principal ideal of the element ∨X ∈ Ψ. These are well-known result, (Kohlas, 2003a) From lattice theory we know that a system closed under intersections, a so- called ∩-system, forms a complete lattice under the partial order of inclusion, see (Davey & Priestley, 2002). Infimum is intersection and supremum is given by \ ∨Y = {I : ideal in Ψ, ∪J∈Y J ⊆ I} where Y is any family of ideals. In particular we have I1 · I2 = I1 ∨ I2.

For IΨ to become an information algebra in the sense of this paper, any ideal must has a support x ∈ D, that is x(I) = I. But if the semilattice D does not have a greatest element, this may not hold. If D has a greatest element >, then by the support axiom for the information algebra (Ψ,D; ≤, ⊥, ·, ) this element is necessarily a support for every element of Ψ. In this case any ideal of Ψ has at least this element as support. If D has no greatest element, we may adjoin one in the following way: Consider (D ∪ {>}, ≤, ⊥) where x ≤ > and x ∨ > = > for all elements x of D ∪ {>}. Extend the conditional independence relation with x⊥y|> and x⊥>|>, >⊥x|> for all x and y in D∪{>}. Then (D∪{>}; ≤, ⊥) is still a q-separoid. Further, define > as the identity map of Ψ. Then, clearly, (Ψ,D∪{>}; ≤, ⊥, ·, ) is still an idempotent information algebra; in particular the axioms of extraction, combination and idempotency are still valid in this extended structure. Therefore, we assume in the sequel, without loss of generality, that D has a top element. Then (IΨ,D; ≤, ⊥, ·, } becomes an idempotent generalised information algebra. 9 PROPER OR IDEMPOTENT INFORMATION 149

Theorem 60 Let (Ψ,D; ≤, ⊥, ·, ) be an idempotent generalised domain-free information algebra such that (D; ≤) has a greatest element. Then (IΨ,D; ≤ , ⊥, ·, } is an idempotent generalised domain-free information algebra and (Ψ,D; ≤, ⊥, ·, } is embedded into it.

Proof. Axiom A0 holds, since (D; ≤, ⊥) is the same q-separoid as in (Ψ,D; ≤ , ⊥, ·, ). Axiom A1 (Semigroup) holds since Ψ is under the order induced by combination a complete lattice. Axiom A2 (Support) follows since the greatest element of D is a support of any ideal I. Further we must show that if x(I) = I and y ≥ x, then y(I) = I. We remark that if ψ ∈ x(I) = I, then ψ ≤ x(φ) = η for some φ ∈ I. Note that η has support x and belongs to I. So ψ ≤ η = x(η), η ∈ I. Therefore, if x(I) = I, then any element ψ of I is dominated by an element with support x from I. Now, clearly y(I) ⊆ I. Consider then an element ψ from I. By the preceding remark there is an element η in I with support x, such that ψ ≤ η. By the support axiom for the algebra (Ψ,D; ≤, ⊥, ·, ), y ≥ x is then also a support for η. So, we have ψ ≤ y(η). This shows that ψ ∈ y(I), hence y(I) = I. This vertifies Axiom A2 for the algebra (IΨ,D; ≤, ⊥, ·, }. Unit in (Iψ; ·) is the principal ideal {1} and null element is Ψ. It is evident that the Unit and Null Axiom A3 holds.

To verify the Extraction Axiom A4, we must show that x⊥y|z and x(I) = I implies y(I) = y(z(I)). Now, if ψ ∈ y(z(I)), then ψ ≤ y(φ) for some element φ such that φ ≤ y(η) for an element η in I. Then ψ ≤ y(z(η)) ≤ y(η). This shows that y(z(I)) ⊆ y(I). Consider then an element ψ ∈ y(I). There is then an element η with support x in I such that ψ ≤ y(η). By the Extraction Axiom A4 for the algebra (Ψ,D; ≤, ⊥, ·, ), we have y(η) = y(z(η)). Since z(ψ) belongs to z(I) this implies that ψ ∈ y(z(I)), therefore y(I) = y(z(I)) and Axiom A4 holds in the algebra (IΨ,D; ≤, ⊥, ·, } too.

For the Combination Axiom A5, assume x⊥y|z and x(I1) = I1, y(I2) = I2. Then we must shown z(I1 ·I2) = z(I1)·z(I2). Consider an element ψ from z(I1 · I2), such that ψ ≤ z(φ) for a φ ∈ I1 · I2. Then there are elements φ1 = x(φ1) ∈ I1 and φ2 = y(φ2) ∈ I2 such that φ ≤ φ1 · φ2. This implies ψ ≤ z(φ1 · φ2). By Axiom A4 for the algebra (Ψ,D; ≤, ⊥, ·, ) we obtain then ψ ≤ z(φ1 · φ2) = z(φ1) · z(φ2), which shows that ψ ∈ z(I1) · z(I2). Conversely consider ψ ∈ z(I1) · z(I2). Then ψ ≤ ψ1 · ψ2, where ψ1 ≤ z(φ1) and φ1 = x(φ1), φ1 ∈ I1, and, similarly, ψ2 ≤ z(φ2), with φ2 = y(φ2) and φ2 ∈ I2. So we have ψ ≤ z(φ1) · z(φ2) and by axiom A4 for the algebra (Ψ,D; ≤, ⊥, ·, ) the left hand side equals z(φ1 · φ2). So ψ ≤ z(φ1 · φ2) with 9 PROPER OR IDEMPOTENT INFORMATION 150

φ1 ∈ I1 and φ2 ∈ I2. This shows that ψ ∈ z(I1 · I2). Thus we obtain that z(I1 · I2) = z(I1) · z(I2).

Idempotency, Axiom A6, follows from x(I) ⊆ I, hence x(I)·I = x(I)∨I = I.

This proves that (IΨ,D; ≤, ⊥, ·, } is an idempotent generalised information algebra. Consider know the pair of maps (f, g) defined by f(ψ) = ↓ ψ and g(x) = x (on the left x as extraction in Ψ, on the right as extraction in IΨ). Then

f(φ · ψ) = ↓(φ · ψ) = ↓φ · ↓ψ,

f(x(ψ)) = ↓x(ψ) = x(↓ψ). (9.4)

These identities follow directly from the definition of combination and ex- traction in IΨ. Further we have f(1) = {1} and f(0) = Ψ. So, the pair of maps (f, g) is a homomorphism. It is also one-to-one, since ↓φ = ↓ψ imply φ = ψ. So (f, g) is an embedding of (Ψ,D; ≤, ⊥, ·, } into (IΨ,D; ≤, ⊥, ·, }. This completes the proof. ut Ideal completion of an idempotent valuation algebra gives an idempotent valuation algebra. This case has already been presented in (Kohlas, 2003a). From a lattice theoretic point of view ideal completion completes the join- semilattice (Ψ, ≤) to a complete lattice. In terms of combination or ag- gregation of information, completion means that any set X of pieces of information in Ψ can be aggregated, namely to the ideal I(X) generated by X. A different approach to completion in idempotent valuation algebras has been proposed in (Guan, 2014).

9.2 Compact Algebras

In information processing, only “finite” information can be handled. “Infi- nite” information can however often be approximated by “finite” elements. This aspect of finiteness is discussed in this section. It must be stressed that not every aspect of finiteness is captured. For example, no questions of computability and related issues will be treated. On the other hand, many aspects discussed in this section, are also considered in domain theory. In fact, much of this section is motivated by domain theory. However, the one crucial feature not addressed in domain theory, is information extraction. Also domain theory places more emphasis on order, approximation and con- vergence of information and less on combination. So, although the subject 9 PROPER OR IDEMPOTENT INFORMATION 151 is similar to domain theory, it is treated here with a different emphasis and goal. It will be seen that the subject is also closely related to algebraic and, more generally, continuous lattices. Consider an idempotent generalised information algebra (Ψ,D; ≤, ⊥, ·, ). In the set of pieces of information Ψ we single out a subset Ψf of elements which are considered to be finite. We shall explain in a moment what this means. First, combination of finite information pieces should again yield finite information. So Ψf is assumed to be closed under combination. The unit 1 is considered to be finite and belongs thus to Ψf . One might expect that extraction of finite information always results in finite information. Although this is the case for many instances, it is not true in general. We adopt the concept of finiteness in order theory. For this purpose, we need the concept of a directed set in Ψ: A subset X of Ψ is called directed, if it is not empty and with any two elements φ1 and φ2 belonging to it, there is an element φ ∈ X such that φ1, φ2 ≤ φ. Then finite elements are defined as follows (Davey & Priestley, 1990):

Definition 18 Finite Elements: An element ψ in (Ψ; ≤) is called finite or also compact, if for any directed set X in Ψ whose supremum exists in Ψ, from ψ ≤ ∨X it follows that there is an element φ ∈ X such that ψ ≤ φ.

Let Ψf denote the set of finite elements of (Ψ; ≤). This set of finite elements is closed under combination. Indeed, if φ and ψ are finite and φ · ψ ≤ ∨X for some directed subset X of Ψ, then from φ, ψ ≤ φ · ψ it follows that there are elements φ0 and ψ0 in X such that φ ≤ φ0 and ψ ≤ ψ0. Further, since X is directed there is an element χ in X such that φ0, ψ0 ≤ χ, hence φ · ψ ≤ φ0 · ψ0 ≤ χ ∈ X. This shows that φ · ψ is finite. Further the unit 1 is clearly finite. We want now to consider information algebras, where every element can be approximated by the finite elements it dominates, that is, the finite elements, which are less informative. In fact, we want even more: An element with support x must be approximated by the finite elements supported by the same domain x. This is captured by the following definition.

Definition 19 Compact Information Algebra: An idempotent, domain- free information algebra (Ψ,D; ≤, ⊥, ·, ) is called compact, if for all φ ∈ Ψ and x ∈ D, if φ = x(φ),

φ = x(φ) = ∨{ψ ∈ Ψf , ψ = x(ψ) ≤ φ}. (9.5) 9 PROPER OR IDEMPOTENT INFORMATION 152

Identity (9.5) means that the finite elements of domain x are dense in this domain, they approximate all elements of this domain. This is called strong density. It implies also that the whole set of finite elements is dense in Ψ, in the sense that they approximate any element φ of Ψ. In fact, such an element has some support x (by the Support Axiom), hence by strong density,

φ = x(φ) = ∨{ψ ∈ Ψf , ψ = x(ψ) ≤ φ}

≤ ∨{ψ ∈ Ψf , ψ ≤ φ} ≤ φ.

Thus, we see that

φ = ∨{ψ ∈ Ψf , ψ ≤ φ}. (9.6)

This is also called weak density. Note that weak density does not imply strong density. A counter example for that is given in (Kohlas, 2003a). Here is a first result, which expresses a continuity property of the extraction operators x.

Theorem 61 Let (Ψ,D; ≤, ⊥, ·, ) be a compact information algebra. If X is a directed subset of Ψ such that ∨X exists in Ψ, and x ∈ D, then ∨φ∈X x(φ) exists in Ψ and _ x(∨X) = x(φ). (9.7) φ∈X

Proof. If φ ∈ X, then φ ≤ ∨X, hence x(φ) ≤ x(∨X) and therefore x(∨X) is an upper bound of the x(φ) for φ ∈ X. By density, _ x(∨X) = {ψ ∈ Ψf : ψ = x(ψ) ≤ ∨X}.

But ψ ≤ ∨X implies that there is a φ ∈ X such that ψ ≤ φ. Then ψ = x(ψ) ≤ x(φ), for all φ ∈ X hence x(∨X) is the least upper bound of the x(φ) for φ ∈ X, therefore x(∨X) = ∨φ∈X x(φ). ut In general, not every directed set in Ψ has a supremum in Ψ. But if this is the case, i.e. if (Ψ; ≤) is a dcpo (directed complete partial order), then, since Ψ is closed under finite joins, it follows by standard results from lattice theory, that Ψ is a complete lattice. If (Ψ,D; ≤, ⊥, ·, ) is compact, then 9 PROPER OR IDEMPOTENT INFORMATION 153 weak density (9.6) holds in Ψ. A complete lattice (Ψ; ≤) with the additional condition (9.6) is called an algebraic lattice, (Gierz, 2003). Therefore, we call a compact information algebra (Ψ,D; ≤, ⊥, ·, ), in which (Ψ; ≤) is a dcpo, an algebraic information algebra.

Definition 20 A compact information algebra (Ψ,D; ≤, ⊥, ·, ) is alled al- gebraic, if (Ψ; ≤) is a dcpo (directed complete partial order).

In summary, an algebraic information algebra is an algebraic lattice in which strong density holds; and conversely, an information algebra which is an algebraic lattice in which strong density holds, is an algebraic information algebra. In (Kohlas, 2003a), an alternative, but equivalent definition of an algebraic information algebra has been given according to the following theorem.

Theorem 62 An idempotent, domain-free information algebra (Ψ,D; ≤, ⊥, ·, ) is an algebraic information algebra, if and only if there exists a subset Ψ0 of Ψ, closed under combination and containing the unit 1, satisfying the following conditions:

1. Convergence: For any directed subset X of Ψ0, the supremum ∨X exists in Ψ.

2. Density: For any φ ∈ Ψ and x ∈ D such that φ = x(φ), 0 φ = x(φ) = ∨{ψ ∈ Ψ , ψ = x(ψ) ≤ φ}. (9.8)

3. Compactness: For any directed subset X of Ψ0 and ψ ∈ Ψ0 such that ψ ≤ ∨X, there is an element φ ∈ X such that ψ ≤ φ.

0 Then Ψ equals the set of finite elements Ψf of (Ψ; ≤).

The full proof of this theorem can be found in (Kohlas, 2003a). Although it is given there for idempotent valuation algebras, the proof carries over to the more general case of idempotent information algebras. Item 1 in the conditions above is sufficient to make (Ψ; ≤) a complete lattice; item 3 is 0 then equivalent to the finiteness condition in (Ψ; ≤), such that Ψ = Ψf and item 2 becomes strong density. The algebra is therefore compact, hence algebraic. An algebraic information algebra may be obtained from any information algebra by ideal completion. 9 PROPER OR IDEMPOTENT INFORMATION 154

Theorem 63 If (Ψ,D; ≤, ⊥, ·, ) is an idempotent information algebra such that (D; ≤) has a greatest element, then its ideal completion (IΨ,D; ≤, ⊥, ·, ) is an algebraic information algebra with the set {↓ φ : φ ∈ Φ} of principal ideals as finite elements.

This has already been shown in (Kohlas, 2003a) for idempotent valuation algebras. Since it has been shown in Theorem 60 in the previous section that (IΨ,D; ≤, ⊥, ·, ) is an idempotent generalised information algebra, the rest of the proof does not change for idempotent generalised information algebras, since only convergence, density and compactess in Theorem 62 need to be verified; therefore we refer to (Kohlas, 2003a) for a proof.

The algebraic information algebra (IΨ,D; ≤, ⊥, ·, ) is fully determined by its finite elements, that is, the elements of Ψ. This holds in general for any compact information algebra, as the following theorem shows. This is similar to a well-known result in domain theory (Stoltenberg-Hansen et al. , 1994). To show that in a compact algebra the elements are fully determined by their finite elements, there are at least two approaches possible. One is ideal completion of the partial order (Ψf , ≤) of finite elements, the other adds suprema of directed sets of finite elements which have no suprema in Ψ (Guan, 2014). It seems that for the second approach the finite elements must be closed under extraction, whereas for ideal completion this is not necessary. Otherwise the two approaches yield equivalent results. We prove first the ideal representation theorem for algebraic information algebras, which shows that an algebraic information algebra is determined by its finite elements through ideal completion.

Theorem 64 Assume (Ψ,D; ≤, ⊥, ·, ) to be a domain-free algebraic infor- mation algebra, where (D; ≤) has a greatest element >. If Ψf are the fi- nite elements of the algebraic information algebra, then the ideal completion

(IΨf ,D; ≤, ⊥, ·, ) may be extended to a generalised domain-free algebraic information algebra, isomorphic (Ψ,D; ≤, ⊥, ·, ).

Proof. If Ψf is closed under extraction, then Ψf ∪ {0} forms a subalgebra of

(Ψ,D; ≤, ⊥, ·, ), and it follows in this case from Theorem 60 that (IΨf ,D; ≤ , ⊥, ·, ) is a idempotent information algebra. But we do not assume that the finite elementss are closed under extraction, so we must first show that from

IΨf and D we may may nevertheless construct an idempotent information algebra. 9 PROPER OR IDEMPOTENT INFORMATION 155

We define first combination between ideals I1 and I2 of Ψf as usual,

I1 · I2 = {ψ ∈ Ψf : ∃ψ1 ∈ I1, ψ2 ∈ I2 such that ψ ≤ ψ1 · ψ2}.

As before, the ideals in Ψf form a ∩-system, hence a complete lattice, with combination as join. So, (IΨf ; ·) is a commutative semigroup with unit {1} and null element Ψf ∪{0}. Hence the Semigroup Axiom A1 of a domain-free information algebra is satisfied. Next, for any x ∈ D we define an extraction operator

x(I) = {ψ ∈ Ψf : ∃φ ∈ I such that ψ ≤ x(φ)}.

Clearly, x(I) is still an ideal in Ψf and x maps therefore IΨf into itself.

The greatest element > in D is a support for all elements of Ψf , hence we have >(I) = I for every ideal in Ψf . Thus, every element I of IΨf has a support. Assume further that x is a support for the ideal I of Ψf , that is x(I) = I and x ≤ y. Note that y(I) ⊆ I and assume ψ ∈ I. Then ψ ≤ x(φ) for some φ ∈ I. But x(φ) ≤ y(φ). This implies ψ ∈ y(I) and therefore

y(I) = I. This confirms the Support Axiom A2 in (IΨf ,D; ≤, ⊥, ·, ). The Unit and Null Axiom A3 is evident for ideals in Ψf . Extraction Axiom A4 and Combination Axiom A5 are proved just as in Theorem 60. The Idempotency Axiom A6 follows since x(I) ⊆ I. Axiom A0 is inherited from the algebra (Ψ,D; ≤, ⊥, ·, ). This shows that (IΨf ,D; ≤, ⊥, ·, ) is a domain-free idempotent information algebra.

Consider now the map φ 7→ Aφ = {ψ ∈ Ψf : ψ ≤ φ}. Since Aφ is an ideal in

Ψf , this maps Ψ into IΨf . Consider any ideal I in Ψf . Then the supremum of I exists in Ψ, since the information algebra (Ψ,D; ≤, ⊥, ·, ) is algebraic. Let φ = ∨I and consider an element ψ ∈ Ψf such that ψ ≤ φ. Then, by compactness, there is an element χ ∈ I dominating ψ. This implies ψ ∈ I and this shows that I = Aφ. So, the map is onto IΨf . It is one-to-one, since Aφ = Aψ implies φ = ψ.

It remains to show that the map is a homomorphism. Clearly A1 = {1}, which is the unit element, and A0 = Ψf ∪ {0}, which is the null element of the ideal completion. Consider two elements φ and ψ from Ψ. Then, Aφ·ψ contains both Aφ and Aψ and Aφ · Aψ = I(Aφ ∪ Aψ) ⊆ Aφ·ψ. If I is an ideal in Ψf , which contains both Aφ and Aψ, then there is an element χ ∈ Ψ such that I = Aχ and φ, ψ ≤ χ, hence Aφ·ψ ⊆ I. Thus we conclude that Aφ·ψ = Aφ · Aψ. 9 PROPER OR IDEMPOTENT INFORMATION 156

Further, for any x in D, we have by definition

x(Aφ) = {ψ ∈ Ψf : ∃χ ∈ Aφ such that ψ ≤ x(χ)}.

So, since x(χ) ≤ x(φ), it follows that x(Aφ) ⊆ Ax(φ). Consider then an element ψ in Ax(φ). By density, φ = ∨Aφ, and from Theorem 61 we conclude that

x(φ) = ∨Ax(φ) = ∨χ∈Aφ x(χ).

The set X = {x(χ): χ ∈ Aφ} is directed. By compactness, there is then an element η ∈ Aφ such that ψ ≤ x(η). But this means that ψ ∈ x(Aφ).

Therefore, we see that Ax(φ) = x(Aφ), which shows that the map φ 7→ Aφ is an information algebra homomorphism between (Ψ,D; ≤, ⊥, ·, ) and

(IΨf ,D; ≤, ⊥, ·, ). This concludes the proof ut This is a representation theorem for algebraic information algebras. What can be said about compact algebras? We first show, that a compact algebra can be extended to an algebraic one by adding the missing suprema of directed sets. Let then (Ψ,D; ≤, ⊥, ·, ) be a compact information with finite elements Ψf . As before, we add the null element to Ψf . We assume now that the Ψf is closed under extraction. The approach used here follows (Guan, 2014). Denote by Dif the family of directed sets of Ψf . For such directed sets X and Y in Dif , we define

X · Y = {φ · ψ : φ ∈ X, ψ ∈ Y },

x(X) = {x(φ): φ ∈ X}. (9.9)

These operations yield again directed sets of finite elements:

Lemma 19 If X,Y ∈ Dif , then X · Y ∈ Dif and x(X) ∈ Dif for all x ∈ D.

Proof. Consider two elements η1 and η2 in X · Y , such that η1 = φ1 · ψ1 and η2 = φ2 · ψ2 with φ1, φ2 ∈ X and ψ1, ψ2 ∈ Y . Since X and Y are directed, there are elements φ ∈ X and ψ ∈ Y such that φ1, φ2 ≤ φ and ψ1, ψ2 ≤ ψ. But then η1, η2 ≤ φ · ψ ∈ X · Y , which shows that X · Y is directed.

Similarly, consider φ1, φ2 ∈ x(X), that is φ1 = x(ψ1) and φ2 = x(ψ2), where ψ1 and ψ2 belong to X. As X is directed, there is an element ψ ∈ X which dominates ψ1 and ψ2. But then it follows that φ1, φ2 ≤ x(ψ) ∈ x(X). 9 PROPER OR IDEMPOTENT INFORMATION 157

This proves that x(X) is directed and belongs to Dif , since we have assumed that the finite elements Ψf are closed under extraction. ut

Next, we define for directed sets X and Y in Dif the relation X ≡θ Y which holds if

a) for all φ ∈ X there is a ψ ∈ Y such that φ ≤ ψ,

b) for all ψ ∈ Y there is a φ ∈ X such that ψ ≤ φ.

This is an equivalence relation in Dif . Note that if X has a supremum in Ψ, then X ≡θ Y if and only if ∨X = ∨Y . Further, for any ψ ∈ Ψf , if X ≡θ {ψ}, then necessarily ψ ∈ X and ∨X = ψ. Now, in Dif /θ we define two operations between equivalence classes

1. Combination: [X]θ · [Y ]θ = [X · Y ]θ,

2. Extraction: x([X]θ) = [x(X)]θ.

These operations are well defined. Indeed, assume X1 ≡θ X2 and consider η1 ∈ X1 · Y . Then η1 = φ1 · ψ with φ1 ∈ X1 and ψ ∈ Y . Then there is a φ2 ∈ X2 such that φ1 ≤ φ2, hence η1 ≤ φ2 · ψ ∈ X2 · Y . In the same way, we find that for η2 ∈ X2 · Y there is an element in X1 · Y which dominates η2. Therefore, we see that X1 · Y ≡θ X2 · Y . Further, assume X ≡θ Y and consider an element φ from X. Then x(φ) ∈ x(X) and there is a ψ ∈ Y such that φ ≤ ψ. But then x(φ) ≤ x(ψ) ∈ x(Y ). In the same way, if x(ψ) ∈ x(Y ), there is an element x(φ) in x(X) such that x(ψ) ≤ x(φ). Thus, x(X) ≡θ y(Y ).

With these operations, it turns out, that Dif becomes an idempotent, alge- braic information, into which the original compact algebra (Ψ,D; ≤, ⊥, ·, ) is embedded.

Theorem 65 Assume (Ψ,D; ≤, ⊥, ·, ) to be a domain-free compact infor- mation algebra, where (D; ≤) has a greatest element > and the set Ψf of finite elements is closed under extraction. Then (Dif /θ, D; ≤, ⊥, ·, ) is an algebraic information algebra and (Ψ,D; ≤, ⊥, ·, ) is embedded into it by the map

φ 7→ [{ψ ∈ Ψf : ψ ≤ φ}]θ. 9 PROPER OR IDEMPOTENT INFORMATION 158

Proof. We show first, that (Dif ,D; ≤, ⊥, ·, ) is a generalised, idempotent information algebra. Axiom A0, q-separoid, is valid, since (D; ≤, ⊥) is the same as in (Ψ,D; ≤, ⊥, ·, ). Commutativity and Associativity of combina- tion follow from commutativity and associativity of the operation X · Y . The class [{1}]θ is the unit and [Ψf ]θ is the null element of combination. So, the Semigroup Axiom A1 is valid.

Since D has greatest element >, which is a support of all elements of Ψf , it follows that >(X) = X, hence >([X]θ) = [>(X)]θ = [X]θ, so every element of Dif /θ has > as a support. Further, suppose that x is a support of [X]θ such that x(X) ≡θ X. Consider an element y ∈ D such that x ≤ y. We claim that then y(X) ≡θ X. Indeed, if ψ ∈ y(X), then ψ = y(φ) ≤ φ for some φ ∈ X. On the other hand assume ψ ∈ X. By assumption, there is a φ in X such that ψ ≤ x(φ). But x ≤ y implies x(φ) ≤ y(φ). So ψ ≤ y(φ) for φ ∈ X, and y(φ) ∈ y(X). This means that y(X) ≡θ X. So y is also a support of [X]θ and the Support Axiom A2 holds.

Note that {1} ≡θ x({1}), Ψf ≡θ x(Ψf ). Below we show that the algebra is idempotent. In this case, from [x(X)]θ = Ψf it follows [X]θ = Ψf since [x(X)]θ ≤ X]θ ≤ Ψf , so that the Unit and Null Axiom A3 is satisfied.

Next, assume x⊥y|z and that x is a support of [X]θ, that is x(X) ≡θ X. We want to show that this implies y(X) ≡θ y(z(X)). In fact, consider an element ψ ∈ y(z(X)), that is ψ = y(z(φ)) for some element φ from X. Then ψ ≤ y(φ) ∈ y(X) since z(φ) ≤ φ. On the other hand consider ψ ∈ y(X). Then ψ = y(φ) for some φ ∈ X and from x(X) ≡θ X it follows that there is a χ ∈ X such that φ ≤ x(χ). Define η = x(χ) so that x(η) = η. Then we see that ψ = y(φ) ≤ y(η) = y(z(η)) by Axiom A4 in the information algebra (Ψ,D; ≤, ⊥, ·, ). From this we obtain ψ ≤ y(z(x(χ))) ≤ y(z(χ)). Since χ ∈ X, we have y(z(χ)) ∈ y(z(X)). This shows that indeed y(X) ≡θ y(z(X)), or also y([X]θ) = y(z([X]θ)). So the Extension Axiom A4 holds in the algebra (Dif ,D; ≤, ⊥, ·, ).

Further assume again x⊥y|z and consider two elements [X]θ and [Y ]θ from Dif /θ having supports x and y respectively, that is x(X) ≡θ X and y(Y ) ≡θ Y . We claim then that z(X · Y ) ≡θ z(X) · z(Y ). Assume first that ψ belongs to z(X) · z(Y ), which means that ψ = z(φ1) · z(φ2) for some elements φ1 ∈ X and φ2 ∈ Y . But ψ = z(φ1) · z(φ2) ≤ z(φ1 · φ2) and the element z(φ1 ·φ2) belongs to z(X ·Y ). On the other hand consider ψ ∈ z(X·Y ), such that ψ = z(φ1·φ2) for some elements φ1 ∈ X and φ2 ∈ Y . From the assumptions that x(X) ≡θ X and y(Y ) ≡θ Y it follows that there are elements χ1 ∈ X and χ2 ∈ Y such that φ1 ≤ x(χ1) and φ2 ≤ x(χ2). 9 PROPER OR IDEMPOTENT INFORMATION 159

Define η1 = x(χ1), η2 = y(χ2). These two element have support x and y respectively. Then we obtain ψ = z(φ1 · φ2) ≤ z(η1 · η2) = z(η1) · z(η2) by Axiom A5 for the information algebra (Ψ,D; ≤, ⊥, ·, ). From this it follows that ψ ≤ z(x(χ1)) · z(y(χ2)) ≤ z(χ1) · z(χ2). The last element belongs to z(X) · z(Y ). This proves that indeed z(X · Y ) ≡θ z(X) · z(Y ) or z([X]θ · [Y ]θ) = z([X]θ) · z([Y ]θ). So, the Combination Axion A5 holds too.

Finally, to verify Idempotency, A6, we show that x(X) · X ≡θ X. If ψ ∈ X, then by the idempotency in the original information algebra ψ = x(ψ) · ψ and this element belongs to x(X) · X. Conversely if ψ ∈ x(X) · X, then ψ = x(φ1) · φ2 and φ1 and φ2 belong to X. Since X is directed, there is a φ ∈ X which dominates φ1 and φ2, such that ψ ≤ φ ∈ X. This shows that x(X) · X ≡θ X, hence x([X]θ) · [X]θ = [X]θ. Therefore, the algebra (Dif ,D; ≤, ⊥, ·, ) is idempotent

So far we have shown that (Dif /θ, D; ≤, ⊥, ·, ) is an idempotent generalised information algebra. It remains to show that this algebra is algebraic. The proof will be based on Theorem 62. As a preparation we prove the following lemma.

Lemma 20 The relation [X]θ ≤ [Y ]θ holds if and only if for all φ ∈ X there is a ψ ∈ Y such that φ ≤ ψ.

Proof. The relation [X]θ ≤ [Y ]θ means that [X]θ · [Y ]θ = [X · Y ]θ = [Y ]θ or X · Y ≡θ Y . So, assume X · Y is equivalent to Y . Consider an element φ ∈ X. Then for any element φ·χ in X ·Y , where χ ∈ Y , there is an element ψ ∈ Y so that φ · χ ≤ ψ. But then φ ≤ ψ. If, on the contrary for any φ ∈ X there is a ψ ∈ Y such that φ ≤ ψ, then φ · ψ = ψ and φ · ψ ∈ X · Y . And if ψ ∈ Y , then ψ ≤ ψ · φ for any φ ∈ X. so, indeed X · Y ≡θ Y . ut

Now, we resume the proof of the theorem. We take the set of classes [{ψ}]θ 0 for ψ ∈ Ψf to be the set of Ψ of Theorem 62. This set will turn out to be the set of finite elements of the algebra (Dif /θ, D; ≤, ⊥, ·, ). 0 0 Let X be a directed subset of Ψ and define X = {ψ :[{ψ}]θ ∈ X}. To 0 simplify notation, we write subsequently [ψ]θ instead of [{ψ}]θ. The set X 0 0 0 is directed in Ψf . By Lemma 20, [ψ]θ ≤ [X ]θ, if ψ ∈ X . So, [X ]θ is an upper bound of X. Assume that [Y ]θ is another upper bound of X. Then, for all [ψ]θ ∈ X, there is a χ ∈ Y such that ψ ≤ χ. Therefore (Lemma 20) 9 PROPER OR IDEMPOTENT INFORMATION 160

0 0 0 [X ]θ ≤ [Y ]θ and [X ]θ is the supremum of X, that is [X ]θ = ∨X. So item 1 of Theorem 62 holds.

Next, consider an element [X]θ with support x in Dif /θ, that is [X]θ = x([X]θ) = [x(X)]θ. We claim that

x(X) ≡θ {ψ ∈ Ψf : ψ = x(ψ) ≤ φ for some φ ∈ X}. (9.10)

In fact, assume ψ ∈ x(X), such that ψ = x(φ) for some φ ∈ X. Then ψ ∈ Ψf , (remember that we assume that Ψf is closed under extraction), and ψ = x(ψ) ≤ φ ∈ X. On the other hand ψ = x(ψ) ≤ φ ∈ X implies ψ ≤ x(φ) and x(φ) ∈ x(X). This proves that (9.10) holds. So, we see, using the same argument as above, that

x([X]θ) = [{ψ ∈ Ψf : ψ = x(ψ) ≤ φ for some φ ∈ X}]θ 0 = ∨{[ψ]θ ∈ Ψ :[ψ]θ = x([ψ]θ) ≤ [X]θ}. (9.11)

This verifies item 2, density, of Theorem 62. 0 Finally, assume [ψ]θ ≤ ∨X, ψ ∈ Ψf , for some directed subset X of Ψ . 0 Define X = {φ :[φ]θ ∈ X}. This set is also directed. So, as above, 0 0 [ψ]θ ≤ ∨X = [X ]θ. Then, by Lemma 20 there is an element φ ∈ X such that ψ ≤ φ, hence [ψ]θ ≤ [φ]θ ∈ X. This proves item 3 of Theorem 62. So, the information algebra (Dif /θ, D; ≤, ⊥, ·, ) is according to Theorem 62 algebraic and the elements [ψ]θ for ψ ∈ Ψf are its finite elements.

It remains to show that the map φ 7→ [{ψ ∈ Ψf : ψ ≤ φ}]θ is an embedding. Define Aφ = {ψ ∈ Ψf : ψ ≤ φ}. First we note that the map φ 7→ [Aφ]θ is one- to-one. In fact, if [Aφ]θ = [Aψ]θ, or Aφ ≡θ Aψ, then it follows from density in the compact information algebra (Ψ,D; ≤, ⊥, ·, ) that φ = ∨Aφ = ∨Aψ = ψ. Next we verify that the map is a homomorphism. In order to show that φ · ψ 7→ [Aφ·ψ]θ = [Aφ · Aψ]θ = [Aφ]θ · [Aψ]θ, it is sufficient to prove that ∨Aφ·ψ = ∨Aφ ·Aψ, since the first supremum exists in a compact information algebra, that is, to prove that

0 0 0 0 0 0 0 0 ∨{φ · ψ : φ , ψ ∈ Ψf , φ ≤ φ, ψ ≤ ψ} = ∨{ψ ∈ Ψf : ψ ≤ φ · ψ} = φ · ψ.

Note that by density, the second supremum exists in Ψ and it is an upper bound of the set on the left hand side. If η is another upper bound of this set, then φ0, ψ0 ≤ η. Since by density both φ and ψ are the suprema of the finite φ0, ψ0 they dominate, we conclude that φ, ψ ≤ η, hence φ · ψ ≤ η and φ · ψ is indeed the supremum of the set on the left hand side. So, we have 9 PROPER OR IDEMPOTENT INFORMATION 161

proved that φ · ψ 7→ [Aφ]θ · [Aψ]θ. In addition, 1 7→ [1]θ and 0 7→ [Ψf ]θ, the unit and null elements in (Dif /θ, D; ≤, ⊥, ·, ).

Further, we must show that x(φ) 7→ [Ax(φ]θ = x([Aφ]θ) = [x(Aφ)]θ. By density in the algebra (Ψ,D; ≤, ⊥, ·, ),

x(φ) = ∨{ψ ∈ Ψf : ψ ≤ x(φ)}

= ∨{ψ ∈ Ψf : ψ = x(ψ) ≤ φ}

= ∨{x(ψ): ψ ∈ Ψf : ψ ≤ φ}

= ∨x(Aφ). (9.12)

So, we have ∨Ax(φ) = ∨x(Aφ) which implies [Ax(φ)]θ = x([Aφ]θ).

Thus, the map φ 7→ [Aφ]θ is an embedding. This concludes the proof. ut

The embedding φ 7→ [Aφ]θ is not only an ordinary information algebra homomorphism. It is in fact a continuous map, in the order-theoretic sense.

Theorem 66 Assume (Ψ,D; ≤, ⊥, ·, ) to be a domain-free compact infor- mation algebra. Then, if X is a directed susbset of Ψ with a supremum in Ψ,

[A∨X ]θ = ∨φ∈X [Aφ]θ. (9.13)

Proof. Since φ ∈ X implies φ ≤ ∨X, we have [Aφ]θ ≤ [A∨X ]θ. So, [A∨X ]θ is an upper bound for the [Aφ]θ with φ ∈ X. If [Y ] is another upper bound of this set, then consider an element ψ ∈ A∨X , hence ψ ∈ Ψf and ψ ≤ ∨X. Then, by compactness in the algebra (Ψ,D; ≤, ⊥, ·, ), there is a φ ∈ X such that ψ ≤ φ, hence ψ ∈ Aφ. From [Aφ]θ ≤ [Y ]θ it follows that there is χ ∈ Y such that ψ ≤ χ. But this shows that [A∨X ]θ ≤ [Y ]θ (Lemma 20). So, [A∨X ]θ is the supremum of the [Aφ]θ for φ ∈ X. ut We could also have considered the ideal completion of the finite elements of the compact information algebra (Ψ,D; ≤, ⊥, ·, ). By Theorem 64 we would obtain an algebraic information algebra. In fact, this algebra is isomorphic to the algebra (Dif /θ, D; ≤, ⊥, ·, ), if the finite elements are closed under extraction, and the isomorphism is a continuous map.

Theorem 67 Assume (Ψ,D; ≤, ⊥, ·, ) to be a domain-free compact infor- mation algebra, where (D; ≤) has a greatest element > and the set of finite elements is closed under extraction. Then the two algebraic information al- gebras (Dif /θ, D; ≤, ⊥, ·, ) and (IΨf ,D; ≤, ⊥, ·, ) are isomorphic under a continuous map. 9 PROPER OR IDEMPOTENT INFORMATION 162

Proof. We show first that any directed set X in Ψf is equivalent to the ideal I(X) it generates in Ψf , that is X ≡θ I(X). If φ ∈ X, then φ ≤ φ and φ ∈ I(X). Conversely, if ψ ∈ I(X), then there is a finite set of elements ψ1, . . . , ψn in X such that ψ ≤ ψ1 ∨ · · · ∨ ψn (see (9.3)). Since X is directed, there is an element φ ∈ X which dominates all ψi, i = 1, . . . , n, hence ψ ≤ φ. So, indeed X ≡θ I(X). This means that any equivalence class [X]θ in Dif /θ can be represented by the the ideal I(X), that is [X]θ = [I(X)]θ.

Consider now the map [X]θ 7→ I(X) from Dif /θ into IΨf . By the last remark, this map is well defined. It is onto IΨf , since any ideal I in Ψf is directed and [I]θ 7→ I(I) = I. It is also one-to-one: Suppose that I(X) = I(Y ), then X ≡θ I(X) and Y ≡θ I(Y ), therefore X ≡θ Y or [X]θ = [Y ]θ. Next, we verify that the map is a homomorphism. First we show that [X]θ · [Y ]θ maps to I(X) · I(Y ), that is, I(X · Y ) = I(X) · I(Y ). Consider an element φ ∈ I(X·Y ). Then there is a finite set of elements ψ1, . . . , ψn in X·Y such that φ ≤ ψ1 ·...·ψn and each element ψi equals ψi,1 ·ψi,2 for some ψi,1 ∈ X and ψi,2 ∈ Y . Since both X and Y are directed, there are elements ψ1 ∈ X and ψ2 ∈ Y which dominate ψ1,1, . . . , ψn,1 and ψ1,2, . . . , ψn,2 respectively. Then it follows that φ ≤ ψ1 · ψ2, which shows that φ ∈ I(X) · I(Y ).

Conversely, assume φ ∈ I(X) · I(Y ), which means that φ ≤ ψ1 · ψ2 for some elements ψ1 ∈ I(X) and ψ2 ∈ I(Y ). Then ψ1 ≤ ψ1,1 · ... · ψn,1 for some elements ψ1,1, . . . , ψn,1 of X and similarly ψ2 ≤ ψ1,2 · ... · ψm,2 for some elements ψ1,2, . . . , ψm,2 of Y . The sets X and Y are directed, therefore there are elements φ1 ∈ X and φ2 ∈ Y , which dominate ψ1,1, . . . , ψn,1 and ψ1,2, . . . , ψm,2 respectively, hence φ ≤ φ1 · φ2 ∈ X · Y . This shows that φ ∈ I(X · Y ) and therefore I(X · Y ) = I(X) · I(Y ).

Further, [{1}]θ maps to I({1}) = {1}, and [Ψf ]θ maps to Ψf . To complete the verification that the map is a homomorphism we now show that x([X]θ) = [x(X)]θ maps to x(I(X)) by proving that I(x(X)) = x(I(X)). Let first φ ∈ I(x(X)) such that φ ≤ ψ1 · ... · ψn for some elements ψ1, . . . , ψn of x(X). This means that ψi = x(φi) for some φi ∈ X. As X is directed, there is a χ ∈ X such that φ1, . . . , φn ≤ χ. It follows then that φ ≤ x(φ1) · ... · x(φn) ≤ x(φ1 · ... · φn) ≤ x(χ) This shows that φ ∈ x(I(X)).

If, on the other hand, φ ∈ x(I(X)), then there is an element ψ ∈ I(X) such that φ ≤ x(ψ) and further ψ ≤ ψ1 · ... · ψn for elements ψi ∈ X. Then there is an element χ ∈ X, such that ψ1, . . . , ψn ≤ χ. This implies φ ≤ x(ψ) ≤ x(ψ1·...·ψn) ≤ (χ) ∈ x(X). Therefore we have φ ∈ I(x(X)), 9 PROPER OR IDEMPOTENT INFORMATION 163

hence I(x(X)) = x(I(X)).

So the map [X]θ 7→ I(X) is an information algebra isomorphism. Finally, we show that the map is continuous. A map f from an algebraic lattice (Φ; ≤) to another algebraic lattice is continuous, if (Davey & Priestley, 2002)

f(φ) = ∨{f(ψ): ψ ∈ Φf , ψ ≤ φ}.

In our case f([X]θ) = I(X) and the finite elements are the classes [ψ]θ for ψ ∈ Ψf . So we must verify that

I(X) = ∨{I({ψ}): ψ ∈ Ψf , [ψ]θ ≤ [X[θ}. (9.14)

Now, [ψ]θ ≤ [X[θ if and only if there is a φ ∈ X such that ψ ≤ φ. Therefore, we need to verify that

I(X) = ∨{I({ψ}): ψ ∈ Ψf , ψ ≤ φ for some φ ∈ X}.

We may identify the principal ideal I({ψ}) with ψ by the embedding of the finite elements of Ψ into their ideal completion (see Section 9.1). Then the last equality becomes

I(X) = ∨{ψ : ψ ∈ Ψf , ψ ≤ φ for some φ ∈ X} ≤ ∨X.

But I(X) = ∨X in the ideal completion. This proves (9.14). ut In view of these results, we have two equivalent ways to complete a compact information algebra to an algebraic one. One method is by ideal completion, the other one by adjoining the missing suprema.

9.3 Duality For Compact Algebras

In this section we turn back to duality between domain-free and labeled generalised information algebras. What is a compact or algebraic labeled information algebras? This question will be examined by looking at the labeled version of a compact or algebraic domain-free information algebra in order to see how compactness transform into the labeled version. Then, based on this analysis, we study how duality extends to compact and alge- braic information algebras. Consider a compact domain-free generalised information algebra (Ψ,D; ≤ , ⊥, ·, ). According to the previous discussions we assume that there is a greatest domain > in D. As we have seen, we can always adjoin such a 9 PROPER OR IDEMPOTENT INFORMATION 164 domain, if necessary. So there is no loss of generality. We form the dual labeled algebra (Φ,D; ≤, ⊥, ·, t), where Φ is the set of pairs (φ, x) with φ ∈ Ψ and x(φ) = φ, see Section 7.3. In particular, let Φx be the set of all pairs (φ, x) for a fixed x. Then [ Φ = Φx. x∈D Note that idempotency allows, as in the domain-free case, to define a partial order in Φ. In fact, (φ, x) ≤ (ψ, y) if and only if (φ, x)·(ψ, y) = (φ·ψ, x∨y) = (ψ, y). This implies φ · ψ = ψ or φ ≤ ψ in (Ψ, ≤) and x ≤ y in (D; ≤). As a preparation, we prove two useful results about the labeled algebra (Φ,D; ≤, ⊥, ·, t).

Lemma 21 Let (Ψ,D; ≤, ⊥, ·, ) be a domain-free generalised information algebra and (Φ,D; ≤, ⊥, ·, t) its dual labeled version. If the supremum of a subset X of Φ exists in Φ, then

∨X = (∨(φ,x)∈X φ, ∨(φ,x)∈X x). (9.15)

Proof. Assume ∨X = (χ, y). Then (φ, x) ≤ (χ, y) for all (φ, x) ∈ X, hence φ ≤ χ and x ≤ y. Consider other upper bounds χ0 and y0 for the elements φ and x,(φ, x) ∈ X. Then (φ, x) ≤ (φ0, y0), hence (χ, y) ≤ (χ0, y0). But this 0 0 implies χ ≤ χ and y ≤ y and so indeed χ = ∨(φ,x)∈X φ and y = ∨(φ,x)∈X x. This is (9.15). ut

Lemma 22 Let (Ψ,D; ≤, ⊥, ·, ) be an idempotent domain-free generalised information algebra and (Φ,D; ≤, ⊥, ·, t) its dual labeled version. Let X be a subset of Ψ such that x(X) = X. If the supremum of X exists in Ψ, then (∨X, x) ∈ Φ and

∨ψ∈X (ψ, x) = (∨X, x).

Proof. We need only to show that ∨X has support x. Define φ = ∨X. Then, for all ψ ∈ X we have ψ = x(ψ) ≤ φ, hence ψ = x(ψ) ≤ x(φ). So, x(φ) is an upper bound of X, therefore φ ≤ x(φ), hence φ = x(φ). ut

The next theorem shows how finite elements in (Φx; ≤) relate to finite ele- ments in (Ψ; ≤). 9 PROPER OR IDEMPOTENT INFORMATION 165

Theorem 68 Let (Ψ,D; ≤, ⊥, ·, ) be a domain-free compact information algebra with finite elements Ψf and (Φ,D; ≤, ⊥, ·, t) its dual labeled version. Then (ψ, x) ∈ Φ is finite in (Φx; ≤) if and only if ψ is finite in (Ψ; ≤), that is, ψ ∈ Ψf .

Proof. Consider an element (ψ, x) of Φ with ψ ∈ Ψf . Let X be a directed subset of Φx whose supremum exists in (Φx; ≤) and such that (ψ, x) ≤ ∨X. Define X0 = {φ ∈ Ψ:(φ, x) ∈ X}. Clearly, X0 is directed too and since ∨X = (∨X0, x) (Lemma 21) the supremum of X0 exists in Ψ and ψ ≤ ∨X0. Since ψ is finite in (Ψ; ≤) there is a φ ∈ X0 such that ψ ≤ φ, hence (ψ, x) ≤ (φ, x) ∈ X. This shows that (ψ, x) is finite in (Φx; ≤).

Conversely, assume that (ψ, x) is finite in (Φx; ≤). Let X be a directed subset of Ψ whose supremum exists in Ψ and such that ψ ≤ ∨X. Then we have 0 ψ = x(ψ) ≤ x(∨X) = ∨x(X) (Theorem 61). Define X = {(x(φ), x): φ ∈ 0 X}. It is a directed set in (Φx; ≤) and we have (ψ, x) ≤ (∨x(X), x) = ∨X (Lemma 22). Since (ψ, x) is assumed to be finite in (Φx; ≤) there is an 0 element (x(φ), x) ∈ X such that (ψ, x) ≤ (x(φ), x). This implies ψ ≤ φ for an element φ ∈ X. This shows that ψ is finite in (Ψ; ≤). ut According to this theorem, finite elements in (Ψ; ≤) correspond to finite elements in (Φx; ≤) for domains x which are supports of the finite elements in (Ψ; ≤). Note that finite elements in (Φx; ≤) are not necessarily finite in (Φ; ≤) and that the finite elements in (Ψ; ≤) do not induce finite elements in (Φ; ≤), as one might have expected. So, if we denote the finite elements in (Φx; ≤) by φx,f , and [ Φf = Φx,f , x∈D then Φf does not represent the finite elements of (Φ; ≤) but the union of the locally finite ones. Note that Φf is closed under combination. In fact, if (φ, x) ∈ Φx,f and (ψ, y) ∈ Φy,f , then by Theorem 68 φ and ψ are finite elements in (Ψ; ≤) and so is its combination φ · ψ. This combination has x ∨ y as a support and again by the same theorem, therefore (φ, x) · (ψ, y) = (φ · ψ, x ∨ y) are finite in Φx∨y,f . However, transport of finite elements keeps them not necessarily finite, except if the finite elements of (Ψ; ≤) are closed under extraction. Nevertheless, for x ≤ y, the element ty(ψ, x) = (ψ, x) · (1, y) remains finite, if (ψ, x) is finite. This is true because (1, y) is a finite element. Next we show that strong density of the compact algebra (Ψ,D; ≤, ⊥, ·, ) induce a local density within the domains Φx of the dual labeled algebra. 9 PROPER OR IDEMPOTENT INFORMATION 166

That is, the finite elements in (Φx; ≤) are dense in Φx and approximate thus the elements of Φx.

Theorem 69 Let (Ψ,D; ≤, ⊥, ·, ) be a domain-free compact information algebra and (Φ,D; ≤, ⊥, ·, t) its dual labeled version. Then, for all (φ, x) ∈ Φ,

(φ, x) = ∨{(ψ, x) ∈ Φx,f :(ψ, x) ≤ (φ, x)}. (9.16)

Proof. By strong density in the algebra (Ψ,D; ≤, ⊥, ·, ) we have

(φ, x) = (∨{ψ ∈ Ψf : ψ = x(ψ) ≤ φ}, x)

= ∨{(ψ, x) ∈ Φx,f :(ψ, x) ≤ (φ, x)}.

This equality holds by Lemma 22. ut So, the dual, labeled version of a compact information algebra is a labeled algebra, where local density according to (9.16) holds. We take this as the model to define labeled compact information algebras. Note that order in a labeled information algebra (Φ,D; ≤, ⊥, ·, t) is defined again by φ ≤ ψ if φ · ψ = ψ. This induces also a partial order (Φx; ≤) between the elements Φx = {φ ∈ Φ: d(φ) = x} in domain x. The following lemma states a few elementary properties of this labeled order.

Lemma 23 Let (Φ,D; ≤, ⊥, ·, t) be an idempotent labeled information alge- bra. Then

1. x ≤ d(φ) implies tx(φ) ≤ φ,

2. x ≥ d(φ) implies tx(φ) ≥ φ,

3. φ ≤ ψ implies tx(φ) ≤ tx(ψ) for any x ∈ D, 4. φ, ψ ≤ φ · ψ,

5. φ ≤ ψ implies φ · χ ≤ ψ · χ.

Proof. 1.) follows from the Idempotency Axiom A7 of a labeled generalised information algebra, tx(φ) · φ = φ.

2.) follows from tx(φ) = φ · 1x, hence by idempotency, tx(φ) · φ = φ · 1x · φ = φ · 1x = tx(φ). 9 PROPER OR IDEMPOTENT INFORMATION 167

3.) Let d(φ) = y and d(ψ) = z and assume first x ≥ y, z. Then y⊥x|x, hence y⊥z|x. Using the Combination Axiom A5, it follows that tx(ψ) = tx(φ·ψ) = tx(φ)·tx(ψ), which shows that tx(φ) ≤ tx(ψ) in this case. Assume next, that d(φ) = d(ψ) = y and x ≤ y. By item 1 proved above, we have tx(φ) · φ = φ, hence, if φ ≤ ψ, we obtain tx(φ) · ψ = ψ. From y⊥x|x it follows with the Combination Axiom that tx(ψ) = tx(tx(φ) · ψ) = tx(φ) · tx(ψ) which shows that in this case too tx(φ) ≤ tx(ψ). In the general case with d(φ) = y and d(ψ) = z, the assumption φ ≤ ψ implies first tx∨y∨z(φ) ≤ tx∨y∨z(ψ) and then tx(φ) = tx(tx∨y∨z(φ)) ≤ tx(tx∨y∨z(ψ)) = tx(ψ), using the two special cases proved before. 4.) follows from idempotency, φ · (φ · ψ) = φ · ψ and ψ · (φ · ψ) = φ · ψ. 5.) If φ · ψ, we have by idempotency (φ · χ) · (ψ · χ) = (φ · ψ) · χ = ψ · χ. ut The lemma shows in particular, that the combination and the transport operations preserve order. After this preparation, we are in a position to define the concept of a labeled compact information algebra.

Definition 21 An idempotent labeled generalised information algebra (Φ,D; ≤ , ⊥, ·, t) is called compact, if

1. for all domains x ∈ D and elements φ with d(φ) = x,

φ = ∨{ψ ∈ Φx,f : ψ ≤ φ}, (9.17)

where Φx,f denotes the set of the finite elements of (Φx; ≤).

2. If ψ ∈ Φx,f and y ≥ x, then ty(ψ) ∈ Φy,f .

Let [ Φf = Φx,f x∈D be the set of all locally finite elements. Again, we emphasize that this is not the set of the finite elements of (Φ; ≤). The justification of this definition will be that the associated dual domain- free information (Φ/σ, D; ≤, ⊥, ··· , ) is again compact. Before we show this, we give some useful results. 9 PROPER OR IDEMPOTENT INFORMATION 168

Lemma 24 Let (Φ,D; ≤, ⊥, ·, t) by a labeled compact information algebra, X a directed subset of Φy such that its supremum exists in Φy and x ≤ y. Then

tx(∨X) = ∨tx(X). (9.18)

Proof. Assume first φ ∈ X such that φ ≤ ∨X, hence tx(φ) ≤ tx(∨X). So, tx(∨X) is an upper bound of the elements tx(φ) for φ ∈ X. On the other hand, by density in the compact labeled algebra

tx(∨X) = ∨{ψ ∈ Φx,f : ψ ≤ tx(∨X)}

= ∨{ψ ∈ Φx,f : ty(ψ) ≤ ∨X}. (9.19)

Since ty(ψ) is finite in domain y, if ψ is so in domain x ≤ y, there is an element φ ∈ X such that ty(ψ) ≤ φ if ty(ψ) ≤ ∨X. But then it follows that ψ ≤ tx(φ) ∈ tx(X) and therefore tx(∨X) is the least upper bound of tx(X). So, indeed tx(∨X) = ∨tx(X). ut

This lemma implies that Φf is closed under combination. In fact, consider φ ∈ Φx,f and ψ ∈ Φy,f , and a directed set X in Φx∨y such that φ · ψ ≤ ∨X. Then φ ≤ tx(∨X) = ∨tx(X) by Lemma 24 and similarly ψ ≤ ty(∨X) = ∨ty(X). Both sets tx(X) and ty(X) are directed, and therefore there are 0 0 0 elements tx(φ ) ∈ tx(X) such that φ ≤ tx(φ ) and ty(ψ ) ∈ ty(X) such that 0 0 0 ψ ≤ ty(ψ ). Both φ , ψ belong to X and so there is also an element χ in X such that φ0, ψ0 ≤ χ. Hence, finally we conclude that φ·ψ ≤ φ0 ·ψ0 ≤ χ ∈ X. This proves that φ · ψ ∈ Φx∨y,f , hence φ · ψ belongs to Φf . But Φf is not necessarily closed under transport. As a preparation for the examination of the dual domain-free algebra asso- ciated with a labeled compact information algebra (Φ,D; ≤, ⊥, ·, t) we prove the following lemma.

Lemma 25 Let (Φ,D; ≤, ⊥, ·, t) be an idempotent labeled information alge- bra, X a directed subset of Φ such that its supremum exists in Φ. Then

[∨X]σ = ∨[X]σ, (9.20) where [X]σ = {[φ]σ : φ ∈ X}.

Proof. Define ψ = ∨X such that [ψ]σ = [∨X]σ and assume that d(ψ) = x. Then, for all φ ∈ X we have φ ≤ ψ and d(φ) ≤ x. Therefore, for all φ ∈ X we have [φ]σ ≤ [ψ]σ and so [ψ]σ is an upper bound of [X]σ. 9 PROPER OR IDEMPOTENT INFORMATION 169

Assume [χ]σ to be another upper bound of [X]σ and d(χ) = y. For any φ in X we have [χ]σ = [φ]σ · [χ]σ = [φ · χ]σ = [tx∨y(φ) · tx∨y(χ)]σ. This implies tx∨y(φ) ≤ tx∨y(χ). Since for φ ∈ X we have d(φ) ≤ x, it follows that φ ≤ tx(φ) = tx(tx∨y(φ)) ≤ tx(tx∨y(χ)). But then ψ = ∨X ≤ tx(tx∨y(χ)). It follows that tx∨y(ψ) ≤ tx∨y(tx(tx∨y(χ))) ≤ tx∨y(tx∨y(χ)) = tx∨y(χ). From this we conclude that [ψ]σ ≤ [χ]σ, such that [ψ]σ is the supremum of [X]σ. ut Now we show that the domain-free information algebra (Φ/σ, D; ≤, ⊥, ·, ) associated with a labeled compact information algebra (Φ,D; ≤, ⊥, ·, t) is indeed again compact. This justifies the definition of a labeled compact information algebra above.

Theorem 70 Let (Φ,D; ≤, ⊥, ·, t) by a labeled compact information algebra. Then (Φ/σ, D; ≤, ⊥, ·, ) is a domain-free compact information algebra and its finite elements are the elements [ψ]σ for ψ ∈ Φf .

Proof. We know already that (Φ/σ, D; ≤, ⊥, ·, ) is an idempotent domain- free information algebra (see Section 7.1, in particular Theorem 54). We show first that the elements [ψ]σ for ψ ∈ Φf are exactly the finite elements in (Φ/σ; ≤). So, assume first that [ψ]σ is finite in (Φ/σ; ≤). By the Support Axiom, [ψ]σ has a support x, hence we may select a representant ψ of the class [ψ]σ with label d(ψ) = x. Consider then a directed set X in Φx such that its supremum exists in Φx and ψ ≤ ∨X. Using Lemma 25, we conclude that [ψ]σ ≤ [∨X]σ = ∨[X]σ. Further, the set [X]σ is directed in (Φ/σ; ≤). Since [ψ]σ is finite in (Φ/σ; ≤) there is an element [φ]σ in [X]σ such that [ψ]σ ≤ [φ]σ. But then we may select φ ∈ X and ψ ≤ φ. This shows that ψ is finite in (Φx; ≤).

Conversely assume that ψ is finite in (Φx; ≤). Consider a directed subset X of Φ/σ, whose supremum exists in Φ/σ and such that [ψ]σ ≤ ∨X. We want to show that there is an element [φ]σ in X such that [ψ]σ ≤ [φ]σ. This shows then that [ψ]σ is finite in (Φ/σ; ≤). The supremum ∨X has a support x. By Theorem 61 we have

∨X = x(∨X) = ∨x(X), where x(X) = {x([φ]σ):[φ]σ ∈ X}. The element [ψ]σ of the information algebra Φ/σ has a support y. Then x ∨ y is also a support of both [ψ]σ and the elements of x(X). We may therefore select representants χ for the equivalence class x([φ]σ) with label d(χ) = x ∨ y such that [χ]σ = 9 PROPER OR IDEMPOTENT INFORMATION 170

0 [tx∨y(χ)]σ = x([χ]σ). Define then the set X = {χ ∈ Φx∨y :[χ]σ ∈ x(X)}. This set is directed, as is the set x(X), since X is directed. We claim that 0 the supremum of X exists in Φx∨y. In fact, let [η]σ = ∨X and consider a representant η of [η]σ with label d(η) = x ∨ y. Then η is an upper bound of X0. Assume η0 to be another upper bound of X0 with label x ∨ y. Then 0 0 0 0 we have [χ]σ ≤ [η ]σ for all χ ∈ X , hence [η]σ ≤ [η ]σ and therefore η ≤ η , which shows that η = ∨X0.

We have [ψ]σ ≤ x(∨X) = ∨X. It follows that

[ψ]σ ≤ x∨y(∨X) = [η]σ and d(η) = x ∨ y. We conclude then that tx∨y(ψ) ≤ η. Since ψ is finite, tx∨y(ψ) = ψ · 1x∨y is finite too. Therefore, by local compactness, there is a 0 χ ∈ X such that tx∨y(ψ) ≤ χ, hence [ψ]σ = [tx∨y(ψ)]σ ≤ [χ]σ = x([φ]σ) ≤ [φ]σ for some [φ]σ ∈ X. This shows that [ψ]σ is finite in (Φ/σ; ≤). It remains to show strong density. For this purpose consider an element [φ]σ = x([φ]σ) in Φ/σ. We take a representant of [φ]σ with label d(φ) = x. By the local density in the labeled algebra (Φ,D; ≤, ⊥, ·, t) we have

[φ]σ = [∨{ψ ∈ Φx,f : ψ ≤ φ}]σ.

From Lemma 25 and the first part of this theorems it follows then that

[φ]σ = ∨{[ψ]σ :[ψ]σ finite in (Φ/σ; ≤), [ψ]σ = x([ψ]σ) ≤ [φ]σ}]σ.

This is strong density in the domain-free information algebra (Φ/σ, D; ≤ , ⊥, ·, ) and this concludes the proof that this algebra is compact. ut In summary, a domain-free compact information algebra D transforms into an associated dual labeled compact information algebra DL. Conversely, a labeled compact information algebra L has an associated dual domain-free compact information algebra LD. Then the labeled compact algebra DL transforms back into the domain-free compact algebra DLD. Similarly, the domain-free compact algebra LD transforms back into the labeled compact algebra LDL. We have seen in Section 7.3 that D and DLD are isomorphic under the map ψ 7→ [(ψ, x)]σ. Similary, the labeled algebra L is isomorphic to the algebra LDL under the map φ 7→ ([φ]σ, x). We show now that in the case of compact algebras these maps are continuous.

Theorem 71 Let (Ψ,D; ≤, ⊥, ·, ) and (Φ,D; ≤, ⊥, ·, t) be compact domain- free and compact labeled generalised information algebras respectively. Then, 9 PROPER OR IDEMPOTENT INFORMATION 171 if X is a directed subset of Ψ whose supremum exists in Ψ and which has support x,

[(∨X, x)]σ = ∨φ∈X [(φ, x)]σ. (9.21)

Further, if X is a directed subset of Φ whose supremum exists in Φ and has label x, then

([∨X]σ, x) = ∨ψ∈X ([ψ]σ, x). (9.22)

Proof. We start with (9.21). By Theorem 61 we have ∨X = x(∨X) = ∨x(X). So, using Lemma 22

[(∨X, x)]σ = [(∨x(X), x)]σ = [∨φ∈X (x(φ), x)]σ.

From this it follows, using Lemma 25,

[(∨X, x)]σ = ∨φ∈X [(x(φ), x)]σ = ∨φ∈X x([(φ, x)]σ).

But all elements [(φ, x)]σ have support x, therfore we conclude

[(∨X, x)]σ = ∨φ∈X [(φ, x)]σ.

This is (9.21). In order to prove (9.22) we note that for ψ ∈ X, we have ψ ≤ ∨X and d(ψ) ≤ x. This implies tx(ψ) ≡σ ψ, hence x(|ψ]σ) = [tx(ψ)]σ = [ψ]σ. So, 0 x is a support for all [ψ]σ such that ψ ∈ X. Define X = {tx(ψ): ψ ∈ X}. 0 Then, by Lemma 24, ∨X = ∨X = ∨ψ∈X tx(ψ). Therefore, we see that (Lemma 25)

0 [∨X]σ = [∨X ]σ = [∨ψ∈X tx(ψ)]σ = ∨ψ∈X [tx(ψ)]σ = ∨ψ∈X [ψ]σ (9.23)

So, from Lemma 22 we obtain

([∨X]σ, x) = (∨ψ∈X [ψ]σ, x) = ∨ψ∈X ([ψ]σ, x).

This is (9.22). ut As remarked above, this theorem shows that D =∼ DLD and L =∼ LDL under continuous isomorphisms, if D and L are compact domain-free or labeled information algebras respectively. We now turn to the algebraic case. What is an algebraic labeled infor- mation algebra, and what can be said about the duality between algebraic 9 PROPER OR IDEMPOTENT INFORMATION 172 domain-free and labeled algebras? We start with an algebraic domain-free generalised information algebra (Ψ,D; ≤, ⊥, ·, ) and examine its dual la- beled version (Φ,D; ≤, ⊥, ·, t), where Φ is, as always, the set of all pairs (ψ, x) with ψ ∈ Ψ and x(ψ) = ψ. Consider a subset X of Φ. If its supre- mum ∨X exists in Φ, then ∨X = (∨X0, ∨X00), where X0 = {ψ :(ψ, x) ∈ X} and X00 = {x :(ψ, x) ∈ X} (Lemma 21). Since (Ψ,D; ≤, ⊥, ·, ) is algebraic, ∨X0 exists always. However, there is no guarantee that ∨X00 exists. So, even if (Ψ,D; ≤, ⊥, ·, ) is algebraic, this does not imply, that any subset X of Φ has a supremum, (Φ; ≤) is not necessarily complete. However, for any x ∈ D, the local orders (Φx; ≤) are complete. If X is a subset of Φx, then ∨X = (∨X0, x). So, (Φ,D; ≤, ⊥, ·, t) is a compact labeled information algebra, which is locally complete in this sense. Alternatively, (Φx; ≤) is a dcpo, which together with compactness implies that it is a complete lattice. This leads to the following definition:

Definition 22 A labeled generalised information algebra (Φ,D; ≤, ⊥, ·, t) is called algebraic if

1. it is compact,

2. for all x ∈ D, (Φx; ≤) is a dcpo.

Note that in an algebraic domain-free information algebra (Ψ,D; ≤, ⊥, ·, ) we may always adjoin a top domain, if D has not already a greatest ele- ment. Therefore, in its labeled version we may likewise assume that D has a greatest element >. Then Φ> contains the pairs (ψ, >) for all ψ ∈ Ψ and (Φ>; ≤) is essentially the same as (Ψ, ≤) as we shall see below (Theorem 73). Therefore, we consider in a first step, algebraic labeled information algebras (Φ,D; ≤, ⊥, ·, t), where D has a greatest domain >. Then, its domain-free version is also algebraic, as the following theorem shows.

Theorem 72 Let (Φ,D; ≤, ⊥, ·, t) be an algebraic labeled information alge- bra and assume that D has a greatest element >. Then (Φ/σ, D; ≤, ⊥, ·, ) is an algebraic domain-free information algebra.

Proof. From Theorem 70 we know that (Φ/σ, D; ≤, ⊥, ·, ) is a compact domain-free information algebra. It remains to show that (Φ/σ; ≤) is a dcpo. Consider a directed subset X of Φ/σ. Note that for all φ ∈ Φ, 9 PROPER OR IDEMPOTENT INFORMATION 173

φ ≡σ t>(φ). Therefore, we may always take the representant of the class [φ]σ in the top domain. Define

0 X = {φ ∈ Φ> :[φ]σ ∈ X}.

0 The supremum ∨X exists in Φ> since (Φ,D; ≤, ⊥, ·, t) is algebraic. By 0 Lemma 25 we obtain then [∨X ]σ = ∨φ∈X0 [φ]σ = ∨X. So the supremuim of X exists in Φ/σ and (Φ/σ; ≤) is a dcpo. ut The proof shows that somehow the whole domain-free information algebra is already incorporated in the top-level domain of the labeled algebra. This can be made more precise: Define in Φ> the following operations:

1. Combination: φ, ψ ∈ Φ> 7→ φ · ψ, where · denotes the combination in Φ,

2. Extraction: φ ∈ Φ>, x ∈ D 7→ x(φ) = t>(tx(φ)).

We claim that with these operations, (Φ>,D; ≤, ⊥, ·, ) becomes a domain- free information algebra.

Theorem 73 Let (Φ,D; ≤, ⊥, ·, t) be a labeled information algebra and as- sume that D has a greatest element >.Then (Φ>,D; ≤, ⊥, ·, ) with the op- erations of combination and extraction as defined above is a domain-free information algebra, isomorphic to (Φ/σ, D; ≤, ⊥, ·, ). If (Φ,D; ≤, ⊥, ·, t) is algebraic, then (Φ>,D; ≤, ⊥, ·, ) is so too.

Proof. The q-separoid and semigroup axioms A0 and A1 are evident. Clearly, > is a support for all φ ∈ Φ>. If x is a support for φ ∈ Φ>, that is, φ = t>(tx(φ)), and y ≥ x, then, using elementary properties of transport (Lemma 1), we obtain

y(φ) = t>(ty(φ)) = t>(ty(t>(tx(φ)))) = t>(ty(t>(ty(tx(φ)))))

= t>(ty(tx(φ))) = t>(tx(φ)) = φ.

This shows that the Support Axiom A2 is valid. Further, 1> is the unity of combination in Φ> and 0> is the null element. Evidently x(1>) = t>(tx(1>)) = 1> and similarly x(0>) = t>(tx(0>)) = 0>. Further, if x(φ) = t>(tx(φ)) = 0>, then φ = 0>. This is axiom A3.

Assume now that x⊥y|z and x(φ) = φ. Then

y(z(φ)) = t>(ty(t>(tz(φ)))) = t>(ty(tz(φ))), 9 PROPER OR IDEMPOTENT INFORMATION 174 since a transport from z to y can always pass by the larger domain >. Then, t>(ty(tz(φ))) = t>(ty(tz(t>(tx(φ))))) = t>(ty(tz(tx(φ)))) = t>(ty(tx(φ))) by the Transport Axiom in the labeled algebra. We conclude

t>(ty(tx(φ))) = t>(ty(t>((tx(φ)))) = t>(ty(φ)).

This shows that y(z(φ)) = y(φ), thus the Extraction Axiom A4 holds.

If x⊥y|z and x(φ) = φ, y(ψ) = ψ, then

z(φ · ψ) = t>(tz(φ · ψ)) = t>(tz(t>(tx(φ)) · t>(ty(ψ)))

= t>(tz(t>(tx(φ) · ty(ψ))))

= t>(tz(tx(φ) · ty(ψ)) = t>(tz(tx(φ)) · tz(ty(φ))) since the transport of a combination to a larger domain equals the combina- tion of the transports to the larger domain, and by the Combination Axiom of the labeled algebra. This implies further that

z(φ · ψ) = t>(tz(t>(tx(φ))) · tz(t>((ty(φ)))))

= t>(tz(φ) · tz(ψ)) = t>(tz(φ)) · t>(tz(ψ))

= z(φ) · z(ψ).

This verifies the validity of the Combination Axiom A5.

It remains to verify Idempotency. Since φ = t>(φ), we have indeed φ·x(φ) = φ · t>(tx(φ)) = t>(φ) · t>(tx(φ)) = t>(φ · tx(φ)) = t>(φ) = φ by idempotency in the labeled algebra. So (Φ>,D; ≤, ⊥, ·, ) is a domain-free idempotent generalised information algebra.

The map φ ∈ Φ> 7→ [φ]σ. is clearly one-to-one and onto Φ/σ. It is also a homomorphism since [φ · ψ]σ = [φ]σ · [ψ]σ and [t>(tx(φ)]σ = [tx(φ)]σ = x([φ]σ).

If (Φ,D; ≤, ⊥, ·, t) is algebraic, then (Φ>; ≤) is a complete lattice. But this means that (Φ>,D; ≤, ⊥, ·, ) is algebraic. The isomorphism is in this case continuous (see Lemma 25). ut All this shows that the top domain plays an important role with respect to algebraicity of information algebras. If in the algebraic labeled information algebra (Φ,D; ≤, ⊥, ·, t) there is no greatest element in D, then the derived domain-free algebra (Φ/σ, D; ≤, ⊥, ·, ) is compact, but not algebraic. Since the domain-free algebra of a compact labeled algebra is compact, and this 9 PROPER OR IDEMPOTENT INFORMATION 175 domain-free algebra can be completed to an algebraic one, we conjecture, that in an algebraic labeled information algebra, we may also adjoin a top domain, if there is none, and then complete this domain and thus get an algebraic information algebra with a top domain. This line of inquiry will however not be pursued here.

9.4 Continuous Algebras

The notion of approximation can be somewhat weakened. This leads to a generalisation of the concept of compact information algebras. The present section is partially based on (Guan & Li, 2012) but applies to generalised information algebras. The basic notion in this section is the way-below relation in an ordered set.

Definition 23 Way-Below. Let (Ψ; ≤) be a partially ordered set. For φ, ψ ∈ Ψ we write ψ  φ and say ψ is way-below φ, if for every directed set X ⊆ Ψ, for which the supremum exists, φ ≤ ∨X implies that there is an element χ ∈ X such that ψ ≤ χ.

Note that φ is a finite element if and only if φ  φ. The following lemma contains some well-known elementary results on the way-below relation, see for instance (Gierz, 2003).

Lemma 26 Let (Ψ; ≤) be a partially ordered set. Then the following holds for φ, ψ ∈ Ψ

1. ψ  φ implies ψ ≤ φ,

2. ψ  φ and φ ≤ χ imply ψ  χ,

3. χ ≤ ψ and ψ  φ imply χ  φ.

4. χ  ψ and ψ  φ imply χ  φ.

We are of course interested in the way-below relation in case that (Ψ,D; ≤ , ⊥, ·, ) is a domain-free idempotent information algebra, that is, Ψ is a semilattice. Then the way-below relation has some additional properties.

Lemma 27 Let (Ψ,D; ≤, ⊥, ·, ) be a domain-free information algebra. Then 9 PROPER OR IDEMPOTENT INFORMATION 176

1. 1  φ for all φ ∈ Ψ.

2. ψ1, ψ2  φ implies ψ1 ∨ ψ2  φ for all ψ1, ψ2 ∈ Ψ. 3. The set {ψ ∈ Ψ: ψ  φ} is an ideal for all φ ∈ Ψ.

4. ψ  φ if and only if for all X ⊆ Ψ such that ∨X exists and φ ≤ ∨X, there is a finite subset F of X such that ψ ≤ ∨F .

Proof. (1) Let X ⊆ Ψ be a directed set, and φ ≤ ∨X. Since X is non-empty, there is a ψ ∈ X and 1 ≤ ψ, hence 1  φ.

(2) Assume ψ1, ψ2  φ. Consider any directed set X ⊆ Ψ such that φ ≤ ∨X. Then there exist elements χ1, χ2 ∈ X so that ψ1 ≤ χ1 and ψ2 ≤ χ2. Since X is directed, there is also an element χ ∈ X so that χ1, χ2 ≤ χ. But then, ψ1 ∨ ψ2 ≤ χ1 ∨ χ2 ≤ χ. This shows that ψ1 ∨ ψ2  φ. (3) Assume ψ  φ and χ ≤ ψ. Then by Lemma 26 (3) χ  φ. Further let ψ1  φ and ψ2  φ. By (2) just proved, ψ1 ∨ ψ2  φ. Hence {ψ ∈ Ψ: ψ  φ} is an ideal. (4) Suppose first that ψ  φ. Let X be a subset of Ψ such that ∨X exists and φ ≤ ∨X. Let Y be the set of all joins of finite subsets of X. Then X ⊆ Y and ∨X is an upper bound for Y . Let χ be another upper bound of Y . Then χ is an upper bound of X, hence ∨X ≤ χ. So ∨X is the supremum of Y , ∨X = ∨Y . Furthermore Y is a directed set. So there is an element η ∈ Y such that ψ ≤ η and η = ∨F for some finite subset F of X. Conversely consider elements ψ, φ ∈ Ψ such that condition 4 of the lemma holds. Let X be a directed subset of Ψ such that ∨X exists and φ ≤ ∨X. There is a finite subset F of X such that ψ ≤ ∨F . Since X is directed, there is a χ ∈ X such that ∨F ≤ χ, hence ψ ≤ χ. So ψ  φ. ut With the aid of the way-below relation, algebraic information algebras can be alternatively characterized.

Theorem 74 If (Ψ,D; ≤, ⊥, ·, ) is an idempotent domain-free information algebra, then the following conditions are equivalent:

1. (Ψ,D; ≤, ⊥, ·, )) is an algebraic information algebra with finite ele- ments Ψf . 9 PROPER OR IDEMPOTENT INFORMATION 177

2. (Ψ; ≤) is an algebraic lattice with finite elements Ψf and ∀x ∈ D, ∀φ ∈ Ψ

x(φ) = ∨{ψ ∈ Ψf : ψ = x(ψ)  φ}. (9.24)

Proof. (1) ⇒ (2): By definition (Ψ; ≤) is an algebraic lattice, that is a complete lattice with finite elements Ψf . Then condition (9.24) follows from strong density and Lemma 26 in the following way,

x(φ) = ∨{ψ ∈ Ψf : ψ = x(ψ) ≤ φ}

= ∨{ψ : ψ  ψ = x(ψ) ≤ φ}

= ∨{ψ : ψ  ψ = x(ψ)  φ}

= ∨{ψ ∈ Ψf : ψ = x(ψ)  φ}.

(2) ⇒ (1): We use Theorem 62. Convergence holds, since (Ψ; ≤) is a com- plete lattice, density follows from (9.24) since ψ  φ implies ψ ≤ φ and compactness follows from the lattice-theoretic finiteness. ut Another important property of finite elements in a compact information algebra is given by the following theorem:

Theorem 75 If (Ψ,D; ≤, ⊥, ·, ) is a compact domain-free information al- gebra, then ψ  φ implies that here is an element χ ∈ Ψf so that ψ ≤ χ ≤ φ.

Proof. The set Aφ = {χ ∈ Ψf : χ ≤ φ} is directed and φ = ∨Aφ, hence φ ≤ ∨Aφ. Then ψ  φ implies the existence of an element χ ∈ Aφ so that ψ ≤ χ. But χ ≤ φ. So ψ ≤ χ ≤ φ and χ ∈ Ψf . ut A set of elements having the property that ψ  φ implies the existence of a χ ∈ S such that ψ ≤ χ ≤ φ is called separating. So the set of finite elements in a compact information algebra is separating. We now introduce continuous information algebras and show that they are a generalisation of algebraic ones.

Definition 24 Continuous Information Algebras. A generalised domain- free information algebra (Ψ,D; ≤, ⊥, ·, ) is called continuous with basis B ⊆ Ψ if B is closed under join, contains the unit 1, and B satisfies the following conditions: 9 PROPER OR IDEMPOTENT INFORMATION 178

1. Convergence: If X ⊆ B is directed, then ∨X exists in Ψ.

2. B-Densitiy: For all φ ∈ Ψ and for all x ∈ D,

x(φ) = ∨{ψ ∈ B : ψ = x(ψ)  x(φ)}.

Note that in a compact information algebra (Ψ,D; ≤, ⊥, ·, ) the finite el- ements Ψf form a basis. So, an algebraic information algebra is also con- tinuous with basis Ψf . We shall present below an example of a continuous information algebra which is not compact. So continuous information alge- bras present a genuine generalisation of compact information algebras. The approximation by finite elements is replaced by an approximation of some more general elements in a basis B. Strong B-density implies weak B-density: In fact let φ ∈ Ψ, then there is a x ∈ D so that φ = x(φ). Then by the strong B-density:

φ = x(φ) = ∨{ψ ∈ B : ψ = x(ψ)  φ} ≤ ∨{ψ ∈ B : ψ  φ} ≤ φ.

This is weak B-density. For later purposes we remark that both the sets {ψ ∈ B : ψ  φ} and {ψ ∈ B : ψ = x(ψ)  φ} are directed. Note also that ψ  φ does not imply x(ψ)  x(φ). Just as in an algebraic information algebra (Ψ,D; ≤, ⊥, ·, ), (Ψ; ≤) is an algebraic lattice, it follows that in a continuous information algebra (Ψ,D; ≤ , ⊥, ·, ) the partial order (Ψ; ≤) is a continuous lattice, namely a complete lattice such that for all φ ∈ Ψ

φ = ∨{ψ ∈ Ψ: ψ  φ}. (9.25)

Theorem 76 If (Ψ,D; ≤, ⊥, ·, ) is an idempotent domain-free information algebra, then the following are equivalent:

1. (Ψ,D; ≤, ⊥, ·, ) is a continuous information algebra.

2. (Ψ; ≤) is a continuous lattice, and ∀x ∈ D, ∀φ ∈ Ψ.

x(φ) = ∨{ψ ∈ Ψ: ψ = x(ψ)  x(φ)}. (9.26) 9 PROPER OR IDEMPOTENT INFORMATION 179

Proof. Assume first (Ψ,D; ≤, ⊥, ·, ) to be a continuous information algebra with basis B. We show first that (Ψ; ≤) is a complete lattice. Consider a non-empty subset X of Ψ. Define Y to be the set of all elements in B, which are way-below all elements in X,

Y = {ψ ∈ B : ψ  φ for all φ ∈ X}.

Since 1 ∈ Y , the set is non-empty, and with ψ1, ψ2 ∈ Y also ψ1 ∨ ψ2 ∈ Y (Lemma 27). So the set Y is directed. Therefore ∨Y exists and is a lower bound of X. Assume ψ to be another lower bound of X. Then Aψ = {η ∈ B : η  ψ} ⊆ Y , since η  ψ ≤ φ implies η  φ. From this we conclude that ψ = ∨Aψ ≤ ∨Y , hence ∨Y is the infimum of X. Since Ψ has further a top element 0 it follows from standard results of lattice theory, that (Ψ; ≤) is a complete lattice. Further, using B-density, we obtain for all φ ∈ Ψ,

φ = ∨{ψ ∈ B : ψ  φ} ≤ ∨{ψ ∈ Ψ: ψ  φ} ≤ φ.

So (Ψ; ≤) is indeed a continuous lattice. Further, again by density,

x(φ) = ∨{ψ ∈ B : ψ = x(ψ)  x(φ)}

≤ ∨{ψ ∈ Ψ: ψ = x(ψ)  x(φ)} ≤ x(φ), so (9.26) holds. If (Ψ; ≤), on the other hand, is a complete lattice, then convergence holds with Ψ as a basis. And (9.26) is exactly B-density with respect to the basis Ψ. Hence (Ψ,D; ≤, ⊥, ·, ) is a continuous information algebra. ut Here follows an example of a continuous information algebra.

Example : Continuous Information Algebra: This example is from (Guan & Li, 2012). Let Ψ = [0, 1] be the real interval between 0 and 1 and D = {0, 1}. Join is defined as maximum, the number 0 its the unit and the number 1 the null element of the algebra. Information extraction is defined as follows:

1(φ) = φ,  φ if φ ∈ [0, 1/2],  (φ) = 0 1/2 if φ ∈ (1/2, 1].

We leave it to reader to verify the axioms of an idempotent valuation algebra. Any non-empty subset X of [0, 1] is in this example directed and sup X exists always. The relation ψ  φ holds if either 0 < ψ < φ or in particular 9 PROPER OR IDEMPOTENT INFORMATION 180 if ψ = φ = 0. As a basis we take B = Ψ. Then it can be verified that x(φ) = ∨{ψ ∈ B : ψ = x(ψ)  φ} holds both for x = 0 and x = 1. So it is a continuous information algebra. But it is not compact: The only element satisfying φ  φ is φ = 0. We have seen above that a compact information algebra is continuous. But the converse does not hold as the example above shows. Here follows a necessary and sufficient condition for a continuous information algebra to be compact.

Theorem 77 A continuous domain-free information algebra (Ψ,D; ≤, ⊥, ·, ) is algebraic, if and only if the set {φ ∈ Ψ: φ  φ} is a basis for (Ψ,D; ≤ , ⊥, ·, ).

Proof. We know already that if (Ψ,D; ≤, ⊥, ·, ) is algebraic, then it is continuous, with basis B = Ψf = {φ ∈ Ψ: φ  φ}. So, assume that (Ψ,D; ≤, ⊥, ·, ) is continuous with basis B = {φ ∈ Φ: φ  φ}. The lattice (Ψ; ≤) is complete, hence it is a dcpo. Strong density is derived as follows:

x(φ) = ∨{ψ ∈ B : ψ = x(ψ)  x(φ)}

= ∨{ψ ∈ B : ψ = x(ψ)  ψ ≤ x(φ)}

= ∨{ψ ∈ B : ψ = x(ψ) ≤ x(φ)} (9.27)

So, the algebra is compact, hence algebraic with the set {φ ∈ Ψ: φ  φ} as finite elements. ut The following Theorem gives another necessary and sufficient condition for an information algebra to be continuous.

Theorem 78 An idempotent domain-free information algebra (Ψ,D; ≤, ⊥, ·, ) is continuous if and only if,

1. (Ψ; ≤) is a continuous lattice,

2. for all x ∈ D and any directed set X ⊂ Ψ, _ x(∨X) = x(φ). (9.28) φ∈X 9 PROPER OR IDEMPOTENT INFORMATION 181

Proof. Assume (Ψ; ≤) to be a continuous lattice, so that weak density holds (9.25), and that (9.28) holds too. Then (Ψ; ≤) is a complete lattice. Consider 0 0 0 a φ ∈ Ψ and let φ = x(φ). Then by weak density φ = ∨{ψ ∈ Φ: ψ  φ }, and {ψ ∈ Φ: ψ  φ0} is a directed set. From this we deduce, using (9.28)

x(φ) = x(x(φ)) = x(∨{ψ ∈ Ψ: ψ  x(φ)})

= ∨{x(ψ): ψ  x(φ)}.

Let η = x(ψ) so that η = x(η) ≤ ψ  x(φ). From this it follows that η  x(φ) and therefore,

x(φ) = ∨{η : η = x(η) = x(ψ), ψ  x(φ)}

≤ ∨{η : η = x(η)  x(φ)} ≤ x(φ)}.

Hence we have x(φ) = ∨{η : η = x(η)  x(φ)} and by Theorem 76 (Ψ,D; ≤, ⊥, ·, ) is a continuous information algebra. Conversely, assume (Ψ,D; ≤, ⊥, ·, ) to be a continuous information algebra with basis B. Then (Ψ; ≤) is a complete lattice (Theorem 76). Consider a directed set X ⊆ Ψ and x ∈ D. For φ ∈ X we have φ ≤ ∨X, hence x(φ) ≤ x(∨X) and therefore ∨φ∈X x(φ) ≤ x(∨X). By strong B-density,

x(∨X) = ∨{ψ ∈ B : ψ = x(ψ)  x(∨(X)}.

Now, ψ = x(ψ)  x(∨X) ≤ ∨X implies that there is a φ ∈ X so that ψ ≤ φ and thus also ψ = x(ψ) ≤ x(φ). From this we conclude that x(∨X) ≤ ∨φ∈X x(φ) and thus x(∨X) = ∨φ∈X x(φ). Hence (9.28) holds and (Ψ; ≤) is a continuous lattice (Theorem 76). ut According to Theorem 61 the identity (9.28) holds for algebraic information algebras. And Theorem 78 can therefore be adapted to algebraic information algebra.

Theorem 79 An idempotent domain-free information algebra (Ψ,D; ≤, ⊥, ·, ) is algebraic if and only if,

1. (Ψ; ≤) is an algebraic lattice,

2. for all x ∈ D and any directed set X ⊂ Φ, _ x(∨X) = x(φ). (9.29) φ∈X 9 PROPER OR IDEMPOTENT INFORMATION 182

Proof. If (Ψ,D; ≤, ⊥, ·, ) is algebraic, we know already that (Ψ; ≤) is an algebraic lattice, and (9.29) holds (Theorem 61). Assume then that (Ψ; ≤) is an algebraic lattice and (9.29) holds. We use Theorem 62. Convergence holds, since (Ψ; ≤) is a complete lattice and compactness holds by the definition of finite elements. Strong density is derived from weak density and (9.29) as follows:

φ = x(φ) = x(∨{ψ ∈ Ψf : ψ ≤ φ = x(φ)}

= ∨{x(ψ): ψ ∈ Ψf : ψ ≤ φ = x(φ)}

= ∨{ψ ∈ Ψf : ψ = x(ψ) ≤ φ}.

This is strong density. So (Ψ,D; ≤, ⊥, ·, ) is an algebraic information alge- bra. ut What is the labeled version of a continuous information algebra? To examine this question, we consider the labeled version (Φ,D; ≤, ⊥, ·, t) of a continuous information algebra (Ψ,D; ≤, ⊥, ·, ). We remind that Φ consists of all pairs (ψ, x), where ψ ∈ Ψ and ψ = x(ψ). Assume that B is a basis of the continuous information algebra (Ψ,D; ≤ , ⊥, ·, ). Define Bx = {(ψ, x): ψ ∈ B, x(ψ) = ψ}. We claim that this is a basis in Ψx. In fact, if (φ, x), (ψ, x) ∈ Bx, then (φ, x) · (ψ, x) = (φ · ψ, x) ∈ Bx since B is closed under combination or join. So Bx is closed under combination. Further also (1, x) belongs to B. Consider any directed subset X of Bx. By Lemma 22 we have ∨X = (∨ψ∈X ψ, x) ∈ Φx. This is the convergence property in Φx. Define [ B¯ = Bx. x∈D

Then, B¯ is still closed under combination. In fact, let (φ, x) ∈ Bx and (ψ, y) ∈ By, then φ, ψ ∈ B and x is a support of φ, y a support of ψ. But then x ∨ y is a support of φ · ψ. So, since (φ, x) · (ψ, y) = (φ · ψ, x ∨ y) and φ, ψ ∈ B, we see that (φ, x) · (ψ, y) ∈ Bx∨y.

We claim also that a density property holds in Ψx. Denote the way-below relation in (Φx; ≤) by x. We prove first the following lemma.

Lemma 28 Let φ, ψ ∈ Ψ and x(φ) = φ, x(ψ) = ψ. Then ψ  φ, if and only if (ψ, x) x (φ, x). 9 PROPER OR IDEMPOTENT INFORMATION 183

Proof. Assume ψ  φ and x(φ) = φ, x(ψ) = ψ. Consider a directed set 0 X ⊆ Φx. Then X = {φ :(φ, x) ∈ X} is directed too. Now, (φ, x) ≤ ∨X implies φ ≤ ∨X0. Then, there is a χ ∈ X0 such that ψ ≤ χ. Note that x(χ) = χ. Hence we see that (ψ, x) ≤ (χ, x). So indeed (φ, x) x (ψ, x).

Conversely, assume (ψ, x) x (φ, x). Consider a directed set X ⊆ Ψ such that φ ≤ ∨X. In a continuous information algebra we have x(∨X) = ∨φ∈X x(φ) (Theorem 78). Then φ = x(φ) ≤ x(∨X) = ∨χ∈X x(χ). There- fore (φ, x) ≤ (∨χ∈X x(χ), x) = ∨χ∈X (x(χ), x) (Lemma 22). Since the set {(x(χ), x): χ ∈ X} is directed, there must then be a χ ∈ X such that (ψ, x) ≤ (x(χ), x). Then ψ = x(ψ) ≤ x(χ) ≤ χ ∈ X. This proves that ψ  φ. ut This allows us to derive density, using Lemma 22 and Lemma 28 in (Φ,D),

∨{(ψ, x) ∈ Bx :(ψ, x) x (φ, x)}

= (∨{ψ : ψ ∈ B, ψ = x(ψ)  φ = x(φ)}, x) = (φ, x).

This is the density property claimed above.

Finally, assume (ψ, x) x (φ, x). By Lemma 28 we have ψ  φ and x is a support of both ψ and φ. If x ≤ y, then y is also a support of both elements. Therefore, again by Lemma 28, we have that ty(ψ, x) = (ψ, y) y (φ, y) = ty(φ, x). Conversely, assume that x is a support of ψ and φ and x ≤ y. Then, if (ψ, y) y (φ, y), Lemma 28 implies that ψ  φ, hence (ψ, x) x (φ, x). This is an important compatibility relation between the way-below relation in different domains Ψx and Ψy We summarize these results in the following theorem.

Theorem 80 Let (Ψ,D; ≤, ⊥, ·, ) be a continuous domain-free information algebra with basis B and (Φ,D; ≤, ⊥, ·, t) the associated dual labeled infor- mation algebra. Then the following properties hold:

1. Bx is a basis in (Φx; ≤), that is Bx is closed under combination and contains (1, x). Any directed subset of Bx has a supremum in Φx.

2. (φ, x) = ∨{(ψ, x) ∈ Bx :(ψ, x) x (φ, x)}, for all (φ, x) ∈ Ψx.

3. If x ≤ y, then (ψ, x) x (φ, x) if and only if ty(ψ, x) y ty(φ, x). 9 PROPER OR IDEMPOTENT INFORMATION 184

This theorem serves as a base to define the concept of a labeled continuous information algebra.

Definition 25 Labeled Continuous Information Algebra: A labeled idempotent information algebra (Φ,D; ≤, ⊥, ·, t) is called continuous, if there is for all x ∈ D a set Bx ⊆ Ψx (the basis in x), closed under combination and contains 1x, satisfying the following conditions for all x ∈ D:

1. Convergence: If X ⊆ Bx is directed, then ∨X ∈ Ψx.

2. Density: For all φ ∈ Ψx, φ = ∨{ψ ∈ Bx : ψ x φ}.

3. Compatibility: If d(φ) = d(ψ) = x ≤ y, then ψ x φ if and only if ty(ψ) y ty(φ).

According to this definition and Theorem 80, the dual labeled information algebra (Φ,D; ≤, ⊥, ·, t) associated with a continuous domain-free informa- tion algebra (Ψ,D; ≤, ⊥, ·, ) is itself continuous. We remark that, as in Theorem 76, it follows that (Φx; ≤) is a continuous lattice for every x ∈ D. To establish duality for continuous information algebras, let’s start with a labeled continuous information algebra (Φ,D; ≤, ⊥, ·, t) and consider its as- sociated dual domain-free information algebra (Φ/σ, D; ≤, ⊥, ·, ). Is this algebra continuous too? A conditionally affirmative answer is given by The- orem 81 below. In order to prove this theorem we need two auxiliary results, which have some interest by themselves.

Lemma 29 Let (Φ,D; ≤, ⊥, ·, t) be an idempotent labeled information alge- bra. Then x([ψ]σ) = [ψ]σ  [φ]σ = x([φ]σ) in Φ/σ implies ψ x φ for the representants ψ and φ of [ψ]σ and [φ]σ with d(ψ) = d(φ) = x. Further, if D has a top element >, and (Φ,D; ≤, ⊥, ·, t) is labeled continuous, then, if d(ψ) = d(φ) = x, ψ x φ implies [ψ]σ  [φ]σ.

Proof. Assume X ⊆ Φx directed, φ, ψ ∈ Φx representants of the classes [φ]σ and [ψ]σ respectively and φ ≤ ∨X. Then [φ]σ ≤ ∨[X]σ with [X]σ = {[χ]σ : χ ∈ X} (Lemma 25). The set [X]σ is directed, therefore [ψ]σ  [φ]σ implies that there is a η ∈ X such that [ψ]σ ≤ [η]σ, hence ψ ≤ η. This proves that ψ x φ.

For the second part, assume first ψ > φ and consider a directed set X in Φ/σ such that [φ]σ ≤ ∨X. We may take as representants of the classes [η]σ 9 PROPER OR IDEMPOTENT INFORMATION 185

0 in the set X their representants in Φ>. Let then X = {η ∈ Φ> :[η]σ ∈ X}. 0 W X is still directed. Now, if [φ]σ ≤ X and φ is again a representant of [φ]σ W 0 0 in Φ>, then also φ ≤ X . Since ψ > φ, there is an element η ∈ X such that ψ ≤ η. But then [η]σ ∈ X and [ψ]σ ≤ [η]σ. This shows that [ψ]σ  [φ]σ. Now, if d(ψ) = d(φ) = x and ψ x φ, then by the compatibility property t>(ψ) > t>(φ), and [ψ]σ = [t>(ψ)]σ and [φ]σ = [t>(φ)]σ, hence [ψ]σ  [φ]σ as just proved. ut The next lemma is similar as Lemma 24 for labeled compact algebras.

Lemma 30 Let (Φ,D; ≤, ⊥, ·, t) be a labeled continuous information alge- bra. If X ⊆ Ψy directed, then for all x ≤ y ∈ D,

tx(∨X) = ∨tx(X), (9.30) where tx(X) = {tx(ψ): ψ ∈ X}.

Proof. Note that ∨X exists in Φy, since (Φy; ≤) is a complete lattice. Con- sider a ψ ∈ X so that ψ ≤ ∨X, hence tx(ψ) ≤ tx(∨X), thus ∨tx(X) ≤ tx(∨X).

Conversely by density in Φx we have

tx(∨X) = ∨{ψ ∈ Φx : ψ x tx(∨X)}.

By the compatibility condition, ψ x tx(∨X) implies ty(ψ) y ty(tx(∨X)) ≤ ∨X. By the definition of the way-below relation y this means that there is a χ ∈ X such that ty(ψ) ≤ χ. But then it follows that ψ = tx(ty(ψ)) ≤ tx(χ) ∈ tx(X), hence tx(∨X) ≤ ∨tx(X) and therefore tx(∨X) = ∨tx(X). ut Now we are in a position to prove the following theorem.

Theorem 81 Let (Φ,D; ≤, ⊥, ·, t) be a labeled continuous information al- gebra, and assume that D has a top element >, then the associated dual domain-free information algebra (Φ/σ, D; ≤, ⊥, ·, )) is continuous.

Proof. We first show that (Φ/σ; ≤) is a complete lattice. To this end consider any non-empty subset X ⊆ Φ/σ. For any element [ψ]σ of X we may take 0 the representant ψ in the top domain Φ>, d(ψ) = >. Let then X = {ψ ∈ 0 Φ> :[ψ]σ ∈ X}. But (Φ>; ≤) is a complete lattice, hence ∨X exists in Φ>. 9 PROPER OR IDEMPOTENT INFORMATION 186

0 By Lemma 25, we have [∨X ]σ = ∨X, and so X has a supremum in Φ/σ. Since (Φ/σ; ≤) has a smallest element [1>]σ, by standard results of lattice theory (Φ/σ; ≤) is a complete lattice.

Next consider any class [φ]σ ∈ Φ/σ. The set {[ψ]σ :[ψ]σ  [φ]σ} is directed. Consider the representants of the classes of this set in Φ>: {ψ ∈ Φ> :[ψ]σ  [φ]σ} and also φ ∈ Φ>. Then, by Lemma 25, Lemma 29 and density in the labeled algebra,

∨{[ψ]σ :[ψ]σ  [φ]σ} = [∨{ψ ∈ Ψ> :[ψ]σ  [φ]σ}]σ

= [∨{ψ ∈ Ψ> : ψ > φ}]σ = [φ]σ.

This shows that (weak) density hold. Therefore, (Ψ/σ; ≤) is a continuous lattice. By Theorem 78 it is now sufficient to prove (9.28). So, consider a directed set X ⊆ Φ/σ. For any [ψ]σ ∈ X we may select the representant ψ in Φ>. 0 Define X = {ψ ∈ Φ> :[ψ]σ ∈ X}. This set is still directed in Ψ>. Now, using repeatedly Lemma 25 and Lemma 30

0 0 0 x(∨X) = x(∨{[φ]σ : φ ∈ X }) = x([∨X ]σ) = [tx(∨X )]σ 0 = [∨tx(X )]σ = ∨{[tx(φ)]σ :[φ]σ ∈ X}

= ∨{x([φ]σ):[φ]σ ∈ X} = ∨x(X).

This proves that (Ψ/σ, D) is a domain-free continuous information algebra. ut Note that the existence of a top element in D is required in the proof above for (Ψ/σ; ≤) to be continuous. It remains so far an open question, whether a labeled continuous information algebra (Ψ,D) can be extended to a labeled continuous information algebra with a top domain. The problem is the extension of the compatibility condition to the new top domain.

9.5 Atomic Algebras

In many idempotent information algebras there are maximal elements, called atoms. In important cases these maximal elements determine the algebra fully. For example, in the set algebra relative to a f.c.f (F, R), the elements or rather the the one-element subsets of any frame, represent maximal in- formation relative to the frame. We want to study this situation in the general framework of generalised idempotent information algebras. Labeled 9 PROPER OR IDEMPOTENT INFORMATION 187 algebras are somewhat better suited for this subject than domain-free in- formation algebras. However we refer to (Kohlas & Schmid, 2014b) for a discussion of the same theme in the context of idempotent domain-free val- uation algebras. In (Kohlas, 2003a) atoms in idempotent labeled valuation algebras were studied. This section presents a generalisation thereof. Consider a labeled idempotent generalised information algebra (Ψ,D; ≤ , ⊥, ·, t). Then we define the concept of an atom in this algebra as follows:

Definition 26 Atom: An element α ∈ Ψx in a domain x ∈ D of a labeled idempotent information algebra (Ψ,D; ≤, ⊥, ·, t) is called an atom in x, if

1. α 6= 0x,

2. for all ψ ∈ Ψx, α ≤ ψ implies either α = ψ or ψ = 0x.

So atoms are maximal elements in a domain, smaller than the null element 3. Since the null element does not represent proper information, atoms are indeed maximal pieces of information. We shall now first present a few elementary properties of atoms. Then we shall show that atoms are closely related to families of compatible frames (f.c.f). Finally, we present particular information algebras, which are essentially set algebras on some f.c.f.

Denote the set of atoms in domain x by Atx. Here are a few general prop- erties of atoms:

Lemma 31 Let (Ψ,D; ≤, ⊥, ·, t) be an idempotent labeled information alge- bra. Then the following holds:

1. If y ≤ x, then α ∈ Atx and ψ ∈ Ψy imply either ψ ≤ α or else α · ψ = 0x,

2. α, β ∈ Atx imply either α = β or else α · β = 0x,

3. α ∈ Atx and y ≤ x imply ty(α) ∈ Aty,

4. α ∈ Atx and β ∈ Aty imply either α · β = 0x∨y or tx(α · β) = α and ty(α · β) = β,

3In order theory atoms are usually defined a minimal elements. But for our purpose our definition fits better the idea of atoms as building blocks of a piece of information. See below. 9 PROPER OR IDEMPOTENT INFORMATION 188

5. α ∈ Atx and ψ ∈ Ψy imply either α · ψ = 0x∨y or tx(α · ψ) = α.

Proof. 1.) From α ≤ α·ψ and the definition of an atom it follows that either α · ψ = 0x or α · ψ = α, hence ψ ≤ α.

2.) As before, α, β ≤ α · β implies either α · β = 0x or α · β = α, α · β = β, hence α = β.

3.) Since α is an atom, α 6= 0x and therefore ty(α) 6= 0y. Assume ty(α) ≤ ψ, where d(ψ) = y. From x⊥y|y we obtain ty(α · ψ) = ty(α) · ψ = ψ. Now, α · ψ = α or α · ψ = 0x by item 1 just proved. In the first case it follows that ty(α) = ψ, in the second case that ψ = 0y. So, ty(α) is indeed an atom.

4.) Assume α · β 6= 0x∨y. We have x⊥y|x, which implies tx(α · β) = α · tx(β) 6= 0x since otherwise α · β = 0x∨y. Then from α ≤ α · tx(β) it follows α = α · tx(β). The second identity ty(α · β) = β follows in the same way. 5.) Is proved in the same way as item 4. ut

If for an element ψ ∈ Ψx and an atom α we have ψ ≤ α, then α implies ψ. Therefore we define At(ψ) = {α ∈ Atx : ψ ≤ α} and call At(ψ) the set of atoms contained in ψ. Now, we fix our attention on information algebras, where every element contains an atom.

Definition 27 Atomic Information Algebra: A labeled idempotent in- formation algebra (Ψ,D; ≤, ⊥, ·, t) is called atomic, if for all x ∈ D and all ψ ∈ Ψx, ψ 6= 0x implies At(ψ) 6= ∅.

In atomic information algebras, atoms have a few additional properties.

Lemma 32 Let (Ψ,D; ≤, ⊥, ·, t) be an atomic information algebra. Then the following holds:

1. If x ≤ y, then for all α ∈ Atx there is an atom β ∈ Aty such that α = tx(β).

2. For all α ∈ Atx, β ∈ Aty such that α · β 6= 0x∨y, there is an atom γ ∈ Atx∨y such that tx(γ) = α and ty(γ) = β.

Proof. 1.) Since the algebra is atomic, there exists an atom β ∈ ty(α) 6= 0y. Then ty(α) ≤ β which implies α = tx(ty(α)) ≤ tx(β). But tx(β) 6= 0x since β 6= 0y, hence α = tx(β). 9 PROPER OR IDEMPOTENT INFORMATION 189

2.) Again there is an atom γ ∈ At(α · β), such that α · β ≤ γ. But then by Lemma 31, item 4, α = tx(α · β) ≤ tx(γ) 6= 0x. So α = tx(γ). Similarly, we derive β = ty(γ). ut

Now, we consider the sets Atx and show that they are part of a compatible family of frames, if the underlying algebra is atomic. We start by showing how the transport operation induces refinings among sets of atoms Atx. In fact, define for x ≤ y the map

τx,y(α) = At(ty(α)) (9.31) from Atx into the power set of Aty.

Theorem 82 Let (Ψ,D; ≤, ⊥, ·, t) be an atomic information algebra. Then τx,y is a refining of Atx.

Proof. First, τx,y(α) 6= ∅, since the algebra is atomic. Secondly, consider atoms α 6= β. Assume that τx,y(α) ∩ τx,y(β) 6= ∅. Then select an atom γ in τx,y(α) ∩ τx,y(β). But this means γ ∈ At(ty(α)), hence ty(α) ≤ γ. It follows that α = tx(ty(α)) ≤ tx(γ). But this implies tx(γ) = α, since tx(γ) is atom in x (Lemma 31). In the same way we deduce tx(γ) = β, hence α = β, contrary to the assumption. So τx,y(α) ∩ τx,y(β) = ∅. Finally consider any atom γ in Aty. Since γ ≤ ty(tx(γ)) we have γ ∈ τx,y(tx(γ)). Since tx(γ) is an atom in x we conclude that γ ∈ ∪α∈Atx τx,y(α). But this shows that

∪α∈Atx τx,y(α) = Aty and this concludes the proof that τx,y is a refining of x. ut According to this theorem, in an atomic information algebra, if x ≤ y, then Aty is a refinement of Atx and the latter a coarsening of Aty. We may extend the maps τx,y in the usual way to sets of atoms in x. Let now F = {Atx : x ∈ D} and R = {τx,y : x, y ∈ D, x ≤ y}. We claim that (F, R) is a f.c.f, provided that some additional conditions are satisfied. This issue will be addressed below. Consider now first all sets of atoms At(ψ) for ψ ∈ Ψ. Let’s denote this set SAt(Ψ). We define the following operations with respect to these sets of atoms:

1. Labeling: d(At(ψ)) = Atx if d(ψ) = x. 2. Combination: If d(φ) = x and d(ψ) = y,

At(φ) ./ At(ψ) = {γ ∈ Atx∨y : tx(γ) ∈ At(φ), ty(γ) ∈ At(ψ)}. (9.32) 9 PROPER OR IDEMPOTENT INFORMATION 190

3. Transport: If d(ψ) = x and y ∈ D,

ty(At(ψ)) = vy,x∨y(τx,x∨y(At(ψ)). (9.33)

Here vy,x∨y(A) is the outer reduction (Section 2.2), defined by vy,x∨y(A) = {α ∈ Aty : τy,x∨y(α) ∩ A 6= ∅}. Note also the similarity of combination with the relational join in relational algebra. We show that At(φ) ./ At(ψ) and tx(At(ψ)) are elements of SAt(Ψ. This follows from the following theorem:

Theorem 83 Let (Ψ,D; ≤, ⊥, ·, t) be an atomic information algebra. Then for all φ, ψ ∈ Ψ and x ∈ D,

At(φ) ./ At(ψ) = At(φ · ψ)

ty(At(ψ)) = At(ty(ψ)).

Proof. Consider first α ∈ At(φ · ψ). Then φ · ψ ≤ α Assume d(φ) = x and d(ψ) = y. Fron x⊥y|x it follows that φ ≤ φ·tx(ψ) = tx(φ·ψ) ≤ tx(α) ∈ Atx. This shows that tx(α) ∈ At(φ). Similarly it follows that ty(α) ∈ At(ψ). So we conclude that α ∈ At(φ) ./ At(ψ) and At(φ · ψ) ⊆ At(φ) ./ At(ψ). Conversely, if α ∈ At(φ) ./ At(ψ), then α ∈ Atx∨y and tx(α) ∈ At(φ), ty(α) ∈ At(ψ). It follows that φ·ψ ≤ tx(α)·ty(α) ≤ α and thus α ∈ At(φ·ψ). This proves that, At(φ) ./ At(ψ) = At(φ · ψ).

Next consider an atom α ∈ At(ty(ψ)) and assume d(ψ) = x. In order to show that α ∈ ty(At(ψ)) we need to verify that

τy,x∨y(α) ∩ τx,x∨y(At(ψ)) 6= ∅. (9.34)

Note that ty(α · ψ) = α · ty(ψ) = α 6= 0y, hence α · ψ 6= 0x∨y. Therefore there exists an atom β in At(α · ψ), such that α · ψ ≤ β. Then tx(β) ≥ tx(α · ψ) = tx(α) · ψ ≥ ψ which shows that tx(β) ∈ At(ψ), hence β ∈ τx,x∨y(At(ψ)). But we have also ty(β) ≥ ty(α · ψ) = α, thus ty(β) = α, which shows that β ∈ τy,x∨y(α). Thus (9.34) holds and α ∈ ty(At(ψ)). Conversely, consider α ∈ ty(At(ψ)). Then α is an atom in y such that (9.34) holds. Select then an atom β in this intersection. We have ty(β) = α and tx(β) ∈ At(ψ) or tx(β) ≥ ψ. This implies β ≥ tx∨y(tx(β)) ≥ tx∨y(ψ). Then α = ty(β) ≥ ty(tx∨y(ψ)) = ty(ψ). So we conclude that α ∈ At(ty(ψ)) and thus ty(At(ψ)) = At(ty(ψ)). ut

According to this theorem SAt(Ψ) is closed under the operation of combi- nation and transport defined above. As in Section 3.1, where we intro- duced a set algebra as a labeled idempotent information algebra relative 9 PROPER OR IDEMPOTENT INFORMATION 191

to a f.c.f. (F, R), we consider pairs (At(ψ), Atx) for d(ψ) = x and define

ΨAtx = {(At(ψ), Atx): d(ψ) = x} and ΨAt(Ψ) = ∪x∈DΨAtx . The operations above in SAt(Ψ) can then be extended to Ψ in the following way:

1. Labeling: d(At(ψ), Atx) = Atx.

2. Combination: (At(φ), Atx) · (At(ψ), Aty) = (At(φ) ./ At(ψ), Atx ∨ Aty).

3. Transport: tAty (At(ψ), Atx) = (ty(At(ψ)), Atx),

Note that (At(1x), Atx) = (Atx, Atx) and (At(0x), Atx) = (∅, Atx) are the unit and null elements of combination in domain Atx. The map ψ, d(ψ) = x 7→ (At(ψ), Atx) satisfies

φ · ψ 7→ (At(φ), Atx) · (At(ψ), Aty),

1x 7→ (Atx, Atx),

0x 7→ (∅, Atx),

ty(ψ) 7→ tAty (At(ψ)), x). (9.35)

All this indicates that the algebra of subsets of atoms in SAt(ψ) might be a generalised information algebra, somehow connected to the original atomic algebra (Ψ,D;; ≤, ⊥, ·, t). To pursue this line of inquiry we strengthen the concept of an atomic algebra

Definition 28 Atomistic Information Algebras: A labeled idempotent information algebra (Ψ,D; ≤, ⊥, ·, t) is called atomistic, if it is atomic and if for all ψ ∈ Ψx, ψ 6= 0x,

ψ = ∧At(ψ). (9.36)

The family of sets Atx for x ∈ D is nearly a family of compatible frames. Only the Identity of Coarsenings condition does not hold in general 4. We introduce an additional condition, which proves to be necessary and suffi- cient to guarantee the Identity of Coarsenings between the frames Ax. We

4 This means that the frames Atx form only a preorder under the order induced by re- finings. This is another indication that a preorder of domains or frames could be sufficient to develop the theory of generalised information algebras, see footnote. 9 PROPER OR IDEMPOTENT INFORMATION 192 call two domains x and y from D informorph if

if for all α ∈ Atx there exists β ∈ Aty

and if for all β ∈ Aty there exists α ∈ Atx

such that α ≡σ β. (9.37)

This means essentially that atoms in domains x and y convey the same information. In fact, if (9.37) holds, then tx∨y(α) = tx∨y(β), hence ty(α) = ty(tx∨y(α)) = ty(tx∨y(β)) = β and similarly tx(β) = α. So, we obtain that tx(ty(α)) = α and also ty(tx(β)) = β for all atoms in Atx or Aty respectively.

Define now F = {Atx : x ∈ D} and R = {τx,y : x ≤ y}. Then (F, R) is an f.c.f provided that no two domains x and y in D are informorph.

Theorem 84 Let (Ψ,D; ≤, ⊥, d, ·, t) be an atomistic information algebra such that no two different domains are infomorph. Then (F, R) is a family of compatible frames (f.c.f).

Proof. By definition we have for x ≤ y ≤ z,

τy,z ◦ τx,y(α) = At(tz(At(ty(α))))

= {γ ∈ At(tz(β)) for some β ∈ At(ty(α))}.

This means γ ≥ tz(β) and β ≥ ty(α), hence γ ≥ tz(ty(α)). Since y⊥z|y and x ≤ y we have also x⊥y|z and therefore tz(ty(α)) = tz(α). Therefore, γ ≥ tz(α), that is γ ∈ At(tz(α)). So we conclude that τy,z ◦ τx,y(α) = At(tz(α)) = τx,z(α). Therefore, τy,z ◦ τx,y = τx,z ∈ R. This shows that composition of refinings holds.

For all α ∈ Atx we have tx(α) = α. So τx,x is the identity map τx,x(α) = {α}. This is the identity condition. The identity of refining is obvious. Clearly Atx∨y is the minimal common refinement of Atx and Aty and τx,x∨y and τy,x∨y are the corresponding refinings. So far, the assumption that no two domains are informorph is not used. This condition means that if (9.37) holds for some domains x, y ∈ D, then x = y. Consider then domains x, y, z ∈ D where x, y ≤ z, such that for each atom α in x there is an atom β in domain y such that τx,z(α) = τy,z(β). This means that At(tz(α)) = At(tz(β)) or that tz(α) = tz(β). But then tx∨y(α) = tx∨y(tz(α)) = tx∨y(tz(β)) = tx∨y(β). So, we have α ≡σ β. In the same way we find for each atom β in domain y an atom α in domain x such 9 PROPER OR IDEMPOTENT INFORMATION 193

that α ≡σ β. But then (9.37) implies x = y, hence Atx = Aty. So Identity of Coarsening holds in (F, R). ut In this f.c.f (F, R) we may of course define the relation of conditional in- dependence between frames as in Section 2.2. Provided the conditional independence relation defines a q-separoid, we may, as in Section 3.1, define the generalised information algebra (Φ, F; ≤, ⊥, d, ·, t) of subsets of atoms if the original algebra (Ψ,D; ≤, ⊥, d, ·, t) is atomistic. We shall see below that the original algebra is in fact embedded into this algebra of subsets of atoms. But before we turn to this issue, we remark that the f.c.f (F, R) is closely related to the relation x⊥y|z in the q-separoid (D; ≤, ⊥). First of all, we have that x ≤ y in D implies Atx ≤ Aty in (F; ≤, ⊥) (we make as usual no distinction in the notation of relations in both structure (D; ≤, ⊥) and (F; ≤, ⊥)). Further, we have Atx ∨ Aty = Atx∨y. So, as partial orders (F; ≤, ⊥) is homomorph to (F; ≤, ⊥). In fact, there is more as the following theorem shows.

Theorem 85 Let (Ψ,D; ≤, ⊥, ·, t) be an atomic information algebra. Then x⊥y|z implies Atx⊥Aty|Atz.

Proof. We start by recalling the definition of the relation Atx⊥Aty|Atz in (F, R). For α ∈ Atx, β ∈ Aty and γ ∈ Atz we have (Section 2.2),

Rγ(Atx, Aty) = {(α, β):(α, β, γ) ∈ R(Atx, Aty, Atz)}.

Further, (α, β, γ) ∈ R(Atx, Aty, Atz) means that

τx,x∨y∨z(α) ∩ τy,x∨y∨z(β) ∩ τz,x∨y∨z(γ) 6= ∅. (9.38)

Similarly,

Rγ(Atx) = {α :(α, γ) ∈ R(Atx, Atz)},

Rγ(Aty) = {β :(β, γ) ∈ R(Aty, Atz)}.

Here we have (α, γ) ∈ R(Atx, Atz) and (β, γ) ∈ R(Aty, Atz) if

τx,x∨z(α) ∩ τz,x∨z(γ) 6= ∅ and τy,y∨z(β) ∩ τz,y∨z(γ) 6= ∅. (9.39)

Finally, Atx⊥Aty|Atz means that

Rγ(Atx, Aty) = Rγ(Atx) × Rγ(Aty) (9.40) 9 PROPER OR IDEMPOTENT INFORMATION 194

for all γ ∈ Atz. We are going to show that (9.38) is equivalent to (9.39), if x⊥y|z. This implies then (9.40).

Remember that we have always Rγ(Atx, Aty) ⊆ Rγ(Atx)×Rγ(Aty). Assume then (9.39). Remind that x⊥y|z implies x ∨ z⊥y ∨ z|z. Since we assume 0 τx,x∨z(α)∩τx,x∨z(γ) 6= ∅ there is an atom α in the intersection At(tx∨z(α))∩ 0 0 At(tx∨z(γ)). Then tx∨z(α), tx∨z(γ) ≤ α so that α = tx(tx∨z(α)) ≤ tx(α ). 0 0 Since tx(α ) is an atom in x, we conclude that α = tx(α ). In the same 0 0 way we obtain also γ = tz(α ). Similarly select an atom β in At(ty∨z(β)) ∩ 0 0 At(ty∨z(γ)) and conclude in the same way that β = ty(β ) and γ = tz(β ). 0 0 0 0 Since x ∨ z⊥y ∨ z|z we obtain tz(α · β ) = tz(α ) · tz(β ) = γ. Since γ 6= 0z 0 0 0 0 0 we infer that α · β 6= 0x∨y∨z. Therefore there is an atom γ ∈ At(α · β ), that is, γ0 ≥ α0 · β0. We have

0 0 0 0 0 α · β = tx∨y∨z(α ) · tx∨y∨z(β ) ≥ tx∨y∨z(α ) ≥ tx∨y∨z(tx∨z(α)) = tx∨y∨z(α).

0 0 So, γ is an atom in At(tx∨y∨z(α)). In the same way we obtain that γ ≥ 0 0 0 0 tx∨y∨z(β), so that γ is also an element of At(tx∨y∨z(β)). Finally γ ≥ α · β 0 0 0 0 0 implies γ ≥ tz(α · β ) = tz(α ) · tz(β ) = γ. From this we conclude also that 0 0 γ ≥ tx∨y∨z(γ), hence γ belongs also to At(tx∨y∨z(γ)). But this shows that (9.38) holds. And this concludes the proof. ut Remark that in the proof of this theorem, no use is made of the assumption that (F, R) id an f.c.f, that is that no two different domains are infomorph. We now make again this assumption such that (F, R) becomes a f.c.f and the algebra of subsets of atoms (Φ, F; ≤, ⊥, d, ·, t) a generalised, idempotent information algebra and turn to the question how the original atomistic algebra (Ψ,D; ≤, ⊥, d, ·, t) is related to this algebra. We have introduced above ΨAt(Ψ) as the set of pairs (At(ψ), Atx), if d(ψ) = x. They belong to the set Φ. So, the map ψ 7→ (At(ψ), Atx) maps Ψ into Φ. In summary, we have the following maps between (Ψ,D; ≤, ⊥, d, ·, t) and (Φ, F; ≤, ⊥, d, ·, t)

ψ ∈ Ψ 7→ (At(ψ), Atx ∈ Φ if d(ψ) = x,

x ∈ D 7→ Atx ∈ F,

tx :Ψ → Ψ 7→ tAx :Φ → Φ.

If (Ψ,D; ≤, ⊥, d, ·, t) is atomistic and no two domains are infomorph, then these maps satisfy the homomorphy conditions (9.35) and further the map 10 CONCLUSION 195

x 7→ Atx maintains order and x⊥y|z implies Atx⊥Aty|Atz (Theorem 85), is therefore a q-separoid homomorphism (Dawid, 2001). Further, At(φ) = At(ψ) implies φ = ∧At(φ) = ∧At(ψ) = ψ and Atx = Aty implies x = y. So these maps are one-to-one. Therfore, we may speak of an embedding of the generalised information algebra (Ψ,D; ≤, ⊥, d, ·, t) into the generalised information algebra (Φ, F; ≤, ⊥, d, ·, t) of sets of atoms. In other words, an atomistic information algebra may be represented by the information algebra of the sets of its atoms. Each piece of information ψ is represented by the set of its atoms At(ψ). We may consider the atoms in At(ψ) the set of possible answers compatible with the piece of information ψ. In this way any piece of information is represented by its set of possible answers. A more complete study of representation of idempotent valuation algebras by set algebras is given in (Kohlas & Schmid, 2014b).

10 Conclusion

We have extended the axiomatic framework behind local computation as proposed originally by (Shenoy & Shafer, 1990a) and worked out in (Kohlas, 2003a) to a more general axiomatic structure based on q-separoids. This concept is derived from notion of a separoid, a mathematical framework for conditional independence and irrelevance. The framework of separoids was largely motivated by probability theory but extended largely beyond it. It was noted that these concepts of conditional independence play an important role for schemes of local computation as originally proposed by (Lauritzen & Spiegelhalter, 1988) and generalised and abstracted by (Shenoy & Shafer, 1990a) and then later worked out by many further contributions. This line of inquiry limited itself to what is called in this paper the multivariate frame, that is valuations related to sets of variables. There were few studies taking a larger few of local computation. A notable exception is (Shafer et al. , 1987b), where families of partitions are considered for local computation with belief functions. This is a special case of the more gneral notion of families of compatible frames, introduced in (Shafer, 1976), although not yet in view of local computation. This concept was used in view of local computation with hints or belief functions in (Kohlas & Monney, 1995). The present paper takes up this approach and generalizes it for an abstract algebraic structure, allowing for local computation. The point is that in the multivariate model the associated lattice of domains is a lattice of subsets, that is distributive. This is no more the case in REFERENCES 196 the more general setting considered in this paper; the lattice of domains, for instance a lattice of partitions, may be arbitrary, not even modular. Or the domains form not even a lattice, but only a join-semilattice. The basic structures considered for local computation like join or junction trees, hypertrees and Marko trees, which are all equivalent notions in multivariate models, are no more the same in the general case. It is shown that the weaker notion of a q-separoid is sufficient to define Markov trees and to allow local computation in Markov trees within the algebraic structure of generalised information algebras. It is further shown that if a special q- separoid on a lattice of domains is used, then the old axiomatic structure, called valuation algebra, is recovered. So generalised information algebras are a generalisation of valuation algebras. generalised information algebras are proposed here first in a labeled form, which is natural for local computation. As proposed by (Shafer, 1991) for certain valuation algebras, a second, domain-free form can also derived for generalised information algebras. This leads to a nice duality theory between two equivalent forms of information algebras. In (Kohlas, 2003a) idempo- tent valuation algebras (called there information algebras) were studied in particular with respect to the partial order induced by idempotency. Here order between pieces of information is generalised for non-idempotent in- formation algebras, especially for regular and separative algebras. These concepts were introduced in (Kohlas, 2003a) for local computation with di- vision, based on a proposition by (Lauritzen & Jensen, 1997). It turns out that these notions from semigroup theory are also relevant for information order, although, they do not seem to lead as far as idempotency. Many results from idempotent valuation algebras can be extended to generalised information algebras, in particular the concepts of compact, algebraic or continuous algebras. It seems that the general approach to information algebras presented here gives the natural mathematical frame for algebraic structures allowing lo- cal computation, covering all known forms of local computation, including the local computation in multivariate models, in system of partitions and families of compatible frames.

References

Anrig, B., Haenni, R., Kohlas, J., & Lehmann, N. 1997. Assumption-based Modeling using ABEL. In: Gabbay, D., Kruse, R., Nonnengart, A., REFERENCES 197

& Ohlbach, H.J. (eds), First International Joint Conference on Qual- itative and Quantitative Practical Reasoning; ECSQARU–FAPR’97. Springer.

Baets, B. De. 1996. Idempotent Uninorms. European J. Op. Res., 118, 631–642.

Beeri, C., Fagin, R., Maier, D., & Yannakakis, M. 1983. On the Desirability of Acyclic Database Schemes. Journal of the ACM, 30(3), 479–513.

Bistarelli, S., & U. Montanari, F. Rossi, T. Schiex G. Verfaillie H. Fargier. 1999. Semiring-Based CSPs and Valued CSPs: Framework, Properties and Comparison. CONSTRAINTS: An Int. Jounnal, 4.

Cechlarova, K., & Plavka, J. Linear Independence in Bottleneck Algebras. Fuzzy Sets Syst., 77, 337–348.

Clifford, A. H., & Preston, G. B. 1967. Algebraic Theory of Semigroups. Providence, Rhode Island: American Mathematical Society.

Cowell, R. G., Dawid, A. P., Lauritzen, S. L., & Spiegelhalter, D. J. 1999. Probabilistic Networks and Expert Systems. Information Sci. and Stats. Springer, New York.

Croisot, R. 1953. Demi-groupes inversifs et demi-groupes r´eunionsde demi- groupes simples. Ann. Sci. Ecole norm. Sup., 79(3), 361–379.

Cuzzolin, F. 2005. Algebraic Structure of the Families of Compatible Frames of Discernment. Ann. of Mathematics and Artificial Intelligence, 45, 241–274.

Davey, B.A., & Priestley, H.A. 1990. Introduction to Lattices and Order. Cambridge University Press.

Davey, B.A., & Priestley, H.A. 2002. Introduction to Lattices and Order. Cambridge University Press.

Dawid, A. P. 2001. Separoids: A Mathematical Framework for Conditional Independence and Irrelevance. Ann. Math. Artif. Intell, 32(1–4), 335– 372.

Dechter, R. 1999. Bucket Elimination: A Unifying Framework for Reason- ing. Artificial Intelligence, 113, 41–85. REFERENCES 198

Dempster, A.P. 1967. Upper and Lower Probabilities Induced by a Multi- valued Mapping. Annals of Math. Stat., 38, 325–339. Flum, J., & Grohe, M. 2006. Parameterized Complexitiy Theory. Springer. Gierz, et. al. G. 2003. Continuous Lattices and Domains. Cambridge Uni- versity Press. Golan, J. 1999. Semikrings and Their Applications. Kluwer Academic Publ. Dordrecht. Gottlob, G., Leone, N., & Scarcello, F. 1999a. A Comparison of Structural CSP Decomposition Methods. Pages 394–399 of: Proceedings of the 16th International Joint Conference on Artificial Intelligence IJCAI. Morgan Kaufmann. Gottlob, G., Leone, N., & Scarcello, F. 1999b. Hypertree decompositions and tractable queries. Pages 21–32 of: PODS ’99: Proceedings of the eighteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems. New York, NY, USA: ACM Press. Gottlob, G., Leone, N., & Scarcello, F. 2001. The complexity of acyclic conjunctive queries. J. ACM, 48(3), 431–498. Gr¨atzer,G. 1978. General Lattice Theory. Academic Press. Groenendijk, Jeroen. 2003. Questions and answers: Semantics and logic. Pages 12–23 of: Bernardi, R., & Moortgat, M. (eds), Proceedings of the 2nd CologNET-ElsET Symposium. Questions and Answers: Theoretical and Applied Perspectives. OTS. Groenendijk, Jeroen, & Stokhof, Martin. 1984. Studies on the Semantics of Questions and the Pragmatics of Answers. Ph.D. thesis, Universiteit van Amsterdam. Groenendijk, Jeroen, & Stokhof, Martin. 1997. Questions. Chap. 19, pages 1055–1124 of: ter Meulen, Alice, & van Benthem, Johan (eds), Hand- book of Logic and Language. Elsevier Science Publishers. Guan, Xuechong. 2014. A New Method for Compact Extension of Informa- tion Algebras. J. of Math. (PRC), 34, 610–616. Guan, Xuechong, & Li, Yongming. 2012. On Two Types of Continuous Information Algebras. Int. J. of Uncertainty, Fuziness and Knowledge- Based Systems, 20, 655–671. REFERENCES 199

Haenni, R., Kohlas, J., & Lehmann, N. 2000a. Probabilistic Argumentation Systems. Pages 221–287 of: Kohlas, J., & Moral, S. (eds), Handbook of Defeasible Reasoning and Uncertainty Management Systems, Vol- ume 5: Algorithms for Uncertainty and Defeasible Reasoning. Kluwer, Dordrecht.

Haenni, R., Kohlas, J., & Lehmann, N. 2000b. Probabilistic Argumentation Systems. Pages 221–287 of: Kohlas, J., & Moral, S. (eds), Handbook of Defeasible Reasoning and Uncertainty Management Systems, Vol- ume 5: Algorithms for Uncertainty and Defeasible Reasoning. Kluwer, Dordrecht.

Hewitt, E., & Zuckerman, H.S. 1956. The l1 algebra of a commutative semigroup. Amer. Math. Soc., 83, 70–97.

Jensen, F.V., Lauritzen, S.L., & Olesen, K.G. 1990. Bayesian Updating in Causal Probabilistic Networks by Local Computation. Computational Statistics Quarterly, 4, 269–282.

Jirousek, R, J. Vejnarova, & Daniel, M. 2007. Compositional Models of Belief Functions. Pages 243–252 of: De Cooman G., J. Vejnarova, & Zaffalon, M. (eds), Proc. of the 5-th Int. Symp. on Imprecise Probabilities and Their Applications (ISIPTA-07. UAI. Charles University Press.

Jirousek, R. 1997. Composition of Probability Measures on Finite Spaces. Pages 274–281 of: Geiger, D., & Shenoy, P. (eds), Uncertainty in Arti- ficial Intelligence. UAI. Morgan Kaufmann.

Jirousek, R. 2011. Foundations of Compositional Model Theory. Int. J. of General Systems, 40, 623–678.

Jirousek, R., & Shenoy, P. 2014. Compositional Models in Valuation Based Systems. Int. J. of Approximate Reasoning, 55, 277–293.

Jirousek, R., & Shenoy, P. 2015. Causal Compositional Models in Valu- ation Based Systems with Examples in Specific Theories. Int. J. of Approximate Reasoning.

Klement, E. Mesiar, R., & Pap, E. 2000. Triangular Norms, Trends in Logic. Kluwer Academic Publ. Dordrecht.

Kohlas, J. 1997. Allocation of Arguments and Evidence Theory. Theoretical Computer Science, 171, 221–246. REFERENCES 200

Kohlas, J. 2003a. Information Algebras: Generic Structures for Inference. Springer-Verlag. Kohlas, J. 2003b. Probabilistic Argumentation Systems. A New Way to Combine Logic with Probability. J. of Applied Logic, 1, 225–253. Kohlas, J. 2007. Uncertain information: random variables in graded semi- lattices. Int. J. Approx. Reason., doi:10.1016/j.ijar.2006.12.005. Kohlas, J., & Monney, P.A. 1995. A Mathematical Theory of Hints. An Approach to the Dempster-Shafer Theory of Evidence. Lecture Notes in Economics and Mathematical Systems, vol. 425. Springer. Kohlas, J., & Schmid, J. 2014a. An Algebraic Theory of Information: An Introduction and Survey. Information, 5, 219–2546. Kohlas, J., & Schmid, J. 2014b. Representation Theory of Information Algebras. Daft, 1–61. Kohlas, J., & Shenoy, P.P. 2000. Computation in Valuation Algebras. Pages 5–39 of: Kohlas, J., & Moral, S. (eds), Handbook of Defeasible Reason- ing and Uncertainty Management Systems, Volume 5: Algorithms for Uncertainty and Defeasible Reasoning. Kluwer, Dordrecht. Kohlas, J., & Wilson, N. 2008. Semiring Induced Valuation Algebras: Ex- act and Approximate Local Computation Algorithms. Artif. Intell., 172(11), 1360–1399. Kohlas, J., Berzati, D., & Haenni, R. 2002. Probabilistic Argumentation Systems and Abduction. Annals of Mathematics and Artificial Intelli- gence, Special Issue (AMAI), 34, 177–195. Kohlas, J¨urg,& Schneuwly, Cesar. 2009. Information Algebra. Pages 95– 127 of: Sommaruga, Giovanni (ed), Formal Theories of Information. Lecture Notes in Computer Science, vol. 5363. Springer. Kolokoltsov, V., & Maslov, V. 1997. Idempotent Analysis and its Applica- tions. Kluwer Academic Publ. Dordrecht. Laskey, K.B., & Lehner, P.E. 1989. Assumptions, Beliefs and Probabilities. Artif. Intell., 41, 65–77. Lauritzen, S. L., & Jensen, F. V. 1997. Local Computation with Valuations from a Commutative Semigroup. Ann. Math. Artif. Intell., 21(1), 51– 69. REFERENCES 201

Lauritzen, S. L., & Spiegelhalter, D. J. 1988. Local Computations with Probabilities on Graphical Structures and their Application to Expert Systems. J. Royal Statis. Soc. B, 50, 157–224. Maier, D. 1983. The Theory of Relational Databases. London: Pitman. Menger, K. 1942. Statistical Metrics. Proc. Nat. Acad. Sci, 28, 535–537. Mitsch, H. A Natural Partial Order for Semigroups. Proc. of the American Mat. Soc., 97, 384–388. Nambooripad, K.S.S.. The Natural Partial Order on a Regular Semigroup. Proc. of the Edinburgh Mat. Soc., 23, 294–260. Pearl, J. 1988. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann Publishers Inc. Pouly, M., & Kohlas, J. 2011. Generic Inference. A Unified Theory for Automated Reasoning. Wiley, Hoboken, New Jersey. Pouly, M., Kohlas, J., & Ryan, P.Y.A. 2013. Generalized Information The- ory for Hints. Int. J. of Approximate Reasoning, 54, 17–34. Shafer, G. 1976. A Mathematical Theory of Evidence. Princeton University Press. Shafer, G. 1991. An Axiomatic Study of Computation in Hypertrees. Work- ing Paper 232. School of Business, University of Kansas. Shafer, G. 1996. Probabilistic Expert Systems. CBMS-NSF Regional Confer- ence Series in Applied Mathematics, no. 67. Philadelphia, PA: SIAM. Shafer, G., & Shenoy, P. 1988. Local Computation in Hypertrees. Tech. rept. 201. School of Business, University of Kansas. Shafer, G., & Shenoy, P. P. 1990. Axioms for Probability and Belief Function Propagation. In: Shafer, G., & Pearl, J. (eds), Readings in Uncertain Reasoning. Morgan Kaufmann Publishers Inc., San Mateo, California. Shafer, G., Shenoy, P.P., & Mellouli, K. 1987a. Propagating Belief Functions in Qualitative Markov Trees. Int. J. of Approximate Reasoning, 1(4), 349–400. Shafer, G., Shenoy, P.P., & Mellouli, K. 1987b. Propagating Belief FUnc- tions in Qualitative Markov Trees. Int. J. of Approximate Reasoning, 1(4), 349–400. REFERENCES 202

Shenoy, P. P., & Shafer, G. 1990a. Axioms for probability and belief-function propagation. Pages 169–198 of: Shachter, Ross D., Levitt, Tod S., Kanal, Laveen N., & Lemmer, John F. (eds), Uncertainty in Artificial Intelligence 4. Machine intelligence and pattern recognition, vol. 9. Amsterdam: Elsevier.

Shenoy, P.P. 1992. Valuation-Based Systems: A Framework for Manag- ing Uncertainty in Expert Systems. Pages 83–104 of: Zadeh, L.A., & Kacprzyk, J. (eds), Fuzzy Logic for the Management of Uncertainty. John Wiley & Sons.

Shenoy, P.P. 1994. Conditional Independence in Valuation-based Systems. International Journal of Approximate Reasoning, 10, 203–234.

Shenoy, P.P. 1996. Axioms for Dynamic Programming. Pages 259–275 of: Gammerman, A. (ed), Computational Learning and Probabilistic Reasoning. Wiley, Chichester, UK.

Shenoy, P.P., & Shafer, G. 1990b. Axioms for Probability and Belief Func- tion Propagation. Pages 169–198 of: R.D. Shachter, T.S. Levitt, J.F. Lemmer, & Kanal, L.N. (eds), Uncertainty in Artif. Intell. 4. North Holland.

Spohn, W. 1988. Ordinal conditional functions: A dynamic theory of epis- temic states. Pages 105–134 of: Harper, W.L., & Skyrms, B. (eds), Causation in Decision, Belief Change, and Statistics, vol. 2. Dordrecht, Netherlands.

Stoltenberg-Hansen, V., Lindstroem, I., & Griftor, E. 1994. Mathematical Theory of Domains. Cambridge: Cambridge University Press.

Vejnarova, J. 1998. Composition of Possibility Measures on Finite Spaces: Preliminary Results. Pages 25–30 of: Bouchon-Meunier, B., & Yager, R.R. (eds), Proc. of the 7-th Int. Conf. on Information Processing and Management of Uncertainty in Knowledge-Based System (IPMU-98. UAI.

Yager, R., & Rybalov, A. 1996. Uninorm Aggregation Operators. Fuzzy Set Syst., 80, 111–120.