<<

Statistical Relational AI: Papers from the AAAI-14 Workshop

Reasoning in the Description BEL Using Bayesian Networks

Ismail˙ Ilkan˙ Ceylan⇤ Rafael Penaloza˜ † Theoretical Theoretical Computer Science, TU Dresden, Germany TU Dresden, Germany Center for Advancing Electronics Dresden [email protected] [email protected]

Abstract 2 Proof Structures in EL is a light-weight DL that allows for polynomial-time rea- We study the problem of reasoning in the probabilistic De- EL scription Logic . Using a novel structure, we show that soning. It is based on and roles, corresponding to probabilistic reasoningBEL in this logic can be reduced in poly- unary and binary predicates from first-order logic, respec- nomial time to standard over a Bayesian network. tively. Formally, let NC and NR be disjoint sets of This reduction provides tight complexity bounds for proba- names and role names, respectively. concepts are defined bilistic reasoning in . through the syntactic rule C ::= A EL C C r.C, where BEL | > | u | 9 A NC and r NR. The2 semantics2 of is given in terms of an interpre- 1 Introduction EL tation =(I , I ) where I is a non-empty domain Description (DLs) (Baader et al. 2007) are a fam- I · and I is an interpretation function that maps every con- ily of knowledge representation formalisms tailored towards · cept name A to a AI I and every role name the representation of terminological knowledge in a formal ✓ r to a set of binary relations rI I I . The manner. In their classical form, DLs are unable to handle ✓ ⇥ interpretation function I is extended to concepts by the inherent uncertainty of many application domains. To · EL defining I := I , (C D)I := CI DI , and overcome this issue, several probabilistic extensions of DLs > u \ ( r.C)I := d I e I :(d, e) rI e CI . have been proposed. The choice of a specific probabilistic 9The knowledge{ 2 of a| 9 domain2 is represented2 ^ through2 a} set DL over others depends on the intended application; these of restricting the interpretation of the concepts. logics differ in their logical expressivity, their , and their independence assumptions. Definition 1 (TBox). A general concept inclusion (GCI) is an expression of the form C D, where C, D are con- Recently, the DL (Ceylan and Penaloza˜ 2014) was v introduced as a meansBEL of describing certain knowledge that cepts. A TBox is a finite set of GCIs. The signature of (sig( )) is the setT of concept and role names appearing in T. depends on an uncertain context, expressed by a Bayesian T T network (BN). An interesting property of this logic is that An interpretation satisfies the GCI C D iff CI DI ; is a model of theI TBox iff it satisfiesv all the GCIs✓ in . reasoning can be decoupled between the logical part and I T T the BN inferences. However, despite the logical component The main reasoning service in is subsumption check- of this logic being decidable in polynomial time, the best ing, i.e., deciding the sub-conceptEL relations between given known for probabilistic reasoning in runs in BEL concepts based on their semantic definitions. A concept exponential time. C is subsumed by D w.r.t. the TBox ( = C D) T T| v In this paper we use a novel structure, called the proof iff CI DI for all models of . It has been structure, to reduce probabilistic reasoning for a ✓ I T BEL shown that subsumption can be decided in in poly- to probabilistic inferences in a BN. In a nut- nomial time by a completion algorithm (Baader,EL Brandt, shell, a proof structure describes the class of contexts that and Lutz 2005). This algorithm requires the TBox to entail the wanted consequence. A BN can be constructed to be in normal form; i.e., where all GCIs are one of the compute the of these contexts, which yields the forms A B A B C A r.B r.B A. It probability of the entailment. Since this reduction can be is well knownv that| everyu v TBox| canv9 be transformed| 9 v into an done in polynomial time, it provides tight upper bounds for equivalent one in normal form of linear size (Brandt 2004; the complexity of reasoning in . BEL Baader, Brandt, and Lutz 2005); for the rest of this paper, we assume that is a TBox in normal form and GCI denotes a ⇤Supported by DFG in the Research Training Group “RoSI” T (GRK 1907). normalized subsumption relation. In this paper, we are interested in deriving the GCIs in †Partially supported by DFG within the Cluster of Excellence normal form that follow from ; i.e. the normalised logical ‘cfAED’. T Copyright c 2014, Association for the Advancement of Artificial closure of . We introduce the deduction rules shown in Ta- Intelligence (www.aaai.org). All rights reserved. ble 1 to produceT the normalised logical closure of a TBox.

15 Table 1: Deduction rules for . EL Algorithm 1 Construction of the pruned proof structure Input: TBox Premises (S) Result (↵) Output: H =(T W, F ) 7! A B B C A C 1 , 1: V , E , i 0 2 hA v r.Bi h , vB ihC A v r.Ci 0 0 2: do T ; hA v9r.Bi hC v AihC v9r.Bi 3 , 3: i i +1 4 h r.Av9 Bi, hB v Cihr.Av9 C i h9 v i h v ih9v i 4: Vi := Vi 1 ↵ S ↵,S Vi 1 5 r.A B , C A r.C B [ { | 7! ✓ } h9 v i h v ih9v i 5: Ei = (S, ↵) S ↵,S Vi 1 6 r.A B , B r.C A C { | 7! ✓ } h9 v i h v9 ihv i 6: while Vi = Vi 1 or Ei = Ei 1 A r.B r.B C A C 7 , 7: W := (↵6 ,k) ↵ V ,6 0 k i hA v9B Ci h9C vX ihA vB i X k 8 , 8: E := {(S, (↵,k| )) 2(S, ↵)  E, 0} k i 9 hA u B v Ci, hX v AihX u B v Ci { | 2 k   } [ h u v i h v ihu v i 9: ( (↵,k 1) , (↵,k)) ↵ Vk 1, 0 k i 10 A B C , X B A X C 10: return{({W, E) } | 2   } 11 hX u Xv Ci h v ihX u Cv i h u v ihv i

Each rule maps a set of premises to a GCI that is implicitly der to avoid cycles. In this case, nodes are pairs of GCIs and labels, where the latter indicates to which level the nodes be- encoded in the premises. It is easy to see that the sets of i premises cover all pairwise combinations of GCIs in normal long in the hypergraph. We write S = (↵,i) ↵ S to denote the i-labeled set of GCIs in S. Let{ n :=| sig2( }) 3, form and that the deduction rules produce the normalised | T | logical closure of a TBox. Moreover, the given deduction we start with the set W0 := (↵, 0) ↵ and define the levels 0 i

16 ... A D ... variable x V is conditionally independent of its non- v descendants2 given its parents. Thus, every BN defines a unique joint probability distribution (JPD) over VB given by

A BA CB CA DB DC D P (V )= P (x ⇡(x)). B B | v v v v v v x V Y2 As with classical DLs, the main building blocks in BEL A BB CB DC D are concepts, which are syntactically built as concepts. v v v v The domain knowledge is encoded by a generalizationEL of Figure 1: The first levels of an unfolded proof structure and TBoxes, where GCIs are annotated with a context, defined the paths to A D by a set of literals belonging to a BN. h v i Definition 7 (KB). Let V be a finite set of Boolean vari- ables. A V -literal is an expression of the form x or x, ¬ Lemma 5. Algorithm 1 terminates in time polynomial on where x V ;aV -context is a consistent set of V -literals. the size of . A V -restricted2 general concept inclusion (V -GCI) is of T the form C D :  where C and D are concepts We briefly illustrate the execution of Algorithm 1 on a and  is ahV -context.v i A V -TBox is a finite setBEL of V -GCIs. simple TBox. A knowledge base (KB) over V is a pair =( , ) Example 6. Consider the following TBox whereBEL is a BN over V and is a V -TBox. K B T EL B T = A B,B C,B D, C D . The first lev- The semantics of extends the semantics of by Tels of{ thev unfoldedv proofv structurev of} are shown in BEL EL 1 T additionally evaluating the random variables from the BN. Figure 1. The first level V0 of this hypergraph contains a Given a finite set of Boolean variables V ,aV -interpretation representative for each GCI in . To construct the second T is a tuple =(I , I , I ) where I is a non-empty set level, we first copy all the GCIs in V0 to V1, and add a I · V called the domain, I : V 0, 1 is a valuation of the hyperedge joining the equivalent ones (represented by a V ! { } variables in V , and I is an interpretation function that maps dashed line in Figure 1). Afterwards, we apply all possible · every concept name A to a set AI I and every role deduction rules to the elements of V0, and add a hyperedge ✓ 3 name r to a binary relation rI I I . from the premises at level V0 to the conclusion at level V1 ✓ ⇥ The interpretation function I is extended to arbitrary (continuous lines). The same procedure is repeated at each · concepts as in and the valuation I is extended to subsequent level. Notice that the set of GCIs at each level is BEL EL V contexts by defining, for every x V , I ( x)=1 I (x), monotonically increasing. Additionally, for each GCI, the 2 V ¬ V and for every context , I ()=min`  I (`), where in-degree of each representative monotonically increases V 2 V I ( ):=1. Intuitively, a context  can be thought as a throughout the levels. conjunctionV ; of literals, which is evaluated to 1 iff each lit- In the next section, we briefly recall , a probabilistic eral in the context is evaluated to 1. extension of based on Bayesian networksBEL (Ceylan and The V -interpretation is a model of the V -GCI I Penaloza˜ 2014),EL and use the construction of the (unfolded) C D :  , denoted as = C D :  , iff h v i I| h v i proof structure to reduce reasoning in this logic, to standard (i) I ()=0, or (ii) CI DI . It is a model of the V ✓ Bayesian network inferences. V -TBox iff it is a model of all the V -GCIs in . The idea is thatT the restriction C D is only requiredT to hold v 3 The Bayesian Description Logic whenever the context  is satisfied. Thus, any interpretation BEL that violates the context trivially satisfies the V -GCI. The probabilistic Description Logic extends by as- BEL EL Example 8. Let V0 = x, y, z , and consider the V0-TBox sociating every GCI in a TBox with a probability. To handle { } the joint probability distribution of the GCIs, these probabil- 0 := A C : x, y , A B : x , T { h v { }i h v {¬ }i ities are encoded in a Bayesian network (Darwiche 2009). B C : x . Formally, a Bayesian network (BN) is a pair =(G, ), h v {¬ }i} B 0 where G =(V,E) is a finite directed acyclic graph (DAG) The interpretation 0 =(d , I , 0) given by I 0 { } · V0 0 whose nodes represent Boolean random variables,2 and 0( x, y, z )=1, AI = d , and BI = CI = V { ¬ } { } ; contains, for every node x V , a conditional probability is a model of 0, but is not a model of the V -GCI 2 T 0 0 distribution P (x ⇡(x)) of x given its parents ⇡(x). If V A B : x , since 0( x )=1but AI BI . B | h v { }i V { } 6✓ is the set of nodes in G, we say that is a BN over V . A V -TBox is in normal form if for each V -GCI B BNs encode a series of conditional independence assump- ↵ :  , ↵Tis an GCI in normal form. A KB tions between the random variables; more precisely, every h =(i2, T) is in normalEL form if is in normal form.BEL As Kfor ,T everyB KB can be transformedT into an equivalent 1For the illustrations we drop the second component of the oneEL in normalBEL form in polynomial time (Ceylan 2013). Thus, nodes, but visually make the level information explicit. we consider only KBs in normal form in the following. 2In their general form, BNs allow for arbitrary discrete random BEL variables. We restrict w.l.o.g. to Boolean variables for ease of pre- 3When there is no danger of ambiguity, we will usually drop the sentation. prefix V and speak simply of e.g. a TBox,aKB or an interpretation.

17 The DL is a special case of in which all V -GCIs x EL BEL x are of the form C D : . Notice that every valuation 0.7 y satisfies the emptyh contextv ;i; thus, a V -interpretation sat- ; I y x 1 isfies the V -GCI C D : iff CI DI . We say that z entails C Dh : v, denoted;i by ✓= C D, if every x 0.5 T h v ;i T| v xy 0.3 ¬ model of is also a model of C D : . For a valuation z of theT variables in V , weh canv define;i a TBox contain- x y 0.1 W xy¬ 0 ing all axioms that must be satisfied in any V -interpretation ¬x y 0.9 =(I , I , I ) with I = . ¬ ¬ I · V V W Definition 9 (restriction). Let =( , ) be a KB. The Figure 2: A simple BN restriction of to a valuation Kof theB variablesT in V is T W := C D : C D :  , ()=1 . TW {h v ;i | h v i2T W } We say that C is p-subsumed by D in , for p (0, 1] if To handle the probabilistic knowledge provided by the 2 BN, we extend the semantics of through multiple- P ( C D :  ) p. BEL h vK i world interpretations. Intuitively, a V -interpretation de- The following proposition was shown in (Ceylan and scribes a possible world; by assigning a probabilistic dis- Penaloza˜ 2014). tribution over these interpretations, we describe the required Proposition 12. Let =( , ) be a KB. Then , which should be consistent with the BN pro- K B T vided in the knowledge base. P ( C D :  )=1 P ()+ P ( ). h vK i B B W Definition 10 (probabilistic model). A probabilistic in- =C D terpretation is a pair =(I,PI), where I is a set of TW X|()=1v P W V -interpretations and PI is a probability distribution over I such that PI( ) > 0 only for finitely many interpretations Consider the KB =( , ), where I Example 13. 0 0 0 0 I. This probabilistic interpretation is a model of the is the BN from Figure 2 and theK TBoxB fromT ExampleB 8. I 2 0 TBox if every I is a model of . is consistent with It follows that P ( A C T: x, y )=1from the first T I 2 T P 0 the BN if for every possible valuation of the variables GCI in and P ( hA vK C : { x })=1i from the others B W 0 in V it holds that since anyT model ofh vneedsK to{¬ satisfy}i the GCIs asserted in K0 PI( )=P ( ). by definition. Notice that A C does not hold in con- I B W Ttext x, y , but P ( A C :v x, y )=1. Since this I, I = 0 I2 XV W describes{ ¬ all} contexts,h wev concludeK { P¬( A}i C : )=1. It is a model of the KB ( , ) iff it is a (probabilistic) model h vK0 ;i of and consistent withB T. T B 3.2 Deciding p-subsumption One simple consequence of this semantics is that proba- We now show that deciding p-subsumption can be reduced bilistic models preserve the probability distribution of for B to exact in Bayesian networks. This latter problem contexts; the probability of a context  is the sum of the is known to be PP-complete (Roth 1996). Let =( , ) probabilities of all valuations that extend . be an arbitrary but fixed KB. From the V -TBoxK T, weB Just as in classical DLs, we want to extract the informa- BEL T construct the classical TBox 0 := ↵ ↵ :  ; tion that is implicitly encoded in a KB. In particular, EL T { | h i2T} BEL that is, 0 contains the same axioms as , but ignores the we are interested in solving different reasoning tasks for this contextualT information encoded in theirT labels. Let now logic. One of the fundamental reasoning problems in is u EL H be the (pruned) unraveled proof structure for this TBox subsumption: is a concept C always interpreted as a sub- T u 0. By construction, H is a directed acyclic hypergraph. concept of D? In the case of , we are also interested in T T BEL Our goal is to transform this hypergraph into a DAG. Us- finding the probability with which such a subsumption rela- ing this DAG, we will construct a BN, from which all the tion holds. For the rest of this section, we formally define p-subsumption relations can be read through standard BN this reasoning task, and provide a method for solving it, by inferences. We explain this construction in two steps. reducing it to a decision problem in Bayesian networks. From Hypergraph to DAG Hypergraphs generalize 3.1 Probabilistic Subsumption graphs by allowing several vertices to be connected by a Subsumption is one of the most basic decision problems in single edge. Intuitively, the hyperedges in a hypergraph en- . In , we generalize this problem to consider also the code a formula in disjunctive normal form. Indeed, an edge contextsEL BEL and probabilities provided by the BN. (S, v) expresses that if all the elements in S can be reached, Definition 11 (p-subsumption). Let C, D be two then v is also reachable; this can be seen as an implication: concepts,  a context, and a KB. ForBEL a w S w v. Suppose that there exist several edges shar- K BEL 2 ) (S ,v), (S ,v),...,(S ,v) probabilistic interpretation =(I,PI), we define ing the same head 1 2 k in the hyper- P graph.V This situation can be described through the implica- P ( C D :  ):= I, = C D: PI( ). The prob- h vP i I2 I| h v i I tion k ( w) v. We can thus rewrite any directed ability of C D :  w.r.t. is defined as i=1 w Si h v iP K acyclic hypergraph2 into) a DAG by introducing auxiliary con- P ( C D :  ):= inf P ( C D :  ). W V h vK i = h vP i junctive and disjunctive nodes (see the upper part of Fig- P| K

18 Algorithm 2 Construction of a DAG from a hypergraph have at most two parent nodes. This restriction can be easily Input: H =(V,E) directed acyclic hypergraph removed by introducing conjunction nodes as before. For a Output: G =(V 0,E0) directed acyclic graph context , let var() denote the set of all variables appearing 1: V 0 V , i, j 0 in . We construct a new BN as follows. Let G =(V,E) and G =(BKV ,E ). The DAG G is 2: for each v V do T T T K 3: S 2S (S, v) E , j i given by G =(V ,E ), where V := V V and K K K K [ T 4: for each{ S| S do2 } 2 E := E E (x, (↵, 0)) ↵ :  ,x var() . 5: V 0 V 0 i , E0 E0 (u, i) u S K [ T [ { | h i2T 2 } [ {^ } [ { ^ | 2 } 6: if i>jthen Clearly, G is a DAG. We now need only to define the con- K 7: V 0 V 0 i , E0 E0 ( i, i) ditional probability tables for the nodes in V given their [ {_ } [ { ^ _ } T 8: i i +1 parents in G ; notice that the structure of the graph G re- K 9: i = j +1 mains unchanged for the construction of G . For every if then K 10: E E ( ,v) node (↵, 0) V , there is a  such that ↵ :  ; the 0 0 j 2 T h i2T 11: else [ { ^ } parents of (↵, 0) in G are then var() V . The con- K ✓ 12: E0 E0 ( , ) j0 has exactly one parent bility tables in a BN is exponential on the number of parents node v; (↵,i) is true with probability 1 iff v is true. that a node has, we ensure that these auxiliary nodes, as well Example 14. Consider the KB =( , 0), where as the elements in W have at most two parent nodes. BEL K T B T Algorithm 2 describes the construction of such a DAG = A B : x , B C : x, y , T { h v { }i h v {¬ }i from a directed hypergraph. Essentially, the algorithm adds C D : z , B D : y . a new node i for each hyperedge (S, v) in the input hy- h v { }i h v { }i} pergraph H,^ and connects it with all the nodes in S. Addi- The BN obtained from this KB is depicted in Figure 3. The tionally, if there are k hyperedges that lead to a single node upper part of the figure represents the DAG obtained from v, it creates k 1 nodes i. These are used to represent the the unraveled proof structure of , while the lower part _ T binary disjunctions among all the hyperedges leading to v. shows the original BN 0. The gray arrows depict the con- Clearly, the algorithm runs in polynomial time on the size of nection between these twoB DAGs, which is given by the la- H, and if H is acyclic, then the resulting graph G is acyclic bels in the V -GCIs in . The gray boxes denote the condi- too. Moreover, all the nodes v V that existed in the in- tional probability of theT different nodes given their parents. 2 put hypergraph will have at most one parent node after the Suppose that we are interested in P ( A D : ). h vK ;i translation; every i node has exactly two parents, and the From the unraveled proof structure, we can see that A D _ v number of parents of a node i is given by the set S from the can be deduced either using the GCIs A B, B C, ^ hyperedge (S, v) E that generated it. In particular, if the C D, or through the two GCIs A Bv, B D.v The 2 input hypergraph is the unraveled proof structure for a TBox probabilityv of any of these combinationsv of GCIsv to appear , then the size of the generated graph G is polynomial on is given by and the contextual connection to the axioms T 0 the size of , and each node has at most two parent nodes. at the lowerB level of the proof structure. Thus, to deduce T P ( A D : ) we need only to compute the probability From DAG to Bayesian Network The next step is to build of theh nodevK (A;i D, n), where n is the last level. a BN that preserves the probabilistic entailments of a v KB. Let =( , ) be such a KB, with =(G, ),BEL and From the properties of proof structures and Theorem 4 we let G beK the DAGT B obtained from the unraveledB proof struc- have that ture ofT using Algorithm 2. Recall that the nodes of G T T P ((↵,n) )= P ((↵,n) ())P ( ()) are either (i) pairs of the form (↵,i), where ↵ is a GCI in BK | BK |V BK V ()=1 normal form built from the signature of , or (ii) an aux- V X T iliary disjunction ( i) or conjunction ( i) node introduced = P ( ), _ ^ BK W by Algorithm 2. Moreover, (↵, 0) is a node of G iff there =↵ T W TX(|)=1 is a context  with ↵ :  . We assume w.l.o.g. that W for node (↵, 0) thereh is exactlyi2T one such context. If there were more than one, then we could extend the BN with an which yields the following result. additional variable which describes the disjunctionsB of these Theorem 15. Let =( , ) be a KB, where is contexts, similarly to the construction of Algorithm 2. Sim- over V , and n = Ksig( ) T3.B For a V -GCIBEL ↵ :  , it holdsB ilarly, we assume that  2, to ensure that 0-level nodes that P ( ↵ :  )=1| TP |()+P ((↵,n)h ). i | |  h i B BK |

19 A D _j0 lows for classical consequences whereas DISPONTE does v not. Closest to our approach is perhaps the Bayesian ex- _j0 tension of DL-Lite called BDL-Lite (d’Amato, Fanizzi, and Lukasiewicz 2008). Abstracting from the different logical j ( j ) ( j ) _ ^ _ ^ 00 component, BDL-Lite looks almost identical to ours. There is, however, a subtle but important difference. In our ap- ... j j j ... ^ ^ 0 ^ 00 proach, an interpretation satisfies a V -GCI C D :  I h v i if I ()=1implies CI DI . In (d’Amato, Fanizzi, A BA CB CA DB DC D V ✓ v v v v v v and Lukasiewicz 2008), the authors employ a closed-world ... assumption over the contexts, where this implication is sub- i i0 stituted for an equivalence; i.e., I ()=0also implies _ _ V CI DI . The use of such semantics can easily produce (A B) (B D) i i (B C) (C D) i v ^ v ^ ^ 0 v ^ v ^ 00 inconsistent6✓ KBs, which is impossible in . BEL Other probabilistic extensions of are (Lutz and x A BBx y CCz DBy D EL v ¬ ^ v v v Schroder¨ 2010) and (Niepert, Noessner, and Stuckenschmidt 2011). The former introduces probabilities as a concept x constructor, while in the latter the probabilities of axioms, x 0.7 which are always assumed to be independent, are implic- y itly encoded through a weighting function, which is inter- y x 1 preted with a log-linear model. Thus, both formalisms differ z x 0.5 ¬ greatly from our approach. xy 0.3 x y 0.1 z xy¬ 0 5 Conclusions ¬x y 0.9 ¬ ¬ We have described the probabilistic DL , which extends the light-weight DL with uncertainBEL contexts. We have Figure 3: A portion of the constructed BN shown that it is possibleEL to construct, from a given KB , a BN that encodes all the probabilistic andBEL logical KknowledgeB ofK w.r.t. to the signature of the KB. Moreover, This theorem states that we can reduce the problem of the size of Kis polynomial on the size of . We obtain that p-subsumption w.r.t. the KB to a probabilistic infer- probabilisticBK reasoning over is at most asK hard as deciding BEL K ence in the BN . Notice that the size of is polyno- inferences in , which yieldsK a tight complexity bound for BK BK mial on the size of . This means that p-subsumption is deciding p-subsumptionBK in this logic. K at most as hard as deciding the probability of query vari- While the construction is polynomial on the input KB, ables, given an evidence, which is known to be in PP (Roth the obtained DAG might not preserve all the desired prop- 1996). Since p-subsumption is already PP-hard (Ceylan and erties of the original BN. For instance, it is known that Penaloza˜ 2014), we obtain the following result. the efficiency of the BN inference engines depends on the Corollary 16. Deciding p-subsumption w.r.t. a KB treewidth of the underlying DAG (Pan, McMichael, and is PP-complete on the size of . BEL K Lendjel 1998); however, the proof structure used by our con- K struction may increase the treewidth of the graph. One direc- 4 Related Work tion of future research will be to try to optimize the reduc- An early attempt for combining BNs and DLs was tion by bounding the treewidth and reducing the ammount P-CLASSIC (Koller, Levy, and Pfeffer 1997), which extends of nodes added to the graph. CLASSIC through probability distributions over the interpre- Clearly, once we have constructed the associated BN from a given KB , this can be used for additionalB in-K tation domain. In the same line, in PR-OWL (da Costa, BEL K Laskey, and Laskey 2008) the probabilistic component is in- ferences, beyond deciding subsumption. We think that rea- terpreted by providing individuals with a probability distri- soning tasks such as contextual subsumption and finding the bution. As many others in the literature (see (Lukasiewicz most likely context, defined in (Ceylan and Penaloza˜ 2014) and Straccia 2008) for a thorough survey on probabilistic can be solved analogously. Studying this and other reason- DLs) these approaches differ from our multiple-world se- ing problems is also a task of future work. mantics, in which we consider a probability distribution over Finally, our construction does not depend on the chosen a set of classical DL interpretations. DL , but rather on the fact that a simple polynomial-time DISPONTE (Riguzzi et al. 2012) is one representative consequence-basedEL method can be used to reason with it. of the approaches that consider a multiple-world seman- It should thus be a simple task to generalize the approach to tics. The main difference with our approach is that in other consequence-based methods, e.g. (Simancik, Kazakov, DISPONTE, the authors assume that all probabilities are in- and Horrocks 2011). It would also be interesting to gener- dependent, while we provide a joint probability distribution alize the probabilistic component to consider other kinds of through the BN. Another minor difference is that al- probabilistic graphical models (Koller and Friedman 2009). BEL

20 References Baader, F.; Calvanese, D.; McGuinness, D. L.; Nardi, D.; and Patel-Schneider, P. F., eds. 2007. The Description Logic Handbook: Theory, Implementation, and Applica- tions. Cambridge University Press, 2nd edition. Baader, F.; Brandt, S.; and Lutz, C. 2005. Pushing the envelope. In Proc. IJCAI-05. Morgan-Kaufmann. EL Brandt, S. 2004. Polynomial time reasoning in a description logic with existential restrictions, GCI axioms, and—what else? In Proc. ECAI-2004, 298–302. IOS Press. Ceylan, I.˙ I.,˙ and Penaloza,˜ R. 2014. The Bayesian Descrip- tion Logic . In Proc. IJCAR 2014. To appear. BEL Ceylan, I.˙ I.˙ 2013. Context-sensitive bayesian description logics. Master’s thesis, Dresden University of Technology, Germany. da Costa, P. C. G.; Laskey, K. B.; and Laskey, K. J. 2008. Pr- owl: A bayesian language for the . In Uncertainty Reasoning for the Semantic Web I, URSW 2005- 2007, volume 5327 of LNCS, 88–107. Springer. d’Amato, C.; Fanizzi, N.; and Lukasiewicz, T. 2008. Tractable reasoning with bayesian description logics. In Proc. Second International Conference on Scalable Uncer- tainty Management (SUM 2008), volume 5291 of LNCS, 146–159. Springer. Darwiche, A. 2009. Modeling and Reasoning with Bayesian Networks. Cambridge University Press. Koller, D., and Friedman, N. 2009. Probabilistic Graphical Models - Principles and Techniques. MIT Press. Koller, D.; Levy, A. Y.; and Pfeffer, A. 1997. P-classic: A tractable probablistic description logic. In Proc. 14th Na- tional Conference on Artificial Intelligence (AAAI-97), 390– 397. AAAI Press. Lukasiewicz, T., and Straccia, U. 2008. Managing uncer- tainty and vagueness in description logics for the semantic web. J. of Web Semantics 6(4):291–308. Lutz, C., and Schroder,¨ L. 2010. Probabilistic description logics for subjective uncertainty. In Lin, F.; Sattler, U.; and Truszczynski, M., eds., KR. AAAI Press. Niepert, M.; Noessner, J.; and Stuckenschmidt, H. 2011. Log-linear description logics. In Walsh, T., ed., IJCAI, 2153–2158. IJCAI/AAAI. Pan, H.; McMichael, D.; and Lendjel, M. 1998. Inference in bayesian networks and the probanet system. Digital Signal Processing 8(4):231 – 243. Riguzzi, F.; Bellodi, E.; Lamma, E.; and Zese, R. 2012. Epistemic and statistical probabilistic . In Proc. 8th Int. Workshop on Uncertainty Reasoning for the Seman- tic Web (URSW-12), volume 900, 3–14. CEUR-WS. Roth, D. 1996. On the hardness of approximate reasoning. Artif. Intel. 82(1-2):273–302. Simancik, F.; Kazakov, Y.; and Horrocks, I. 2011. Consequence-based reasoning beyond horn ontologies. In Proc. IJCAI-11, 1093–1098. IJCAI/AAAI.

21