Grouping Axioms for More Coherent Ontology Descriptions

Grouping axioms for more coherent ontology descriptions Sandra Williams Richard Power The Open University The Open University Milton Keynes, United Kingdom Milton Keynes, United Kingdom [email protected] [email protected] Abstract Web languages, see Smart (2008). Typically they generate one sentence per axiom: for example, Ontologies and datasets for the Semantic from the axiom2 Cat v Animal the OWL ACE Web are encoded in OWL formalisms that verbaliser (Kaljurand and Fuchs, 2007) generates are not easily comprehended by people. ‘Every cat is an animal’. The result is not a co- To make ontologies accessible to human herent text, however, but a disorganised list, often domain experts, several research groups including inefficient repetitions such as: have developed ontology verbalisers using Every cat is an animal. Natural Language Generation. In practice Every dog is an animal. ontologies are usually composed of simple Every horse is an animal. axioms, so that realising them separately Every rabbit is an animal. is relatively easy; there remains however An obvious first step towards improved efficiency the problem of producing texts that are co- and coherence would be to replace such lists with herent and efficient. We describe in this a single aggregated sentence: paper some methods for producing sen- The following are kinds of animals: cats, dogs, tences that aggregate over sets of axioms horses and rabbits. that share the same logical structure. Be- In this paper, we show how all axiom patterns cause these methods are based on logical in EL++, a DL commonly used in the Semantic structure rather than domain-specific con- Web, can be aggregated without further domain cepts or language-specific syntax, they are knowledge, and describe a prototype system that generic both as regards domain and lan- performs such aggregations. Our method aggre- guage. gates axioms while they are still in logical form, 1 Introduction i.e., as part of sentence planning but before con- verting to a linguistic representation and realising When the Semantic Web becomes established, as English sentences. This approach is somewhat people will want to build their own knowledge different from that proposed by other researchers bases (i.e., ontologies, or TBox axioms, and data, who convert ontology axioms to linguistic struc- 1 or ABox axioms ). Building these requires a high tures before aggregating (Hielkema, 2009; Galanis level of expertise and is time-consuming, even et al., 2009; Dongilli, 2008). We present results with the help of graphical interface tools such as from testing our algorithm on over fifty ontologies Proteg´ e´ (Knublauch et al., 2004). Fortunately, nat- from the Tones repository3. ural language engineers have provided a solution to at least part of the problem: verbalisers, e.g., 2 Analysis of axiom groupings the OWL ACE verbaliser (Kaljurand and Fuchs, In this section we analyse which kinds of axioms 2007). might be grouped together. Power (2010) anal- Ontology verbalisers are NLG systems that generate controlled natural language from Semantic 2For brevity we use logic notation rather than e.g., OWL Functional Syntax: subClassOf(class(ns:cat) 1Description Logic (DL) underlies the Web Ontology class(ns:animal)) where ns is any valid namespace. Language OWL. DL distinguishes statements about classes The operator v denotes the subclass relation, u denotes class (TBox) from those about individuals (ABox). OWL cov- intersection, and 9P:C the class of individuals bearing the ers both kinds of statements, which in OWL terminology are relation P to one or more members of class C. called ‘axioms’. 3http://owl.cs.manchester.ac.uk/ No. Logic OWL % 1. 2. 3. 4. 1 A v B subClassOf(A B) 51 1. L,R,C ×,R?,× ×,×,× L?,×,C 2 A v 9P:B subClassOf(A 2. L,R,× ×,×,× ×,×,C someValuesFrom(P B)) 33 3. L,R,× ×,R?,× 3 [a; b] 2 P propertyAssertion(P a b) 8 4. L,R,× 4 a 2 A classAssertion(A a) 4 Table 2: Aggregating common axioms: 1. A v B, Table 1: The four most common axiom patterns. 2. A v 9P:B, 3. [a; b] 2 P , 4. a 2 A ysed axiom patterns present in the same fifty on- Table 2 summarises our conclusions on whether tologies. In spite of the richness of OWL, the sur- each pair of the four common patterns can be prising result was that only four relatively simple merged or chained. Each cell contains three en- patterns dominated, accounting for 96% of all pat- tries, indicating the possibility of left-hand-side terns found in more than 35,000 axioms. Overall merge (L), right-hand-side merge (R), and chain- there were few unique patterns, typically only 10 ing (C). As can be seen, some merges or chains to 20, and up to 34 in an unusually complex ontol- are possible across different patterns, but the safest ogy. Table 1 lists the common patterns in logic no- aggregations are those grouping axioms with the tation and OWL Functional Syntax, and also gives same pattern (down the diagonal), and it is these the frequencies across the fifty knowledge bases. on which we focus here. Examples of English paraphrases for them are: 1. Every Siamese is a cat. 3 Merging similar patterns 2. Every cat has as body part a tail. 3. Mary owns Kitty. 4. Kitty is a Siamese. Function Merge Patterns When two or more axioms conform to a pattern: f1(A) f1([A1;A2;A3;::: ]) f2(A; B) f2([A1;A2;A3;::: ];B) A v B f2(A; [B1;B2;B3;::: ]) A v C f3(A; B; C) f3([A1;A2;A3;::: ]; B; C) B v C f3(A; [B1;B2;B3;::: ];C) C v D f3(A; B; [C1;C2;C3;::: ]) there are two techniques with which to aggregate Table 3: Generic merging rules. them: merging and chaining. If the right-hand sides are identical we can merge the left-hand If we represent ABox and TBox axioms as sides, and vice versa:4 Prolog terms (or equivalently in OWL Func- [A; B] v C tional Syntax), they take the form of functions A v [B; C] with a number of arguments — for example subClassOf(A,B), where subClassOf is the func- Alternatively, where the right-hand side of an ax- tor, A is the first argument and B is the second argu- iom is identical to the left-hand side of another ax- ment. We can then formulate generic aggregation iom, we can ‘chain’ them: rules for merging one-, two- and three-argument A v B v C v D axioms, as shown in table 3. Merging compresses the information into a more In general, we combine axioms for which the efficient text, as shown in the introduction, while functor is the same and only one argument differs. chaining orders the information to facilitate infer- We do not aggregate axiom functions with more ence — for example, ‘Every A is a B and every than three arguments. The merged constituents B is a C’ makes it easier for readers to draw the must be different expressions with the same log- inference that every A is a C. ical form. 4We regard expressions like A v [B; C] and A v B v C as shorthand forms allowing us to compress several axioms 4 Implemention into one formula. For merges one could also refactor the set of axioms into a new axiom: thus for example A v [B; C] This section describes a Prolog application which could be expressed as A v (B u C), or [A; B] v C as performs a simple verbalisation including aggre- (A t B) v C. This formulation would have the advantage of staying within the normal notation and semantics of DL; gation. It combines a generic grammar for real- however, it is applicable only to merges, not to chains. ising logical forms with a domain-specific lexicon derived from identifiers and labels within the input from the OWL Controlled Natural Language task ontology. force (Schwitter et al., 2008), so obtaining rea- Input to the application is an OWL/XML file.5 sonably natural sentences for common axiom pat- Axioms that conform to EL++ DL are selected terns, even though some less common axioms such and converted into Prolog format. A draft lex- as those describing attributes of properties (e.g., icon is then built automatically from the iden- domain, range, functionality, reflexivity, transitiv- tifier names and labels, on the assumption that ity) are hard to express without falling back on classes are lexicalised by noun groups, properties technical concepts from the logic of relations; for by verb groups with valency two, and individuals these we have (for now) allowed short technical by proper nouns. formulations (e.g., ‘The property “has as part” Our aggregation rules are applied to axioms is transitive’). With these limitations, the gram- with the same logical form. The first step picks mar currently realises any single axiom conform- out all the logical patterns present in the input ing to EL++, or any aggregation of EL++ axioms ontology by abstracting from atomic terms. The through the merge rules described above. Table 4 next step searches for all axioms matching each lists example aggregated axiom patterns and En- of the patterns present. Then within each pattern- glish realisations generated with our grammar. set, the algorithm searches for axioms that differ by only one argument, grouping axioms together 5 Testing the ‘merging’ algorithm in the ways suggested in table 3. It exhaustively lists every possible grouping and builds a new, ag- Unit Original Aggregated Reduction gregated axiom placing the values for the merged Sentences 35,542 11,948 66% argument in a list, e.g., consider the axioms: Words 320,603 264,461 18% subClassOf(class(cat), class(feline)).

Load more