From: AAAI Technical Report WS-99-14. Compilation copyright © 1999, AAAI (www.aaai.org). All rights reserved. First-Order Context and Formal Analysis

Laurent Chaudron and Nicolas Maille ONERACERT 2 avenue Edouard Belin BP 4025 - 31055 Toulouse Cedex 04 - FRANCE Phone: +33 5 62 25 26 55 Fax: +33 5 62 25 25 64 {chaudron,maille}©cert, fr

Key words: Formal Concept Analysis, Context Based use numerical features to capture symbolic notions in KnowledgeExploration, First-Order Lattices, order to keep the semantic on each piece of the model. Discovery, Context Based Rules, Logic Programming This methodological constraint leads to cautiously define the elements of the model dedicated to the Abstract representation of the basic elements of the context and their exploitations. This is the purpose of our work. Formal Concept Analysis -also called "Galois The relations between FCAand first-order logic have Lattices"- is an algebraic modelbased on proposi- tional calculus that is used for symbolicknowledge been studied, especially in (Zickwolf 1991); our work exploration from a formal context. The aim of this is more focused on the formal prerequisites and the paper is to design the theoretical modelsrequired programmingconditions of the definition of a consist- for the extension of Formal Concept Analysis to ent model of first-order Logic FCA (say: 1LFCA). first-order logic so as to improveboth the expres- This leads to search for first-order corresponding sion power as a knowledgemining tool upon first operators to the fundamental ones in FCA: set union, order contexts, and the relevanceof its results. set intersection, and the Galois connections (one must Our contribution consists in: i) a synthesis of the notice that nothing else than these three operators basic notions of FCA,ii) the design of the Cube is needed to generate the FCAtheory). This is the Modeldedicated to the conjunctions of literals, purpose of the Cube model thanks to which iii) the designof a completefirst-order logic formal concept analysis of first-order contexts. The ap- relevant definitions of 1LFCAcan be formulated and proach is described from the theoretical point of implemented. view, implementations in logic programmingand As it has been the case for FCA, the study of 1LFCA applications are also briefly presented. has created some results which maybe of a theoretical interest for themselves, even if one may also expect 1LFCAto offer more powerful exploration tools and Introduction in particular more suitable "numerical ~symbolic" As far as knowledge information is concerned, the translations for the contexts of the considered applic- induction of concepts from a context described by data ations. Thus, the motivations of this work are both and information is a pivotal topic; generally, if numer- pragmatic and theoretical. ical valuations (belief measures, preferences...) can defined on the considered data, numerical or mixed Foreword: In the next section, the foundations of clas- methods can be directly used: rough sets (Pawlak 1991; sical FCAare recalled in a revised new form which is Skowron & Polkowski 1998), Cartesian space model adapted to the extension to first-order logic. In the se- (Ichino ~ Yaguchi 1998; Ichino & Ono 1998)... quel, a classical term is underlined, while a term defined some cases, requirements or accessibility constraints by the authors is quoted within a box. In the context imply to rest only on symbolic attributes. Thus, of the paper the proofs are omitted (they can be found more fundamental models and techniques are required; in (Chaudron & Maille 1999)). Propositions, lemmas Formal Concept Analysis (Ganter & Wille 1996) is and theorems are labeled within the same sequence. suitable candidate for such a purpose. Unfortunately, FCAtheory relies only on propositional calculus and Formal Concept Analysis an extension is required so as to characterize the context with more expressive attributes. Indeed, The basic notions of FCA our applications (cooperative system design, activity Formal Concept Analysis (say: FCA) is a set- modeling, symbolic fusion...) require means to rep- theoretical model for concepts that reflects the resent both symbolic information as predicates and philosophical understanding of a concept as a context- numerical parameters. But in our approach, we do not based unit of thought consisting in two parts: the

46 extend, which contains all the entities (the objects, A’ is the set of attributes commonto all the objects the examples...) belonging to the concept, and the in A, and B’ is the set of the objects possessing their intend, which is the collection of all the attributes attributes in B. (the characteristics, the properties...) shared by the entities (Arnaud & Nicole 1662). For example, the Example 2: in C1, if A = {Objl}, Then A’ = pair: ({B7~7, A3XX, DC9} , {wings, engines})may {Prol, Pro2, Pro5}, and (A" = A). If B {Prol, Pr o2}, naturally induce a simple and commonconcept... then B’ = {Objl}, and (B" = X). One can also notice Based upon Galois connections, FCA was first de- the case of the empty set: as asubset ofO: 0~ = P, and scribed in (Barbut & Monjardet 1970) and in the 80’s conversely, as a of P: 0~ = O. R. Wille designed a dedicated theory and program at the University of Darmstadt. An introduction can be Remarks: found in (Davey & Priestley 1990) and FCA theory ¯ Depending on the properties of sets O and P, the and applications are now described in the reference dual operators are called polarities in (Birkhoff 1940), book (Ganter & Wille 1996). FCAis frequently used whereas when defined by their basic properties (see as a preprocessing tool for classification (Carpineto Proposition 1) they are traditionally knownas Galois Romano 1996), but in our approach, we stay closer connections. to the original purpose of FCA. In FCA, the basic ¯ Notation ~ is the same for of O and P as both notion that models the knowledge about a specific play a symmetric role in the theory. This symmetry domain is the formal context -described as a binary disappears when P deals with predicates instead of relation between two finite sets- from which concepts propositions. and conceptual double hierarchies can be formally ¯ Usually the context is suppose to verify O N P = 0 derived so as to form the mathematical structure of but all cases can be considered. a lattice 1 with respect to a subconcept-superconcept ¯ The attributes are atomic positive formulas. In FCA relation. FCAis used for self-emergent classification the negation operator is not considered explicitly. of objects, detection of hidden implications between ¯ In this section, the definitions, the propositions objects, construction of concept sequences, object and the proofs are based on the usual set operations. recognition, aggregation of data and information, More precisely the definitions and arguments rely knowledge representation and analysis. only on the set of properties induced by the fact that (7)(0), C, U, M) and (7)(P), C, U, M) are lattices. allow the extension of FCAto first-order logic to be Definitions and properties of FCA "smoothly" achieved. Definition 1 A formal context C is defined as a triple (O, P, ~) where O (objects or entities) and Proposition 1 the dual operators verify the following (properties or attributes) are finite sets and ~ is properties: for all Ai subset of O (the symmetric prop- mapping from O onto ~’(P). For o in O, ¢(o) indicates erties stand for any subset of P): the subset of attributes possessed by o (¢ is frequently (i)A1 C A2 ~ A~ C A~, synthetised in a table). (ii)A C A’, (iii)A’ = A", Example 1: Let C1 = (O1,P1), 01 = Objqe[1.a ], P1 = (iv)(A1 N ~ D A’I U A’2, Projde[1.5], (v)(A1 U A2)’ = A~ M ¢ ( Objl )-= {Prol,Pro2,Prob} Definition 3 Given A C O and B C P, the pair ¢( Obj2)={Pro2,Pro3,Pro$,Prob} ~ ¢ ( Obj3)= {Prol ,Pro3,Prob (A, B) is concept if fde/ A’= Band B =A. A is the extend o--~-the concept and B is the intend. In the sequel we assume that (O, P, ¢) denotes the working context. The set of all concepts defined on the context (O, P, () is denoted as L. Definition 2 The dual operators (denoted as _~ between O and P are defined as follows: Example 3: in C1, ({Obj2,Objl},{Pro5,Pro2}) and (VA ({Obj3,Objl},{Pro5,Prol}) are concepts. O)(VB C P), A’ =d¢f No~A~(o), Two questions arise: what is the structure of L and B’ = ef {o e OIBc ¢(o)}. how to determine it? This is the role of the next 1given two internal operators M(infimum) and kJ (su- theoretical results. premum)on a set E, (E,M,U) is a lattice, iffd~f: Mand are idempotent, associative, commutativeand they verify Definitions 4 The supremum V and infimum A op- the absorption law x M(x U y) = x and x [3 (x V1y) = erators are defined on L as follows: for all concepts lattice is alwaysan ordered set: the relation ~ defined on E (A1,B1) and (A2,B2) in as: (x < y) ~--~def (X [7 y) = is an order rel ation forwhich n and t3 represent the greatest lower bound and the least (A1, BI) V (A2, B2) =d~l ((A1 U A2)", B1 n B~) upper bound (Birkhoff 1940). (A1,B1) A (A2, B2) =def (A1 M A2, (B1 U B2)")

47 Theorem2 (L, <<, V, A) is a complete lattice. for such a process (the specific details of the application are not detailled here). The formal definitions match with the intuitive notion Given a set of information systems characterized by of concepts as a pair of a collection of examples and functionalities (e.g. medical database access, languages their characteristics: the more objects there are the less translations, transmission capability), they are sup- characteristics they share. The relation between com- posed to operate together in order to achieve peace parable concepts works like communicating vessels: maintenance or rescue missions. The problem is how to design the subgroups of systems, sharing same capabil- ities so as to better managethe whole set of systems? FCAappears to be a appropriate tool for such a system analysis. to The following context C2 (with fictitious identifiers) de- o ts properties i scribes the features of five real categories of systems: 02 = {VEH, TRANS, XFLR6, SYSF99, SYS33}. VEHis smaller an on-board information system of a vehicle, TRANSis (~, ~) a transmission system, XFLR6is a mobile surgery cell, SYS33 is a local information system and SYSF99is a Figure 1: The order relation between concepts global rescue management system. The context C2 is defined as follows2:

Example4: the concept lattice L1 generated by context C1 VEH TRA XFL SY9 SY3 is constituted of 8 concepts: omouv(meca) × X × 8: ({}, {Prol,Pro2,Pro5,Pro3,Pro4}) opo(ue) × 7: ({ObjS},{Prol,Pro3,Pro5})) opo(odb) × × × 6: ({Obj2},{Pro2,Pro3,Pro4,Fro5}) omouv(ue) × × 5: ({Objl}, {Prol,Pro2,Pro5}) cr(tir) × × 4: ({Obj3, Obj2}, {Pro5,Pro3}) dem(tir) × × X 3: ({Obj2, Objl}, {Pro5,Pro2}) opo(gu) × × × 2: ({Obj3, Objl}, {Pro5,Prol}) opo(SYS33) × × 1: ({Objl, Obj2, Obj3}, {Pro5}) opo(rens) × × L1 can be represented by its Hasse-diagram(figure 2). opo(odb) × × X cr(sit) × × × × × Pro5 cr(sit2) prev(tirnucami) X The properties of the systems define a set of 12 Pro1~o2~~x~ Pr°4 literals. As they contain no variable they can be con- Obj3 ~ Obj2 sidered as propositional calculus literals. For instance: omouv(meca) (resp: omouv(ue) means that the system can manage (access, store, transmit) instructions Figure 2: The Concept Lattice L1 related to the movingof specified vehicle (resp: a small group of vehicles, such as a surgery cell); an attribute of the kind opo(x), with x G {ue, gu, odb, tens} denotes such a diagram, an element labeled by an object Obji rep- the capability of managing the structural organization resents the concept with the smallest set containing Obji, of the different teams; cr(sit) means that the system and an element labeled by a Proj represents the can access to the state of the current situation: staff, concept with the smallest set containing it. Hence, a given wounded people, material... The Concept Lattice L2 conceptC)inherits all the properties whichare linked above derived from C2 is constituted of 11 concepts defining it in the diagram and C) is constituted of all the objects interoperable groups (figure 3) in which knowledge whichare linked belowit. explorations are processed thanks to programming capabilities. Application: System Analysis This real application is related to the design of distrib- Implementation and Exploitation uted information systems. A first analysis was made A Prolog program was designed so as to implement the with classical FCAin (Chaudron & BarBs 1997); in the Galois connections and the function which associates present section, we recall the description of the problem to each context C its Concept Lattice L: C ---+ L. and we analyze the extensions of the context to first or- The determination algorithm relies on Theorem 3: the der logic. This example may appear very simple at a concept lattice L is determined by the finite sequence first glance: a few systems characterized by a few fea- Tn of sets of concepts: tures. But it appears clearly that a hand-madeanalysis would be quite impossible to be made. FCAhas helped 2 C2 summarizesa four-pages specification file.

48 dem(t~ In this example, it is easy to notice that, from the pro- positional calculus point of view opo(odb) and opo(ue) T~.~ ~(2) ".~mouv(meca) are different literals whereas it appears that they should opo(ue); \ ~\-"< --\ tlO)VEH share a property like opo(x) (with x variable) if the context could be considered within a first-order frame. cr(sit2) k’~-~XFL~R~~A--S~SF9 This is the purpose of the next section. °m°uv(ueI )~’FLROk’~(.~ S YiSdpo9~9S y S 33 cr(tir) "x ~.~~~potrens) prev(tna) The cube model The cube model is dedicated to formalize conjunctions of properties, in this study, it will allow first order contexts to be defined. Let us denote Co the set of all the finite subsets of Figure 3: The System Analysis Concept Lattice L2 positive literal of the propositional calculus. For each FCA context (O, P, ~), we have ~(O) C Co and logical properties deduced by FCArelies implicitly To = {(0,P)}, Tx = {(o",o’)1 o on the fact that (C0, c,N, t2) is a lattice in which Tn+l= {c~v c~l c~,~¯ Tn} the partial order C is consistent with the logical implication +--. The aim of the cube model is to build, Tno = {(O, 0)}, = Une[0,no] Tn in a first-order logic frame, the same kind of algebraic The context and the concept lattice can be considered structure. The cube model is based on a classical first as a global Prolog knowledge base C t3 L on which order language (Coast, Vat, Funct, Pred) whose set of knowledge3. exploration experiments are performed terms Term is the functional closure of Coast U Vat by Funct. Such an approach has been used for knowledge The knowledge base C t2 L is used so as to look for representation in intelligence systems and cooperative contextual dependencies between either objects or systems (Chaudron et al. 1997). The elementary attributes. The induction of the context-based rules is properties are represented by literals and the elements given by the generic frame of (Guigues & Duquenne of their [~] are called [logical cubes ~, they 1986) widely developed in ConImp by Burmeister are interpreted as the conjunction of the literals. Cubes (Burmeister 1987). The techniques are based on the play a dual role besides the classical clauses, and by fundamental Lemma: default their variables are existentially quantified.

Lemma3 (VA ¯ P), ~ (A ~ (A" - In expert or deductive systems, knowledge squares with general rules that are captured by clauses: c = Based on this Lemma, a context rule generator was {trans(x),-~resc(x)} represents the information "res- implemented so as to compute the set of rules de- cue systems have transmission capabilities"; the associ- ducible form the context. The logical links between ated logical formula is Vx (trans(x) V-~resc(x)). When the features are captured by first order rules which we plan to describe as a context the state of an ob- are translated into production rules (these algorithms served situation, cubes are more adequate and c = are not detailed here). The core of these symbolic {trans(x),-~resc(x)} means: "there is an object with inductions from the context is based on the Galois transmission capabilities that is not a rescue system". connections which allow various knowledge exploration to be processed, such as: "from the present context, The logical interpretation is 3x (trans(x) A -~resc(x) what are the properties that are necessarily deduced As far as knowledge representation of a context is con- when systems have to operate by sharing capabilities cerned, we would like the order relation induced on C on rescue operations planning (opo(x)) and medical to capture the intuitive notion of "information enrich- staff movements(omouv(y))7. ’’. The solution is given ment". But such an "enrichment" can be obtained via by the Galois closure of {opo(x),omouv(y)} in C2 different means: quantity of information, precision of (fortunately, the answer is in a deductive form) the terms, logical dependency. Example: {cr(sit), opo(gu)} following interdependency is proved: is more informative than {opo(gu)} for the number of literals is higher; and {cr(sit)} is more informative than {opo(odb), omouv(ue)} --+ { dem(tir),cr(sit2), opo(ue) The improvement of this FCAbased knowledge mining {cr(x)} (with x variable) for a sake of precision. Un- tool is currently under study. fortunately, the combination of both intuitive criteria may lead to a contradiction: {cr(x),opo(gu)} cannot be consistently comparedto {cr(sit)}. 3 it must be noticed that the computing time is not a critical problemin our projects as the situations we are ana- 4 "cube"is the namethat wasused for the first time by A. lyzing are characterized by a high complexityand a very low Thayse (Thayse & col. 1989). The denotator "product" was evolution time. The lattice must be completely developed previously used -but not defined- in 1975 by Vere (Vere andis completely developed. 1975).

49 The previous cases highlight the need for sound defini- commutative. tions to the intuitive concepts of union and intersection of two finite information sets in accordance to the Example 6: p(x,g(y,b)) is the anti-unified literal of following requirements: the infimum has to capture the p(a, g(a, b)) andp(1, g(b, b)). commonfeatures (while giving more information than In fact, anti-unification allows the infimumto generalize the empty set frequently generated by the unification the terms so as to properly enrich the set intersection rule); the supremumhas to cope with the contradictory on the cubes. The result of the anti-unification of criteria: quantity/precision of the information (while two cubes cl and c2 is defined as the union of the giving a more synthetic result than the set union). anti-unification of every couple (11,12) based on the Such purposes are achieved thanks to an algebraic same predicate name and such that ll belongs to cl approach. and 12 belongs to c2.

Definition 5 (Vci E C) cl subsumes c2J, and we Definition 8 Let cl and c2 belong to C~. The infimum write Icl _

5O c---:--q Definitions 9 The dual operators I’1 and ..C).. between O and ~(O) are defined by: opo(Varl) , ~.’" !~5?J omouv(Var3) AI =de/ Ae o, eA ~(Oi) cr(Var2 ) . ~ !. , /" :.’.. : ". B° =des{oi e OIB<_c ~(oi)}. The dual operator from properties to objects, denoted dem(tir)( .’".~,st~.3-~’ ’.,’boPT.N,. ’ {,5u) ) ". F~°(°~’?.omouv(meca) as o and the changing of ( in ( are the only differences betweenclassical and first order definition of a context. opo(ue) Thanks to Corollary 7, is is easy to see that if B E Co er(sit2) omouv(ue)~ then B° = B~ in the sense of classical FCA. Further- more, the lattice (7)(P),C,U,N) can be replaced by r, (C

Proposition 8 The dual operators verify (i) to (v) ~ conditions of Proposition 1. Figure 4: The first-Order ConceptLattice £1 Definitions 10 A [first order concept ]is a pair (A,B) A C O, B E Cr such that: ~ A = B (up to variable renaming) (it allowed the system analysis of figure 4 to be com- B°=A. puted). This extension is a recent result and it has The set of all first-order concepts defined by the to be more intensively experimented. In particular, the context (O, ~) is denoted as [-~. powerful cube tool will allow inner logical constraints to be taken into account thanks to variables: in the cube Proposition 8 For first-order concepts (A1,B1) {foo(Yarl, Var2), bar(Var2, trick)} and bar are and (A2,B2) the relation defined by: (A1,B1) linked through variable Var2. Furthermore, a compar- (A2,B2) ¢~ A1 C_ A2(¢~ B2 <~ ison between the 1LFCAcapabilities and those of prox- is an order relation on £1. imate models such as logical scaling (Prediger 1997) has been launched. Definitions 11 The I supremum: II J and [infimum: ~ ] Indeed, the favorite domain of FCAand 1LFCAis the symbolic knowledge, but thanks to the CLP experi- are respectively defined on £1 as follows: ments, many improvements are expected for the ana- (A1,B1) (A2,B2) =d ef (( A1 U A2)’°,B1 nc lysis of large numerical databases. (At,B1) (A2, B2) =gel (A 1 fq Ae,(B1 Ue B°’) Conclusion Theorem9 (121, E, t_l, ~) is a lattice. The integration of the Cube model - a lattice structure Back to the application of system analysis, with the on conjunctions of first-order literals - in FCAconsti- tutes the core of the design of n complete first-order same context C2, one can determine two new 1LFCA concepts (~) ’ and (~) ’. They represent the generalized FCAbased on the definition of first-order formal contexts. attributes opo(Yarl), cr(Var2) and omouv(Yar3) The next step will be the extension of the Cube model (with Vari variables) which were searched for. to a Constraint-Cube model - a lattice of constrained Corollary 10 1LFCAis an extension of FCA. cubes - so as to offer a powerful numerical and symbolic data analysis and knowledge discovery on constrained Moreoverit is easy to notice that if a first order context first-order contexts. (O,~) contains no variable, it can be considered a classical context and its classical concept lattice, The current study focuses on first order context ° based rules generation which related to the classical say L can be computed. Due to the presence of approaches of the ILP5 community. functional terms and thanks to the anti-unification, From the theoretical point of view, the relations the computation of its first-order concept lattice, £1 may induce the presence of variables, hence L° C_ £1, between the changes of the context and the correlated evolutions of the concept lattice is a main objective. as it is verified in the system analysis. The new applications are the pilot activity modeling (in which context dependent incidents are searched 1LFCA Implementation for), and the correlations between the context and the interactions between a person and a machine. The FCAalgorithms were adapted to the 1LFCAdefin- itions with respect to the Cube lattice program, this gave the first first-order logic concept lattice generator Inductive Logic Programming

51 References Prediger, S. 1997. Logical scaling in formal concept Arnaud, A., and Nicole, P. 1662. La logique ou l’art analysis. In Lukose, D.; Delugach, H.; Keeler, de penser. Paris: Gallimard, 1992. in French. M.; Searle, L.; and Sowa, J. F., eds., Conceptual Barbut, M., and Monjardet, B. 1970. Ordre et classi- structures: Fulfilling Peirce’s dream, number 1257 fication : Alg~bre et combinatoire. Collection Hachette in Lecture Notes in Artificial Intelligence. Berlin- Heidelberg-New York: Springer-Verlag. UniversitY. Paris: Hachette. in French. Skowron, A., and Polkowski, L. 1998. Rough Sets in Birkhoff, G. 1940. Lattice Theory. ACM. Knowledge Discovery. Heidelberg: Physica-Verlag. Burmeister, P. 1987. Programmzur Formalen Begri[f- Thayse, A., and col. 1989. Approche logique de sanalyse einwertiger Kontexte. TH Darmstadt. Eng- l’Intelligence artificielle. Paris: Dunod.in French. lish Version. Vere, S. 1975. Induction of concepts in the predic- Carpineto, C., and Romano,G. 1996. A lattice concep- ate calculus. In IJCAI’75 Proceedings of the ,~th Intl. tual clustering system and its application to browsing retrievial. 24:95-122. Conf. on Artificial Intelligence, 281-287. Zickwolf, M. 1991. Rule exploration: First Order Logic Chaudron, L., and Bar~s, M. 1997. Interoperability in Formal Concept Analysis. Ph.D. Dissertation, THD of systems: from distributed information to coopera- Darmstadt University. Short English version, nr1580, tion. In 15th IJCAI’97 Workshop: "IA in Distributed June 1993. Information Networking". Chaudron, L., and Maille, N. 1999. Le mod- 61e des cubes : representation alg~brique des con- jonctions de propri~t~s. Technical Report 1/7601.43 DCSD-T, Onera-Cert. in French. See:ftp.cert.fr: /pub/chaudron/rapport cube s99. ps. gz. Chaudron, L.; Cossart, C.; Maille, N.; and Tessier, C. 1997. A purely symbolic model for dynamic scene interpretation. International Journal on Artificial In- telligence Tools 6(4):635-664. Davey, B., and Priestley, H. 1990. Introduction to Lattices and Order. Cambridge University Press. Ganter, B., and Wille, R. 1996. Formale Begriffsana- lyse: Mathematische Grundlagen. Berlin-Heidelberg: Springer-Verlag. In German. Guigues, J.-L., and Duquenne, V. 1986. Familles minimales d’implications informatives r~sultant d’un tableau de donn~es binaires. Math. Sci. Hum. 95:5- 18. In French. Huet, G. 1976. Rdsolution d’dquations dans des lan- gages d’ordre 1,2,...,w. Ph.D. Dissertation, Universit~ de Paris VII. Th~se d’l~tat (in French). Ichino, M., and Ono, Y. 1998. A new feature se- lection method to extract functional structures from multidimensional symbolic data. IEICE transactions on information and systems E81-D(6):556-564. Ichino, M., and Yaguchi, H. 1998. Data Science, Classification and related Methods. Springer-Verlag. chapter Symbolic Pattern Classifiers Based on the Cartesian System Model, 358-369. Lassez, J.-L.; Maher, M.; and Marriott, K. 1987. Foundations of Deductive Databases and Logic Pro- gramming. J. Minker. chapter Unification Revisited. Pawlak, Z. 1991. Rough Sets. Dordrecht: Kluwer Academic. Plotkin, G. D. 1970. Machine Intelligence, volume 5. Edinburgh: Edinburgh University Press. chapter 8. A note on inductive generalization, 153-163. B. Meltzer and D. Michie ed.

52