Membership Constraints in Formal Concept Analysis

Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence (IJCAI 2015) Membership Constraints in Formal Concept Analysis Sebastian Rudolphy and Christian Sac˘ area˘ z and Diana Troanca˘z yTechnische Universitat¨ Dresden, Germany zUniversitatea Babes Bolyai, Romania [email protected] fcsacarea,[email protected] Abstract with promote critical discourse (see also [Wille, 1994; 1997; 2000]). Formal Concept Analysis (FCA) is a prominent The mathematical theory underlying Conceptual Knowl- field of applied mathematics using object-attribute edge Processing is Formal Concept Analysis, providing a relationships to define formal concepts – groups powerful and elegant mathematical tool for understanding of objects with common attributes – which can be and investigating knowledge, based on a set-theoretical se- ordered into conceptual hierarchies, so-called con- mantics, comprising methods for representation, acquiring, cept lattices. We consider the problem of satisfia- and retrieval of knowledge, as well as for further theory build- bility of membership constraints, i.e., to determine ing in several other domains of science. if a formal concept exists whose object and attribute set include certain elements and exclude others. We Formal Concept Analysis (FCA) appeared at the end of the analyze the computational complexity of this prob- 1980’s in order to restructure classical lattice theory into a lem in general and for restricted forms of member- form that is suitable for applications in data analysis. The fun- ship constraints. We perform the same analysis for damental data structure FCA uses is a formal context, which generalizations of FCA to incidence structures of exploits the fact that data is quite often represented by inci- arity three (objects, attributes and conditions) and dence structures relating objects and attributes. FCA provides higher. We present a generic answer set program- also a mathematization of the traditional, philosophical un- ming (ASP) encoding of the membership constraint derstanding of a concept as a unit of thought consisting of an satisfaction problem, which allows for deploying extent (the set of objects falling under the concept) and an available highly optimized ASP tools for its solu- intent (the set of attributes characterizing the concept). Us- tion. Finally, we discuss the importance of mem- ing mathematical operations, concepts are computed from the bership constraints in the context of navigational object-attribute data table. They can be naturally ordered, re- approaches to data analysis. sulting in a conceptual hierarchy, called concept lattice. The entire information stored in a formal context is preserved by this operation and the concept lattice is the basis for further 1 Introduction data analysis. It can be represented graphically in order to Conceptual Knowledge Processing and Representation is a allow navigation among concepts, as well as to support com- particular approach to knowledge management, acknowledg- munication. Different algebraic methods can be used in order ing the constitutive role of thinking, arguing and communi- to study its structure and to compute data dependencies. FCA cating human beings in dealing with knowledge and its pro- also provides elegant methods to significantly reduce the ef- cessing. The term processing also underlines the fact that fort of mining association rules. obtaining or approximating knowledge is a process which Classical FCA was extended by Wille and Lehmann to the should always be conceptual in the above sense. The methods triadic case, featuring a ternary (objects vs. attributes vs. con- of Conceptual Knowledge Processing have been introduced ditions) instead of a binary (objects vs. attributes) incidence and discussed by Rudolf Wille in [Wille, 2006], based on the relation [Lehmann and Wille, 1995], leading to the notions of pragmatic philosophy of Charles Sanders Peirce, continued tricontext and triconcept. This extension has been success- by Karl-Otto Apel and Jurgen¨ Habermas. fully used in inherently triadic scenarios such as collaborative Wille defines Conceptual Knowledge Processing as an ap- tagging [Jaschke¨ et al., 2008]. plied discipline dealing with knowledge which is constituted Nevertheless, if the number of concepts is very large, a by conscious reflexion, discursive argumentation and human holistic graphical representation may become inefficient and communication on the basis of cultural background, social unwieldy. Note that the number of concepts may be exponen- conventions and personal experiences. Its main aim is to tially in the size of the underlying (tri)context. develop and maintain formal methods and instruments for Hence, a way of narrowing down the set of “interesting” processing information and knowledge which support ratio- concepts by specifying criteria appears as a crucial feature of nal thought, judgment and action of human beings and there- conceptual knowledge management applications, in order to 3186 focus exactly on the data subset one is interested to explore The set B(K) of formal concepts, ordered by the or start exploration from. As a straightforward form of such subconcept-superconcept relationship is a complete lattice criteria, we introduce membership constraints which specify and can be graphically represented as an order diagram. that a formal concept’s extent or intent must include certain elements and exclude others. The question of satisfiability of such membership constraints, i.e., to determine if there exists at all a formal concept is the starting point of our current research. In this paper, we analyze the computational complexity of this problem, both for the classical dyadic case and for higher arity generalizations of FCA, first for triadic data sets and then for the n-adic case. Moreover, we also discuss a generic answer set programming (ASP) encoding of membership constraint problems, which allows for deploying available highly optimized ASP tools for its solution. Finally, Figure 2: Concept lattice of the context in Figure 1 we turn our attention to the question wherefrom the entire problem setting started, namely we discuss the importance F. Lehmann and R. Wille extended in [Lehmann and Wille, of membership constraints in the context of navigational ap- 1995] the theory of FCA to deal with threedimensional data. proaches to data analysis and provide some conclusions of This has been called Triadic FCA (3FCA), where objects are our work. related to attributes and conditions. 2 Preliminaries Definition 4. A tricontext is a quadruple K = (G; M; B; Y ) with G, M, and B being sets called objects, attributes, and 2.1 Formal Concept Analysis conditions, respectively, and Y ⊆ G × M × B the ternary In the following, we briefly sketch some basic notions about incidence relation where (g; m; b) 2 Y means that object g FCA. For more, please refer to [Ganter and Wille, 1999]. has attribute m under condition b. Definition 1. A formal context is a triple K = (G; M; I) Finite tricontexts can be represented as three-dimensional with G and M being sets called objects and attributes, re- cross-tables, which are typically displayed in “slices”, e.g.: spectively, and I ⊆ G × M the binary incidence relation b m m m m b m m m m b m m m m where gIm means that object g has attribute m. 1 1 2 3 4 2 1 2 3 4 3 1 2 3 4 g × × × × g × × g × × × Finite formal contexts can be represented as cross-tables, 1 1 1 g × × × g × × × g × × × the rows of which are representing objects, the columns at- 2 2 2 g × × × g × × × × g × × × tributes, while the incidence relation is represented by crosses 3 3 3 in that table. g4 × × × g4 × × × × g4 × × × × m1 m2 m3 m4 m5 m6 Definition 5. A triconcept of a tricontext K is a triple g1 × (A1;A2;A3) with extent A1 ⊆ G, intent A2 ⊆ M, and g2 × × modus A3 ⊆ B satisfying A1 × A2 × A3 ⊆ Y and for every C ⊇ A , C ⊇ A , C ⊇ A that satisfy C ×C ×C ⊆ Y g3 × × 1 1 2 2 3 3 1 2 3 holds C = A , C = A , and C = A . We denote by T( ) g4 × 1 1 2 2 3 3 K the set of all triconcepts of . g5 × × × K g6 × With the rise of folksonomies as data structure of social re- source sharing systems, triadic FCA was directly applied in Figure 1: Formal context as a cross-table the study of folksonomies [Jaschke¨ et al., 2008]. Efficient algorithms to determine all (or all frequent) triconcepts of a Definition 2. For a set A ⊆ G of objects we define the deriva- tricontext have been developed. However, a visualization that tion operator AI := fm j gIm for all g 2 Ag and for a set I would be as intuitive as concept lattices for classical FCA B ⊆ M of attributes, we analogously define B = fg j has remained elusive for the triadic case. Initial investiga- gIm for all m 2 Bg.A formal concept of a context K is a tions into interactive ways of browsing the space of tricon- pair (A; B) with extent A ⊆ G and intent B ⊆ M satisfying cepts have been made [Rudolph et al., 2015]. AI = B and BI = A. We denote the set of formal concepts of the context K by B(K). 2.2 Complexity Theory An alternative, useful way of characterizing formal con- We assume the reader to be familiar with complexity the- cepts is that A × B ⊆ I and A, B are maximal w.r.t. this ory [Papadimitriou, 1994] and, in particular, the complexity property, i.e., for every C ⊇ A and D ⊇ B with C × D ⊆ I classes AC0 and NP. must hold C = A and D = B. We briefly recap that AC0 (problems solvable by Boolean Definition 3. If (A; B); (C; D) 2 B(K), we say that (A; B) circuits of polynomial size and constant depth) coincides with is a subconcept of (C; D) (or equivalently, (C; D) is a su- expressibility by first-order formulae [Immerman, 1999].

Load more