An Introduction to Formal Concept Analysis (FCA)

An Introduction to Formal Concept Analysis (FCA) Amedeo Napoli Orpailleur Team Université de Lorraine, CNRS, Inria, LORIA, 54000 Nancy, France [email protected] Tutorial at ICCS 2020 An Introduction to Formal Concept Analysis (FCA) Bolzano Summer of Knowledge, September 2020 Summary of the presentation 1 Introduction 2 A Smooth Introduction to Formal Concept Analysis Derivation operators, formal concepts and concept lattice The structure of the concept lattice 3 Algorithms for discovering the concepts The Chein-Malgrange algorithm The Ganter algorithm 4 Pattern Structures 5 Functional Dependencies 6 Complements and References Amedeo Napoli Tutorial on FCA ICCS 2020 – Sep. 18-20 2020 2 / 118 Knowledge Discovery in Databases (KDD) data Knowledge Discovery in Databases (KDD) consists in selection processing large volumes of data preparation in order to discover “patterns” prepared data that are significant, useful, and reusable. data Three main steps: data mining preparation, data mining, and interpretation. discovered patterns KDD is iterative and interactive, interpretation i.e. it can be replayed and it is evaluation guided by an analyst. interpreted patterns From Data to Knowledge Units data Data have a context and KDD is selection knowledge oriented, depending on preparation domain knowledge, e.g. prepared data constraints, preferences. data At each step, domain knowledge mining can be embedded to guide KDD, e.g. interestingness measures, discovered patterns feature selection. interpretation The KDD loop can be extended evaluation with an additional step related to interpreted patterns the production of actionable knowledge(knowledge construction pattern step). representation knowledge units Knowledge Discovery and Knowledge Representation A parallel can be drawn with the “Knowledge Level” (Newell): the data level, the information level, the knowledge level, and the production of actionable knowledge for being reused by software agents. Knowledge Discovery and Knowledge Engineering are complementary. Introduction Knowledge Discovery and Knowledge Representation Most of the time, KD is depending on domain knowledge: selecting patterns w.r.t. interest measures, similarity, thresholds, preferences. feature selection. Knowledge discovery, as a learning process, can be used for knowledge engineering, for example by returning implications, i.e. =), and concept definitions, i.e. =) and (=. Actually, implications within the concept lattice provide a support for “partial” and “complete definitions” of concepts. A main idea underlying declarative knowledge representation and reasoning can be reused in KDD, i.e. Describe the problem and the solver will take care of the solution. On which formalism could rely such a “solver”? Amedeo Napoli Tutorial on FCA ICCS 2020 – Sep. 18-20 2020 6 / 118 Introduction Exploratory Knowledge Discovery based on FCA Formal Concept Analysis (FCA) is a mathematical formalism based on lattice theory, classification and concept discovery. Moreover, FCA follows a human centered approach and supports exploration operations through the concept lattice. Discovery of concepts, i.e. classes of individuals with a description. Organization of concepts into a poset based on a subsumption relation. The poset supports exploration, e.g. information retrieval, visualization... How FCA can be a “Discovery Engine for Exploratory KDD”. Amedeo Napoli Tutorial on FCA ICCS 2020 – Sep. 18-20 2020 7 / 118 Introduction The Contexts of Planets The initial context of planets: Planet Size Distance to Sun Moon(s) Jupiter large far yes Mars small near yes Mercury small near no Neptune medium far yes Pluto small far yes Saturn large far yes Earth small near yes Uranus medium far yes Venus small near no The context of planets after plain scaling: Planets Size Distance to Sun Moon(s) small medium large near far yes no Jupiter x x x Mars x x x Mercury x x x Neptune x x x Pluto x x x Saturn x x x Earth x x x Uranus x x x Venus x x x Amedeo Napoli Tutorial on FCA ICCS 2020 – Sep. 18-20 2020 8 / 118 The Concept Lattice of Planets Exploration and Visualization (LatViz) Navigation and Information Retrieval Interpretation of concepts and rules Rules: “far −! medium” with confidence 2=5, “small −! near” with confidence 4=5. Implications: “no =) near” and “near =) small” with confidence 1. Introduction FCA: References, Books, Tools, Events, Applications... The Formal Concept Analysis Homepage: https://upriss.github.io/fca/fca.html Books and Tutorials: Ganter and Wille, Belohlavek, Carpineto and Romano, Ganter and Obiedkov. Tools: Conexp and variations, Toscana, Latviz, Galicia, FCA Tools Bundle. Events: International Conference on Formal Concept Analysis (ICFCA), International Conference on Lattices and Applications (CLA), Workshop “What can FCA do for Artificial Intelligence?” (FCA4AI), International Conference on Conceptual Structures (ICCS). Applications: many many. Amedeo Napoli Tutorial on FCA ICCS 2020 – Sep. 18-20 2020 11 / 118 1 Introduction 2 A Smooth Introduction to Formal Concept Analysis Derivation operators, formal concepts and concept lattice The structure of the concept lattice 3 Algorithms for discovering the concepts The Chein-Malgrange algorithm The Ganter algorithm 4 Pattern Structures 5 Functional Dependencies 6 Complements and References A Smooth Introduction to Formal Concept Analysis FCA, Formal Concepts and Concept Lattices Marc Barbut and Bernard Monjardet, Ordre et classification, Hachette, 1970. Bernhard Ganter and Rudolf Wille, Formal Concept Analysis, Springer, 1999. Claudio Carpineto and Giovanni Romano, Concept Data Analysis: Theory and Applications, John Wiley & Sons, 2004. Bernhard Ganter and Sergei Obiedkov, Conceptual Exploration, Springer, 2016. Amedeo Napoli Tutorial on FCA ICCS 2020 – Sep. 18-20 2020 13 / 118 A Smooth Introduction to Formal Concept Analysis The FCA process The basic procedure of Formal Concept Analysis (FCA) is based on a simple representation of data, i.e. a binary table called a formal context. Each formal context is transformed into a mathematical structure called concept lattice. The information contained in the formal context is preserved. The concept lattice is the basis for data analysis. It is represented graphically to support analysis, mining, visualization, interpretation. Amedeo Napoli Tutorial on FCA ICCS 2020 – Sep. 18-20 2020 14 / 118 Pizzas/Toppings tomato garlic oregano mozza funghi delfia x x x regina x x x roma x x x x marta x x napoli x x x x castelli x x x A Smooth Introduction to Formal Concept Analysis The notion of a formal context G/M m1 m2 m3 m4 m5 g1 x x x g2 x x x g3 x x x x g4 x x g5 x x x x g6 x x x (G; M; I) is called a formal context where G (Gegenstände) and M (Merkmale) are sets, and I ⊆ G × M is a binary relation between G and M. G = fdelfia; regina; roma; marta; napoli; castellig M = ftomato; garlic; oregano; mozza; funghig The elements of G are the objects, while the elements of M are the attributes, I is the incidence relation of the context (G; M; I). Amedeo Napoli Tutorial on FCA ICCS 2020 – Sep. 18-20 2020 16 / 118 A Smooth Introduction to Formal Concept Analysis Derivation operators, formal concepts and concept lattice Two derivation operators For A ⊆ G and for B ⊆ M: 0 : }(G) −! }(M) 0 : A −! A0 A0 = fm 2 M=(g; m) 2 I for all g 2 Ag 0 : }(M) −! }(G) with 0 : B −! B0 B0 = fg 2 G=(g; m) 2 I for all m 2 Bg Amedeo Napoli Tutorial on FCA ICCS 2020 – Sep. 18-20 2020 17 / 118 A Smooth Introduction to Formal Concept Analysis Derivation operators, formal concepts and concept lattice Computing the images of sets of objects and attributes fg2g0 = fm1; m3; m4g: Objects / Attributes m1 m2 m3 m4 m5 g1 x x x g2 x x x g3 x x x x g4 x x g5 x x x x g6 x x x A0 = fm 2 M=(g; m) 2 I for all g 2 Ag B0 = fg 2 G=(g; m) 2 I for all m 2 Bg Amedeo Napoli Tutorial on FCA ICCS 2020 – Sep. 18-20 2020 18 / 118 A Smooth Introduction to Formal Concept Analysis Derivation operators, formal concepts and concept lattice Computing the images of sets of objects and attributes fm3g0 = fg1; g2; g3; g5; g6g: Objects / Attributes m1 m2 m3 m4 m5 g1 x x x g2 x x x g3 x x x x g4 x x g5 x x x x g6 x x x A0 = fm 2 M=(g; m) 2 I for all g 2 Ag B0 = fg 2 G=(g; m) 2 I for all m 2 Bg Amedeo Napoli Tutorial on FCA ICCS 2020 – Sep. 18-20 2020 19 / 118 A Smooth Introduction to Formal Concept Analysis Derivation operators, formal concepts and concept lattice Computing the images of sets of objects and attributes fg3; g5g0 = fm1; m2; m3; m4g: Objects / Attributes m1 m2 m3 m4 m5 g1 x x x g2 x x x g3 x x x x g4 x x g5 x x x x g6 x x x Amedeo Napoli Tutorial on FCA ICCS 2020 – Sep. 18-20 2020 20 / 118 A Smooth Introduction to Formal Concept Analysis Derivation operators, formal concepts and concept lattice Computing the images of sets of objects and attributes fm3; m4g0 = fg2; g3; g5; g6g: Objects / Attributes m1 m2 m3 m4 m5 g1 x x x g2 x x x g3 x x x x g4 x x g5 x x x x g6 x x x Amedeo Napoli Tutorial on FCA ICCS 2020 – Sep. 18-20 2020 21 / 118 A Smooth Introduction to Formal Concept Analysis Derivation operators, formal concepts and concept lattice The derivation operators and the Galois connection 0 : }(G) −! }(M) with A −! A0 0 : }(M) −! }(G) with B −! B0 These two applications induce a Galois connection between }(G) and }(M) when sets are ordered by set inclusion. A Galois connection is defined as follows: Let (P; ≤) and (Q; ≤) be two partially ordered sets.

An Introduction to Formal Concept Analysis (FCA)

Formal Concept Analysis of Rodent Carriers of Zoonotic Disease

Classification Methods Based on Formal Concept Analysis

Formal Contexts, Formal Concept Analysis, and Galois Connections

Conceptual Alignment with Formal Concept Analysis

Contributions to Numerical Formal Concept Analysis, Bayesian Predictive Inference and Sample Size Determination

Formal Concept Analysis

Completing Terminological Axioms with Formal Concept Analysis

Formal Concept Analysis: Themes and Variations for Knowledge Processing

An Application of Formal Concept Analysis to Neural Decoding

Formal Concept Analysis for Concept Collecting and Their Analysis*

Representing Numeric Values in Concept Lattices

Formal Concept Analysis Based Association Rules Extraction