The theory and practice of coupling formal concept analysis to relational The theory and practice of coupling formal databases concept analysis to relational databases

Background Theory

Database † Scaling Jens K¨otters Peter W. Eklund Conjunctive Queries †School of Information Technology, Deakin University, Geelong, Australia SQL Interpretation

Navigation FCA4AI 2018

Summary Stockholm, 13 July 2018

References The theory and practice of coupling formal concept analysis to relational databases

Background Theory

Database Background Theory Scaling

Conjunctive Queries

SQL Interpretation

Navigation

Summary

References FCA + Database Theory

The theory and practice of coupling formal concept analysis to relational Previous work [Koe13] relates FCA with database theory. A databases table of analogies:

Background Theory Standard FCA FCA + Database Theory Database Formal context Relational Structure [Koe13], Scaling Power context family [Koe16] Conjunctive Queries Set of Objects Table SQL Interpretation Set of Attributes Conjunctive query

Navigation Concept lattice Conjunctive-query lattice

Summary

References Lattices of n-ary concepts

The theory and practice of coupling formal concept analysis to relational databases The conjunctive-query lattice can be decomposed into sublattices L[{x1,..., xn}] of n-ary concepts described by Background variables x1,..., xn. All sublattices of n-ary concepts are Theory isomorphic (irrespective of variable names), so we can speak of Database Scaling the lattice of n-ary concepts. The extents are n-ary relations. Conjunctive Queries The lattice C[{x1,..., xn}] contains the concepts of SQL Interpretation L[{x1,..., xn}] where intents correspond to connected graphs.

Navigation

Summary

References The theory and practice of coupling formal concept analysis to relational databases

Background Theory

Database Database Scaling Scaling

Conjunctive Queries

SQL Interpretation

Navigation

Summary

References Example: Literature Database

The theory and practice Book of coupling formal title author publication date concept Alice in Wonderland 1 1865-11-26 analysis to To the Lighthouse 2 1927-05-05 relational databases The Hitchhiker’s Guide to the Galaxy 3 1979-10-12 4 2015-02-03 Harry Potter and the Deathly Hallows 5 2007-07-21 The Casual Vacancy 5 2012-09-27 Background Theory The Shining 6 1977-01-28 6 2013-09-24 Database The Da Vinci Code 7 2003-03-18 Scaling Inferno 7 2013-03-14 Conjunctive Queries Author SQL id first name last name nationality date of birth Interpretation 1 British 1832-01-27 Navigation 2 Virginia Woolf British 1882-01-25 3 British 1952-03-11 Summary 4 Neil Gaiman British 1960-11-10 References 5 J. K. Rowling British 1965-07-31 6 American 1947-09-21 7 Dan Brown American 1964-06-22 Conceptual scaling of a many-valued context (1)

The theory and practice of coupling Author table Centuries scale formal concept id first name last name nationality date of birth Centuries analysis to 1 Lewis Carroll British 1832-01-27 19C 20C 21C 1832-01-27 × relational 2 Virginia Woolf British 1882-01-25 databases 1865-11-26 × 3 Douglas Adams British 1952-03-11 1882-01-25 × 4 Neil Gaiman British 1960-11-10 1927-05-05 × 5 J. K. Rowling British 1965-07-31 1947-09-21 × Background 6 Stephen King American 1947-09-21 Theory 1952-03-11 × 7 Dan Brown American 1964-06-22 1960-11-10 × Database Scaling 1964-06-22 × DOB context 1965-07-31 × Conjunctive 1977-01-28 × Queries DOB 1979-10-12 × SQL 19C 20C 21C 2003-03-18 × Interpretation Lewis Carroll × 2007-07-21 × Virginia Woolf × 2012-09-27 × Navigation Douglas Adams × 2013-03-14 × Summary Neil Gaiman × 2013-09-24 × J. K. Rowling × 2015-02-03 × References Stephen King × Dan Brown × Conceptual scaling of a many-valued context (2)

The theory and practice Author table Nationalities scale of coupling formal id first name last name nationality date of birth concept 1 Lewis Carroll British 1832-01-27 analysis to 2 Virginia Woolf British 1882-01-25 Nationalities relational 3 Douglas Adams British 1952-03-11 British American French Russian databases 4 Neil Gaiman British 1960-11-10 British × 5 J. K. Rowling British 1965-07-31 American × 6 Stephen King American 1947-09-21 French × Background 7 Dan Brown American 1964-06-22 Theory Russian ×

Database Scaling Nationality context

Conjunctive We say that the Na- Queries nat tionalities scale is bound SQL Interpretation British American French Russian to the nationality col- Lewis Carroll × Navigation Virginia Woolf × umn (and the Centuries Summary Douglas Adams × Neil Gaiman × scale was bound to the References J. K. Rowling × date of birth column). Stephen King × Dan Brown × Conceptual scaling of a many-valued context (3)

The theory and practice Author table of coupling id first name last name nationality date of birth formal concept 1 Lewis Carroll British 1832-01-27 analysis to 2 Virginia Woolf British 1882-01-25 relational 3 Douglas Adams British 1952-03-11 databases 4 Neil Gaiman British 1960-11-10 5 J. K. Rowling British 1965-07-31 6 Stephen King American 1947-09-21 Background 7 Dan Brown American 1964-06-22 Theory We consider the Database Derived context subcontexts ob- Scaling tained from the Conjunctive scales as facets Queries Authors SQL

Interpretation DOB:19C DOB:20C DOB:21C nat:British nat:American nat:French nat:Russian Lewis Carroll × × Navigation Virginia Woolf × × Summary Douglas Adams × × Neil Gaiman × × References J. K. Rowling × × Stephen King × × Dan Brown × × Higher-arity scales: Foreign keys

The theory Book and practice of coupling title author publication date formal Author table Alice in Wonderland 1 1865-11-26 concept To the Lighthouse 2 1927-05-05 id first name last name nationality date of birth analysis to The Hitchhiker’s Guide to the Galaxy 3 1979-10-12 1 Lewis Carroll British 1832-01-27 relational Trigger Warning 4 2015-02-03 2 Virginia Woolf British 1882-01-25 databases Harry Potter and the Deathly Hallows 5 2007-07-21 3 Douglas Adams British 1952-03-11 The Casual Vacancy 5 2012-09-27 4 Neil Gaiman British 1960-11-10 The Shining 6 1977-01-28 5 J. K. Rowling British 1965-07-31 Doctor Sleep 6 2013-09-24 Background 6 Stephen King American 1947-09-21 The Da Vinci Code 7 2003-03-18 Theory 7 Dan Brown American 1964-06-22 Inferno 7 2013-03-14

Database Scaling Binary context ”wrote” Conjunctive Queries Equality scale wrote The first parame- = SQL Equality (1,1) × id=author ter of the Equality Interpretation (Lewis Carroll,Alice in Wonderland) × (1,2) (Virginia Woolf,To the Lighthouse) × Navigation (1,3) scale is bound to (Douglas Adams,Hitchhiker’s Guide) × (2,1) (Neil Gaiman,Trigger Warning) × Summary (2,2) × Author.id, the second (J.K. Rowling,Harry Potter 7) × (2,3) (J.K. Rowling,The Casual Vacancy) × References (3,1) parameter is bound (Stephen King,The Shining) × (3,2) (Stephen King,Doctor Sleep) × to Book.author. (3,3) × (Dan Brown,The Da Vinci Code) × ... (Dan Brown,Inferno) × Higher-arity scales: Measuring distance

The theory and practice Distance scales can be used to measure spatial distance of coupling formal between objects or the time span between events. concept analysis to To measure at what age an author wrote a particular book, we relational databases instead use the foreign key condition as a domain expression (which defines the object set of a derived context) and use a Background distance scale (not the one below !!) on top of this. Theory Database Binary context ”wrote” Scaling Distance scale 50 40 Conjunctive =0 ≤1 ≤2 30 ≤ ≤ Queries Distance wrote ≤

(1,1) × × × wrote age age age SQL (1,2) × × (Lewis Carroll, Alice in Wonderland) × × × Interpretation (1,3) × (Virginia Woolf, To the Lighthouse) × × (Douglas Adams, Hitchhiker’s Guide) × × × × Navigation (2,1) × × (Neil Gaiman, Trigger Warning) × (2,2) × × × Summary (J. K. Rowling, Harry Potter 7) × × (2,3) × × (J. K. Rowling, The Casual Vacancy) × × References (3,1) × (Stephen King, The Shining) × × × × (3,2) × × (Stephen King, Doctor Sleep) × (3,3) × × × (Dan Brown, The Da Vinci Code) × × × (Dan Brown, Inferno) × × Power Context Family

The theory and practice of coupling formal The contexts for each facet can be assembled in a power concept analysis to context family. relational databases

0 1 30 50 40

Background sort: Author sort: Book ≤ ≤ ≤ Theory Lewis Carroll × nationality: GB nationality: USA DOB: 19C DOB: 20C DOB: 21C pubdate: 19C pubdate: 20C pubdate: 21C 2 Virginia Woolf × Lewis Carroll × × Database Douglas Adams × Virginia Woolf × ×

Douglas Adams × × wrote: wrote wrote: age wrote: age wrote: age Scaling Neil Gaiman × J. K. Rowling × Neil Gaiman × × (Lewis Carroll, Alice in Wonderland) × × × Stephen King × J. K. Rowling × × (Virginia Woolf, To the Lighthouse) × × Conjunctive Dan Brown × Stephen King × × (Douglas Adams, Hitchhiker’s Guide) × × × × Queries Alice in Wonderland × Dan Brown × × (Neil Gaiman, Trigger Warning) × To the Lighthouse × Alice in Wonderland × (J. K. Rowling, Harry Potter 7) × × SQL Hitchhiker’s Guide × To the Lighthouse × (J. K. Rowling, The Casual Vacancy) × × Hitchhiker’s Guide × Interpretation Harry Potter 7 × (Stephen King, The Shining) × × × × The Casual Vacancy × Harry Potter 7 × (Stephen King, Doctor Sleep) × Trigger Warning × The Casual Vacancy × (Dan Brown, The Da Vinci Code) × × × Navigation The Shining × Trigger Warning × (Dan Brown, Inferno) × × Doctor Sleep × The Shining × Summary The Da Vinci Code × Doctor Sleep × Inferno × The Da Vinci Code × References Inferno × The theory and practice of coupling formal concept analysis to relational databases

Background Theory

Database Conjunctive Queries Scaling

Conjunctive Queries

SQL Interpretation

Navigation

Summary

References Formalizations of Conjunctive Queries

The theory and practice of coupling formal concept In popular use: analysis to relational Tableaux databases Logical Formulas Background Datalog Rules Theory

Database Scaling Other formalizations in selected literature: Conjunctive Queries Relational Structures [CM77] SQL Windowed Relational Structures [Koe13] Interpretation

Navigation Windowed Power Context Families [Koe16] Summary Windowed Intension Graphs [Koe16] References Windowed Intension Graph

The theory and practice of coupling formal concept analysis to relational ”20th-century-born British authors who published in the 21st databases century”

Background Theory

Database Scaling

Conjunctive Queries

SQL Interpretation Terminology: object node, relation node, subject node, label,

Navigation marker, window

Summary

References Intension Graph

The theory and practice of coupling formal concept analysis to relational databases The underlying intension graph.

Background Theory

Database Scaling

Conjunctive Queries

SQL Interpretation

Navigation

Summary

References Solution

The theory and practice of coupling formal concept analysis to relational databases There are two more solutions in the below power context family. Background Theory 0 1 Database 30 50 40 sort: Author sort: Book ≤ ≤ Scaling ≤ Lewis Carroll × nationality: GB nationality: USA DOB: 19C DOB: 20C DOB: 21C pubdate: 19C pubdate: 20C pubdate: 21C 2 Conjunctive Virginia Woolf × Lewis Carroll × × Douglas Adams × Virginia Woolf × ×

Queries Neil Gaiman × Douglas Adams × × wrote: wrote wrote: age wrote: age wrote: age J. K. Rowling × Neil Gaiman × × (Lewis Carroll, Alice in Wonderland) × × × SQL Stephen King × J. K. Rowling × × (Virginia Woolf, To the Lighthouse) × × Interpretation Dan Brown × Stephen King × × (Douglas Adams, Hitchhiker’s Guide) × × × × Alice in Wonderland × Dan Brown × × (Neil Gaiman, Trigger Warning) × Navigation To the Lighthouse × Alice in Wonderland × (J. K. Rowling, Harry Potter 7) × × Hitchhiker’s Guide × To the Lighthouse × (J. K. Rowling, The Casual Vacancy) × × Harry Potter 7 × Hitchhiker’s Guide × (Stephen King, The Shining) × × × × Summary The Casual Vacancy × Harry Potter 7 × (Stephen King, Doctor Sleep) × Trigger Warning × The Casual Vacancy × (Dan Brown, The Da Vinci Code) × × × References The Shining × Trigger Warning × (Dan Brown, Inferno) × × Doctor Sleep × The Shining × The Da Vinci Code × Doctor Sleep × Inferno × The Da Vinci Code × Inferno × Result Table (Concept Extension)

The theory and practice of coupling formal concept analysis to relational databases

Background Theory

Database Scaling

Conjunctive Queries Result Table

SQL x1 Interpretation Neil Gaiman J. K. Rowling Navigation

Summary

References The theory and practice of coupling formal concept analysis to relational databases

Background Theory

Database SQL Interpretation Scaling

Conjunctive Queries

SQL Interpretation

Navigation

Summary

References Syntactic Interpretation

The theory and practice of coupling formal concept analysis to relational databases

Background Theory

Database Scaling

Conjunctive Queries

SQL Interpretation

Navigation RI (D)(Q) = RD (I (Q)) Summary

References Database Scales

The theory Centuries

and practice 19C 20C 21C of coupling 1832-01-27 × formal 1865-11-26 × concept 1882-01-25 × analysis to 1927-05-05 × relational 1947-09-21 × A database scale assigns an databases 1952-03-11 × SQL definition to each at- 1960-11-10 × 1964-06-22 × tribute. The corresponding Background 1965-07-31 × Theory 1977-01-28 × scale context (left side) can be 1979-10-12 × Database 2003-03-18 × derived if so desired. Scaling 2007-07-21 × 2012-09-27 × Conjunctive 2013-03-14 × Queries 2013-09-24 × 2015-02-03 × SQL Interpretation

Navigation σCenturies(19C) ≡ z1 BETWEEN ”1800-01-01” AND ”1899-12-31”

Summary σCenturies(20C) ≡ z1 BETWEEN ”1900-01-01” AND ”1999-12-31”

References σCenturies(21C) ≡ z1 BETWEEN ”2000-01-01” AND ”2099-12-31” Database Facets

The theory and practice of coupling formal concept DOB Similarly, a facet provides SQL def- analysis to 19C 20C 21C relational Lewis Carroll × initions of its attributes. It is ob- databases Virginia Woolf × tained by a variable substitution Douglas Adams × Neil Gaiman × in the underlying scale’s SQL def- Background J. K. Rowling × Theory Stephen King × inition, according to the binding. Database Dan Brown × (here: z1 → t1.date of birth) Scaling

Conjunctive Queries ΦDOB(19C) ≡ t1.date of birth BETWEEN ”1800-01-01” AND ”1899-12-31”

SQL ΦDOB(20C) ≡ t1.date of birth BETWEEN ”1900-01-01” AND ”1999-12-31” Interpretation ΦDOB(21C) ≡ t1.date of birth BETWEEN ”2000-01-01” AND ”2099-12-31” Navigation

Summary

References Database Facets

The theory and practice of coupling formal pubdate 19C 20C 21C concept analysis to Alice in Wonderland × Thereby, a relation between val- relational To the Lighthouse × databases Hitchhiker’s Guide × ues is translated into a relation be- Harry Potter 7 × The Casual Vacancy × tween objects. The scales encode Background Trigger Warning × the actual logic; they should be Theory The Shining × Database Doctor Sleep × generic and reusable. Scaling The Da Vinci Code × Inferno × Conjunctive Queries

SQL Φpubdate(19C) ≡ t1.publication date BETWEEN ”1800-01-01” AND ”1899-12-31” Interpretation Φpubdate(20C) ≡ t1.publication date BETWEEN ”1900-01-01” AND ”1999-12-31” Navigation Φpubdate(21C) ≡ t1.publication date BETWEEN ”2000-01-01” AND ”2099-12-31” Summary

References SQL Translation

The theory and practice of coupling formal concept analysis to relational databases

SELECT DISTINCT Ωsort(u1)(u1) AS x1 , ...,

Background Ω (um) AS xm Theory sort(um) Database FROM sort(v1) AS v1 , ..., Scaling

Conjunctive sort(vn) AS vn Queries WHERE Φc1 ()(v11, ..., v1n1 ) AND ... SQL Interpretation AND Φck (ak )(vk1, ..., vknk ) Navigation

Summary

References Example: SQL Translation

The theory and practice of coupling formal concept analysis to relational databases

Background Theory

Database SELECT DISTINCT CONCAT(v1.first name, ””, v1.last name) AS x1 Scaling FROM Author AS v1 , Conjunctive Queries Book AS v2

SQL WHERE v1.nationality=”GB” Interpretation AND v1.date of birth BETWEEN ”1900-01-01” AND ”1999-12-31” Navigation AND v1.id = v2.author Summary AND v1.publication date BETWEEN ”2000-01-01” AND ”2099-12-31” References The theory and practice of coupling formal concept analysis to relational databases

Background Theory

Database Navigation Scaling

Conjunctive Queries

SQL Interpretation

Navigation

Summary

References Projectional Concept Graph

The theory and practice of coupling formal concept In a projectional concept graph, each node is considered as a analysis to unary concept in a system of interrelated concepts. The node relational databases extent is a unary concept extent in the conjunctive-query lattice. However, we do not compute the graph closure (i.e. Background Theory the intent in the conjunctive-query lattice).

Database Scaling

Conjunctive Queries

SQL Interpretation Node Extent Node Extent Navigation Neil Gaiman Trigger Warning Summary J. K. Rowling Harry Potter 7 The Casual Vacancy References Projectional Concept Graph

The theory and practice of coupling formal concept analysis to relational Definition databases A projectional concept graph is a 5-tuple (V , E, ν, κ, ext ) K~ Background comprised of an intension graph G := (V , E, κ, ν) and its Theory extension map Database Scaling ~ ext~ (v) := { ϕ(v) | ϕ ∈ S(G, ) } Conjunctive K K Queries SQL for a given power context family ~ with S(G, ~ ) 6= ∅. We call Interpretation K K ext (v) the node extent of v. Navigation K~ Summary

References Projectional Concept Graph

The theory and practice of coupling formal We envision relation nodes as controls in a user interface to concept analysis to show/hide associated value columns. relational databases

Background Theory

Database Scaling

Conjunctive Queries

SQL Interpretation

Navigation x1 x1.date of birth Summary Neil Gaiman 1960-11-10 J. K. Rowling 1965-07-31 References Projectional Concept Graph

The theory and practice of coupling formal Showing only projections eliminates combinatorial explosion in concept analysis to result tables. But windows of size ≥ 2 are still supported, if the relational databases actual combinations are of interest.

Background Theory

Database Scaling

Conjunctive Queries

SQL Interpretation

Navigation x1 x1.date of birth x2 Summary Neil Gaiman 1960-11-10 Trigger Warning J. K. Rowling 1965-07-31 Harry Potter 7 References J. K. Rowling 1965-07-31 The Casual Vacancy Refinement Triple

The theory and practice of coupling formal concept With each projectional concept graph, there is an associated analysis to + + + relational refinement triple (E , κ , θ ). databases E +: Associates with each object node v a list E +(v) of facets. Each facet corresponds to a new relation node that Background Theory can be connected to v. Database + Scaling κ : Provides for each object or relation node u a list + Conjunctive κ (u) of scale intents (or equivalently, scale concepts), Queries which can replace the current label κ(u). SQL Interpretation θ+ : A list of pairs of object nodes that can be merged. Navigation Each refinement option leads to another projectional concept Summary graph that at least one solution. References Example: Refinement Triple

The theory and practice of coupling formal concept analysis to relational Node Extent Node Extent databases Neil Gaiman Trigger Warning J. K. Rowling Harry Potter 7 The Casual Vacancy

Background Theory

Database Scaling

Conjunctive Queries The label refinements in κ+(u) SQL Interpretation correspond, for each facet, to con- Navigation cepts in the facet’s concept lattice. Summary

References The theory and practice of coupling formal concept analysis to relational databases

Background Theory

Database Summary Scaling

Conjunctive Queries

SQL Interpretation

Navigation

Summary

References Summary

The theory and practice of coupling formal concept analysis to relational databases Revisited: Relational scaling Background Application: Building query vocabulary around a relational Theory

Database database Scaling Application: Faceted navigation in power context families Conjunctive Queries Proposal: A new class of concept graphs SQL Interpretation

Navigation

Summary

References Selected ReferencesI

The theory and practice [CM77] Ashok K. Chandra and Philip M. Merlin of coupling Optimal implementation of conjunctive queries in relational databases. formal In: Proceedings of the 9th annual ACM symposium on theory of computing, pp. 77–90 (1977) concept analysis to [DE09] Jon R. Ducrou and Peter W. Eklund relational databases Faceted document navigation using conceptual structures. In: Conceptual Structures in Practice. Chapman & Hall/CRC (2009)

[GE99] Bernd Groh and Peter W. Eklund Background Algorithms for creating relational power context families from conceptual graphs. Theory In: Proceedings of ICCS 1999. LNAI, vol. 1640, pp. 389–400. Springer (1999)

Database [Her02] Joachim Hereth Scaling Relational scaling and databases. In: Proceedings of ICCS 2002. LNAI, vol. 2393, pp. 62–76. Springer (2002) Conjunctive Queries [HRV02] Marianne Huchard, Cyril Roume and Petko Valtchev SQL When concepts point at other concepts: the case of UML diagram reconstruction. Interpretation In: Proceedings of FCAKDD 2002, pp. 32–43.

Navigation [Koe11] Jens K¨otters Object configuration browsing in relational databases. Summary In: Proceedings of ICFCA 2011. LNCS, vol. 6628, pp. 151–166. Springer (2011)

References [Koe13] Jens K¨otters Concept lattices of a relational structure. In: Proceedings of ICCS 2013. LNCS, vol. 7735, pp. 301–310. Springer (2013) Selected ReferencesII

The theory and practice of coupling formal concept analysis to relational [Koe16] Jens K¨otters databases Intension graphs as patterns over power context families. In: Proceedings of CLA 2016. CEUR Workshop Proceedings, vol. 1624 (2016)

[PW99] Susanne Prediger and Rudolf Wille Background The lattice of concept graphs of a relationally scaled context. Theory In: Proceedings of ICCS 1999. LNAI, vol. 1640, pp. 401–414. Springer (1999) Database [Pri00] Uta Priss Scaling Lattice-based information retrieval. Conjunctive In: Knowledge Organization, vol. 27(3), pp. 132–142 (2000) Queries [Wil97] Rudolf Wille SQL Conceptual Graphs and Formal Concept Analysis. Interpretation In: Proceedings of ICCS 1997. LNCS, vol. 1257, pp. 290–303. Springer (1997) Navigation

Summary

References