KAPITEL 3 BERUHT AUF EINEM AUSZUG AUS DEM FOLGENDEN VORTRAG:

A tutorial on Database Theory and a talk on database query answering under updates

Nicole Schweikardt

Humboldt-Universität zu Berlin

24th Workshop on Logic, Language, Information and Computation (WoLLIC 2017) , July 19 & 20, 2017 database schema : relational signature σ := {M, D} a db : D = (MD , PD ), where MD : a finite subset of dom2 PD : a finite subset of dom3 dom : a fixed, infinite domain of potential db entries adom(D) : the set of all d ∈ dom that occur in MD or PD View D as a finite σ-structure with universe adom(D)!

Return all titles of movies y in which stars:

ϕ1(y) := Movie(y, "Sigourney Weaver") Return all tuples (x, y) of cinemas x and movie titles y such that x plays movie y in which Sigourney Weaver stars:  ϕ2(x, y) := ∃ z Programme(x, y, z) ∧ Movie(y, "Sigourney Weaver") Conjunctive queries!

Example database and two queries A logician’s point of view: Movie Programme NameMovie Actor: a 2-ary relationCinema symbol M Movietitle Time Alien Sigourney Weaver Babylon Casablanca 17:30 ProgrammeBlade Runner Harrison: a Ford 3-ary relationBabylon symbol P Gravity 20:15 Blade Runner Sean Young Casablanca Blade Runner 15:30 Brazil Jonathan Pryce Casablanca Alien 18:15 Brazil Kim Greist Casablanca Blade Runner 20:30 Casablanca Humphrey Bogart Casablanca 20:30 Casablanca Ingrid Bergmann Kino International Casablanca 18:00 Gravity Kino International Brazil 20:00 Gravity George Clooney Kino International Brazil 22:00 Resident Evil Milla Jovovich Moviemento Gravity 17:00 Moviemento Gravity 19:30 Terminator Linda Hamilton Moviemento Alien 22:00 Terminator Urania Resident Evil 20:00 . . Urania Resident Evil 21:30 . . Urania Resident Evil 23:00

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 2/ 43 database schema : relational signature σ := {M, D} a db : D = (MD , PD ), where MD : a finite subset of dom2 PD : a finite subset of dom3 dom : a fixed, infinite domain of potential db entries adom(D) : the set of all d ∈ dom that occur in MD or PD View D as a finite σ-structure with universe adom(D)!

Return all tuples (x, y) of cinemas x and movie titles y such that x plays movie y in which Sigourney Weaver stars:  ϕ2(x, y) := ∃ z Programme(x, y, z) ∧ Movie(y, "Sigourney Weaver") Conjunctive queries!

Example database and two queries A logician’s point of view: Movie Programme NameMovie Actor: a 2-ary relationCinema symbol M Movietitle Time Alien Sigourney Weaver Babylon Casablanca 17:30 ProgrammeBlade Runner Harrison: a Ford 3-ary relationBabylon symbol P Gravity 20:15 Blade Runner Sean Young Casablanca Blade Runner 15:30 Brazil Jonathan Pryce Casablanca Alien 18:15 Brazil Kim Greist Casablanca Blade Runner 20:30 Casablanca Humphrey Bogart Casablanca Resident Evil 20:30 Casablanca Ingrid Bergmann Kino International Casablanca 18:00 Gravity Sandra Bullock Kino International Brazil 20:00 Gravity George Clooney Kino International Brazil 22:00 Resident Evil Milla Jovovich Moviemento Gravity 17:00 Terminator Arnold Schwarzenegger Moviemento Gravity 19:30 Terminator Linda Hamilton Moviemento Alien 22:00 Terminator Michael Biehn Urania Resident Evil 20:00 . . Urania Resident Evil 21:30 . . Urania Resident Evil 23:00

Return all titles of movies y in which Sigourney Weaver stars:

ϕ1(y) := Movie(y, "Sigourney Weaver")

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 2/ 43 database schema : relational signature σ := {M, D} a db : D = (MD , PD ), where MD : a finite subset of dom2 PD : a finite subset of dom3 dom : a fixed, infinite domain of potential db entries adom(D) : the set of all d ∈ dom that occur in MD or PD View D as a finite σ-structure with universe adom(D)!

Conjunctive queries!

Example database and two queries A logician’s point of view: Movie Programme NameMovie Actor: a 2-ary relationCinema symbol M Movietitle Time Alien Sigourney Weaver Babylon Casablanca 17:30 ProgrammeBlade Runner Harrison: a Ford 3-ary relationBabylon symbol P Gravity 20:15 Blade Runner Sean Young Casablanca Blade Runner 15:30 Brazil Jonathan Pryce Casablanca Alien 18:15 Brazil Kim Greist Casablanca Blade Runner 20:30 Casablanca Humphrey Bogart Casablanca Resident Evil 20:30 Casablanca Ingrid Bergmann Kino International Casablanca 18:00 Gravity Sandra Bullock Kino International Brazil 20:00 Gravity George Clooney Kino International Brazil 22:00 Resident Evil Milla Jovovich Moviemento Gravity 17:00 Terminator Arnold Schwarzenegger Moviemento Gravity 19:30 Terminator Linda Hamilton Moviemento Alien 22:00 Terminator Michael Biehn Urania Resident Evil 20:00 . . Urania Resident Evil 21:30 . . Urania Resident Evil 23:00

Return all titles of movies y in which Sigourney Weaver stars:

ϕ1(y) := Movie(y, "Sigourney Weaver") Return all tuples (x, y) of cinemas x and movie titles y such that x plays movie y in which Sigourney Weaver stars:  ϕ2(x, y) := ∃ z Programme(x, y, z) ∧ Movie(y, "Sigourney Weaver")

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 2/ 43 database schema : relational signature σ := {M, D} a db : D = (MD , PD ), where MD : a finite subset of dom2 PD : a finite subset of dom3 dom : a fixed, infinite domain of potential db entries adom(D) : the set of all d ∈ dom that occur in MD or PD View D as a finite σ-structure with universe adom(D)!

Example database and two queries A logician’s point of view: Movie Programme NameMovie Actor: a 2-ary relationCinema symbol M Movietitle Time Alien Sigourney Weaver Babylon Casablanca 17:30 ProgrammeBlade Runner Harrison: a Ford 3-ary relationBabylon symbol P Gravity 20:15 Blade Runner Sean Young Casablanca Blade Runner 15:30 Brazil Jonathan Pryce Casablanca Alien 18:15 Brazil Kim Greist Casablanca Blade Runner 20:30 Casablanca Humphrey Bogart Casablanca Resident Evil 20:30 Casablanca Ingrid Bergmann Kino International Casablanca 18:00 Gravity Sandra Bullock Kino International Brazil 20:00 Gravity George Clooney Kino International Brazil 22:00 Resident Evil Milla Jovovich Moviemento Gravity 17:00 Terminator Arnold Schwarzenegger Moviemento Gravity 19:30 Terminator Linda Hamilton Moviemento Alien 22:00 Terminator Michael Biehn Urania Resident Evil 20:00 . . Urania Resident Evil 21:30 . . Urania Resident Evil 23:00

Return all titles of movies y in which Sigourney Weaver stars:

ϕ1(y) := Movie(y, "Sigourney Weaver") Return all tuples (x, y) of cinemas x and movie titles y such that x plays movie y in which Sigourney Weaver stars:  ϕ2(x, y) := ∃ z Programme(x, y, z) ∧ Movie(y, "Sigourney Weaver") Conjunctive queries! Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 2/ 43 database schema : relational signature σ := {M, D} a db : D = (MD , PD ), where MD : a finite subset of dom2 PD : a finite subset of dom3 dom : a fixed, infinite domain of potential db entries adom(D) : the set of all d ∈ dom that occur in MD or PD View D as a finite σ-structure with universe adom(D)!

Example database and two queries A logician’s point of view: Movie Programme NameMovie Actor: a 2-ary relationCinema symbol M Movietitle Time Alien Sigourney Weaver Babylon Casablanca 17:30 ProgrammeBlade Runner Harrison: a Ford 3-ary relationBabylon symbol P Gravity 20:15 Blade Runner Sean Young Casablanca Blade Runner 15:30 Brazil Jonathan Pryce Casablanca Alien 18:15 Brazil Kim Greist Casablanca Blade Runner 20:30 Casablanca Humphrey Bogart Casablanca Resident Evil 20:30 Casablanca Ingrid Bergmann Kino International Casablanca 18:00 Gravity Sandra Bullock Kino International Brazil 20:00 Gravity George Clooney Kino International Brazil 22:00 Resident Evil Milla Jovovich Moviemento Gravity 17:00 Terminator Arnold Schwarzenegger Moviemento Gravity 19:30 Terminator Linda Hamilton Moviemento Alien 22:00 Terminator Michael Biehn Urania Resident Evil 20:00 . . Urania Resident Evil 21:30 . . Urania Resident Evil 23:00

Return all titles of movies y in which Sigourney Weaver stars:

ϕ1(y) := Movie(y, "Sigourney Weaver") Return all tuples (x, y) of cinemas x and movie titles y such that x plays movie y in which Sigourney Weaver stars:  ϕ2(x, y) := ∃ z Programme(x, y, z) ∧ Movie(y, "Sigourney Weaver") Conjunctive queries! Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 2/ 43 a db : D = (MD , PD ), where MD : a finite subset of dom2 PD : a finite subset of dom3 dom : a fixed, infinite domain of potential db entries adom(D) : the set of all d ∈ dom that occur in MD or PD View D as a finite σ-structure with universe adom(D)!

Example database and two queries A logician’s point of view: Movie Programme NameMovie Actor: a 2-ary relationCinema symbol M Movietitle Time Alien Sigourney Weaver Babylon Casablanca 17:30 ProgrammeBlade Runner Harrison: a Ford 3-ary relationBabylon symbol P Gravity 20:15 Blade Runner Sean Young Casablanca Blade Runner 15:30 databaseBrazil schema Jonathan: relational Pryce signatureCasablancaσ := {MAlien, D} 18:15 Brazil Kim Greist Casablanca Blade Runner 20:30 Casablanca Humphrey Bogart Casablanca Resident Evil 20:30 Casablanca Ingrid Bergmann Kino International Casablanca 18:00 Gravity Sandra Bullock Kino International Brazil 20:00 Gravity George Clooney Kino International Brazil 22:00 Resident Evil Milla Jovovich Moviemento Gravity 17:00 Terminator Arnold Schwarzenegger Moviemento Gravity 19:30 Terminator Linda Hamilton Moviemento Alien 22:00 Terminator Michael Biehn Urania Resident Evil 20:00 . . Urania Resident Evil 21:30 . . Urania Resident Evil 23:00

Return all titles of movies y in which Sigourney Weaver stars:

ϕ1(y) := Movie(y, "Sigourney Weaver") Return all tuples (x, y) of cinemas x and movie titles y such that x plays movie y in which Sigourney Weaver stars:  ϕ2(x, y) := ∃ z Programme(x, y, z) ∧ Movie(y, "Sigourney Weaver") Conjunctive queries! Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 2/ 43 adom(D) : the set of all d ∈ dom that occur in MD or PD View D as a finite σ-structure with universe adom(D)!

Example database and two queries A logician’s point of view: Movie Programme NameMovie Actor: a 2-ary relationCinema symbol M Movietitle Time Alien Sigourney Weaver Babylon Casablanca 17:30 ProgrammeBlade Runner Harrison: a Ford 3-ary relationBabylon symbol P Gravity 20:15 Blade Runner Sean Young Casablanca Blade Runner 15:30 databaseBrazil schema Jonathan: relational Pryce signatureCasablancaσ := {MAlien, D} 18:15 Brazil a db Kim: GreistD = (MD , PDCasablanca), where Blade Runner 20:30 Casablanca Humphrey Bogart Casablanca 2 Resident Evil 20:30 CasablancaMD Ingrid: Bergmann a finite subsetKino of Internationaldom Casablanca 18:00 Gravity D Sandra Bullock Kino International3 Brazil 20:00 Gravity P George: Clooneya finite subsetKino of Internationaldom Brazil 22:00 Resident Evil Milla Jovovich Moviemento Gravity 17:00 Terminatordom Arnold:Schwarzenegger a fixed, infiniteMoviemento domain of potentialGravity db19:30 entries Terminator Linda Hamilton Moviemento Alien 22:00 Terminator Michael Biehn Urania Resident Evil 20:00 . . Urania Resident Evil 21:30 . . Urania Resident Evil 23:00

Return all titles of movies y in which Sigourney Weaver stars:

ϕ1(y) := Movie(y, "Sigourney Weaver") Return all tuples (x, y) of cinemas x and movie titles y such that x plays movie y in which Sigourney Weaver stars:  ϕ2(x, y) := ∃ z Programme(x, y, z) ∧ Movie(y, "Sigourney Weaver") Conjunctive queries! Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 2/ 43 View D as a finite σ-structure with universe adom(D)!

Example database and two queries A logician’s point of view: Movie Programme NameMovie Actor: a 2-ary relationCinema symbol M Movietitle Time Alien Sigourney Weaver Babylon Casablanca 17:30 ProgrammeBlade Runner Harrison: a Ford 3-ary relationBabylon symbol P Gravity 20:15 Blade Runner Sean Young Casablanca Blade Runner 15:30 databaseBrazil schema Jonathan: relational Pryce signatureCasablancaσ := {MAlien, D} 18:15 Brazil a db Kim: GreistD = (MD , PDCasablanca), where Blade Runner 20:30 Casablanca Humphrey Bogart Casablanca 2 Resident Evil 20:30 CasablancaMD Ingrid: Bergmann a finite subsetKino of Internationaldom Casablanca 18:00 Gravity D Sandra Bullock Kino International3 Brazil 20:00 Gravity P George: Clooneya finite subsetKino of Internationaldom Brazil 22:00 Resident Evil Milla Jovovich Moviemento Gravity 17:00 Terminatordom Arnold:Schwarzenegger a fixed, infiniteMoviemento domain of potentialGravity db19:30 entries adomTerminator(D) Linda: Hamilton the set of all dMoviemento∈ dom thatAlien occur in22:00MD or PD Terminator Michael Biehn Urania Resident Evil 20:00 . . Urania Resident Evil 21:30 . . Urania Resident Evil 23:00

Return all titles of movies y in which Sigourney Weaver stars:

ϕ1(y) := Movie(y, "Sigourney Weaver") Return all tuples (x, y) of cinemas x and movie titles y such that x plays movie y in which Sigourney Weaver stars:  ϕ2(x, y) := ∃ z Programme(x, y, z) ∧ Movie(y, "Sigourney Weaver") Conjunctive queries! Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 2/ 43 Example database and two queries A logician’s point of view: Movie Programme NameMovie Actor: a 2-ary relationCinema symbol M Movietitle Time Alien Sigourney Weaver Babylon Casablanca 17:30 ProgrammeBlade Runner Harrison: a Ford 3-ary relationBabylon symbol P Gravity 20:15 Blade Runner Sean Young Casablanca Blade Runner 15:30 databaseBrazil schema Jonathan: relational Pryce signatureCasablancaσ := {MAlien, D} 18:15 Brazil a db Kim: GreistD = (MD , PDCasablanca), where Blade Runner 20:30 Casablanca Humphrey Bogart Casablanca 2 Resident Evil 20:30 CasablancaMD Ingrid: Bergmann a finite subsetKino of Internationaldom Casablanca 18:00 Gravity D Sandra Bullock Kino International3 Brazil 20:00 Gravity P George: Clooneya finite subsetKino of Internationaldom Brazil 22:00 Resident Evil Milla Jovovich Moviemento Gravity 17:00 Terminatordom Arnold:Schwarzenegger a fixed, infiniteMoviemento domain of potentialGravity db19:30 entries adomTerminator(D) Linda: Hamilton the set of all dMoviemento∈ dom thatAlien occur in22:00MD or PD Terminator Michael Biehn Urania Resident Evil 20:00 . . Urania Resident Evil 21:30 . . View D as a finite σ-structure with universeUrania adomResident(D)! Evil 23:00

Return all titles of movies y in which Sigourney Weaver stars:

ϕ1(y) := Movie(y, "Sigourney Weaver") Return all tuples (x, y) of cinemas x and movie titles y such that x plays movie y in which Sigourney Weaver stars:  ϕ2(x, y) := ∃ z Programme(x, y, z) ∧ Movie(y, "Sigourney Weaver") Conjunctive queries! Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 2/ 43 Example database and two queries A logician’s point of view: Movie Programme NameMovie Actor: a 2-ary relationCinema symbol M Movietitle Time Alien Sigourney Weaver Babylon Casablanca 17:30 ProgrammeBlade Runner Harrison: a Ford 3-ary relationBabylon symbol P Gravity 20:15 Blade Runner Sean Young Casablanca Blade Runner 15:30 databaseBrazil schema Jonathan: relational Pryce signatureCasablancaσ := {MAlien, D} 18:15 Brazil a db Kim: GreistD = (MD , PDCasablanca), where Blade Runner 20:30 Casablanca Humphrey Bogart Casablanca 2 Resident Evil 20:30 CasablancaMD Ingrid: Bergmann a finite subsetKino of Internationaldom Casablanca 18:00 Gravity D Sandra Bullock Kino International3 Brazil 20:00 Gravity P George: Clooneya finite subsetKino of Internationaldom Brazil 22:00 Resident Evil Milla Jovovich Moviemento Gravity 17:00 Terminatordom Arnold:Schwarzenegger a fixed, infiniteMoviemento domain of potentialGravity db19:30 entries adomTerminator(D) Linda: Hamilton the set of all dMoviemento∈ dom thatAlien occur in22:00MD or PD Terminator Michael Biehn Urania Resident Evil 20:00 . . Urania Resident Evil 21:30 . . View D as a finite σ-structure with universeUrania adomResident(D)! Evil 23:00

Return all titles of movies y in which Sigourney Weaver stars:

ϕ1(y) := M(y, "Sigourney Weaver") Return all tuples (x, y) of cinemas x and movie titles y such that x plays movie y in which Sigourney Weaver stars:  ϕ2(x, y) := ∃ z Programme(x, y, z) ∧ Movie(y, "Sigourney Weaver") Conjunctive queries! Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 2/ 43 Example database and two queries A logician’s point of view: Movie Programme NameMovie Actor: a 2-ary relationCinema symbol M Movietitle Time Alien Sigourney Weaver Babylon Casablanca 17:30 ProgrammeBlade Runner Harrison: a Ford 3-ary relationBabylon symbol P Gravity 20:15 Blade Runner Sean Young Casablanca Blade Runner 15:30 databaseBrazil schema Jonathan: relational Pryce signatureCasablancaσ := {MAlien, D} 18:15 Brazil a db Kim: GreistD = (MD , PDCasablanca), where Blade Runner 20:30 Casablanca Humphrey Bogart Casablanca 2 Resident Evil 20:30 CasablancaMD Ingrid: Bergmann a finite subsetKino of Internationaldom Casablanca 18:00 Gravity D Sandra Bullock Kino International3 Brazil 20:00 Gravity P George: Clooneya finite subsetKino of Internationaldom Brazil 22:00 Resident Evil Milla Jovovich Moviemento Gravity 17:00 Terminatordom Arnold:Schwarzenegger a fixed, infiniteMoviemento domain of potentialGravity db19:30 entries adomTerminator(D) Linda: Hamilton the set of all dMoviemento∈ dom thatAlien occur in22:00MD or PD Terminator Michael Biehn Urania Resident Evil 20:00 . . Urania Resident Evil 21:30 . . View D as a finite σ-structure with universeUrania adomResident(D)! Evil 23:00

Return all titles of movies y in which Sigourney Weaver stars:

ϕ1(y) := M(y, "Sigourney Weaver") Return all tuples (x, y) of cinemas x and movie titles y such that x plays movie y in which Sigourney Weaver stars:  ϕ2(x, y) := ∃ z P(x, y, z) ∧ M(y, "Sigourney Weaver") Conjunctive queries! Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 2/ 43 , i.e., compute the set

ϕ(D) := ϕ(x1,..., xk ) (D) := J K h i  k a1 ak (a1,..., ak ) ∈ adom(D) :(adom(D), D) |= ϕ ··· x1 xk ···

Special case k = 0: Boolean queries: Evaluate ϕ() on D means Decide if (adom(D), D) |= ϕ

Query evaluation

Consider a query language L (e.g., SQL, conjunctive queries CQ, first-order logic FO).

Let ϕ(x1,..., xk ) be a query of signature σ, formulated in L. Let D be a database of signature σ.

Task:

Evaluate ϕ(x1,..., xk ) on D

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 3/ 43 Special case k = 0: Boolean queries: Evaluate ϕ() on D means Decide if (adom(D), D) |= ϕ

Query evaluation

Consider a query language L (e.g., SQL, conjunctive queries CQ, first-order logic FO).

Let ϕ(x1,..., xk ) be a query of signature σ, formulated in L. Let D be a database of signature σ.

Task:

Evaluate ϕ(x1,..., xk ) on D, i.e., compute the set

ϕ(D) := ϕ(x1,..., xk ) (D) := J K h i  k a1 ak (a1,..., ak ) ∈ adom(D) :(adom(D), D) |= ϕ ··· x1 xk ···

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 3/ 43 Query evaluation

Consider a query language L (e.g., SQL, conjunctive queries CQ, first-order logic FO).

Let ϕ(x1,..., xk ) be a query of signature σ, formulated in L. Let D be a database of signature σ.

Task:

Evaluate ϕ(x1,..., xk ) on D, i.e., compute the set

ϕ(D) := ϕ(x1,..., xk ) (D) := J K h i  k a1 ak (a1,..., ak ) ∈ adom(D) :(adom(D), D) |= ϕ ··· x1 xk ···

Special case k = 0: Boolean queries: Evaluate ϕ() on D means Decide if (adom(D), D) |= ϕ

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 3/ 43 0 I Boolean First-Order Queries: data complexity is in AC , combined complexity is PSPACE-complete [Stockmeyer ’74, Vardi ’82]

I Boolean Least-Fixed Point Queries: data complexity is PTIME-complete, combined complexity is EXPTIME-complete [Immerman ’82, Vardi ’82].

: Measure the complexity of evaluating ϕ on D in terms of the sizes of ϕ and D.

: Assume the query ϕ to be fixed. Measure the complexity of evaluating ϕ on D only in terms of the size of D.

Typical results obtained in database theory: 0 I Boolean Conjunctive Queries: data complexity is in AC , combined complexity is NP-complete [Chandra & Merlin ’77]

CAVEAT: These notions & results cannot handle updates of the db!

Complexity of query evaluation In his STOC’82 paper, Moshe Vardi introduced the notions combined complexity

and data complexity

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 4/ 43 0 I Boolean First-Order Queries: data complexity is in AC , combined complexity is PSPACE-complete [Stockmeyer ’74, Vardi ’82]

I Boolean Least-Fixed Point Queries: data complexity is PTIME-complete, combined complexity is EXPTIME-complete [Immerman ’82, Vardi ’82].

: Assume the query ϕ to be fixed. Measure the complexity of evaluating ϕ on D only in terms of the size of D.

Typical results obtained in database theory: 0 I Boolean Conjunctive Queries: data complexity is in AC , combined complexity is NP-complete [Chandra & Merlin ’77]

CAVEAT: These notions & results cannot handle updates of the db!

Complexity of query evaluation In his STOC’82 paper, Moshe Vardi introduced the notions combined complexity: Measure the complexity of evaluating ϕ on D in terms of the sizes of ϕ and D.

and data complexity

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 4/ 43 0 I Boolean First-Order Queries: data complexity is in AC , combined complexity is PSPACE-complete [Stockmeyer ’74, Vardi ’82]

I Boolean Least-Fixed Point Queries: data complexity is PTIME-complete, combined complexity is EXPTIME-complete [Immerman ’82, Vardi ’82].

Typical results obtained in database theory: 0 I Boolean Conjunctive Queries: data complexity is in AC , combined complexity is NP-complete [Chandra & Merlin ’77]

CAVEAT: These notions & results cannot handle updates of the db!

Complexity of query evaluation In his STOC’82 paper, Moshe Vardi introduced the notions combined complexity: Measure the complexity of evaluating ϕ on D in terms of the sizes of ϕ and D.

and data complexity: Assume the query ϕ to be fixed. Measure the complexity of evaluating ϕ on D only in terms of the size of D.

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 4/ 43 0 I Boolean First-Order Queries: data complexity is in AC , combined complexity is PSPACE-complete [Stockmeyer ’74, Vardi ’82]

I Boolean Least-Fixed Point Queries: data complexity is PTIME-complete, combined complexity is EXPTIME-complete [Immerman ’82, Vardi ’82]. CAVEAT: These notions & results cannot handle updates of the db!

Complexity of query evaluation In his STOC’82 paper, Moshe Vardi introduced the notions combined complexity: Measure the complexity of evaluating ϕ on D in terms of the sizes of ϕ and D.

and data complexity: Assume the query ϕ to be fixed. Measure the complexity of evaluating ϕ on D only in terms of the size of D.

Typical results obtained in database theory: 0 I Boolean Conjunctive Queries: data complexity is in AC , combined complexity is NP-complete [Chandra & Merlin ’77]

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 4/ 43 I Boolean Least-Fixed Point Queries: data complexity is PTIME-complete, combined complexity is EXPTIME-complete [Immerman ’82, Vardi ’82]. CAVEAT: These notions & results cannot handle updates of the db!

Complexity of query evaluation In his STOC’82 paper, Moshe Vardi introduced the notions combined complexity: Measure the complexity of evaluating ϕ on D in terms of the sizes of ϕ and D.

and data complexity: Assume the query ϕ to be fixed. Measure the complexity of evaluating ϕ on D only in terms of the size of D.

Typical results obtained in database theory: 0 I Boolean Conjunctive Queries: data complexity is in AC , combined complexity is NP-complete [Chandra & Merlin ’77] 0 I Boolean First-Order Queries: data complexity is in AC , combined complexity is PSPACE-complete [Stockmeyer ’74, Vardi ’82]

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 4/ 43 CAVEAT: These notions & results cannot handle updates of the db!

Complexity of query evaluation In his STOC’82 paper, Moshe Vardi introduced the notions combined complexity: Measure the complexity of evaluating ϕ on D in terms of the sizes of ϕ and D.

and data complexity: Assume the query ϕ to be fixed. Measure the complexity of evaluating ϕ on D only in terms of the size of D.

Typical results obtained in database theory: 0 I Boolean Conjunctive Queries: data complexity is in AC , combined complexity is NP-complete [Chandra & Merlin ’77] 0 I Boolean First-Order Queries: data complexity is in AC , combined complexity is PSPACE-complete [Stockmeyer ’74, Vardi ’82]

I Boolean Least-Fixed Point Queries: data complexity is PTIME-complete, combined complexity is EXPTIME-complete [Immerman ’82, Vardi ’82].

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 4/ 43 Complexity of query evaluation In his STOC’82 paper, Moshe Vardi introduced the notions combined complexity: Measure the complexity of evaluating ϕ on D in terms of the sizes of ϕ and D.

and data complexity: Assume the query ϕ to be fixed. Measure the complexity of evaluating ϕ on D only in terms of the size of D.

Typical results obtained in database theory: 0 I Boolean Conjunctive Queries: data complexity is in AC , combined complexity is NP-complete [Chandra & Merlin ’77] 0 I Boolean First-Order Queries: data complexity is in AC , combined complexity is PSPACE-complete [Stockmeyer ’74, Vardi ’82]

I Boolean Least-Fixed Point Queries: data complexity is PTIME-complete, combined complexity is EXPTIME-complete [Immerman ’82, Vardi ’82]. CAVEAT: These notions & results cannot handle updates of the db! Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 4/ 43 For k-ary queries: I Compute the number of tuples in ϕ(D) I Test for a given tuple a whether a ∈ ϕ(D) I Enumerate the tuples in ϕ(D) update data structure in

of degree 6 d f (ϕ, d) = 3-exp(||ϕ|| + lg lg d) I Preprocessing: Build a suitable data structure that represents D and ϕ(D) I Output: For Boolean queries: I Decide if D |= ϕ

I Dynamic setting: Tuples may be inserted into or deleted from D Similar results for FO with counting FOC(P) [Kuske, S., LICS’17]. Future task: Revisit other results on FO model checking in the dynamic setting! – Thank you! –

A typical scenario for DB-systems

I Input: I Database D I query ϕ(x1,..., xk )

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 5/ 43 For k-ary queries: I Compute the number of tuples in ϕ(D) I Test for a given tuple a whether a ∈ ϕ(D) I Enumerate the tuples in ϕ(D) update data structure in

of degree 6 d f (ϕ, d) = 3-exp(||ϕ|| + lg lg d)

I Output: For Boolean queries: I Decide if D |= ϕ

I Dynamic setting: Tuples may be inserted into or deleted from D Similar results for FO with counting FOC(P) [Kuske, S., LICS’17]. Future task: Revisit other results on FO model checking in the dynamic setting! – Thank you! –

A typical scenario for DB-systems

I Input: I Database D I query ϕ(x1,..., xk ) I Preprocessing: Build a suitable data structure that represents D and ϕ(D)

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 5/ 43 update data structure in

of degree 6 d f (ϕ, d) = 3-exp(||ϕ|| + lg lg d)

For k-ary queries: I Compute the number of tuples in ϕ(D) I Test for a given tuple a whether a ∈ ϕ(D) I Enumerate the tuples in ϕ(D) I Dynamic setting: Tuples may be inserted into or deleted from D Similar results for FO with counting FOC(P) [Kuske, S., LICS’17]. Future task: Revisit other results on FO model checking in the dynamic setting! – Thank you! –

A typical scenario for DB-systems

I Input: I Database D I query ϕ(x1,..., xk ) I Preprocessing: Build a suitable data structure that represents D and ϕ(D) I Output: For Boolean queries: I Decide if D |= ϕ

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 5/ 43 update data structure in

of degree 6 d f (ϕ, d) = 3-exp(||ϕ|| + lg lg d)

I Compute the number of tuples in ϕ(D) I Test for a given tuple a whether a ∈ ϕ(D) I Enumerate the tuples in ϕ(D) I Dynamic setting: Tuples may be inserted into or deleted from D Similar results for FO with counting FOC(P) [Kuske, S., LICS’17]. Future task: Revisit other results on FO model checking in the dynamic setting! – Thank you! –

A typical scenario for DB-systems

I Input: I Database D I query ϕ(x1,..., xk ) I Preprocessing: Build a suitable data structure that represents D and ϕ(D) I Output: For Boolean queries: I Decide if D |= ϕ For k-ary queries:

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 5/ 43 update data structure in

of degree 6 d f (ϕ, d) = 3-exp(||ϕ|| + lg lg d)

I Test for a given tuple a whether a ∈ ϕ(D) I Enumerate the tuples in ϕ(D) I Dynamic setting: Tuples may be inserted into or deleted from D Similar results for FO with counting FOC(P) [Kuske, S., LICS’17]. Future task: Revisit other results on FO model checking in the dynamic setting! – Thank you! –

A typical scenario for DB-systems

I Input: I Database D I query ϕ(x1,..., xk ) I Preprocessing: Build a suitable data structure that represents D and ϕ(D) I Output: For Boolean queries: I Decide if D |= ϕ For k-ary queries: I Compute the number of tuples in ϕ(D)

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 5/ 43 update data structure in

of degree 6 d f (ϕ, d) = 3-exp(||ϕ|| + lg lg d)

I Enumerate the tuples in ϕ(D) I Dynamic setting: Tuples may be inserted into or deleted from D Similar results for FO with counting FOC(P) [Kuske, S., LICS’17]. Future task: Revisit other results on FO model checking in the dynamic setting! – Thank you! –

A typical scenario for DB-systems

I Input: I Database D I query ϕ(x1,..., xk ) I Preprocessing: Build a suitable data structure that represents D and ϕ(D) I Output: For Boolean queries: I Decide if D |= ϕ For k-ary queries: I Compute the number of tuples in ϕ(D) I Test for a given tuple a whether a ∈ ϕ(D)

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 5/ 43 update data structure in

of degree 6 d f (ϕ, d) = 3-exp(||ϕ|| + lg lg d)

I Dynamic setting: Tuples may be inserted into or deleted from D Similar results for FO with counting FOC(P) [Kuske, S., LICS’17]. Future task: Revisit other results on FO model checking in the dynamic setting! – Thank you! –

A typical scenario for DB-systems

I Input: I Database D I query ϕ(x1,..., xk ) I Preprocessing: Build a suitable data structure that represents D and ϕ(D) I Output: For Boolean queries: I Decide if D |= ϕ For k-ary queries: I Compute the number of tuples in ϕ(D) I Test for a given tuple a whether a ∈ ϕ(D) I Enumerate the tuples in ϕ(D)

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 5/ 43 of degree 6 d f (ϕ, d) = 3-exp(||ϕ|| + lg lg d)

update data structure in

Similar results for FO with counting FOC(P) [Kuske, S., LICS’17]. Future task: Revisit other results on FO model checking in the dynamic setting! – Thank you! –

A typical scenario for DB-systems

I Input: I Database D I query ϕ(x1,..., xk ) I Preprocessing: Build a suitable data structure that represents D and ϕ(D) I Output: For Boolean queries: I Decide if D |= ϕ For k-ary queries: I Compute the number of tuples in ϕ(D) I Test for a given tuple a whether a ∈ ϕ(D) I Enumerate the tuples in ϕ(D) I Dynamic setting: Tuples may be inserted into or deleted from D

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 5/ 43 Overview

Introduction

Conjunctive Queries on Arbitrary Databases

First-Order Queries on Bounded Degree Databases

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 6/ 43 Overview

Introduction

Conjunctive Queries on Arbitrary Databases

First-Order Queries on Bounded Degree Databases

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 7/ 43 Complexity of query evaluation:

Obvious: static time 6 update time · kDk Thus:

I constant update time =⇒ static setting has linear data complexity O(1) O(1) I n update time ⇐⇒ n static time For the static setting: tight characterisation of the tractable CQs: Boolean: [Grohe, Schwentick, Segoufin 2001], [Grohe 2007], [Marx 2010], [Marx 2013] counting: [Dalmau, Jonsson 2004], [Chen, Mengel 2015], [Greco, Scarcello 2015] enumeration: [Bulatov et al. 2012], [Bagan, Durand, Grandjean 2007] I.e.: Update time nO(1) is well-understood! Interesting: Sub-linear update time

Conjunctive queries (CQs)

Conjunctive queries:  ϕ(x1,..., x`) := ∃x`+1 · · · ∃xm R1(x) ∧ · · · ∧ Rs (x)

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 8/ 43 Thus:

I constant update time =⇒ static setting has linear data complexity O(1) O(1) I n update time ⇐⇒ n static time For the static setting: tight characterisation of the tractable CQs: Boolean: [Grohe, Schwentick, Segoufin 2001], [Grohe 2007], [Marx 2010], [Marx 2013] counting: [Dalmau, Jonsson 2004], [Chen, Mengel 2015], [Greco, Scarcello 2015] enumeration: [Bulatov et al. 2012], [Bagan, Durand, Grandjean 2007] I.e.: Update time nO(1) is well-understood! Interesting: Sub-linear update time

Conjunctive queries (CQs)

Conjunctive queries:  ϕ(x1,..., x`) := ∃x`+1 · · · ∃xm R1(x) ∧ · · · ∧ Rs (x)

Complexity of query evaluation:

Obvious: static time 6 update time · kDk

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 8/ 43 O(1) O(1) I n update time ⇐⇒ n static time For the static setting: tight characterisation of the tractable CQs: Boolean: [Grohe, Schwentick, Segoufin 2001], [Grohe 2007], [Marx 2010], [Marx 2013] counting: [Dalmau, Jonsson 2004], [Chen, Mengel 2015], [Greco, Scarcello 2015] enumeration: [Bulatov et al. 2012], [Bagan, Durand, Grandjean 2007] I.e.: Update time nO(1) is well-understood! Interesting: Sub-linear update time

Conjunctive queries (CQs)

Conjunctive queries:  ϕ(x1,..., x`) := ∃x`+1 · · · ∃xm R1(x) ∧ · · · ∧ Rs (x)

Complexity of query evaluation:

Obvious: static time 6 update time · kDk Thus:

I constant update time =⇒ static setting has linear data complexity

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 8/ 43 For the static setting: tight characterisation of the tractable CQs: Boolean: [Grohe, Schwentick, Segoufin 2001], [Grohe 2007], [Marx 2010], [Marx 2013] counting: [Dalmau, Jonsson 2004], [Chen, Mengel 2015], [Greco, Scarcello 2015] enumeration: [Bulatov et al. 2012], [Bagan, Durand, Grandjean 2007] I.e.: Update time nO(1) is well-understood! Interesting: Sub-linear update time

Conjunctive queries (CQs)

Conjunctive queries:  ϕ(x1,..., x`) := ∃x`+1 · · · ∃xm R1(x) ∧ · · · ∧ Rs (x)

Complexity of query evaluation:

Obvious: static time 6 update time · kDk Thus:

I constant update time =⇒ static setting has linear data complexity O(1) O(1) I n update time ⇐⇒ n static time

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 8/ 43 Interesting: Sub-linear update time

Conjunctive queries (CQs)

Conjunctive queries:  ϕ(x1,..., x`) := ∃x`+1 · · · ∃xm R1(x) ∧ · · · ∧ Rs (x)

Complexity of query evaluation:

Obvious: static time 6 update time · kDk Thus:

I constant update time =⇒ static setting has linear data complexity O(1) O(1) I n update time ⇐⇒ n static time For the static setting: tight characterisation of the tractable CQs: Boolean: [Grohe, Schwentick, Segoufin 2001], [Grohe 2007], [Marx 2010], [Marx 2013] counting: [Dalmau, Jonsson 2004], [Chen, Mengel 2015], [Greco, Scarcello 2015] enumeration: [Bulatov et al. 2012], [Bagan, Durand, Grandjean 2007] I.e.: Update time nO(1) is well-understood!

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 8/ 43 Conjunctive queries (CQs)

Conjunctive queries:  ϕ(x1,..., x`) := ∃x`+1 · · · ∃xm R1(x) ∧ · · · ∧ Rs (x)

Complexity of query evaluation:

Obvious: static time 6 update time · kDk Thus:

I constant update time =⇒ static setting has linear data complexity O(1) O(1) I n update time ⇐⇒ n static time For the static setting: tight characterisation of the tractable CQs: Boolean: [Grohe, Schwentick, Segoufin 2001], [Grohe 2007], [Marx 2010], [Marx 2013] counting: [Dalmau, Jonsson 2004], [Chen, Mengel 2015], [Greco, Scarcello 2015] enumeration: [Bulatov et al. 2012], [Bagan, Durand, Grandjean 2007] I.e.: Update time nO(1) is well-understood! Interesting: Sub-linear update time

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 8/ 43 poly(ϕ) := ||ϕ||O(1)

update data structure in

Scenario [Berkholz, Keppeler, S., PODS’17]

I Input: data complexity I Database D arbitrary I query ϕ(x1,..., xk ) CQ I Preprocessing: Build a suitable data structure that represents D and ϕ(D) I Output: For Boolean queries: I Decide if D |= ϕ For k-ary queries: I Compute the number of tuples in ϕ(D) I Test for a given tuple a whether a ∈ ϕ(D) I Enumerate the tuples in ϕ(D) I Dynamic setting: Tuples may be inserted into or deleted from D After every update we want to update the data structure and report the new query result quickly: in time constant or polylog(||D||).

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 9/ 43 poly(ϕ) := ||ϕ||O(1)

update data structure in

Scenario [Berkholz, Keppeler, S., PODS’17]

I Input: data complexity I Database D arbitrary I query ϕ(x1,..., xk ) CQ I Preprocessing: Build a suitable data structure that represents D and ϕ(D) I Output: For Boolean queries: I Decide if D |= ϕ For k-ary queries: I Compute the number of tuples in ϕ(D) I Test for a given tuple a whether a ∈ ϕ(D) I Enumerate the tuples in ϕ(D) I Dynamic setting: Tuples may be inserted into or deleted from D After every update we want to update the data structure and report the new query result quickly: in time constant or polylog(||D||). Main result: This is possible ⇐⇒ ϕ is q-hierarchical.

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 9/ 43 q-hierarchical CQs are hierarchical CQs where, additionally, the quantifiers respect the query’s hierarchical form.

Definition: A CQ ϕ(z1,..., zk ) is q-hierarchical if for all variables x, y of ϕ the following is satisfied: (i) atoms(x) ⊆ atoms(y) or atoms(y) ⊆ atoms(x) or atoms(x) ∩ atoms(y) = ∅, and (ii) if atoms(x) ( atoms(y) and x ∈ free(ϕ), then y ∈ free(ϕ). Queries that are not q-hierarchical:  ψS-E-T () := ∃x ∃y S(x) ∧ E(x, y) ∧ T (y) x y

ϕS-E-T (x, y) := S(x) ∧ E(x, y) ∧ T (y)  ϕE-T (x) := ∃y E(x, y) ∧ T (y) x ∃y A q-hierarchical query: (y) := ∃x E(x y) ∧ T (y)  θE-T , ∃x y

q-hierarchical CQs Dalvi & Suciu (PODS’07) introduced the hierarchical CQs to characterise the Boolean CQs that can be answered in PTIME on probabilistic dbs.

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 10/ 43 Definition: A CQ ϕ(z1,..., zk ) is q-hierarchical if for all variables x, y of ϕ the following is satisfied: (i) atoms(x) ⊆ atoms(y) or atoms(y) ⊆ atoms(x) or atoms(x) ∩ atoms(y) = ∅, and (ii) if atoms(x) ( atoms(y) and x ∈ free(ϕ), then y ∈ free(ϕ). Queries that are not q-hierarchical:  ψS-E-T () := ∃x ∃y S(x) ∧ E(x, y) ∧ T (y) x y

ϕS-E-T (x, y) := S(x) ∧ E(x, y) ∧ T (y)  ϕE-T (x) := ∃y E(x, y) ∧ T (y) x ∃y A q-hierarchical query: (y) := ∃x E(x y) ∧ T (y)  θE-T , ∃x y

q-hierarchical CQs Dalvi & Suciu (PODS’07) introduced the hierarchical CQs to characterise the Boolean CQs that can be answered in PTIME on probabilistic dbs. q-hierarchical CQs are hierarchical CQs where, additionally, the quantifiers respect the query’s hierarchical form.

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 10/ 43 Queries that are not q-hierarchical:  ψS-E-T () := ∃x ∃y S(x) ∧ E(x, y) ∧ T (y) x y

ϕS-E-T (x, y) := S(x) ∧ E(x, y) ∧ T (y)  ϕE-T (x) := ∃y E(x, y) ∧ T (y) x ∃y A q-hierarchical query: (y) := ∃x E(x y) ∧ T (y)  θE-T , ∃x y

q-hierarchical CQs Dalvi & Suciu (PODS’07) introduced the hierarchical CQs to characterise the Boolean CQs that can be answered in PTIME on probabilistic dbs. q-hierarchical CQs are hierarchical CQs where, additionally, the quantifiers respect the query’s hierarchical form.

Definition: A CQ ϕ(z1,..., zk ) is q-hierarchical if for all variables x, y of ϕ the following is satisfied: (i) atoms(x) ⊆ atoms(y) or atoms(y) ⊆ atoms(x) or atoms(x) ∩ atoms(y) = ∅, and (ii) if atoms(x) ( atoms(y) and x ∈ free(ϕ), then y ∈ free(ϕ).

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 10/ 43 ϕS-E-T (x, y) := S(x) ∧ E(x, y) ∧ T (y)  ϕE-T (x) := ∃y E(x, y) ∧ T (y) x ∃y A q-hierarchical query: (y) := ∃x E(x y) ∧ T (y)  θE-T , ∃x y

q-hierarchical CQs Dalvi & Suciu (PODS’07) introduced the hierarchical CQs to characterise the Boolean CQs that can be answered in PTIME on probabilistic dbs. q-hierarchical CQs are hierarchical CQs where, additionally, the quantifiers respect the query’s hierarchical form.

Definition: A CQ ϕ(z1,..., zk ) is q-hierarchical if for all variables x, y of ϕ the following is satisfied: (i) atoms(x) ⊆ atoms(y) or atoms(y) ⊆ atoms(x) or atoms(x) ∩ atoms(y) = ∅, and (ii) if atoms(x) ( atoms(y) and x ∈ free(ϕ), then y ∈ free(ϕ). Queries that are not q-hierarchical:  ψS-E-T () := ∃x ∃y S(x) ∧ E(x, y) ∧ T (y) x y

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 10/ 43  ϕE-T (x) := ∃y E(x, y) ∧ T (y) x ∃y A q-hierarchical query: (y) := ∃x E(x y) ∧ T (y)  θE-T , ∃x y

q-hierarchical CQs Dalvi & Suciu (PODS’07) introduced the hierarchical CQs to characterise the Boolean CQs that can be answered in PTIME on probabilistic dbs. q-hierarchical CQs are hierarchical CQs where, additionally, the quantifiers respect the query’s hierarchical form.

Definition: A CQ ϕ(z1,..., zk ) is q-hierarchical if for all variables x, y of ϕ the following is satisfied: (i) atoms(x) ⊆ atoms(y) or atoms(y) ⊆ atoms(x) or atoms(x) ∩ atoms(y) = ∅, and (ii) if atoms(x) ( atoms(y) and x ∈ free(ϕ), then y ∈ free(ϕ). Queries that are not q-hierarchical:  ψS-E-T () := ∃x ∃y S(x) ∧ E(x, y) ∧ T (y) x y

ϕS-E-T (x, y) := S(x) ∧ E(x, y) ∧ T (y)

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 10/ 43 A q-hierarchical query: (y) := ∃x E(x y) ∧ T (y)  θE-T , ∃x y

q-hierarchical CQs Dalvi & Suciu (PODS’07) introduced the hierarchical CQs to characterise the Boolean CQs that can be answered in PTIME on probabilistic dbs. q-hierarchical CQs are hierarchical CQs where, additionally, the quantifiers respect the query’s hierarchical form.

Definition: A CQ ϕ(z1,..., zk ) is q-hierarchical if for all variables x, y of ϕ the following is satisfied: (i) atoms(x) ⊆ atoms(y) or atoms(y) ⊆ atoms(x) or atoms(x) ∩ atoms(y) = ∅, and (ii) if atoms(x) ( atoms(y) and x ∈ free(ϕ), then y ∈ free(ϕ). Queries that are not q-hierarchical:  ψS-E-T () := ∃x ∃y S(x) ∧ E(x, y) ∧ T (y) x y

ϕS-E-T (x, y) := S(x) ∧ E(x, y) ∧ T (y)  ϕE-T (x) := ∃y E(x, y) ∧ T (y) x ∃y

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 10/ 43 q-hierarchical CQs Dalvi & Suciu (PODS’07) introduced the hierarchical CQs to characterise the Boolean CQs that can be answered in PTIME on probabilistic dbs. q-hierarchical CQs are hierarchical CQs where, additionally, the quantifiers respect the query’s hierarchical form.

Definition: A CQ ϕ(z1,..., zk ) is q-hierarchical if for all variables x, y of ϕ the following is satisfied: (i) atoms(x) ⊆ atoms(y) or atoms(y) ⊆ atoms(x) or atoms(x) ∩ atoms(y) = ∅, and (ii) if atoms(x) ( atoms(y) and x ∈ free(ϕ), then y ∈ free(ϕ). Queries that are not q-hierarchical:  ψS-E-T () := ∃x ∃y S(x) ∧ E(x, y) ∧ T (y) x y

ϕS-E-T (x, y) := S(x) ∧ E(x, y) ∧ T (y)  ϕE-T (x) := ∃y E(x, y) ∧ T (y) x ∃y A q-hierarchical query: (y) := ∃x E(x y) ∧ T (y)  θE-T , ∃x y Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 10/ 43 poly(ϕ) := ||ϕ||O(1)

update data structure in

Scenario [Berkholz, Keppeler, S., PODS’17]

I Input: data complexity I Database D arbitrary I query ϕ(x1,..., xk ) CQ I Preprocessing: Build a suitable data structure that represents D and ϕ(D) I Output: For Boolean queries: I Decide if D |= ϕ For k-ary queries: I Compute the number of tuples in ϕ(D) I Test for a given tuple a whether a ∈ ϕ(D) I Enumerate the tuples in ϕ(D) I Dynamic setting: Tuples may be inserted into or deleted from D After every update we want to update the data structure and report the new query result quickly: in time constant or polylog(||D||). Main result: This is possible ⇐⇒ ϕ is q-hierarchical.

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 11/ 43 The OMv-problem: [Henzinger et al., STOC’15] Input: a Boolean n × n-matrix M and a stream v1,..., vn of n-dimensional Boolean vectors

Task: output Mv` before accessing v`+1 OMv-Conjecture: For every  > 0, there is no algorithm that solves 3  the OMv-problem in total time O(n − ) Fails for offline algorithms if we receive all vectors at once: fast matrix multiplication!

Theorem (Enumeration): [Berkholz, Keppeler, S., PODS’17] Let  > 0 and let ϕ(x) be a self-join free CQ that is not q-hierarchical. Then, there is no algorithm with arbitrary preprocessing time and 1  tu = O(n − ) update time that enumerates ϕ(D) with 1  td = O(n − ) delay, unless the OMv-conjecture fails.  Proof idea for ϕE-T (x) := ∃y E(x, y) ∧ T (y)

Intractability result for enumerating CQs that are not q-hierarchical . . . is subject to suitable algorithmic conjecture

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 12/ 43 OMv-Conjecture: For every  > 0, there is no algorithm that solves 3  the OMv-problem in total time O(n − ) Fails for offline algorithms if we receive all vectors at once: fast matrix multiplication!

Theorem (Enumeration): [Berkholz, Keppeler, S., PODS’17] Let  > 0 and let ϕ(x) be a self-join free CQ that is not q-hierarchical. Then, there is no algorithm with arbitrary preprocessing time and 1  tu = O(n − ) update time that enumerates ϕ(D) with 1  td = O(n − ) delay, unless the OMv-conjecture fails.  Proof idea for ϕE-T (x) := ∃y E(x, y) ∧ T (y)

Intractability result for enumerating CQs that are not q-hierarchical . . . is subject to suitable algorithmic conjecture The OMv-problem: [Henzinger et al., STOC’15] Input: a Boolean n × n-matrix M and a stream v1,..., vn of n-dimensional Boolean vectors

Task: output Mv` before accessing v`+1

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 12/ 43 Fails for offline algorithms if we receive all vectors at once: fast matrix multiplication!

Theorem (Enumeration): [Berkholz, Keppeler, S., PODS’17] Let  > 0 and let ϕ(x) be a self-join free CQ that is not q-hierarchical. Then, there is no algorithm with arbitrary preprocessing time and 1  tu = O(n − ) update time that enumerates ϕ(D) with 1  td = O(n − ) delay, unless the OMv-conjecture fails.  Proof idea for ϕE-T (x) := ∃y E(x, y) ∧ T (y)

Intractability result for enumerating CQs that are not q-hierarchical . . . is subject to suitable algorithmic conjecture The OMv-problem: [Henzinger et al., STOC’15] Input: a Boolean n × n-matrix M and a stream v1,..., vn of n-dimensional Boolean vectors

Task: output Mv` before accessing v`+1 OMv-Conjecture: For every  > 0, there is no algorithm that solves 3  the OMv-problem in total time O(n − )

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 12/ 43 Theorem (Enumeration): [Berkholz, Keppeler, S., PODS’17] Let  > 0 and let ϕ(x) be a self-join free CQ that is not q-hierarchical. Then, there is no algorithm with arbitrary preprocessing time and 1  tu = O(n − ) update time that enumerates ϕ(D) with 1  td = O(n − ) delay, unless the OMv-conjecture fails.  Proof idea for ϕE-T (x) := ∃y E(x, y) ∧ T (y)

Intractability result for enumerating CQs that are not q-hierarchical . . . is subject to suitable algorithmic conjecture The OMv-problem: [Henzinger et al., STOC’15] Input: a Boolean n × n-matrix M and a stream v1,..., vn of n-dimensional Boolean vectors

Task: output Mv` before accessing v`+1 OMv-Conjecture: For every  > 0, there is no algorithm that solves 3  the OMv-problem in total time O(n − ) Fails for offline algorithms if we receive all vectors at once: fast matrix multiplication!

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 12/ 43  Proof idea for ϕE-T (x) := ∃y E(x, y) ∧ T (y)

Intractability result for enumerating CQs that are not q-hierarchical . . . is subject to suitable algorithmic conjecture The OMv-problem: [Henzinger et al., STOC’15] Input: a Boolean n × n-matrix M and a stream v1,..., vn of n-dimensional Boolean vectors

Task: output Mv` before accessing v`+1 OMv-Conjecture: For every  > 0, there is no algorithm that solves 3  the OMv-problem in total time O(n − ) Fails for offline algorithms if we receive all vectors at once: fast matrix multiplication!

Theorem (Enumeration): [Berkholz, Keppeler, S., PODS’17] Let  > 0 and let ϕ(x) be a self-join free CQ that is not q-hierarchical. Then, there is no algorithm with arbitrary preprocessing time and 1  tu = O(n − ) update time that enumerates ϕ(D) with 1  td = O(n − ) delay, unless the OMv-conjecture fails.

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 12/ 43 Intractability result for enumerating CQs that are not q-hierarchical . . . is subject to suitable algorithmic conjecture The OMv-problem: [Henzinger et al., STOC’15] Input: a Boolean n × n-matrix M and a stream v1,..., vn of n-dimensional Boolean vectors

Task: output Mv` before accessing v`+1 OMv-Conjecture: For every  > 0, there is no algorithm that solves 3  the OMv-problem in total time O(n − ) Fails for offline algorithms if we receive all vectors at once: fast matrix multiplication!

Theorem (Enumeration): [Berkholz, Keppeler, S., PODS’17] Let  > 0 and let ϕ(x) be a self-join free CQ that is not q-hierarchical. Then, there is no algorithm with arbitrary preprocessing time and 1  tu = O(n − ) update time that enumerates ϕ(D) with 1  td = O(n − ) delay, unless the OMv-conjecture fails.  Proof idea for ϕE-T (x) := ∃y E(x, y) ∧ T (y) Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 12/ 43 Given n × n matrix M, let

D 2 D I E 0 := { (i, j) ∈ [n] : M(i, j) = 1 }, T 0 := ∅

2 1  Create data structure for D0 in time n · n − .

Given n-dim vector v`, update

D` I T := { i ∈ [n]: v`(i) = 1 }.

1 ε in time n · n − . For u` := Mv` we have:

I ϕE-T (D`) = { i ∈ [n]: u`(i) = 1 }

1 ε and can output u` after enumerating ϕE-T (D`) in time n · n − . 3 ε This solves OMv in total time O(n − ) E

 Proof idea for ϕE-T (x) := ∃y E(x, y) ∧ T (y) A lower bound for enumerating via OMv

Input: Boolean n × n matrix M and stream v1,..., vn of n-dimensional Boolean vectors.

Task: output Mv` before accessing v`+1

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 13/ 43 Given n-dim vector v`, update

D` I T := { i ∈ [n]: v`(i) = 1 }.

1 ε in time n · n − . For u` := Mv` we have:

I ϕE-T (D`) = { i ∈ [n]: u`(i) = 1 }

1 ε and can output u` after enumerating ϕE-T (D`) in time n · n − . 3 ε This solves OMv in total time O(n − ) E

 Proof idea for ϕE-T (x) := ∃y E(x, y) ∧ T (y) A lower bound for enumerating via OMv

Input: Boolean n × n matrix M and stream v1,..., vn of n-dimensional Boolean vectors.

Task: output Mv` before accessing v`+1

Given n × n matrix M, let

D 2 D I E 0 := { (i, j) ∈ [n] : M(i, j) = 1 }, T 0 := ∅

2 1  Create data structure for D0 in time n · n − .

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 13/ 43 For u` := Mv` we have:

I ϕE-T (D`) = { i ∈ [n]: u`(i) = 1 }

1 ε and can output u` after enumerating ϕE-T (D`) in time n · n − . 3 ε This solves OMv in total time O(n − ) E

 Proof idea for ϕE-T (x) := ∃y E(x, y) ∧ T (y) A lower bound for enumerating via OMv

Input: Boolean n × n matrix M and stream v1,..., vn of n-dimensional Boolean vectors.

Task: output Mv` before accessing v`+1

Given n × n matrix M, let

D 2 D I E 0 := { (i, j) ∈ [n] : M(i, j) = 1 }, T 0 := ∅

2 1  Create data structure for D0 in time n · n − .

Given n-dim vector v`, update

D` I T := { i ∈ [n]: v`(i) = 1 }.

1 ε in time n · n − .

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 13/ 43 1 ε and can output u` after enumerating ϕE-T (D`) in time n · n − . 3 ε This solves OMv in total time O(n − ) E

 Proof idea for ϕE-T (x) := ∃y E(x, y) ∧ T (y) A lower bound for enumerating via OMv

Input: Boolean n × n matrix M and stream v1,..., vn of n-dimensional Boolean vectors.

Task: output Mv` before accessing v`+1

Given n × n matrix M, let

D 2 D I E 0 := { (i, j) ∈ [n] : M(i, j) = 1 }, T 0 := ∅

2 1  Create data structure for D0 in time n · n − .

Given n-dim vector v`, update

D` I T := { i ∈ [n]: v`(i) = 1 }.

1 ε in time n · n − . For u` := Mv` we have:

I ϕE-T (D`) = { i ∈ [n]: u`(i) = 1 }

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 13/ 43 3 ε This solves OMv in total time O(n − ) E

 Proof idea for ϕE-T (x) := ∃y E(x, y) ∧ T (y) A lower bound for enumerating via OMv

Input: Boolean n × n matrix M and stream v1,..., vn of n-dimensional Boolean vectors.

Task: output Mv` before accessing v`+1

Given n × n matrix M, let

D 2 D I E 0 := { (i, j) ∈ [n] : M(i, j) = 1 }, T 0 := ∅

2 1  Create data structure for D0 in time n · n − .

Given n-dim vector v`, update

D` I T := { i ∈ [n]: v`(i) = 1 }.

1 ε in time n · n − . For u` := Mv` we have:

I ϕE-T (D`) = { i ∈ [n]: u`(i) = 1 }

1 ε and can output u` after enumerating ϕE-T (D`) in time n · n − .

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 13/ 43  Proof idea for ϕE-T (x) := ∃y E(x, y) ∧ T (y) A lower bound for enumerating via OMv

Input: Boolean n × n matrix M and stream v1,..., vn of n-dimensional Boolean vectors.

Task: output Mv` before accessing v`+1

Given n × n matrix M, let

D 2 D I E 0 := { (i, j) ∈ [n] : M(i, j) = 1 }, T 0 := ∅

2 1  Create data structure for D0 in time n · n − .

Given n-dim vector v`, update

D` I T := { i ∈ [n]: v`(i) = 1 }.

1 ε in time n · n − . For u` := Mv` we have:

I ϕE-T (D`) = { i ∈ [n]: u`(i) = 1 }

1 ε and can output u` after enumerating ϕE-T (D`) in time n · n − . 3 ε This solves OMv in total time O(n − ) E Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 13/ 43 Fails for offline algorithms if we receive all vectors at once: fast matrix multiplication!

Theorem (Boolean): [Berkholz, Keppeler, S., PODS’17] Fix an  > 0 and let ϕ be a Boolean CQ whose homomorphic core is not q-hierarchical. Then, there is no algorithm with arbitrary preprocessing time and 1  tu = O(n − ) update time that answers ϕ(D) in time 2  ta = O(n − ), unless the OuMv-conjecture fails.  Proof idea for ψS-E-T := ∃x∃y S(x) ∧ E(x, y) ∧ T (y)

Intractability result for Boolean CQs that are not q-hierarchical The OuMv-problem: [Henzinger et al., STOC’15] Input: a Boolean n × n-matrix M and a stream u1,v1,..., un,vn of n-dimensional Boolean vectors | Task: output (u`) Mv` before accessing u`+1, v`+1 OuMv-Conjecture: For every  > 0, there is no algorithm that 3  solves the OuMv-problem in total time O(n − )

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 14/ 43 Theorem (Boolean): [Berkholz, Keppeler, S., PODS’17] Fix an  > 0 and let ϕ be a Boolean CQ whose homomorphic core is not q-hierarchical. Then, there is no algorithm with arbitrary preprocessing time and 1  tu = O(n − ) update time that answers ϕ(D) in time 2  ta = O(n − ), unless the OuMv-conjecture fails.  Proof idea for ψS-E-T := ∃x∃y S(x) ∧ E(x, y) ∧ T (y)

Intractability result for Boolean CQs that are not q-hierarchical The OuMv-problem: [Henzinger et al., STOC’15] Input: a Boolean n × n-matrix M and a stream u1,v1,..., un,vn of n-dimensional Boolean vectors | Task: output (u`) Mv` before accessing u`+1, v`+1 OuMv-Conjecture: For every  > 0, there is no algorithm that 3  solves the OuMv-problem in total time O(n − ) Fails for offline algorithms if we receive all vectors at once: fast matrix multiplication!

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 14/ 43  Proof idea for ψS-E-T := ∃x∃y S(x) ∧ E(x, y) ∧ T (y)

Intractability result for Boolean CQs that are not q-hierarchical The OuMv-problem: [Henzinger et al., STOC’15] Input: a Boolean n × n-matrix M and a stream u1,v1,..., un,vn of n-dimensional Boolean vectors | Task: output (u`) Mv` before accessing u`+1, v`+1 OuMv-Conjecture: For every  > 0, there is no algorithm that 3  solves the OuMv-problem in total time O(n − ) Fails for offline algorithms if we receive all vectors at once: fast matrix multiplication!

Theorem (Boolean): [Berkholz, Keppeler, S., PODS’17] Fix an  > 0 and let ϕ be a Boolean CQ whose homomorphic core is not q-hierarchical. Then, there is no algorithm with arbitrary preprocessing time and 1  tu = O(n − ) update time that answers ϕ(D) in time 2  ta = O(n − ), unless the OuMv-conjecture fails.

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 14/ 43 Intractability result for Boolean CQs that are not q-hierarchical The OuMv-problem: [Henzinger et al., STOC’15] Input: a Boolean n × n-matrix M and a stream u1,v1,..., un,vn of n-dimensional Boolean vectors | Task: output (u`) Mv` before accessing u`+1, v`+1 OuMv-Conjecture: For every  > 0, there is no algorithm that 3  solves the OuMv-problem in total time O(n − ) Fails for offline algorithms if we receive all vectors at once: fast matrix multiplication!

Theorem (Boolean): [Berkholz, Keppeler, S., PODS’17] Fix an  > 0 and let ϕ be a Boolean CQ whose homomorphic core is not q-hierarchical. Then, there is no algorithm with arbitrary preprocessing time and 1  tu = O(n − ) update time that answers ϕ(D) in time 2  ta = O(n − ), unless the OuMv-conjecture fails.  Proof idea for ψS-E-T := ∃x∃y S(x) ∧ E(x, y) ∧ T (y)

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 14/ 43 Theorem (Counting): [Berkholz, Keppeler, S., PODS’17] Let  > 0 and let ϕ(x) be a CQ whose homomorphic core is not q-hierarchical. Then, there is no algorithm with arbitrary preprocessing time and 1  tu = O(n − ) update time that computes |ϕ(D)| in time 1  tc = O(n − ), unless the OV-conjecture or the OuMv-conjecture fails.  Proof idea for ϕE-T (x) := ∃y E(x, y) ∧ T (y)

Intractability result for counting CQs that are not q-hierarchical The OV-problem: [cf. R. Williams, 2005] Input: two sets U and V of n Boolean vectors of dimension d := dlog2 ne Task: decide if there exist u ∈ U and v ∈ V with u|v = 0 OV-Conjecture: For every  > 0, there is no algorithm that solves 2  the OV-problem in time O(n − )

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 15/ 43  Proof idea for ϕE-T (x) := ∃y E(x, y) ∧ T (y)

Intractability result for counting CQs that are not q-hierarchical The OV-problem: [cf. R. Williams, 2005] Input: two sets U and V of n Boolean vectors of dimension d := dlog2 ne Task: decide if there exist u ∈ U and v ∈ V with u|v = 0 OV-Conjecture: For every  > 0, there is no algorithm that solves 2  the OV-problem in time O(n − )

Theorem (Counting): [Berkholz, Keppeler, S., PODS’17] Let  > 0 and let ϕ(x) be a CQ whose homomorphic core is not q-hierarchical. Then, there is no algorithm with arbitrary preprocessing time and 1  tu = O(n − ) update time that computes |ϕ(D)| in time 1  tc = O(n − ), unless the OV-conjecture or the OuMv-conjecture fails.

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 15/ 43 Intractability result for counting CQs that are not q-hierarchical The OV-problem: [cf. R. Williams, 2005] Input: two sets U and V of n Boolean vectors of dimension d := dlog2 ne Task: decide if there exist u ∈ U and v ∈ V with u|v = 0 OV-Conjecture: For every  > 0, there is no algorithm that solves 2  the OV-problem in time O(n − )

Theorem (Counting): [Berkholz, Keppeler, S., PODS’17] Let  > 0 and let ϕ(x) be a CQ whose homomorphic core is not q-hierarchical. Then, there is no algorithm with arbitrary preprocessing time and 1  tu = O(n − ) update time that computes |ϕ(D)| in time 1  tc = O(n − ), unless the OV-conjecture or the OuMv-conjecture fails.  Proof idea for ϕE-T (x) := ∃y E(x, y) ∧ T (y)

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 15/ 43 | ϕE-T (D`) = 4 = { ui : ui v` 6= 0 }

D` 1  2 1 ε I for each v` ∈ V : update T in time d · n − = dlog nen − | I there is ui ∈ U with ui v` = 0 ⇐⇒ |ϕE-T (D`)| < n. 2 1 ε 2 ε0 I finished for all v` ∈ V within time n · dlog nen − = n − E

 Proof idea for ϕE-T (x) := ∃y E(x, y) ∧ T (y) A lower bound for counting via OV Left: n vertices for the n vectors u ∈ U Right: d := dlog2 ne vertices for vector-coordinates

| u1 = (1, 0, 0)

| = (1, 1, 0)

| u3 = (1, 0, 1)

| u4 = (0, 0, 1)

| u5 = (0, 1, 1)

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 16/ 43 | ϕE-T (D`) = 4 = { ui : ui v` 6= 0 }

| I there is ui ∈ U with ui v` = 0 ⇐⇒ |ϕE-T (D`)| < n. 2 1 ε 2 ε0 I finished for all v` ∈ V within time n · dlog nen − = n − E

 Proof idea for ϕE-T (x) := ∃y E(x, y) ∧ T (y) A lower bound for counting via OV Left: n vertices for the n vectors u ∈ U Right: d := dlog2 ne vertices for vector-coordinates

| u1 = (1, 0, 0)

| u2 = (1, 1, 0) 1 D` T v` = 1 | u3 = (1, 0, 1) 0

| u4 = (0, 0, 1)

| u5 = (0, 1, 1)

D` 1  2 1 ε I for each v` ∈ V : update T in time d · n − = dlog nen −

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 16/ 43 2 1 ε 2 ε0 I finished for all v` ∈ V within time n · dlog nen − = n − E

 Proof idea for ϕE-T (x) := ∃y E(x, y) ∧ T (y) A lower bound for counting via OV Left: n vertices for the n vectors u ∈ U Right: d := dlog2 ne vertices for vector-coordinates

| u1 = (1, 0, 0)

| u2 = (1, 1, 0) 1 D` T v` = 1 | u3 = (1, 0, 1) 0

| | u4 = (0, 0, 1) ϕE-T (D`) = 4 = { ui : ui v` 6= 0 }

| u5 = (0, 1, 1)

D` 1  2 1 ε I for each v` ∈ V : update T in time d · n − = dlog nen − | I there is ui ∈ U with ui v` = 0 ⇐⇒ |ϕE-T (D`)| < n.

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 16/ 43  Proof idea for ϕE-T (x) := ∃y E(x, y) ∧ T (y) A lower bound for counting via OV Left: n vertices for the n vectors u ∈ U Right: d := dlog2 ne vertices for vector-coordinates

| u1 = (1, 0, 0)

| u2 = (1, 1, 0) 1 D` T v` = 1 | u3 = (1, 0, 1) 0

| | u4 = (0, 0, 1) ϕE-T (D`) = 4 = { ui : ui v` 6= 0 }

| u5 = (0, 1, 1)

D` 1  2 1 ε I for each v` ∈ V : update T in time d · n − = dlog nen − | I there is ui ∈ U with ui v` = 0 ⇐⇒ |ϕE-T (D`)| < n. 2 1 ε 2 ε0 I finished for all v` ∈ V within time n · dlog nen − = n − E Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 16/ 43 poly(ϕ) := ||ϕ||O(1)

update data structure in

Scenario [Berkholz, Keppeler, S., PODS’17]

I Input: data complexity I Database D arbitrary I query ϕ(x1,..., xk ) CQ I Preprocessing: Build a suitable data structure that represents D and ϕ(D) I Output: For Boolean queries: I Decide if D |= ϕ For k-ary queries: I Compute the number of tuples in ϕ(D) I Test for a given tuple a whether a ∈ ϕ(D) I Enumerate the tuples in ϕ(D) I Dynamic setting: Tuples may be inserted into or deleted from D After every update we want to update the data structure and report the new query result quickly: in time constant or polylog(||D||). Main result: This is possible ⇐⇒ ϕ is q-hierarchical.

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 17/ 43 poly(ϕ) := ||ϕ||O(1)

update data structure in

Scenario [Berkholz, Keppeler, S., PODS’17]

I Input: data complexity I Database D arbitrary I query ϕ(x1,..., xk ) q-hierarchical CQ I Preprocessing: Build a suitable data structure that represents D and ϕ(D) I Output: For Boolean queries: I Decide if D |= ϕ For k-ary queries: I Compute the number of tuples in ϕ(D) I Test for a given tuple a whether a ∈ ϕ(D) I Enumerate the tuples in ϕ(D) I Dynamic setting: Tuples may be inserted into or deleted from D After every update we want to update the data structure and report the new query result quickly: in time constant or polylog(||D||). Main result: This is possible ⇐⇒ ϕ is q-hierarchical.

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 17/ 43 poly(ϕ) := ||ϕ||O(1)

update data structure in

Scenario [Berkholz, Keppeler, S., PODS’17]

I Input: data complexity I Database D arbitrary I query ϕ(x1,..., xk ) q-hierarchical CQ I Preprocessing: in time O(||D||) Build a suitable data structure that represents D and ϕ(D) I Output: For Boolean queries: I Decide if D |= ϕ For k-ary queries: I Compute the number of tuples in ϕ(D) I Test for a given tuple a whether a ∈ ϕ(D) I Enumerate the tuples in ϕ(D) I Dynamic setting: Tuples may be inserted into or deleted from D After every update we want to update the data structure and report the new query result quickly: in time constant or polylog(||D||). Main result: This is possible ⇐⇒ ϕ is q-hierarchical.

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 17/ 43 poly(ϕ) := ||ϕ||O(1)

update data structure in

Scenario [Berkholz, Keppeler, S., PODS’17]

I Input: data complexity I Database D arbitrary I query ϕ(x1,..., xk ) q-hierarchical CQ I Preprocessing: in time O(||D||) Build a suitable data structure that represents D and ϕ(D) I Output: For Boolean queries: I Decide if D |= ϕ in constant time For k-ary queries: I Compute the number of tuples in ϕ(D) I Test for a given tuple a whether a ∈ ϕ(D) I Enumerate the tuples in ϕ(D) I Dynamic setting: Tuples may be inserted into or deleted from D After every update we want to update the data structure and report the new query result quickly: in time constant or polylog(||D||). Main result: This is possible ⇐⇒ ϕ is q-hierarchical.

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 17/ 43 poly(ϕ) := ||ϕ||O(1)

update data structure in

Scenario [Berkholz, Keppeler, S., PODS’17]

I Input: data complexity I Database D arbitrary I query ϕ(x1,..., xk ) q-hierarchical CQ I Preprocessing: in time O(||D||) Build a suitable data structure that represents D and ϕ(D) I Output: For Boolean queries: I Decide if D |= ϕ in constant time For k-ary queries: I Compute the number of tuples in ϕ(D) in constant time I Test for a given tuple a whether a ∈ ϕ(D) I Enumerate the tuples in ϕ(D) I Dynamic setting: Tuples may be inserted into or deleted from D After every update we want to update the data structure and report the new query result quickly: in time constant or polylog(||D||). Main result: This is possible ⇐⇒ ϕ is q-hierarchical.

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 17/ 43 poly(ϕ) := ||ϕ||O(1)

update data structure in

Scenario [Berkholz, Keppeler, S., PODS’17]

I Input: data complexity I Database D arbitrary I query ϕ(x1,..., xk ) q-hierarchical CQ I Preprocessing: in time O(||D||) Build a suitable data structure that represents D and ϕ(D) I Output: For Boolean queries: I Decide if D |= ϕ in constant time For k-ary queries: I Compute the number of tuples in ϕ(D) in constant time I Test for a given tuple a whether a ∈ ϕ(D) in constant time I Enumerate the tuples in ϕ(D) I Dynamic setting: Tuples may be inserted into or deleted from D After every update we want to update the data structure and report the new query result quickly: in time constant or polylog(||D||). Main result: This is possible ⇐⇒ ϕ is q-hierarchical.

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 17/ 43 poly(ϕ) := ||ϕ||O(1)

update data structure in

Scenario [Berkholz, Keppeler, S., PODS’17]

I Input: data complexity I Database D arbitrary I query ϕ(x1,..., xk ) q-hierarchical CQ I Preprocessing: in time O(||D||) Build a suitable data structure that represents D and ϕ(D) I Output: For Boolean queries: I Decide if D |= ϕ in constant time For k-ary queries: I Compute the number of tuples in ϕ(D) in constant time I Test for a given tuple a whether a ∈ ϕ(D) in constant time I Enumerate the tuples in ϕ(D) with constant delay I Dynamic setting: Tuples may be inserted into or deleted from D After every update we want to update the data structure and report the new query result quickly: in time constant or polylog(||D||). Main result: This is possible ⇐⇒ ϕ is q-hierarchical.

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 17/ 43 poly(ϕ) := ||ϕ||O(1)

Scenario [Berkholz, Keppeler, S., PODS’17]

I Input: data complexity I Database D arbitrary I query ϕ(x1,..., xk ) q-hierarchical CQ I Preprocessing: in time O(||D||) Build a suitable data structure that represents D and ϕ(D) I Output: For Boolean queries: I Decide if D |= ϕ in constant time For k-ary queries: I Compute the number of tuples in ϕ(D) in constant time I Test for a given tuple a whether a ∈ ϕ(D) in constant time I Enumerate the tuples in ϕ(D) with constant delay I Dynamic setting: update data structure in constant time Tuples may be inserted into or deleted from D After every update we want to update the data structure and report the new query result quickly: in time constant or polylog(||D||). Main result: This is possible ⇐⇒ ϕ is q-hierarchical.

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 17/ 43 Scenario [Berkholz, Keppeler, S., PODS’17]

I Input: combined complexity O(1) I Database D arbitrary poly(ϕ) := ||ϕ|| I query ϕ(x1,..., xk ) q-hierarchical CQ I Preprocessing: in time poly(ϕ)||D|| Build a suitable data structure that represents D and ϕ(D) I Output: For Boolean queries: I Decide if D |= ϕ in constant time For k-ary queries: I Compute the number of tuples in ϕ(D) in constant time I Test for a given tuple a whether a ∈ ϕ(D) in constant time I Enumerate the tuples in ϕ(D) with constant delay I Dynamic setting: update data structure in constant time Tuples may be inserted into or deleted from D After every update we want to update the data structure and report the new query result quickly: in time constant or polylog(||D||). Main result: This is possible ⇐⇒ ϕ is q-hierarchical.

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 17/ 43 Scenario [Berkholz, Keppeler, S., PODS’17]

I Input: combined complexity O(1) I Database D arbitrary poly(ϕ) := ||ϕ|| I query ϕ(x1,..., xk ) q-hierarchical CQ I Preprocessing: in time poly(ϕ)||D|| Build a suitable data structure that represents D and ϕ(D) I Output: For Boolean queries: I Decide if D |= ϕ in constant time For k-ary queries: I Compute the number of tuples in ϕ(D) in constant time I Test for a given tuple a whether a ∈ ϕ(D) in constant time I Enumerate the tuples in ϕ(D) with constant delay I Dynamic setting: update data structure in time poly(ϕ) Tuples may be inserted into or deleted from D After every update we want to update the data structure and report the new query result quickly: in time constant or polylog(||D||). Main result: This is possible ⇐⇒ ϕ is q-hierarchical.

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 17/ 43 Scenario [Berkholz, Keppeler, S., PODS’17]

I Input: combined complexity O(1) I Database D arbitrary poly(ϕ) := ||ϕ|| I query ϕ(x1,..., xk ) q-hierarchical CQ I Preprocessing: in time poly(ϕ)||D|| Build a suitable data structure that represents D and ϕ(D) I Output: For Boolean queries: I Decide if D |= ϕ in time O(1) For k-ary queries: I Compute the number of tuples in ϕ(D) in constant time I Test for a given tuple a whether a ∈ ϕ(D) in constant time I Enumerate the tuples in ϕ(D) with constant delay I Dynamic setting: update data structure in time poly(ϕ) Tuples may be inserted into or deleted from D After every update we want to update the data structure and report the new query result quickly: in time constant or polylog(||D||). Main result: This is possible ⇐⇒ ϕ is q-hierarchical.

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 17/ 43 Scenario [Berkholz, Keppeler, S., PODS’17]

I Input: combined complexity O(1) I Database D arbitrary poly(ϕ) := ||ϕ|| I query ϕ(x1,..., xk ) q-hierarchical CQ I Preprocessing: in time poly(ϕ)||D|| Build a suitable data structure that represents D and ϕ(D) I Output: For Boolean queries: I Decide if D |= ϕ in time O(1) For k-ary queries: I Compute the number of tuples in ϕ(D) in time O(1) I Test for a given tuple a whether a ∈ ϕ(D) in constant time I Enumerate the tuples in ϕ(D) with constant delay I Dynamic setting: update data structure in time poly(ϕ) Tuples may be inserted into or deleted from D After every update we want to update the data structure and report the new query result quickly: in time constant or polylog(||D||). Main result: This is possible ⇐⇒ ϕ is q-hierarchical.

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 17/ 43 Scenario [Berkholz, Keppeler, S., PODS’17]

I Input: combined complexity O(1) I Database D arbitrary poly(ϕ) := ||ϕ|| I query ϕ(x1,..., xk ) q-hierarchical CQ I Preprocessing: in time poly(ϕ)||D|| Build a suitable data structure that represents D and ϕ(D) I Output: For Boolean queries: I Decide if D |= ϕ in time O(1) For k-ary queries: I Compute the number of tuples in ϕ(D) in time O(1) I Test for a given tuple a whether a ∈ ϕ(D) in time poly(ϕ) I Enumerate the tuples in ϕ(D) with constant delay I Dynamic setting: update data structure in time poly(ϕ) Tuples may be inserted into or deleted from D After every update we want to update the data structure and report the new query result quickly: in time constant or polylog(||D||). Main result: This is possible ⇐⇒ ϕ is q-hierarchical.

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 17/ 43 Scenario [Berkholz, Keppeler, S., PODS’17]

I Input: combined complexity O(1) I Database D arbitrary poly(ϕ) := ||ϕ|| I query ϕ(x1,..., xk ) q-hierarchical CQ I Preprocessing: in time poly(ϕ)||D|| Build a suitable data structure that represents D and ϕ(D) I Output: For Boolean queries: I Decide if D |= ϕ in time O(1) For k-ary queries: I Compute the number of tuples in ϕ(D) in time O(1) I Test for a given tuple a whether a ∈ ϕ(D) in time poly(ϕ) I Enumerate the tuples in ϕ(D) with delay poly(ϕ) I Dynamic setting: update data structure in time poly(ϕ) Tuples may be inserted into or deleted from D After every update we want to update the data structure and report the new query result quickly: in time constant or polylog(||D||). Main result: This is possible ⇐⇒ ϕ is q-hierarchical.

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 17/ 43 I answer a Boolean CQ,

I count the number of result tuples,

I enumerate the result relation with constant delay.

Efficient evaluation of a fragment of CQs

Theorem (Upper bound): For every CQ that is q-hierarchical, there is a dynamic data structure that has constant update time and allows to

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 18/ 43 Efficient evaluation of a fragment of CQs

Theorem (Upper bound): For every CQ that is q-hierarchical, there is a dynamic data structure that has constant update time and allows to

I answer a Boolean CQ,

I count the number of result tuples,

I enumerate the result relation with constant delay.

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 18/ 43 ENUM:

+ + I ENUM: store NE (v), NF (v) as lists with constant access, D + + for v ∈ R report {v} × NE (v) × NF (v) Lemma: A CQ ϕ(x) is q-hierarchical ⇐⇒ every connected component of ϕ(x) has a q-tree.

P + + |ϕ(D)| = v RD |NE (v)| · |NF (v)| ∈ + + P + + I COUNT: store |NE (v)|, |NF (v)|, v RD |NE (v)| · |NF (v)| ∈

q-hierarchical queries

ϕ(x, y, z) := R(x) ∧ E(x, y) ∧ F (x, z) x

y z

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 19/ 43 ENUM:

+ + I ENUM: store NE (v), NF (v) as lists with constant access, D + + for v ∈ R report {v} × NE (v) × NF (v) Lemma: A CQ ϕ(x) is q-hierarchical ⇐⇒ every connected component of ϕ(x) has a q-tree.

+ + P + + I COUNT: store |NE (v)|, |NF (v)|, v RD |NE (v)| · |NF (v)| ∈

q-hierarchical queries

ϕ(x, y, z) := R(x) ∧ E(x, y) ∧ F (x, z) x P + + |ϕ(D)| = v RD |NE (v)| · |NF (v)| ∈ y z

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 19/ 43 ENUM:

+ + I ENUM: store NE (v), NF (v) as lists with constant access, D + + for v ∈ R report {v} × NE (v) × NF (v) Lemma: A CQ ϕ(x) is q-hierarchical ⇐⇒ every connected component of ϕ(x) has a q-tree.

q-hierarchical queries

ϕ(x, y, z) := R(x) ∧ E(x, y) ∧ F (x, z) x P + + |ϕ(D)| = v RD |NE (v)| · |NF (v)| ∈ y z + + P + + I COUNT: store |NE (v)|, |NF (v)|, v RD |NE (v)| · |NF (v)| ∈

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 19/ 43 ENUM:

Lemma: A CQ ϕ(x) is q-hierarchical ⇐⇒ every connected component of ϕ(x) has a q-tree.

q-hierarchical queries

ϕ(x, y, z) := R(x) ∧ E(x, y) ∧ F (x, z) x P + + |ϕ(D)| = v RD |NE (v)| · |NF (v)| ∈ y z + + P + + I COUNT: store |NE (v)|, |NF (v)|, v RD |NE (v)| · |NF (v)| ∈ + + I ENUM: store NE (v), NF (v) as lists with constant access, D + + for v ∈ R report {v} × NE (v) × NF (v)

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 19/ 43 ENUM:

Lemma: A CQ ϕ(x) is q-hierarchical ⇐⇒ every connected component of ϕ(x) has a q-tree.

q-hierarchical queries

ϕ(x, y, z) := R(x) ∧ E(x, y) ∧ F (x, z) x P + + |ϕ(D)| = v RD |NE (v)| · |NF (v)| ∈ y z + + P + + I COUNT: store |NE (v)|, |NF (v)|, v RD |NE (v)| · |NF (v)| ∈ + + I ENUM: store NE (v), NF (v) as lists with constant access, D + + for v ∈ R report {v} × NE (v) × NF (v)

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 19/ 43 ENUM:

1. for every R(y1,..., yr ) in ϕ: {y1,..., yr } forms a path in T that starts at the root

2. the free variables {x1,..., x`} form a connected subtree that contains the root Lemma: A CQ ϕ(x) is q-hierarchical ⇐⇒ every connected component of ϕ(x) has a q-tree.

q-hierarchical queries

ϕ(x, y, z) := R(x) ∧ E(x, y) ∧ F (x, z) x P + + |ϕ(D)| = v RD |NE (v)| · |NF (v)| ∈ y z + + P + + I COUNT: store |NE (v)|, |NF (v)|, v RD |NE (v)| · |NF (v)| ∈ + + I ENUM: store NE (v), NF (v) as lists with constant access, D + + for v ∈ R report {v} × NE (v) × NF (v) Definition (q-tree):

A q-tree T for a CQ ϕ(x1,..., x`) is a rooted tree with V (T ) = vars(ϕ) and

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 19/ 43 ENUM:

2. the free variables {x1,..., x`} form a connected subtree that contains the root Lemma: A CQ ϕ(x) is q-hierarchical ⇐⇒ every connected component of ϕ(x) has a q-tree.

q-hierarchical queries

ϕ(x, y, z) := R(x) ∧ E(x, y) ∧ F (x, z) x P + + |ϕ(D)| = v RD |NE (v)| · |NF (v)| ∈ y z + + P + + I COUNT: store |NE (v)|, |NF (v)|, v RD |NE (v)| · |NF (v)| ∈ + + I ENUM: store NE (v), NF (v) as lists with constant access, D + + for v ∈ R report {v} × NE (v) × NF (v) Definition (q-tree):

A q-tree T for a CQ ϕ(x1,..., x`) is a rooted tree with V (T ) = vars(ϕ) and

1. for every R(y1,..., yr ) in ϕ: {y1,..., yr } forms a path in T that starts at the root

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 19/ 43 ENUM:

Lemma: A CQ ϕ(x) is q-hierarchical ⇐⇒ every connected component of ϕ(x) has a q-tree.

q-hierarchical queries

ϕ(x, y, z) := R(x) ∧ E(x, y) ∧ F (x, z) x P + + |ϕ(D)| = v RD |NE (v)| · |NF (v)| ∈ y z + + P + + I COUNT: store |NE (v)|, |NF (v)|, v RD |NE (v)| · |NF (v)| ∈ + + I ENUM: store NE (v), NF (v) as lists with constant access, D + + for v ∈ R report {v} × NE (v) × NF (v) Definition (q-tree):

A q-tree T for a CQ ϕ(x1,..., x`) is a rooted tree with V (T ) = vars(ϕ) and

1. for every R(y1,..., yr ) in ϕ: {y1,..., yr } forms a path in T that starts at the root

2. the free variables {x1,..., x`} form a connected subtree that contains the root

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 19/ 43 ENUM:

q-hierarchical queries

ϕ(x, y, z) := R(x) ∧ E(x, y) ∧ F (x, z) x P + + |ϕ(D)| = v RD |NE (v)| · |NF (v)| ∈ y z + + P + + I COUNT: store |NE (v)|, |NF (v)|, v RD |NE (v)| · |NF (v)| ∈ + + I ENUM: store NE (v), NF (v) as lists with constant access, D + + for v ∈ R report {v} × NE (v) × NF (v) Definition (q-tree):

A q-tree T for a CQ ϕ(x1,..., x`) is a rooted tree with V (T ) = vars(ϕ) and

1. for every R(y1,..., yr ) in ϕ: {y1,..., yr } forms a path in T that starts at the root

2. the free variables {x1,..., x`} form a connected subtree that contains the root Lemma: A CQ ϕ(x) is q-hierarchical ⇐⇒ every connected component of ϕ(x) has a q-tree.

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 19/ 43 I S(b, p, a), R(b, p, a), R(b, p, b), R(b, p, c) ∈ D, E(b, p) ∈/ D

I INSERT E(b, p)

Data structure for q-hierarchical queries  ϕ(x, y, z, y 0, z0) = Rxyz ∧ Rxyz0 ∧ Exy ∧ Exy 0 ∧ Sxyz x

y y0

z z0

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 20/ 43 I INSERT E(b, p)

Data structure for q-hierarchical queries  ϕ(x, y, z, y 0, z0) = Rxyz ∧ Rxyz0 ∧ Exy ∧ Exy 0 ∧ Sxyz x

y y0

z z0

start x a b Cstart = 23 14 9 y y y0 y0 b [y, x , p] e f e f g d g h p 6 1 1 1 3 1 1 1 0 z z0 z z0 z z0 z z0

a b a b c c c b a b c a a b c 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

I S(b, p, a), R(b, p, a), R(b, p, b), R(b, p, c) ∈ D, E(b, p) ∈/ D

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 20/ 43 Data structure for q-hierarchical queries  ϕ(x, y, z, y 0, z0) = Rxyz ∧ Rxyz0 ∧ Exy ∧ Exy 0 ∧ Sxyz x

y y0

z z0

start x a b Cstart = 38 14 24 y y0 y y0

e f e f g p d g h p 6 1 1 1 3 3 1 1 1 1 z z0 z z0 z z0 z z0

a b a b c c c b a b c a a b c 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

I S(b, p, a), R(b, p, a), R(b, p, b), R(b, p, c) ∈ D, E(b, p) ∈/ D

I INSERT E(b, p)

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 20/ 43 Summary [Berkholz, Keppeler, S., PODS’17]

I Input: combined complexity O(1) I Database D arbitrary poly(ϕ) := ||ϕ|| I query ϕ(x1,..., xk ) q-hierarchical CQ I Preprocessing: in time poly(ϕ)||D|| Build a suitable data structure that represents D and ϕ(D) I Output: For Boolean queries: I Decide if D |= ϕ in time O(1) For k-ary queries: I Compute the number of tuples in ϕ(D) in time O(1) I Test for a given tuple a whether a ∈ ϕ(D) in time poly(ϕ) I Enumerate the tuples in ϕ(D) with delay poly(ϕ) I Dynamic setting: update data structure in time poly(ϕ) Tuples may be inserted into or deleted from D After every update we want to update the data structure and report the new query result quickly: in time constant or polylog(||D||). Main result: This is possible ⇐⇒ ϕ is q-hierarchical.

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 21/ 43 Summary [Berkholz, Keppeler, S., PODS’17]

I Input: combined complexity O(1) I Database D arbitrary poly(ϕ) := ||ϕ|| I query ϕ(x1,..., xk ) q-hierarchical CQ I Preprocessing: in time poly(ϕ)||D|| Build a suitable data structure that represents D and ϕ(D) I Output: For Boolean queries: I Decide if D |= ϕ in time O(1) For k-ary queries: I Compute the number of tuples in ϕ(D) in time O(1) I Test for a given tuple a whether a ∈ ϕ(D) in time poly(ϕ) I Enumerate the tuples in ϕ(D) with delay poly(ϕ) I Dynamic setting: update data structure in time poly(ϕ) Tuples may be inserted into or deleted from D After every update we want to update the data structure and report the new query result quickly: in time constant or polylog(||D||). Main result: This is possible ⇐⇒ ϕ is q-hierarchical. Ongoing work: Similar results for UCQs & FDs. Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 21/ 43 Summary [Berkholz, Keppeler, S., PODS’17]

I Input: combined complexity O(1) I Database D arbitrary poly(ϕ) := ||ϕ|| I query ϕ(x1,..., xk ) q-hierarchical CQ I Preprocessing: in time poly(ϕ)||D|| Build a suitable data structure that represents D and ϕ(D) I Output: For Boolean queries: I Decide if D |= ϕ in time O(1) For k-ary queries: I Compute the number of tuples in ϕ(D) in time O(1) I Test for a given tuple a whether a ∈ ϕ(D) in time poly(ϕ) I Enumerate the tuples in ϕ(D) with delay poly(ϕ) I Dynamic setting: update data structure in time poly(ϕ) Tuples may be inserted into or deleted from D

Related work: [Idris, Ugarte, Vansummeren, SIGMOD’17]: q-hierarchical queries are also efficient in practice!

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 21/ 43 Overview

Introduction

Conjunctive Queries on Arbitrary Databases

First-Order Queries on Bounded Degree Databases

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 22/ 43 In FOC(P):  Peven #(y).Movie (y, "Sigourney Weaver")

FO+MOD = extension of first-order logic with modulo-counting quantifiers ∃i mod m y ψ(y, z)

Let P be a collection of numerical predicates. E.g., P may contain the 2 predicates Peven = {i ∈ Z : i is even} and P = {(i, j) ∈ Z : i 6 j}. J K J 6K FOC(P) = extension of first-order logic with formulas of the form P(t1,..., tr ) for P ∈ P of arity r, and where each ti is a counting term built using integers, +, ·, and basic counting terms t(x) of the form #y.ψ(x, y)

FO+MOD queries and FOC(P) queries Movie Name Actor Is the number of movies with Alien Sigourney Weaver Sigourney Weaver even? Blade Runner Harrison Ford Blade Runner Sean Young Brazil Jonathan Pryce In FO+MOD: Brazil Kim Greist Casablanca Humphrey Bogart 0 mod 2 Casablanca Ingrid Bergmann ∃ y Movie (y, "Sigourney Weaver") Gravity Sandra Bullock Gravity George Clooney Resident Evil Milla Jovovich Terminator Arnold Schwarzenegger Terminator Linda Hamilton Terminator Michael Biehn . . . .

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 23/ 43 In FOC(P):  Peven #(y).Movie (y, "Sigourney Weaver")

Let P be a collection of numerical predicates. E.g., P may contain the 2 predicates Peven = {i ∈ Z : i is even} and P = {(i, j) ∈ Z : i 6 j}. J K J 6K FOC(P) = extension of first-order logic with formulas of the form P(t1,..., tr ) for P ∈ P of arity r, and where each ti is a counting term built using integers, +, ·, and basic counting terms t(x) of the form #y.ψ(x, y)

FO+MOD queries and FOC(P) queries Movie Name Actor Is the number of movies with Alien Sigourney Weaver Sigourney Weaver even? Blade Runner Harrison Ford Blade Runner Sean Young Brazil Jonathan Pryce In FO+MOD: Brazil Kim Greist Casablanca Humphrey Bogart 0 mod 2 Casablanca Ingrid Bergmann ∃ y Movie (y, "Sigourney Weaver") Gravity Sandra Bullock Gravity George Clooney Resident Evil Milla Jovovich Terminator Arnold Schwarzenegger Terminator Linda Hamilton Terminator Michael Biehn . . . .

FO+MOD = extension of first-order logic with modulo-counting quantifiers ∃i mod m y ψ(y, z)

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 23/ 43 Let P be a collection of numerical predicates. E.g., P may contain the 2 predicates Peven = {i ∈ Z : i is even} and P = {(i, j) ∈ Z : i 6 j}. J K J 6K FOC(P) = extension of first-order logic with formulas of the form P(t1,..., tr ) for P ∈ P of arity r, and where each ti is a counting term built using integers, +, ·, and basic counting terms t(x) of the form #y.ψ(x, y)

FO+MOD queries and FOC(P) queries Movie Name Actor Is the number of movies with Alien Sigourney Weaver Sigourney Weaver even? Blade Runner Harrison Ford Blade Runner Sean Young Brazil Jonathan Pryce In FO+MOD: Brazil Kim Greist Casablanca Humphrey Bogart 0 mod 2 Casablanca Ingrid Bergmann ∃ y Movie (y, "Sigourney Weaver") Gravity Sandra Bullock Gravity George Clooney Resident Evil Milla Jovovich In FOC( ): Terminator Arnold Schwarzenegger P Terminator Linda Hamilton  Terminator Michael Biehn Peven #(y).Movie (y, "Sigourney Weaver") . . . .

FO+MOD = extension of first-order logic with modulo-counting quantifiers ∃i mod m y ψ(y, z)

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 23/ 43 FOC(P) = extension of first-order logic with formulas of the form P(t1,..., tr ) for P ∈ P of arity r, and where each ti is a counting term built using integers, +, ·, and basic counting terms t(x) of the form #y.ψ(x, y)

FO+MOD queries and FOC(P) queries Movie Name Actor Is the number of movies with Alien Sigourney Weaver Sigourney Weaver even? Blade Runner Harrison Ford Blade Runner Sean Young Brazil Jonathan Pryce In FO+MOD: Brazil Kim Greist Casablanca Humphrey Bogart 0 mod 2 Casablanca Ingrid Bergmann ∃ y Movie (y, "Sigourney Weaver") Gravity Sandra Bullock Gravity George Clooney Resident Evil Milla Jovovich In FOC( ): Terminator Arnold Schwarzenegger P Terminator Linda Hamilton  Terminator Michael Biehn Peven #(y).Movie (y, "Sigourney Weaver") . . . .

FO+MOD = extension of first-order logic with modulo-counting quantifiers ∃i mod m y ψ(y, z)

Let P be a collection of numerical predicates. E.g., P may contain the 2 predicates Peven = {i ∈ Z : i is even} and P = {(i, j) ∈ Z : i 6 j}. J K J 6K

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 23/ 43 FO+MOD queries and FOC(P) queries Movie Name Actor Is the number of movies with Alien Sigourney Weaver Sigourney Weaver even? Blade Runner Harrison Ford Blade Runner Sean Young Brazil Jonathan Pryce In FO+MOD: Brazil Kim Greist Casablanca Humphrey Bogart 0 mod 2 Casablanca Ingrid Bergmann ∃ y Movie (y, "Sigourney Weaver") Gravity Sandra Bullock Gravity George Clooney Resident Evil Milla Jovovich In FOC( ): Terminator Arnold Schwarzenegger P Terminator Linda Hamilton  Terminator Michael Biehn Peven #(y).Movie (y, "Sigourney Weaver") . . . .

FO+MOD = extension of first-order logic with modulo-counting quantifiers ∃i mod m y ψ(y, z)

Let P be a collection of numerical predicates. E.g., P may contain the 2 predicates Peven = {i ∈ Z : i is even} and P = {(i, j) ∈ Z : i 6 j}. J K J 6K FOC(P) = extension of first-order logic with formulas of the form P(t1,..., tr ) for P ∈ P of arity r, and where each ti is a counting term built using integers, +, ·, and basic counting terms t(x) of the form #y.ψ(x, y) Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 23/ 43 Bounded degree databases

Graph G = (V , E): degree of a node v : the number of neighbours of v in G degree of G : max {degree(v): v ∈ V }

Database D:

degree of D : degree of the Gaifman graph of D

Gaifman graph of D: the graph G = (V , E) with V = adom(D) and an edge between two distinct nodes a, b ∈ V iff some tuple in some relation of D contains a and b

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 24/ 43 I evaluation in time f (ϕ, d)||D||, for (Frick, Grohe 2004)

2O(||ϕ||) f (ϕ, d) = 2d = 3-exp(||ϕ|| + lg lg d) and the 3-fold exponential blow-up is unavoidable assuming FPT 6= AW[∗]. Non-Boolean queries:

I enumeration with constant delay and linear-time preprocessing (Durand, Grandjean 2007)

I delay f (ϕ, d) and preprocessing f (ϕ, d)||D||, where f (ϕ, d) = 3-exp(||ϕ|| + lg lg d) (Kazana, Segoufin 2011)

Known results for the static setting (i.e., without updates) FO query evaluation on dbs of degree 6 d Boolean queries:

I evaluation in linear time (Seese 1996)

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 25/ 43 Non-Boolean queries:

I enumeration with constant delay and linear-time preprocessing (Durand, Grandjean 2007)

I delay f (ϕ, d) and preprocessing f (ϕ, d)||D||, where f (ϕ, d) = 3-exp(||ϕ|| + lg lg d) (Kazana, Segoufin 2011)

Known results for the static setting (i.e., without updates) FO query evaluation on dbs of degree 6 d Boolean queries:

I evaluation in linear time (Seese 1996)

I evaluation in time f (ϕ, d)||D||, for (Frick, Grohe 2004)

2O(||ϕ||) f (ϕ, d) = 2d = 3-exp(||ϕ|| + lg lg d) and the 3-fold exponential blow-up is unavoidable assuming FPT 6= AW[∗].

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 25/ 43 I delay f (ϕ, d) and preprocessing f (ϕ, d)||D||, where f (ϕ, d) = 3-exp(||ϕ|| + lg lg d) (Kazana, Segoufin 2011)

Known results for the static setting (i.e., without updates) FO query evaluation on dbs of degree 6 d Boolean queries:

I evaluation in linear time (Seese 1996)

I evaluation in time f (ϕ, d)||D||, for (Frick, Grohe 2004)

2O(||ϕ||) f (ϕ, d) = 2d = 3-exp(||ϕ|| + lg lg d) and the 3-fold exponential blow-up is unavoidable assuming FPT 6= AW[∗]. Non-Boolean queries:

I enumeration with constant delay and linear-time preprocessing (Durand, Grandjean 2007)

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 25/ 43 Known results for the static setting (i.e., without updates) FO query evaluation on dbs of degree 6 d Boolean queries:

I evaluation in linear time (Seese 1996)

I evaluation in time f (ϕ, d)||D||, for (Frick, Grohe 2004)

2O(||ϕ||) f (ϕ, d) = 2d = 3-exp(||ϕ|| + lg lg d) and the 3-fold exponential blow-up is unavoidable assuming FPT 6= AW[∗]. Non-Boolean queries:

I enumeration with constant delay and linear-time preprocessing (Durand, Grandjean 2007)

I delay f (ϕ, d) and preprocessing f (ϕ, d)||D||, where f (ϕ, d) = 3-exp(||ϕ|| + lg lg d) (Kazana, Segoufin 2011)

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 25/ 43 Known results for the static setting (i.e., without updates) FO query evaluation on dbs of degree 6 d Boolean queries:

I evaluation in linear time (Seese 1996)

I evaluation in time f (ϕ, d)||D||, for (Frick, Grohe 2004)

2O(||ϕ||) f (ϕ, d) = 2d = 3-exp(||ϕ|| + lg lg d) and the 3-fold exponential blow-up is unavoidable assuming FPT 6= AW[∗]. Non-Boolean queries:

I enumeration with constant delay and linear-time preprocessing (Durand, Grandjean 2007)

I delay f (ϕ, d) and preprocessing f (ϕ, d)||D||, where f (ϕ, d) = 3-exp(||ϕ|| + lg lg d) (Kazana, Segoufin 2011) Similar results for other classes of databases

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 25/ 43 Known results for the static setting (i.e., without updates) FO query evaluation on dbs of degree 6 d Boolean queries:

I evaluation in linear time (Seese 1996)

I evaluation in time f (ϕ, d)||D||, for (Frick, Grohe 2004)

2O(||ϕ||) f (ϕ, d) = 2d = 3-exp(||ϕ|| + lg lg d) and the 3-fold exponential blow-up is unavoidable assuming FPT 6= AW[∗]. Non-Boolean queries:

I enumeration with constant delay and linear-time preprocessing (Durand, Grandjean 2007)

I delay f (ϕ, d) and preprocessing f (ϕ, d)||D||, where f (ϕ, d) = 3-exp(||ϕ|| + lg lg d) (Kazana, Segoufin 2011) New: Generalisation to the dynamic setting and FO+MOD [Berkholz, Keppeler, S., ICDT’17] and FOC(P) [Kuske, S., LICS’17] Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 25/ 43 f (ϕ, d) = 3-exp(||ϕ|| + lg lg d)

update data structure in

Similar results for FO with counting FOC(P) [Kuske, S., LICS’17]. Future task: Revisit other results on FO model checking in the dynamic setting! – Thank you! –

Scenario [Berkholz, Keppeler, S., ICDT’17], [Kuske, S., LICS’17]

I Input: I Database D of degree 6 d I query ϕ(x1,..., xk ) in FOC(P)[σ] I Preprocessing: Build a suitable data structure that represents D and ϕ(D) I Output: For Boolean queries: I Decide if D |= ϕ For k-ary queries: I Compute the number of tuples in ϕ(D) I Test for a given tuple a whether a ∈ ϕ(D) I Enumerate the tuples in ϕ(D) I Dynamic setting: Tuples may be inserted into or deleted from D

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 26/ 43 f (ϕ, d) = 3-exp(||ϕ|| + lg lg d)

update data structure in

Similar results for FO with counting FOC(P) [Kuske, S., LICS’17]. Future task: Revisit other results on FO model checking in the dynamic setting! – Thank you! –

Scenario [Berkholz, Keppeler, S., ICDT’17], [Kuske, S., LICS’17]

I Input: data complexity I Database D of degree 6 d I query ϕ(x1,..., xk ) in FOC(P)[σ] I Preprocessing: in time O(||D||) Build a suitable data structure that represents D and ϕ(D) I Output: For Boolean queries: I Decide if D |= ϕ For k-ary queries: I Compute the number of tuples in ϕ(D) I Test for a given tuple a whether a ∈ ϕ(D) I Enumerate the tuples in ϕ(D) I Dynamic setting: Tuples may be inserted into or deleted from D

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 26/ 43 f (ϕ, d) = 3-exp(||ϕ|| + lg lg d)

update data structure in

Similar results for FO with counting FOC(P) [Kuske, S., LICS’17]. Future task: Revisit other results on FO model checking in the dynamic setting! – Thank you! –

Scenario [Berkholz, Keppeler, S., ICDT’17], [Kuske, S., LICS’17]

I Input: data complexity I Database D of degree 6 d I query ϕ(x1,..., xk ) in FOC(P)[σ] I Preprocessing: in time O(||D||) Build a suitable data structure that represents D and ϕ(D) I Output: For Boolean queries: I Decide if D |= ϕ in constant time For k-ary queries: I Compute the number of tuples in ϕ(D) I Test for a given tuple a whether a ∈ ϕ(D) I Enumerate the tuples in ϕ(D) I Dynamic setting: Tuples may be inserted into or deleted from D

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 26/ 43 f (ϕ, d) = 3-exp(||ϕ|| + lg lg d)

update data structure in

Similar results for FO with counting FOC(P) [Kuske, S., LICS’17]. Future task: Revisit other results on FO model checking in the dynamic setting! – Thank you! –

Scenario [Berkholz, Keppeler, S., ICDT’17], [Kuske, S., LICS’17]

I Input: data complexity I Database D of degree 6 d I query ϕ(x1,..., xk ) in FOC(P)[σ] I Preprocessing: in time O(||D||) Build a suitable data structure that represents D and ϕ(D) I Output: For Boolean queries: I Decide if D |= ϕ in constant time For k-ary queries: I Compute the number of tuples in ϕ(D) in constant time I Test for a given tuple a whether a ∈ ϕ(D) I Enumerate the tuples in ϕ(D) I Dynamic setting: Tuples may be inserted into or deleted from D

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 26/ 43 f (ϕ, d) = 3-exp(||ϕ|| + lg lg d)

update data structure in

Similar results for FO with counting FOC(P) [Kuske, S., LICS’17]. Future task: Revisit other results on FO model checking in the dynamic setting! – Thank you! –

Scenario [Berkholz, Keppeler, S., ICDT’17], [Kuske, S., LICS’17]

I Input: data complexity I Database D of degree 6 d I query ϕ(x1,..., xk ) in FOC(P)[σ] I Preprocessing: in time O(||D||) Build a suitable data structure that represents D and ϕ(D) I Output: For Boolean queries: I Decide if D |= ϕ in constant time For k-ary queries: I Compute the number of tuples in ϕ(D) in constant time I Test for a given tuple a whether a ∈ ϕ(D) in constant time I Enumerate the tuples in ϕ(D) I Dynamic setting: Tuples may be inserted into or deleted from D

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 26/ 43 f (ϕ, d) = 3-exp(||ϕ|| + lg lg d)

update data structure in

Similar results for FO with counting FOC(P) [Kuske, S., LICS’17]. Future task: Revisit other results on FO model checking in the dynamic setting! – Thank you! –

Scenario [Berkholz, Keppeler, S., ICDT’17], [Kuske, S., LICS’17]

I Input: data complexity I Database D of degree 6 d I query ϕ(x1,..., xk ) in FOC(P)[σ] I Preprocessing: in time O(||D||) Build a suitable data structure that represents D and ϕ(D) I Output: For Boolean queries: I Decide if D |= ϕ in constant time For k-ary queries: I Compute the number of tuples in ϕ(D) in constant time I Test for a given tuple a whether a ∈ ϕ(D) in constant time I Enumerate the tuples in ϕ(D) with constant delay I Dynamic setting: Tuples may be inserted into or deleted from D

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 26/ 43 f (ϕ, d) = 3-exp(||ϕ|| + lg lg d)

Similar results for FO with counting FOC(P) [Kuske, S., LICS’17]. Future task: Revisit other results on FO model checking in the dynamic setting! – Thank you! –

Scenario [Berkholz, Keppeler, S., ICDT’17], [Kuske, S., LICS’17]

I Input: data complexity I Database D of degree 6 d I query ϕ(x1,..., xk ) in FOC(P)[σ] I Preprocessing: in time O(||D||) Build a suitable data structure that represents D and ϕ(D) I Output: For Boolean queries: I Decide if D |= ϕ in constant time For k-ary queries: I Compute the number of tuples in ϕ(D) in constant time I Test for a given tuple a whether a ∈ ϕ(D) in constant time I Enumerate the tuples in ϕ(D) with constant delay I Dynamic setting: update data structure in constant time Tuples may be inserted into or deleted from D

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 26/ 43 Similar results for FO with counting FOC(P) [Kuske, S., LICS’17]. Future task: Revisit other results on FO model checking in the dynamic setting! – Thank you! –

Scenario [Berkholz, Keppeler, S., ICDT’17]

I Input: combined complexity I Database D of degree 6 d f (ϕ, d) = I query ϕ(x1,..., xk ) in FO+MOD[σ] 3-exp(||ϕ|| + lg lg d) I Preprocessing: in time f (ϕ, d)||D|| Build a suitable data structure that represents D and ϕ(D) I Output: For Boolean queries: I Decide if D |= ϕ in constant time For k-ary queries: I Compute the number of tuples in ϕ(D) in constant time I Test for a given tuple a whether a ∈ ϕ(D) in constant time I Enumerate the tuples in ϕ(D) with constant delay I Dynamic setting: update data structure in constant time Tuples may be inserted into or deleted from D

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 26/ 43 Similar results for FO with counting FOC(P) [Kuske, S., LICS’17]. Future task: Revisit other results on FO model checking in the dynamic setting! – Thank you! –

Scenario [Berkholz, Keppeler, S., ICDT’17]

I Input: combined complexity I Database D of degree 6 d f (ϕ, d) = I query ϕ(x1,..., xk ) in FO+MOD[σ] 3-exp(||ϕ|| + lg lg d) I Preprocessing: in time f (ϕ, d)||D|| Build a suitable data structure that represents D and ϕ(D) I Output: For Boolean queries: I Decide if D |= ϕ in constant time For k-ary queries: I Compute the number of tuples in ϕ(D) in constant time I Test for a given tuple a whether a ∈ ϕ(D) in constant time I Enumerate the tuples in ϕ(D) with constant delay I Dynamic setting: update data structure in time f (ϕ, d) Tuples may be inserted into or deleted from D

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 26/ 43 Similar results for FO with counting FOC(P) [Kuske, S., LICS’17]. Future task: Revisit other results on FO model checking in the dynamic setting! – Thank you! –

Scenario [Berkholz, Keppeler, S., ICDT’17]

I Input: combined complexity I Database D of degree 6 d f (ϕ, d) = I query ϕ(x1,..., xk ) in FO+MOD[σ] 3-exp(||ϕ|| + lg lg d) I Preprocessing: in time f (ϕ, d)||D|| Build a suitable data structure that represents D and ϕ(D) I Output: For Boolean queries: I Decide if D |= ϕ in time O(1) For k-ary queries: I Compute the number of tuples in ϕ(D) in constant time I Test for a given tuple a whether a ∈ ϕ(D) in constant time I Enumerate the tuples in ϕ(D) with constant delay I Dynamic setting: update data structure in time f (ϕ, d) Tuples may be inserted into or deleted from D

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 26/ 43 Similar results for FO with counting FOC(P) [Kuske, S., LICS’17]. Future task: Revisit other results on FO model checking in the dynamic setting! – Thank you! –

Scenario [Berkholz, Keppeler, S., ICDT’17]

I Input: combined complexity I Database D of degree 6 d f (ϕ, d) = I query ϕ(x1,..., xk ) in FO+MOD[σ] 3-exp(||ϕ|| + lg lg d) I Preprocessing: in time f (ϕ, d)||D|| Build a suitable data structure that represents D and ϕ(D) I Output: For Boolean queries: I Decide if D |= ϕ in time O(1) For k-ary queries: I Compute the number of tuples in ϕ(D) in time O(1) I Test for a given tuple a whether a ∈ ϕ(D) in constant time I Enumerate the tuples in ϕ(D) with constant delay I Dynamic setting: update data structure in time f (ϕ, d) Tuples may be inserted into or deleted from D

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 26/ 43 Similar results for FO with counting FOC(P) [Kuske, S., LICS’17]. Future task: Revisit other results on FO model checking in the dynamic setting! – Thank you! –

Scenario [Berkholz, Keppeler, S., ICDT’17]

I Input: combined complexity I Database D of degree 6 d f (ϕ, d) = I query ϕ(x1,..., xk ) in FO+MOD[σ] 3-exp(||ϕ|| + lg lg d) I Preprocessing: in time f (ϕ, d)||D|| Build a suitable data structure that represents D and ϕ(D) I Output: For Boolean queries: I Decide if D |= ϕ in time O(1) For k-ary queries: I Compute the number of tuples in ϕ(D) in time O(1) 2 I Test for a given tuple a whether a ∈ ϕ(D) in time O(k ) I Enumerate the tuples in ϕ(D) with constant delay I Dynamic setting: update data structure in time f (ϕ, d) Tuples may be inserted into or deleted from D

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 26/ 43 Similar results for FO with counting FOC(P) [Kuske, S., LICS’17]. Future task: Revisit other results on FO model checking in the dynamic setting! – Thank you! –

Scenario [Berkholz, Keppeler, S., ICDT’17]

I Input: combined complexity I Database D of degree 6 d f (ϕ, d) = I query ϕ(x1,..., xk ) in FO+MOD[σ] 3-exp(||ϕ|| + lg lg d) I Preprocessing: in time f (ϕ, d)||D|| Build a suitable data structure that represents D and ϕ(D) I Output: For Boolean queries: I Decide if D |= ϕ in time O(1) For k-ary queries: I Compute the number of tuples in ϕ(D) in time O(1) 2 I Test for a given tuple a whether a ∈ ϕ(D) in time O(k ) 3 I Enumerate the tuples in ϕ(D) with delay O(k ) I Dynamic setting: update data structure in time f (ϕ, d) Tuples may be inserted into or deleted from D

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 26/ 43 Similar results for FO with counting FOC(P) [Kuske, S., LICS’17]. Future task: Revisit other results on FO model checking in the dynamic setting! – Thank you! –

Scenario [Berkholz, Keppeler, S., ICDT’17]

I Input: combined complexity I Database D of degree 6 d f (ϕ, d) = I query ϕ(x1,..., xk ) in FO+MOD[σ] 3-exp(||ϕ|| + lg lg d) I Preprocessing: in time f (ϕ, d)||D|| Build a suitable data structure that represents D and ϕ(D) I Output: For Boolean queries: I Decide if D |= ϕ in time O(1) For k-ary queries: I Compute the number of tuples in ϕ(D) in time O(1) 2 I Test for a given tuple a whether a ∈ ϕ(D) in time O(k ) 3 I Enumerate the tuples in ϕ(D) with delay O(k ) I Dynamic setting: update data structure in time f (ϕ, d) Tuples may be inserted into or deleted from D Proof method: use Hanf normal form for FO+MOD

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 26/ 43 D I Nr (b) is the induced substructure of D on D D D Nr (b)= Nr (b1) ∪ · · · ∪ Nr (bk ) where D D Nr (bi )= {a ∈ adom(D): dist (bi , a) 6 r}

I Sphere-formula sphτ (x):  D  ∼ D, a |= sphτ (x) ⇐⇒ Nr (a), a = τ

Hanf normal form for FO+MOD

I A type τ with k centres and radius r:

Example type with k = 4 centres and radius r = 1

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 27/ 43 I Sphere-formula sphτ (x):  D  ∼ D, a |= sphτ (x) ⇐⇒ Nr (a), a = τ

Hanf normal form for FO+MOD

I A type τ with k centres and radius r:

Example type with k = 4 centres and radius r = 1

D I Nr (b) is the induced substructure of D on D D D Nr (b)= Nr (b1) ∪ · · · ∪ Nr (bk ) where D D Nr (bi )= {a ∈ adom(D): dist (bi , a) 6 r}

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 27/ 43 Hanf normal form for FO+MOD

I A type τ with k centres and radius r:

Example type with k = 4 centres and radius r = 1

D I Nr (b) is the induced substructure of D on D D D Nr (b)= Nr (b1) ∪ · · · ∪ Nr (bk ) where D D Nr (bi )= {a ∈ adom(D): dist (bi , a) 6 r}

I Sphere-formula sphτ (x):  D  ∼ D, a |= sphτ (x) ⇐⇒ Nr (a), a = τ

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 27/ 43 Two queries ϕ(x) and ψ(x) are d-equivalent iff

(D, a) |= ϕ ⇐⇒ (D, a) |= ψ

for all dbs D of degree 6 d.

Theorem (Heimberg, Kuske, S., LICS’16) There is an algorithm which receives as input a degree bound d > 2 and a FO+MOD[σ]-formula ϕ(x), and constructs a d-equivalent formula ψ(x) in Hanf normal form. The algorithm’s runtime is f (ϕ, d) = 3-exp(||ϕ|| + lg lg d).

Hanf normal form for FO+MOD A Hanf normal form ψ(x) is a Boolean combination of

I sphere-formulas sphρ(x) and >m i mod m I Hanf-sentences ∃ y sphτ (y) and ∃ y sphτ (y) where τ is a type with 1 centre and radius r.

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 28/ 43 Theorem (Heimberg, Kuske, S., LICS’16) There is an algorithm which receives as input a degree bound d > 2 and a FO+MOD[σ]-formula ϕ(x), and constructs a d-equivalent formula ψ(x) in Hanf normal form. The algorithm’s runtime is f (ϕ, d) = 3-exp(||ϕ|| + lg lg d).

Hanf normal form for FO+MOD A Hanf normal form ψ(x) is a Boolean combination of

I sphere-formulas sphρ(x) and >m i mod m I Hanf-sentences ∃ y sphτ (y) and ∃ y sphτ (y) where τ is a type with 1 centre and radius r.

Two queries ϕ(x) and ψ(x) are d-equivalent iff

(D, a) |= ϕ ⇐⇒ (D, a) |= ψ

for all dbs D of degree 6 d.

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 28/ 43 Hanf normal form for FO+MOD A Hanf normal form ψ(x) is a Boolean combination of

I sphere-formulas sphρ(x) and >m i mod m I Hanf-sentences ∃ y sphτ (y) and ∃ y sphτ (y) where τ is a type with 1 centre and radius r.

Two queries ϕ(x) and ψ(x) are d-equivalent iff

(D, a) |= ϕ ⇐⇒ (D, a) |= ψ

for all dbs D of degree 6 d.

Theorem (Heimberg, Kuske, S., LICS’16) There is an algorithm which receives as input a degree bound d > 2 and a FO+MOD[σ]-formula ϕ(x), and constructs a d-equivalent formula ψ(x) in Hanf normal form. The algorithm’s runtime is f (ϕ, d) = 3-exp(||ϕ|| + lg lg d).

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 28/ 43 Proof Idea: Step 1: Transform ϕ into Hanf normal form ψ.

Main result for Boolean queries

Theorem There is a dynamic algorithm that receives as input I a degree bound d > 2, I a Boolean FO+MOD[σ]-query ϕ, and I a db D of degree 6 d, and computes

I within f (ϕ, d)||D|| preprocessing time a data structure

I that can be updated in time f (ϕ, d) and allows to return the query result ϕ(D) with answer time O(1).

f (ϕ, d) = 3-exp(||ϕ|| + lg lg d)

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 29/ 43 Main result for Boolean queries

Theorem There is a dynamic algorithm that receives as input I a degree bound d > 2, I a Boolean FO+MOD[σ]-query ϕ, and I a db D of degree 6 d, and computes

I within f (ϕ, d)||D|| preprocessing time a data structure

I that can be updated in time f (ϕ, d) and allows to return the query result ϕ(D) with answer time O(1).

f (ϕ, d) = 3-exp(||ϕ|| + lg lg d)

Proof Idea: Step 1: Transform ϕ into Hanf normal form ψ.

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 29/ 43 Let τ be the type with 1 center and radius 2:

Let ρ be the type with 1 center and radius 2:

Data structure: A[τ] = 0 , A[ρ] = 0 Database:

a1 a2 a3 a4 a5 a6

a7

Proof idea (by example)

0 mod 2 0 mod 2 ψ = ∃ y sphτ (y) ∧ ∃ y sphρ(y)

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 30/ 43 Data structure: A[τ] = 0 , A[ρ] = 0 Database:

a1 a2 a3 a4 a5 a6

a7

Proof idea (by example)

0 mod 2 0 mod 2 ψ = ∃ y sphτ (y) ∧ ∃ y sphρ(y) Let τ be the type with 1 center and radius 2:

Let ρ be the type with 1 center and radius 2:

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 30/ 43 Database:

a1 a2 a3 a4 a5 a6

a7

Proof idea (by example)

0 mod 2 0 mod 2 ψ = ∃ y sphτ (y) ∧ ∃ y sphρ(y) Let τ be the type with 1 center and radius 2:

Let ρ be the type with 1 center and radius 2:

Data structure: A[τ] = 0 , A[ρ] = 0

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 30/ 43 Proof idea (by example)

0 mod 2 0 mod 2 ψ = ∃ y sphτ (y) ∧ ∃ y sphρ(y) Let τ be the type with 1 center and radius 2:

Let ρ be the type with 1 center and radius 2:

Data structure: A[τ] = 0 , A[ρ] = 0 Database:

a1 a2 a3 a4 a5 a6

a7

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 30/ 43 Proof idea (by example)

0 mod 2 0 mod 2 ψ = ∃ y sphτ (y) ∧ ∃ y sphρ(y) Let τ be the type with 1 center and radius 2:

Let ρ be the type with 1 center and radius 2:

Data structure: A[τ] = 1 , A[ρ] = 0 Database:

a1 a2 a3 a4 a5 a6

a7

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 30/ 43 Proof idea (by example)

0 mod 2 0 mod 2 ψ = ∃ y sphτ (y) ∧ ∃ y sphρ(y) Let τ be the type with 1 center and radius 2:

Let ρ be the type with 1 center and radius 2:

Data structure: A[τ] = 1 , A[ρ] = 1 Database:

a1 a2 a3 a4 a5 a6

a7

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 30/ 43 Proof idea (by example)

0 mod 2 0 mod 2 ψ = ∃ y sphτ (y) ∧ ∃ y sphρ(y) Let τ be the type with 1 center and radius 2:

Let ρ be the type with 1 center and radius 2:

Data structure: A[τ] = 1 , A[ρ] = 1 Database:

a1 a2 a3 a4 a5 a6

a8 a7

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 30/ 43 Proof idea (by example)

0 mod 2 0 mod 2 ψ = ∃ y sphτ (y) ∧ ∃ y sphρ(y) Let τ be the type with 1 center and radius 2:

Let ρ be the type with 1 center and radius 2:

Data structure: A[τ] = 0 , A[ρ] = 1 Database:

a1 a2 a3 a4 a5 a6

a8 a7

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 30/ 43 Proof idea (by example)

0 mod 2 0 mod 2 ψ = ∃ y sphτ (y) ∧ ∃ y sphρ(y) Let τ be the type with 1 center and radius 2:

Let ρ be the type with 1 center and radius 2:

Data structure: A[τ] = 1 , A[ρ] = 1 Database:

a1 a2 a3 a4 a5 a6

a8 a7

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 30/ 43 Proof idea (by example)

0 mod 2 0 mod 2 ψ = ∃ y sphτ (y) ∧ ∃ y sphρ(y) Let τ be the type with 1 center and radius 2:

Let ρ be the type with 1 center and radius 2:

Data structure: A[τ] = 1 , A[ρ] = 1 Database:

a1 a2 a3 a4 a5 a6

a8 a7

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 30/ 43 Proof idea (by example)

0 mod 2 0 mod 2 ψ = ∃ y sphτ (y) ∧ ∃ y sphρ(y) Let τ be the type with 1 center and radius 2:

Let ρ be the type with 1 center and radius 2:

Data structure: A[τ] = 0 , A[ρ] = 1 Database:

a1 a2 a3 a4 a5 a6

a8 a7

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 30/ 43 Proof idea (by example)

0 mod 2 0 mod 2 ψ = ∃ y sphτ (y) ∧ ∃ y sphρ(y) Let τ be the type with 1 center and radius 2:

Let ρ be the type with 1 center and radius 2:

Data structure: A[τ] = 0 , A[ρ] = 2 Database:

a1 a2 a3 a4 a5 a6

a8 a7

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 30/ 43 Proof Idea: Step 1: Transform ϕ into Hanf normal form ψ.

Main result for Boolean queries

Theorem There is a dynamic algorithm that receives as input I a degree bound d > 2, I a Boolean FO+MOD[σ]-query ϕ, and I a db D of degree 6 d, and computes

I within f (ϕ, d)||D|| preprocessing time a data structure

I that can be updated in time f (ϕ, d) and allows to return the query result ϕ(D) with answer time O(1).

f (ϕ, d) = 3-exp(||ϕ|| + lg lg d)

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 31/ 43 Main result for enumeration

Theorem There is a dynamic algorithm that receives as input I a degree bound d > 2, I a k-ary FO+MOD[σ]-query ϕ(x), and I a db D of degree 6 d, and computes

I within f (ϕ, d)||D|| preprocessing time a data structure

I that can be updated in time f (ϕ, d) and allows to enumerate ϕ(D) with delay O(k3).

f (ϕ, d) = 3-exp(||ϕ|| + lg lg d)

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 32/ 43 Main result for enumeration Theorem There is a dynamic algorithm that receives as input I a degree bound d > 2, I a k-ary FO+MOD[σ]-query ϕ(x), and I a db D of degree 6 d, and computes

I within f (ϕ, d)||D|| preprocessing time a data structure

I that can be updated in time f (ϕ, d) and allows to enumerate ϕ(D) with delay O(k3).

f (ϕ, d) = 3-exp(||ϕ|| + lg lg d)

Proof Idea:

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 32/ 43 Proof idea: Reduction to coloured graphs

Input: Database D FO+MOD-query ϕ(x1, ..., xk )

Same approach as in [Durand, S., Segoufin, PODS’14], but now we have to take care of updates!

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 33/ 43 Proof idea: Reduction to coloured graphs

Input: Database D FO+MOD-query ϕ(x1, ..., xk )

σk := {E, C1, ..., Ck } v ∈ ψk (G) σk -structure G k ^ ^ ψk (x1,..., xk ) := Ci (xi ) ∧ ¬E(xi , xj ) i=1 i=j 6

Same approach as in [Durand, S., Segoufin, PODS’14], but now we have to take care of updates!

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 33/ 43 Proof idea: Reduction to coloured graphs

Input: Enumerate: Database D FO+MOD-query ϕ(x1, ..., xk ) a ∈ ϕ(D)

σk := {E, C1, ..., Ck } v ∈ ψk (G) σk -structure G k ^ ^ ψk (x1,..., xk ) := Ci (xi ) ∧ ¬E(xi , xj ) i=1 i=j 6

Same approach as in [Durand, S., Segoufin, PODS’14], but now we have to take care of updates!

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 33/ 43 Representing Databases by Coloured Graphs _ sph (x1,..., xk )& sentences ϕ(x1,..., xk ) ≡d τi i ∈I

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 34/ 43 Representing Databases by Coloured Graphs _ sph (x1,..., xk )& sentences ϕ(x1,..., xk ) ≡d τi i ∈I ^ ^ sphτ (xj ) ∧ ¬dist 2r+1(xj , xj0 ) sphτ (x1,..., xc ) ≡d j 6 j 1,...,c j=j0 ∈{ } 6

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 34/ 43 Representing Databases by Coloured Graphs _ sph (x1,..., xk )& sentences ϕ(x1,..., xk ) ≡d τi i ∈I ^ ^ sphτ (xj ) ∧ ¬dist 2r+1(xj , xj0 ) sphτ (x1,..., xc ) ≡d j 6 j 1,...,c j=j0 ∈{ } 6

^ ^ Cj (zj ) ∧ ¬E(zj , zj0 ) ϕc (z1,..., zc ) := j 1,...,c j=j0 ∈{ } 6

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 34/ 43 Representing Databases by Coloured Graphs _ sph (x1,..., xk )& sentences ϕ(x1,..., xk ) ≡d τi i ∈I ^ ^ sphτ (xj ) ∧ ¬dist 2r+1(xj , xj0 ) sphτ (x1,..., xc ) ≡d j 6 j 1,...,c j=j0 ∈{ } 6

^ ^ Cj (zj ) ∧ ¬E(zj , zj0 ) ϕc (z1,..., zc ) := j 1,...,c j=j0 ∈{ } 6 D xj D ∼ CjG := {va : a ∈ adom(D)| |, (Nr (a), a) = τj }

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 34/ 43 Representing Databases by Coloured Graphs _ sph (x1,..., xk )& sentences ϕ(x1,..., xk ) ≡d τi i ∈I ^ ^ sphτ (xj ) ∧ ¬dist 2r+1(xj , xj0 ) sphτ (x1,..., xc ) ≡d j 6 j 1,...,c j=j0 ∈{ } 6

^ ^ Cj (zj ) ∧ ¬E(zj , zj0 ) ϕc (z1,..., zc ) := j 1,...,c j=j0 ∈{ } 6 D xj D ∼ CjG := {va : a ∈ adom(D)| |, (Nr (a), a) = τj }

S D V := j 1,...,c CjG ∈{ }

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 34/ 43 Representing Databases by Coloured Graphs _ sph (x1,..., xk )& sentences ϕ(x1,..., xk ) ≡d τi i ∈I ^ ^ sphτ (xj ) ∧ ¬dist 2r+1(xj , xj0 ) sphτ (x1,..., xc ) ≡d j 6 j 1,...,c j=j0 ∈{ } 6

^ ^ Cj (zj ) ∧ ¬E(zj , zj0 ) ϕc (z1,..., zc ) := j 1,...,c j=j0 ∈{ } 6 D xj D ∼ CjG := {va : a ∈ adom(D)| |, (Nr (a), a) = τj }

S D D 2 D V := j 1,...,c CjG E G := {(va, vb) ∈ V : dist (a, b) 6 2r + 1} ∈{ }

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 34/ 43 Representing Databases by Coloured Graphs _ sph (x1,..., xk )& sentences ϕ(x1,..., xk ) ≡d τi i ∈I ^ ^ sphτ (xj ) ∧ ¬dist 2r+1(xj , xj0 ) sphτ (x1,..., xc ) ≡d j 6 j 1,...,c j=j0 ∈{ } 6

^ ^ Cj (zj ) ∧ ¬E(zj , zj0 ) ϕc (z1,..., zc ) := j 1,...,c j=j0 ∈{ } 6 D xj D ∼ CjG := {va : a ∈ adom(D)| |, (Nr (a), a) = τj }

S D D 2 D V := j 1,...,c CjG E G := {(va, vb) ∈ V : dist (a, b) 6 2r + 1} ∈{ }

(a1,..., ac ) ∈ sphτ (D) ⇐⇒ (va1 ,..., vac ) ∈ ϕc (GD )

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 34/ 43 I Assume, an update command update R(a) is received I Let r 0 := r + (k − 1)(2r + 1) I Let D0 ∈ {Dold, Dnew} be the database where a occurs in R. D0 I Let U := Nr 0 (a) D xj D ∼ I Cj := {vb : b ∈ adom(D)| | with (Nr (b), b) = τj } D G old I Updating the colours: Before: Cj = Cj 1: for j = 1 to c do Sk ` 2: for every tuple b ∈ `=1 U do Dnew  ∼ 3: if Nr (b), b = τj then Cj ← Cj ∪ {vb} 4: else Cj ← Cj \{vb}

Dnew Afterwards: Cj = CjG

Updating the graph (1) Claim

If Dnew is obtained from Dold by one update step, then GDnew can (k2r+k σ ) be obtained from GDold by dO || || update steps and additional ( σ k2d2r+2) computing time 2O || || .

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 35/ 43 I Let r 0 := r + (k − 1)(2r + 1) I Let D0 ∈ {Dold, Dnew} be the database where a occurs in R. D0 I Let U := Nr 0 (a) D xj D ∼ I Cj := {vb : b ∈ adom(D)| | with (Nr (b), b) = τj } D G old I Updating the colours: Before: Cj = Cj 1: for j = 1 to c do Sk ` 2: for every tuple b ∈ `=1 U do Dnew  ∼ 3: if Nr (b), b = τj then Cj ← Cj ∪ {vb} 4: else Cj ← Cj \{vb}

Dnew Afterwards: Cj = CjG

Updating the graph (1) Claim

If Dnew is obtained from Dold by one update step, then GDnew can (k2r+k σ ) be obtained from GDold by dO || || update steps and additional ( σ k2d2r+2) computing time 2O || || .

I Assume, an update command update R(a) is received

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 35/ 43 I Let D0 ∈ {Dold, Dnew} be the database where a occurs in R. D0 I Let U := Nr 0 (a) D xj D ∼ I Cj := {vb : b ∈ adom(D)| | with (Nr (b), b) = τj } D G old I Updating the colours: Before: Cj = Cj 1: for j = 1 to c do Sk ` 2: for every tuple b ∈ `=1 U do Dnew  ∼ 3: if Nr (b), b = τj then Cj ← Cj ∪ {vb} 4: else Cj ← Cj \{vb}

Dnew Afterwards: Cj = CjG

Updating the graph (1) Claim

If Dnew is obtained from Dold by one update step, then GDnew can (k2r+k σ ) be obtained from GDold by dO || || update steps and additional ( σ k2d2r+2) computing time 2O || || .

I Assume, an update command update R(a) is received I Let r 0 := r + (k − 1)(2r + 1)

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 35/ 43 D0 I Let U := Nr 0 (a) D xj D ∼ I Cj := {vb : b ∈ adom(D)| | with (Nr (b), b) = τj } D G old I Updating the colours: Before: Cj = Cj 1: for j = 1 to c do Sk ` 2: for every tuple b ∈ `=1 U do Dnew  ∼ 3: if Nr (b), b = τj then Cj ← Cj ∪ {vb} 4: else Cj ← Cj \{vb}

Dnew Afterwards: Cj = CjG

Updating the graph (1) Claim

If Dnew is obtained from Dold by one update step, then GDnew can (k2r+k σ ) be obtained from GDold by dO || || update steps and additional ( σ k2d2r+2) computing time 2O || || .

I Assume, an update command update R(a) is received I Let r 0 := r + (k − 1)(2r + 1) I Let D0 ∈ {Dold, Dnew} be the database where a occurs in R.

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 35/ 43 D xj D ∼ I Cj := {vb : b ∈ adom(D)| | with (Nr (b), b) = τj } D G old I Updating the colours: Before: Cj = Cj 1: for j = 1 to c do Sk ` 2: for every tuple b ∈ `=1 U do Dnew  ∼ 3: if Nr (b), b = τj then Cj ← Cj ∪ {vb} 4: else Cj ← Cj \{vb}

Dnew Afterwards: Cj = CjG

Updating the graph (1) Claim

If Dnew is obtained from Dold by one update step, then GDnew can (k2r+k σ ) be obtained from GDold by dO || || update steps and additional ( σ k2d2r+2) computing time 2O || || .

I Assume, an update command update R(a) is received I Let r 0 := r + (k − 1)(2r + 1) I Let D0 ∈ {Dold, Dnew} be the database where a occurs in R. D0 I Let U := Nr 0 (a)

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 35/ 43 D G old I Updating the colours: Before: Cj = Cj 1: for j = 1 to c do Sk ` 2: for every tuple b ∈ `=1 U do Dnew  ∼ 3: if Nr (b), b = τj then Cj ← Cj ∪ {vb} 4: else Cj ← Cj \{vb}

Dnew Afterwards: Cj = CjG

Updating the graph (1) Claim

If Dnew is obtained from Dold by one update step, then GDnew can (k2r+k σ ) be obtained from GDold by dO || || update steps and additional ( σ k2d2r+2) computing time 2O || || .

I Assume, an update command update R(a) is received I Let r 0 := r + (k − 1)(2r + 1) I Let D0 ∈ {Dold, Dnew} be the database where a occurs in R. D0 I Let U := Nr 0 (a) D xj D ∼ I Cj := {vb : b ∈ adom(D)| | with (Nr (b), b) = τj }

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 35/ 43 Updating the graph (1) Claim

If Dnew is obtained from Dold by one update step, then GDnew can (k2r+k σ ) be obtained from GDold by dO || || update steps and additional ( σ k2d2r+2) computing time 2O || || .

I Assume, an update command update R(a) is received I Let r 0 := r + (k − 1)(2r + 1) I Let D0 ∈ {Dold, Dnew} be the database where a occurs in R. D0 I Let U := Nr 0 (a) D xj D ∼ I Cj := {vb : b ∈ adom(D)| | with (Nr (b), b) = τj } D G old I Updating the colours: Before: Cj = Cj 1: for j = 1 to c do Sk ` 2: for every tuple b ∈ `=1 U do Dnew  ∼ 3: if Nr (b), b = τj then Cj ← Cj ∪ {vb} 4: else Cj ← Cj \{vb}

Dnew Afterwards: Cj = CjG

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 35/ 43 D I Updating the edges: Before: E = E G old Sk ` 1: for every tuple b ∈ `=1 U do k 0 S ` 2: for every tuple b ∈ `=1 U do 3: if condition (1), (2) and (3) holds then

4: E ← E ∪ {(vb, vb0 )} 5: else

6: E ← E \{(vb, vb0 )}

D Afterwards: E = E G new I Conditions:

(1) There is a j ∈ {1, ..., c} such that b ∈ CjG 0 (2) There is a j 0 ∈ {1, ..., c} such that b ∈ CjG0 D (3) dist new (b, b0) 6 2r + 1

Updating the graph (2)

D 2 D I E G := {(va, vb) ∈ V : dist (a, b) 6 2r + 1}

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 36/ 43 I Conditions:

(1) There is a j ∈ {1, ..., c} such that b ∈ CjG 0 (2) There is a j 0 ∈ {1, ..., c} such that b ∈ CjG0 D (3) dist new (b, b0) 6 2r + 1

Updating the graph (2)

D 2 D I E G := {(va, vb) ∈ V : dist (a, b) 6 2r + 1} D I Updating the edges: Before: E = E G old Sk ` 1: for every tuple b ∈ `=1 U do k 0 S ` 2: for every tuple b ∈ `=1 U do 3: if condition (1), (2) and (3) holds then

4: E ← E ∪ {(vb, vb0 )} 5: else

6: E ← E \{(vb, vb0 )}

D Afterwards: E = E G new

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 36/ 43 Updating the graph (2)

D 2 D I E G := {(va, vb) ∈ V : dist (a, b) 6 2r + 1} D I Updating the edges: Before: E = E G old Sk ` 1: for every tuple b ∈ `=1 U do k 0 S ` 2: for every tuple b ∈ `=1 U do 3: if condition (1), (2) and (3) holds then

4: E ← E ∪ {(vb, vb0 )} 5: else

6: E ← E \{(vb, vb0 )}

D Afterwards: E = E G new I Conditions:

(1) There is a j ∈ {1, ..., c} such that b ∈ CjG 0 (2) There is a j 0 ∈ {1, ..., c} such that b ∈ CjG0 D (3) dist new (b, b0) 6 2r + 1

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 36/ 43 Enumeration with delay O(k3d)

k ^ ^ ψk (x1,..., xk ) := Ci (xi ) ∧ ¬E(xi , xj ) i=1 i=j 6

for all u1 ∈ C1G do Enum(u1). Output EOE.

function Enum(u1,..., ui ) if i = k then Output (u1,..., ui ) else

for all ui+1 ∈ CiG+1 do Si if ui+1 ∈/ j=1 NG(uj ) then Enum(u1,..., ui , ui+1)

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 37/ 43 Enumeration with delay O(k3d)

k ^ ^ ψk (x1,..., xk ) := Ci (xi ) ∧ ¬E(xi , xj ) i=1 i=j 6

for all u1 ∈ C1G do Enum(u1). Output EOE.

function Enum(u1,..., ui ) if i = k then Output (u1,..., ui ) else

for all ui+1 ∈ CiG+1 do Si if ui+1 ∈/ j=1 NG(uj ) then Enum(u1,..., ui , ui+1)

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 37/ 43 Enumeration with delay O(k3d)

k ^ ^ ψk (x1,..., xk ) := Ci (xi ) ∧ ¬E(xi , xj ) i=1 i=j 6

for all u1 ∈ C1G do Enum(u1). Output EOE.

function Enum(u1,..., ui ) if i = k then Output (u1,..., ui ) else

for all ui+1 ∈ CiG+1 do Si if ui+1 ∈/ j=1 NG(uj ) then Enum(u1,..., ui , ui+1)

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 37/ 43 Enumeration with delay O(k3d)

k ^ ^ ψk (x1,..., xk ) := Ci (xi ) ∧ ¬E(xi , xj ) i=1 i=j 6

for all u1 ∈ C1G do Enum(u1). Output EOE.

function Enum(u1,..., ui ) if i = k then Output (u1,..., ui ) else

for all ui+1 ∈ CiG+1 do Si Problem: Too few blue nodes if ui+1 ∈/ j=1 NG(uj ) then Enum(u1,..., ui , ui+1)

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 37/ 43 W.l.o.g. let I = {1,..., s} be the set of small colours (with s 6 k).

  (uj , uj0 ) ∈/ E G, S := (u1,..., us ) ∈ C1G × · · · × CsG : for all j 6= j0

The set S can be computed in time O((dk)k ).

s ∈ S ⇐⇒ ex. a such that (s, a) ∈ ϕ(D)

Handling small colours

A colour ` ∈ {1,..., k} is small :⇐⇒ C`G 6 dk

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 38/ 43   (uj , uj0 ) ∈/ E G, S := (u1,..., us ) ∈ C1G × · · · × CsG : for all j 6= j0

The set S can be computed in time O((dk)k ).

s ∈ S ⇐⇒ ex. a such that (s, a) ∈ ϕ(D)

Handling small colours

A colour ` ∈ {1,..., k} is small :⇐⇒ C`G 6 dk

W.l.o.g. let I = {1,..., s} be the set of small colours (with s 6 k).

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 38/ 43 s ∈ S ⇐⇒ ex. a such that (s, a) ∈ ϕ(D)

Handling small colours

A colour ` ∈ {1,..., k} is small :⇐⇒ C`G 6 dk

W.l.o.g. let I = {1,..., s} be the set of small colours (with s 6 k).

  (uj , uj0 ) ∈/ E G, S := (u1,..., us ) ∈ C1G × · · · × CsG : for all j 6= j0

The set S can be computed in time O((dk)k ).

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 38/ 43 Handling small colours

A colour ` ∈ {1,..., k} is small :⇐⇒ C`G 6 dk

W.l.o.g. let I = {1,..., s} be the set of small colours (with s 6 k).

  (uj , uj0 ) ∈/ E G, S := (u1,..., us ) ∈ C1G × · · · × CsG : for all j 6= j0

The set S can be computed in time O((dk)k ).

s ∈ S ⇐⇒ ex. a such that (s, a) ∈ ϕ(D)

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 38/ 43 The enumeration procedure

1: for all (u1,..., us ) ∈ S do 2: Enum(u1,..., us ). 3: Output the end-of-enumeration message EOE. 4: 5: function Enum(u1,..., ui ) 6: if i = k then 7: output the tuple (u1,..., ui ) 8: else

9: for all ui+1 ∈ CiG+1 do Si 10: if ui+1 ∈/ j=1 NG(uj ) then 11: Enum(u1,..., ui , ui+1)

where NG(uj ) := {v ∈ V G :(uj , v) ∈ E G}.

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 39/ 43 k ^ ^ ψk (x1,..., xk ) := Ci (xi ) ∧ ¬E(xi , xj ) i=1 i=j 6 S large colours

......

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 40/ 43 k ^ ^ ψk (x1,..., xk ) := Ci (xi ) ∧ ¬E(xi , xj ) i=1 i=j 6 S large colours

......

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 40/ 43 k ^ ^ ψk (x1,..., xk ) := Ci (xi ) ∧ ¬E(xi , xj ) i=1 i=j 6 S large colours

......

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 40/ 43 k ^ ^ ψk (x1,..., xk ) := Ci (xi ) ∧ ¬E(xi , xj ) i=1 i=j 6 S large colours

......

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 40/ 43 k ^ ^ ψk (x1,..., xk ) := Ci (xi ) ∧ ¬E(xi , xj ) i=1 i=j 6 S large colours

......

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 40/ 43 k ^ ^ ψk (x1,..., xk ) := Ci (xi ) ∧ ¬E(xi , xj ) i=1 i=j 6 S large colours

......

update step: Insert a node into a colour C` with |C`G| = dk

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 41/ 43 k ^ ^ ψk (x1,..., xk ) := Ci (xi ) ∧ ¬E(xi , xj ) i=1 i=j 6 S large colours

......

update step: Insert a node into a colour C` with |C`G| = dk

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 41/ 43 k ^ ^ ψk (x1,..., xk ) := Ci (xi ) ∧ ¬E(xi , xj ) i=1 i=j 6 S large colours

......

update step: Delete a node from a colour C` with |C`G| = dk + 1

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 41/ 43 k ^ ^ ψk (x1,..., xk ) := Ci (xi ) ∧ ¬E(xi , xj ) i=1 i=j 6 S large colours

......

update step: Delete a node from a colour C` with |C`G| = dk + 1

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 41/ 43 Proof Idea:

Main result for enumeration Theorem There is a dynamic algorithm that receives as input I a degree bound d > 2, I a k-ary FO+MOD[σ]-query ϕ(x), and I a db D of degree 6 d, and computes

I within f (ϕ, d)||D|| preprocessing time a data structure

I that can be updated in time f (ϕ, d) and allows to enumerate ϕ(D) with delay O(k3).

f (ϕ, d) = 3-exp(||ϕ|| + lg lg d)

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 42/ 43 Main result for enumeration Theorem There is a dynamic algorithm that receives as input I a degree bound d > 2, I a k-ary FO+MOD[σ]-query ϕ(x), and I a db D of degree 6 d, and computes

I within f (ϕ, d)||D|| preprocessing time a data structure

I that can be updated in time f (ϕ, d) and allows to enumerate ϕ(D) with delay O(k3) f (ϕ, d).

f (ϕ, d) = 3-exp(||ϕ|| + lg lg d)

For enumeration with delay O(k3): Use the skip-pointers that were introduced by [Durand, S., Segoufin, PODS’14] for the static setting and lift the approach to the dynamic setting.

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 42/ 43 Main result for enumeration Theorem There is a dynamic algorithm that receives as input I a degree bound d > 2, I a k-ary FO+MOD[σ]-query ϕ(x), and I a db D of degree 6 d, and computes

I within f (ϕ, d)||D|| preprocessing time a data structure

I that can be updated in time f (ϕ, d) and allows to enumerate ϕ(D) with delay O(k3) /////////f (ϕ, d) O(k3).

f (ϕ, d) = 3-exp(||ϕ|| + lg lg d)

For enumeration with delay O(k3): Use the skip-pointers that were introduced by [Durand, S., Segoufin, PODS’14] for the static setting and lift the approach to the dynamic setting.

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 42/ 43 Similar results for FO with counting FOC(P) [Kuske, S., LICS’17]. Future task: Revisit other results on FO model checking in the dynamic setting! – Thank you! –

Summary [Berkholz, Keppeler, S., ICDT’17]

I Input: combined complexity I Database D of degree 6 d f (ϕ, d) = I query ϕ(x1,..., xk ) in FO+MOD[σ] 3-exp(||ϕ|| + lg lg d) I Preprocessing: in time f (ϕ, d)||D|| Build a suitable data structure that represents D and ϕ(D) I Output: For Boolean queries: I Decide if D |= ϕ in time O(1) For k-ary queries: I Compute the number of tuples in ϕ(D) in time O(1) 2 I Test for a given tuple a whether a ∈ ϕ(D) in time O(k ) 3 I Enumerate the tuples in ϕ(D) with delay O(k ) I Dynamic setting: update data structure in time f (ϕ, d) Tuples may be inserted into or deleted from D

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 43/ 43 Future task: Revisit other results on FO model checking in the dynamic setting! – Thank you! –

Summary [Berkholz, Keppeler, S., ICDT’17]

I Input: combined complexity I Database D of degree 6 d f (ϕ, d) = I query ϕ(x1,..., xk ) in FO+MOD[σ] 3-exp(||ϕ|| + lg lg d) I Preprocessing: in time f (ϕ, d)||D|| Build a suitable data structure that represents D and ϕ(D) I Output: For Boolean queries: I Decide if D |= ϕ in time O(1) For k-ary queries: I Compute the number of tuples in ϕ(D) in time O(1) 2 I Test for a given tuple a whether a ∈ ϕ(D) in time O(k ) 3 I Enumerate the tuples in ϕ(D) with delay O(k ) I Dynamic setting: update data structure in time f (ϕ, d) Tuples may be inserted into or deleted from D Similar results for FO with counting FOC(P) [Kuske, S., LICS’17].

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 43/ 43 – Thank you! –

Summary [Berkholz, Keppeler, S., ICDT’17]

I Input: combined complexity I Database D of degree 6 d f (ϕ, d) = I query ϕ(x1,..., xk ) in FO+MOD[σ] 3-exp(||ϕ|| + lg lg d) I Preprocessing: in time f (ϕ, d)||D|| Build a suitable data structure that represents D and ϕ(D) I Output: For Boolean queries: I Decide if D |= ϕ in time O(1) For k-ary queries: I Compute the number of tuples in ϕ(D) in time O(1) 2 I Test for a given tuple a whether a ∈ ϕ(D) in time O(k ) 3 I Enumerate the tuples in ϕ(D) with delay O(k ) I Dynamic setting: update data structure in time f (ϕ, d) Tuples may be inserted into or deleted from D Similar results for FO with counting FOC(P) [Kuske, S., LICS’17]. Future task: Revisit other results on FO model checking in the dynamic setting!

Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 43/ 43 Summary [Berkholz, Keppeler, S., ICDT’17]

I Input: combined complexity I Database D of degree 6 d f (ϕ, d) = I query ϕ(x1,..., xk ) in FO+MOD[σ] 3-exp(||ϕ|| + lg lg d) I Preprocessing: in time f (ϕ, d)||D|| Build a suitable data structure that represents D and ϕ(D) I Output: For Boolean queries: I Decide if D |= ϕ in time O(1) For k-ary queries: I Compute the number of tuples in ϕ(D) in time O(1) 2 I Test for a given tuple a whether a ∈ ϕ(D) in time O(k ) 3 I Enumerate the tuples in ϕ(D) with delay O(k ) I Dynamic setting: update data structure in time f (ϕ, d) Tuples may be inserted into or deleted from D Similar results for FO with counting FOC(P) [Kuske, S., LICS’17]. Future task: Revisit other results on FO model checking in the dynamic setting! – Thank you! – Nicole Schweikardt (HU Berlin) Database Theory and Query Answering under Updates 43/ 43