Mechanically Verified Graph Query Processing

Mechanically Verified Graph Query Processing

Context Regular Datalog Regular Datalog Formalization Conclusions Incremental Maintenance of Graph Views Stefania Dumbrava March 9, 2021 1 / 24 Context Regular Datalog Regular Datalog Formalization Conclusions This Talk Our Work A machine-verified incremental graph query engine. 2 / 24 Context Regular Datalog Regular Datalog Formalization Conclusions This Talk Our Work A machine-verified incremental graph query engine. • ...relying on the Coq proof assistant 2 / 24 Context Regular Datalog Regular Datalog Formalization Conclusions This Talk Our Work A machine-verified incremental graph query engine. • ...relying on the Coq proof assistant • ...for regular queries 1 (closure operators as primitive) expressive & tractable 2 ) amenable to graph querying logic-based language ) amenable to formal verification 1 Moshe Y. Vardi: A Theory of Regular Queries. PODS 2016: 1-9 2 Juan L. Reutter, et al.: Regular Queries on Graph Databases. Th. Comput. Syst. 61(1): 31-83 (2017) 3 / 24 Context Regular Datalog Regular Datalog Formalization Conclusions Datalog • logic programming without function symbols • consists of facts and rules • finite model semantics • sound, complete and terminating evaluation ! bottom-up iteration of a fixpoint operator 4 / 24 Context Regular Datalog Regular Datalog Formalization Conclusions Datalog Example Assume we have the following simple graph database instance: ...and that we want to compute all the pairs of potential friends. 5 / 24 Context Regular Datalog Regular Datalog Formalization Conclusions Datalog Example 5 / 24 Context Regular Datalog Regular Datalog Formalization Conclusions Datalog Example 5 / 24 Context Regular Datalog Regular Datalog Formalization Conclusions Datalog Example 5 / 24 Context Regular Datalog Regular Datalog Formalization Conclusions Datalog Example 5 / 24 Context Regular Datalog Regular Datalog Formalization Conclusions Regular Datalog • safe: head variables in each clause appear in its body • recursion is restricted to transitive closure ...and internalized into special literals • distinguished top clause whose head we call view • all symbols (except the view one) are binary • stratified: no clauses with body symbols not previously defined 6 / 24 Context Regular Datalog Regular Datalog Formalization Conclusions Datalog vs. Regular Datalog • Regular Datalog corresponds to UC2RPQ under transitive closure ! expressive: allows for complex graph patterns ! tractable: parallelizability & decidable query containement 7 / 24 Context Regular Datalog Regular Datalog Formalization Conclusions Building Blocks Graph Databases finite sets of constants V (nodes) & symbols Σ (edge labels). Graph Instance G over Σ: set of directed labeled edges, E, where E ⊆ V × Σ × V. Graph Database D(G) over G: G can be seen as a database D(G) = fs(n1; n2) j (n1; s; n2) 2 Eg s1 sk Path ρ of length k in G: sequence n1 −! n2 ::: nk−1 −! nk ∗ Path Label: λ(ρ) = s1 ::: sk 2 Σ 8 / 24 Context Regular Datalog Regular Datalog Formalization Conclusions Syntax Regular Datalog (RD) Expressions Terms (Node IDs) t ::= x j n where x 2 Var; n 2 V Atoms A ::= s(t1; t2) where s 2 Σ Literals L ::= A j A+ Conjunctive Body B ::= L1 ^ ::: ^ Ln where n 2 N Disjunctive Body D ::= B1 _ ::: _ Bn where n 2 N Clauses C ::=( t1; t2) D ProgramsΠ:Σ ! fC1;:::; Cng where n 2 N Regular Queries (RQ) over G • RD-program Π • distinguished query clause Ω whose head is the top-level view 9 / 24 Context Regular Datalog Regular Datalog Formalization Conclusions Semantics (I/II) Interpretations (G) Modeled as indexed relations (Σ × f; +g) !P(V × V). 10 / 24 Context Regular Datalog Regular Datalog Formalization Conclusions Semantics (I/II) Interpretations (G) Modeled as indexed relations (Σ × f; +g) !P(V × V). 10 / 24 Context Regular Datalog Regular Datalog Formalization Conclusions Semantics (I/II) Interpretations (G) Modeled as indexed relations (Σ × f; +g) !P(V × V). Interpretation Well-Formedness (wfG) G(s; +) has to correspond to the transitive closure of G(s; ): wfG(G) ()8 s; is closure(G(s; ); G(s; +)) + is closure(gs ; gp) () 8(n1; n2) 2 gp; 9ρ 2 V ; path(gs ; n1; ρ) ^ last(ρ) = n2 path(g; n1; ρ) () 8i 2 f1 ::: jρjg; (ni ; ni+1) 2 g 10 / 24 Context Regular Datalog Regular Datalog Formalization Conclusions Semantics (II/II) Literal Satisfaction l For L ≡ s (n1; n2), G j= L () (n1; n2) 2 G(s; l). Clause Satisfaction For C ≡ (t1; t2) (L1;1 ^ ::: ^ L1;n) _ ::: _ (Lm;1 ^ ::: ^ Lm;n), W V G j=s L () 8η; i=1::m( j=1::n G j= η(Li;j )) ) G j= η(s(t1; t2)). Program Satisfaction For Π ≡ Σ ! fC1;:::; Cng, G j=Σ Π () 8 s 2 Σ; G j=s Π(s). 11 / 24 Context Regular Datalog Regular Datalog Formalization Conclusions View Computation: Clausal Evaluation W For each clause C ≡ (t1; t2) i=1::n Bi corresponding to Π(s) • match each conjunctive body Bi against the interpretation A extend substitutions MG (a) matching a given atom a B ...uniformly, to the entire body as MG (Bi ) • collect the substitutions matching the entire disjunctive body Clausal Consequence Operator Π;s [ B T (G) ≡ fσ(t1; t2) j σ 2 MG (Bi )g i=1::n 12 / 24 Context Regular Datalog Regular Datalog Formalization Conclusions View Computation: Stratified Program Evaluation Stratification Condition A program Π is stratified if: there exists a mapping σ : Σ ! [1; n] such that, for all s in Σ, the Π(s) clause (t1; t2) B satisfies: max σ(r) < σ(s) r2sym(B) 13 / 24 Context Regular Datalog Regular Datalog Formalization Conclusions View Computation: Stratified Program Evaluation + Π1 = connected(X ; Y ) follows (X ; Y ) − Π2 = pfriends(X ; Y ) ((knows ^ (connected · connected )) _ contacted)(X ; Y ) + Π3 = pconnected(X ; Y ) pfriends (X ; Y ) 14 / 24 Context Regular Datalog Regular Datalog Formalization Conclusions View Computation: Stratified Program Evaluation + Π1 = connected(X ; Y ) follows (X ; Y ) − Π2 = pfriends(X ; Y ) ((knows ^ (connected · connected )) _ contacted)(X ; Y ) + Π3 = pconnected(X ; Y ) pfriends (X ; Y ) Π1;connected ∆1 = T (;) = connected ! f(Alice; Bill); (Bob; Bill)g 14 / 24 Context Regular Datalog Regular Datalog Formalization Conclusions View Computation: Stratified Program Evaluation + Π1 = connected(X ; Y ) follows (X ; Y ) − Π2 = pfriends(X ; Y ) ((knows ^ (connected · connected )) _ contacted)(X ; Y ) + Π3 = pconnected(X ; Y ) pfriends (X ; Y ) Π1;connected ∆1 = T (;) = connected ! f(Alice; Bill); (Bob; Bill)g Π2;pfriends ∆2 = T (∆1) = pfriends ! f(Alice; Bob); (Bob; Eve)g 14 / 24 Context Regular Datalog Regular Datalog Formalization Conclusions View Computation: Stratified Program Evaluation + Π1 = connected(X ; Y ) follows (X ; Y ) − Π2 = pfriends(X ; Y ) ((knows ^ (connected · connected )) _ contacted)(X ; Y ) + Π3 = pconnected(X ; Y ) pfriends (X ; Y ) Π1;connected ∆1 = T (;) = connected ! f(Alice; Bill); (Bob; Bill)g Π2;pfriends ∆2 = T (∆1) = pfriends ! f(Alice; Bob); (Bob; Eve)g Π3;pconnected ∆3 = T (∆1 [ ∆2) = pconnected ! f(Alice; Bob); (Bob; Eve); (Alice; Eve)g: 14 / 24 Context Regular Datalog Regular Datalog Formalization Conclusions Modeling Updates Updates An update ∆ ≡ (∆+; ∆−) is a pair of disjoint graphs ∆+; ∆−. • ∆+ ≡ bulk insertions • ∆− ≡ bulk deletions Update Application G :+: ∆ ≡ G n ∆− [ ∆+ ∆fs ! (g+; g−)g ≡ (∆+fs ! g+g; ∆−fs ! g− n g+g) 15 / 24 Context Regular Datalog Regular Datalog Formalization Conclusions View Maintenance Computation + Π1 = connected(X ; Y ) follows (X ; Y ) − Π2 = pfriends(X ; Y ) ((knows ^ (connected · connected )) _ contacted)(X ; Y ) + Π3 = pconnected(X ; Y ) pfriends (X ; Y ) 0 Π1;connected ∆1 = Tsupp (∆1) = ∆1fconnected ! (f(Bill; Bob)g; ;)g ... where supp = fknows; contacted; follows+; connected; pconnectedg 16 / 24 Context Regular Datalog Regular Datalog Formalization Conclusions View Computation vs. Maintenance W For each clause C ≡ (t1; t2) i=1::n Bi corresponding to Π(s) Incremental Clausal Consequence Operator Π;s Π;s T (G :+: ∆); (s 2= supp) _ (∆− [ D) 6= ; TG;supp(∆) = S B i=1::n MG;∆(Bi ); otherwise 17 / 24 Context Regular Datalog Regular Datalog Formalization Conclusions Full Engine Overview • stratified, single-pass, bottom-up heuristic • non-recursive (recursion internalized in closure computation) • supports both view computation and incremental maintenance • core component: clause evaluation forward-chain clausal consequence operator (fwd or clause) based on a matching algorithm corresponds to computing a nested-loop join 18 / 24 Context Regular Datalog Regular Datalog Formalization Conclusions Full Engine Overview Fixpoint fwd_program Π G supp ∆ ΣB ΣC : edelta := match ΣC with | [::] => ∆ | [:: s & ss] => let (arg, body) := Π s in let ∆0 := fwd_or_clause G supp ∆ s arg body in let ∆0 := compute_closures G ∆0 s in 0 fwd_program Π G supp ∆ (s [ s+ [ ΣB) ss 19 / 24 Context Regular Datalog Regular Datalog Formalization Conclusions Regular Datalog: Engine Characterization Theorem (Soundness) • Π { a safe, stratifiable, Regular Datalog program • Σ { its set of symbols • G { a graph instance • ∆ { an update The IVM-engine cumulatively processes symbols in Σ, such that if: • the already processed symbols, ΣB, are a well-formed Π-slice • ∆ only modifies ΣB, i.e., sym(∆) ⊆ ΣB • G :+: ∆ j= Π ΣB Then, it outputs ∆O , such that G :+:∆ O j=Σ Π . 20 / 24 Context Regular Datalog Regular Datalog Formalization Conclusions Incremental View Maintenance (IVM) V ∆ V [G] V ν ∆ Π V G G ] :+: :+:∆ [G Π maintenance :+: ∆] V [G V recomputation G G :+: ∆ ∆ G ≡ base graph; Π ≡ RD program; V ≡ top-view; ∆ ≡ update. Soundness ∆ If V [G] j= ΠG , the engine outputs an incremental update V s.t ∆ V [G] :+: V j= ΠG:+:∆ 21 / 24 Context Regular Datalog Regular Datalog Formalization Conclusions Incremental ∆-Matching Compute V ∆ s.t V [G :+: ∆] = V [G] :+: V ∆ with a delta program1 Idea: distributing deltas over joins and factoring. 1 Ashish Gupta, et al.: Maintaining Views Incrementally. SIGMOD 1993: 157-166 22 / 24 Context Regular Datalog Regular Datalog Formalization Conclusions Incremental ∆-Matching Compute V ∆ s.t V [G :+: ∆] = V [G] :+: V ∆ with a delta program1 Idea: distributing deltas over joins and factoring. Delta Program For V L1;:::; Ln, a delta program is a set of clauses of the form ∆ ν ν V L1;:::; Li−1; Li ; Li+1;:::; Ln 1 Ashish Gupta, et al.: Maintaining Views Incrementally.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    55 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us