Semiring Provenance Over Graph Databases Tapp - King’S College London July 12, 2018
Total Page:16
File Type:pdf, Size:1020Kb
Preliminaries Provenance of an RPQ Experiments Conclusion Semiring Provenance over Graph Databases TaPP - King’s College London July 12, 2018 Joint work with Yann Ramusat Silviu Maniu, Pierre Senellart École normale supérieure Inria LRI 1/30 Preliminaries Provenance of an RPQ Experiments Conclusion Contents 1 Preliminaries Formal Definitions Known Algorithms 2 Provenance of an RPQ Graph Transformation Algorithms 3 Experiments STIF Network Results 4 Conclusion Conclusion 2/30 Preliminaries Provenance of an RPQ Experiments Conclusion Working example r a,5 a,0 s q b,3 a,2 t What if these integers represent time to move between two vertices and we want the minimum travel time between s and t? 3/30 Preliminaries Provenance of an RPQ Experiments Conclusion Working example r a,5 a,0 s q b,3 a,2 t What if these integers represent time to move between two vertices and we want the minimum travel time between s and t?2 3/30 Preliminaries Provenance of an RPQ Experiments Conclusion Working example r a,5 a,0 s q b,3 a,2 t What if these integers represent time to move between two vertices and we want the minimum travel time between s and t?2 And now if we only want to consider paths going through a b edge? 3/30 Preliminaries Provenance of an RPQ Experiments Conclusion Working example r a,5 a,0 s q b,3 a,2 t What if these integers represent time to move between two vertices and we want the minimum travel time between s and t?2 And now if we only want to consider paths going through a b edge? 8 3/30 Preliminaries Provenance of an RPQ Experiments Conclusion Working example r a,5 a,0 s q b,3 a,2 t And if these integers represent security restrictions? What is the minimum security clearance needed to go from s to t? 3/30 Preliminaries Provenance of an RPQ Experiments Conclusion Working example r a,5 a,0 s q b,3 a,2 t And if these integers represent security restrictions? What is the minimum security clearance needed to go from s to t?2 3/30 Preliminaries Provenance of an RPQ Experiments Conclusion Working example r a,5 a,0 s q b,3 a,2 t And if these integers represent security restrictions? What is the minimum security clearance needed to go from s to t?2 Considering only paths avoiding b edges? 3/30 Preliminaries Provenance of an RPQ Experiments Conclusion Working example r a,5 a,0 s q b,3 a,2 t And if these integers represent security restrictions? What is the minimum security clearance needed to go from s to t?2 Considering only paths avoiding b edges?2 3/30 Preliminaries Provenance of an RPQ Experiments Conclusion Overview We propose several algorithms able to solve: 1 routing information when probabilistic information is present (e.g., road closures, uncertain travel time); 2 top-k relevant paths; or 3 accessibility under security restrictions; by computing semiring-based provenance of graph queries. Each of these algorithms yields a tradeoff between generality and time complexity. 4/30 Preliminaries Provenance of an RPQ Experiments Conclusion Overview We propose several algorithms able to solve: 1 routing information when probabilistic information is present (e.g., road closures, uncertain travel time); 2 top-k relevant paths; or 3 accessibility under security restrictions; by computing semiring-based provenance of graph queries. Each of these algorithms yields a tradeoff between generality and time complexity. 4/30 Preliminaries Provenance of an RPQ Experiments Conclusion Overview We propose several algorithms able to solve: 1 routing information when probabilistic information is present (e.g., road closures, uncertain travel time); 2 top-k relevant paths; or 3 accessibility under security restrictions; by computing semiring-based provenance of graph queries. Each of these algorithms yields a tradeoff between generality and time complexity. 4/30 Preliminaries Provenance of an RPQ Experiments Conclusion Overview We propose several algorithms able to solve: 1 routing information when probabilistic information is present (e.g., road closures, uncertain travel time); 2 top-k relevant paths; or 3 accessibility under security restrictions; by computing semiring-based provenance of graph queries. Each of these algorithms yields a tradeoff between generality and time complexity. 4/30 Preliminaries Provenance of an RPQ Experiments Conclusion Overview We propose several algorithms able to solve: 1 routing information when probabilistic information is present (e.g., road closures, uncertain travel time); 2 top-k relevant paths; or 3 accessibility under security restrictions; by computing semiring-based provenance of graph queries. Each of these algorithms yields a tradeoff between generality and time complexity. 4/30 Preliminaries Provenance of an RPQ Experiments Conclusion Overview We propose several algorithms able to solve: 1 routing information when probabilistic information is present (e.g., road closures, uncertain travel time); 2 top-k relevant paths; or 3 accessibility under security restrictions; by computing semiring-based provenance of graph queries. Each of these algorithms yields a tradeoff between generality and time complexity. 4/30 Preliminaries Provenance of an RPQ Experiments Conclusion Graph Database with Provenance Indication Definition (Graph Database) A graph database G over Σ is a pair (V ; E), where V is a finite set of node ids and E ⊆ V × Σ × V . Annotations are given by a weight function, w : E ! S for (S; ⊕; ⊗; ¯0; ¯1) a semiring. r a,5 a,0 q s b,3 a,2 t 5/30 Preliminaries Provenance of an RPQ Experiments Conclusion Graph Database with Provenance Indication Definition (Graph Database) A graph database G over Σ is a pair (V ; E), where V is a finite set of node ids and E ⊆ V × Σ × V . Annotations are given by a weight function, w : E ! S for (S; ⊕; ⊗; ¯0; ¯1) a semiring. r a,5 a,0 q s b,3 a,2 t 5/30 Preliminaries Provenance of an RPQ Experiments Conclusion Graph Database with Provenance Indication Definition (Graph Database) A graph database G over Σ is a pair (V ; E), where V is a finite set of node ids and E ⊆ V × Σ × V . Annotations are given by a weight function, w : E ! S for (S; ⊕; ⊗; ¯0; ¯1) a semiring. r a,5 a,0 q s b,3 a,2 t 5/30 Preliminaries Provenance of an RPQ Experiments Conclusion Basic Semiring Theory An element a 2 K is idempotent if a ⊕ a = a. K is said to be idempotent when all elements are. Definition (k-closed Semiring) Let k ≥ 0 be an integer. A semiring (K; ⊕; ⊗; ¯0; ¯1) is k-closed if k+1 k 8a 2 K; L an = L an: n=0 n=0 Definition (Star semiring) A star semiring is a semiring (K; ⊕; ⊗; ¯0; ¯1) with an additional unary operator ∗ verifying: 8a 2 K; a∗ = 1 ⊕ (a ⊗ a∗); a∗ = 1 ⊕ (a∗ ⊗ a): 6/30 Preliminaries Provenance of an RPQ Experiments Conclusion Basic Semiring Theory An element a 2 K is idempotent if a ⊕ a = a. K is said to be idempotent when all elements are. Definition (k-closed Semiring) Let k ≥ 0 be an integer. A semiring (K; ⊕; ⊗; ¯0; ¯1) is k-closed if k+1 k 8a 2 K; L an = L an: n=0 n=0 Definition (Star semiring) A star semiring is a semiring (K; ⊕; ⊗; ¯0; ¯1) with an additional unary operator ∗ verifying: 8a 2 K; a∗ = 1 ⊕ (a ⊗ a∗); a∗ = 1 ⊕ (a∗ ⊗ a): 6/30 Preliminaries Provenance of an RPQ Experiments Conclusion Basic Semiring Theory An element a 2 K is idempotent if a ⊕ a = a. K is said to be idempotent when all elements are. Definition (k-closed Semiring) Let k ≥ 0 be an integer. A semiring (K; ⊕; ⊗; ¯0; ¯1) is k-closed if k+1 k 8a 2 K; L an = L an: n=0 n=0 Definition (Star semiring) A star semiring is a semiring (K; ⊕; ⊗; ¯0; ¯1) with an additional unary operator ∗ verifying: 8a 2 K; a∗ = 1 ⊕ (a ⊗ a∗); a∗ = 1 ⊕ (a∗ ⊗ a): 6/30 Preliminaries Provenance of an RPQ Experiments Conclusion Examples of Semirings Tropical: T = (R+ [ f1g; min; +; 1; 0) to compute shortest-distance. k kTropical: Tk is the set of k-tuples in (R+ [ f1g) ordered for the natural order of R [ f1g): k Tk = f(a1;:::; ak ) 2 (R+ [ f1g) : 0 ≤ a1 ≤ · · · ≤ ak g: a ⊕k b = mink ((ai )i [ (bj )j ) and a ⊗k b = mink ((ai + bj )i;j ) used to compute top-k shortest-distance. Security: Sn is then (f0;:::; n + 1g; min; max; n + 1; 0). Integers correspond to levels such as Top Secret, Secret, Restricted, etc... to compute under security restrictions. Counting:( N [ f1g) together with a star operation: 0∗ = 1 and 8a 2 N; a∗ = 1 to compute number of paths between two nodes. 7/30 Preliminaries Provenance of an RPQ Experiments Conclusion Examples of Semirings Tropical: T = (R+ [ f1g; min; +; 1; 0) to compute shortest-distance. k kTropical: Tk is the set of k-tuples in (R+ [ f1g) ordered for the natural order of R [ f1g): k Tk = f(a1;:::; ak ) 2 (R+ [ f1g) : 0 ≤ a1 ≤ · · · ≤ ak g: a ⊕k b = mink ((ai )i [ (bj )j ) and a ⊗k b = mink ((ai + bj )i;j ) used to compute top-k shortest-distance. Security: Sn is then (f0;:::; n + 1g; min; max; n + 1; 0). Integers correspond to levels such as Top Secret, Secret, Restricted, etc... to compute under security restrictions. Counting:( N [ f1g) together with a star operation: 0∗ = 1 and 8a 2 N; a∗ = 1 to compute number of paths between two nodes.