GRAIL: Scalable Reachability Index for Large Graphs
Total Page:16
File Type:pdf, Size:1020Kb
Problem Definition & Motivation Background Our Approach : GRAIL Experiments Conclusion & Future Work GRAIL: Scalable Reachability Index for Large Graphs Hilmi Yıldırım1 Vineet Chaoji2 Mohammed J.Zaki1 1Rensselaer Polytechnic Institute Troy, NY 2Yahoo! Labs Bangalore, India 14 September VLDB 2010 Problem Definition & Motivation Background Our Approach : GRAIL Experiments Conclusion & Future Work Outline Problem Definition & Motivation Background Related Work Interval Labeling Our Approach : GRAIL Index Construction Querying Experiments Experimental Setup & Datasets Results and Comparison with Other Methods Sensitivity to Different Graph Types and Parameters Conclusion & Future Work Problem Definition & Motivation Background Our Approach : GRAIL Experiments Conclusion & Future Work Problem Definition Reachability Query : Given two vertices u and v in a directed acyclic graph G, is there a path between u and v? Simple in undirected graphs • Any directed graph can be transformed into a dag • A Query(B,I) B C D • Reachable E F G HI Query(D,B) • Not Reachable J Problem Definition & Motivation Background Our Approach : GRAIL Experiments Conclusion & Future Work Motivation Traditional Applications Class Hierarchies, GIS, • dependency graphs Trending Applications Semantic Web • Biological networks • Citation graphs • Motivation Existing methods do not • scale for large and dense graphs Motivation Problem Definition & Motivation Background Our Approach : GRAIL Experiments Conclusion & Future Work Related Work Construction Time Query Time Index Size Opt. Tree Cover (Agrawal et al. 89) O(nm) O(n) O(n2) GRIPP (Trissl et al. 07) O(m + n) O(m n) O(m + n) − Dual Labeling (Wang et al. 06) O(n + m + t3) O(1) O(n + t2) PathTree (Jin et al. 08) O(mk) O(mk)/O(mn) O(nk) 2HOP (Cohen et al. 03) O(n4) O(√m) O(n√m) HOPI (Schenkel et al. 05) O(n3) O(√m) O(n√m) GRAIL (this paper) O(d(n + m)) O(d)/O(n + m) O(dn) Full Transitive Closure DFS/BFS O(nm) Construction Time O(1) O(1) Query Time O(n + m) O(n2) Index Size O(1) Problem Definition & Motivation Background Our Approach : GRAIL Experiments Conclusion & Future Work Interval Labeling on a Tree 0 Post-Order Labeling Interval of u is [s(u), e(u)] • 1 2 e(u) is the post-order value • u of node 3 4 5 s(u) is the min of e(v) • where u v ⇒ 6 7 8 9 Problem Definition & Motivation Background Our Approach : GRAIL Experiments Conclusion & Future Work Interval Labeling on a Tree 0 Post-Order Labeling Interval of u is [s(u), e(u)] • 1 2 e(u) is the post-order value • u of node 3 4 5 s(u) is the min of e(v) • where u v ⇒ 6 7 8 1] 9 Problem Definition & Motivation Background Our Approach : GRAIL Experiments Conclusion & Future Work Interval Labeling on a Tree 0 Post-Order Labeling Interval of u is [s(u), e(u)] • 1 2 e(u) is the post-order value • u of node 3 4 5 s(u) is the min of e(v) • where u v ⇒ 6 7 8 [1,1] 9 Problem Definition & Motivation Background Our Approach : GRAIL Experiments Conclusion & Future Work Interval Labeling on a Tree 0 Post-Order Labeling Interval of u is [s(u), e(u)] • 1 2 e(u) is the post-order value • u of node 3 4 5 s(u) is the min of e(v) • where u v ⇒ 6 7 8 [1,1] 9 2] Problem Definition & Motivation Background Our Approach : GRAIL Experiments Conclusion & Future Work Interval Labeling on a Tree 0 Post-Order Labeling Interval of u is [s(u), e(u)] • 1 2 e(u) is the post-order value • u of node 3 4 5 s(u) is the min of e(v) • where u v ⇒ 6 7 8 [1,1] 9 [2,2] Problem Definition & Motivation Background Our Approach : GRAIL Experiments Conclusion & Future Work Interval Labeling on a Tree 0 Post-Order Labeling Interval of u is [s(u), e(u)] • 1 2 e(u) is the post-order value • u of node 3 4 5 s(u) is the min of e(v) • where u v ⇒ 6 7 3] 8 [1,1] 9 [2,2] Problem Definition & Motivation Background Our Approach : GRAIL Experiments Conclusion & Future Work Interval Labeling on a Tree 0 Post-Order Labeling Interval of u is [s(u), e(u)] • 1 2 e(u) is the post-order value • u of node 3 4 5 s(u) is the min of e(v) • where u v ⇒ 6 7 [1,3] 8 [1,1] 9 [2,2] Problem Definition & Motivation Background Our Approach : GRAIL Experiments Conclusion & Future Work Interval Labeling on a Tree 0 Post-Order Labeling Interval of u is [s(u), e(u)] • 1 2 e(u) is the post-order value • u of node 3 [1,4] 4 5 s(u) is the min of e(v) • where u v ⇒ 6 7 [1,3] 8 [1,1] 9 [2,2] Problem Definition & Motivation Background Our Approach : GRAIL Experiments Conclusion & Future Work Interval Labeling on a Tree 0 Post-Order Labeling Interval of u is [s(u), e(u)] • 1 2 e(u) is the post-order value • u of node 3 [1,4] 4 [5,5] 5 s(u) is the min of e(v) • where u v ⇒ 6 7 [1,3] 8 [1,1] 9 [2,2] Problem Definition & Motivation Background Our Approach : GRAIL Experiments Conclusion & Future Work Interval Labeling on a Tree 0 Post-Order Labeling Interval of u is [s(u), e(u)] • 1 [1,6] 2 e(u) is the post-order value • u of node 3 [1,4] 4 [5,5] 5 s(u) is the min of e(v) • where u v ⇒ 6 7 [1,3] 8 [1,1] 9 [2,2] Problem Definition & Motivation Background Our Approach : GRAIL Experiments Conclusion & Future Work Interval Labeling on a Tree 0 [1,10] Post-Order Labeling Interval of u is [s(u), e(u)] • 1 [1,6] 2 [7,9] e(u) is the post-order value • u of node 3 [1,4] 4 [5,5] 5 [7,8] s(u) is the min of e(v) • where u v ⇒ 6 [7,7] 7 [1,3] 8 [1,1] 9 [2,2] Problem Definition & Motivation Background Our Approach : GRAIL Experiments Conclusion & Future Work Interval Labeling on a DAG 0 1 2 3 4 5 6 7 8 9 Problem Definition & Motivation Background Our Approach : GRAIL Experiments Conclusion & Future Work Interval Labeling on a DAG 0 1 2 3 4 5 6 7 8 9 Problem Definition & Motivation Background Our Approach : GRAIL Experiments Conclusion & Future Work Interval Labeling on a DAG 0 [1,10] False positives on DAGs • such as 6 > 9 − 1 [1,6] 2 [1,9] 3 [1,4] 4 [1,5] 5 [1,8] 6 [1,7] 7 [1,3] 8 [1,1] 9 [2,2] Problem Definition & Motivation Background Our Approach : GRAIL Experiments Conclusion & Future Work Interval Labeling on a DAG 0 [1,10] False positives on DAGs • such as 6 > 9 − 1 [1,6] 2 [1,9] Variants of Interval Labeling • – Tree Cover 3 [1,4] 4 [1,5] 5 [1,8] • Optimal Tree Cover • GRIPP • PathTree 6 [1,7] 7 [1,3] 8 [1,1] 9 [2,2] Problem Definition & Motivation Background Our Approach : GRAIL Experiments Conclusion & Future Work Interval Labeling on a DAG 0 [1,10] False positives on DAGs • such as 6 > 9 − 1 [1,6] 2 [7,9] Variants of Interval Labeling • – Tree Cover 3 [1,4] 4 [5,5] 5 [7,8] • Optimal Tree Cover • GRIPP • PathTree 6 [7,7] 7 [1,3] 8 [1,1] 9 [2,2] Problem Definition & Motivation Background Our Approach : GRAIL Experiments Conclusion & Future Work Interval Labeling on a DAG 0 [1,10] False positives on DAGs • such as 6 > 9 − 1 [1,6] 2 [7,9] Variants of Interval Labeling • – Tree Cover 3 [1,4] 4 [5,5] 5 [7,8] • Optimal Tree Cover • GRIPP • PathTree 6 [7,7] 7 [1,3] [1,1] 8 [1,1] 9 [2,2] Problem Definition & Motivation Background Our Approach : GRAIL Experiments Conclusion & Future Work Interval Labeling on a DAG 0 [1,10] False positives on DAGs • such as 6 > 9 − 1 [1,6] 2 [7,9] Variants of Interval Labeling • – Tree Cover 3 [1,4] 4 [5,5] 5 [7,8] • Optimal Tree Cover [1,1] • GRIPP • PathTree 6 [7,7] 7 [1,3] [1,1] 8 [1,1] 9 [2,2] Problem Definition & Motivation Background Our Approach : GRAIL Experiments Conclusion & Future Work Interval Labeling on a DAG 0 [1,10] False positives on DAGs • such as 6 > 9 − 1 [1,6] 2 [7,9] Variants of Interval Labeling [1,4] • – Tree Cover 3 [1,4] 4 [5,5] 5 [7,8] • Optimal Tree Cover [1,1] • GRIPP • PathTree 6 [7,7] 7 [1,3] [1,1] 8 [1,1] 9 [2,2] Problem Definition & Motivation Background Our Approach : GRAIL Experiments Conclusion & Future Work GRAIL : Graph Reachability Indexing via RAndomized Interval Labeling Key Observations No false negatives. • Interval labeling is repeatable with different traversals. • Problem Definition & Motivation Background Our Approach : GRAIL Experiments Conclusion & Future Work GRAIL : Graph Reachability Indexing via RAndomized Interval Labeling Key Observations No false negatives. • Interval labeling is repeatable with different traversals. • GRAIL Index Construction For each dimension of the index • • Generate a randomized post-order labeling Each label corresponds to a dimension of the hyperrectangle • that node represents. Each new dimension reduces the number of exceptions. • Problem Definition & Motivation Background Our Approach : GRAIL Experiments Conclusion & Future Work GRAIL in action 0 [1,10] Many exceptions after the • 1 [1,6] 2 [1,9] first traversal.