Streaming Graph Algorithms

Streaming Graph Algorithms

Streaming Graph Algorithms CSC 581 Christopher Siu 1 / 22 Finding the Majority Element Consider the following problem: Given an unsorted list of arbitrary elements, find the element that makes up more than half the elements in the list. Example Given: [−1; 8; λ, 8; 8; a; ?; 8; ?; 94; 8; 8; 8; 8; ;] . return 8. 2 / 22 Naïve Solution MajorityElement(A [a1; a2; : : : ; an]) Input: A finite list of n elements, A Output: The majority element in the list 1: for i 1 to n do 2: let count 0 3: for j 1 to n do 4: if aj = ai then 5: count count + 1 n 6: if count > b =2c then 7: return ai 8: return “None” This algorithm runs in On2 time and O(1) space. 3 / 22 Divide and Conquer Solution MajorityElement(A [a1; a2; : : : ; an]) Input: A finite list of n elements, A Output: The majority element and the number of times it occurs 1: if n = 1 then 2: return (a1; 1) 3: else n 4: let mid b =2c 5: let (al; cl) MajorityElement([a1; : : : ; amid]) 6: let (ar; cr) MajorityElement([amid+1; : : : ; an]) 7: if al = “None” and cr = “None” then 8: return (“None”; −1) 9: else if al = ar then 10: return (al; cl + cr) . 4 / 22 Divide and Conquer Solution MajorityElement(A [a1; a2; : : : ; an]) . 11: else 12: cl cl + Count(al; [amid+1; : : : ; an]) 13: cr cr + Count(ar; [a1; : : : ; amid]) n 14: if al 6= “None” and cl > b =2c then 15: return (al; cl) n 16: else if ar 6= “None” and cr > b =2c then 17: return (ar; cr) 18: else 19: return (“None”; −1) This algorithm runs in O(n log n) time and O(n) space. 5 / 22 Hashing Solution MajorityElement(A [a1; a2; : : : ; an]) Input: A finite list of n elements, A Output: The majority element in the list 1: let T [] 2: for i 1 to n do 3: if ai 2= T then 4: T [ai] 1 5: else 6: T [ai] T [ai] + 1 n 7: if T [ai] > b =2c then 8: return ai 9: return “None” This algorithm runs in O(n) time and O(n) space. 6 / 22 Boyer-Moore Majority Vote Algorithm MajorityVote(A [a1; a2; : : : ; an]) Input: A finite list of n elements, A Output: The majority element in the list, assuming one exists 1: let x a1 2: let votes 1 3: for i 2 to n do 4: if votes = 0 then 5: x ai 6: votes 1 7: else if ai = x then 8: votes count + 1 9: else 10: votes count − 1 11: return x 7 / 22 Boyer-Moore Majority Vote Algorithm Example Let A = [−1; 8; λ, 8; 8; a; ?; 8; ?; 94; 8; 8; 8; 8; ;]. Element Candidate Votes (continued) −1 −1 1 ?? 1 8 −1 0 94 ? 0 λ λ 1 8 8 1 8 λ 0 8 8 2 8 8 1 8 8 3 a 8 0 8 8 4 ?? 1 ; 8 3 8 ? 0 return 8 This algorithm runs in O(n) time and O(1) space. 8 / 22 Boyer-Moore Majority Vote Algorithm In a sense, this algorithm is “complete”, but not “sound”. If there is a majority element, this algorithm will find it. If this algorithm finds an element, that element is not necessarily the majority element. MajorityElement(A [a1; a2; : : : ; an]) Input: A finite list of n elements, A Output: The majority element in the list 1: let x MajorityVote(A) n 2: if Count(x; A) > b =2c then 3: return x 4: else 5: return “None” This algorithm, too, runs in O(n) time and O(1) space. 9 / 22 Boyer-Moore Majority Vote Algorithm The Majority Vote algorithm loses context. The algorithm doesn’t keep complete occurrence information. The algorithm is not capable of backtracking. The algorithm discards data it has already processed. The algorithm is capable of operating in constant space. Karp, Shenker, and Papadimitriou generalized the Majority Vote 1 algorithm to find all elements with freqeuncy > θ in O =θ space, although it requres two passes. 10 / 22 Streaming Algorithms A streaming algorithm processes an indefinite stream of elements, S = hs1; s2; : : : ; sn;:::i. Since the goal is to avoid storing the dataset in memory, the algorithm must operate under severe memory constraints. Since the algorithm cannot know the future, it must process data as it arrives, in the order it arrives. Streaming algorithms are ideal for processing massive datasets or responding to real-time events. 11 / 22 The Seven Bridges of Königsberg A graph G = (V; E) consists of a set of vertices, V , and a set of edges, E. By convention, jV j = n and jEj = m. Each edge e = (u; v) connects two vertices, u and v. Graphs are used to represent relationships between entities. 12 / 22 Graphs Graphs can be used to represent: Bridges and landmasses Diseased populations Cities and roads States and transitions Computers on the Internet Friends on social networks Flights between airports ... Given a graph, we might be interested in: Eulerian paths Densest subgraphs Hamiltonian paths Shortest paths Spanning trees Largest cliques Network flow ... 13 / 22 Finding A Minimum Spanning Tree Consider the following problem: Given a connected, undirected, weighted graph, find the tree of minimum total weight that connects all vertices. Example Given: v 3 6 8 2 u x 7 2 7 w . return f(u; v; 3) ; (u; x; 2) ; (w; x; 2)g. 14 / 22 Kruskal’s Algorithm MinimumSpanningTree(G (V; E)) Input: A connected, undirected, weighted graph G Output: A minimum spanning tree of G 1: let T be an empty tree 2: while E 6= ; do 3: let e MinWeightEdge(E) 4: E E n feg 5: if T [ feg does not contain a cycle then 6: T T [ feg 7: return T This algorithm runs in O (m log m) time and O (n + m) space. Selecting the minimum weight edge requires that the algorithm access the edge set multiple times. 15 / 22 Graph Streaming Algorithms Often, a semi-streaming model must be used. An algorithm is allowed O(n polylog n) space. The time constraint is similarly relaxed. McGregor distinguishes three types of graph streams: Insert-Only An unordered stream of edges, S = he1; e2; : : : ; emi Graph Sketch A stream of edge additions and removals, S = ha1; a2; : : : ; aki, where ai = (ei; δi 2 f1; −1g). Sliding Window An indefinite stream where at time t, only the last w edges are considered, S = h: : : ; et−w+1; : : : ; et−1; et;:::i. 16 / 22 Finding a Minimum Spanning Tree The Cycle Property For any cycle C in a graph G, if e is the strictly maximum weight edge in C, then e is not part of any MST of G. MinimumSpanningTree(G he1; e2; : : : ; emi) Input: A stream of edges defining a connected, undirected graph G Output: A minimum spanning tree of G 1: let T be an empty tree 2: for e 2 G do 3: T T [ feg 4: if T contains a cycle, C then 5: T T n fMaxWeightEdge(C)g 6: return T This algorithm only requires one pass through the edge set. 17 / 22 Selected Graph Streaming Algorithms Algorithm Problem Insert-Only Graph Sketch Sliding Window Connectivity X X X Two-coloring X X X Min. (1 + )-approx. or (1 + )-approx. Spanning Tree X O (log n) passes Max. 2-approx. Multiple passes (3 + )-approx. Matching McGregor, Feigenbaum et al. 18 / 22 Future Work Most research has been done with undirected graphs, yet many naturally occurring graphs are directed. Constructing a graph generally does not produce edges in a truly random order; can we take advantage of that fact? 19 / 22 Questions? 20 / 22 References 1 R. M. Karp, S. Shenker, and C. H. Papadimitriou. “A Simple Algorithm for Finding Frequent Elements in Streams and Bags”. ACM Transactions on Database Systems, Vol. 28, No. 1, 2003, pp. 51-55. [Online]. Available: https://dl.acm.org/citation.cfm?id=762473 2 R. S. Boyer and J. S. Moore. “MJRTY — A Fast Majority Vote Algorithm”, in Automated Reasoning: Essays in Honor of Woody Bledsoe R. S. Boyer, Ed. Dordrecht, The Netherlands: Kluwer Academic Publishers, 1991, pp. 105-117. [Online]. Available: http://www.cs.utexas.edu/~moore/best-ideas/mjrty/index.html 3 N. Alon, Y. Matias, and M. Szegedy. “The Space Complexity of Approximating the Frequency Moments”, in Proc. of the 28th Annual ACM Symposium on Theory of Computing, 1996, pp. 20-29. [Online]. Available: https://dl.acm.org/citation.cfm?id=237823 4 A. McGregor. “Graph Stream Algorithms: A Survey”, in ACM SIGMOD Record, Vol. 43, No. 1, 2014, pp. 9-20. [Online]. Available: https://dl.acm.org/citation.cfm?id=2627694 21 / 22 References 5 R. E. Tarjan. “Minimum Spanning Trees”, in Data Structures and Network Algorithms, Philadelphia, USA: Society for Industrial and Applied Mathematics, 1983, pp. 71-83. 6 J. Feigenbaum, S. Kannan, A. McGregor, S. Suri, and J. Zhang. “On Graph Problems in a Semi-Streaming Model”, in Theoretical Computer Science, Vol. 348, No. 2, 2005, pp. 207-216. 7 D. Ediger and J. P. Fairbanks. “Deriving Streaming Graph Algorithms from Static Definitions”, in IEEE Intl. Parallel and Distributed Processing Symposium Workshops, 2017, pp. 637-642. [Online]. Available: https://ieeexplore.ieee.org/document/7965103/ 8 J. Riedy and D. A. Bader. “Multithreaded Community Monitoring for Massive Streaming Graph Data”, in IEEE 27th Intl. Symposium on Parallel and Distributed Processing Workshops and PhD Forum, 2013, pp. 1646-1655. [Online]. Available: https://ieeexplore.ieee.org/document/6651061/ 22 / 22.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    22 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us