Parallel Ordered Sets Using Join

PARALLEL SEARCH TREES Yihan Sun CMU 15-853 Why trees? Very important data structure in almost all areas in computer science Maintain ordering on the keys – search tree Build index for databases or search engines Support ordered sets and maps, priority queues, …, useful as a subroutine in many algorithms Range searching: useful in database systems and computational geometry algorithms …… Balance Search Trees (BSTs) Many algorithms on trees have cost proportional to tree height Need balancing schemes to keep tree nearly balanced Balancing scheme: a set of invariants on trees AVL trees, weight-balanced trees, splay trees, treaps, B trees, …… In this lecture we use balanced binary search trees Not limited to any specific balancing schemes. The algorithms/bounds hold at least for: AVL trees, red-black trees, weight-balance trees and treaps What we want to do with trees Searching The first or last entry, the entry of certain rank, … Insertion and deletion Cost is order of the tree height, 푂(log 푛) for balance trees Construction Filter, map-reduce, … Bulk insertion and deletion Merging two trees (or getting the common elements) Can be highly-parallelized Goal: work-efficient (optimal) and polylogarithmic depth In this lecture Parallel algorithms on trees Applications that can be solved using assorted tree structures Next: Preliminaries Preliminaries 푇 = join(푇퐿, 푒, 푇푅): connects two trees 푇퐿 and 푇푅 with 푒, but get the result balanced 푇 = 푒 in-order(푇)=[in-order(푇퐿), 푒, in-order(푇푅)] 푇푅 푇퐿 Preliminaries 푇 = join(푇퐿, 푒, 푇푅): connects two trees 푇퐿 and 푇푅 with 푒, but get the result balanced 푇 = join2(푇퐿, 푇푅): similar as join but without the middle entry Both can finish in O(log 푛), where 푛 is the larger tree size (for join a tighter bound is 푂( log 푇퐿 − log 푇푅 )) ⟨푇퐿, 푏, 푇푅⟩= split(푇, 푘) 푇퐿 contains all keys in 푇 smaller than 푘 푇푅 contains all keys in 푇 greater than 푘 A bit 푏 indicating whether 푘 ∈ 푇 costs O(log 푛), where 푛 is the size of 푇 Next: simple parallel algorithms on trees Parallel Search trees It is very easy to design parallel algorithms on trees using divide-and-conquer scheme Recursively deal with two subtrees in parallel Combine results of recursive calls and the root Usually gives polylogarithmic bounds on depth func(T, …) { if (T is empty) return base_case; M = do_something(T.root); in parallel: L=func(T.left, …); R=func(T.right, …); return combine_results(L, R, M, …) } Map and reduce Maps each entry on the tree to a certain value using function map, then reduce all the mapped values using reduce (with identity identity). Assume map and reduce both have constant cost. map_reduce(Tree T, function map, function reduce, value_type identity) { if (T is empty) return identity; M=map(t.root); in parallel: L=map_reduce(T.left, map, reduce, identity); R=map_reduce(T.right, map, reduce, identity); return reduce(reduce(L, M, R)); 푂(푛) work and 푂(log 푛) depth Filter Select all entries in the tree that satisfy function 푓 Return a tree of all these entries filter(Tree T, function f) { if (T is empty) return an empty tree; in parallel: L=filter(T.left, f); R=filter(T.right, f); if (f(T.root)) return join(L, T.root, R); else return join2(L, R); 푂(푛) work and 푂(log2 푛) depth Construction T=build(Array A, int size) { 퐴′=parallel_sort(A, size); return build_sorted(퐴′, 푠); } T=build_sorted(Arrary A, int start, int end) { if (start == end) return an empty tree; if (start == end-1) return singleton(A[start]); mid = (start+end)/2; in parallel: 푂(푛) work and L = build_sorted(A, start, mid); 푂(log 푛) depth R = build_sorted(A, mid+1, end); return join(L, A[mid], R); 푂(푛 log 푛) work and 푂(log 푛) depth, bounded by the sorting algorithm Output to array Output the entries in a tree 푇 to an array in its in- order Assume each tree node stores its subtree size (an empty tree has size 0) 풆 to_array(Tree T, array A, int offset) { if (T is empty) return; 푇. 푙푒푓푡 푇. 푟푔ℎ푡 A[offset+T.left.size] = get_entry(T.root); in parallel: to_array(T.left, A, offset); 풆 to_array(T.right, A, offset+T.left.size()+1); The size of the left subtree 푂(푛) work and 푂(log 푛) depth Brief Summary Parallel algorithms on trees Polylogarithmic depth Takeaway: design parallel divide-and-conquer algorithms on trees Next: tree-tree operations: union, intersection, difference Combine two indexes in database Subroutine in some applications, e.g., the range tree …… In this lecture: lower bound, a divide-and-conquer algorithm, the cost analysis Merging Two Trees of Size n and m (n≥m) Solution 1: flatten trees into arrays, merge with moving pointers: 0 4 7 8 Cost: 푂 푚 + 푛 1 2 3 5 6 9 What if 푛 ≫ 푚? O 푛 ? 0 1 2 3 4 5 6 7 8 9 Solution 2: insert the entries in the smaller tree into the larger tree 5 Cost: 푂 푚 log 푛 7 2 69 4 8 What if 푛 = 푚? 1 3 76 9 0 0 4 7 8 푂 푛 log 푛 ? 15 What is the minimum cost? The Lower Bound of Merging Two Ordered Sets Choose 푛 slots for the elements in the first set among all 푚 + 푛 available slots in the final result. Set 1 (size 푚): Set 2 (size 푛 ≥ 푚): 0 4 7 8 1 2 3 5 6 9 풎 + 풏 풎 + 풏 = Result: 0 1 2 3 4 5 6 7 8 9 풎 풏 cases 풎+풏 풏 Lower bound: 퐥퐨퐠 = 횯 풎 퐥퐨퐠 + ퟏ ퟐ 풎 풎 The Lower Bound of Merging Two Ordered Sets The lower bound 풏 푶 풎 퐥퐨퐠 + ퟏ 풎 When 푚 = 푛, it is 푂(푛) When n ≫ 푚, it is about 푂 푚 log 푛 (e.g., when 푚 = 1, it is 푂 log 푛 ) Can we give an algorithm achieving this bound? The Union Function Base case 7 5 4 8 2 9 0 1 3 6 The Union Function 5 4 7 2 9 0 8 1 3 6 푳ퟏ 푹ퟏ 푳ퟐ 푹ퟐ The Union Function 5 4 2 7 9 0 1 3 8 6 Union Union The Union Function 5 2 7 1 3 6 8 0 4 9 푻푳 푻푹 The Union Function 5 2 7 1 3 6 8 0 4 9 Similarly we can implement intersection and difference. The Cost Theorem 1. For AVL trees, red-black trees, weight-balance trees and treaps, the above algorithm of merging two balanced 푛 BSTs of sizes 푚 and 푛 (푚 ≤ 푛) have 푂 푚 log + 1 work and 푚 푂(log 푚 log 푛) depth (in expectation for treaps). The Cost Lemma 1. The Join work can be asymptotically bounded by its corresponding Split. Lemma 2. Split a tree 푻 of size 풏 costs time 푶(퐥퐨퐠 풏). The depth is 푶 퐥퐨퐠 풎 퐥퐨퐠 풏 ℎ1 steps to reach the base case, O(ℎ2) cost for each split The Split Work Union split 8 4 9 0 푻ퟏ Union 5 Union split split 4 8 2 7 0 푻 9 ퟐ ^ 1 3 6 8 0 4 ^ 9 Union Union Union Union Concavity: 푘 ∑푎 The Split Work ෍ 푎 ≤ 푘 log 푘 =1 split 8 4 9 log |푇1| = log 푛 0 푻ퟏ (|푇|: the size of tree 푇) 푇21 5 푇22 split split 4 8 log |푇21| + log |푇22| 2 7 푛 0 9 ≤ 2 log 2 …… 푻 log |푇 | + log |푇 | ퟐ 31 32 1 3 6 ^ + log 푇 + log |푇 | 8 log |푇 |33+ ⋯ + log 34|푇 ℎ| ℎ1 푛 ℎ2 ≤ 4 log 푛 푇31 0 푇32 4 푇33 ^ 푇34 9 ≤ 2ℎ log 42ℎ 푛 푛 ℎ 푛 c 푛 log 푛 + 2 log + 4 log … + 2 log ℎ = 푂 푚 log + 1 The height of T2 2 4 2 푚 ℎ = 푂(log 푚) c log2 푚 terms (If 푇2 is perfectly balanced) Concavity: 푘 ∑푎 The Split Work ෍ 푎 ≤ 푘 log 푘 =1 split 8 푁 Lemma 3. In a tree of size 푁, there are at most ⌈/2⌉ nodes in 4 9 푐 level −. log |푇1| = log 푛 0 푻ퟏ (|푇|: the size of tree 푇) 푇21 5 (Details푇22 omitted) split split 4 8 log |푇21| + log |푇22| 2 푛 푚 푛 푚 푛7 푛 푛 0 푂 푚 log + log + log + ⋯ 9= 푂 푚≤log2 log+ 1 푚 2 푚/2 4 푚/4 푚 2 …… 푻 log |푇31| + log |푇32| 1 3 ퟐ6 ^ + log 푇33 + log |푇34| 8 log |푇 | + ⋯ +푛log |푇 ℎ| ℎ1≤ 4 log ℎ2 4 푛 푇 0 푇32 4 푇33 ^ 푇 9 ℎ 31 34 ≤ 2 log ℎ 푛 푛 푛 푛 2 ℎ c The height of T log 푛 + 2 log + 4 log … + 2 log ℎ = 푂 푚 log + 1 2 2 4 2 푚 ℎ = 푂(log 푚) c log2 푚 terms (If 푇2 is perfectly balanced) Brief Summary Takeaways: The lower bound of merging two ordered structures A work-optimal and polylog-depth parallel algorithms to merge two trees A useful trick in analysis: layering bottom-up Next: augmentations on trees and range queries Range query (1D) Report all entries in key range [푘퐿, 푘푅]. Get a tree of them: 푂(log 푛) work and depth Flatten them in an array: 푂(푘 + log 푛) work, 푂(log 푛) depth 푂(log 푛) related nodes / subtrees 푘퐿 푘푅 Augmentations In practice, most problems can be solved with a “textbook” data structure with some augmentations Augmented trees: in each of the tree node, store some additional information about the whole subtree rooted at it.

Load more