Massively Parallel Algorithms Computer Science, ETH Zurich
Mohsen Ghaffari
Lecture Notes by Davin Choo
Version: 19th June, 2019
Contents
Notation and useful inequalities
Administrative matters
1 The MPC Model 1 1.1 Computation and Communication Model...... 1 1.2 Initial data distribution...... 2 1.3 Commonly used subroutines...... 3
2 Matching 7 2.1 Matching using strongly superlinear memory...... 7 2.2 Matching using near linear memory...... 10 2.3 Matching using strongly sublinear memory...... 19 2.4 Approximation improvement via augmenting paths.... 26
3 Connected Components & MST 41 3.1 MST using near linear memory...... 43 3.2 Connectivity using near linear memory...... 44 3.3 Log diameter time connectivity using sublinear memory.. 51 3.4 Geometric MST using sublinear memory...... 56
4 Lower bounds & conditional hardness 63 4.1 Lower bounds...... 63 4.2 Conditional hardness...... 69
5 Dynamic Programming 73 5.1 Weighted Interval Selection...... 73
6 Submodular Maximization 81 6.1 A greedy sequential algorithm...... 82 6.2 Constant approximation in 2 MPC rounds...... 83 6.3 Optimal approximation via Sample-and-Prune ...... 89 6.4 Optimal approximation in constant time...... 93
7 Data clustering 101 7.1 k-means ...... 101 7.2 k-means++: Initializing with guarantees...... 102 7.3 k-meansk: Parallelizing the initialization...... 103
8 Exact minimum cut in near linear memory 113
9 Vertex coloring 119 9.1 Warm up...... 120 9.2 A structural decomposition...... 122 9.3 A three phase analysis...... 125
Notation and useful inequalities
Commonly used notation
• WLOG: without loss of generality
• ind.: independent / independently
• w.p.: with probability
• w.h.p: with high probability We say event X holds with high probability (w.h.p.) if
1 Pr[X] 1 − > poly(n)
1 say, Pr[X] > 1 − nc for some constant c > 2. • Integer range [n] = {1, ... , n}
Useful inequalities
−x • For any x, (1 − x) 6 e
1 −x−x2 • For x ∈ (0, 2 ), (1 − x) > e
1 −x −x • For x ∈ [0, 2 ], 4 6 1 − x 6 e
n k n en k • ( k ) 6 k 6 ( k )