6.854 Lecture Notes

6.854 Lecture Notes Thanadol Chomphoochan Fall 2020 This file contains all the raw notes I took when I was taking 6.854 Advanced Algorithms in Fall 2020, taught by Karger. I barely edited anything before compiling it as this file, so there may be factual or typographical errors. Please use this at your own risk. Please email [email protected] if you wish to provide feedback. (Last updated: December 6th, 2020) Contents 1 Fibonacci Heaps5 1.1 Inventing Fibonacci Heap .......................... 5 1.2 Potential function............................... 7 1.3 Analyzing the tree (\Saving Private Ryan")................ 7 1.3.1 Step 1: Analyze mark bits ..................... 8 1.3.2 Step 2: Understanding tree size .................. 8 1.4 Summary of Fibonacci Heap......................... 8 1.5 Fast Minimum Spanning Tree........................ 9 1.6 Persistent Data Structures.......................... 9 2 Persistent Data Structures 11 2.1 Fat node and the copy method ....................... 11 2.2 Path copying ................................. 11 2.3 Lazier path copying by Sarnak & Tarjan.................. 11 2.4 Application: Planar point location ..................... 12 3 Splay Trees 13 3.1 Analysis.................................... 13 4 Multi-level bucketing and van Emde Boas trees 16 4.1 Dial's algorithm for SSSP .......................... 16 4.2 Tries...................................... 16 4.2.1 Denardo & Fox............................ 17 4.2.2 Cherkassky, Goldberg, Silverstein ................. 17 4.3 van Emde Boas Tree (1977) ......................... 18 4.3.1 Insert(x; Q).............................. 18 4.3.2 Delete-min(x; Q)........................... 18 4.3.3 Analyzing the space complexity .................. 19 1 Thanadol Chomphoochan (Fall 2020) 6.854 Lecture Notes 5 Hashing 20 5.1 2-universal hash family by Carter and Wegman.............. 20 5.2 Guaranteeing O(1) lookups ......................... 21 5.3 Fredman, Kombs, Szemeredi......................... 22 6 Maximum flow problem 23 6.1 Feasible Flow ................................. 23 6.2 Maximum flow upper bound......................... 24 6.3 Ford-Fulkerson Algorithm .......................... 26 6.3.1 Finding better augmenting paths.................. 26 6.3.2 Scaling algorithm (Gabow '85, Dinitz '73)............. 26 6.4 Strongly Polynomial Max Flow Algorithms................. 27 6.4.1 Edmonds-Karp Algorithm...................... 27 6.4.2 Blocking Flows............................ 28 7 Minimum-cost Flow Problem 31 7.1 Min cost max flow and variants....................... 31 7.2 Showing optimality.............................. 32 7.2.1 Cycle canceling algorithm (Klein) ................. 32 7.3 Prices and reduced costs........................... 33 7.3.1 Scaling algorithms using prices................... 34 7.4 Complementary slackness .......................... 35 8 Linear Programming 37 8.1 Forms of linear programs........................... 37 8.2 Solving linear equalities ........................... 39 8.2.1 Writing down solutions ....................... 39 8.2.2 Finding a solution.......................... 40 8.2.3 Generalizing to non-square matrices................ 40 8.2.4 Proving there is no solution..................... 40 8.3 Inequalities and Geometry.......................... 41 8.3.1 Basic solution ............................ 41 8.3.2 Naive algorithm for solving LP................... 43 8.4 Duality: Verifying optimality in polynomial time ............. 43 8.5 Strong Duality ................................ 44 8.6 Rules for Duals................................ 46 8.7 Applications of duality............................ 47 8.7.1 Shortest path............................. 47 8.7.2 Maximum flow............................ 47 8.8 Complementary slackness .......................... 48 8.8.1 Maximum flow (repeated)...................... 48 8.8.2 Min cost circulation......................... 49 9 Algorithms for linear programming 50 9.1 Simplex algorithm (1940s).......................... 50 9.1.1 Notation juggling .......................... 50 2 Thanadol Chomphoochan (Fall 2020) 6.854 Lecture Notes 9.1.2 The algorithm ............................ 51 9.1.3 Is this polynomial?.......................... 52 9.1.4 Simplex and Duality......................... 52 9.2 Ellipsoid Method ............................... 53 9.2.1 The algorithm ............................ 53 9.2.2 Runtime analysis........................... 54 9.2.3 Grotschel Louasz Schriver...................... 54 9.3 Interior point algorithm ........................... 54 10 Approximate algorithms 56 10.1 Quality of an approximation algorithm................... 56 10.2 Greedy Algorithms.............................. 57 10.3 Scheduling theory............................... 58 10.4 Approximation schemes ........................... 59 10.5 Dealing with strongly NP-hard problems.................. 60 10.6 Smarter enumeration technique....................... 60 10.7 MAX-SNP problems ............................. 61 10.8 Relaxation................................... 62 10.8.1 Christofides’s heuristic ....................... 62 10.9 Scheduling problem with release dates ................... 63 10.10 LP Relaxation Technique........................... 63 10.10.1 Facility location problem ...................... 64 10.10.2 Randomized rounding........................ 64 11 Parameterized complexity 67 11.1 Kernalization ................................. 67 11.2 Treewidth ................................... 67 12 Computational Geometry 69 12.1 Sweep Algorithms............................... 70 12.2 Voronoi Diagrams............................... 70 13 Online Algorithms 73 13.1 Ski-rental problem .............................. 73 13.2 Selling stocks ................................. 74 13.2.1 No splitting.............................. 74 13.2.2 With splitting ............................ 74 13.2.3 Using randomization......................... 74 13.3 Scheduling problem.............................. 75 13.4 Paging problem................................ 75 13.4.1 Least recently used (LRU) ..................... 75 13.5 Preventing predictability via randomization ................ 76 13.6 Finding lower bound for randomized online algorithms.......... 78 13.6.1 Lower bound for paging algorithms ................ 78 13.7 k-server problem ............................... 79 13.7.1 1-D case: Double-coverage algorithm................ 79 3 Thanadol Chomphoochan (Fall 2020) 6.854 Lecture Notes 14 External memory algorithms 82 14.1 Basic algorithms ............................... 82 14.2 Linked lists .................................. 83 14.3 Search trees.................................. 84 14.3.1 B-trees ................................ 84 14.4 Sorting..................................... 85 15 Parallel Algorithms 87 15.1 Circuits .................................... 87 15.1.1 Addition ............................... 88 15.2 Parallel Random Access Machine (PRAM)................. 88 15.2.1 Finding OR.............................. 89 15.2.2 Max of n values ........................... 89 15.3 Work of an algorithm............................. 89 15.3.1 Max problem............................. 90 15.3.2 Parallel prefix sum.......................... 90 15.3.3 List ranking ............................. 90 15.4 Parallel sorting ................................ 91 4 1 Fibonacci Heaps Fibonacci Heaps are developed by Michael L. Fredman and Robert E. Tarjan in 1984 to solve Shortest Path (SP) and Minimum Spanning Tree (MST) problems. Recall, to solve MST, we can use Prim's algorithm which keeps adding the \closest" neighbor to the tree. This is similar to Dijkstra's algorithm. To optimize runtime for Prim's Algorithm, we use a priority queue which supports three operations: • Insert items: used n times • Find/delete minimum: used n times • Decrease keys: used m times Using a standard binary heap, we get total runtime of O(nm log n). If we can optimize decrease-key operations to O(1) per operation, then we will get a better runtime of O(nm log n). Notice that the runtime of a heap is completely determined by the height of the tree. So, let us instead use d-heap where each node has d children nodes. The height would be O(logd n). This gives faster insert and decrease-key runtime. However, delete-min now takes O(d logd n). Prim's algorithm will take a total of O(m logd n + nd logd n) runtime. The best theoretical runtime we can achieve is when both m log d and nd logd n terms are 2 set equal: m = nd. Then, we get a runtime of O(m log m n). If m = n then the runtime n is simply O(m). §1.1 Inventing Fibonacci Heap We try to be lazy, meaning every time we work, it should count as much as possible. Imagine there is an adversary who tries to force you to work hard. We can force the adversary to spend a lot of effort to make us work, and amortize our work vs. theirs. Let's see how far we can go with laziness. • With insertion, we might use a linked list and simply insert stuff to the head of our linked list in O(1). This might sound like the best we can do, but in fact we can just say \okay" and do nothing! 5 Thanadol Chomphoochan (Fall 2020) 6.854 Lecture Notes • With delete-min, we need to scan for minimum in O(n), which averages out to only O(1) per inserted item. The problem occurs when we have to do multiple delete-min operations. The rule of thumb is: whenever we work, we simplify. In this case, notice that each comparison reveals smaller/larger elements. So, we can just ignore larger elements when we seek the minimum. Represent this as trees where the loser

6.854 Lecture Notes

Persistent Predecessor Search and Orthogonal Point Location on the Word

X-Fast and Y-Fast Tries

Lock-Free Concurrent Van Emde Boas Array

Introduction to Algorithms, 3Rd

Lock-Free Van Emde Boas Array Lock-Free Van Emde Boas

Memory Optimization

Predecessor Search

Cache-Friendly Search Trees; Or, in Which Everything Beats Std::Set

MIT 6.851 Advanced Data Structures Prof

Ijita-17-No3.Pdf

Topological Algorithms for Geographic and Geometric Graphs

Cache-Oblivious B-Trees∗