The Adaptive Complexity of Submodular Optimization
Total Page:16
File Type:pdf, Size:1020Kb
The Adaptive Complexity of Submodular Optimization The Harvard community has made this article openly available. Please share how this access benefits you. Your story matters Citation Balkanski, Eric. 2019. The Adaptive Complexity of Submodular Optimization. Doctoral dissertation, Harvard University, Graduate School of Arts & Sciences. Citable link http://nrs.harvard.edu/urn-3:HUL.InstRepos:42029793 Terms of Use This article was downloaded from Harvard University’s DASH repository, and is made available under the terms and conditions applicable to Other Posted Material, as set forth at http:// nrs.harvard.edu/urn-3:HUL.InstRepos:dash.current.terms-of- use#LAA The Adaptive Complexity of Submodular Optimization a dissertation presented by Eric Balkanski to The School of Engineering and Applied Sciences in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the subject of Computer Science Harvard University Cambridge, Massachusetts May 2019 © 2019 Eric Balkanski All rights reserved. Dissertation advisor: Yaron Singer Eric Balkanski The Adaptive Complexity of Submodular Optimization Abstract In this thesis, we develop a new optimization technique that leads to exponentially faster algorithms for solving submodular optimization problems. For the canonical problem of maximizing a non-decreasing submodular function under a cardinality constraint, it is well known that the celebrated greedy algorithm which iteratively adds elements whose marginal contribution is largest achieves a 1 1/e approximation, which ≠ is optimal. The optimal approximation guarantee of the greedy algorithm comes at a price of high adaptivity. The adaptivity of an algorithm is the number of sequential rounds it makes when polynomially-many function evaluations can be executed in parallel in each round. Since submodular optimization is regularly applied on very large datasets, adaptivity is crucial as algorithms with low adaptivity enable dramatic speedups in parallel computing time. Submodular optimization has been studied for well over forty years now, and somewhat surprisingly, there was no known constant-factor approximation algorithm for submodular maximization whose adaptivity is sublinear in the size of the ground set n. Our main contribution is a novel optimization technique called adaptive sampling which leads to constant factor approximation algorithms for submodular maximization in only logarithmically many adaptive rounds. This is an exponential speedup in the parallel runtime for submodular maximization compared to previous constant factor approximation algorithms. Furthermore, we show that no algorithm can achieve a constant factor approximation in o˜(log n) rounds. Thus, the adaptive complexity of submodular maximization, i.e., the minimum number of rounds r such that there exists an r-adaptive algorithm which achieves a constant factor approximation, is logarithmic up to lower order terms. iii Contents 1 Introduction 1 1.1 Adaptive Sampling: An Exponential Speedup . 4 1.2 ResultsOverview ................................. 5 1.2.1 From Predictions to Decisions: Optimization from Samples . 6 1.2.2 Faster Parallel Algorithms Through Adaptive Sampling . 8 1.3 Preliminaries . 12 1.4 Discussion About Adaptivity . 14 1.4.1 Adaptivity in Other Areas . 14 1.4.2 Related Models of Parallel Computation . 15 1.4.3 Applications of Adaptivity . 17 2 Non-Adaptive Optimization 20 2.1 From Predictions to Decisions: Optimization from Samples . 20 2.1.1 The Optimization from Samples Model . 23 2.1.2 Optimization from Samples is Equivalent to Non-Adaptivity . 25 2.1.3 Overview of Results . 27 2.2 Optimization from Samples Algorithms . 29 2.2.1 Curvature . 30 2.2.2 Learning to Influence . 35 iv 2.2.3 General Submodular Maximization . 54 2.3 The Limitations of Optimization from Samples . 62 2.3.1 A Framework for Hardness of Optimization from Samples . 62 2.3.2 Submodular Maximization . 65 2.3.3 Maximum Coverage . 67 2.3.4 Curvature . 84 2.4 References and Acknowledgments . 91 3 Adaptive Optimization 92 3.1 Adaptivity . 93 3.1.1 The Adaptive Complexity Model . 93 3.1.2 The Adaptivity Landscape for Submodular Optimization . 94 3.1.3 Main Result . 95 3.1.4 Adaptive Sampling: a Coupling of Learning and Optimization . 96 3.1.5 Overview of Results . 96 3.2 Adaptive Algorithms . 99 3.2.1 An Algorithm with Logarithmic Adaptivity . 99 3.2.2 Experiments . 117 3.2.3 The Optimal Approximation . 123 3.2.4 Non-monotone Functions . 141 3.2.5 Matroid Constraints . 166 3.3 Adaptivity Lower Bound . 205 3.4 References and Acknowledgments . 214 v Acknowledgments First and foremost, I would like to thank my advisor Yaron Singer. Yaron has taught me everything about research, from long-term research vision to choosing the right font for a presentation. I have been extremely fortunate to have an advisor who has always believed in me, cares so much about my success, and tirelessly mentored and advised me through every stage of my PhD. My family has been a vital source of support through the years. I am grateful to my parents, Cecile and Yves, for all the encouragement and freedom to explore every project and idea I had since a very young age. Thank you to my siblings Sophie and Jefffor all the good times laughing together and the endless singing in the car during vacations, I cannot wait for our next family vacation. Thanks also to my girlfriend Meghan for the emotional support and dealing with me during paper deadlines. I am very grateful for my two internships at Google Research NYC, which have played an important role in my PhD. During these two summers, I met and worked with wonderful people, broadened my research horizon, and explored new directions. Thank you in particular to my hosts and collaborators Umar Syed, Sergei Vassilvitskii, Balu Sivan, Renato Paes Leme, and Vahab Mirrokni. I have also had the chance to work with incredible collaborators from whom I have learned a lot and who made significant contributions to this thesis: Aviad Rubinstein, Jason Hartline, Andreas Krause, Baharan Mirzasoleiman, Nicole Immorlica, Amir Globerson, Nir Rosenfeld, and Adam Breuer. I am also grateful for the numerous discussions about the content of this thesis with Thibaut Horel, whose breadth of knowledge has been very helpful. I am fortunate to have spent my PhD working in a fun and warm environment. Thank you to my officemates through the years, Emma Heikensten, Siri Isaksson, Dimitris Kalimeris, Gal Kaplun, Sharon Qian, Greg Stoddard, Bo Waggoner and Ming Yin, I will miss the vi great atmosphere in MD115. The EconCS group and Maxwell Dworkin have been amazingly friendly and supportive places to grow academically, thanks in particular to David Parkes, Yiling Chen, Jean Pouget-Abadie, Hongyao Ma, Chara Podimata, Jarek Blasiok, Debmalya Mandal, Preetum Nakkiran, Lior Seeman, Goran Radanovic, and Yang Liu, and to my research committee, Yaron Singer, David Parkes, Sasha Rush, and Michael Mitzenmacher. I am fortunate for the countless and diverse opportunities I had during my undergraduate studies at Carnegie Mellon which helped me grow. These opportunities lead me to valuable research, teaching, leadership, social, and athletic experiences. Thank you in particular to my undergraduate research advisor, Ariel Procaccia, who was the first person to tell me he believed I could become a professor, and my academic advisor John Mackey for all the opportunities. This thesis was supported in part by a Google PhD Fellowship and a Smith Family Graduate Science and Engineering Fellowship. vii Chapter 1 Introduction The field of optimization has recently expanded to new application domains, where complex decision-making tasks are challenging existing frameworks and techniques. A main difficulty is that the scale of these tasks is growing at a vertiginous rate. Innovative frameworks and techniques are needed to capture modern application domains as well as address the challenges posed by large scale computation. We begin by discussing three application domains where optimization has recently played an important role. The first domain is genomics. Recent developments in computational biology that allow processing large amounts of genomic data have been a catalyst for progress towards understanding the human genome. One approach that helps to understand and process massive gene datasets is to cluster gene sequences. Given gene clusters, a small representative subset of genes, in other words a summary of the entire dataset, can be obtained by choosing one gene sequence from each cluster. The problem of clustering gene sequences is an example of a large scale optimization problem. A different domain is recommender systems. Twenty years ago, a Friday movie night would require browsing through multiple aisles at Blockbuster to find the right movie. Today, streaming services use recommender systems to suggest a personalized collection of movies to 1 a user. Recommender systems are not limited to movies, but are also used for music and online retail. Given the preferences of different users, optimization techniques are used to find diverse and personalized collections of movies, songs, or products. The third domain, ride-sharing services, did not even exist a decade ago. Ride-sharing services have revolutionized transportation by allowing passengers to request a ride to their desired destination in a few seconds using their smartphone. These companies face novel and complex optimization tasks. One example is driver dispatch, which is the problem of assigning drivers to different locations, in order to match the demand from riders. Gene sequences, user ratings for music and movies, rider-driver locations are examples of large datasets that we wish to harness using optimization. Novel algorithmic techniques and frameworks that are adapted for these new domains are needed. A standard approach to these optimization problems is to use an algorithmic technique called the greedy approach. Informally, a greedy algorithm takes small local steps towards building a global solution to a problem. For example, for the driver dispatch problem in New York City, a greedy algorithm might first dispatch a first driver to bustling midtown Manhattan.