Randomized Versus Deterministic Implementations of Concurrent Data Structures

Randomized versus Deterministic Implementations of Concurrent Data Structures THÈSE NO 5447 (2012) PRÉSENTÉE LE 6 AOÛT 2012 À LA FACULTÉ INFORMATIQUE ET COMMUNICATIONS LABORATOIRE DE PROGRAMMATION DISTRIBUÉE PROGRAMME DOCTORAL EN INFORMATIQUE, COMMUNICATIONS ET INFORMATION ÉCOLE POLYTECHNIQUE FÉDÉRALE DE LAUSANNE POUR L'OBTENTION DU GRADE DE DOCTEUR ÈS SCIENCES PAR Dan Alistarh acceptée sur proposition du jury: Prof. A. Schiper, président du jury Prof. R. Guerraoui, directeur de thèse Dr P. Kuznetsov, rapporteur Prof. P. Spirakis, rapporteur Prof. R. Urbanke, rapporteur Suisse 2012 To Ludmila Abstract One of the key trends in computing over the past two decades has been increased distribution, both at the processor level, where multi-core architectures are now the norm, and at the system level, where many key services are currently distributed over multiple machines. Thus, understanding the power and limitations of computing in a concurrent, distributed setting is one of the major challenges in Computer Science. In this thesis, we analyze the complexity of implementing concurrent data structures in asynchronous shared memory systems. We focus on the complexity of a classic distributed coordination task called renaming, in which a set of processes need to pick distinct names from a small set of identifiers. We present the first tight bounds for the time complexity of this problem, both for deterministic and randomized implementations, solving a long-standing open problem in the field. For deterministic algorithms, we prove a tight linear lower bound; for randomized solutions, we provide logarithmic upper and lower bounds on time complexity. Together, these results show an exponential separation between deterministic and randomized renaming solutions. Importantly, the lower bounds extend to implementations of practical shared-memory data structures, such as queues, stacks, and counters. From a technical perspective, this thesis highlights new connections between the distributed renaming problem and other fundamental objects, such as sorting networks, mutual exclusion, and counters. In particular, we show that sorting networks can be used to obtain optimal randomized solutions to renaming, and that, in turn, the existence of these solutions implies a linear lower bound on the complexity of the problem. In sum, the results in this thesis suggest that deterministic implementations of shared-memory data structures do not scale well in terms of worst-case time complexity. On the positive side, we emphasize randomization as a natural alternative, which can circumvent the deterministic lower bounds with high probability. Thus, a promising direction for future work is to extend our randomized renaming techniques to obtain efficient implementations of concurrent data structures. Keywords: concurrent algorithms, shared memory, data structures, renaming, randomization, lower bounds v Résumé Les deux dernières décennies ont vu l’augmentation de la distribution des processus s’affirmer comme une tendance majeure de l’évolution des systèmes informatiques. Ceci aussi bien au niveau des microprocesseurs, qui sont maintenant pratiquement tous multi-coeur, que au niveau système, où les services principaux sont maintenant répartis sur plusieurs machines. La compréhension des possibilités et des limitations du calcul réparti et concurrent est donc d’un intérêt majeur en informatique. La présente thèse analyse la complexité algorithmique de l’implémentation de structures de données concurrentes dans les systèmes à mémoire partagée. Nous concentrons nos efforts sur la complexité d’une tâche de coordination classique dite du renommage. La tâche consiste à attribuer un nom distinct appartenant à un ensemble restreint de noms à chaque processus du système. Nous présentons les premières bornes optimales de la complexité temporelle de cette tâche pour les implémentations déterministes et probabilistes. Ces résultats résolvent un problème ouvert de longue date dans le domaine. Pour les algorithmes déterministes, nous établissons une borne inférieure linéaire. Quant aux algorithmes probabilistes, nous établis- sons des bornes inférieure et supérieures logarithmiques. Ces résultats montrent qu’il existe une différence exponentielle de complexité entre les solutions probabilistes et déterministes pour la tâche du renommage. La borne inférieure obtenue dans le cas déterministe s’applique aussi aux implémentations déterministes de structures de données telles que les piles, les files, et les compteurs. D’un point de vue technique, cette thèse met en lumière des connections nouvelles entre la tâche du renommage et d’autres tâches fondamentales telles que les réseaux de tri, l’exclusion mutuelle, et les compteurs. En particulier, nous montrons qu’il est possible d’obtenir une solution probabiliste optimale à la tâche du renommage en utilisant les réseaux de tri. L’existence de cette solution implique une borne inférieure linéaire à la complexité du problème dans le cas déterministe. En somme, les résultats obtenus lors de cette thèse suggèrent que les implémentations déter- ministes des structures de données en mémoire partagée ne s’adaptent pas bien à la montée en charge d’un système. De manière plus positive, nos résultats mettent en lumière l’utilisation des méthodes probabilistes, qui permettent de contourner ce problème avec grande probabil- ité. L’extension des techniques utilisées pour résoudre de manière probabiliste le problème du renommage aux structures de données concurrentes est donc une direction de recherche prometteuse. vii Mots-clés: algorithmes concurrents, mémoire partagée, structures de données, renommage, méthodes probabilistes, bornes inférieures viii Preface This thesis presents part of the Ph.D. work performed under the supervision of Prof. Rachid Guerraoui at the EPFL School of Computer and Communication Sciences. It focuses on the complexity of concurrent shared-memory data structures, in particular the complexity of the renaming problem. The main results of this thesis appeared originally in the following papers. 1. Dan Alistarh, James Aspnes, Seth Gilbert, and Rachid Guerraoui. The complexity of renaming. In Proceedings of the 52nd IEEE Annual Symposium on Foundations of Computer Science (FOCS 2011), pages 718-727, 2011. 2. Dan Alistarh, James Aspnes, Keren Censor-Hillel, Seth Gilbert, and Morteza Zadimoghad- dam. Optimal-time adaptive strong renaming, with applications to counting. In Pro- ceedings of the 30th Annual Symposium on Principles of Distributed Computing (PODC 2011), ACM, pages 239-248, 2011. 3. Dan Alistarh, Hagit Attiya, Seth Gilbert, Andrei Giurgiu, and Rachid Guerraoui. Fast randomized test-and-set and renaming. In Proceedings of the 24th International Conference on Distributed Computing (DISC 2010), pages 94-108, 2010. During my Ph.D., I was also involved in other aspects of distributed computing, such as shared- memory data structures, the complexity of speculative distributed algorithms, and efficient wireless communication in a hostile environment. This resulted in the following publications. 1. Dan Alistarh, Seth Gilbert, Rachid Guerraoui, and Corentin Travers. How to Solve Con- sensus in the Smallest Window of Synchrony. In Proceedings of the 22nd International Symposium on Distributed Computing (DISC 2008), pages 32-46, 2008. 2. Dan Alistarh, Seth Gilbert, Rachid Guerraoui, and Corentin Travers. Of Choices, Fail- ures and Asynchrony: The Many Faces of Set Agreement. In Proceedings of the 20th International Symposium on Algorithms and Computation (ISAAC 2009), pages 943-953, 2009. 3. Dan Alistarh, Seth Gilbert, Rachid Guerraoui, and Morteza Zadimoghaddam. How Efficient Can Gossip Be? (On the Cost of Resilient Information Exchange). In Proceedings of the 37th International Colloquium on Automata, Languages and Programming (ICALP 2010), pages 115-126, 2010. ix Preface 4. Dan Alistarh, Seth Gilbert, Rachid Guerraoui, Zarko Milosevic, and Calvin Newport. Securing Every Bit: Authenticated Broadcast in Radio Networks. In Proceedings of the 22nd ACM Symposium on Parallelism in Algorithms and Architectures (SPAA 2010), pages 50-59, 2010. 5. Dan Alistarh, Seth Gilbert, Rachid Guerraoui, and Corentin Travers. Generating Fast Indulgent Algorithms. In Proceedings of the 12th International Conference on Distributed Computing and Networking (ICDCN 2011), pages 41-52, 2011. 6. Dan Alistarh, and James Aspnes. Sub-logarithmic Test-and-set Against a Weak Adversary. In Proceedings of the 25th International Symposium on Distributed Computing (DISC 2011), pages 97-109, 2011. 7. Dan Alistarh, Hagit Attiya, Rachid Guerraoui, and Corentin Travers. Early Deciding Syn- chronous Renaming in O(log f ) Rounds or Less. In Proceedings of the 19th International Colloquium on Structural Information and Communication Complexity (SIROCCO 2012), 2012. 8. Dan Alistarh, Rachid Guerraoui, Petr Kuznetsov, and Giuliano Losa. On the Cost of Composing Shared-Memory Algorithms. In Proceedings of the 24th ACM Symposium on Parallelism in Algorithms and Architectures (SPAA 2012), 2012. 9. Dan Alistarh, Michael Bender, Seth Gilbert, Rachid Guerraoui. How to Allocate Tasks Asynchronously. In Proceedings of the 53rd IEEE Annual Symposium on Foundations of Computer Science (FOCS 2012), to appear, 2012. 10. Dan Alistarh, Seth Gilbert, Rachid Guerraoui, and Corentin Travers. Generating Fast Indulgent Algorithms. In the Theory of Computer Systems Journal, to appear, 2012. 11. Dan Alistarh, Seth Gilbert, Rachid Guerraoui, and Corentin Travers. Of Choices, Failures

Load more