A Focus on Java Garbage Collectors and C++ Smart Pointers

Université de Neuchâtel Faculté des Sciences Institut d’Informatique Efficient Memory Management with Hardware Transactional Memory: A Focus on Java Garbage Collectors and C++ Smart Pointers par Maria Carpen-Amarie Thèse présentée à la Faculté des Sciences pour l’obtention du grade de Docteur ès Sciences Acceptée sur proposition du jury: Prof. Pascal Felber, Directeur de thèse Université de Neuchâtel, Suisse Prof. Rachid Guerraoui École Polytechnique Fédérale de Lausanne, Suisse Dr. Jean-Pierre Lozi Université Nice Sophia Antipolis, France Prof. Nir Shavit Massachusetts Institute of Technology, États-Unis Dr. Valerio Schiavoni Université de Neuchâtel, Suisse Soutenue le 3 novembre 2017 Faculté des Sciences Secrétariat-décanat de Faculté Rue Emile-Argand 11 2000 Neuchâtel – Suisse Tél : + 41 (0)32 718 21 00 E-mail : [email protected] IMPRIMATUR POUR THESE DE DOCTORAT La Faculté des sciences de l'Université de Neuchâtel autorise l'impression de la présente thèse soutenue par Madame Maria CARPEN-AMARIE Titre: “Efficient Memory Management with Hardware Transactional Memory: A Focus on Java Garbage Collectors and C++ Smart Pointers” sur le rapport des membres du jury composé comme suit: Prof. Pascal Felber, directeur de thèse, Université de Neuchâtel, Suisse Dr Valerio Schiavoni, Université de Neuchâtel, Suisse Prof. Nir Shavit, MIT, USA Prof. Rachid Guerraoui, EPF Lausanne, Suisse Dr Jean-Pierre Lozi, Université Nice Sophia Antipolis, France Neuchâtel, le 7 novembre 2017 Le Doyen, Prof. R. Bshary Imprimatur pour thèse de doctorat www.unine.ch/sciences ACKNOWLEDGMENTS I would like to express my gratitude to my supervisor, Prof. Pascal Felber, for supporting and advising me throughout my work on the PhD thesis. Knowing he is always there to exchange thoughts or give quick feedback every time I needed made this experience so much more enjoyable and insightful. Likewise, I would like to thank the external members of my thesis jury, Prof. Nir Shavit, Prof. Rachid Guerraoui and Dr. Jean-Pierre Lozi for taking the time to review my work and for the exciting discussions on the topic. I’m equally grateful to Dr. Valerio Schiavoni, for being both a friend and a strict reviewer, helping me very much with a cocktail of detailed feedback and good advice. My thanks also go to Prof. Gaël Thomas and David Dice for their constant support and advice. I am deeply indebted to Dr. Patrick Marlier. Without him this project would have never been started and without his knowledge, patience and enthusiasm during the first year it would have never reached this stage. Thanks, Pat, for a priceless first year of PhD! During the last four years I had a wonderful time together with all my colleagues and friends at UniNE. I have not forgotten the wonderful moments spent with Anita, Mascha, Emi, Yarco, Heverson, Raziel, before they left the university. Likewise, I’d like to thank the current gang of friends for all the fun coffee breaks, crossword puzzles, boardgames nights and so many other exciting events and debates: Dorian, Rafael, Roberta, Sébastien and everybody else. Of course, special thanks to my terrific officemate Mirco. Last, but not least, I want to thank all my family. A big hug for my sister Alexandra, who has always been my best friend and a role model for me. Special thanks to my mother, Maria, for always giving me the best possible care. Finally, thank you, Călin, for creating so many moments of happiness and warmth in the last few years. ABSTRACT With multi-core systems becoming ubiquitous in the last few years, lightweight synchronization and efficient automatic memory management are more and more demanded in the design of applications and libraries. Memory management techniques try to take advantage of the growing parallelism as much as possible, but the increasingly complex synchronization introduces considerable performance issues. The goal of this thesis is to leverage hardware transactional memory (HTM) for improving automatic memory management. We address both managed environments (Java) and unmanaged languages (C++). We investigate to what extent a novel synchronization mechanism such as HTM is able to make well-known memory management systems more efficient. We further provide insight into what scenarios benefit most from HTM-based synchronization. Managed runtime environments generally have a memory reclamation and recycling mechanism integrated, called garbage collector (GC). Its goal is to manage memory automatically and transparently, without impacting the application execution. The Java HotSpot virtual machine features several types of GCs, all of which need to block all application threads, i.e., stop the world, during collection. We propose a novel algorithm for a fully-concurrent GC based on HTM. We implement and optimize transactional access barriers, the core part of the algorithm. Our evaluation suggests that a full implementation of the transactional GC would have comparable performance with Java’s low-pause concurrent GC, while also eliminating stop-the-world pauses. Unmanaged languages use specialized forms of automatic memory management. In this context, we focus on smart pointers in C++. Their logic is based on reference-counting. However, the extra synchronization required by smart pointers has a negative impact on the performance of applications. We propose an HTM-based implementation of smart pointers. We aim to improve their performance by replacing atomic operations on the reference counter with transactions. In case of conflict, the pointer falls back to being a classical smart pointer. We identify favorable scenarios for using transactional pointers and we subsequently specialize them for concurrent data structure traversal. After studying the downsides of automatic memory management in multiple contexts, we find that HTM is a good candidate for addressing their specific synchronization issues. We show that HTM can successfully support both a feasible pauseless Java GC implementation, as well as more efficient smart pointers in C++. Keywords: Transactional memory, garbage collection, Java virtual machine, C++, reference counting, data structures. RÉSUMÉ De nos jours les systèmes multi-cœurs sont omniprésents. La conception des applications demande de plus en plus un moyen de synchronisation plus léger, ainsi qu’une gestion automatique de la mémoire plus efficace. Les techniques de gestion de la mémoire tâchent de profiter au maximum du parallélisme qui augmente chaque jour. Néanmoins, la complexité de la synchronisation peut entraîner des problèmes de performance majeurs. Dans cette thèse, nous tirons parti du support matériel pour la mémoire transactionnelle afin d’améliorer la performance de la gestion automatique de la mémoire. Ce travail porte sur des environnements supervisés (Java), ainsi que non-supervisés (C++). Nous étudions la mesure dans laquelle ce nouveau moyen de synchronisation peut rendre des systèmes réputés de gestion de la mémoire plus efficaces. De plus, nous identifions les scénarios où l’on peut utiliser la mémoire transactionnelle matérielle pour maximiser ses bénéfices. En général, les environnements supervisés ont un mécanisme intégré de récupération et re- cyclage de la mémoire, appelé le ramasse-miettes. Le but d’un ramasse-miettes est de gérer automatiquement la mémoire sans affecter la performance de l’application. La machine virtuelle Java HotSpot comporte plusieurs types de ramasse-miettes, qui ont tous besoin d’arrêter (ou bloquer) l’application pendant le ramassage. Nous proposons un nouvel algorithme pour un ramasse-miettes entièrement concurrent, employant de la mémoire transactionnelle matérielle pour la synchronisation avec l’application. Nous développons et optimisons une partie fonda- mentale de notre algorithme, des barrières en écriture et en lecture transactionnelles. Notre évaluation indique qu’après la mise en œuvre complète de notre algorithme, le ramasse-miettes transactionnel aura une performance comparable aux ramasse-miettes existants, mais sans bloquer l’application. Les langages non-supervisés utilisent parfois des moyens spécialisés de gestion automatique de la mémoire. Dans ce contexte, nous étudions les pointeurs intelligents de C++. La gestion de la mémoire repose sur un algorithme à comptage des références. La performance des applications est parfois dégradée en raison de la synchronisation requise par cet algorithme. Nous proposons une variante des pointeurs intelligents synchronisés avec de la mémoire transactionnelle matérielle. L’idée est de remplacer les opérations atomiques au niveau du compteur des références par des transactions. En cas de conflit, le pointeur transactionnel se transforme en pointeur intelligent classique. Les pointeurs transactionnels se révèlent particulièrement avantageux dans le contexte des structures des données. Nous analysons les désavantages de la gestion automatique de la mémoire dans plusieurs contextes. Nous montrons que la mémoire transactionnelle matérielle représente une très bonne solution pour les problèmes spécifiques de synchronisation rencontrés par les systèmes de gestion de la mémoire, dans des environnements aussi bien supervisés que non-supervisés. Mots-clés: Mémoire transactionnelle, ramasse-miettes, machine virtuelle Java, C++, comptage de références, structures des données. xi Contents 1 Introduction 1 1.1 Context and motivation . .1 1.2 Contributions . .2 1.3 Organization of the manuscript . .4 Part I – Background 7 2 Context: Multi-core Architectures 9 2.1 Shared memory . 10 2.2 Concurrent programming . 10

Load more