Combining Lock-Free Programming with Cooperative Multitasking for a Portable Multiprocessor Runtime System

Research Collection Doctoral Thesis Combining Lock-Free Programming with Cooperative Multitasking for a Portable Multiprocessor Runtime System Author(s): Negele, Florian Publication Date: 2014 Permanent Link: https://doi.org/10.3929/ethz-a-010335528 Rights / License: In Copyright - Non-Commercial Use Permitted This page was generated automatically upon download from the ETH Zurich Research Collection. For more information please consult the Terms of use. ETH Library DISS. ETH NO. 22298 Combining Lock-Free Programming with Cooperative Multitasking for a Portable Multiprocessor Runtime System A thesis submitted to attain the degree of DOCTOR OF SCIENCES of ETH ZURICH (Dr. sc. ETH Zurich) presented by FLORIAN NEGELE MSc ETH CS, ETH Zurich born 15 August 1981 citizen of the Principality of Liechtenstein accepted on the recommendation of Prof. Dr. Jürg Gutknecht, examiner Prof. Dr. Thomas Gross, co-examiner Prof. Dr. Bernhard Egger, co-examiner 2014 ii Whatever can go wrong, will go wrong. (attributed to Edward A. Murphy) Murphy was an optimist. (according to authors of lock-free programs) iii iv Acknowledgements A dissertation is never the achievement of a single person. Now is the time to thank all people who enabled me to create and complete this thesis. First and foremost I would like to express my special appreciation and sin- cere thanks to my good friend and advisor Dr. Felix Friedrich who tirelessly encouraged me for this work and always provided me with outstanding support and reliable expertise. It has been a great pleasure working with you all of these years. I am also greatly indebted to my supervisor Prof. Jürg Gutknecht for giving me the opportunity and free reign to realise my ideas within this doctoral thesis. I am sincerely grateful for the comfortable working environment and his continuous guidance, help and patience. Additional thanks go to Prof. Thomas Gross and Prof. Bernhard Egger for their valuable assistance and kind agreement to co-examine this work. I would also like to show my deep gratitude to Paul Reed who was always generous in scrutinising drafts and providing me steadily with helpful comments and much appreciated suggestions. Last but not least, I would like to thank all of my past co-workers and colleagues for the interesting discussions, helpful insights, and the overall pleasant working environment. These people include Luc Bläser, Matthias Büchler, Olivier Clerc, André Fischer, Ulrike Glavitsch, Thomas Kägi, Daniel Keller, Sven Knudsen, Ling Liu, Roman Mitin, Georg Ofenbeck, Sven Stauber, Lars Widmer, and many others. I owe special thanks to Ruth Hidalgo, Franziska Mäder, and Martina Wirth for their reliable administrat- ive support. v vi Contents 1 Introduction 1 1.1 Motivation . .1 1.2 Scope . .3 1.3 Related Work . .7 1.4 Contributions . .9 1.5 Overview . 10 2 Lock-Free Programming 11 2.1 Introduction . 11 2.2 Memory Models . 15 2.3 Programming Techniques . 21 3 Case Study: Lock-Free Queues 33 3.1 Introduction . 33 3.2 Linearisation Points . 37 3.3 Intermediate Nodes . 44 3.4 The ABA Problem . 51 3.5 Safe Memory Deallocation . 58 4 Lock-Free Scheduling 67 4.1 Introduction . 67 4.2 Foundations . 70 4.3 Synchronisation Primitives . 80 4.4 Multiprocessor Support . 93 4.5 Interrupt Handling . 95 vii 5 Lock-Free Memory Management 101 5.1 Heap Management . 101 5.2 Stack Management . 113 5.3 Garbage Collection . 116 6 Case Study: Software Portability 127 6.1 Introduction . 127 6.2 Cross Compilation . 129 6.3 Generic Object Files . 131 6.4 Runtime System Structure . 138 7 Evaluation 145 7.1 Contention Management . 146 7.2 Performance Measurements . 150 7.3 Portability and Flexibility . 168 8 Conclusion 171 8.1 Summary . 171 8.2 Future Work . 174 A Digital Material 179 B Module Reference 181 C Enhancements to Active Oberon 203 Bibliography 225 viii Abstract The application of mutual exclusion in order to protect shared data from concurrent access is a recurring programming pattern found in contempor- ary application software. Although alternatives to blocking synchronisation like lock-free programming solve several critical problems and offer better progress guarantees, they have not become as popular. This especially ap- plies to operating system kernels that provide support for multiprocessing environments. This thesis describes our approach to design and implement an operating system kernel which is based solely on non-blocking algorithms. Its goal is to provide a completely lock-free runtime system for object-oriented programming languages that support concurrency and automatic memory management. We also strived for an uncompromisingly high portability of its source code and replaced all features provided by the hardware with machine-independent software solutions wherever possible. In particular, we abandoned the prevalent preemptive scheduler of modern systems in favour of a compiler-guaranteed cooperative multitasking. The combina- tion of lock-free programming with cooperative multitasking presents a novel approach and enables several practical and beneficial non-blocking programming techniques such as processor-local storage. The result of this work is a concise and reliable operating system kernel providing flexible language support for multithreading and garbage collection. Its source code is easily portable to a variety of hardware architectures ranging from powerful multicore machines to small embedded devices and microcontrollers. In addition, its rigorous machine independence also allows the runtime system to be used as a library for applications running on top of existing operating systems. Despite its simplicity, the performance of our lock-free approach is often comparable to established and optimised operating systems and is clearly superior to solutions based on locking. ix x Kurzfassung Die heutige Softwareentwicklung setzt häufig Verfahren des wechselseiti- gen Ausschlusses ein, um gemeinsam genutzte Daten vor gleichzeitigem Zugriff nebenläufiger Prozesse zu schützen. Vielversprechende Alternativen zu dieser blockierenden Art der Prozesssynchronisation wie beispielsweise die lockfreie Programmierung lösen zahlreiche der damit verbundenen Pro- bleme, sind aber längst nicht so weit verbreitet. Vor allem Betriebssystem- kerne für Mehrprozessorsysteme bauen immer noch häufig auf gegenseitige Zugriffssperren. Diese Arbeit geht dem Entwurf und der Entwicklung eines Betriebs- systemkerns nach, welcher stattdessen ausschliesslich auf nicht blockierenden Algorithmen basiert. Das Ziel des Kerns ist es, ein durchgängig lockfreies Laufzeitsystem für objektorientierte Programmiersprachen mit Unterstützung für Nebenläufigkeit und automatischer Speicherbereinigung anzubieten. Ein weiteres Anliegen ist die kompromisslose Portabilität des Quellcodes, welche soweit möglich versucht, hardwarespezifische Funk- tionalitäten durch unabhängige Softwarelösungen zu ersetzen. Das bedeu- tendste Beispiel für diesen Ansatz stellt die Verwendung von kooperativem Multitasking dar, welches als Ersatz für das typischerweise präemptive Scheduling moderner Systeme zum Einsatz kommt. Die Verbindung von lockfreier Programmierung mit kooperativem Multitasking wurde bislang kaum ausgenutzt und ermöglicht einige praktische Programmiertechniken, die für die Entwicklung von nicht blockierenden Algorithmen von grossem Vorteil sind. Das Ergebnis dieser Dissertation ist ein schlanker und zuverlässiger Betriebssystemkern mit flexibler Sprachunterstützung für Multithreading und Garbage Collection. Der Quellcode ist mühelos auf eine Vielzahl von Hardwarearchitekturen portierbar und läuft sowohl auf leistungsstarken Mehrkernrechnern als auch eingebetteten Systemen und Mikrocontrollern. xi Die konsequent umgesetzte Maschinenunabhängigkeit erlaubt es dem Lauf- zeitsystem ausserdem, als Bibliothek für Applikationen bestehender Be- triebssysteme zur Verfügung zu stehen. Die Auswertung der vorliegenden Arbeit hat gezeigt, dass die Leistungsfähigkeit unserer unkomplizierten und lockfreien Implementation in einigen Bereichen durchaus mit der von bewährten und optimierten Betriebssystemen vergleichbar ist. Dies gilt vor allen Dingen für Systeme, welche auf blockierender Prozesssynchronisation beruhen. xii 1 Introduction This chapter describes the background, purpose, and scope of this thesis and puts it into perspective with related work. In addition, it summarises our main contributions by consolidating the ideas and concepts behind them. 1.1 Motivation Since several years now, many operating systems provide native support for multiple central processing units in order to utilise the fullest potential of modern hardware. The most prominent representatives include for example Microsoft Windows and many Unix-like operating systems. Although the internal design and implementation of these operating systems often varies considerably, all of them usually include a generic hardware abstraction layer for applications running on top of them. The basic idea is to allow all applications to access system specific services and resources independently of the underlying hardware architecture. The base at the lowest level of this abstraction mechanism consists of the so-called operating system kernel which manages and provides access to the most primitive but fundamental hardware resources such as logical processors, the main memory, and facilities to access peripheral devices. The functionality of the kernel is usually required directly or indirectly by all applications as well as higher-level components of the hardware abstraction layer like

Load more