Effective Testing for Concurrency Bugs
Total Page:16
File Type:pdf, Size:1020Kb
Effective Testing for Concurrency Bugs Thesis for obtaining the title of Doctor of Engineering Science of the Faculties of Natural Science and Technology I of Saarland University From Pedro Fonseca Saarbr¨ucken, 13. April 2015 Date of Colloquium: 24/06/2015 Dean of Faculty: Univ.-Prof. Dr. Markus Bl¨aser Chair of the Committee: Prof. Dr. Andreas Zeller First Reviewer: Prof. Dr. Rodrigo Rodrigues Second Reviewer: Prof. Dr. Peter Druschel Third Reviewer: Prof. Dr. George Candea Forth Reviewer: Dr. Bj¨orn Brandenburg Academic Assistant: Dr. Przemyslaw Grabowicz ii c 2015 Pedro Fonseca ALL RIGHTS RESERVED iii Dedicated to my parents and brother v This work was supported by the Foundation for Science and Technology of Portugal (SFRH/BD/45309/2008) and the Max Planck Society vi Abstract In the current multi-core era, concurrency bugs are a serious threat to software reliability. As hardware becomes more parallel, concurrent programming will become increasingly pervasive. However, correct concurrent programming is known to be extremely chal- lenging for developers and can easily lead to the introduction of concurrency bugs. This dissertation addresses this challenge by proposing novel techniques to help developers expose and detect concurrency bugs. We conducted a bug study to better understand the external and internal effects of real-world concurrency bugs. Our study revealed that a significant fraction of concur- rency bugs qualify as semantic or latent bugs, which are two particularly challenging classes of concurrency bugs. Based on the insights from the study, we propose a con- currency bug detector, PIKE that analyzes the behavior of program executions to infer whether concurrency bugs have been triggered during a concurrent execution. In addi- tion, we present the design of a testing tool, SKI, that allows developers to test operating system kernels for concurrency bugs in a practical manner. SKI bridges the gap between user-mode testing and kernel-mode testing by enabling the systematic exploration of the kernel thread interleaving space. Our evaluation shows that both PIKE and SKI are effective at finding concurrency bugs. vii Kurzdarstellung Im gegenw¨artigenMulticore-Zeitalter sind Fehler aufgrund von Nebenl¨aufigkeit eine ern- sthafte Bedrohung der Zuverl¨assigkeit von Software. Mit der wachsenden Parallelisierung von Hardware wird nebenl¨aufigesProgrammieren nach und nach allgegenw¨artig.Diese Art von Programmieren ist jedoch als ¨außerstschwierig bekannt und kann leicht zu Programmierfehlern f¨uhren. Die vorliegende Dissertation nimmt sich dieser Heraus- forderung an indem sie neuartige Techniken vorschl¨agt,die Entwicklern beim Aufdecken von Neben¨aufigkeitsfehlern helfen. Wir f¨uhreneine Studie von Fehlern durch, um die externen und internen Effekte von in der Praxis vorkommenden Nebenl¨aufigkeitsfehlern besser zu verstehen. Diese ergibt, dass ein bedeutender Anteil von solchen Fehlern als semantisch bzw. latent zu charak- terisieren ist { zwei besonders herausfordernde Klassen von Nebenl¨aufigkeitsfehlern. Basierend auf den Erkenntnissen der Studie entwickeln wir einen Detektor (PIKE), der Programmausf¨uhrungendaraufhin analysiert, ob Nebenl¨aufigkeitsfehler aufgetreten sind. Weiterhin pr¨asentieren wir das Design eines Testtools (SKI), das es Entwicklern erm¨glicht, Betriebssystemkerne praktikabel auf Neben¨aufigkeitsfehler zu uberpr¨rufen. SKI f¨ulltdie L¨ucke zwischen Testen im Benutzermodus und Testen im Kernelmodus, indem es die systematische Erkundung der Kernel-Thread-Verschachtelungen erlaubt. Unsere Auswertung zeigt, dass sowohl PIKE als auch SKI effektiv Nebenl¨aufigkeits- fehler finden. viii Acknowledgements I'm greatly indebted to my advisor, Rodrigo Rodrigues, for his continuous effort over the years to support and encourage me. His exceptional energy and constructive optimism were absolutely essential to the success of this work. His altruism made me a better person. This thesis would not exist without the contributions of my collaborators: Bj¨orn Brandenburg, Cheng Li and Vishal Singhal. Special thanks to Bj¨ornfor the feedback and support provided during the last few years of my work. I won't forget that, during my first years at MPI-SWS, I learned a lot with the per- severance and experience of Atul Singh and Andreas Haeberlen. The energy of Pramod Bhatotia and Cheng Li was contagious. Thank you all. My friends and colleagues at MPI-SWS, in particular the SysNets group members, had a crucial role in producing an inspiring research environment and providing invaluable feedback that were critical to this research. Living in Saarbr¨uecken, while working on my dissertation, was an enjoyable experience, in large part, thanks to my friends. I am grateful to my parents and my brother for their permanent support throughout my life. xi Contents List of Figures xvii List of Tables xix 1 Introduction1 1.1 Hardware parallelism: A blessing and a curse................1 1.2 The threat of concurrency bugs........................2 1.3 Problem statement...............................3 1.4 Contributions..................................5 1.5 Outline.....................................6 2 Related work7 2.1 Studies of bugs.................................7 2.2 Testing software................................9 2.2.1 Detecting concurrency bugs......................9 2.2.2 Exposing concurrency bugs...................... 12 2.3 Deterministic replay.............................. 14 2.4 Virtual machine introspection (VMI)..................... 14 3 Background 17 3.1 Systematic exploration of the interleaving space............... 17 3.1.1 PCT scheduling algorithm....................... 18 3.2 Database management systems........................ 19 3.2.1 MySQL overview............................ 20 3.2.2 Internal structure............................ 20 3.2.3 Concurrent programming....................... 21 3.2.4 Request vs. transaction concurrency................. 21 3.2.5 Storage engines............................. 22 3.3 Operating system kernels........................... 22 3.3.1 Concurrency and synchronization mechanisms............ 23 xiii Contents 3.3.2 Thread affinity............................. 23 4 The effects of concurrency bugs 25 4.1 Overview.................................... 25 4.2 Study methodology............................... 26 4.2.1 Selection of concurrency bugs..................... 26 4.2.2 Manual analysis of bug reports.................... 29 4.3 Results...................................... 29 4.3.1 Evolution of concurrency bugs.................... 29 4.3.2 External effects............................. 30 4.3.3 Latent bugs............................... 31 4.3.4 Characteristics of semantic bugs................... 33 4.3.5 Internal effects of latent bugs..................... 34 4.3.6 Recovering from latent errors..................... 35 4.3.7 Severity and fixing complexity.................... 35 4.4 Limitations and discussion........................... 37 4.5 Summary.................................... 38 5 Detecting latent and semantic bugs 41 5.1 Overview.................................... 41 5.2 Semantic and latent concurrency bugs.................... 42 5.2.1 Linearizability: The spec from within................ 42 5.2.2 Capturing application state...................... 43 5.2.3 Maintaining the summary functions................. 45 5.3 PIKE: A concurrency bug finding tool.................... 45 5.3.1 Handling false positives........................ 45 5.3.2 Implementation............................. 47 5.4 Experience with MySQL............................ 48 5.4.1 State summary functions....................... 48 5.4.2 Input generation............................ 49 5.5 Results...................................... 51 5.5.1 Development effort........................... 51 5.5.2 Bugs found............................... 51 5.5.3 False positives............................. 57 5.6 Limitations and discussion........................... 59 5.6.1 Reliance on modifications to tested software............. 59 5.6.2 Extending traditional test suites................... 60 xiv Contents 5.7 Summary.................................... 60 6 Exposing concurrency bugs in kernels 61 6.1 Overview.................................... 61 6.2 Kernel API and kernel modifications..................... 63 6.3 SKI: Exploring kernel interleavings...................... 64 6.3.1 Overview................................ 64 6.3.2 Exercising control over threads.................... 65 6.3.3 Inferring liveness............................ 65 6.3.4 Scheduling algorithm.......................... 67 6.3.5 Discussion................................ 69 6.4 Scaling to real-world kernels.......................... 70 6.5 Implementation................................. 72 6.5.1 Overview................................ 72 6.5.2 Runnable contexts........................... 73 6.5.3 Helper testing framework....................... 73 6.5.4 Optimizations and parallelization................... 74 6.5.5 Bug detectors.............................. 74 6.5.6 Traces and bug diagnosis....................... 75 6.6 Results...................................... 75 6.6.1 Configuration.............................. 76 6.6.2 Reproducing concurrency bugs.................... 76 6.6.3 Exposing unknown concurrency bugs................. 79 6.7 Limitations and discussion........................... 83 6.7.1 Reliance on thread pinning...................... 83 6.7.2 Reliance on VMM........................... 85 6.8 Summary.................................... 85 7 Testing software in the