The Code of Many Colors: Semi-Automated Reasoning About Multi-Thread Policy for Java
Total Page:16
File Type:pdf, Size:1020Kb
The Code of Many Colors: Semi-automated Reasoning about Multi-Thread Policy for Java Dean F. Sutherland CMU-ISR-08-112 May 2008 School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213 Thesis Committee: William L. Scherlis, Chair Jonathan Aldrich Stephen Brookes Eric Nyberg Guy L. Steele, Jr., Sun Microsystems Laboratories Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy. Copyright c 2008 Dean F. Sutherland ! This material is based upon work supported by the following grants: NASA: NCC2-1298 and NNA05C530A; Lockheed Martin: RRMHS1798; ARO: DAAD190210389; IBM Eclipse: IC-5010. The views and conclusions contained in this document are those of the author and should not be interpreted as representing the official policies, either expressed or implied of the sponsor or the U.S. government. Keywords: Java, Program evolution, Static analysis, Race Conditions, Multi-threaded programming, Design Intent, Concurrency, Program assurance For Libby, who has spent far too many years providing front-line support for me and for this work. She’s in serious danger of being recognized as the world’s finest spouse. Also for Ivan, who has spent too many years and too many dollars providing financial support for my graduate schooling. I couldn’t have done it without either of you. iv Abstract Concurrent programming has proven to be difficult. One cause of this difficulty is that the rel- evant thread usage policy seldom appears either in documents or code comments. A second dif- ficulty is that thread usage policy—even when it is known—imposes widespread consequences on the code to be written. Finally, finding and removing concurrency faults in existing code is hard. This thesis introduces thread coloring, a language of discourse useful for concise expression of and reasoning about intended thread usage policies in a wide variety of code. Thread coloring addresses a range of concurrency issues—assuring single-thread access, identifying possibly- shared data regions and localizing knowledge about roles for threads—that have not previously been comprehensively addressed. Using this language, programmers can model design intent about relationships among the roles of threads with respect to segments of executable code and also with respect to shared state. Programmers formally link the model with their code by ex- pressing the model as annotations in that code. This thesis describes a prototype analysis tool, integrated into an integrated development environment, and its use in case studies to demonstrate that thread coloring is a feasible and practicable approach to expressing and understanding thread usage policies, including complex ones. The tool analyzes consistency between the expressed model and the as-written code, and notifies programmers of discrepancies between them. The case studies use published code to demonstrate that developers can express useful models, identify concurrency faults and assure policy compliance. The thesis includes a demonstration of scaling to a medium-sized program of 140KSLOC and a demonstration of the potential to scale to much larger programs and sup- port composition among analysis results for separately developed components. By limiting the problem scope to thread usage policy, the prototype implementation requires one hundred times fewer annotations than are needed for full functional correctness—6.3 annotations per KSLOC, potentially reduceable in future by another order of magnitude. This thesis provides five primary contributions to software engineering. First, it provides a language that developers can use to express thread usage policies. Second, it provides a sys- tematic way to improve code quality by assuring that as-written code complies with expressed thread usage policy. Third, it uses a new combination of preexisting techniques to reduce the effort required to express models to very low levels. Fourth, it demonstrates techniques that per- mit the analysis to operate on very large programs—millions of lines of code appear to be within reach. Finally, it demonstrates techniques that permit straightforward and reliable incremental recomputation of results after a program change. vi Acknowledgments Every member of the Fluid group provided essential support and encouragement throughout my research. Edwin Chan kept the software working and pulled together whatever infrastructure I needed. John Tang Boyland built the IR and much of the other infrastructure. Aaron Green- house showed me how to write analyses that work with the IR. John and Aaron wrote the effects analysis that provides my connection with data. Tim Halloran got us hooked into Eclipse, and provided the Drop–Sea TMS. Elissa Newman helped me understand how to visualize my analy- sis. Bill Scherlis pointed us all in the right direction, provided sage advice as needed, and kept the funding flowing. I certainly couldn’t have done it without you. Jonathan Aldrich spent endless hours helping me get my head around small-step semantics. He also provided a valuable sounding board about many different technical issues, especially during the design of the module system. Thanks for your patience and help. Several outside readers spent way too much time reading drafts of dissertation chapters and telling me what I’d done wrong. I owe many thanks to Aaron Greenhouse, Daniel V. Klein, Joseph M. Newcomer, and Ivan E. Sutherland. I especially want to thank Elizabeth Sutherland for her tireless efforts reading, critiquing and editing my dissertation. Without her efforts I would still have four different names for the color environment, a modules chapter on the brink of incomprehensibility, and a guided tour that loses readers at every turn. The members of my thesis committee provided critical feedback about my research and my dissertation. They also provided exceedingly prompt turn-around on a dissertation that really should have been out weeks before it actually was. Thank you all, for your valuable input. viii Contents 1 Introduction 1 1.1 Vision . 1 1.2 Thread usage policy . 1 1.3 The problem: policy-based concurrency is difficult to implement correctly . 2 1.3.1 Respecting thread usage policy . 3 1.3.2 Repairing thread usage policy faults . 3 1.3.3 Thread usage policy knowledge . 4 1.4 Our approach: Thread coloring . 4 1.4.1 Formal Language . 5 1.4.2 Analysis . 6 1.5 Assessing thread coloring . 6 1.5.1 Tool . 6 1.5.2 Criteria for practicability . 7 1.6 Value of thread coloring in development . 10 1.7 Thesis Statement . 10 1.7.1 Outline of the dissertation . 11 1.7.2 Modules . 12 2 Informal presentation of the coloring model 15 2.1 Introduction . 15 2.2 Why use thread coloring? . 15 2.2.1 The Fluid solution . 16 2.3 Guided tour of thread coloring . 17 2.3.1 Modules . 17 2.3.2 Color names . 19 2.3.3 Color constraints . 20 2.3.4 Consistency Checking . 21 2.3.5 Unconstrained code . 22 2.3.6 Running example . 23 2.3.7 Color inheritance . 24 2.3.8 Running example . 25 2.3.9 Granting colors . 25 2.3.10 Scoped promises . 28 2.3.11 Running Example . 29 2.3.12 Issues previously glossed-over . 29 2.3.13 Running example . 30 2.3.14 Coloring data . 31 2.4 Design intent at scale . 33 2.4.1 Scaling design intent for large programs . 34 2.4.2 Scaling design intent for many thread roles . 34 2.4.3 Fundamentally complex threading models . 35 ix 3 Modules 37 3.1 Introduction . 37 3.2 Why use modules? . 37 3.2.1 Software developers and managers . 37 3.2.2 Analysis developers . 39 3.3 Designing the Fluid module system . 40 3.3.1 Why design yet another module system? . 40 3.3.2 Design criteria . 40 3.4 Design concept, rationale & realization . 43 3.4.1 Terminology . 44 3.4.2 Modules from thirty-thousand feet . 45 3.4.3 Local visibility rule . 45 3.4.4 Hierarchy . 45 3.4.5 In/Out visibility control . 53 3.4.6 Non-tree-shaped module hierarchy . 56 3.4.7 Support for existing code . 58 3.5 Case study and validation . 61 3.5.1 Analyses that take advantage of the Fluid module system . 61 3.5.2 Electric 8.01 case study . 62 3.6 Comparative analysis . 66 3.6.1 Other module systems . 66 3.6.2 Module system attributes . 68 3.6.3 What’s different about the Fluid module system? . 72 3.7 Future Work . 72 4 Case studies 75 4.1 Introduction . 75 4.2 Realism and practicability . 76 4.3 GraphLayout case study . 76 4.3.1 The Application . 78 4.3.2 Case study details . 78 4.4 Sky View Café case study . 84 4.4.1 The application . 85 4.4.2 Case study details . ..