Aligning Intent and Behavior in Software Systems: How Programs Communicate & Their Distribution and Organization
Total Page:16
File Type:pdf, Size:1020Kb
© 2020 William B. Dietz ALIGNING INTENT AND BEHAVIOR IN SOFTWARE SYSTEMS: HOW PROGRAMS COMMUNICATE & THEIR DISTRIBUTION AND ORGANIZATION BY WILLIAM B. DIETZ DISSERTATION Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Computer Science in the Graduate College of the University of Illinois at Urbana-Champaign, 2020 Urbana, Illinois Doctoral Committee: Professor Vikram Adve, Chair Professor John Regehr, University of Utah Professor Tao Xie Assistant Professor Sasa Misailovic ABSTRACT Managing the overwhelming complexity of software is a fundamental challenge because complex- ity is the root cause of problems regarding software performance, size, and security. Complexity is what makes software hard to understand, and our ability to understand software in whole or in part is essential to being able to address these problems effectively. Attacking this overwhelming complexity is the fundamental challenge I seek to address by simplifying how we write, organize and think about programs. Within this dissertation I present a system of tools and a set of solutions for improving the nature of software by focusing on programmer’s desired outcome, i.e. their intent. At the program level, the conventional focus, it is impossible to identify complexity that, at the system level, is unnecessary. This “accidental complexity” includes everything from unused features to independent implementations of common algorithmic tasks. Software techniques driving innovation simultaneously increase the distance between what is intended by humans – developers, designers, and especially the users – and what the executing code does in practice. By preserving the declarative intent of the programmer, which is lost in the traditional process of compiling and linking and building software, it is easier to abstract away unnecessary details. The Slipstream, ALLVM, and software multiplexing methods presented here automatically reduce complexity of programs while retaining intended function of the program. This results in improved performance at both startup and run-time, as well as reduced disk and memory usage. ii To Clare, Brian, Noelle, and Chex. iii ACKNOWLEDGMENTS This has been a long and difficult journey, and there are many people I would like to thank and words I wish I could find to thank them with. With humility and gratitude I apologize for anyone forgotten and for inadequately expressing my thanks to both you and even those mentioned; thank you to each of you wonderful people! Thank you for the many ways you helped me to learn and grow into the scholar and person that I needed to be to see this through. I thank Andrew Lenharth for letting me work with him as an undergraduate student in the LLVM research group. Working with you inspired me to go down this crazy winding road. I thank my advisor Professor Vikram Adve for your support and for allowing me the freedom to explore and find my way through what it means to be a researcher. Not many advisors would put up with the tumultuous path I took nor the emphasis on philosophy I placed on “doctor of philosophy” while pursuing it in computer science. Thank you. I thank my coworkers, coauthors, colleagues, and peers Arushi Aggarwal, Sean Bartell, Joshua Cranmer, Assistant Professor John Criswell, Sandeep Dasgupta, Assistant Professor Nathan Dautenhahn, Theo Kasampalis, Maria Kotsifakou, John Mainzer, and Scott Moore for your collaboration, ideas, and insights on projects and for your camaraderie throughout this journey. Thank you for our great conversations about everything from computer science to life. I thank the LLVM group undergraduate students, and now alumni, Tom Chen, Matthew Plant, Tarun Rajendran, and Matthew Wala for their hard work on the various projects that contributed to work included in the dissertation. I thank the administrative staff at the Department of Computer Science in the Grainger College of Engineering, especially Viveka and Maggie for all their help navigating paperwork and deadlines. I also thank Vikram’s former faculty office support specialist Michelle for helping with travel, reimbursements, and more. I thank my committee members Professor John Regehr, Professor Tao Xie, and Assistant Professor Sasa Misailovic. From the bottom of my heart, I want to thank my best friends Brian and Noëlle James for always helping me, whether it was related to grad school or not. You two believed in me at times I could not believe in myself. Thank you for not giving up on me and for carrying me across the finish line. I thank my Mom, brother Nat, and sister Angel for all their unconditional love and support during my grad school experience. Nat, thank you for our unforgettable ice fishing trips and weekly FIFA nights (“SHOOOOSH!”). Angel, thank you for the always-fun weekends at your place, and especially for my The Little Engine That Could book, which I still cherish. “I thought I could. I thought I could”. I thank my girlfriend Clare for more than I can say. Thank you for your love, unwavering support, and for always believing in me. Thank you for putting up with my attempts to compare my research iv in computers with bridges and concrete. Hey, it kinda worked, way better than anyone would have expected it to. Finally, I thank my cat Chex whose companionship and presence have been essential to my happiness and sanity while completing this dissertation. I do not thank you for standing on my laptop, but admit it was an effective way to remind me of many things such as ensuring Isaved often and that it’s important to take breaks. Most especially you always brought me to the here and now, dragging out of my nonsense into partaking in your joy of the present. Life is good and happiness comes easily if you let it, you would demonstrate, as long as there’s food, play, and snuggles. Thank you. To you and to ##uiuc-llvm I say: Meow. v TABLE OF CONTENTS CHAPTER 1 INTRODUCTION .................................. 1 1.1 Dissertation Contributions..................................3 CHAPTER 2 RELATED WORK .................................. 5 2.1 Background..........................................5 2.2 Slipstream...........................................8 2.3 ALLVM............................................. 11 2.4 Software Multiplexing.................................... 12 CHAPTER 3 SLIPSTREAM: AUTOMATIC OPTIMIZATION OF INTERPROCESS COMMUNICATION ........................................ 14 3.1 Context............................................. 14 3.2 Introduction.......................................... 14 3.3 Slipstream Overview..................................... 16 3.4 Design and Implementation................................. 19 3.5 Discussion........................................... 22 3.6 Results............................................. 25 3.7 Acknowledgements...................................... 32 3.8 Conclusion........................................... 33 CHAPTER 4 ALLVM ......................................... 34 4.1 Introduction.......................................... 34 4.2 Forging an Allexe....................................... 39 4.3 Specific Challenges Encountered.............................. 46 4.4 Running an Allexe...................................... 52 4.5 ALLVM Utility Suite..................................... 54 CHAPTER 5 SOFTWARE MULTIPLEXING ........................... 57 5.1 Preface............................................. 57 5.2 Introduction.......................................... 57 5.3 Extending ALLVM...................................... 62 5.4 Generating Multicall Programs............................... 64 5.5 Statically Linking Shared Libraries............................. 68 5.6 Evaluation........................................... 69 5.7 Discussion........................................... 78 5.8 Conclusions.......................................... 80 CHAPTER 6 CONCLUSIONS ................................... 81 6.1 Future Directions....................................... 81 6.2 Discussion........................................... 84 REFERENCES .............................................. 88 vi APPENDIX A MULTIPLEXING: ADDITIONAL RESULTS .................. 97 A.1 Impact of Multiplexing on Compression.......................... 97 A.2 Binary Sizes for Various Software.............................. 98 A.3 One-Binary LEMP....................................... 100 APPENDIX B MULTIPLEXING: THEMED COLLECTIONS ................. 101 B.1 List of Packages in Each Collection............................. 101 B.2 Binary Sizes for “Logic” Collection............................. 102 vii CHAPTER 1: INTRODUCTION Simple things should be simple, complex things should be possible —Alan Kay Software is often slower, consumes more resources, and has undesired behavior and bugs; in summary we find our software doesn’t do what we want it to do. Complexity is the rootcause of problems regarding software performance, size, and security. Attacking this overwhelming complexity is the fundamental challenge I seek to address. In the seminal paper “No Silver Bullet — Essence and Accidents of Software Engineering” [1], Brooks argues that, fundamentally, any approach of writing software captures both intentional and accidental complexity. Moseley and Marks [2], in contrast, suggest that declarative programming can capture only the essential complexity, without side effects. Certainly, to the extent possible, encapsulating only the intended complexity and avoiding all accidental complexity is ideal. However, this is not how software