Programming Technologies for Engineering Quality Multicore

Programming Technologies for Engineering Quality Multicore Software by Tim Kaler B.S., Massachusetts Institute of Technology (2012) M.Eng, Massachusetts Institute of Technology (2013) Submitted to the Department of Electrical Engineering and Computer Science in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Computer Science at the MASSACHUSETTS INSTITUTE OF TECHNOLOGY September 2020 c Massachusetts Institute of Technology 2020. All rights reserved. Author............................................................................ Department of Electrical Engineering and Computer Science August 28, 2020 Certified by........................................................................ Charles E. Leiserson Professor of Electrical Engineering and Computer Science Thesis Supervisor Accepted by....................................................................... Leslie A. Kolodziejski Professor of Electrical Engineering and Computer Science Chair, Department Committee on Graduate Students 2 Programming Technologies for Engineering Quality Multicore Software by Tim Kaler Submitted to the Department of Electrical Engineering and Computer Science on August 28, 2020, in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Computer Science Abstract The widespread availability of large multicore computers in the cloud has given engineers and scientists unprecedented access to large computing platforms. Traditionally, high-end computing solutions have been developed and used by only a small community, as these solutions rely on expensive and specialized computing environments. The emergence of large-scale cloud computing providers, however, has democratized access to large-scale (although not necessarily HPC-scale) computing power, which can now be rented on-demand with just a credit card. The complexity of parallel programming, however, has made it more difficult for even expert programmers to develop high-quality multicore software systems. For average programmers, developing parallel programs that are debuggable, correct, and performant is a daunting challenge. This thesis is concerned with the development of programming technologies that reduce the complexity of parallel programming to make it easier for average programmers to exploit the capabilities of multicore hardware. I contend that realizing the full potential of the multicore revolution requires the development of programming technologies that make it easier to write quality code | code that has a simple understandable structure and performs well in practice. These programming technologies broadly include parallel algorithms, data structures, optimization techniques, profiling tools, and system design principles. Along these ends, this thesis presents seven intellectual artifacts from the domains of parallel algorithms, multicore-centric systems for scientific computing, and programming tools that make it easier to write quality code by simplifying the design, analysis, and performance engineering of multicore software: • Chromatic: Parallel algorithms for scheduling data-graph computations determinis- tically. • Color: Parallel algorithms and ordering heuristics for graph coloring that have the simple semantics of serial code. • PARAD: An efficient and parallelism-preserving algorithm for performing automatic differentiation in parallel programs. • Connectomics: An end-to-end image-segmentation pipeline for connectomics using a single large multicore. • Alignment: An image-alignment pipeline for connectomics that uses memory-efficient algorithms, and techniques for judiciously exploiting performance{accuracy tradeoffs. • Reissue: Reissue policies for reducing tail-latency in distributed services that are easy to analyze and effective in practice. • Cilkmem: Efficient algorithms and tools for measuring the worst-case memory high- water mark of parallel programs. 3 Although the emphasis and domains of these artifacts vary, they each involve the dis- covery of a way to tame complexity in parallel software systems without compromising, in fact, usually enhancing, theoretical guarantees and real-world performance. Thesis Supervisor: Charles E. Leiserson Title: Professor of Electrical Engineering and Computer Science 4 Acknowledgments There are many individuals to whom I owe my thanks for their support and guidance during my graduate studies. First and foremost, I thank my advisor Charles E. Leiserson. Charles has earned my enduring gratitude for his invaluable support, guidance, and generosity over the years. Charles is a professor who exudes integrity, curiosity, and a sense of fun. These attributes motivated me to work hard in his undergraduate algorithms class, take his performance engineering class, and ultimately join his research group. Since then I've learned more about writing and research from Charles than I could have ever anticipated. I can't say that it has always been easy to implement all of his advice, but I can say that it's always been worth the effort. I thank Tao B. Schardl, Julian Shun, and I-Ting Angelina Lee for their service on my thesis committee and for helpful discussions over the years. Tao Schardl has a deep understanding of parallel linguistics, performance engineering, and compilers and has been a constant source of ideas, inspiration, and support. I also thank Tao Schardl for his contributions to the Chromatic, Color, Cilkmem, and PARAD papers. Julian has shared his substantial expertise in the design of work-efficient parallel algorithms and related topics. I also thank Julian for inviting me to give a lecture in his seminar class. Angelina has provided insights into the internals of the Cilk runtime system, and helpful discussions on research directions on many topics including reducer hyperobjects, scheduling, and program analysis tools. Nir Shavit deserves thanks for his support, encouragement, and contributions to the Connectomics and Alignment papers. The field of connectomics has fascinated me since I was in high school, and it was a joy to contribute to the project. I thank all the members of the connectomics project, especially Alex Matveev and Yaron Meirovitch, with whom I've had many fruitful and enriching discussions. I also thank my collaborators at Harvard including Daniel Berger, Adi (Suissa) Peleg, Thouis R. Jones, Hanspeter Pfister, and Jeff Lichtman. Yuxiong He and Sameh Elnikety provided valuable mentorship during my time at Mi- crosoft Research and contributed to the Reissue paper. Their good humor and sharp intellectual insights made my time at MSR an absolute joy. I thank my collaborators at IBM, especially Jie Chen, Georgios Kollias and Mark Weber, for hosting me during multiple visits to IBM's campus and for advice and helpful discussions related to machine learning on graphs and automatic differentiation. I thank my other coauthors, not yet mentioned, William Kuszmaul, Daniele Vettorel, William Hasenplaugh, Brian Wheatman, Sarah Wooders, Erik Demaine, Quanquan Liu, Adam Yedidia, and Aaron Sidford. I thank William Kuszmaul for his unique theoretical insights into approximation algorithms in the Cilkmem paper. I thank Daniele Vetorrel for his contributions to the Cilkmem paper, and for his advice and guidance in designing a prototype CSI tool for the PARAD paper. I thank Brian Wheatman and Sarah Wooders for their dedication and persistence as undergraduate researchers and M.Eng students during our collaboration on the Alignment paper. I thank Erik Demaine, Quanquan Liu, Adam Yedidia, and Aaron Sidford for their theoretical insights and contributions to the Retroactive paper. I thank William Hasenplaugh for his contributions to the Chromatic and Color papers as well as for his good humor and for sharing his deep theoretical and practical understanding of memory models and computer architecture. Julian Shun and Adam Belay served on my RQE commmittee and I am grateful for 5 their helpful comments and advice. I thank all of the students, faculty, and staff on the seventh floor for providing helpful discussions and fun diversions throughout my graduate studies. I thank the members of the Supertech research group past and present. I especially thank Bradley Kuszmaul, Maryam Mehri Dehnavi, Shahin Kamali, Helen Xu, and Matthew Kilgore for their unique insights and helpful discussions over the years. I give special thanks to Cree Bruins and Marsha Davidson for their logistical support and for adding an extra dose of levity and warmth to the atmosphere of the group. The members of the EECS graduate office deserve my thanks for helping my graduate studies run smoothly. I especially thank Janet Fischer for her assistance over the years. I thank Anne Hunter and the members of the undergraduate EECS office for all of the support provided during my undergraduate studies, and for their support of my UROP and M.Eng students. The members of TIG have been consistently helpful and flexible while I've been at CSAIL. The dedication of the members of TIG to maintaining transparency and openness in the network and computing infrastructure is something I really appreciate. Garrett Wollman, in particular, has been especially helpful for various computing projects related to courses and research. I thank all of the faculty, staff, and students that have provided support during my time as an undergraduate at MIT. In particular, I thank Hari Balakrishnan, Sam Madden, Ben Waber, Lenin Ravindranath, and Arvind Thiagarajan for giving me early opportunities to engage in research during my undergraduate studies.

Load more