Hardware-Driven Evolution in Storage Software by Zev Weiss A
Total Page:16
File Type:pdf, Size:1020Kb
Hardware-Driven Evolution in Storage Software by Zev Weiss A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy (Computer Sciences) at the UNIVERSITY OF WISCONSIN–MADISON 2018 Date of final oral examination: June 8, 2018 ii The dissertation is approved by the following members of the Final Oral Committee: Andrea C. Arpaci-Dusseau, Professor, Computer Sciences Remzi H. Arpaci-Dusseau, Professor, Computer Sciences Michael M. Swift, Professor, Computer Sciences Karthikeyan Sankaralingam, Professor, Computer Sciences Johannes Wallmann, Associate Professor, Mead Witter School of Music i © Copyright by Zev Weiss 2018 All Rights Reserved ii To my parents, for their endless support, and my cousin Charlie, one of the kindest people I’ve ever known. iii Acknowledgments I have taken what might be politely called a “scenic route” of sorts through grad school. While Ph.D. students more focused on a rapid graduation turnaround time might find this regrettable, I am glad to have done so, in part because it has afforded me the opportunities to meet and work with so many excellent people along the way. I owe debts of gratitude to a large cast of characters: To my advisors, Andrea and Remzi Arpaci-Dusseau. It is one of the most common pieces of wisdom imparted on incoming grad students that one’s relationship with one’s advisor (or advisors) is perhaps the single most important factor in whether these years of your life will be pleasant or unpleasant, and I feel exceptionally fortunate to have ended up iv with the advisors that I’ve had. I have always been granted plenty of independence, but also given the guidance and suggestions to point me in a useful direction when I’ve gotten stuck. Andrea’s thorough, thoughtful feedback on paper drafts (and of course this very document) has been invaluable. Her CS402 program was a rewarding, enjoyable experience (even though I unfortunately missed the semesters when she was actually around running it herself), and provides a wonderful service to Madison schools and their students. I am glad Remzi happened to notice my shenanigans in 537 projects (and deem them acceptable, even when they were horrifying), and in spite of them still provide me the opportunity to teach the same course some years later. I will miss our weekly meetings, especially the ones that veered off course. To Ed Almasy and Rachael Bower, fearless co-directors of the Internet Scout Research Group, who welcomed me into their wonderful work environment, where I happily remained through nearly my entire career as a grad student. To Rustam Lalkaka, officemate and co-sysadmin at Scout, who perhaps unintentionally ended up responsible for a large part of how the rest of my time here at UW-Madison has un- folded by suggesting that I take CS537 from Remzi. (“Who?” I said. “He’s cool, you won’t regret it”, he replied, propheti- cally.) v To Corey Halpin, with whom I have shared countless lengthy and enjoyable conversations on many matters, but most often a shared (excellent!) taste in software. To Johannes Wallmann, for guiding me through my music minor, running an excellent jazz program at the UW School of Music, and bravely serving on a CS dissertation defense committee! To Karu Sankaralingam and Mike Swift, for the interesting courses they’ve taught (and taught well), and for agreeing to listen to me defend this dissertation. To Tyler Harter – the only student coauthor I’ve worked with in my time here, but as excellent a coauthor as one could hope for, with a remarkable knack for presenting complex topics in clear, comprehensible ways. To Jun He, who has endured me as an officemate for longer than anyone else, providing numerous interesting discussions along the way. And to all the other members of the Arpaci-Dusseau group I’ve worked with and learned so much from over the last seven years: Ram Alagappan, Leo Arulraj, Vijay Chidambaram, Thahn Do, Aishwarya Ganesan, Joo Yung Hwang, Sudarsun Kannan, Samer al-Kiswany, Jing Liu, Lanyue Lu, Yuvraj Patel, Thanu Pillai, Kan Wu, Suli Yang, Yiying Zhang, Yupu Zhang, and Dennis Zhou. To the friends I’ve made in the department here: Ben vi Bramble, Mark Coatsworth, Adam Everspaugh, Thomas Griebel, Rob Jellinek, Kevin Kowalski, Ben Miller, Evan Rad- koff, Will Seale, Brent Stephens, Venkatanathan Varadarajan, Ara Vartanian. The diverse discussions, project collabora- tions, beers on the terrace, and other adventures many and varied have been a pleasure. To the people I worked with at Fusion-io: Sriram Subra- manian, Swaminathan Sundararaman, Nisha Talagala, and the members of the Clones team. I thoroughly enjoyed my time there, and am grateful for the honor of being granted, via custom-emblazoned sweatshirt, the status of “intern emeritus” when I left to return to Madison. To the many people of SimpleMachines, where there’s never a dull moment, and where almost everything is good. To Angela Thorpe for being so helpful with administrative questions and deftly coordinating the CS graduate program – especially critical for those among us who perhaps fall slightly toward the less-organized end of the spectrum. To Eric Siereveld for skillfully directing the UW Latin Jazz Ensemble the two years I was fortunate enough to play in it, and Josh Agterberg, Andrew Baldwin, Rachel Heuer, and Will Porter for providing a wonderful ensemble to perform with for my minor recital. To Michaela Vatcheva, for so many interesting times and vii conversations, and for getting me to (after sufficient prodding) finally join the sailing club – my only regret is that I took so long to heed this advice. To Amy De Simone for befriending me when I had just moved to a new and unfamiliar city, and bringing me along on walks with her dog. To Becky, for her patience and encouragement, especially in these last few weeks. And to my sister, Samara, and parents, Alan and Cheryl, for their love and support. viii ix Contents Contents ix List of Figures xv Abstract xxvii 1 Introduction 1 1.1 Trace Replay in the Multicore Era . 1 1.2 Advanced Virtualization for Flash Storage 6 1.3 Cache-Compact Filesystems for NVM . 10 1.4 Overview . 16 2 Accurate Trace Replay for Multithreaded Applica- tions 19 x 2.1 Introduction . 21 2.2 Trace Mining . 29 2.2.1 Trace Inputs . 31 2.2.2 Inference . 32 2.3 ROOT: Ordering Heuristics . 36 2.3.1 Trace Model . 37 2.3.2 Ordering Rules . 40 2.4 ARTC: System-Call Replay . 45 2.4.1 Goals . 45 2.4.2 ROOT with System-Call Traces 47 2.4.3 Implementation . 52 2.5 Evaluation . 61 2.5.1 Semantic Correctness: Magritte . 64 2.5.2 Performance Accuracy . 67 2.6 Case Study: Magritte . 80 2.6.1 fsync Semantics . 82 2.7 Related Work . 84 2.8 Conclusion . 86 3 Storage Virtualization for Solid-State Devices 89 3.1 Introduction . 90 3.2 Background . 97 3.3 Structure . 99 3.4 Interfaces . 101 3.4.1 Range Operations . 101 xi 3.4.2 Complementary Properties . 103 3.5 Implementation . 105 3.5.1 Log Structuring . 105 3.5.2 Metadata Persistence . 106 3.5.3 Space Management . 107 3.6 Garbage Collection . 109 3.6.1 Design Considerations . 109 3.6.2 Possible Approaches . 111 3.6.3 Design . 116 3.6.4 Scanner . 118 3.6.5 Cleaner . 121 3.6.6 Techniques and Optimizations . 124 3.7 Case Studies . 132 3.7.1 Snapshots . 132 3.7.2 Deduplication . 137 3.7.3 Single-Write Journaling . 138 3.8 GC Evaluation . 144 3.8.1 Garbage Collection in Action . 144 3.8.2 GC Capacity Scaling . 147 3.9 Conclusion . 148 4 Cache-Conscious Filesystems for Low-Latency Storage151 4.1 Introduction . 153 4.2 Filesystem Cache Access Patterns . 155 4.3 DenseFS . 169 xii 4.3.1 Data Cache Compaction . 170 4.3.2 Instruction Cache Compaction . 178 4.3.3 A Second Generation . 184 4.4 Evaluation . 193 4.4.1 Microbenchmark results . 193 4.4.2 DenseFS1 application results: grep197 4.4.3 DenseFS2 application results: SQLite198 4.5 Related Work . 205 4.6 Conclusion . 208 5 Conclusions 211 5.1 Increasing Core Counts and Trace Replay 212 5.2 Flash and Storage Virtualization . 214 5.3 NVM and Filesystem Cache Behavior . 214 5.4 Future Work . 216 5.5 Final Thoughts . 218 Bibliography 221 xiii xv List of Figures 2.1 Techniques for I/O-space inference. Active tracing perturbs timing by artificially delaying specific events so as to observe which other events are affected; passive tracing allows all events to occur at their natural pace. 33 2.2 Example action series. A snippet from a simple system-call trace for two threads is shown in 2.2(a). Beneath each event, a comment lists the resource touched by each system call. 2.2(b) shows the action series corresponding to each resource that appears in the trace. 39 xvi 2.3 Ordering Rules. a1 < a2 means action a1 must be replayed before action a2. acts[create] and acts[delete] represent acts[first] and acts[last], re- spectively, when the first action in a series is a create or when the last action is a delete. When this is not the case, the constraint does not apply. 42 2.4 Examples of valid and invalid orderings. Each square represents an action. Different colors rep- resent consecutive generations of the same name. Thick borders indicate creation and deletion events. 44 2.5 Replay modes. Circles represent reasonable ways to apply rules to resources; filled circles are modes currently supported by ARTC.