Recovery Techniques to Improve File System Reliability

RECOVERY TECHNIQUES TO IMPROVE FILE SYSTEM RELIABILITY by Swaminathan Sundararaman B.E. (Hons.) Computer Science (Birla Institute of Technology and Science (Pilani), India) 2005 M.S. Computer Sciences (Stony Brook University) 2007 A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Computer Sciences University of Wisconsin-Madison 2011 Committee in charge: Prof. Andrea C. Arpaci-Dusseau (Co-chair) Prof. Remzi H. Arpaci-Dusseau (Co-chair) Prof. Ben Liblit Prof. Michael M. Swift Prof. Shan Lu Prof. Jon Eckhardt ii iv v To my parents vi vii Acknowledgements I would like to express my heart-felt thanks to my advisors, Andrea Arpaci-Dusseau and Remzi Arpaci-Dusseau, who guided me in this endeavor and taught me how to do research. I always wanted to build really cool systems during my Ph.D. Not only did they let me do this, but they also taught me the valuable lesson that solutions (or systems) are just a small part of doing research. I also learnt the importance of clarity in both writing and presentation from them. I would like to thank Andrea for teaching me the values of patience, good ques- tions, and paying attention to details. I would like to thank Remzi for helping me select interesting projects and, most importantly, for showing me that research is a lot of fun. I am ever grateful to Remzi for taking the time to write me a long email and talk with me on the phone while I was deciding between universities. If not for that email, I would have ended up taking a full-time position at one of the companies in the Bay Area – and regretted not doing a Ph.D. for the rest of my life. I am grateful to my thesis committee members – Mike Swift, Ben Liblit, Shan Lu, and Jon Eckhardt – for their insightful comments and suggestions. Their feed- back during my preliminary and final exams improved this thesis greatly. I would like to thank Mike Swift for collaborating with us on the Membrane project. It was one of the best learning experiences for me. During our meetings, he would go into the smallest implementation detail and then quickly switch back to the big picture ideas. After my meetings with him, I would go back and spend a few hours trying to understand what he had been able to say in only a few seconds during the meetings. I am thankful to Sriram Subramanian for putting up with me during my best and worst times in Madison. Even though both of us graduated together from BITS, we hardly knew each other at the time we came to the UW. The friendship began with me sending him an email asking if he wanted to be my roommate – to which he agreed. Since then, it has been a great journey; he has been my roommate for the last four years, office mate for three years, project mate and a collaborator for last four years, and last but not least, a great friend. I have spent many hours viii brainstorming ideas with him and he has helped a lot in refining and validating my ideas and designs. I am fortunate to have had the opportunity to work with smart and hardworking colleagues during the course of my Ph.D.: Leo Arulraj, Lakshmi Bairavasundaram, Abhishek Rajimwale, Sriram Subramanian, Laxman Visampalli, Yiying Zhang, and Yupu Zhang. I also have enjoyed interacting with other members in the group: Ishani Ahuja, Nitin Agrawal, Vijay Chidambaram, Thanh Do, Haryadi Gunawi, Tyler Harter, Ao Ma, Joe Meehean, Thanumallayan Pillai, Deepak Ramamurthi, Lanyue Lu, Suli Yang. I would especially like to thank Abhishek, Sriram, Laxman, and Yupu for their hard work, contributions, and often burning the midnight oil with me during deadline times. A lot of my motivation for pursuing a Ph.D. came from my experiences at Stony Brook University and Birla Institute of Technology and Science (BITS). I would like to especially thank my advisors: Erez Zadok at Stony Brook, for fostering in me an interest in file and storage systems and teaching me that kernel programming is a lot of fun; Rahul Banerjee at BITS, for enabling me to think big and dream about projects that were beyond my capabilities; and Sundar Balasubramanian at BITS, for teaching me the importance of novelty and creativity in doing research. I have benefited greatly from interning at VMware, Microsoft Research, and EPFL. I would like to thank the organizations as well as my mentors and managers there: Satyam Vaghani at VMware, Dennis Fetterly at Microsoft Research, and George Candea at EPFL. I would also like to thank the other collaborators at these organizations: Abhishek Rai, Yunlin Tan, and Mayank Rawat at VMware; Michael Isard and Chandu Thekkath at Microsoft Research; and Vitaly Chipounov and Vova Kuznetsov at EPFL. I would like to thank my friends for making my stay in Madison enjoyable. To name a few: Koushik Narasimhan, Srinath Sridharan, Uthra Srinath, Pira- manayagam Arumuga Nainar, Siddharth Barman, Asim Kadav, Janani Kalyanam, Ashok Anand, Deepthi Pachauri, Giri Ravipati, Rathijit Sen, Neelam Goyal, Nikhil Teletia, and Abhishek Iyer. They were happy even for my small successes and cheered me during difficult times. I am going to miss all the fun we had: evening movies, potluck dinners, tennis, cricket, poker sessions, and road trips. I would like to specially thank Kaushik for coming up with fancy names and for the constant nudging during deadlines, which kept it fun for me. Finally, I would like to thank my family, without whose love and support this Ph.D. would have been impossible. Most of the credit, however, is due to my parents. They live their life to make me happy, and they have given me everything I have needed to find my way in the world and succeed. They have always been very ix supportive of my decisions and have been the two pillars that I can always lean on. My sister, Harini, has cheered me on during the entire course of my Ph.D. I would also like to thank my cousin, Rajesh, and his wife, Niranjana, for their constant encouragement and support. Finally, I would also like to thank all my relatives who always encouraged and supported me in all of my endeavors. x xi Abstract RECOVERY TECHNIQUES TO IMPROVE FILE SYSTEM RELIABILITY Swaminathan Sundararaman We implement selective restart and resource reservation for commodity file systems to improve their reliability. Selective restart allows file systems to quickly recover from failures; resource reservation enables file systems to avoid certain failures altogether. Together they enable a new class of more robust and reliable file systems to be realized. In the first part of this dissertation (on selective restart), we develop Membrane, a generic framework built inside the operating system to selectively restart kernel- level file systems. Membrane allows an operating system to tolerate a broad class of file system failures and does so while remaining transparent to running applications; upon failure, the file system restarts, its state is restored, and pending application requests are serviced as if no failure had occurred. We also develop Re-FUSE, a generic framework designed to restart user-level file systems upon failures. Re- FUSE monitors the user-level file-system and on a crash restarts the file system and restores its state; the restart process is completely transparent to applications. We evaluate both Membrane and Re-FUSE, and show, through experimentation, that both of these frameworks induce little performance and space overhead and can tolerate a wide range of crashes with minimal code change. In the second part of the dissertation (on resource reservation), we develop An- ticipatory Memory Allocation (AMA), a technique that uses static analysis to sim- plify recovery code dealing with memory-allocation failures. AMA determines the memory requirements of a particular call into a file system, and then pre-allocates said amount immediately upon entry; subsequent allocation requests are serviced from the pre-allocated pool and thus guaranteed never to fail. We evaluate AMA by transforming Linux ext2 file system into a memory-failure robust version of it- self (called ext2-mfr). Experiments reveal that ext2-mfr avoids memory-allocation failures successfully while incurring little space and time overheads. xii Contents Acknowledgements vii Abstract xi 1 Introduction 1 1.1 Reliability Through Restartability . 3 1.2 Reliability Through Reservation . 6 1.3 Contributions ............................ 8 1.4 Outline ............................... 9 2 File Systems 11 2.1 Overview .............................. 11 2.2 File System Components . 15 2.3 Handling Application Requests . 16 2.4 On-DiskState............................ 17 2.5 Consistency ............................. 20 2.5.1 Journaling.......................... 21 2.5.2 Snapshotting ........................ 22 2.6 DeploymentTypes ......................... 23 2.6.1 Kernel-level File Systems . 23 2.6.2 User-level File Systems . 24 2.7 Summary .............................. 25 3 Reliability through Restartability 27 3.1 FailureModel............................ 28 3.1.1 Definitions ......................... 28 3.1.2 Taxonomy of Faults . 29 3.1.3 SystemModel ....................... 29 3.1.4 Behavior of Systems on Failures . 29 xiii xiv 3.1.5 Occurrence Pattern . 32 3.1.6 Operating System Response to a Failure . 32 3.1.7 OurApproach........................ 33 3.2 Restarting a file system . 34 3.2.1 Goals ............................ 34 3.2.2 State Associated with File Systems . 35 3.2.3 Different Ways to Restart a File System . 37 3.3 Components of a Restartable Framework . 39 3.3.1 Fault Detection . 39 3.3.2 Fault Anticipation . 40 3.3.3 FaultRecovery ....................... 41 3.4 Summary .............................. 42 4 Restartable Kernel-level File Systems 43 4.1 Design................................ 44 4.1.1 Overview .........................

Recovery Techniques to Improve File System Reliability

Copy on Write Based File Systems Performance Analysis and Implementation

Membrane: Operating System Support for Restartable File Systems Swaminathan Sundararaman, Sriram Subramanian, Abhishek Rajimwale, Andrea C

Ivoyeur: Inotify

Filesystems HOWTO Filesystems HOWTO Table of Contents Filesystems HOWTO

Using Hierarchical Folders and Tags for File Management

Model-Based Failure Analysis of Journaling File Systems

Monitoring File Events

A File System Level Snapshot in Ext4

Kernel Korner

VERSIONFS a Versatile and User-Oriented Versioning File System

Displaying and Watching Directories Using Lazarus

The Third Extended File System with Copy-On-Write