Using Model Checking to Find Serious File System Errors

Total Page:16

File Type:pdf, Size:1020Kb

Using Model Checking to Find Serious File System Errors Using Model Checking to Find Serious File System Errors Junfeng Yang, Paul Twohey, Dawson Engler ∗ Madanlal Musuvathi {junfeng, twohey, engler}@cs.stanford.edu [email protected] Computer Systems Laboratory Microsoft Research Stanford University One Microsoft Way Stanford, CA 94305, U.S.A. Redmond, WA 98052, U.S.A. Abstract Not only are errors in file systems dangerous, file This paper shows how to use model checking to find system code is simultaneously both difficult to reason serious errors in file systems. Model checking is a for- about and difficult to test. The file system must cor- mal verification technique tuned for finding corner-case rectly recover to an internally consistent state if the sys- errors by comprehensively exploring the state spaces de- tem crashes at any point, regardless of what data is being fined by a system. File systems have two dynamics that mutated, flushed or not flushed to disk, and what invari- make them attractive for such an approach. First, their ants have been violated. Anticipating all possible failures errors are some of the most serious, since they can de- and correctly recovering from them is known to be hard; stroy persistent data and lead to unrecoverable corrup- our results do not contradict this perception. tion. Second, traditional testing needs an impractical, The importance of file system errors has led to the de- exponential number of test cases to check that the sys- velopment of many file system stress test frameworks; tem will recover if it crashes at any point during execu- two good ones are [24,30]. However, these focus mostly tion. Model checking employs a variety of state-reducing on non-crash based errors such as checking that the file techniques that allow it to explore such vast state spaces system operations create, delete and link objects cor- efficiently. rectly. Testing that a file system correctly recovers from We built a system, FiSC, for model checking file sys- a crash requires doing reconstruction and then compar- tems. We applied it to three widely-used, heavily-tested ing the reconstructed state to a known legal state. The file systems: ext3 [13], JFS [21], and ReiserFS [27]. We cost of a single crash-reboot-reconstruct cycle (typically found serious bugs in all of them, 32 in total. Most have a minute or more) makes it impossible to test more than led to patches within a day of diagnosis. For each file a tiny fraction of the exponential number of crash pos- system, FiSC found demonstrable events leading to the sibilities. Consequently, just when implementors need unrecoverable destruction of metadata and entire direc- validation the most, testing is least effective. Thus, even tories, including the file system root directory “/”. heavily-tested systems have errors that only arise after they are deployed, making their errors all but impossible 1 Introduction to eliminate or even replicate. In this paper, we use model checking to systematically File system errors are some of the most destructive errors test and find errors in file systems. Model checking [5, possible. Since almost all deployed file systems reside 19,22] is a formal verification technique that systemat- in the operating system kernel, even a simple error can ically enumerates the possible states of a system by ex- crash the entire system, most likely in the midst of a mu- ploring the nondeterministic events in the system. Model tation to stable state. Bugs in file system code can range checkers employ various state reduction techniques to from those that cause “mere” reboots to those that lead efficiently explore the resulting exponential state space. to unrecoverable errors in stable on disk state. In such For instance, generated states can be stored in a hash ta- cases, mindlessly rebooting the machine will not correct ble to avoid redundantly exploring the same state. Also, or mask the errors and, in fact, can make the situation by inspecting the system state, model checkers can iden- worse. tify similar set of states and prioritize the search towards ∗ previously unexplored behaviors in the system. When This research was supported by NSF grant CCR-0326227 and DARPA grant F29601-03-2-0117. Dawson Engler is partially sup- applicable, such a systematic exploration can achieve the ported by Coverity and an NSF Career award. effect of impractically massive testing by avoiding the ¦ " # ¦ # ¦ ¨ ¦ " # ¦ # ¦ redundancy that would occur in conventional testing. ! § ¦ E § ¦ © ¦ ! ¤ ¡ ¢ £ The dominant cost of traditional model checking is the ¥ ¦ § ¨ ¦ © effort needed to write an abstract specification of the sys- ¤ tem (commonly referred to as the “model”). This up- $ $ $ ¤ > ? @ A B C@ D front cost has traditionally made model checking com- ¢ pletely impractical for large systems. A sufficiently de- I # ¦ § © # ¨ J ¦ § % ¦ § ! ' tailed model can be as large as the checked system. Em- ! ¦ ¥ F E ¨ § ¦ : ; & ' & ' ! ¨ pirically, implementors often refuse to write them; those ¨ " # ¦ # ¦ that are written have errors and, even if they do not, they ¦ © ¨ ¥ : ; < ¥ § : ; = § ¦ : ; “drift” as the implementation is modified but the model 9 9 9 F G < ¥ § : H § H ; is not [6]. E ¨ % § ¨ § Recent work has developed implementation-level © ¦ ¦ ¥ ¦ § & ' & ! ¡ ¢ £ ¦ § ¨ ¦ © model checkers that check implementation code directly ¥ without requiring an abstract specification [18,25,26]. ¤ We leverage this approach to create a model checking in- frastructure, the File System Checker (FiSC), which lets ) * + , , - . implementors model-check real, unmodified file systems ( / 0 1 ) 2 0 1 3 4 5 * 3 - - 0 4 3 0 - ) . with relatively little model checking knowledge. FiSC ) - ) 5 . is built on CMC, an explicit state space, implementation / / , - ) 5 2 4 6 ) 5 2 1 7 4 3 0 1 3 8 4 . 4 ( + , , - ) 5 . model checker we developed in previous work [25,26], which lets us run an entire operating system inside of the model checker. This allows us to check a file system in situ rather than attempting the difficult task of extracting it from the operating system kernel. Figure 1: State exploration and checking overview. We applied FiSC to three widely-used, heavily-tested FiSC’s main loop picks a state S from the state queue and file systems, JFS [21], ReiserFS [27], and ext3 [13]. We then iteratively generates its successor states by applying found serious bugs in all of them, 32 in total. Most have each possible operation to a restored copy of S. Thegen- 0 led to patches within a day of diagnosis. For each file erated state S is checked for validity and, if valid and not system, FiSC found demonstrable events leading to the explored before, inserted onto the state queue. unrecoverable destruction of metadata and entire direc- tories, including the file system root directory “/”. writes to and truncates files; and mounts and unmounts The rest of the paper is as follows. We give an the file system. Figure 1 shows this process. overview of both FiSC (§2) and how to check a file sys- As each new state is generated, we intercept all disk tem with it (§3). We then describe: the checks FiSC writes done by the checked file system and forward them performs (§4), the optimizations it does (§5), and how to the permutation checker, which checks that the disk it checks file system recovery code (§6). We then discuss is in a state that fsck can repair to produce a valid results (§7) and our experiences using FiSC (§8), includ- file system after each subset of all possible disk writes. ing sources of false positives and false negatives. We This avoids storing a separate state for each permutation then conclude. and allows FiSC to choose which permutations to check. This checker is explained in Section 4.2. We run fsck 2 Checking Overview on the host system outside of the model checker and use Our system is comprised of four parts: (1) CMC, an ex- a small shared library to capture all the disk accesses plicit state model checker running the Linux kernel, (2) fsck makes while repairing the file system generated a file system test driver, (3) a permutation checker which by writing a permutation. We feed these fsck gener- verifies that a file system can recover no matter what or- ated writes into the crash recovery checker. This checker der buffer cache contents are written to disk, and (4) a allows FiSC to recursively check for failures in fsck fsck recovery checker. The model checker starts in an and is covered in Section 6. initial pristine state (an empty, formatted disk) and re- Figure 2 outlines the operation of the permutation and cursively generates and checks successive states by sys- fsck recovery checkers. Both checkers copy the disk tematically executing state transitions. Transitions are from the starting state of a transition and write onto the either test driver operations or FS-specific kernel threads copy to avoid perturbing the system. After the copied which flush blocks to disk. The test driver is conceptu- disk is modified the model checker traverses its file sys- ally similar to a program run during testing. It creates, tem, recording the properties it checks for consistency in removes, and renames files, directories, and hard links; a model of the file system. Currently these are the name, " ( atop the system call interface. The other layer provides " ¡ ¢ £ ¤ ¥ ¦ £ § ¨ © a “fake environment” that the checked system runs on. ¥ ) + * We need this environment model because the checked file system does not run on bare hardware. Instead, FiSC ¡ ¢ £ ¤ ¥ ¦ £ § ¨ © ( ¥ provides a virtual block device that models a disk as a collection of sectors that can be written atomically. The 0 $ ( # £ $ % & £ ¤ 1 block device driver layer is a natural place to cut as it is , £ ¤ ¦ - ¨ © ¨ ¥ % § . / £ $ £ ¤ . / £ $ £ ¤ the only relatively well-documented boundary between in-core and persistent data. 2 Modern Unix derivatives provide a Virtual File Sys- + # £ $ % & £ ¤ £ ' + 4 tem (VFS) interface [28].
Recommended publications
  • Study of File System Evolution
    Study of File System Evolution Swaminathan Sundararaman, Sriram Subramanian Department of Computer Science University of Wisconsin {swami, srirams} @cs.wisc.edu Abstract File systems have traditionally been a major area of file systems are typically developed and maintained by research and development. This is evident from the several programmer across the globe. At any point in existence of over 50 file systems of varying popularity time, for a file system, there are three to six active in the current version of the Linux kernel. They developers, ten to fifteen patch contributors but a single represent a complex subsystem of the kernel, with each maintainer. These people communicate through file system employing different strategies for tackling individual file system mailing lists [14, 16, 18] various issues. Although there are many file systems in submitting proposals for new features, enhancements, Linux, there has been no prior work (to the best of our reporting bugs, submitting and reviewing patches for knowledge) on understanding how file systems evolve. known bugs. The problems with the open source We believe that such information would be useful to the development approach is that all communication is file system community allowing developers to learn buried in the mailing list archives and aren’t easily from previous experiences. accessible to others. As a result when new file systems are developed they do not leverage past experience and This paper looks at six file systems (Ext2, Ext3, Ext4, could end up re-inventing the wheel. To make things JFS, ReiserFS, and XFS) from a historical perspective worse, people could typically end up doing the same (between kernel versions 1.0 to 2.6) to get an insight on mistakes as done in other file systems.
    [Show full text]
  • W4118: Linux File Systems
    W4118: Linux file systems Instructor: Junfeng Yang References: Modern Operating Systems (3rd edition), Operating Systems Concepts (8th edition), previous W4118, and OS at MIT, Stanford, and UWisc File systems in Linux Linux Second Extended File System (Ext2) . What is the EXT2 on-disk layout? . What is the EXT2 directory structure? Linux Third Extended File System (Ext3) . What is the file system consistency problem? . How to solve the consistency problem using journaling? Virtual File System (VFS) . What is VFS? . What are the key data structures of Linux VFS? 1 Ext2 “Standard” Linux File System . Was the most commonly used before ext3 came out Uses FFS like layout . Each FS is composed of identical block groups . Allocation is designed to improve locality inodes contain pointers (32 bits) to blocks . Direct, Indirect, Double Indirect, Triple Indirect . Maximum file size: 4.1TB (4K Blocks) . Maximum file system size: 16TB (4K Blocks) On-disk structures defined in include/linux/ext2_fs.h 2 Ext2 Disk Layout Files in the same directory are stored in the same block group Files in different directories are spread among the block groups Picture from Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639 3 Block Addressing in Ext2 Twelve “direct” blocks Data Data BlockData Inode Block Block BLKSIZE/4 Indirect Data Data Blocks BlockData Block Data (BLKSIZE/4)2 Indirect Block Data BlockData Blocks Block Double Block Indirect Indirect Blocks Data Data Data (BLKSIZE/4)3 BlockData Data Indirect Block BlockData Block Block Triple Double Blocks Block Indirect Indirect Data Indirect Data BlockData Blocks Block Block Picture from Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc.
    [Show full text]
  • Certifying a File System
    Certifying a file system: Correctness in the presence of crashes Tej Chajed, Haogang Chen, Stephanie Wang, Daniel Ziegler, Adam Chlipala, Frans Kaashoek, and Nickolai Zeldovich MIT CSAIL 1 / 28 New file systems (and bugs) are introduced over time Some bugs are serious: security exploits, data loss, etc. File systems are complex and have bugs File systems are complex (e.g., Linux ext4 is ∼60,000 lines of code) and have many bugs: 500 ext3 400 300 200 100 # patches for bugs 0 Jan'04 Jan'05 Jan'06 Jan'07 Jan'08 Jan'09 Jan'10 Jan'11 Cumulative number of patches for file-system bugs in Linux; data from [Lu et al., FAST’13] 2 / 28 Some bugs are serious: security exploits, data loss, etc. File systems are complex and have bugs File systems are complex (e.g., Linux ext4 is ∼60,000 lines of code) and have many bugs: 500 ext3 400 ext4 xfs 300 reiserfs 200 jfs btrfs 100 # patches for bugs 0 Jan'04 Jan'05 Jan'06 Jan'07 Jan'08 Jan'09 Jan'10 Jan'11 Cumulative number of patches for file-system bugs in Linux; data from [Lu et al., FAST’13] New file systems (and bugs) are introduced over time 2 / 28 File systems are complex and have bugs File systems are complex (e.g., Linux ext4 is ∼60,000 lines of code) and have many bugs: 500 ext3 400 ext4 xfs 300 reiserfs 200 jfs btrfs 100 # patches for bugs 0 Jan'04 Jan'05 Jan'06 Jan'07 Jan'08 Jan'09 Jan'10 Jan'11 Cumulative number of patches for file-system bugs in Linux; data from [Lu et al., FAST’13] New file systems (and bugs) are introduced over time Some bugs are serious: security exploits, data loss, etc.
    [Show full text]
  • XFS: There and Back ...And There Again? Slide 1 of 38
    XFS: There and Back.... .... and There Again? Dave Chinner <[email protected]> <[email protected]> XFS: There and Back .... and There Again? Slide 1 of 38 Overview • Story Time • Serious Things • These Days • Shiny Things • Interesting Times XFS: There and Back .... and There Again? Slide 2 of 38 Story Time • Way back in the early '90s • Storage exceeding 32 bit capacities • 64 bit CPUs, large scale MP • Hundreds of disks in a single machine • XFS: There..... Slide 3 of 38 "x" is for Undefined xFS had to support: • Fast Crash Recovery • Large File Systems • Large, Sparse Files • Large, Contiguous Files • Large Directories • Large Numbers of Files • - Scalability in the XFS File System, 1995 http://oss.sgi.com/projects/xfs/papers/xfs_usenix/index.html XFS: There..... Slide 4 of 38 The Early Years XFS: There..... Slide 5 of 38 The Early Years • Late 1994: First Release, Irix 5.3 • Mid 1996: Default FS, Irix 6.2 • Already at Version 4 • Attributes • Journalled Quotas • link counts > 64k • feature masks • • XFS: There..... Slide 6 of 38 The Early Years • • Allocation alignment to storage geometry (1997) • Unwritten extents (1998) • Version 2 directories (1999) • mkfs time configurable block size • Scalability to tens of millions of directory entries • • XFS: There..... Slide 7 of 38 What's that Linux Thing? • Feature development mostly stalled • Irix development focussed on CXFS • New team formed for Linux XFS port! • Encumberance review! • Linux was missing lots of bits XFS needed • Lot of work needed • • XFS: There and..... Slide 8 of 38 That Linux Thing? XFS: There and..... Slide 9 of 38 Light that fire! • 2000: SGI releases XFS under GPL • • 2001: First stable XFS release • • 2002: XFS merged into 2.5.36 • • JFS follows similar timeline • XFS: There and....
    [Show full text]
  • BSD UNIX Toolbox 1000+ Commands for Freebsd, Openbsd
    76034ffirs.qxd:Toolbox 4/2/08 12:50 PM Page iii BSD UNIX® TOOLBOX 1000+ Commands for FreeBSD®, OpenBSD, and NetBSD®Power Users Christopher Negus François Caen 76034ffirs.qxd:Toolbox 4/2/08 12:50 PM Page ii 76034ffirs.qxd:Toolbox 4/2/08 12:50 PM Page i BSD UNIX® TOOLBOX 76034ffirs.qxd:Toolbox 4/2/08 12:50 PM Page ii 76034ffirs.qxd:Toolbox 4/2/08 12:50 PM Page iii BSD UNIX® TOOLBOX 1000+ Commands for FreeBSD®, OpenBSD, and NetBSD®Power Users Christopher Negus François Caen 76034ffirs.qxd:Toolbox 4/2/08 12:50 PM Page iv BSD UNIX® Toolbox: 1000+ Commands for FreeBSD®, OpenBSD, and NetBSD® Power Users Published by Wiley Publishing, Inc. 10475 Crosspoint Boulevard Indianapolis, IN 46256 www.wiley.com Copyright © 2008 by Wiley Publishing, Inc., Indianapolis, Indiana Published simultaneously in Canada ISBN: 978-0-470-37603-4 Manufactured in the United States of America 10 9 8 7 6 5 4 3 2 1 Library of Congress Cataloging-in-Publication Data is available from the publisher. No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning or otherwise, except as permitted under Sections 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 646-8600. Requests to the Publisher for permis- sion should be addressed to the Legal Department, Wiley Publishing, Inc., 10475 Crosspoint Blvd., Indianapolis, IN 46256, (317) 572-3447, fax (317) 572-4355, or online at http://www.wiley.com/go/permissions.
    [Show full text]
  • Filesystem Considerations for Embedded Devices ELC2015 03/25/15
    Filesystem considerations for embedded devices ELC2015 03/25/15 Tristan Lelong Senior embedded software engineer Filesystem considerations ABSTRACT The goal of this presentation is to answer a question asked by several customers: which filesystem should you use within your embedded design’s eMMC/SDCard? These storage devices use a standard block interface, compatible with traditional filesystems, but constraints are not those of desktop PC environments. EXT2/3/4, BTRFS, F2FS are the first of many solutions which come to mind, but how do they all compare? Typical queries include performance, longevity, tools availability, support, and power loss robustness. This presentation will not dive into implementation details but will instead summarize provided answers with the help of various figures and meaningful test results. 2 TABLE OF CONTENTS 1. Introduction 2. Block devices 3. Available filesystems 4. Performances 5. Tools 6. Reliability 7. Conclusion Filesystem considerations ABOUT THE AUTHOR • Tristan Lelong • Embedded software engineer @ Adeneo Embedded • French, living in the Pacific northwest • Embedded software, free software, and Linux kernel enthusiast. 4 Introduction Filesystem considerations Introduction INTRODUCTION More and more embedded designs rely on smart memory chips rather than bare NAND or NOR. This presentation will start by describing: • Some context to help understand the differences between NAND and MMC • Some typical requirements found in embedded devices designs • Potential filesystems to use on MMC devices 6 Filesystem considerations Introduction INTRODUCTION Focus will then move to block filesystems. How they are supported, what feature do they advertise. To help understand how they compare, we will present some benchmarks and comparisons regarding: • Tools • Reliability • Performances 7 Block devices Filesystem considerations Block devices MMC, EMMC, SD CARD Vocabulary: • MMC: MultiMediaCard is a memory card unveiled in 1997 by SanDisk and Siemens based on NAND flash memory.
    [Show full text]
  • Filesystems HOWTO Filesystems HOWTO Table of Contents Filesystems HOWTO
    Filesystems HOWTO Filesystems HOWTO Table of Contents Filesystems HOWTO..........................................................................................................................................1 Martin Hinner < [email protected]>, http://martin.hinner.info............................................................1 1. Introduction..........................................................................................................................................1 2. Volumes...............................................................................................................................................1 3. DOS FAT 12/16/32, VFAT.................................................................................................................2 4. High Performance FileSystem (HPFS)................................................................................................2 5. New Technology FileSystem (NTFS).................................................................................................2 6. Extended filesystems (Ext, Ext2, Ext3)...............................................................................................2 7. Macintosh Hierarchical Filesystem − HFS..........................................................................................3 8. ISO 9660 − CD−ROM filesystem.......................................................................................................3 9. Other filesystems.................................................................................................................................3
    [Show full text]
  • Acronis® Disk Director® 12 User's Guide
    User Guide Copyright Statement Copyright © Acronis International GmbH, 2002-2015. All rights reserved. "Acronis", "Acronis Compute with Confidence", "Acronis Recovery Manager", "Acronis Secure Zone", Acronis True Image, Acronis Try&Decide, and the Acronis logo are trademarks of Acronis International GmbH. Linux is a registered trademark of Linus Torvalds. VMware and VMware Ready are trademarks and/or registered trademarks of VMware, Inc. in the United States and/or other jurisdictions. Windows and MS-DOS are registered trademarks of Microsoft Corporation. All other trademarks and copyrights referred to are the property of their respective owners. Distribution of substantively modified versions of this document is prohibited without the explicit permission of the copyright holder. Distribution of this work or derivative work in any standard (paper) book form for commercial purposes is prohibited unless prior permission is obtained from the copyright holder. DOCUMENTATION IS PROVIDED "AS IS" AND ALL EXPRESS OR IMPLIED CONDITIONS, REPRESENTATIONS AND WARRANTIES, INCLUDING ANY IMPLIED WARRANTY OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE OR NON-INFRINGEMENT, ARE DISCLAIMED, EXCEPT TO THE EXTENT THAT SUCH DISCLAIMERS ARE HELD TO BE LEGALLY INVALID. Third party code may be provided with the Software and/or Service. The license terms for such third-parties are detailed in the license.txt file located in the root installation directory. You can always find the latest up-to-date list of the third party code and the associated license terms used with the Software and/or Service at http://kb.acronis.com/content/7696 Acronis patented technologies Technologies, used in this product, are covered and protected by one or more U.S.
    [Show full text]
  • Journaling File Systems
    Linux Journaling File Systems Linux onzSeries Journaling File Systems Volker Sameske ([email protected]) Linux on zSeries Development IBM Lab Boeblingen, Germany Share Anaheim,California February27 –March 4,2005 Session 9257 ©2005 IBM Corporation Linux Journaling File Systems Agenda o File systems. • Overview, definitions. • Reliability, scalability. • File system features. • Common grounds & differences. o Volume management. • LVM, EVMS, MD. • Striping. o Measurement results. • Hardware/software setup. • throughput. • CPU load. 2 Session 9257 © 2005 IBM Corporation Linux Journaling File Systems A file system should... o ...store data o ...organize data o ...administrate data o ...organize data about the data o ...assure integrity o ...be able to recover integrity problems o ...provide tools (expand, shrink, check, ...) o ...be able to handle many and large files o ...be fast o ... 3 Session 9257 © 2005 IBM Corporation Linux Journaling File Systems File system-definition o Informally • The mechanism by which computer files are stored and organized on a storage device. o More formally, • A set of abstract data types that are necessary for the storage, hierarchical organization, manipulation, navigation, access and retrieval of data. 4 Session 9257 © 2005 IBM Corporation Linux Journaling File Systems Why a journaling file system? o Imagine your Linux system crashs while you are saving an edited file: • The system crashs after the changes have been written to disk à good crash • The system crashs before the changes have been written to disk à bad crash but bearable if you have an older version • The sytem crashs just in the moment your data will be written: à very bad crash your file could be corrupted and in worst case the file system could be corrupted à That‘s why you need a journal 5 Session 9257 © 2005 IBM Corporation Linux Journaling File Systems Somefilesystemterms o Meta data • "Data about the data" • File system internal data structure (e.g.
    [Show full text]
  • State of the Art: Where We Are with the Ext3 Filesystem
    State of the Art: Where we are with the Ext3 filesystem Mingming Cao, Theodore Y. Ts’o, Badari Pulavarty, Suparna Bhattacharya IBM Linux Technology Center {cmm, theotso, pbadari}@us.ibm.com, [email protected] Andreas Dilger, Alex Tomas, Cluster Filesystem Inc. [email protected], [email protected] Abstract 1 Introduction Although the ext2 filesystem[4] was not the first filesystem used by Linux and while other filesystems have attempted to lay claim to be- ing the native Linux filesystem (for example, The ext2 and ext3 filesystems on Linux R are when Frank Xia attempted to rename xiafs to used by a very large number of users. This linuxfs), nevertheless most would consider the is due to its reputation of dependability, ro- ext2/3 filesystem as most deserving of this dis- bustness, backwards and forwards compatibil- tinction. Why is this? Why have so many sys- ity, rather than that of being the state of the tem administrations and users put their trust in art in filesystem technology. Over the last few the ext2/3 filesystem? years, however, there has been a significant amount of development effort towards making There are many possible explanations, includ- ext3 an outstanding filesystem, while retaining ing the fact that the filesystem has a large and these crucial advantages. In this paper, we dis- diverse developer community. However, in cuss those features that have been accepted in our opinion, robustness (even in the face of the mainline Linux 2.6 kernel, including direc- hardware-induced corruption) and backwards tory indexing, block reservation, and online re- compatibility are among the most important sizing.
    [Show full text]
  • Model-Based Failure Analysis of Journaling File Systems
    Model-Based Failure Analysis of Journaling File Systems Vijayan Prabhakaran, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau University of Wisconsin, Madison Computer Sciences Department 1210, West Dayton Street, Madison, Wisconsin {vijayan, dusseau, remzi}@cs.wisc.edu Abstract To analyze such file systems, we develop a novel model- based fault-injection technique. Specifically, for the file We propose a novel method to measure the dependability system under test, we develop an abstract model of its up- of journaling file systems. In our approach, we build models date behavior, e.g., how it orders writes to disk to maintain of how journaling file systems must behave under different file system consistency. By using such a model, we can journaling modes and use these models to analyze file sys- inject faults at various “interesting” points during a file sys- tem behavior under disk failures. Using our techniques, we tem transaction, and thus monitor how the system reacts to measure the robustness of three important Linux journaling such failures. In this paper, we focus only on write failures file systems: ext3, Reiserfs and IBM JFS. From our anal- because file system writes are those that change the on-disk ysis, we identify several design flaws and correctness bugs state and can potentially lead to corruption if not properly present in these file systems, which can cause serious file handled. system errors ranging from data corruption to unmountable We use this fault-injection methodology to test three file systems. widely used Linux journaling file systems: ext3 [19], Reis- erfs [14] and IBM JFS [1].
    [Show full text]
  • Outline of Ext4 File System & Ext4 Online Defragmentation Foresight
    Outline of Ext4 File System & Ext4 Online Defragmentation Foresight LinuxCon Japan/Tokyo 2010 September 28, 2010 Akira Fujita <[email protected]> NEC Software Tohoku, Ltd. Self Introduction ▐ Name: Akira Fujita Japan ▐ Company: NEC Software Tohoku, Ltd. in Sendai, Japan. Sendai ● ▐ Since 2004, I have been working at NEC Software Tohoku developing Linux file system, mainly ext3 and ● ext4 filesystems. Tokyo Currently, I work on the quality evaluation of ext4 for enterprise use, and also develop the ext4 online defragmentation. Page 2 Copyright(C) 2010 NEC Software Tohoku, Ltd. All Rights Reserved. Outline ▐ What is ext4 ▐ Ext4 features ▐ Compatibility ▐ Performance measurement ▐ Recent ext4 topics ▐ What is ext4 online defrag ▐ Relevant file defragmentation ▐ Current status / future plan Page 3 Copyright(C) 2010 NEC Software Tohoku, Ltd. All Rights Reserved. What is ext4 ▐ Ext4 is the successor of ext3 which is developed to solve performance issues and scalability bottleneck on ext3 and also provide backward compatibility with ext3. ▐ Ext4 development began in 2006. Included in stable kernel 2.6.19 as EXPERIMENTAL (ext4dev). Since kernel 2.6.28, ext4 has been released as stable (Renamed from ext4dev to ext4 in kernel 2.6.28). ▐ Maintainers Theodore Ts'o [email protected] , Andreas Dilger [email protected] ▐ ML [email protected] ▐ Ext4 Wiki http://ext4.wiki.kernel.org Page 4 Copyright(C) 2010 NEC Software Tohoku, Ltd. All Rights Reserved. Ext4 features Page 5 Copyright(C) 2010 NEC Software Tohoku, Ltd. All Rights Reserved. Ext4 features Bigger file/filesystem size support. Compared to ext3, ext4 is: 8 times larger in file size, 65536 times(!) larger in filesystem size.
    [Show full text]