Evaluating File System Reliability on Solid State Drives

Evaluating File System Reliability on Solid State Drives Shehbaz Jaffer, Stathis Maneas, Andy Hwang, Bianca Schroeder USENIX ATC ‘19 J1 Introduction & Motivation • Storage landscape has changed: ▪ HDDs -> SSDs. ▪ What about their failure characteristics? ➢Partial failures are a magnitude higher for SSDs! ➢FTL is prone to bugs during power faults. ▪ New/Evolved file systems: ➢eXt3 -> ext4 (journaling). ➢Btrfs (copy-on-write). ➢F2FS (log-structured, tailored for flash). • Our goal: How do these file systems deal with partial drive errors? 82 Research Questions & Methodology 83 Research Questions & Methodology • What we know (IRON File Systems, 2005): ▪ Only eXt3, JFS, ReiserFS, NTFS (partial). ▪ Hard disks only. ▪ Does not consider file system checkers. 83 Research Questions & Methodology • What we know (IRON File Systems, 2005): ▪ Only eXt3, JFS, ReiserFS, NTFS (partial). ▪ Hard disks only. ▪ Does not consider file system checkers. • What we want to know: ▪ Btrfs, eXt4, F2FS. ▪ Can they detect errors? ▪ Can they recover from errors? ▪ Can the system checker (fsck) fiX errors? 83 SSD Faults and their Manifestation • What if the storage device starts misbehaving and generating errors? OS • How eXactly file systems deal with these errors? Block-based FS … ext4 F2FS … VFS Btrfs Physical Device SSD 84 SSD Faults and their Manifestation SSD/Flash Faults Manifestation Read I/O OS Block-based FS Write … ext4 F2FS … VFS I/O Btrfs Corrup tion Shorn Write Physical Device Lost SSD Write 85 SSD Faults and their Manifestation SSD/Flash Faults Manifestation Uncorrectable bit corruption Read I/O OS Block-based FS Write … ext4 F2FS … VFS I/O Btrfs Corrup tion Shorn Write Physical Device Lost SSD Write 86 SSD Faults and their Manifestation SSD/Flash Faults Manifestation Uncorrectable bit corruption Read I/O OS Block-based FS Write … ext4 F2FS … VFS I/O Btrfs Corrup tion Read Read Read I/O Error Request I/O Error! • The SSD could not successfully (1) (2) complete the operation! Shorn Write Physical Device Lost SSD Write 87 SSD Faults and their Manifestation SSD/Flash Faults Manifestation Uncorrectable bit corruption Incomplete Program Operation Read I/O OS Block-based FS Write … ext4 F2FS … VFS I/O Btrfs Corrup tion Shorn Write Physical Device Lost SSD Write 8 SSD Faults and their Manifestation SSD/Flash Faults Manifestation Uncorrectable bit corruption Incomplete Program Operation Read I/O OS Block-based FS Write … ext4 F2FS … VFS I/O Btrfs Corrup tion Read Corrupted Corruption Request Data! • Silent! The SSD does not detect it! (1) (2) Shorn Write Physical Device Lost SSD Write 89 SSD Faults and their Manifestation SSD/Flash Faults Manifestation Uncorrectable bit corruption Incomplete Program Operation Read I/O OS Block-based FS Write … ext4 F2FS … VFS I/O Btrfs Corrup tion Write Future reads after Shorn Write Request a shorn write! • A write issued by the file system but (1) partially persisted on the SSD. Shorn ▪ Power fault (Zheng et al., FAST Write Physical Device ‘13). Write Lost is shorn! SSD Write (2) 810 SSD Faults and their Manifestation SSD/Flash Faults Manifestation Uncorrectable bit corruption Incomplete Program Operation Read I/O OS Dropped Writes Block-based FS Write … ext4 F2FS … VFS I/O Btrfs Corrup tion Shorn Write Physical Device Lost SSD Write 811 SSD Faults and their Manifestation SSD/Flash Faults Manifestation Uncorrectable bit corruption Incomplete Program Operation Read I/O OS Dropped Writes Block-based FS Write … ext4 F2FS … VFS I/O Btrfs Corrup tion Write Write Write I/O Error Request I/O Error! • The SSD could not successfully (1) (2) complete the operation! Shorn Write Physical Device Lost SSD Write 812 SSD Faults and their Manifestation SSD/Flash Faults Manifestation Uncorrectable bit corruption Incomplete Program Operation Read I/O OS Dropped Writes Block-based FS Write … ext4 F2FS … VFS I/O Btrfs Corrup tion Write Write Lost Write Request Complete! • No error is returned! (1) (2) Shorn Write Physical Device Write Lost is lost! SSD Write (3) 813 SSD Faults and their Manifestation SSD/Flash Faults Manifestation Uncorrectable bit corruption Incomplete Program Operation Read I/O OS Dropped Writes Silent Bit Corruption Block-based FS Write … ext4 F2FS … Misdirected Writes VFS I/O Btrfs Shorn Writes FTL Metadata Corruption Corrup Incomplete Erase Operation tion Shorn Write Physical Device Lost SSD Write 814 How to test the resiliency of file systems against SSD faults? OS Block-based FS … ext4 F2FS … VFS Btrfs Block Layer Physical Device SSD 815 How to test the resiliency of file systems against SSD faults? • Emulate the faults’ manifestation! ▪ Inject errors at the Block Layer. OS Block-based FS … ext4 F2FS … VFS Btrfs Block Layer Physical Device SSD 815 How to test the resiliency of file systems against SSD faults? • Emulate the faults’ manifestation! ▪ Inject errors at the Block Layer. OS Block-based FS … ext4 F2FS … VFS Btrfs Device Mapper Module Block Layer Physical Device SSD 815 How to test the resiliency of file systems against SSD faults? • Emulate the faults’ manifestation! ▪ Inject errors at the Block Layer. OS • Device Mapper Module: ▪ Intercept every I/O request. Block-based FS … … ▪ Fail a request & return an error. ext4 F2FS VFS Btrfs ▪ Silently drop a request. ▪ Alter block contents online. Device Mapper Module Block Layer Physical Device SSD 815 Targeted Error Injection • Understand the effect of every injected error. OS • Identify block types and specific data structures within each block! Block-based FS … ext4 F2FS … VFS Btrfs • Target specific data structures and fields within them: ▪ Trace all I/O requests (blktrace). Device Mapper Module ▪ Logic inside our device mapper module. ▪ FS tools, such as dump, to inspect the disk image offline. Block Layer Physical Device SSD 816 How do file systems detect and recover from errors? • Each application focuses on one Application particular operation: read(2) ▪ mkdir, creat, etc. write(2) OS • Run an application and collect all Block-based FS accessed blocks. … ext4 F2FS … VFS Btrfs • Targeted error injection: ▪ Repeat the eXecution and inject a single error into each accessed block. Device Mapper Module ▪ Target one block at a time. Block Layer • Better isolation and characterization of the file system’s reaction to every Physical Device injected error! SSD 817 How do file systems detect and recover from errors? • Categorize each file system’s detection and Application recovery policies: ▪ read(2) Across all visible aspects, such as logs, return write(2) codes, etc. OS ▪ Check how effectively fsck recovers the file system. Block-based FS … ext4 F2FS … VFS Btrfs Device Mapper Module Block Layer Physical Device SSD 818 How do file systems detect and recover from errors? • Categorize each file system’s detection and Application recovery policies: ▪ read(2) Across all visible aspects, such as logs, return write(2) codes, etc. OS ▪ Check how effectively fsck recovers the file system. Block-based FS … ext4 F2FS … VFS Btrfs Detection Recovery Fsck Crash/Fail Device Mapper Module Error Code Fail Sanity Partial Block Layer Original Physical Device Full SSD 819 How do file systems detect and recover from errors? • Categorize each file system’s detection and Application recovery policies: ▪ read(2) Across all visible aspects, such as logs, return write(2) codes, etc. OS ▪ Check how effectively fsck recovers the file system. Block-based FS … ext4 F2FS … VFS Btrfs Detection Recovery Fsck Device Mapper Module Error Code Retry Sanity Propagate Block Layer Physical Device SSD 820 How do file systems detect and recover from errors? • Categorize each file system’s detection and Application recovery policies: ▪ read(2) Across all visible aspects, such as logs, return write(2) codes, etc. OS ▪ Check how effectively fsck recovers the file system. Block-based FS … ext4 F2FS … VFS Btrfs Detection Recovery Fsck* Zero Zero Crash/Fail Device Mapper Module Error Code Retry Fail Sanity Propagate Partial Block Layer Redundancy Previous Original Physical Device Fsck Stop Full SSD * Colors indicate severity. 821 How do file systems detect and recover from errors? • Categorize each file system’s detection and Application recovery policies: ▪ read(2) Across all visible aspects, such as logs, return write(2) codes, etc. OS ▪ Check how effectively fsck recovers the file system. Block-based FS … ext4 F2FS … VFS Btrfs Experiments Device Mapper Module Block Layer Physical Device SSD 822 How do file systems detect and recover from errors? • Categorize each file system’s detection and Application recovery policies: ▪ read(2) Across all visible aspects, such as logs, return write(2) codes, etc. OS ▪ Check how effectively fsck recovers the file system. Block-based FS … ext4 F2FS … VFS Btrfs Experiments 22 6 27 2 Device Mapper Module Applications Read/Write/ Data Recovery & Corruption Structures Detection Block Layer Experiments Physical Device 7000+ Test Cases! SSD 822 Results – Overview (1/2) 823 Results – Overview (1/2) Unmountable/Unrecoverable file system in 16% of all cases! 823 Results – Overview (2/2) File System Detection Recovery 824 Results – Overview (2/2) File System Detection Recovery ext4 824 Results – Overview (2/2) File System Detection Recovery ext4 Btrfs 824 Results – Overview (2/2) File System Detection Recovery ext4 Btrfs F2FS 824 eXt4 Results The Good News • Capable of recovering from a large range of fault scenarios. • Little use of checksums: ▪ Still, it can deal with corruption and shorn writes due to a very rich set of sanity checks. • System checker capable of reconstructing

Evaluating File System Reliability on Solid State Drives

F2FS) Overview

In Search of Optimal Data Placement for Eliminating Write Amplification in Log-Structured Storage

W4118: Linux File Systems

Certifying a File System

XFS: There and Back ...And There Again? Slide 1 of 38

NOVA: a Log-Structured File System for Hybrid Volatile/Non

Filesystem Considerations for Embedded Devices ELC2015 03/25/15

Filesystems HOWTO Filesystems HOWTO Table of Contents Filesystems HOWTO

Acronis® Disk Director® 12 User's Guide

Journaling File Systems

Analysis and Design of Native File System Enhancements for Storage Class Memory

NOVA: the Fastest File System for Nvdimms