Snapshots in a Flash with Iosnap

Snapshots in a Flash with ioSnap Sriram Subramanian, Andrea C. Arpaci-Dusseau, Swaminathan Sundararaman, Remzi H. Arpaci-Dusseau Nisha Talagala Computer Sciences Department, University of Fusion-IO Inc. Wisconsin-Madison {srsubramanian,swami,ntalagala}@fusionio.com {dusseau,remzi}@cs.wisc.edu Abstract tation of the state of a data volume, used in backups, repli- Snapshots are a common and heavily relied upon feature in cation, and other functions. storage systems. The high performance of flash-based stor- However, the world of storage is in the midst of a sig- age systems brings new, more stringent, requirements for nificant change, due to the advent of flash-based persistent this classic capability. We present ioSnap, a flash optimized memory [2, 5, 9]. Flash provides much higher throughput snapshot system. Through careful design exploiting com- and lower latency to applications, particularly for those with mon snapshot usage patterns and flash oriented optimiza- random I/O needs, and thus is increasingly integral in mod- tions, including leveraging native characteristics of Flash ern storage stacks. Translation Layers, ioSnap delivers low-overhead snapshots As flash-based storage becomes more prevalent by the with minimal disruption to foreground traffic. Through our rapid fall in prices (from hundreds of dollars per GB to evaluation, we show that ioSnap incurs negligible perfor- about $1 per GB in the span of 10 years [8]) and significant mance overhead during normal operation, and that common- increase in capacity (up to several TBs [7]), customers re- case operations such as snapshot creation and deletion in- quire flash devices to deliver the same features as traditional cur little cost. We also demonstrate techniques to mitigate disk-based systems. However, the order of magnitude per- the performance impact on foreground I/O during intensive formance differential provided by flash requires a rethink of snapshot operations such as activation. Overall, ioSnap rep- both the requirements imposed on classic data services such resents a case study of how to integrate snapshots into a mod- as snapshots and the implementation of these data services. ern, well-engineered flash-based storage system. Most traditional disk-based snapshot systems rely on Copy-on-Write (CoW) or Redirect-on-Write (RoW) [1, 11, 13, 18] since snapshotted blocks have to be preserved and 1. Introduction not overwritten. It is a well known fact that SSDs also rely on Modern disk-based storage systems need to support a rich a form of Redirect-on-Write (or Remap-on-Write) to deliver array of data services. Beyond a bevy of reliability ma- high throughput and manage device wear [2]. Thus, at first chinery, such as checksums [4, 28], redundancy [17], and glance, it seems natural for snapshots to seamlessly work crash consistency techniques [10, 14, 23], customers expect with an SSD’s flash translation layer (FTL) since the SSD systems to provide time and space saving capabilities, in- does not overwrite blocks in-place and implicitly provides cluding caching [6, 16], deduplication [33] and archival ser- time-ordering through log structured writes. vices [22]. In this paper, we describe the design and implementation An important feature common in disk-based file systems of ioSnap, a system which adds flash-optimized snapshots and block stores is the ability to create and access snap- to an FTL, in our case the Fusion-io Virtual Storage Layer shots [11, 15, 30]. A snapshot is a point-in-time represen- (VSL). Through ioSnap we show how a flash-aware snapshot implementation can meet more stringent performance requirements as well as leverage the RoW capability native to the FTL. Permission to make digital or hard copies of all or part of this work for personal or We also present a careful analysis of ioSnap performance classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation and space overheads. Through experimentation, we show on the first page. Copyrights for components of this work owned by others than ACM that ioSnap delivers excellent common case read and write must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a performance, largely indistinguishable from the standard fee. Request permissions from [email protected]. FTL. We also measure the costs of creating, deleting, and EuroSys 2014, April 13 - 16 2014, Amsterdam, Netherlands. Copyright © 2014 ACM 978-1-4503-2704-6/14/04. $15.00. accessing snapshots, showing that they are reasonable; in http://dx.doi.org/10.1145/2592798.2592825 addition, we show how our rate-limiting machinery can re- duce the impact of ioSnap activity substantially, ensuring System T ype Metadata Consistency little impact on foreground I/O. Efficiency The main contributions of this paper are as follows: WAFL [11] CoW FS - Yes Btrfs [1] CoW FS - Yes ext3cow [18] CoW FS - Yes • We revisit snapshot requirements and interpret them anew PersiFS [20] CoW FS - Yes for the flash-based storage era where performance, capac- Elephant [24] CoW FS Journaled Yes ity/performance ratios, and predictable performance re- Fossil [22] CoW FS - Yes quirements are substantially different. Spiralog [29] Log FS Log Yes NILFS [13] Log FS Log Yes • We exploit the RoW capability natively present in FTLs to VDisk [30] BC on VM - No meet the stringent snapshot requirements described above, Peabody [15] BC on VD - Yes while also maintaining the performance efficiency for reg- Virtual Disks [19] BC on VD - Yes ular operations that is the primary deliverable of any pro- BVSSD [12] BC - No duction FTL. LTFTL [27] BC - No SSS [26] Object CoW Journaled [25] Yes • We explore a unique design space, which focuses on ZeroSnap [21] Flash Array - Yes avoiding time and space overheads in common-case oper- TRAP Array [32] RAID CoW XOR No ations (such as data access, snapshot creation, and snapshot deletion) by sacrificing performance of rarely used operations (such as snapshot access). Table 1. Versioning storage systems. The table presents a summary of several versioning systems comparing some of the rel- • We introduce novel rate-limiting algorithms to minimize evant characteristics including the type (Copy-on-Write, log struc- impact of background snapshot work on user activity. tured, file system based or block layer), metadata versioning ef- • We perform a thorough analysis depicting the costs and ficiency and snapshot consistency. (BC: Block CoW, VD: Virtual benefits of ioSnap, including a comparison with Btrfs, a Disk) well known disk-optimized snapshotting system [1]. lack of metadata storage efficiency, and the need for other The rest of this paper is organized as follows. First, we tools outside the file system to access the snapshots [15]. describe background and related work on snapshots (§2). File system support for snapshots can overcome most Second, we motivate the need to revisit snapshot require- of the issues with block level systems including consis- ments for flash and outline a new set of requirements (§3). tent snapshots and efficient metadata storage. Systems like Third, we present our design goals and optimizations points PureStorage [21], Self Securing Storage [26] and NetApp for ioSnap (§4). We then discuss our design and implemen- filers [11] are standalone storage boxes and may implement tation (§5), present a detailed evaluation (§6), followed by a a proprietary file system to provide snapshots. File system discussion of our results(§7), and finally conclude (§8). level snapshotting can give rise to questions on the correct- ness, maintenance, and fault tolerance of these systems. For example, when namespace tunnels [11] are required for ac- 2. Background cessing older data, an unstable active file system could make Snapshots are point-in-time representations of the state of snapshots inaccessible. Some block level systems also face a storage device. Typical storage systems employ snapshots issues with system upgrades (e.g., Peabody [15]) but others to enable efficient backups and more recently to create au- have used user level tools to interpret snapshotted data, thus dit trails for regulatory compliance [18]. Many disk based avoiding the dependency on the file system [30]. snapshot systems have been implemented with varied design Copy-on-Write vs. Redirect-on-Write requirements: some systems implement snapshots in the file In Copy-on-Write (CoW) based snapshots, new data system while others implement it at the block device layer. writes to the primary volume are updated in place while Some systems have focused on the efficiency and security the old data, now belonging to the snapshot, is explicitly aspects of snapshots, while others have focused on data re- copied to a new location. This approach has the advantage tention and snapshot access capabilities. In this section we of maintaining contiguity for the primary data copy, but in- briefly overview existing snapshot systems and their design duces overheads for regular I/O when snapshots are present. choices. Table 1 is a more extensive (though not exhaustive) Redirect-on-Write (RoW) sends new writes to alternate lo- list of snapshotting systems and their major design choices. cations, sacrificing contiguity in the primary copy for lower Block Level or File System? overhead updates in the presence of snapshots. Since FTLs From the implementation perspective, snapshots may be perform Remap-on-Write, which for our purposes is con- implemented at the file system or at the block layer. Both ceptually similar to Redirect-on-Write, we use the two terms systems have their advantages. Block layer implementations interchangeably in this paper. let the snapshot capability stay independent of the file system Metadata: Efficiency and Consistency above, thus making deployment simpler, generic, and hassle- Any snapshotting system has to not only keep versions of free [30].

Snapshots in a Flash with Iosnap

LLNL Computation Directorate Annual Report (2014)

Replication, History, and Grafting in the Ori File System

Sistemas De Archivos Distribuido Para Clúster HPC Utilizando Ceph

The Evolution of File Systems

Content Addressed, Versioned, P2P File System (DRAFT 3)

The File Systems Evolution

February 2012 Vol

Máster En Bioinformática Y Biología Computacional Trabajo Fin De

Filesystems” by Vince Freeh (NCSU) Journaling

Fossil an Archival File Server

Comparison of File Systems

Replication, History, and Grafting in the Ori File System