Recursive Updates in Copy-On-Write File Systems - Modeling and Analysis

2342 JOURNAL OF COMPUTERS, VOL. 9, NO. 10, OCTOBER 2014 Recursive Updates in Copy-on-write File Systems - Modeling and Analysis Jie Chen*, Jun Wang†, Zhihu Tan*, Changsheng Xie* *School of Computer Science and Technology Huazhong University of Science and Technology, China *Wuhan National Laboratory for Optoelectronics, Wuhan, Hubei 430074, China [email protected], {stan, cs_xie}@hust.edu.cn †Dept. of Electrical Engineering and Computer Science University of Central Florida, Orlando, Florida 32826, USA [email protected] Abstract—Copy-On-Write (COW) is a powerful technique recursive update. Recursive updates can lead to several for data protection in file systems. Unfortunately, it side effects to a storage system, such as write introduces a recursively updating problem, which leads to a amplification (also can be referred as additional writes) side effect of write amplification. Studying the behaviors of [4], I/O pattern alternation [5], and performance write amplification is important for designing, choosing and degradation [6]. This paper focuses on the side effects of optimizing the next generation file systems. However, there are many difficulties for evaluation due to the complexity of write amplification. file systems. To solve this problem, we proposed a typical Studying the behaviors of write amplification is COW file system model based on BTRFS, verified its important for designing, choosing, and optimizing the correctness through carefully designed experiments. By next generation file systems, especially when the file analyzing this model, we found that write amplification is systems uses a flash-memory-based underlying storage greatly affected by the distributions of files being accessed, system under online transaction processing (OLTP) which varies from 1.1x to 4.2x. We further found that write workloads. That’s because the OLTP workloads amplification is also affected by the number of files being introduce random write access pattern, which would accessed, the number of files contained in a file system, and trigger the worst case of recursive updates. Besides, the as well as the space utilization of file system trees. flash-memory media would also suffer from the effects of Index Terms—copy-on-write, file systems, write high write amplification because of its limited write- amplification endurance and poor write performance. There are many difficulties for evaluating the behaviors of write amplification. First, the mainstream I. INTRODUCTION COW file systems like ZFS [2], BTRFS [3] are running in the OS kernel. It is hard to hack these file systems for Copy-On-Write (COW) is one of the fundamental evaluation because of their complex implementation. update policies used when modifying data in disk blocks. Second, a recursive update process is affected by many With COW update policy, the target block is read into factors such as the organization of a file system, the memory, modified, and then written to disk at an alternate number of files contained in a file system, the distribution location (not overwriting the old data). Since it never of files being accessed, as well as the time epoch a overwrites old data, COW is usually used to prevent data checkpoint lasts. It is hard to evaluate how these factors loss from system crashes in file systems [1-3]. influence the write amplification in a real file system. Nevertheless, COW introduces an unpleasant recursive To solve this problem, we proposed a typical COW file updating procedure. Assuming that a file system is a large system model based on BTRFS, and verified its tree made up of disk blocks, when a leaf block is correctness through carefully designed experiments. By modified with the COW policy, its parent node also needs analyzing this model, we found that write amplification is to be modified to update the new location of the modified greatly affected by the distributions of files being child block. This update process will recursively occur accessed, which varies from 1.1x to 4.2x. We further until it reaches the root block which can be updated in a found that write amplification is also affected by the fixed place on disk. We define such a procedure as a number of files accessed, the number of files contained in the file system, and as well as the space utilization of file Manuscript received March 2, 2014; revised May 21, 2014; accepted system trees. May 22, 2014. The contributions of this paper are: © 2014 ACADEMY PUBLISHER doi:10.4304/jcp.9.10.2342-2351 JOURNAL OF COMPUTERS, VOL. 9, NO. 10, OCTOBER 2014 2343 • To our knowledge, the first study to systematically analyze the behaviors of write amplification caused by recursive updates; • Proposed a B-tree based file system model. The rest of this paper is organized as follows. In Section II, we motivate our work by discussing the background of recursive updates. In Section III, we describe our COW file system model. Section IV describes the methodology of calculation and verification. Section V describes the verification results and theoretical analysis results. Section VI discusses the related work and Section VII concludes our work. II. BACKGROUND This section provides the background of recursive updates. Here, we discuss what is copy-on-write, what is the definition of recursive updates, how does it work in Figure 1. A file system can be conceptually modeled as a tree made up file systems, and what are their effects. of disk blocks. A. What Is COW? not modified directly. Instead, a new block F is allocated, COW is one of the fundamental update policies used in and then the data in f is copied to F, and then requested storage systems. The basic schema is never overwriting modifications are made in F. However, the modified data old data. When updating a block with COW policy, the in block F cannot be seen by the file system until its data block is read into memory, modified, and then block address has been updated in its parent block d. This written to a new location, leaving the old data unmodified. means that the modification to the child propagates to its COW update policy has been used vastly in storage parent block. Furthermore, this modification to the parent systems: block will continue propagating along with the child- Protecting data: File systems like WAFL [1], ZFS [2], parent path until it reaches a special node which can be and BTRFS [3] use COW update policy to implement modified at a fixed place. We define such a procedure as snapshot for data protection. a recursive update. Improving performance: Log-structured file systems, In order to mitigate the overhead of recursive updates, such as LFS [7], use COW update policy to transform the COW file systems usually use a checkpoint (or access pattern from a large amount of small random transaction) mechanism. The checkpoint (or transaction) writes to a single large sequential write, which leverages mechanism is used to accumulate updates in memory and the disk sequential I/O semantics. apply them all at once to form a consistency view of the Updating data on special media: Write-once-read- whole file system structure. (In the following, we refer many media, such as optical disk [8], uses COW to checkpoint as the operation of flushing all modified data implement random write. Flash-memory file systems, to disk, while referring transaction as the process of such as CFFS [9], FlashFS [10], JFFS [11], use COW to accumulating updates and flushing them back during two optimize update operations to improve write performance contiguous checkpoints.) As seen from Fig. 2, after the and implement wear-leveling. last transaction is flushed to disk, a new transaction will Different than COW, the natural update policy is called start. Within the transaction, modifications to a block Update-In-Place (UIP), which means the target data is only trigger its COW operation once at the first time it is read into memory, modified, and then written to disk at modified, which means a previously modified block can its original location (overwriting the old data). be updated in place. Finally, the transaction commit operation will perform a checkpoint. Several conditions B. What Is the Definition of Recursive Updates? can trigger the commit operation, such as a calling to File systems can be conceptually modeled as a tree of fsync, a write-operation with O_SYNC flag, the number of disk blocks, as seen in Fig. 1. The file system tree rooted modified blocks reaches a predefined upper limit, as well at the super block. Inodes are the immediate children of as the current transaction is timeout (usually 30s). the root, and they in turn are the parents of data blocks and/or indirect or even double-indirect blocks. Thus, C. What Are the Effects of Recursive Updates? every allocated block with the exception of the super Here, we identify three side effects of recursive block has a parent. updates: COW update policy causes recursive updates in a file Write amplification: Recursive updates may cause system tree. In COW file systems, a modification to a write amplification. As shown in Fig. 2, the application disk block is always written to a newly allocated block, only needs to modify one leaf data block f, however, the which recursively updates the appropriate pointers in the recursive update causes a total of four tree blocks (super, parent blocks. Fig. 2 illustrates this process. When the A, D, F) modified. So, the data actually flushed is as high application requests to modify the block f, the block f is as 4x of the data requested. In practice, the amount of © 2014 ACADEMY PUBLISHER 2344 JOURNAL OF COMPUTERS, VOL. 9, NO. 10, OCTOBER 2014 checkpoint checkpoint Figure 2. Recursive updates within a transaction in COW file systems.

Recursive Updates in Copy-On-Write File Systems - Modeling and Analysis

Development of a Verified Flash File System ⋆

DASH: Database Shadowing for Mobile DBMS

Redbooks Paper Linux on IBM Zseries and S/390

CS 152: Computer Systems Architecture Storage Technologies

AMD Alchemy™ Processors Building a Root File System for Linux® Incorporating Memory Technology Devices

BEC Brochure for Partners

F2punifycr: a Flash-Friendly Persistent Burst-Buffer File System

Yaffs Direct Interface

Zálohuj S BTRFS!

Architecture and Performance of Perlmutter's 35 PB Clusterstor

YAFFS a NAND Flash Filesystem

FRASH: Hierarchical File System for FRAM and Flash