An Efficient Locking Scheme for Path-Based File Systems

An Efficient Locking Scheme for Path-based File Systems Ritik Malhotra University of California, Berkeley [email protected] ABSTRACT day and age - FUSE, which stands for File System Path-based file system drivers like FUSE, OSX- in Userspace [7]. FUSE, and its various incarna- FUSE, and CBFS have become increasingly popu- tions - OSXFUSE for the Mac OS X operating lar tools for developers to create their own file sys- system line [8] and Callback File System (CBFS) tem. Developing a file system using one of these for the Windows operating system line [6] - allow drivers has many differences than developing one a developer to write a file system completely in from scratch, including the fact that these drivers user-space, without having to touch the kernel or use paths to execute queries on files and folders in- develop anything for it. For the sake of this paper, stead of file handles or inode addresses. This limi- we refer to this class of tools as FUSE, since these tation causes issues when trying to handle concur- solutions are all based off its original concept. rent requests since these file systems cannot pro- vide the same guarantees as non-path-based ones FUSE is able to accomplish this goal by installing since they don't implement any locking mecha- its kernel extension and having that be the middle- nisms out of the box. To fix this problem, we man communication library between the file sys- propose an efficient and correct locking scheme to tem the developer develops in user-space, and the allow path-based file systems to properly handle corresponding virtual file system (VFS) module concurrent requests. that lives in kernel-mode and interacts with the various file system method calls made by the op- Keywords erating system. FUSE's kernel extension acts as a thin wrapper between the VFS and the developer's file systems, locking, FUSE, concurrency file system, simply acting as a unit that bounces calls from one end to the other, all while taking 1. INTRODUCTION care of the kernel-mode quirks, edge cases, secu- File system development is a difficult task for most rity, and more as a generic abstract solution for developers, especially because of the dependency the developer to not have to think about. on writing file systems in the kernel-mode frame- work and having to deal with its various idiosyn- On top of the simplicity FUSE provides, it has a crasies. There are various quirks and problems and myriad of other benefits for the developer to take a lot of boilerplate code that has to be written just advantage of, including: to get a barebones working version of a file system up and running. • Access to multiple languages to write the file system instead of being limited to using C 1.1 File System in Userspace for a kernel-mode file system driver. The Fortunately, there have been various solutions to FUSE file system a developer gets to write alleviate the difficulty of writing such file systems is essentially treated as any other applica- by different open source projects, commercial third- tion running in user-space. This gives the party extensions, and more. These solutions allow developer the freedom to write it in any lan- the developer to abstract the complexity of the guage, provided that it follows the contract kernel and write file systems without having to set by the FUSE kernel extension and can worry about some of the kernel's problems. One communicate back and forth with it. class of these solutions has emerged to be a popu- lar choice for developers writing file systems in this • Access to the network. Doing network requests in the kernel is not possible without some complicated machinery, but is the com- plete opposite with FUSE file systems. Since they run like other applications, they also the VFS module up to the user-space file system have the freedom to pick up any regular li- implementation. From there on, the user-space brary and use it in the file system, including file system is expected to keep its own consistent ones that handle network requests. This al- mapping of paths to the appropriate data struc- lows for easier development of network file tures containing the information about the object systems and network-connected distributed at that path. file systems, which are becoming more pop- ular recently. This makes sense purely on a development side - let the developers handle their own data structures and interfere as little as possible. This is generally However, FUSE is not without its limitations and a sign of a good API that abstracts out the right has a few drawbacks versus writing a file system things, and standalone sounds like a harmless fea- purely from scratch [4], including: ture. Unfortunately this concept introduces more problems when combined with other aspects of the file system, which we introduce later in this paper. • Much slower performance due to the heavy context switching required between user-space and kernel-mode. This is expected since each 1.3 Concurrency call that comes to the FUSE kernel extension Like most regular file systems, FUSE file systems has to be transported across the user-kernel are inherently designed to support concurrent op- boundary to be queried on the user-space file erations on it. The FUSE kernel extension han- system before returning the appropriate data dles concurrent requests and passes them in a non- to the caller (which is another round trip blocking fashion up to the user-space file system through kernel-mode and back up the call- implementation and leaves it to handle proper con- ing application in user-space). currency control, serialization, etc. This is fantas- tic for performance reasons, but pushes the burden • Unable to get granular control over memory of handling parallelism within the file system on to and cache management. Since FUSE file sys- the developer. tems don't live in the kernel, we're restricted on the amount of low-level control we have Again, standalone, this is a great feature for FUSE over things like memory management, data file systems to support since it helps with perfor- caching, and other kernel-specific functions. mance, but combined with other quirks of how FUSE works, this poses a problem. • Queries are done repeatedly via paths instead of an ID-based or pointer-based system. This is problematic since paths aren't 1.4 Concurrency in a Path-based File the best form of identifying a node in a file System system, as we later discuss in this paper. When we combine both these concepts - concurrency and path-based querying - there's a larger issue that pops up: how do we properly serialize The last issue presented, related to path-based requests that could potentially run into conflicts queries, is the one we focus on in this paper. We or concurrent modification exceptions on the file delve into this issue to determine the ramifica- system's core data structures. tions of such a system, what limitations it poses, what potential solutions look like, how we solve To depict this problem, let's take a look at an ex- the problem, and the benefits from doing so. ample: 1.2 Path-based Queries Algorithm 1 Example of Problem One of the biggest differences with a FUSE file 1: procedure InterleavingThreadsExample system and a regular file system is the fact that 2: Thread 1 : method queries are done via paths to file system 3: write(0=a=b0; DataChunk1) objects, rather than an inode identifier, a file han- 4: Thread 2 : dle, or any other sort of pointer-based query method. 5: rename(0=a0;0 =c0) This design choice is done within FUSE's kernel 6: Thread 1 : extension, and is done by choice, not by some lim- 7: write(0=a=b0; DataChunk2) itation of the underlying kernel software. FUSE's mantra resides with keeping the kernel extension thin in order to keep performance as high as possible and act purely as a pass-through layer. This This example shows two interleaving threads (pre- in turn means that FUSE doesn't keep a mapping sumably two different applications working on the between the path to an object and its inode ID, file system at the same time) acting on conflicting but rather passes the path that it receives from data. Let's take a look at what really happened here: Thread #1 is undergoing a series of write() does the file system have to deal with interleaving requests to write data to the file at /a/b. It exe- threads modifying related data and leaving the file cutes the write requests in chunks defined by the system in an inconsistent and corrupt state. If kernel, so it has to make multiple write() calls to the file system has a global access lock for all op- get all the data written to the file. Thread #2 is erations, it immediately serializes all file system unaware of thread #1's intention to write data to method calls and turns the file system into a single- a file that is a child node of the folder thread #2 threaded system. is modifying. Thread #2 continues its operation and renames the folder at /a to /c. When thread But even this approach is limited as it still puts #1 makes another write() call to write the second the burden on the developer to ensure that certain chunk of data, the file system is in an inconsistent file system operations are transactional, like reads state from what thread #1 thought it was at after and writes.

Load more