An Efficient Locking Scheme for Path-Based File Systems

Total Page:16

File Type:pdf, Size:1020Kb

An Efficient Locking Scheme for Path-Based File Systems An Efficient Locking Scheme for Path-based File Systems Ritik Malhotra University of California, Berkeley [email protected] ABSTRACT day and age - FUSE, which stands for File System Path-based file system drivers like FUSE, OSX- in Userspace [7]. FUSE, and its various incarna- FUSE, and CBFS have become increasingly popu- tions - OSXFUSE for the Mac OS X operating lar tools for developers to create their own file sys- system line [8] and Callback File System (CBFS) tem. Developing a file system using one of these for the Windows operating system line [6] - allow drivers has many differences than developing one a developer to write a file system completely in from scratch, including the fact that these drivers user-space, without having to touch the kernel or use paths to execute queries on files and folders in- develop anything for it. For the sake of this paper, stead of file handles or inode addresses. This limi- we refer to this class of tools as FUSE, since these tation causes issues when trying to handle concur- solutions are all based off its original concept. rent requests since these file systems cannot pro- vide the same guarantees as non-path-based ones FUSE is able to accomplish this goal by installing since they don't implement any locking mecha- its kernel extension and having that be the middle- nisms out of the box. To fix this problem, we man communication library between the file sys- propose an efficient and correct locking scheme to tem the developer develops in user-space, and the allow path-based file systems to properly handle corresponding virtual file system (VFS) module concurrent requests. that lives in kernel-mode and interacts with the various file system method calls made by the op- Keywords erating system. FUSE's kernel extension acts as a thin wrapper between the VFS and the developer's file systems, locking, FUSE, concurrency file system, simply acting as a unit that bounces calls from one end to the other, all while taking 1. INTRODUCTION care of the kernel-mode quirks, edge cases, secu- File system development is a difficult task for most rity, and more as a generic abstract solution for developers, especially because of the dependency the developer to not have to think about. on writing file systems in the kernel-mode frame- work and having to deal with its various idiosyn- On top of the simplicity FUSE provides, it has a crasies. There are various quirks and problems and myriad of other benefits for the developer to take a lot of boilerplate code that has to be written just advantage of, including: to get a barebones working version of a file system up and running. • Access to multiple languages to write the file system instead of being limited to using C 1.1 File System in Userspace for a kernel-mode file system driver. The Fortunately, there have been various solutions to FUSE file system a developer gets to write alleviate the difficulty of writing such file systems is essentially treated as any other applica- by different open source projects, commercial third- tion running in user-space. This gives the party extensions, and more. These solutions allow developer the freedom to write it in any lan- the developer to abstract the complexity of the guage, provided that it follows the contract kernel and write file systems without having to set by the FUSE kernel extension and can worry about some of the kernel's problems. One communicate back and forth with it. class of these solutions has emerged to be a popu- lar choice for developers writing file systems in this • Access to the network. Doing network re- quests in the kernel is not possible without some complicated machinery, but is the com- plete opposite with FUSE file systems. Since they run like other applications, they also the VFS module up to the user-space file system have the freedom to pick up any regular li- implementation. From there on, the user-space brary and use it in the file system, including file system is expected to keep its own consistent ones that handle network requests. This al- mapping of paths to the appropriate data struc- lows for easier development of network file tures containing the information about the object systems and network-connected distributed at that path. file systems, which are becoming more pop- ular recently. This makes sense purely on a development side - let the developers handle their own data structures and interfere as little as possible. This is generally However, FUSE is not without its limitations and a sign of a good API that abstracts out the right has a few drawbacks versus writing a file system things, and standalone sounds like a harmless fea- purely from scratch [4], including: ture. Unfortunately this concept introduces more problems when combined with other aspects of the file system, which we introduce later in this paper. • Much slower performance due to the heavy context switching required between user-space and kernel-mode. This is expected since each 1.3 Concurrency call that comes to the FUSE kernel extension Like most regular file systems, FUSE file systems has to be transported across the user-kernel are inherently designed to support concurrent op- boundary to be queried on the user-space file erations on it. The FUSE kernel extension han- system before returning the appropriate data dles concurrent requests and passes them in a non- to the caller (which is another round trip blocking fashion up to the user-space file system through kernel-mode and back up the call- implementation and leaves it to handle proper con- ing application in user-space). currency control, serialization, etc. This is fantas- tic for performance reasons, but pushes the burden • Unable to get granular control over memory of handling parallelism within the file system on to and cache management. Since FUSE file sys- the developer. tems don't live in the kernel, we're restricted on the amount of low-level control we have Again, standalone, this is a great feature for FUSE over things like memory management, data file systems to support since it helps with perfor- caching, and other kernel-specific functions. mance, but combined with other quirks of how FUSE works, this poses a problem. • Queries are done repeatedly via paths in- stead of an ID-based or pointer-based sys- tem. This is problematic since paths aren't 1.4 Concurrency in a Path-based File the best form of identifying a node in a file System system, as we later discuss in this paper. When we combine both these concepts - concur- rency and path-based querying - there's a larger issue that pops up: how do we properly serialize The last issue presented, related to path-based requests that could potentially run into conflicts queries, is the one we focus on in this paper. We or concurrent modification exceptions on the file delve into this issue to determine the ramifica- system's core data structures. tions of such a system, what limitations it poses, what potential solutions look like, how we solve To depict this problem, let's take a look at an ex- the problem, and the benefits from doing so. ample: 1.2 Path-based Queries Algorithm 1 Example of Problem One of the biggest differences with a FUSE file 1: procedure InterleavingThreadsExample system and a regular file system is the fact that 2: Thread 1 : method queries are done via paths to file system 3: write(0=a=b0; DataChunk1) objects, rather than an inode identifier, a file han- 4: Thread 2 : dle, or any other sort of pointer-based query method. 5: rename(0=a0;0 =c0) This design choice is done within FUSE's kernel 6: Thread 1 : extension, and is done by choice, not by some lim- 7: write(0=a=b0; DataChunk2) itation of the underlying kernel software. FUSE's mantra resides with keeping the kernel extension thin in order to keep performance as high as pos- sible and act purely as a pass-through layer. This This example shows two interleaving threads (pre- in turn means that FUSE doesn't keep a mapping sumably two different applications working on the between the path to an object and its inode ID, file system at the same time) acting on conflicting but rather passes the path that it receives from data. Let's take a look at what really happened here: Thread #1 is undergoing a series of write() does the file system have to deal with interleaving requests to write data to the file at /a/b. It exe- threads modifying related data and leaving the file cutes the write requests in chunks defined by the system in an inconsistent and corrupt state. If kernel, so it has to make multiple write() calls to the file system has a global access lock for all op- get all the data written to the file. Thread #2 is erations, it immediately serializes all file system unaware of thread #1's intention to write data to method calls and turns the file system into a single- a file that is a child node of the folder thread #2 threaded system. is modifying. Thread #2 continues its operation and renames the folder at /a to /c. When thread But even this approach is limited as it still puts #1 makes another write() call to write the second the burden on the developer to ensure that certain chunk of data, the file system is in an inconsistent file system operations are transactional, like reads state from what thread #1 thought it was at after and writes.
Recommended publications
  • Configuring UNIX-Specific Settings: Creating Symbolic Links : Snap
    Configuring UNIX-specific settings: Creating symbolic links Snap Creator Framework NetApp September 23, 2021 This PDF was generated from https://docs.netapp.com/us-en/snap-creator- framework/installation/task_creating_symbolic_links_for_domino_plug_in_on_linux_and_solaris_hosts.ht ml on September 23, 2021. Always check docs.netapp.com for the latest. Table of Contents Configuring UNIX-specific settings: Creating symbolic links . 1 Creating symbolic links for the Domino plug-in on Linux and Solaris hosts. 1 Creating symbolic links for the Domino plug-in on AIX hosts. 2 Configuring UNIX-specific settings: Creating symbolic links If you are going to install the Snap Creator Agent on a UNIX operating system (AIX, Linux, and Solaris), for the IBM Domino plug-in to work properly, three symbolic links (symlinks) must be created to link to Domino’s shared object files. Installation procedures vary slightly depending on the operating system. Refer to the appropriate procedure for your operating system. Domino does not support the HP-UX operating system. Creating symbolic links for the Domino plug-in on Linux and Solaris hosts You need to perform this procedure if you want to create symbolic links for the Domino plug-in on Linux and Solaris hosts. You should not copy and paste commands directly from this document; errors (such as incorrectly transferred characters caused by line breaks and hard returns) might result. Copy and paste the commands into a text editor, verify the commands, and then enter them in the CLI console. The paths provided in the following steps refer to the 32-bit systems; 64-bit systems must create simlinks to /usr/lib64 instead of /usr/lib.
    [Show full text]
  • Mac Keyboard Shortcuts Cut, Copy, Paste, and Other Common Shortcuts
    Mac keyboard shortcuts By pressing a combination of keys, you can do things that normally need a mouse, trackpad, or other input device. To use a keyboard shortcut, hold down one or more modifier keys while pressing the last key of the shortcut. For example, to use the shortcut Command-C (copy), hold down Command, press C, then release both keys. Mac menus and keyboards often use symbols for certain keys, including the modifier keys: Command ⌘ Option ⌥ Caps Lock ⇪ Shift ⇧ Control ⌃ Fn If you're using a keyboard made for Windows PCs, use the Alt key instead of Option, and the Windows logo key instead of Command. Some Mac keyboards and shortcuts use special keys in the top row, which include icons for volume, display brightness, and other functions. Press the icon key to perform that function, or combine it with the Fn key to use it as an F1, F2, F3, or other standard function key. To learn more shortcuts, check the menus of the app you're using. Every app can have its own shortcuts, and shortcuts that work in one app may not work in another. Cut, copy, paste, and other common shortcuts Shortcut Description Command-X Cut: Remove the selected item and copy it to the Clipboard. Command-C Copy the selected item to the Clipboard. This also works for files in the Finder. Command-V Paste the contents of the Clipboard into the current document or app. This also works for files in the Finder. Command-Z Undo the previous command. You can then press Command-Shift-Z to Redo, reversing the undo command.
    [Show full text]
  • Powershell Integration with Vmware View 5.0
    PowerShell Integration with VMware® View™ 5.0 TECHNICAL WHITE PAPER PowerShell Integration with VMware View 5.0 Table of Contents Introduction . 3 VMware View. 3 Windows PowerShell . 3 Architecture . 4 Cmdlet dll. 4 Communication with Broker . 4 VMware View PowerCLI Integration . 5 VMware View PowerCLI Prerequisites . 5 Using VMware View PowerCLI . 5 VMware View PowerCLI cmdlets . 6 vSphere PowerCLI Integration . 7 Examples of VMware View PowerCLI and VMware vSphere PowerCLI Integration . 7 Passing VMs from Get-VM to VMware View PowerCLI cmdlets . 7 Registering a vCenter Server . .. 7 Using Other VMware vSphere Objects . 7 Advanced Usage . 7 Integrating VMware View PowerCLI into Your Own Scripts . 8 Scheduling PowerShell Scripts . 8 Workflow with VMware View PowerCLI and VMware vSphere PowerCLI . 9 Sample Scripts . 10 Add or Remove Datastores in Automatic Pools . 10 Add or Remove Virtual Machines . 11 Inventory Path Manipulation . 15 Poll Pool Usage . 16 Basic Troubleshooting . 18 About the Authors . 18 TECHNICAL WHITE PAPER / 2 PowerShell Integration with VMware View 5.0 Introduction VMware View VMware® View™ is a best-in-class enterprise desktop virtualization platform. VMware View separates the personal desktop environment from the physical system by moving desktops to a datacenter, where users can access them using a client-server computing model. VMware View delivers a rich set of features required for any enterprise deployment by providing a robust platform for hosting virtual desktops from VMware vSphere™. Windows PowerShell Windows PowerShell is Microsoft’s command line shell and scripting language. PowerShell is built on the Microsoft .NET Framework and helps in system administration. By providing full access to COM (Component Object Model) and WMI (Windows Management Instrumentation), PowerShell enables administrators to perform administrative tasks on both local and remote Windows systems.
    [Show full text]
  • Where Do You Want to Go Today? Escalating
    Where Do You Want to Go Today? ∗ Escalating Privileges by Pathname Manipulation Suresh Chari Shai Halevi Wietse Venema IBM T.J. Watson Research Center, Hawthorne, New York, USA Abstract 1. Introduction We analyze filename-based privilege escalation attacks, In this work we take another look at the problem of where an attacker creates filesystem links, thereby “trick- privilege escalation via manipulation of filesystem names. ing” a victim program into opening unintended files. Historically, attention has focused on attacks against priv- We develop primitives for a POSIX environment, provid- ileged processes that open files in directories that are ing assurance that files in “safe directories” (such as writable by an attacker. One classical example is email /etc/passwd) cannot be opened by looking up a file by delivery in the UNIX environment (e.g., [9]). Here, an “unsafe pathname” (such as a pathname that resolves the mail-delivery directory (e.g., /var/mail) is often through a symbolic link in a world-writable directory). In group or world writable. An adversarial user may use today's UNIX systems, solutions to this problem are typ- its write permission to create a hard link or symlink at ically built into (some) applications and use application- /var/mail/root that resolves to /etc/passwd. A specific knowledge about (un)safety of certain directories. simple-minded mail-delivery program that appends mail to In contrast, we seek solutions that can be implemented in the file /var/mail/root can have disastrous implica- the filesystem itself (or a library on top of it), thus providing tions for system security.
    [Show full text]
  • An Incremental Path Towards a Safer OS Kernel
    An Incremental Path Towards a Safer OS Kernel Jialin Li Samantha Miller Danyang Zhuo University of Washington University of Washington Duke University Ang Chen Jon Howell Thomas Anderson Rice University VMware Research University of Washington LoC Abstract Incremental Progress Tens of Linux Safe Linux Linux has become the de-facto operating system of our age, Millions FreeBSD but its vulnerabilities are a constant threat to service availabil- Hundreds of Singularity ity, user privacy, and data integrity. While one might scrap Thousands Biscuit Linux and start over, the cost of that would be prohibitive due Theseus Thousands RedLeaf seL4 to Linux’s ubiquitous deployment. In this paper, we propose Hyperkernel Safety an alternative, incremental route to a safer Linux through No Type Ownership Functional proper modularization and gradual replacement module by Guarantees Safety Safety Verification module. We lay out the research challenges and potential Figure 1: Our vision and the current state of systems. solutions for this route, and discuss the open questions ahead. security vulnerabilities are reported each year, and lifetime CCS Concepts analysis suggests that the new code added this year has intro- duced tens of thousands of more bugs. ! • Software and its engineering Software verification; While operating systems are far from the only source of se- ! • Computer systems organization Reliability. curity vulnerabilities, it is hard to envision building trustwor- Keywords thy computer systems without addressing operating system correctness. kernel safety, verified systems, reliable systems One attractive option is to scrap Linux and start over. Many ACM Reference Format: past works focus on building more secure and correct op- Jialin Li, Samantha Miller, Danyang Zhuo, Ang Chen, Jon Howell, erating systems from the ground up: ones based on strong and Thomas Anderson.
    [Show full text]
  • Simple File System (SFS) Format
    Simple File System (SFS) Format Version SFS-V00.01 The SFS is a simple file system format optimized for writing and for low overhead.. The advantages of this format are: • Event navigation is possible using simple content-independent file system like functions. • Very low overhead. No loss due to block size granularity • Entire valid file system can be created by appending content On the other hand, random access directory navigation is rather slow because there is no built-in indexing or directory hierarchy. For a 500MB file system containing files with 5k bytes this represents an initial search overhead of ~1-2 sec (~100,000 seeks). SFS Structure The structure of a SFS file is as follows VolumeSpec Head File1 File1 Binary Data File2 File2 Binary Data ... ... Tail VolumeSpec: This is simply a 12 byte character string representing filesystem version. For example: “SFS V00.01” Head: This is a short header record. The byte order only applies to the time field of this record. type = “HEAD” byte_order = 0x04030201 time File: The File records are a variable length record containing information about a file. type = “FILE” byte_order = 0x04030201 Sz head_sz attr reserved name.... name (continued).... “byte_order” corresponds only to this header. The endiness of the data is undefined by SFS “sz” corresponds to the datafile size. This may be any number, but the file itself will be padded to take up a multiple of 4 bytes “head_sz” this must be a multiple of 4 “attr” SFS_ATTR_INVALID: file deleted SFS_ATTR_PUSHDIR: push current path to path stack SFS_ATTR_POPDIR: pop current path from path stack SFS_ATTR_NOCD: this record doesn’t reset the basedir “name” the name of the file.
    [Show full text]
  • A Plethora of Paths
    A Plethora of Paths Eric Larson Seattle University [email protected] Abstract A common static software bug detection technique is to use path simulation. Each execution path is simulated using symbolic variables to determine if any software errors could occur. The scalability of this and other path-based approaches is dependent on the number of paths in the pro- gram. This paper explores the number of paths in 15 differ- ent programs. Often, there are one or two functions that contribute a large percentage of the paths within a program. A unique aspect in this study is that slicing was used in dif- ferent ways to determine its effect on the path count. In par- ticular, slicing was applied to each interesting statement individually to determine if that statement could be analyzed without suffering from path explosion. Results show that slic- ing is only effective if it can completely slices away a func- tion that suffers from path explosion. While most programs had several statements that resulted in short path counts, slicing was not adequate in eliminating path explosion occurrences. Looking into the tasks that suffer from path explosion, we find that functions that process input, produce stylized output, or parse strings or code often have signifi- Figure 1. Sample Control Flow Graph. cantly more paths than other functions. a reasonable amount of time. One technique to address path 1. Introduction explosion is to use program slicing [11, 20] to remove code not relevant to the property being verified. Static analysis is an important tool in software verifica- Consider the sample control flow graph in Figure 1.
    [Show full text]
  • File Manager
    File Manager 2016.12 Rev. 1.0 Contents 1.Introduction....................................................................................1 1-1.Overview .........................................................................................................1 1-2. System Requirements and Restrictions .......................................................1 2.Installation on Windows PC ...........................................................2 2-1.Installation......................................................................................................2 2-2.Uninstallation.................................................................................................4 3.How to use RSD-FDFM ..................................................................5 3-1.How to proceed with format CF media..........................................................6 3-2.How to proceed with writing data to CF media ............................................8 3-3.How to create files/folders at CF media.......................................................11 3-4.How to delete files/folders at CF media.......................................................14 3-5.How to save files/folders at CF media .........................................................15 3-6.A list of error messages of RSD-FDFM .......................................................16 *All trademarks and logos are the properties of their respective holders. *The specifications and pictures are subject to change without notice. 1.Introduction Thank you for purchasing file
    [Show full text]
  • When Powerful SAS Meets Powershell
    PharmaSUG 2018 - Paper QT-06 ® TM When Powerful SAS Meets PowerShell Shunbing Zhao, Merck & Co., Inc., Rahway, NJ, USA Jeff Xia, Merck & Co., Inc., Rahway, NJ, USA Chao Su, Merck & Co., Inc., Rahway, NJ, USA ABSTRACT PowerShell is an MS Windows-based command shell for direct interaction with an operating system and its task automation. When combining the powerful SAS programming language and PowerShell commands/scripts, we can greatly improve our efficiency and accuracy by removing many trivial manual steps in our daily routine work as SAS programmers. This paper presents five applications we developed for process automation. 1) Automatically convert RTF files in a folder into PDF files, all files or a selection of them. Installation of Adobe Acrobat printer is not a requirement. 2) Search the specific text of interest in all files in a folder, the file format could be RTF or SAS Source code. It is very handy in situations like meeting urgent FDA requests when there is a need to search required information quickly. 3) Systematically back up all existing files including the ones in subfolders. 4) Update the attributes of a selection of files with ease, i.e., change all SAS code and their corresponding output including RTF tables and SAS log in a production environment to read only after database lock. 5) Remove hidden temporary files in a folder. It can clean up and limit confusion while delivering the whole output folder. Lastly, the SAS macros presented in this paper could be used as a starting point to develop many similar applications for process automation in analysis and reporting activities.
    [Show full text]
  • Apple File System Reference
    Apple File System Reference Developer Contents About Apple File System 7 General-Purpose Types 9 paddr_t .................................................. 9 prange_t ................................................. 9 uuid_t ................................................... 9 Objects 10 obj_phys_t ................................................ 10 Supporting Data Types ........................................... 11 Object Identifier Constants ......................................... 12 Object Type Masks ............................................. 13 Object Types ................................................ 14 Object Type Flags .............................................. 20 EFI Jumpstart 22 Booting from an Apple File System Partition ................................. 22 nx_efi_jumpstart_t ........................................... 24 Partition UUIDs ............................................... 25 Container 26 Mounting an Apple File System Partition ................................... 26 nx_superblock_t ............................................. 27 Container Flags ............................................... 36 Optional Container Feature Flags ...................................... 37 Read-Only Compatible Container Feature Flags ............................... 38 Incompatible Container Feature Flags .................................... 38 Block and Container Sizes .......................................... 39 nx_counter_id_t ............................................. 39 checkpoint_mapping_t ........................................
    [Show full text]
  • The OS/2 Warp 4 CID Software Distribution Guide
    SG24-2010-00 International Technical Support Organization The OS/2 Warp 4 CID Software Distribution Guide January 1998 Take Note! Before using this information and the product it supports, be sure to read the general information in Appendix D, “Special Notices” on page 513. First Edition (January 1998) This edition applies to OS/2 Warp 4 in a software distribution environment and to NetView Distribution Manager/2 (NVDM/2) with Database 2 (DB2) Version 2.11 for OS/2 and Tivoli TME 10 Software Distribution 3.1.3 for OS/2 (SD4OS2) software distribution managers. This edition also applies to OS/2 Warp 4 subcomponents, such as TCP/IP 4.0, MPTS, NetFinity 4.0, Peer Services, and LAN Distance, and to OS/2- related products, such as eNetwork Personal Communications for OS/2 Warp, eNetwork Communications Server for OS/2 Warp, Transaction Server, Lotus Notes, and Lotus SmartStuite 96 for OS/2 Warp. Comments may be addressed to: IBM Corporation, International Technical Support Organization Dept. DHHB Building 045 Internal Zip 2834 11400 Burnet Road Austin, Texas 78758-3493 When you send information to IBM, you grant IBM a non-exclusive right to use or distribute the information in any way it believes appropriate without incurring any obligation to you. © Copyright International Business Machines Corporation 1998. All rights reserved Note to U.S Government Users – Documentation related to restricted rights – Use, duplication or disclosure is subject to restrictions set forth in GSA ADP Schedule Contract with IBM Corp. Contents Figures. .xiii Tables. xvii Preface. .xix How This Redbook in Organized .
    [Show full text]
  • Software Product Description
    Software Product Description PRODUCT NAME: PATHWORKS for VMS, Version 4.0 SPD 30.50.07 (Formerly VMS Services for pes) DESCRIPTION • PATHWORKS for DOS (TCP/IP) - Software required for a personal computer running the DOS Operat­ PATHWORKS for VMS is based on the Personal Com­ ing System to access services of a PATHWORKS puting Systems Architecture (PCSA), which is an ex­ for VMS or PATHWORKS for ULTRIX server via tension of Digital Equipment Corporation's systems and the TCP/IP network transport. (Described in SPD networking architecture that merges the VAX, RISC and 33.45.xx.) personal computer environments. The PATHWORKS product family, developed under the PC SA architecture, • DECnetlPCSA Client: VAXmate - Required software provides a framework for integrating personal comput­ for VAXmates to use the facilities provided by PATH­ ers into an organization's total information system so WORKS for VMS. (Described in SPD 55.10.xx.) that different types of users can share information and PATH WORKS for VMS software allows VAX, MicroVAX, network services across the entire organization. VAXstation and VAXserver computers to act as applica­ The PATHWORKS family of software products includes: tion, data and resource servers to groups of personal computers. By using these server systems, personal • PATHWORKS for VMS - Software that allows a VMS­ computers can share applications, data and resources. based VAX system to act as a file, print, mail and Information can be accessed from local and remote sys­ disk server to DOS and OS/2® personal computers. tems and that information can be applied in DOS or (Described in this document.) OS/2 applications.
    [Show full text]