Lifs: an Attribute-Rich File System for Storage Class Memories

LiFS: An Attribute-Rich File System for Storage Class Memories SashaAmes NikhilBobb KevinM.Greenan OwenS.Hofmann Mark W. Storer Carlos Maltzahn Ethan L. Miller Scott A. Brandt Storage Systems Research Center Computer Science Department University of California, Santa Cruz Abstract find what one really wants. This problem has been ad- dressed on the web with the development of search en- As the number and variety of files stored and accessed by gines, and it is now often harder to find information on a typical user has dramatically increased, existing file sys- one’s own hard drive than on the web. tem structures have begun to fail as a mechanism for man- To address this shortcoming, application developers aging all of the informationcontained in those files. Many have been forced to develop their own metadata infras- applications—email clients, multimedia management ap- tructures. Email applications, digital photo albums, dig- plications, and desktop search engines are examples— ital music applications, desktop search applications, and have been forced to develop their own richer metadata many others have their own file system metadata to enable infrastructures. While effective, these solutions are gener- the organizing, searching, browsing, viewing/playing, an- ally non-standard, non-portable, non-sharable across ap- notating, and generally working with specific types of plications, users or platforms, proprietary, and potentially files. Many of these applications have special-purpose inefficient. In the interest of providing a rich, efficient, code for dealing with the specific properties of the type of shared file system metadata infrastructure, we have devel- data they manage, but they also include code that is not oped the Linking File System (LiFS). Taking advantage data-specific for organizing, annotating, browsing, etc. of non-volatile storage class memories, LiFS supports a Because this code was developed as part of an application, wide variety of user and application metadata needs while it rarely adheres to any standard, is often not portable, it efficiently supporting traditional file system operations. is difficult to share between applications, users, or platforms, it is typically owned by the company that devel- 1 Introduction oped it, and it is potentially inefficient. Key functions of the operating system are to efficiently File system interfaces have changed relatively little in the provide services that are used by a variety of applications, three decades since the UNIX file system was first intro- to abstract away low-level details by providing a use- duced. Metadata in standard file systems includes direc- ful high-level API, and to facilitate sharing of resources tory hierarchies and some fixed per-file attributes includ- used by multiple users and applications. The wide vari- ing file name, permissions, size, and access/modification ety of applications developing their own file system meta- times. While primitive, these interfaces have served well. data infrastructure shows the need for this infrastructure In the same time frame, demandson the storage subsys- to be provided by the file system. Recently, researchers tem have increased both quantitatively and qualitatively. (ourselves included) have taken steps in that direction Storage systems have grown, the amount of storage ac- by including additional per-file metadata in the form of cessed by individual users has increased, and the variety hkey,valuei pairs. While useful, this is inadequate to of data stored has grown dramatically. General-purpose support the needs of the wide variety of applications de- file systems are now used to store tremendous volumes of scribed above. text documents, web pages, application programs, email We present the Linking File System (LiFS). In addi- files, calendars, contacts, music files, movies, and many tion to the standard file system operations, LiFS pro- other types of data. Although current file systems are rel- vides searchable application-defined file attributes and at- atively effective at reliably storing the data, the increasing tributed links between files. File attributes are in the form size and complexity of the information stored has made of application-defined hkey,valuei pairs. Links are ex- management and retrieval of the information problematic. plicit relationships between files that can themselves have Simply stated, with so much information, it is difficult to attributes expressed as hkey,valuei pairs to express the nature of the relationship created by the link. These simple 2.1 Information Management additions dramatically change the nature of file systems, enabling a wide variety of operations, providing a rich, 2.1.1 Searching shared metadata infrastructure, and allowing applications Finding data on the vast World-wide Web is often eas- to focus on managing application-specific data instead of ier than finding the same content on a local hard drive. managing things that are best managed by the file system. The same search engine technology that does well on In LiFS, all files may contain data and links to other the Internet typically performs poorly when applied to files. Thus, a traditional data file is one with contents and enterprise-level file systems. To our knowledge both of no links, and a traditionaldirectoryis one with no contents these statements are only based on anecdotal albeit com- and directory containment links to other files. Many more mon evidence but can be plausibly explained: web links interesting examples are possible. For example, a .c file convey relevance and semantic information that turns out can contain links to the .h files it includes. An executable to be very useful for searching and presenting search re- can contain links to its source .c files, the compiler used sults [36]. However, traditional file systems only convey to generate it, and the library files on which it depends. relationships among files through the hierarchical direc- A document can contain a link to the application used to tory system. edit and view it. With respect to the application-specific Hierarchical directories are actually a compromise be- infrastructures mentioned above, an email client, a digital tween the need of users to organize their data on one hand photo album, and a music player now all need only pro- and file system designers who aim to reduce the cost of vide a GUI for managing their specific type of data and maintaining metadata on the other. This dearth of rela- manipulating the attributes of and relationships among the tionships between files is the primary reason that finding files they each manage, and the file system can take care data in enterprise-wide file systems or even in local file of storing the attributes and relationships and efficiently systems appears to be harder than on the Internet. supporting their manipulation, search, and other actions. There is in fact a wealth of explicit and implicit relationships among files. Because hierarchical directo- LiFS is enabled by storage class memories—non- ries make it difficult to express explicit relationships, this volatile, byte-addressable RAM—by making the reading, information often ends up being stored in application- writing, indexing, and searching of such rich metadata fast specific files and obscured by proprietary file formats. and efficient. Our prototype is designed with MRAM in There are also implicit relationships such as provenance, mind, but is implementedin Linux using standard DRAM. application and data dependencies, as well as contextsthat Our results demonstrate that the performance of LiFS may span applications; this information is typically not is comparable to, if not better than, that of other Linux recorded at all. However, maintaining provenance and file systems, while providing far richer metadata seman- context relationships enables powerful search capabilities. tics. As a proof-of-concept we have implemented a sim- For example, two downloaded files that are in some way ple browser that functions alternatively as a file system related on the web, e. g., originating from the same web browser, an email browser, a digital photo browser, or a site, normally are stripped from their context when stored music browser, depending upon the files and links con- on a local file system. Provenancepreserves their relation- tained in the file system hierarchy it is exploring. The ships and provides important clues to context-sensitive following sections examine some motivating examples in searches. more detail, discuss the LiFS design, present the details The history of Internet search engines shows a progres- of our LiFS implementation, and show the performance sion of increasingly sophisticated ways of mining rela- of LiFS in various scenarios. tionships, extending successful searching even to documents with content obscured by proprietary formats. The introduction of rich relationships for files will allow the successful use of advanced search technologies within 2 Motivating Examples personal or enterprise-wide file systems. 2.1.2 Repository Sharing Relational links and attributes are surprisingly useful. In this section we will illustrate their utility in finding and or- Because applications are often the sole maintainer of ganizing information and in coping with change in com- meaningful relationships among files, repositories are of- puting infrastructure. These areas are of particular interest ten partitioned. As a result, each repository can only in the context of rapid growth of personal and enterprise- be accessed by one application and relationships that

Lifs: an Attribute-Rich File System for Storage Class Memories

Development of a Verified Flash File System ⋆

STAT 625: Statistical Case Studies 1 Our Computing Environment(S)

System Calls and I/O

Enhancing the Accuracy of Synthetic File System Benchmarks Salam Farhat Nova Southeastern University, [email protected]

DASH: Database Shadowing for Mobile DBMS

Redbooks Paper Linux on IBM Zseries and S/390

11.7 the Windows 2000 File System

Partitioner Och Filsystem 2

[13주차] Sysfs and Procfs

“Application - File System” Divide with Promises

CS 152: Computer Systems Architecture Storage Technologies

Refs: Is It a Game Changer? Presented By: Rick Vanover, Director, Technical Product Marketing & Evangelism, Veeam