Organizing, Indexing, and Searching Large-Scale File Systems

Organizing, Indexing, and Searching Large-Scale File Systems Technical Report UCSC-SSRC-09-09 December 2009 Andrew W. Leung [email protected] Storage Systems Research Center Baskin School of Engineering University of California, Santa Cruz Santa Cruz, CA 95064 http://www.ssrc.ucsc.edu/ UNIVERSITY OF CALIFORNIA SANTA CRUZ ORGANIZING, INDEXING, AND SEARCHING LARGE-SCALE FILE SYSTEMS A dissertation submitted in partial satisfaction of the requirements for the degree of DOCTOR OF PHILOSOPHY in COMPUTER SCIENCE by Andrew W. Leung December 2009 The Dissertation of Andrew W. Leung is approved: Ethan L. Miller, Chair Darrell D.E. Long Garth R. Goodson Tyrus Miller Vice Provost and Dean of Graduate Studies Copyright c by Andrew W. Leung 2009 Table of Contents List of Figures vi List of Tables xiii Abstract xvi Dedication xviii Acknowledgments xix 1 Introduction 1 1.1 MainContributions ............................... 3 1.2 Analysis of Large-Scale File System Properties . .......... 4 1.3 New Approaches to File Indexing . 5 1.4 Towards Searchable File Systems . 6 1.5 Organization.................................... 7 2 Background and Related Work 8 2.1 TheGrowingDataVolumeTrend. 8 2.2 ScalableStorageSystems. 9 2.2.1 ParallelFileSystems . 10 2.2.2 Peer-to-Peer File Systems . 12 2.2.3 OtherFileSystems ............................ 13 2.3 The Data Management Problem in Large-Scale File Systems .......... 15 2.3.1 Search-Based File Management . 18 2.3.2 File System Search Background . 18 2.3.3 Large-Scale File System Search Use Cases . 20 2.4 Large-Scale File System Search Challenges . 23 2.5 ExistingSolutions............................... 26 2.5.1 BruteForceSearch ............................ 26 2.5.2 Desktop and Enterprise Search . 27 2.5.3 SemanticFileSystems ..... ..... ...... ...... .... 28 2.5.4 SearchableFileSystems . 31 iii 2.5.5 RankingSearchResults . 31 2.6 IndexDesign ................................... 32 2.6.1 Relational Databases . 33 2.6.2 InvertedIndexes ............................. 36 2.6.3 File System Optimized Indexing . 40 2.7 Summary ..................................... 41 3 Properties of Large-Scale File Systems 42 3.1 Previous Workload Studies . 45 3.1.1 TheNeedforaNewStudy . 46 3.1.2 TheCIFSProtocol ............................ 47 3.2 Workload Tracing Methodology . 47 3.3 WorkloadAnalysis ................................ 49 3.3.1 Terminology ............................... 49 3.3.2 Workload Comparison . 50 3.3.3 Comparison to Past Studies . 52 3.3.4 FileI/OProperties ............................ 61 3.3.5 FileRe-Opens .............................. 63 3.3.6 Client Request Distribution . 65 3.3.7 FileSharing................................ 65 3.3.8 File Type and User Session Patterns . 69 3.4 DesignImplications..... ...... ..... ...... ...... .. 73 3.5 Previous Snapshot Studies . 74 3.6 Snapshot Tracing Methodology . 76 3.7 SnapshotAnalysis................................ 77 3.7.1 SpatialLocality.............................. 77 3.7.2 Frequency Distribution . 78 3.8 Summary ..................................... 81 4 New Approaches to File Indexing 82 4.1 MetadataIndexing ................................ 84 4.1.1 SpyglassDesign ............................. 85 4.1.2 Hierarchical Partitioning . 86 4.1.3 Partition Versioning . 92 4.1.4 Collecting Metadata Changes . 95 4.1.5 DistributedDesign ............................ 96 4.2 ContentIndexing ................................. 97 4.2.1 IndexDesign ............................... 97 4.2.2 QueryExecution ............................. 100 4.2.3 IndexUpdates .............................. 101 4.2.4 Additional Functionality . 102 4.2.5 Distributing the Index . 103 4.3 Experimental Evaluation . 104 iv 4.3.1 Implementation Details . 105 4.3.2 ExperimentalSetup. 106 4.3.3 Metadata Collection Performance . 107 4.3.4 UpdatePerformance . 108 4.3.5 SpaceOverhead ............................. 109 4.3.6 SelectivityImpact ............................ 110 4.3.7 SearchPerformance ...... ..... ...... ...... .... 111 4.3.8 IndexLocality .............................. 113 4.3.9 Versioning Overhead . 117 4.4 Summary ..................................... 118 5 Towards Searchable File Systems 119 5.1 Background .................................... 121 5.1.1 SearchApplications ...... ..... ...... ...... .... 121 5.1.2 Integrating Search into the File System . 123 5.2 Integrating Metadata Search . 124 5.2.1 MetadataClustering ...... ..... ...... ...... .... 125 5.2.2 IndexingMetadata . ...... ..... ...... ...... .... 129 5.2.3 QueryExecution ............................. 132 5.2.4 Cluster-Based Journaling . 133 5.3 Integrating Semantic File System Functionality . ........... 134 5.3.1 CopernicusDesign . ...... ..... ...... ...... .... 135 5.3.2 GraphConstruction. 137 5.3.3 ClusterIndexing ............................. 139 5.3.4 QueryExecution ............................. 140 5.3.5 ManagingUpdates . ...... ..... ...... ...... .... 141 5.4 ExperimentalResults . 142 5.4.1 Implementation Details . 142 5.4.2 Microbenchmarks . ...... ..... ...... ...... .... 144 5.4.3 Macrobenchmarks . ...... ..... ...... ...... .... 150 5.5 Summary ..................................... 155 6 Future Directions 157 6.1 Large-Scale File Systems Properties . 157 6.2 New Approaches to File Indexing . 158 6.3 Towards Searchable File Systems . 159 7 Conclusions 161 Bibliography 165 v List of Figures 2.1 Network-attached storage architecture. Multiple clients utilize a centralized server for access to shared storage. 9 2.2 Parallel file system architecture. Clients send metadata requests to metadata servers (MDSs) and data requests to separate data stores. .......... 11 2.3 Inverted index architecture. The dictionary is a data structure that maps key- words to posting lists. Posting lists contain postings that describe each occur- rence of the keyword in the file system. Searches require retrieving the posting lists for each keyword in the query. 37 3.1 Request distribution over time. The frequency of all requests, read requests, and write requests are plotted over time. Figure 3.1(a) shows how the request distribution changes for a single week in October 2007. Here request totals are grouped in one hour intervals. The peak one hour request total for corporate is 1.7 million and 2.1 million for engineering. Figure 3.1(b) shows the request distribution for a nine week period between September and November 2007. Here request totals are grouped into one day intervals. The peak one day intervals are 9.4 million for corporate and 19.1 million for engineering. ........... 53 3.2 Sequential run properties. Sequential access patterns are analyzed for various sequential run lengths. The X-axes are given in a log-scale. Figure 3.2(a) shows the length of sequential runs, while Figure 3.2(b) shows how many bytes are transferred in sequential runs. 56 vi 3.3 Sequentiality of data transfer. The frequency of sequentiality metrics is plotted against different data transfer sizes. Darker regions indicate a higher fraction of total transfers. Lighter regions indicate a lower fraction. Transfer types are broken into read-only, write-only, and read-write transfers. Sequentiality metrics are grouped by tenths for clarity. 57 3.4 File size access patterns. The distribution of open requests and bytes transferred are analyzed according to file size at open. The X-axes are shown on a log-scale. Figure 3.4(a) shows the size of files most frequently opened. Fig- ure 3.4(b) shows the size of files from which most bytes are transferred. 59 3.5 File open durations. The duration of file opens is analyzed. The X-axis is presented on a log-scale. Most files are opened very briefly, although engineering files are opened slightly longer than corporate files. ......... 60 3.6 File lifetimes. The distributions of lifetimes for all created and deleted files are shown. Time is shown on the X-axis on a log-scale. Files may be deleted through explicit delete request or truncation. ......... 61 3.7 File I/O properties. The burstiness and size properties of I/O requests are shown. Figure 3.7(a) shows the I/O inter-arrival times. The X-axis is presented on a log-scale. Figure 3.7(b) shows the sizes of read and write I/O. The X-axis is divided into 8 KB increments. 62 3.8 File open properties. The frequency of and duration between file re-opens is shown. Figure 3.8(a) shows how often files are opened more than once. Fig- ure 3.8(b) shows the time between re-opens and time intervals on the X-axis are giveninalog-scale................................. 64 3.9 Client activity distribution The fraction of clients responsible for certain ac- tivitiesisplotted. ................................. 65 3.10 File sharing properties. We analyze the frequency and temporal properties of file sharing. Figure 3.10(a) shows the distribution of files opened by multiple clients. Figure 3.10(b) shows the duration between shared opens. The durations on the X-axis are in a log-scale. 66 vii 3.11 File sharing access patterns. The fraction of read-only, write-only, and read- write accesses are shown for differing numbers of sharing clients. Gaps are seen where no files were shared with that number of clients. 67 3.12 User file sharing equality. The equality of sharing is shown for differing numbers of sharing clients. The Gini coefficient, which measures the level of equality, is near 0 when sharing clients have about the same number of opens to a file. It is near 1 when clients unevenly share opens to a file. ......... 68 3.13 File type popularity. The histograms on the right show which file

Organizing, Indexing, and Searching Large-Scale File Systems

Online Layered File System (OLFS): a Layered and Versioned Filesystem and Performance Analysis

A Searchable-By-Content File System

A Semantic File System for Integrated Product Data Management', Advanced Engineering Informatics, Vol

Using Virtual Directories Prashanth Mohan, Raghuraman, Venkateswaran S and Dr

Privacy Engineering for Social Networks

Enabling Security in Cloud Storage Slas with Cloudproof

Modern Infrastructure for Dummies®, Dell EMC #Getmodern Special Edition

VERSIONFS a Versatile and User-Oriented Versioning File System

Orion File System : File-Level Host-Based Virtualization

Metadata Efficiency in Versioning File Systems

The Sile Model — a Semantic File System Infrastructure for the Desktop

Replication, History, and Grafting in the Ori File System