Toward Automatic Context-Based Attribute Assignment for Semantic file Systems
Toward automatic context-based attribute assignment for semantic file systems Craig A. N. Soules, Gregory R. Ganger CMU-PDL-04-105 June 2004 Parallel Data Laboratory Carnegie Mellon University Pittsburgh, PA 15213-3890 Abstract Semantic file systems enable users to search for files based on attributes rather than just pre-assigned names. This paper devel- ops and evaluates several new approaches to automatically generating file attributes based on context, complementing existing approaches based on content analysis. Context captures broader system state that can be used to provide new attributes for files, and to propagate attributes among related files; context is also how humans often remember previous items [2], and so should fit the primary role of semantic file systems well. Based on our study of ten systems over four months, the addition of context-based mechanisms, on average, reduces the number of files with zero attributes by 73%. This increases the total number of classifiable files by over 25% in most cases, as is shown in Figure 1. Also, on average, 71% of the content-analyzable files also gain additional valuable attributes. Acknowledgements: We thank the members and companies of the PDL Consortium (including EMC, Hewlett-Packard, HGST, Hitachi, IBM, Intel, LSI Logic, Microsoft, Network Appliance, Panasas, Seagate, Sun, and Veritas) for their interest, insights, feedback, and support. We thank IBM and Intel for hardware grants supporting our research efforts. Keywords: semantic filesystems, context, attributes, classfication 1 Introduction As storage capacity continues to increase, the number of files belonging to an individual user has increased accordingly. Already, storage capacity has reached the point where there is little reason for a user to delete old content—in fact, the time required to do so would be wasted.
[Show full text]