Building Blocks for Semantic Data Organization on the Desktop

Total Page:16

File Type:pdf, Size:1020Kb

Building Blocks for Semantic Data Organization on the Desktop DISSERTATION Building Blocks for Semantic Data Organization on the Desktop Verfasser Dipl.-Ing. Niko Popitsch angestrebter akademischer Grad Doktor der technischen Wissenschaften (Dr. techn.) Wien, im August 2011 Studienkennzahl lt. Studienblatt: A 786 881 Dissertationsgebiet lt. Studienblatt: Informatik Betreuer: Univ.-Prof. Dipl.-Ing. Dr. Wolfgang Klas Zweitbetreuer: Univ.-Prof. Dipl.-Ing. DDr. Gerald Quirchmayr Moji družini Acknowledgments Working as a Ph.D. student at the University of Vienna in the past four years was both a magnificent as well as a challenging experience. I wish to thank my supervisor Prof. Dr. Wolfgang Klas and my secondary advisor Prof. DDr. Gerald Quirchmayr for giving me the opportunity and the freedom to pursue my research and to write this thesis as well as for their guidance and all their support. In addition, I am indebted to my colleagues Bernhard Haslhofer and Bernhard Schandl for the fruitful col- laboration. Special thanks go also to my colleagues Stefan Leitich, Robert Mosser, Wolfgang Jochum, Ross King, Jan Stankovsky, Stefan Zander, Elaheh Momeni-Roochi and all the others with whom I had the plea- sure to work with during the past years. Thank you all for sharing your excellent ideas with me and for all the fun we had at work. Above all, however, my gratitude and my love go to my family, my friends and in particular to Maja for motivating and supporting me in the past years. iii Abstract The organization of (multimedia) data on current desktop systems is done to a large part by arranging files in hierarchical file systems, but also by specialized applications (e.g., music or photo organizing software) that make use of file-related metadata for this task. These metadata are predominantly stored in embedded file headers, using a magnitude of mainly proprietary formats. Generally, metadata and links play the key roles in advanced data organization concepts. Their limited support in prevalent file system implementations, however, hinders the adoption of such concepts on the desktop: First, non-uniform access interfaces require metadata consuming applications to understand both a file’s format and its metadata scheme; second, separate data/metadata access is not possible, and third, metadata cannot be attached to multiple files or to file folders although the latter are the primary constructs for file organization. As a consequence of this, current desktops suffer, inter alia, from (i) limited data organization possibilities, (ii) limited navigability, (iii) limited data findability, and (iv) metadata fragmentation. Although there were attempts to improve this situation, e.g., by introducing semantic file systems, most of these issues were successfully addressed and solved in the Web and in particular in the Semantic Web and reusing these solutions on the desktop, a central hub of data and metadata manipulation, is clearly desirable. In this thesis a novel, backwards-compatible metadata model that addresses the above-mentioned issues is introduced. This model is based on stable file identifiers and external, file-related, semantic metadata descriptions that are represented using the generic RDF graph model. Descriptions are accessible via a uniform Linked Data interface and can be linked with other descriptions and resources. In particular, this model enables semantic linking between local file system objects and remote resources on the Web or the emerging Web of Data, thereby enabling the integration of these data spaces. As the model crucially relies on the stability of these links, we contribute two algorithms that preserve their integrity in local and in remote environments. This means that links between file system objects, metadata descriptions and remote resources do not break even if their addresses change, e.g., when files are moved or Linked Data resources are re-published using different URIs. Finally, we contribute a prototypical implementation of the proposed metadata model that demonstrates how these building blocks sum up to constitute a metadata layer that may act as a foundation for semantic data organization on the desktop. v Zusammenfassung Die Organisation von (Multimedia-) Daten auf Desktop-Systemen wird derzeit hauptsächlich durch das Ein- ordnen von Dateien in ein hierarchisches Dateisystem bewerkstelligt. Zusätzlich werden gewisse Inhalte (z.B. Musik oder Fotos) von spezialisierter Software mit Hilfe Datei-bezogener Metadaten verwaltet. Diese Metadaten werden meist direkt im Dateikopf in einer Unzahl verschiedener, vorwiegend proprietärer For- mate gespeichert. Allgemein nehmen Metadaten und Links die Schlüsselrollen in fortgeschrittenen Datenor- ganisationskonzepten ein, ihre eingeschränkte Unterstützung in vorherrschenden Dateisystemen macht die Einführung solcher Konzepte auf dem Desktop jedoch schwierig: Erstens müssen Anwendungen sowohl Dateiformat als auch Metadatenschema verstehen um auf Metadaten zugreifen zu können; zweitens ist ein getrennter Zugriff auf Daten und Metadaten nicht möglich und drittens kann man solche Metadaten nicht mit mehreren Dateien oder mit Dateiordnern assoziieren obgleich letztere die derzeit wichtigsten Konstrukte für die Dateiorganisation darstellen. Dies bedeutet in weiterer Folge: (i) eingeschränkte Möglichkeiten der Datenorganisation, (ii) eingeschränkte Navigationsmöglichkeiten, (iii) schlechte Auffindbarkeit der gespei- cherten Daten, und (iv) Fragmentierung von Metadaten. Obschon es Versuche gab, diese Situation (zum Beispiel mit Hilfe semantischer Dateisysteme) zu verbessern, wurden die meisten dieser Probleme bisher vor allem im Web und im Speziellen im semantischen Web adressiert und gelöst. Das Anwenden dort ent- wickelter Lösungen auf dem Desktop, einer zentralen Plattform der Daten- und Metadatenmanipulation, wäre zweifellos von Vorteil. In der vorliegenden Arbeit wird ein neues, rückwärts-kompatibles Metadatenmodell als Lösungsversuch für die oben genannten Probleme präsentiert. Dieses Modell basiert auf stabilen Datei-Identifikatoren und ex- ternen, semantischen, Datei-bezogenen Metadatenbeschreibungen welche im RDF Graphenmodell repräsen- tiert werden. Diese Beschreibungen sind durch eine einheitliche Linked-Data-Schnittstelle zugänglich und können mit anderen Beschreibungen und Ressourcen verlinkt werden. Im Speziellen erlaubt dieses Modell semantische Links zwischen lokalen Dateisystemobjekten und Netzressourcen im Web sowie im entstehen- den “Daten Web” und ermöglicht somit die Integration dieser Datenräume. Das Modell hängt entscheidend von der Stabilität dieser Links ab weshalb zwei Algorithmen präsentiert werden, welche deren Integrität in lokalen und vernetzten Umgebungen erhalten können. Dies bedeutet, dass Links zwischen Dateisys- temobjekten, Metadatenbeschreibungen und Netzressourcen nicht brechen wenn sich deren Adressen än- dern, z.B. wenn Dateien verschoben oder Linked-Data Ressourcen unter geänderten URIs publiziert werden. Schließlich wird eine prototypische Implementierung des vorgeschlagenen Metadatenmodells präsentiert, welche demonstriert wie die Summe dieser Bausteine eine Metadatenschicht bildet die als Grundlage für semantische Datenorganisation auf dem Desktop verwendet werden kann. vii Table of Contents I Background1 1 Introduction 3 1.1 Data organization on the desktop . .4 1.2 Existing solution strategies . .6 1.3 Thesis and contributions . .8 2 Definitions 9 II Metadata model 19 3 The role of metadata and links 21 3.1 Core building blocks . 21 3.2 Categories of metadata . 22 3.3 Links . 25 3.4 Discussion . 30 4 A model for external, semantic, file-related metadata 33 4.1 Introduction . 34 4.2 A novel metadata model for the desktop . 35 4.3 A formal definition of the Y2 metadata model . 42 4.4 Realization of a file metadata store . 45 4.5 Related work . 54 4.6 Discussion . 58 4.7 Conclusions . 62 III Methods for preserving link integrity 65 5 The broken link problem 67 5.1 Introduction . 67 5.2 Change events . 67 5.3 Broken links . 69 ix x TABLE OF CONTENTS 5.4 Solution strategies . 70 6 DSNotify – fixing broken links in a Web of Data 75 6.1 Introduction . 75 6.2 Dataset dynamics . 76 6.3 The broken link problem in the Web of Data . 78 6.4 The DSNotify Eventset vocabulary . 79 6.5 Event detection with DSNotify . 81 6.6 Evaluation . 90 6.7 Related work . 98 6.8 Discussion . 101 7 A study of low-level file system features 105 7.1 Introduction . 105 7.2 Related work . 105 7.3 Data sets . 107 7.4 Methodology . 107 7.5 Results . 110 7.6 Discussion . 113 7.7 Conclusions . 117 8 Gorm – a heuristic, user-space method for event detection in the file system 119 8.1 Introduction . 119 8.2 The file move detection problem . 119 8.3 Proposed heuristic approach . 122 8.4 Evaluation . 136 8.5 Related work . 142 8.6 Discussion . 146 IV Implementation 149 9 Y2 – a prototype for semantic file annotation 151 9.1 User interface . 152 9.2 Implementation of the proposed metadata model . 159 9.3 Related work . 163 9.4 Discussion . 171 10 Conclusions and Outlook 175 10.1 Summary of contributions . 175 10.2 Discussion . 176 10.3 Limitations and future research directions . 177 10.4 Epilogue . 179 TABLE OF CONTENTS xi V Addenda 181 A Prevalent file systems 183 A.1 Special purpose file systems . 187 A.2 File system virtualization . 188 B Current file system linking concepts 193 C Popular metadata standards 195 C.1 Multimedia metadata standards . 195 C.2 Semantic vocabularies . 197 D Y2 RDF examples 201 Bibliography 205 Curriculum Vitae 221 List of Figures 1.1 An approximate timeline of some key file formats and technologies that did or are expected to sustainably change the way how digital information is produced, processed and shared . .3 1.2 A simplified workflow of media and its metadata . .4 1.3 Exemplary folder hierarchy for grouping semantically related files. .5 2.1 File fork / alternate data stream concept . 11 2.2 Popular Web 2.0 applications . 12 2.3 Linking Open Data cloud . 16 3.1 Metadata categories and examples . ..
Recommended publications
  • Bibliography of Erik Wilde
    dretbiblio dretbiblio Erik Wilde's Bibliography References [1] AFIPS Fall Joint Computer Conference, San Francisco, California, December 1968. [2] Seventeenth IEEE Conference on Computer Communication Networks, Washington, D.C., 1978. [3] ACM SIGACT-SIGMOD Symposium on Principles of Database Systems, Los Angeles, Cal- ifornia, March 1982. ACM Press. [4] First Conference on Computer-Supported Cooperative Work, 1986. [5] 1987 ACM Conference on Hypertext, Chapel Hill, North Carolina, November 1987. ACM Press. [6] 18th IEEE International Symposium on Fault-Tolerant Computing, Tokyo, Japan, 1988. IEEE Computer Society Press. [7] Conference on Computer-Supported Cooperative Work, Portland, Oregon, 1988. ACM Press. [8] Conference on Office Information Systems, Palo Alto, California, March 1988. [9] 1989 ACM Conference on Hypertext, Pittsburgh, Pennsylvania, November 1989. ACM Press. [10] UNIX | The Legend Evolves. Summer 1990 UKUUG Conference, Buntingford, UK, 1990. UKUUG. [11] Fourth ACM Symposium on User Interface Software and Technology, Hilton Head, South Carolina, November 1991. [12] GLOBECOM'91 Conference, Phoenix, Arizona, 1991. IEEE Computer Society Press. [13] IEEE INFOCOM '91 Conference on Computer Communications, Bal Harbour, Florida, 1991. IEEE Computer Society Press. [14] IEEE International Conference on Communications, Denver, Colorado, June 1991. [15] International Workshop on CSCW, Berlin, Germany, April 1991. [16] Third ACM Conference on Hypertext, San Antonio, Texas, December 1991. ACM Press. [17] 11th Symposium on Reliable Distributed Systems, Houston, Texas, 1992. IEEE Computer Society Press. [18] 3rd Joint European Networking Conference, Innsbruck, Austria, May 1992. [19] Fourth ACM Conference on Hypertext, Milano, Italy, November 1992. ACM Press. [20] GLOBECOM'92 Conference, Orlando, Florida, December 1992. IEEE Computer Society Press. http://github.com/dret/biblio (August 29, 2018) 1 dretbiblio [21] IEEE INFOCOM '92 Conference on Computer Communications, Florence, Italy, 1992.
    [Show full text]
  • A Searchable-By-Content File System
    A Searchable-by-Content File System Srinath Sridhar, Jeffrey Stylos and Noam Zeilberger May 2, 2005 Abstract Indexed searching of desktop documents has recently become popular- ized by applications from Google, AOL, Yahoo!, MSN and others. How- ever, each of these is an application separate from the file system. In our project, we explore the performance tradeoffs and other issues encountered when implementing file indexing inside the file system. We developed a novel virtual-directory interface to allow application and user access to the index. In addition, we found that by deferring indexing slightly, we could coalesce file indexing tasks to reduce total work while still getting good performance. 1 Introduction Searching has always been one of the fundamental problems of computer science, and from their beginnings computer systems were designed to support and when possible automate these searches. This support ranges from the simple-minded (e.g., text editors that allow searching for keywords) to the extremely powerful (e.g., relational database query languages). But for the mundane task of per- forming searches on users’ files, available tools still leave much to be desired. For the most part, searches are limited to queries on a directory namespace—Unix shell wildcard patterns are useful for matching against small numbers of files, GNU locate for finding files in the directory hierarchy. Searches on file data are much more difficult. While from a logical point of view, grep combined with the Unix pipe mechanisms provides a nearly universal solution to content-based searches, realistically it can be used only for searching within small numbers of (small-sized) files, because of its inherent linear complexity.
    [Show full text]
  • A Semantic File System for Integrated Product Data Management', Advanced Engineering Informatics, Vol
    Citation for published version: Eck, O & Schaefer, D 2011, 'A Semantic File System for Integrated Product Data Management', Advanced Engineering Informatics, vol. 25, no. 2, pp. 177-184. https://doi.org/10.1016/j.aei.2010.08.005 DOI: 10.1016/j.aei.2010.08.005 Publication date: 2011 Document Version Publisher's PDF, also known as Version of record Link to publication University of Bath Alternative formats If you require this document in an alternative format, please contact: [email protected] General rights Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. Take down policy If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately and investigate your claim. Download date: 02. Oct. 2021 Advanced Engineering Informatics 25 (2011) 177–184 Contents lists available at ScienceDirect Advanced Engineering Informatics journal homepage: www.elsevier.com/locate/aei A semantic file system for integrated product data management ⇑ Oliver Eck a, , Dirk Schaefer b a Department of Computer Science, HTWG Konstanz, Germany b Systems Realization Laboratory, Woodruff School of Mechanical Engineering, Georgia Institute of Technology, Atlanta, GA, USA article info abstract Article history: A mechatronic system is a synergistic integration of mechanical, electrical, electronic and software tech- Received 31 October 2009 nologies into electromechanical systems. Unfortunately, mechanical, electrical, and software data are Received in revised form 10 June 2010 often handled in separate Product Data Management (PDM) systems with no automated sharing of data Accepted 17 August 2010 between them or links between their data.
    [Show full text]
  • Using Virtual Directories Prashanth Mohan, Raghuraman, Venkateswaran S and Dr
    1 Semantic File Retrieval in File Systems using Virtual Directories Prashanth Mohan, Raghuraman, Venkateswaran S and Dr. Arul Siromoney {prashmohan, raaghum2222, wenkat.s}@gmail.com and [email protected] Abstract— Hard Disk capacity is no longer a problem. How- ‘/home/user/docs/univ/project’ directory lists the ever, increasing disk capacity has brought with it a new problem, file. The ‘Database File System’ [2], encounters this prob- the problem of locating files. Retrieving a document from a lem by listing recursively all the files within each directory. myriad of files and directories is no easy task. Industry solutions are being created to address this short coming. Thereby a funneling of the files is done by moving through We propose to create an extendable UNIX based File System sub-directories. which will integrate searching as a basic function of the file The objective of our project (henceforth reffered to as system. The File System will provide Virtual Directories which SemFS) is to give the user the choice between a traditional list the results of a query. The contents of the Virtual Directory heirarchial mode of access and a query based mechanism is formed at runtime. Although, the Virtual Directory is used mainly to facilitate the searching of file, It can also be used by which will be presented using virtual directories [1], [5]. These plugins to interpret other queries. virtual directories do not exist on the disk as separate files but will be created in memory at runtime, as per the directory Index Terms— Semantic File System, Virtual Directory, Meta Data name.
    [Show full text]
  • Privacy Engineering for Social Networks
    UCAM-CL-TR-825 Technical Report ISSN 1476-2986 Number 825 Computer Laboratory Privacy engineering for social networks Jonathan Anderson December 2012 15 JJ Thomson Avenue Cambridge CB3 0FD United Kingdom phone +44 1223 763500 http://www.cl.cam.ac.uk/ c 2012 Jonathan Anderson This technical report is based on a dissertation submitted July 2012 by the author for the degree of Doctor of Philosophy to the University of Cambridge, Trinity College. Technical reports published by the University of Cambridge Computer Laboratory are freely available via the Internet: http://www.cl.cam.ac.uk/techreports/ ISSN 1476-2986 Privacy engineering for social networks Jonathan Anderson In this dissertation, I enumerate several privacy problems in online social net- works (OSNs) and describe a system called Footlights that addresses them. Foot- lights is a platform for distributed social applications that allows users to control the sharing of private information. It is designed to compete with the performance of today’s centralised OSNs, but it does not trust centralised infrastructure to en- force security properties. Based on several socio-technical scenarios, I extract concrete technical problems to be solved and show how the existing research literature does not solve them. Addressing these problems fully would fundamentally change users’ interactions with OSNs, providing real control over online sharing. I also demonstrate that today’s OSNs do not provide this control: both user data and the social graph are vulnerable to practical privacy attacks. Footlights’ storage substrate provides private, scalable, sharable storage using untrusted servers. Under realistic assumptions, the direct cost of operating this storage system is less than one US dollar per user-year.
    [Show full text]
  • Modern Infrastructure for Dummies®, Dell EMC #Getmodern Special Edition
    Modern Infrastructure Dell EMC #GetModern Special Edition by Lawrence C. Miller These materials are © 2017 John Wiley & Sons, Ltd. Any dissemination, distribution, or unauthorized use is strictly prohibited. Modern Infrastructure For Dummies®, Dell EMC #GetModern Special Edition Published by: John Wiley & Sons Singapore Pte Ltd., 1 Fusionopolis Walk, #07-01 Solaris South Tower, Singapore 138628, www.wiley.com © 2017 by John Wiley & Sons Singapore Pte Ltd. Registered Office John Wiley & Sons Singapore Pte Ltd., 1 Fusionopolis Walk, #07-01 Solaris South Tower, Singapore 138628 All rights reserved No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning or otherwise, except as permitted by the Copyright Act of Singapore 1987, without the prior written permission of the Publisher. For information about how to apply for permission to reuse the copyright material in this book, please see our website http://www.wiley.com/go/ permissions. Trademarks: Wiley, For Dummies, the Dummies Man logo, The Dummies Way, Dummies.com, Making Everything Easier, and related trade dress are trademarks or registered trademarks of John Wiley & Sons, Inc. and/or its affiliates in the United States and other countries, and may not be used without written permission. Dell EMC and the Dell EMC logo are registered trademarks of Dell EMC. All other trademarks are the property of their respective owners. John Wiley & Sons Singapore Pte Ltd., is not associated with any product or vendor mentioned in this book. LIMIT OF LIABILITY/DISCLAIMER OF WARRANTY: WHILE THE PUBLISHER AND AUTHOR HAVE USED THEIR BEST EFFORTS IN PREPARING THIS BOOK, THEY MAKE NO REPRESENTATIONS OR WARRANTIES WITH RESPECT TO THE ACCURACY OR COMPLETENESS OF THE CONTENTS OF THIS BOOK AND SPECIFICALLY DISCLAIM ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
    [Show full text]
  • Comparative Analysis of Distributed and Parallel File Systems' Internal Techniques
    Comparative Analysis of Distributed and Parallel File Systems’ Internal Techniques Viacheslav Dubeyko Content 1 TERMINOLOGY AND ABBREVIATIONS ................................................................................ 4 2 INTRODUCTION......................................................................................................................... 5 3 COMPARATIVE ANALYSIS METHODOLOGY ....................................................................... 5 4 FILE SYSTEM FEATURES CLASSIFICATION ........................................................................ 5 4.1 Distributed File Systems ............................................................................................................................ 6 4.1.1 HDFS ..................................................................................................................................................... 6 4.1.2 GFS (Google File System) ....................................................................................................................... 7 4.1.3 InterMezzo ............................................................................................................................................ 9 4.1.4 CodA .................................................................................................................................................... 10 4.1.5 Ceph.................................................................................................................................................... 12 4.1.6 DDFS ..................................................................................................................................................
    [Show full text]
  • Performance and Scalability of a Sensor Data Storage Framework
    View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by Aaltodoc Publication Archive Aalto University School of Science Degree Programme in Computer Science and Engineering Paul Tötterman Performance and Scalability of a Sensor Data Storage Framework Master’s Thesis Helsinki, March 10, 2015 Supervisor: Associate Professor Keijo Heljanko Advisor: Lasse Rasinen M.Sc. (Tech.) Aalto University School of Science ABSTRACT OF Degree Programme in Computer Science and Engineering MASTER’S THESIS Author: Paul Tötterman Title: Performance and Scalability of a Sensor Data Storage Framework Date: March 10, 2015 Pages: 67 Major: Software Technology Code: T-110 Supervisor: Associate Professor Keijo Heljanko Advisor: Lasse Rasinen M.Sc. (Tech.) Modern artificial intelligence and machine learning applications build on analysis and training using large datasets. New research and development does not always start with existing big datasets, but accumulate data over time. The same storage solution does not necessarily cover the scale during the lifetime of the research, especially if scaling up from using common workgroup storage technologies. The storage infrastructure at ZenRobotics has grown using standard workgroup technologies. The current approach is starting to show its limits, while the storage growth is predicted to continue and accelerate. Successful capacity planning and expansion requires a better understanding of the patterns of the use of storage and its growth. We have examined the current storage architecture and stored data from different perspectives in order to gain a better understanding of the situation. By performing a number of experiments we determine key properties of the employed technologies. The combination of these factors allows us to make informed decisions about future storage solutions.
    [Show full text]
  • Network Working Group H. Birkholz Internet-Draft Fraunhofer SIT Intended Status: Informational C
    Network Working Group H. Birkholz Internet-Draft Fraunhofer SIT Intended status: Informational C. Vigano Expires: January 4, 2018 Universitaet Bremen C. Bormann Universitaet Bremen TZI July 03, 2017 Concise data definition language (CDDL): a notational convention to express CBOR data structures draft-greevenbosch-appsawg-cbor-cddl-11 Abstract This document proposes a notational convention to express CBOR data structures (RFC 7049). Its main goal is to provide an easy and unambiguous way to express structures for protocol messages and data formats that use CBOR. Status of This Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on January 4, 2018. Copyright Notice Copyright (c) 2017 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust’s Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document.
    [Show full text]
  • Orion File System : File-Level Host-Based Virtualization
    Orion File System : File-level Host-based Virtualization Amruta Joshi Faraz Shaikh Sapna Todwal Pune Institute of Computer Pune Institute of Computer Pune Institute of Computer Technology, Technology, Technology, Dhankavadi, Pune 411043, India Dhankavadi, Pune 411043, India Dhankavadi, Pune 411043, India 020-2437-1101 020-2437-1101 020-2437-1101 [email protected] [email protected] [email protected] Abstract— The aim of Orion is to implement a solution that The automatic indexing of files and directories is called provides file-level host-based virtualization that provides for "semantic" because user programmable transducers use better aggregation of content/information based on information about these semantics of files to extract the semantics and properties. File-system organization today properties for indexing. The extracted properties are then very closely mirrors storage paradigms rather than user- stored in a relational database so that queries can be run access paradigms and semantic grouping. All file-system against them. Experimental results from our semantic file hierarchies are containers that are expressed based on their system implementation ORION show that semantic file physical presence (a separate drive letter on Windows, or a systems present a more effective storage abstraction than the particular mount point based on the volume in Unix). traditional tree structured file systems for information We have implemented a solution that will allow sharing, storage and retrieval. users to organize their files
    [Show full text]
  • The Sile Model — a Semantic File System Infrastructure for the Desktop
    The Sile Model — A Semantic File System Infrastructure for the Desktop Bernhard Schandl and Bernhard Haslhofer University of Vienna, Department of Distributed and Multimedia Systems {bernhard.schandl,bernhard.haslhofer}@univie.ac.at Abstract. With the increasing storage capacity of personal computing devices, the problems of information overload and information fragmen- tation become apparent on users’ desktops. For the Web, semantic tech- nologies aim at solving this problem by adding a machine-interpretable information layer on top of existing resources, and it has been shown that the application of these technologies to desktop environments is helpful for end users. Certain characteristics of the Semantic Web archi- tecture that are commonly accepted in the Web context, however, are not desirable for desktops; e.g., incomplete information, broken links, or disruption of content and annotations. To overcome these limitations, we propose the sile model, an intermediate data model that combines attributes of the Semantic Web and file systems. This model is intended to be the conceptual foundation of the Semantic Desktop, and to serve as underlying infrastructure on which applications and further services, e.g., virtual file systems, can be built. In this paper, we present the sile model, discuss Semantic Web vocabularies that can be used in the context of this model to annotate desktop data, and analyze the performance of typical operations on a virtual file system implementation that is based on this model. 1 Introduction A large amount of information is stored on personal desktops. We use our per- sonal computing devices—both mobile and stationary—to communicate, to write documents, to organize multimedia content, to search for and retrieve informa- tion, and much more.
    [Show full text]
  • Windows Internals, Sixth Edition, Part 2
    spine = 1.2” Part 2 About the Authors Mark Russinovich is a Technical Fellow in ® the Windows Azure™ group at Microsoft. Windows Internals He is coauthor of Windows Sysinternals SIXTH EDITION Administrator’s Reference, co-creator of the Sysinternals tools available from Microsoft Windows ® The definitive guide—fully updated for Windows 7 TechNet, and coauthor of the Windows Internals and Windows Server 2008 R2 book series. Delve inside Windows architecture and internals—and see how core David A. Solomon is coauthor of the Windows Internals book series and has taught components work behind the scenes. Led by a team of internationally his Windows internals class to thousands of renowned internals experts, this classic guide has been fully updated Windows developers and IT professionals worldwide, SIXTH for Windows 7 and Windows Server® 2008 R2—and now presents its including Microsoft staff. He is a regular speaker 6EDITION coverage in two volumes. at Microsoft conferences, including TechNet As always, you get critical, insider perspectives on how Windows and PDC. operates. And through hands-on experiments, you’ll experience its Alex Ionescu is a chief software architect and internal behavior firsthand—knowledge you can apply to improve consultant expert in low-level system software, application design, debugging, system performance, and support. kernel development, security training, and Internals reverse engineering. He teaches Windows internals courses with David Solomon, and is ® In Part 2, you will: active in the security research community.
    [Show full text]