Extending Basic Spotlight Technology by Using Apis

Faculty of Computer Science, Free University of Bozen-Bolzano, Piazza Domenicani 3, 39100 Bolzano, Italy Tel:+39 04710 16000, Fax:+39 04710 16009, http://www.inf.unibz.it/krdb/ EMCL Project Report Extending basic Spotlight Technology by using APIs Muhammad Faheem Abstract Spotlight is advanced search technology based on metadata information and is integrated with the file system, which enable user to search files in the file system efficiently and quickly. Spotlight is more than just searching for document, it provides the APIs which help developers to make use of these metadata information according to their own domain. The aim of this project is to explore the Spotlight APIs and enhance them, so that we could run intelligent queries over it. 1 CONTENTS Contents 1 Introduction 3 2 Motivation 3 3 Preliminaries 4 4 Spotlight APIs 4 4.1 Metadata . .5 4.2 Spotlight Store . .6 4.3 Different ways to examine the file’s metadata . .6 4.3.1 High Level Language Program . .7 4.3.2 Command Line . .8 4.4 Spotlight Queries . .9 4.4.1 Finder Tool . .9 4.4.2 Programming Languages . .9 4.4.3 mdfind Tool . 10 4.5 Xcode 3.2 . 10 4.5.1 Creating New Project . 10 4.5.2 Xcode WorkSpace . 11 4.5.3 Interface building . 12 4.5.4 Objective C . 12 4.5.5 A simple Graphical User Interface application based on Predicate . 13 5 Enriching basic Spotlight Technology 16 5.1 Spotlight Schema . 18 5.2 Beyond Spotlight APIs . 23 5.3 A Graphical User Application based on Key metadata attribute . 25 5.4 Use Cases and Evaluation . 26 6 Conclusion and Future Work 28 Reference29 2 CONTENTS 1 Introduction Nowadays, new technologies coming from different field of research may converge for imple- menting new and more sophisticated office automation systems. One of these technology allows modern operating system to store in the file system meta information regarding content of various type of documents such as media files, instant messages, office documents, etc. The Internet, web and electronic mail have revolutionized the way we communicate and collaborate. We are much more connected and in turn our demands increase. Now Metadata information regarding the files, contacts and messages is available. Now challenge is to make usage of this information in the right way. Spotlight provides us all the facilities we need regarding searching a file in the file system. Spotlight is tightly integrated with the operating system, which gives it edge over the other technologies, e-g Google Desktop Tool. To use the metadata information in other applications, we need to extend the capabilities of Spotlight. We can query the system using gray box on the top-right side of the Mac window screen, using finder tool or using command prompt. But for running some intelligent queries we need to use Spotlight APIs. By using Spotlight APIs we can run query based on key attributes, e-g kMDItemKind. 2 Motivation Spotlight allows text searching of user emails, computer files, photos, music, chat, web history etc. We can use the Spotlight APIs to make better use of meta information. The main motivation to explore spotlight APIs is to assist the on going research on Semantic Desktop Tool in KRDB1 center. We are trying to assist at data level of Sematic Desktop Application. Here we discuss more organized approach than already in use. To make a usage of meta information for the field of sematic web we need to organize data in a better structure, as we know already that Spotlight organize the meta-data information in a poor way, because the main purpose of the spotlight is to search for files quickly over the system. So here we extend the Spotlight APIs to fill this gap and make data well structured. Semantic Desktop Tool uses a methodology to extract conceptual schema from raw data and raw schema. So we believe that spotlight APIs can help in a way to extract conceptual schema and also populating ABOX based on conceptual schema. The past work done over Semantic Desktop motivates us to develop that strategy. In other way we are trying to provide a search engine (wrapper developed using Spotlight) for Semantic Desktop Tool. For example Semantic Desktop needs to run query like: • Give me names of all persons who work on Project AAA. • Task associated with the project. • Person X role in a company. • Who replied to an email with subject A, sent by Manger on 20-09-2010. • pdf files written by Faheem and is a part of project. To run these kind of queries against file system we need fast and efficient Tool at data level. We believe that we cannot run these queries straight on Spotlight, but rather we need to use the APIs provided by spotlight and extend them so that we can run such a complex queries. Though these queries looks simple but still require join between different type of files. Here one point needs to be cleared: why we prefer Spotlight over Google Desktop2? There are several reason but we will only pinpoint these according to Semantic Desktop Tool require- ment. We prefer Spotlight because Spotlight has better integration with the OS, so it will use less resources and be able to do better indexing, and faster searches. Due to integration with 1KRDB research center for Knowledge and Data, Free University Bolzano 2Google Desktop makes searching your computer as easy as searching the web with Google. see [1] 3 CONTENTS OS, Spotlight updates itself every time the hard drive is written. Our experience says that some times even we delete some file on the file system, but still it is shown in a search result by Google Desktop. When we try to open that file we get message "File not found". But on the other hand spotlight does fast indexing and update database in few seconds because of its integration with the OS. 3 Preliminaries We don't require any special skills but still we assume that a reader of this paper has some basic programming skills and good understanding of Mac environment. We will develop our small application using Xcode IDE [3] [6]. We will write our code in objective C [9]. Although this paper is well organized but still readers are recommended to read these tutorial [3] [6] [9] before they start reading this paper. We will start with a brief introduction about spotlight but the main purpose of this paper is to extend basic spotlight technology by using spotlight APIs. 4 Spotlight APIs Spotlight [2] is fast desktop search technology and fundamental feature of Mac OS X that allows the user to organize and search files based on metadata information. For years peoples have been talking about making the file system fast and easy to search by using the metadata information. But it's been just talk, no technical development regarding it. Other Operating systems have long promised it but does not come up with application. But suddenly third party add-ons are starting to appear and provide the capability of searching across the file system based on metadata information but still with lots of limitations. However Tiger is the first industrial-strength operating system which provides the fully integrated, fast and efficient search across all the files on the system. Organizing the files on system in such a way that it could be easily accessible is a difficult task, and mostly end users are responsible for it. However, even the most organized user will not find it easy to arrange their files in a way that makes it easy to find metadata information. As file system provides only one way of organizing information, user must use some special tool to search for what they want. But still it does not help as most of tools can be slow and limited in how they perform search and also not efficient against complex queries. for instance, user may want to search more than a file e-g searching an email sent by john on date 13-08-2010. Spotlight is an advanced search Technology based on metadata information and integrated with the file system. It keeps tracking the file system and performs certain action to keep its spotlight store updated so that each files easily accessible. Every time a file is created, moved, saved, copied, or deleted, the file system will automatically ensures that the file is properly cataloged,inexed and ready for whatever search query might be issued. Spotlight is more than just searching for documents. Spotlight importers define metadata information that Finder tool can display in its Get Info panel. This information provides more sufficient and details about documents. Examples of Metadata information: • Video files provide their dimensions, pixel depth and other color related information. • Movies provide their duration. • PDF files provide information about the authors, creation date, dimensions, encoding, and where they originated. • Contact provides information about the first name, last name, email id, phone number and instant messaging address. 4 CONTENTS Spotlight is not only available for end users but also for developers to help them to enrich their application with the spotlight capabilities. Tiger does not apply any restrictions or limits over the use of spotlight APIs. There are several technologies that power spotlight and provide dominance over other existing technologies. Spotlight Technologies: • A database consisting of a high-performance meta-data store and content index that is fully integrated with the file system. • Programmatic APIs that are part of the CoreServices and Cocoa frameworks that helps user to query the meta-data store and content index.

Load more