Metadata Based Search in LAN METADATA BASED SEARCH in LAN

Metadata Based Search In LAN METADATA BASED SEARCH IN LAN 1SALABHA P S, 2ABDUL NIZAR M College of Engineering,Trivandrum Abstract— In recent years, the number of ways to keep and manage personal information has increased considerably, in line with the overall increase in the number of devices, technologies and applications on which knowledge workers rely. The fragmentation of personal information increases the probability of keeping something locked away in a device, application or format and forgetting that something was ever seen, heard, or read in the first place. Information does not only exist in personal computers, but is continuously produced and revised in local area networks. Due to the increased availability of data and evolved standards in the last years, applications of semantic technologies in organizational information systems have increased. The application of ontologies in organizational information systems allows the integration of heterogeneous information items within the organizational memory. Semantic architectures bring together information sources, which, previously, would have been more difficult to manage. In this paper we try to develop a tool that aim at effectively using semantic technologies to support searching data in LAN based on attributes of a file. The data stored in LAN is to be indexed periodically and information collected will be stored in the form of RDF triples. These metadata can be later used for searching documents. Keywords— indexing, lan, metadata, search, semantic desktop I. INTRODUCTION people. This concept is very much related to the Semantic Web but is distinct insofar as its main The traditional personal computers manage the concern is the personal use of information. Semantic resources in two ways. One way is based on directory desktop improves the efficiency we search and locate and file name. Another way is managing resources resources. It also can create relations between through the application. Each type of resource file different types of resources, and users would treat with one or more applications associated can only be their personal affairs more efficiently. accessed by the appropriate application. For example, Word documents can be edited by. II. SEMANTIC TECHNOLOGY Microsoft Word and the video files may be processed by the Microsoft Media Player, Real Player or other Semantic technology encodes meanings separately application. However, with the rapid development from data and content files, and separately from and popularization of personal computers and application code. This enables machines as well as Internet, the traditional management of personal people to understand, share and reason with them at computer resources can no longer meet the actual execution time. With semantic technologies, adding, needs of users, mainly in the following aspects: changing and implementing new relationships or Rapid increase in the size of hard disks interconnecting programs in a different way can be Fragmentation of personal information just as simple as changing the external model that Filename may not reflect content these programs share. Computers cannot get a great deal of With traditional information technology, on the information about the content of files other hand, meanings and relationships must be Inability to manage semantic link between predefined and “hard wired” into data formats and the resources the application program code at design time. This Time to locate information means that when something changes, previously unexchanged information needs to be exchanged, or This paper employs the emerging technology of two programs need to interoperate in a new way, the semantic desktop to provide a new way to solve this humans must get involved. Off-line, the parties must problem. The Semantic Desktop is a collective term define and communicate between them the for ideas related to changing a computer's user knowledge needed to make the change, and then interface and data handling capabilities so that data recode the data structures and program logic to is more easily shared between different applications accommodate it, and then apply these changes to the or tasks and so that data that once could not be database and the application. Then, and only then, automatically processed by a computer could be. It can they implement the changes. also encompasses some ideas about being able to Semantic technologies are “meaning-centered.” automatically share information between different They include tools for: auto recognition of topics and Proceedings of Fifth IRAJ International Conference, 15th September 2013, Pune, India, ISBN: 978-93-82702-29-0 148 Metadata Based Search In LAN concepts, information and meaning extraction, and information may be made available to applications categorization. Given a question, semantic other than those for which it was originally created. technologies can directly search topics, concepts, associations that span a vast number of sources. C. RDF Data Storage Semantic technologies provide an abstraction layer above existing IT technologies that enables bridging Currently, there are three kinds of RDF data and interconnection of data, content, and processes. storage format: RDF / XML file format, a special Second, from the portal perspective, semantic XML/RDF database (triple store) and a traditional technologies can be thought of as a new level of relational database. For a small amount of data to depth that provides far more intelligent, capable, RDF / XML files storage is available. But for large relevant, and responsive interaction than with amounts of data, taking into account scalability, data information technologies alone. integrity and security, the query efficiency and many other factors, using relational database or RDF / III. DESIGN CONCEPT XML database to store RDF data is a relatively good choice. A. Resource Identification A triplestore is a purpose-built database for the In semantic desktop URIs are used to identify data. storage and retrieval of triples, a triple being a data Each resource has a unique identifier that can be entity composed of subject-predicate-object, like used to locate data from any computer in a network. "Bob is 35" or "Bob knows Fred". Much like a The basic syntax of URI is <scheme>:// <authority> relational database, one stores information in a <path>? <query>. For example it can be of form triplestore and retrieves it via a query language. mail://<email id>/<program Unlike a relational database, a triplestore is name>/<type>/<identity>. optimized for the storage and retrieval of triples. In addition to queries, triples can usually be B. 3.2. Information Representation imported/exported using Resource Description Framework (RDF) and other formats. Some triplestores can store billions of triples. Some of the Resource Description Framework (RDF) is a common implementations of triple store are Soprano, language used for expression of World Wide Web Virtuoso Universal Server, Apache Jena, Sesame etc. information resources. It is specifically used for representing metadata about web resources, such as web page’s title, author, modification time etc. By D. Ontology generalizing the concept of web resources RDF can In the context of knowledge sharing, the term be used to express anything that can be identified on ontology t means a specification of a web even if they cannot be accessed directly from conceptualization. That is, an ontology is a web. The relationship between computer resource, description (like a formal specification of a program) URI and RDF is described in fig 1 of the concepts and relationships that can exist for an agent or a community of agents. An ontology provides a shared vocabulary, which can be used to model a domain, that is, the type of objects and/or concepts that exist, and their properties and relations. We can establish hundreds of relationships between resources. Different people may refer same relationship (or properties) using different names. Also same name may mean different thing to different people. This may create ambiguity while Figure 1: Resource described in RDF creating an ontology. Instead of creating our own ontology we used ontologies developed as an open- RDF is intended for situations in which this source project called Shared Desktop Ontologies information needs to be processed by applications, (SDO). The development process is centered around rather than being only displayed to people. RDF the SDO Trac repository which is open to provides a common framework for expressing this contributions from everyone. information so it can be exchanged between applications without loss of meaning. Since it is a common framework, application designers can IV. PROTOTYPE SYSTEM leverage the availability of common RDF parsers and processing tools. The ability to exchange information In order to support metadata based search in LAN between different applications means that the a prototype system called locator was developed. It Proceedings of Fifth IRAJ International Conference, 15th September 2013, Pune, India, ISBN: 978-93-82702-29-0 149 Metadata Based Search In LAN utilizes semantic technologies to collect metadata directories based on these relationships. When a user and identify relationship between resources. issues a query it is analyzed and matched with the ontology library to find results. The result to queries A. System Architecture are ranked according to relevance. The frequency of keyword in Figure 2 shows the system architecture of Locator. the file and

Load more