Metadata Based Search in LAN METADATA BASED SEARCH in LAN

Total Page:16

File Type:pdf, Size:1020Kb

Metadata Based Search in LAN METADATA BASED SEARCH in LAN Metadata Based Search In LAN METADATA BASED SEARCH IN LAN 1SALABHA P S, 2ABDUL NIZAR M College of Engineering,Trivandrum Abstract— In recent years, the number of ways to keep and manage personal information has increased considerably, in line with the overall increase in the number of devices, technologies and applications on which knowledge workers rely. The fragmentation of personal information increases the probability of keeping something locked away in a device, application or format and forgetting that something was ever seen, heard, or read in the first place. Information does not only exist in personal computers, but is continuously produced and revised in local area networks. Due to the increased availability of data and evolved standards in the last years, applications of semantic technologies in organizational information systems have increased. The application of ontologies in organizational information systems allows the integration of heterogeneous information items within the organizational memory. Semantic architectures bring together information sources, which, previously, would have been more difficult to manage. In this paper we try to develop a tool that aim at effectively using semantic technologies to support searching data in LAN based on attributes of a file. The data stored in LAN is to be indexed periodically and information collected will be stored in the form of RDF triples. These metadata can be later used for searching documents. Keywords— indexing, lan, metadata, search, semantic desktop I. INTRODUCTION people. This concept is very much related to the Semantic Web but is distinct insofar as its main The traditional personal computers manage the concern is the personal use of information. Semantic resources in two ways. One way is based on directory desktop improves the efficiency we search and locate and file name. Another way is managing resources resources. It also can create relations between through the application. Each type of resource file different types of resources, and users would treat with one or more applications associated can only be their personal affairs more efficiently. accessed by the appropriate application. For example, Word documents can be edited by. II. SEMANTIC TECHNOLOGY Microsoft Word and the video files may be processed by the Microsoft Media Player, Real Player or other Semantic technology encodes meanings separately application. However, with the rapid development from data and content files, and separately from and popularization of personal computers and application code. This enables machines as well as Internet, the traditional management of personal people to understand, share and reason with them at computer resources can no longer meet the actual execution time. With semantic technologies, adding, needs of users, mainly in the following aspects: changing and implementing new relationships or Rapid increase in the size of hard disks interconnecting programs in a different way can be Fragmentation of personal information just as simple as changing the external model that Filename may not reflect content these programs share. Computers cannot get a great deal of With traditional information technology, on the information about the content of files other hand, meanings and relationships must be Inability to manage semantic link between predefined and “hard wired” into data formats and the resources the application program code at design time. This Time to locate information means that when something changes, previously unexchanged information needs to be exchanged, or This paper employs the emerging technology of two programs need to interoperate in a new way, the semantic desktop to provide a new way to solve this humans must get involved. Off-line, the parties must problem. The Semantic Desktop is a collective term define and communicate between them the for ideas related to changing a computer's user knowledge needed to make the change, and then interface and data handling capabilities so that data recode the data structures and program logic to is more easily shared between different applications accommodate it, and then apply these changes to the or tasks and so that data that once could not be database and the application. Then, and only then, automatically processed by a computer could be. It can they implement the changes. also encompasses some ideas about being able to Semantic technologies are “meaning-centered.” automatically share information between different They include tools for: auto recognition of topics and Proceedings of Fifth IRAJ International Conference, 15th September 2013, Pune, India, ISBN: 978-93-82702-29-0 148 Metadata Based Search In LAN concepts, information and meaning extraction, and information may be made available to applications categorization. Given a question, semantic other than those for which it was originally created. technologies can directly search topics, concepts, associations that span a vast number of sources. C. RDF Data Storage Semantic technologies provide an abstraction layer above existing IT technologies that enables bridging Currently, there are three kinds of RDF data and interconnection of data, content, and processes. storage format: RDF / XML file format, a special Second, from the portal perspective, semantic XML/RDF database (triple store) and a traditional technologies can be thought of as a new level of relational database. For a small amount of data to depth that provides far more intelligent, capable, RDF / XML files storage is available. But for large relevant, and responsive interaction than with amounts of data, taking into account scalability, data information technologies alone. integrity and security, the query efficiency and many other factors, using relational database or RDF / III. DESIGN CONCEPT XML database to store RDF data is a relatively good choice. A. Resource Identification A triplestore is a purpose-built database for the In semantic desktop URIs are used to identify data. storage and retrieval of triples, a triple being a data Each resource has a unique identifier that can be entity composed of subject-predicate-object, like used to locate data from any computer in a network. "Bob is 35" or "Bob knows Fred". Much like a The basic syntax of URI is <scheme>:// <authority> relational database, one stores information in a <path>? <query>. For example it can be of form triplestore and retrieves it via a query language. mail://<email id>/<program Unlike a relational database, a triplestore is name>/<type>/<identity>. optimized for the storage and retrieval of triples. In addition to queries, triples can usually be B. 3.2. Information Representation imported/exported using Resource Description Framework (RDF) and other formats. Some triplestores can store billions of triples. Some of the Resource Description Framework (RDF) is a common implementations of triple store are Soprano, language used for expression of World Wide Web Virtuoso Universal Server, Apache Jena, Sesame etc. information resources. It is specifically used for representing metadata about web resources, such as web page’s title, author, modification time etc. By D. Ontology generalizing the concept of web resources RDF can In the context of knowledge sharing, the term be used to express anything that can be identified on ontology t means a specification of a web even if they cannot be accessed directly from conceptualization. That is, an ontology is a web. The relationship between computer resource, description (like a formal specification of a program) URI and RDF is described in fig 1 of the concepts and relationships that can exist for an agent or a community of agents. An ontology provides a shared vocabulary, which can be used to model a domain, that is, the type of objects and/or concepts that exist, and their properties and relations. We can establish hundreds of relationships between resources. Different people may refer same relationship (or properties) using different names. Also same name may mean different thing to different people. This may create ambiguity while Figure 1: Resource described in RDF creating an ontology. Instead of creating our own ontology we used ontologies developed as an open- RDF is intended for situations in which this source project called Shared Desktop Ontologies information needs to be processed by applications, (SDO). The development process is centered around rather than being only displayed to people. RDF the SDO Trac repository which is open to provides a common framework for expressing this contributions from everyone. information so it can be exchanged between applications without loss of meaning. Since it is a common framework, application designers can IV. PROTOTYPE SYSTEM leverage the availability of common RDF parsers and processing tools. The ability to exchange information In order to support metadata based search in LAN between different applications means that the a prototype system called locator was developed. It Proceedings of Fifth IRAJ International Conference, 15th September 2013, Pune, India, ISBN: 978-93-82702-29-0 149 Metadata Based Search In LAN utilizes semantic technologies to collect metadata directories based on these relationships. When a user and identify relationship between resources. issues a query it is analyzed and matched with the ontology library to find results. The result to queries A. System Architecture are ranked according to relevance. The frequency of keyword in Figure 2 shows the system architecture of Locator. the file and
Recommended publications
  • Health Sensor Data Management in Cloud
    Special Issue - 2015 International Journal of Engineering Research & Technology (IJERT) ISSN: 2278-0181 NCRTS-2015 Conference Proceedings Health Sensor Data Management in Cloud Rashmi Sahu Department of Computer Science and Engineering BMSIT,Avallahalli,Yelahanka,Bangalore Visveswariya Technological University Abstract--Wearable sensor devices with cloud computing uses its software holds 54% of patient records in US and feature have great impact in our daily lives. This 2.5% of patient records in world wide.[9] technology provides services to acquire, consume and share personal health information. Apart from that we can be How resource wastage can be refused by using cloud connected with smart phones through which we can access technology information through sensor devices equipped with our smart phone. Now smartphones has been resulted in the new ways. It is getting embedded with sensor devices such as Suppose there are 3 Hospitals A,B,C.Each hospital cameras, microphones, accelerometers, proximity sensors, maintains their own network database server,they have GPS etc. through which we can track information and management department and softwares,maintainance significant parameter about physiology. Some of the department and softwares.They organizes their own data wearable tech devices are popular today like Jawbone Up and they maintained by their own.But there is resource and Fitbit Flex, HeartMath Inner Balance Sensor, wastage,means three different health organizations Tinke.This paper is survey in area of medical field that utilizing resources having paid and costs three times of represents why cloud technologies used in medical field and single plus waste of data space also.so why can’t we how health data managed in cloud.
    [Show full text]
  • PDF-Xchange Viewer
    PDF-XChange Viewer © 2001-2011 Tracker Software Products Ltd North/South America, Australia, Asia: Tracker Software Products (Canada) Ltd., PO Box 79 Chemainus, BC V0R 1K0, Canada Sales & Admin Tel: Canada (+00) 1-250-324-1621 Fax: Canada (+00) 1-250-324-1623 European Office: 7 Beech Gardens Crawley Down., RH10 4JB Sussex, United Kingdom Sales Tel: +44 (0) 20 8555 1122 Fax: +001 250-324-1623 http://www.tracker-software.com [email protected] Support: [email protected] Support Forums: http://www.tracker-software.com/forum/ ©2001-2011 TRACKER SOFTWARE PRODUCTS II PDF-XChange Viewer v2.5x Table of Contents INTRODUCTION...................................................................................................... 7 IMPORTANT! FREE vs. PRO version ............................................................................................... 8 What Version Am I Running? ............................................................................................................................. 9 Safety Feature .................................................................................................................................................. 10 Notice! ......................................................................................................................................... 10 Files List ....................................................................................................................................... 10 Latest (available) Release Notes .................................................................................................
    [Show full text]
  • Lookeen Desktop Search
    Lookeen Desktop Search Find your files faster! User Benefits Save time by simultaneously searching for documents on your hard drive, in file servers and the network. Lookeen can also search Outlook archives, the Exchange Search with fast and reliable Server and Public Folders. Advanced filters and wildcard options make search more Lookeen technology powerful. With Lookeen you’ll turn ‘search’ into ‘find’. You’ll be able to manage and organize large amounts of data efficiently. Employees will save valuable time usually Find your information in record spent searching to work on more important tasks. time thanks to real-time indexing Lookeen desktop search can also Search your desktop, Outlook be integrated into Outlook The search tool for Windows files and Exchange folders 10, 8, 7 and Vista simultaneously Ctrl+Ctrl is back: instantly launch Edit and save changes to Lookeen from documents in Lookeen preview anywhere on your desktop Save and re-use favorite queries and access them with short keys View all correspondence with individuals or groups at the push of a button Create one-click summaries of email correspondences Start saving time and money immediately For Companies Features Business Edition Desktop search software compatible with Powerful search in virtual environments like Compatible with standard and virtual Windows 10, 8, 7 and Vista Citrix and VMware desktops like Citrix, VMware and Terminal Servers. Simplified roll out through exten- Optional add-in to Microsoft Outlook 2016, Simple, user friendly interface gives users a sive group directives and ADM files. 2013, 2010, 2007 or 2003 and Office 365 unified view over multiple data sources Automatic indexing of all files on the hard Clear presentation of search results drive, network, file servers, Outlook PST/OST- Enterprise Edition Full fidelity preview option archives, Public Folders and the Exchange Scans additional external indexes.
    [Show full text]
  • Towards the Ontology Web Search Engine
    TOWARDS THE ONTOLOGY WEB SEARCH ENGINE Olegs Verhodubs [email protected] Abstract. The project of the Ontology Web Search Engine is presented in this paper. The main purpose of this paper is to develop such a project that can be easily implemented. Ontology Web Search Engine is software to look for and index ontologies in the Web. OWL (Web Ontology Languages) ontologies are meant, and they are necessary for the functioning of the SWES (Semantic Web Expert System). SWES is an expert system that will use found ontologies from the Web, generating rules from them, and will supplement its knowledge base with these generated rules. It is expected that the SWES will serve as a universal expert system for the average user. Keywords: Ontology Web Search Engine, Search Engine, Crawler, Indexer, Semantic Web I. INTRODUCTION The technological development of the Web during the last few decades has provided us with more information than we can comprehend or manage effectively [1]. Typical uses of the Web involve seeking and making use of information, searching for and getting in touch with other people, reviewing catalogs of online stores and ordering products by filling out forms, and viewing adult material. Keyword-based search engines such as YAHOO, GOOGLE and others are the main tools for using the Web, and they provide with links to relevant pages in the Web. Despite improvements in search engine technology, the difficulties remain essentially the same [2]. Firstly, relevant pages, retrieved by search engines, are useless, if they are distributed among a large number of mildly relevant or irrelevant pages.
    [Show full text]
  • An Activity Based Data Model for Desktop Querying (Extended Abstract)?
    An activity based data model for desktop querying (Extended Abstract)? Sibel Adalı1 and Maria Luisa Sapino2 1 Rensselaer Polytechnic Institute, 110 8th Street, Troy, NY 12180, USA, [email protected], 2 Universit`adi Torino, Corso Svizzera, 185, I-10149 Torino, Italy [email protected] 1 Introduction With the introduction of a variety of desktop search systems by popular search engines as well as the Mac OS operating system, it is now possible to conduct keyword search across many types of documents. However, this type of search only helps the users locate a very specific piece of information that they are looking for. Furthermore, it is possible to locate this information only if the document contains some keywords and the user remembers the appropriate key- words. There are many cases where this may not be true especially for searches involving multimedia documents. However, a personal computer contains a rich set of associations that link files together. We argue that these associations can be used easily to answer more complex queries. For example, most files will have temporal and spatial information. Hence, files created at the same time or place may have relationships to each other. Similarly, files in the same directory or people addressed in the same email may be related to each other in some way. Furthermore, we can define a structure called “activities” that makes use of these associations to help user accomplish more complicated information needs. Intu- itively, we argue that a person uses a personal computer to store information relevant to various activities she or he is involved in.
    [Show full text]
  • Using Context to Enhance File Search
    Connections: Using Context to Enhance File Search Craig A. N. Soules, Gregory R. Ganger Carnegie Mellon University ABSTRACT Attribute-based naming allows users to classify each file Connections is a file system search tool that combines tradi- with multiple attributes [9, 12, 37]. Once in place, these at- tional content-based search with context information gath- tributes provide additional paths to each file, helping users ered from user activity. By tracing file system calls, Con- locate their files. However, it is unrealistic and inappropri- nections can identify temporal relationships between files ate to require users to proactively provide accurate and use- and use them to expand and reorder traditional content ful classifications. To make these systems viable, they must search results. Doing so improves both recall (reducing false- automatically classify the user's files, and, in fact, this re- positives) and precision (reducing false-negatives). For ex- quirement has led most systems to employ search tools over ample, Connections improves the average recall (from 13% hierarchical file systems rather than change their underlying to 22%) and precision (from 23% to 29%) on the first ten methods of organization. results. When averaged across all recall levels, Connections The most prevalent automated classification method to- improves precision from 17% to 28%. Connections provides day is content analysis: examining the contents and path- these benefits with only modest increases in average query names of files to determine attributes that describe them. time (2 seconds), indexing time (23 seconds daily), and in- Systems using attribute-based naming, such as the Seman- dex size (under 1% of the user's data set).
    [Show full text]
  • Dtsearch Desktop/Dtsearch Network Manual
    dtSearch Desktop dtSearch Network Version 7 Copyright 1991-2021 dtSearch Corp. www.dtsearch.com SALES 1-800-483-4637 (301) 263-0731 Fax (301) 263-0781 [email protected] TECHNICAL (301) 263-0731 [email protected] 1 Table of Contents 1. Getting Started _____________________________________________________________ 1 Quick Start 1 Installing dtSearch on a Network 7 Automatic deployment of dtSearch on a Network 8 Command-Line Options 10 Keyboard Shortcuts 11 2. Indexes __________________________________________________________________ 13 What is a Document Index? 13 Creating an Index 13 Caching Documents and Text in an Index 14 Indexing Documents 15 Noise Words 17 Scheduling Index Updates 17 3. Indexing Web Sites _________________________________________________________ 19 Using the Spider to Index Web Sites 19 Spider Options 20 Spider Passwords 21 Login Capture 21 4. Sharing Indexes on a Network _________________________________________________ 23 Creating a Shared Index 23 Sharing Option Settings 23 Index Library Manager 24 Searching Using dtSearch Web 25 5. Working with Indexes _______________________________________________________ 27 Index Manager 27 Recognizing an Existing Index 27 Deleting an Index 27 Renaming an Index 27 Compressing an Index 27 Verifying an Index 27 List Index Contents 28 Merging Indexes 28 6. Searching for Documents _____________________________________________________ 29 Using the Search Dialog Box 29 Browse Words 31 More Search Options 32 Search History 33 i Table of Contents Searching for a List of Words 33 7.
    [Show full text]
  • Comparison of Indexers
    Comparison of indexers Beagle, JIndex, metaTracker, Strigi Michal Pryc, Xusheng Hou Sun Microsystems Ltd., Ireland November, 2006 Updated: December, 2006 Table of Contents 1. Introduction.............................................................................................................................................3 2. Indexers...................................................................................................................................................4 3. Test environment ....................................................................................................................................5 3.1 Machine............................................................................................................................................5 3.2 CPU..................................................................................................................................................5 3.3 RAM.................................................................................................................................................5 3.4 Disk..................................................................................................................................................5 3.5 Kernel...............................................................................................................................................5 3.6 GCC..................................................................................................................................................5
    [Show full text]
  • Requirements for XML Document Database Systems Airi Salminen Frank Wm
    Requirements for XML Document Database Systems Airi Salminen Frank Wm. Tompa Dept. of Computer Science and Information Systems Department of Computer Science University of Jyväskylä University of Waterloo Jyväskylä, Finland Waterloo, ON, Canada +358-14-2603031 +1-519-888-4567 ext. 4675 [email protected] [email protected] ABSTRACT On the other hand, XML will also be used in ways SGML and The shift from SGML to XML has created new demands for HTML were not, most notably as the data exchange format managing structured documents. Many XML documents will be between different applications. As was the situation with transient representations for the purpose of data exchange dynamically created HTML documents, in the new areas there is between different types of applications, but there will also be a not necessarily a need for persistent storage of XML documents. need for effective means to manage persistent XML data as a Often, however, document storage and the capability to present database. In this paper we explore requirements for an XML documents to a human reader as they are or were transmitted is database management system. The purpose of the paper is not to important to preserve the communications among different parties suggest a single type of system covering all necessary features. in the form understood and agreed to by them. Instead the purpose is to initiate discussion of the requirements Effective means for the management of persistent XML data as a arising from document collections, to offer a context in which to database are needed. We define an XML document database (or evaluate current and future solutions, and to encourage the more generally an XML database, since every XML database development of proper models and systems for XML database must manage documents) to be a collection of XML documents management.
    [Show full text]
  • List of Search Engines
    A blog network is a group of blogs that are connected to each other in a network. A blog network can either be a group of loosely connected blogs, or a group of blogs that are owned by the same company. The purpose of such a network is usually to promote the other blogs in the same network and therefore increase the advertising revenue generated from online advertising on the blogs.[1] List of search engines From Wikipedia, the free encyclopedia For knowing popular web search engines see, see Most popular Internet search engines. This is a list of search engines, including web search engines, selection-based search engines, metasearch engines, desktop search tools, and web portals and vertical market websites that have a search facility for online databases. Contents 1 By content/topic o 1.1 General o 1.2 P2P search engines o 1.3 Metasearch engines o 1.4 Geographically limited scope o 1.5 Semantic o 1.6 Accountancy o 1.7 Business o 1.8 Computers o 1.9 Enterprise o 1.10 Fashion o 1.11 Food/Recipes o 1.12 Genealogy o 1.13 Mobile/Handheld o 1.14 Job o 1.15 Legal o 1.16 Medical o 1.17 News o 1.18 People o 1.19 Real estate / property o 1.20 Television o 1.21 Video Games 2 By information type o 2.1 Forum o 2.2 Blog o 2.3 Multimedia o 2.4 Source code o 2.5 BitTorrent o 2.6 Email o 2.7 Maps o 2.8 Price o 2.9 Question and answer .
    [Show full text]
  • XML Normal Form (XNF)
    Ryan Marcotte www.cs.uregina.ca/~marcottr CS 475 (Advanced Topics in Databases) March 14, 2011 Outline Introduction to XNF and motivation for its creation Analysis of XNF’s link to BCNF Algorithm for converting a DTD to XNF Example March 14, 2011 Ryan Marcotte 2 March 14, 2011 Ryan Marcotte 3 Introduction XML is used for data storage and exchange Data is stored in a hierarchical fashion Duplicates and inconsistencies may exist in the data store March 14, 2011 Ryan Marcotte 4 Introduction Relational databases store data according to some schema XML also stores data according to some schema, such as a Document Type Definition (DTD) Obviously, some schemas are better than others A normal form is needed that reduces the amount of storage needed while ensuring consistency and eliminating redundancy March 14, 2011 Ryan Marcotte 5 Introduction XNF was proposed by Marcelo Arenas and Leonid Libkin (University of Toronto) in a 2004 paper titled “A Normal Form for XML Documents” Recognized a need for good XML data design as “a lot of data is being put on the web” “Once massive web databases are created, it is very hard to change their organization; thus, there is a risk of having large amounts of widely accessible, but at the same time poorly organized legacy data.” March 14, 2011 Ryan Marcotte 6 Introduction XNF provides a set of rules that describe well-formed DTDs Poorly-designed DTDs can be transformed into well- formed ones (through normalization – just like relational databases!) Well-formed DTDs avoid redundancies and update
    [Show full text]
  • Model-Based XML to Relational Database Mapping Choices
    International Journal of Recent Technology and Engineering (IJRTE) ISSN: 2277-3878, Volume-8 Issue-3S, October 2019 Model-based XML to Relational Database Mapping Choices Emyliana Song, Su-Cheng Haw, Fang-Fang Chua Type Definition file (DTD) or XML schema to define Abstract— Extensible Markup Language (XML) technology structure of XML document. For model-based mapping, is widely used for data exchange and data representation in both DTD and XML schema is not needed. online and offline mode. This structured format language able to be transformed into other formats and share information The rest of the paper is organized as follows. Existing and across platforms. XML is simple; however, it is designed to related approaches on model-based mapping schemes are accommodate changes. For this paper, a study on reviewed in section 2. Section 3 discussed the performance transformation of XML document into relational database is evaluation carried out in the experiment of selected conducted. Crucial part of this process is how to maintain the approaches. Experimental results and analysis of the hierarchy and relationships between data in the document into findings are presented in section 4. And lastly, Section 5 database. Approaches that are discussed in this paper each uses own unique way of data storing technique and database design. conclude the paper. Therefore, each algorithm is assessed with three datasets constitute of small, medium and large size XML file. The II. LITERATURE REVIEW efficiency of the algorithms is being tested on time taken for data storing and query execution process. At the end of the Throughout the years, numerous mapping schemes have evaluation, we discuss factors that affect algorithm performance been proposed to resolve issues on transforming XML to and present suggestions to improve mapping scheme for future relational database structure.
    [Show full text]