Analysis on Semantic Web Information Retrieval Methodologies
Total Page:16
File Type:pdf, Size:1020Kb
Advances in Engineering Research (AER), volume 142 International Conference for Phoenixes on Emerging Current Trends in Engineering and Management (PECTEAM 2018) Literature Survey: Analysis on Semantic Web Information Retrieval Methodologies 1K.Ezhilarasi, 2G.Maria Kalavathy 1Research Scholar, Anna University, India 2Professor, St.Joseph’s College of Engineering, India [email protected], [email protected] Abstract:The Semantic web is the extension of an existing The focus of semantic web is to make the web as web that defines a standard by which, information is given machine-understandable by declaring annotations, in well-defined meaning and enables the machines to where automated agents will be able to understand the understand information. Many kinds of research are going content on the web, establish the relationship between in semantic web space. Researchers follow different them and take logical decisions to accomplish the approaches to retrieve data from the semantic web. This paper is to investigate the existing situation of the Semantic complex task with minimum human interaction. Web with the focus on effective information retrieval. Ontology is generally defined as formal vocabulary and Based on the architecture of semantic search engine, list of is considered as one of the main pillars of the semantic parameters (like ranking algorithm, reasoning mechanism) web. This is used to define the world by declaring the is framed to carry out a systematic analysis of different concepts and their relationships in an unambiguous way. techniques proposed by the researchers. This survey The Semantic Web search engines aim is to fetch the identifies around 20 unique models that retrieve the data reliable, precise, relevant information what actually user from the semantic web and information systems. Summary needs without taking much time to traverse the of the selected semantic search models is specified and irrelevant pages what traditional search engine does. compared them by means of “classification parameters” defined. This comparison identifies common insight, 1.1 Search Process: unique features and also open issues. This study can be Till now various semantic search approaches have been used as a guide for future application development and published. Their application area and their realization research. are varied, though they have common set of ideas. So the search process included a manual keyword search Keywords: Semantic search engine, Semantic Information on IEEE, ACM, Scopus, and Google Scholar. In the retrieval, Ontology, semantic search first stage we focused on the following keywords and keyword combinations: Semantic web; “{Information I. INTRODUCTION retrieval, Search Engine, intelligent information, The Web has become an object of our daily life and search} and {prototype, model, methodology}. the amount of information in the web is ever growing. For data providers, we disregarded all Besides plaintexts, especially multimedia information publications released prior to 2000. General papers such as graphics, audio or video has become a (e.g. original publications on key topics) were not predominant part of the web's information traffic. But, filtered by date. These initial searches yielded semantic how can we find our required information from this search papers. Every publication was perused and huge information space? Nowadays, there are many web summarized. Lastly, we extracted a sample set of search engines for retrieving information from the relevant references from papers we identified as key Internet. These traditional search engines retrieve and publications. display the information based on the occurrence of We categorize the papers in four approaches based on words in a document, geographical location etc, instead functionality. Semantic Search approaches [13][25][17] of understanding the content by exploiting semantics. can use any of the following approaches: For example, consider the user wants to find the name of First Approach (Contextual Search): A contextual all Indian researchers who have written documents on Approach is to disambiguate and to make queries to the semantic web during the last year. This kind of provide single meaning.[1][27][15] search is not feasible in traditional search engines, but Second Approach (Reasoning Search): The focus is on the Semantic web has the ability to execute the search reasoning. It finds new information from the given successfully since it associates formal meaning with the facts.[7][10][22][23] content. Third Approach (User Query Processing): Understanding of User query language. This approach’s Copyright © 2018, the Authors. Published by Atlantis Press. 93 This is an open access article under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/). Advances in Engineering Research (AER), volume 142 effort is the goal of identifying the aim of 3. User Context: people.[2][4][5][11] The usefulness of retrieved documents always links Fourth approach (Ontology based search): The to the user context. Many semantic search engines apply representation of knowledge uses ontology. The system the user context to fetch the user’s needed information. uses the typed query by using ontology so that the Dynamic interaction: User context is extracted from search can be focused. [12][23][29][30] user interaction dynamically. Based on the user’s Semantic search engines can mix more than one query and query-refinement history the system approach to fulfill different functions. There is room for guesses about desired results. a variety of search engine which means it does not fit Predefined Question Category: In this approach, into any type. queries are categorized in so-called question- The filtered papers [16] are grouped into two categories: categories that specify the user’s information need. 1. Research papers that perform a search on the web. The system provides a fixed number of question- 2. Research papers that perform a search on the categories that are exploited during query specific domain. evaluation. After studying the research papers, around 20 unique 4. System Design model/prototype are filtered. These prototypes Search engines can be designed in three possible ways functionalities are summarized and compared based on they are: the classification parameters; this survey addresses the Stand-alone search engine: A stand-alone search common perception, uniqueness and also open issues. engine peforms all functions by using single The rest of this paper is presented as follows. In machine and it consists of components like the section II, classification parameters framed by us to crawler, indexer and query engine etc.. analyze the semantic search engine approaches. Section Meta-search engine: A meta-search engine does not III represents overview of selected approaches with have indexer,crawler and database. It distributes open issues. Section IV constitutes comparison of all queries to other subordinate search-engines and prototype/systems based on the classification combines the results, thereby provide the search parameters. Section V comprises the scope for future result to the user. research directions and finally concluded with Distributed Search engine: Distributed search conclusion. engine has other machines to carry out process like crawler, reasoner. It distributes all work to other II. CLASSIFICATION PARAMETERS machines in order to improve the scalability and The architecture of conceptual-semantic search performance. engine has components like the crawler, parser, 5. Query Refinement: ontology database, knowledge repository, inference The semantic modification of user queries is a well- engine, ranking algorithm and user interface. By known technique for information retrieval. In the area of considering this in mind the following parameters [14] semantic search, it often exploits information from are formulated to analyze the semantic search ontologies. It plays a central role in many semantic approaches. search engines. Different techniques have been 1. Focus: developed to increase both, recall and precision of a In this criterion, Search engines may works on query. The increase of precision is often called query information retrieval from the (Semantic) web or special disambiguation. purpose information systems. The query transformation falls into three categories: 2. Transparency Manually: The simplest way to modify a query leaves Regarding the user interface with semantic system the modification to the user. When the user enters a features, Transparency types are as follows: query, the system returns not only documents but also an Transparent: The semantic functions of the system appropriate part of ontology. The user navigates the are invisible to the user; the system appears to be an ontology and reformulates his query, i.e., adds or ‘ordinary’ search engine. Transparent systems have removes query terms. no means to request additional information from the Query rewriting: Query rewriting is driven by the idea user. that a query can be optimized by the system. Three Interactive: Interactive systems may ask the user for different ways, augmentation, trimming and term clarification or suggest changes to the query. substitution are observed. Hybrid: Hybrid systems merge both interactive and In the case of augmentation, the query is enhanced with transparent behavior. Normally, they act as terms that are derived from the ontological context of transparent systems. If systems require user the original query