Domain Oriented Semantic Web Based Personalized Search Engine

Total Page:16

File Type:pdf, Size:1020Kb

Domain Oriented Semantic Web Based Personalized Search Engine 2014 Fifth International Conference on Intelligent Systems, Modelling and Simulation Domain Oriented Semantic Web based Personalized Search Engine Shruti Kohli Sonam Arora Department of Computer Science Department of Computer Science BIT, Noida BIT, Noida [email protected] [email protected] Abstract—Present market is dominated by such Search Recall. The above said weakness of today’s search engine Engines which are working on keyword based querying leads to low precision and recall parameters. system. This becomes useless and leads to wastage of user’s This weakness can be resolved to much extent using time if he is not aware of the keywords which are used to index Semantic Web Technologies. This technology works by desired relevant pages. For example, user enters keyword extending the current existing web with semantics that ‘Book’, now Google will show results for both ‘Reading any provides meaning to that web image or document. This book’ and ‘Book a Hotel’. That means user has to look into the meaning is defined in a way that understood by machine thus contents of the web pages to shortlist relevant pages which he reducing the effort of users in searching for relevant images. needs. The same problem exists in case of image search In Semantic Technologies, information is represented in a engines. If query is search for images of ‘hotel in Delhi’, image new W3C standard called Resource Description Framework result set will contain irrelevant as well relevant images. Now solution is needed where machine will itself divide the result (RDF). Now in Semantic technologies, ontology is main into relevant and irrelevant images and then showing the ingredient. Resource Description Framework has relevant ones to the user. But this solution is not feasible one recommended well defined format for representing ontology. because it then has to check the content of image using image Currently existing formats are RDF/XML, OWL, Turtle processing techniques and then checking for similarity between etc. Currently research on semantic technologies is in all the images which is not implementable for millions of beginning stage, therefore traditional search engines like records worldwide. Another solution is Semantic Web Google, Bing, Yahoo etc are still dominating the market. Technologies. It is an extension of the current web that allows Let’s take illustration of query to search for “Taj Mahal”. the meaning of information to be precisely described which can Now the results displayed by traditional search engines will be well understood by computer as well as user. Ontology is display thousands of images for “Taj Mahal in Agra” as well very important ingredient of Semantic Web and in this work as “Taj Mahal casino in Atlantic, USA” Search engine will Ontology for Hotel domain is used. User will be provided easy not refine the result for user. User himself has to sift through to use interface to query hotel ontology. Technologies used are the result set to find relevant results for his use. Information SPARQL query language and JENA API for searching user retrieval in current scenario relies only on keyword searches query inside ontology. In this work focus is given over using Google, Yahoo, Bing etc or based on simple metadata preserving user’s preferences while displaying results on the such as that of an RSS. Moreover, there is no provision to web page. Challenge was on dynamically loading the hotel generate personalized searches easily, so users need to think dataset into ontology in RDF format. This was done using and write search keywords that match their own Semantic Tool which internally uses Google AJAX API for populating latest results from Internet. Advantage of using requirements correspondingly. Such a process of searching is Semantic Web is that it results in only relevant images, which time consuming and requires lot of effort on human part. in turn increases Precision and Recall rates of search engine. That means if users does not understand the keywords to be used for searching, then he can’t perform relevant Keywords-SPARQL; RD; Semantic Web; Web3.0 information retrieval. Now semantic searching increases this efficiency by I. INTRODUCTION providing only relevant results to the user. It represents the Nowadays, whatever information user needs, he gets it data available over internet in format of ontology, which online, anywhere and anytime. If any user wants to visit contains the description of information using metadata. User some place, he would like to search information about that does not need to apply effort to think for keywords that will place, hotel available, weather etc beforehand. User may also give them the result they desire, instead the user can simply like to look at images of hotels before checking their provide the search engine with whatever information it has website. For this, lots of search engines are available in by selecting domain. today’s market online. User needs to enter keyword for the The core idea of domain based search engine is to query he is having and then he is flooded with images that is describe query in the form of domain description. For this available online. But the problem is that information which user need to build a query type that is well understood by is displayed contains relevant as well as irrelevant results. Semantic Web. This paper proposes an architecture where Now user’s work increases in separating his desired set of queries are not build using natural langue, but an easy to use images from the pool of results displayed. This leads to user interface that help users to build complex queries they wastage of time and energy. This measure which checks the want. efficiency of search engine in terms of number of relevant documents returned is done by two factors, Precision and 2166-0662/14 $31.00 © 2014 IEEE 23 DOI 10.1109/ISMS.2014.12 II. CHALLENGES contain lots of unwanted image result which are of no Keyword based search is useful especially to a user who interest to the user. Consider Figure 2. knows what keywords are used to index the images as they can easily define queries. This approach becomes problematic when the user is not aware about the way to write query such that desirable results only appear because for that he must know the semantic concepts that are used in that particular domain in which he is interested. And therefore after user enters the query, he is returned with some irrelevant images along with relevant ones. To check this efficiency of search engine, two parameters are available that is Precision and Recall. Consider Figure 1. Figure 2. Google images results Semantic Web based search aims to provide better precise and recall rates as compared to keyword based search. Challenge is to create a domain based semantic web search engine which is highly user friendly and provide advance search options with the help of various parameters that a user can think of. User friendly in a sense that user need not think of appropriate keyword that might give them their desired result, instead the user can simply provide the search engine with whatever information it has by selecting provided options. Now it is clear that Ontology is the main ingredient of Semantic Web and it is build for a particular domain. This work is using domain for Hotels available in the continent of Asia and Europe. Now the challenge is to load the RDF data in Hotel Ontology dynamically. For this, semantic tool is developed. This semantic tool is making use of Google AJAX API internally to fetch results from Google search engine. Now over these results, it employs URL checking to separate relevant results from irrelevant results. These relevant search results are then transformed into RDF format Figure 1. Precision vs. Recall and then populated in Hotel ontology. From the given interface user chooses the desired options and then sends the Let A be number of relevant records retrieved, let B be query to this search engine, which in turn provides only the number of relevant records not retrieved and C be the reliable results from the ontology. number of irrelevant records retrieved. Another challenge involved was to display results to the Precision: Percentage of returned pages that is users such that their preferences are taken into account. For relevant. Or in other words the capability of this user click history is tracked by the system. minimizing the number of irrelevant links returned Advantage of using Semantic Web is that user shouldn’t to the users. be aware of the concepts supporting the search to use it. Precision: - A*100/ (A+C) Their experience should be as close as the one they currently Recall: Percentage of relevant pages that is have with the current web and the search engines they use returned. Or in other words the capability of daily. maximizing the number of relevant links returned to the users. Recall: -A*100/ (A+B) Some of the latest works relating semantic areas are:- WANG Yong-gui and JIA Zhen [1] gave introduction to All search mechanism till date performs the function Semantic Web and its Mining and then proved that their where precision and recall percentage is too low. For integration can bring lot of effectiveness in Web Mining. For example, consider a situation when User enters “images of that they used a five step process which actually integrated all hotels in Delhi” query on Google, now search results may Web mining with Semantic technologies. 24 Jiang Huiping [2] proposed a semantic web search model simple variations of them in a prominent way. In case user to enhance efficiency and accuracy of IR for unstructured has entity oriented queries then they will work well only and semi structured documents.
Recommended publications
  • NEXT GENERATION CATALOGUES: an ANALYSIS of USER SEARCH STRATEGIES and BEHAVIOR by FREDRICK KIWUWA LUGYA DISSERTATION Submitted I
    View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by Illinois Digital Environment for Access to Learning and Scholarship Repository NEXT GENERATION CATALOGUES: AN ANALYSIS OF USER SEARCH STRATEGIES AND BEHAVIOR BY FREDRICK KIWUWA LUGYA DISSERTATION Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Library and Information Science in the Graduate College of the University of Illinois at Urbana-Champaign, 2017 Urbana, Illinois Doctoral Committee: Associate Professor Kathryn La Barre, Chair and Director Assistant Professor Nicole A. Cooke Dr. Jennifer Emanuel Taylor, University of Illinois Chicago Associate Professor Carol Tilley ABSTRACT The movement from online catalogues to search and discovery systems has not addressed the goals of true resource discoverability. While catalogue user studies have focused on user search and discovery processes and experiences, and construction and manipulation of search queries, little insight is given to how searchers interact with search features of next generation catalogues. Better understanding of user experiences can help guide informed decisions when selecting and implementing new systems. In this study, fourteen graduate students completed a set of information seeking tasks using UIUC's VuFind installation. Observations of these interactions elicited insight into both search feature use and user understanding of the function of features. Participants used the basic search option for most searches. This is because users understand that basic search draws from a deep index that always gives results regardless of search terms; and because it is convenient, appearing at every level of the search, thus reducing effort and shortening search time.
    [Show full text]
  • A Framework for Evaluating the Retrieval Effectiveness of Search Engines Dirk Lewandowski Hamburg University of Applied Sciences, Germany
    1 A Framework for Evaluating the Retrieval Effectiveness of Search Engines Dirk Lewandowski Hamburg University of Applied Sciences, Germany This is a preprint of a book chapter to be published in: Jouis, Christophe: Next Generation Search Engine: Advanced Models for Information Retrieval. Hershey, PA: IGI Global, 2012 http://www.igi-global.com/book/next-generation-search-engines/59723 ABSTRACT This chapter presents a theoretical framework for evaluating next generation search engines. We focus on search engines whose results presentation is enriched with additional information and does not merely present the usual list of “10 blue links”, that is, of ten links to results, accompanied by a short description. While Web search is used as an example here, the framework can easily be applied to search engines in any other area. The framework not only addresses the results presentation, but also takes into account an extension of the general design of retrieval effectiveness tests. The chapter examines the ways in which this design might influence the results of such studies and how a reliable test is best designed. INTRODUCTION Information retrieval systems in general and specific search engines need to be evaluated during the development process, as well as when the system is running. A main objective of the evaluations is to improve the quality of the search results, although other reasons for evaluating search engines do exist (see Lewandowski & Höchstötter, 2008). A variety of quality factors can be applied to search engines. These can be grouped into four major areas (Lewandowski & Höchstötter, 2008): • Index Quality: This area of quality measurement indicates the important role that search engines’ databases play in retrieving relevant and comprehensive results.
    [Show full text]
  • E-Book: an Introduction to Custom Search Engines for Websites
    E-book: An Introduction to Custom Search Engines for Websites What are you looking for? Search www.sitesearch360.com 1 Contents An introduction to search engines........................................ 4 Your visitor’s search journey: the key elements................ 23 What is site search?................................................................ 6 Presentation — how does it look?................................................ 24 Why is site search important?............................................... 6 Processing — how does it work?.................................................. 28 Navigation vs. search................................................................. 7 Results — how does it deliver?.................................................... 32 How are people using search?............................................... 9 Implementation challenges.................................................... 36 Why are custom search engines the perfect choice?........ 12 JavaScript................................................................................ 36 Semantic search........................................................................ 12 Security.................................................................................. 37 Natural search language............................................................ 13 WordPress............................................................................... 37 Actionable analytics.................................................................. 14 Cloudflare and
    [Show full text]
  • Natural Language Processing for Online Applications Text Retrieval, Extraction and Categorization
    Natural Language Processing for Online Applications Natural Language Processing Editor Prof. Ruslan Mitkov School of Humanities, Languages and Social Sciences University of Wolverhampton Stafford St. Wolverhampton WV1 1SB, United Kingdom Email: [email protected] Advisory Board Christian Boitet (University of Grenoble) Jn Carroll (University of Sussex, Brighton) Eugene Charniak (Brown University, Providence) Eduard Hovy (Information Sciences Institute, USC) Richard Kittredge (University of Montreal) Geoffrey Leech (Lancaster University) Carlos Martin-Vide (Rovira i Virgili Un., Tarragona) Andrei Mikheev (University of Edinburgh) Jn Nerbonne (University of Groningen) Nicolas Nicolov (IBM, T.J. Watson Research Center) Kemal Oflazer (Sabanci University) Allan Ramsey (UMIST, Manchester) Monique Rolbert (Université de Marseille) Richard Sproat (AT&T Labs Research, Florham Park) K-Y Su (Baviour Design Corp.) Isabelle Trancoso (INESC, Lisbon) Benjamin Tsou (City University of Hong Kong) Jun-ichi Tsujii (University of Tokyo) Evene Tzoukermann (Bell Laboratories, Murray Hill) Yorick Wilks (University of Sheffield) Volume 5 Natural Language Processing for Online Applications: Text Retrieval, Extraction and Categorization by Peter Jackson and Isabelle Moulinier Natural Language Processing for Online Applications Text Retrieval, Extraction and Categorization Peter Jackson Isabelle Moulinier omson Legal & Regulatory John Benjamins Publishing Company Amsterdam / iladelia TM The paper used in this publication meets the minimum requirements of American 8 National Standard for Information Sciences – Permanence of Paper for Printed Library Materials, ansi z39.48-1984. Library of Congress Cataloging-in-Publication Data Jackson, Peter, 1948- Natural language processing for online applications : text retrieval, extraction, and categorization / Peter Jackson, Isabelle Moulinier. p.cm.(Natural Language Processing, issn 1567–8202 ; v.5) Includes bibliographical references and index.
    [Show full text]
  • AIIP Connections Is the Official Newsletter for Members of the Association of Independent Information Professionals
    Connections Volume 22 No . 6 November/December 2008 President’s Message Ed Vawter, QD Information IN THIS ISSUE I want to take this opportunity to wish everyone Happy Holidays and hope that President’s Message . 1 you and your family are doing well. Footnotes . 2 ow, time has flown by and winter new look. We are also still working on is nearly upon us. I can’t believe the website and adding some Web 2, Opportunities in a W over half my term as President of AIIP including the podcasts I have done in Changing World . 4 is past. Given that our conference is the past, and including the new blog The Perils of Blogging earlier than usually next year, I actually called i to i. the News . 6 only get to serve 11 months instead of a I mentioned the conference earlier and full year. I wish I had more time to get What is Fair Use? . 9 just wants to reminder everyone that the many things done that myself and it is from March 26-29 at the Marriott the board have planned but I will have WordWise . 10 in Albuquerque, New Mexico, USA. I the pleasure of continuing on the board understand that the economy right now The Four Phases for one more year under the leadership does not look promising but I would of a Project . .13 of Marcy Phelps. I’ve been very happy encourage you to start putting away with what we have accomplished and I Coach’s Corner . .15 some money each week to enable you to think the whole board would agree that attend.
    [Show full text]
  • A Search Engine Features Comparison. PUB DATE 1997-08-00 NOTE 48P.; Master's Research Paper, Kent State University
    DOCUMENT RESUME ED 413 900 IR 056 730 AUTHOR Vorndran, Gerald TITLE A Search Engine Features Comparison. PUB DATE 1997-08-00 NOTE 48p.; Master's Research Paper, Kent State University. PUB TYPE Dissertations/Theses (040) -- Reports Evaluative (142) EDRS PRICE MF01/PCO2 Plus Postage. DESCRIPTORS Access to Information; Computer Interfaces; Databases; *Online Searching; *Online Systems; Online Vendors; Reference Services; Relevance (Information Retrieval); Search Strategies; *User Needs (Information); *World Wide Web IDENTIFIERS *DIALOG; *Search Engines ABSTRACT Until recently, the World Wide Web (WWW) public access search engines have not included many of the advanced commands, options, and features commonly available with the for-profit online database user interfaces, such as DIALOG. This study evaluates the features and characteristics common to both types of search interfaces, examines the Web search interfaces to define lingering deficiencies as compared to the online interfaces, and presents suggestions for improvement to those areas of the Web interfaces found lacking. The most advanced interface features of the AltaVista, Excite, HotBot, and Infoseek Web search interfaces were compared to the DIALOG interface features. The Web search interfaces, as a whole, still trail the DIALOG search interface in terms of the quality, quantity, depth (robustness), and usability of the search system. Appendices include background information, search parameters, and output for the Web engines, and for DIALOG. (Contains 46 references.) (Author/SWC) ******************************************************************************** Reproductions supplied by EDRS are the best that can be made from the original document. ******************************************************************************** A Search Engine Features Comparison A Master's research Paper submitted to the Kent State University School of Library Science and Information Science in partial fulfillment of the requirements for the degree Master of Library Science by Gerald Vorndran August, 1997 OF EDUCATION U.S.
    [Show full text]
  • An Agent-Based Semantic Search Engine for Scalable Enterprise Applications*
    An Agent-based Semantic Search Engine for Scalable Enterprise Applications* Andrea Passadore1, Alberto Grosso1, Antonio Boccalatte1 1 University of Genova, DIST, Via Opera Pia 13, 16145 Genova, Italy {passa, agrosso, nino}dist.unige.it Abstract. In this paper we present AgentSeeker: a multi-agent platform aimed to index local or online documents, with the support of ontologies which de- scribe the application domain and the competences the user is referring to, dur- ing his query session. AgentSeeker is a flexible and scalable solution mainly devoted to enterprise applications where electronic knowledge bases are par- ticularly important for their business activity. An Ontology Agent is devoted to manage semantic representations of the enterprise domain, organizing the re- sults of a user’s query, according to the concepts which represent the relevant entities in the company business. Keywords: ontology, search engine, multi-agent system, web site, document. 1 Introduction The Information Age leads to us various benefits and comforts, encouraging our thirst for knowledge, helping us at work, and gladdening us during our leisure time. On the other hand, the bits which encode the information are so much that we are lost in a sea of electronic data. The orientation is so hard that oftentimes we lose documents both in the little pond of our personal hard disks and in the boundless ocean of Internet. Search engines represent a saving compass which enables us to find an Internet page winnowing the whole network or to find a personal document through a desktop application which parses private files. Usual search engines denote an intuitive behaviour: they store the textual content of the parsed documents in a database and they return an ordered list of files contain- ing the keywords suggested through a user’s query.
    [Show full text]
  • Comparison of Search Engines
    International Journal of Information Technology and Library Science. Volume 1, Number 1 (2012), pp. 9-25 © Research India Publications http://www.ripublication.com Comparison of Search Engines 1Vikas Malviya, 2Devendra Kumar Mishra, 3Manisha Gawde and 4Madhu Singh Solnki 1Librarian, Rau Dental College, Indore, Madhya Pradesh, India 2Librarian, Mathuradevi Institute of Technology & Management Indore-452018 Madhya Pradesh, India 3Assistant Librarian, Mathuradevi Institute of Technology & Management Indore -452018, Madhya Pradesh, India 4Librarian, Kasturba Gram Mahavidyala, Indore, Madhya Pradesh, India E-mail: [email protected], [email protected], [email protected]., gawdemanisha881@ gmail.com, [email protected], [email protected] Abstract To achieve the best possible results and fulfill the aim of this evaluation exercise, by studying the capabilities that search engines provide to end users. After a general examination of the search engines, a classification took place according to the types of features supported by each search engine. This classification of features was essential in order to determine the areas, in which the evaluation should focus. Goals, non-goals and limitations were extracted from this initial part of the research. The scope of the evaluation of search engine based on accuracy of the most popular search engines, along with their database coverage and other issues such as response time, user- friendliness of the interface, ease in query syntax and submission, are evaluated. Keywords: Search Engine; World Wide Web; Search Strategies; GOOGLE. Introduction The explosive growth of the Internet has rendered the World Wide Web as the primary tool for information retrieval today. However, the amount of information published is increasing constantly and this makes it impossible for anyone to monitor changes.
    [Show full text]
  • Online Evaluation for Information Retrieval
    Foundations and Trends R in Information Retrieval Vol. 10, No. 1 (2016) 1–117 c 2016 K. Hofmann, L. Li, and F. Radlinski DOI: 10.1561/1500000051 Online Evaluation for Information Retrieval Katja Hofmann Lihong Li Microsoft Microsoft [email protected] [email protected] Filip Radlinski Microsoft fi[email protected] Contents 1 Introduction 2 1.1 Terminology . 3 1.2 Motivation and Uses . 4 1.3 This Survey . 5 1.4 Organization . 6 2 Controlled Experiments 7 2.1 Online Controlled Experiments in Information Retrieval . 7 2.2 Planning Controlled Experiments . 10 2.3 Data Analysis . 16 2.4 Between-subject Experiments . 20 2.5 Extensions to AB testing . 22 2.6 Within-subject Experiments . 26 2.7 Extensions to Interleaving . 29 3 Metrics for Online Evaluation 31 3.1 Introduction . 31 3.2 Absolute Document-level Metrics . 33 3.3 Relative Document-level Metrics . 36 3.4 Absolute Ranking-level Metrics . 37 3.5 Relative Ranking-level Metrics . 39 ii iii 3.6 Absolute Session-level and Longer-term Metrics . 44 3.7 Relative Session-level Metrics . 48 3.8 Beyond Search on the Web . 48 3.9 Practical Issues . 48 4 Estimation from Historical Data 51 4.1 Motivation and Challenges . 51 4.2 Problem Setup . 54 4.3 Direct Outcome Models . 56 4.4 Inverse Propensity Score Methods . 58 4.5 Practical Issues . 67 4.6 Concluding Remarks . 68 5 The Pros and Cons of Online Evaluation 70 5.1 Relevance . 71 5.2 Biases . 72 5.3 Experiment Effects . 73 5.4 Reusability . 74 6 Online Evaluation in Practice 76 6.1 Case Studies Approach .
    [Show full text]
  • AWARENESS and USED of SEARCH ENGINES by UNDERGRADUATE STUDENT in DELTA STATE UNIVERSITY, ABRAKA NIGERIA Judith Imoniwe [email protected]
    University of Nebraska - Lincoln DigitalCommons@University of Nebraska - Lincoln Library Philosophy and Practice (e-journal) Libraries at University of Nebraska-Lincoln Summer 8-18-2018 AWARENESS AND USED OF SEARCH ENGINES BY UNDERGRADUATE STUDENT IN DELTA STATE UNIVERSITY, ABRAKA NIGERIA Judith Imoniwe [email protected] Follow this and additional works at: http://digitalcommons.unl.edu/libphilprac Part of the Library and Information Science Commons Imoniwe, Judith, "AWARENESS AND USED OF SEARCH ENGINES BY UNDERGRADUATE STUDENT IN DELTA STATE UNIVERSITY, ABRAKA NIGERIA" (2018). Library Philosophy and Practice (e-journal). 1982. http://digitalcommons.unl.edu/libphilprac/1982 1 AWARENESS AND USE OF SEARCH ENGINES BY UNDERGRADUATE STUDENTS IN DELTA STATE UNIVERSITY, NIGERIA. BY IMONIWE .O. JUDITH DEPARTMENT OF LIBRARY AND INFORMATION SCIENCE, FACULTY OF EDUCATION, DELTA STATE UNIVERSITY, ABRAKA ABSTRACT The study investigated the awareness and use of search engines by undergraduate students in Delta State University, Abraka. Case study design was used for the study. The population comprised one thousand five hundred and thirty nine (1,539) students in the Faculty of Social Science in Delta State University. The sample of the study was 154. Five research questions guided the study and two hypothesis tested at 0.05 level of significant. One instrument titled search engine use questionnaire (SEUQ) was used for the study. Percentages and mean rating were used to answer research questions. Chi Square and Anova were used to test hypotheses. The major findings of the study include: Undergraduate Students level of awareness of search engines was low in Delta State University. The test for hypothesis shows that there was a significant difference from the respondents awareness of search engines according to their departments.
    [Show full text]
  • A Strategic Perspective on Search Engines: Thought Candies for Practitioners and Researchers
    Available online at www.sciencedirect.com Journal of Interactive Marketing 23 (2009) 49–60 www.elsevier.com/locate/intmar A Strategic Perspective on Search Engines: Thought Candies for Practitioners and Researchers Arvind Rangaswamy,a,⁎ C. Lee Gilesa & Silvija Seresb a Penn State University, USA b FAST, ASA, Norway Abstract Search engines, such as Google and Yahoo! Search, are more than just portals or information tools. In fact, they are agents of a transformation that is making the business environment more transparent, and thus, potentially more competitive. This new environment is creating opportunities and challenges for businesses of every stripe. In this paper, we explore the following topics: (1) what are search engines exactly, (2) what businesses can do with search engines, (3) how are, and how should, senior executives be viewing the strategic impact of search engines, and (4) what are some important research issues for academics and practitioners that would help us gain a better understanding of the strategic impact of search engines. © 2009 Direct Marketing Educational Foundation, Inc. Published by Elsevier B.V. All rights reserved. Keywords: Search engines; Business impact Introduction tion insights for managers and researchers. Search engines are no longer just convenient information tools. In fact, Just fifteen years ago, most people could not have imagined they are powerful agents of a transformation that is making the crucial role that search engines would play in facilitating the business environment more transparent, and thus, poten- today's commerce. Now search engines can be used to find tially more competitive. This environment is creating new every conceivable kind of information about people, places, opportunities and challenges as highlighted by the following things, and more.
    [Show full text]
  • Real Time Search on the Web: Queries, Topics, and Economic Value ⇑ Bernard J
    Information Processing and Management xxx (2011) xxx–xxx Contents lists available at ScienceDirect Information Processing and Management journal homepage: www.elsevier.com/locate/infoproman Real time search on the web: Queries, topics, and economic value ⇑ Bernard J. Jansen a, , Zhe Liu a, Courtney Weaver a, Gerry Campbell b, Matthew Gregg b a College of Information Sciences and Technology, The Pennsylvania State University, University Park, PA 16802, United States b Collecta, Santa Monica, CA 90401, United States article info abstract Article history: Real time search is an increasingly important area of information seeking on the Web. In Received 21 June 2010 this research, we analyze 1,005,296 user interactions with a real time search engine over Received in revised form 5 January 2011 a 190 day period. Using query log analysis, we investigate searching behavior, categorize Accepted 17 January 2011 search topics, and measure the economic value of this real time search stream. We examine Available online xxxx aggregate usage of the search engine, including number of users, queries, and terms. We then classify queries into subject categories using the Google Directory topical hierarchy. Keywords: We next estimate the economic value of the real time search traffic using the Google Real time search AdWords keyword advertising platform. Results shows that 30% of the queries were unique Real time content Collecta (used only once in the entire dataset), which is low compared to traditional Web searching. Twitter Also, 60% of the search traffic comes from the search engine’s application program inter- Economic value of search face, indicating that real time search is heavily leveraged by other applications.
    [Show full text]