Rienianpi Space Model and Similarity-Based Web Retrieval

Total Page:16

File Type:pdf, Size:1020Kb

Rienianpi Space Model and Similarity-Based Web Retrieval RIENIANPI SPACE MODEL AND SIMILARITY-BASED WEB RETRIEVAL -4 THESIS SUBMITTEDTO THE FACULTYOF GRADUATESTUDIES AND RESEARCH IN PARTIALFULFILL~IENT OF THE REQUIREMENTS FOR THE DEGREEOF D~CTOROF PHILOSOPHY IN COMPUTERSCIENCE UNIVERSITYOF REGINA BY Zhiwei Wang Regina, Sas kat chewan February 24, 2001 @ Copyright 2001: Zhiwei Wang National Librafy Bibliothèque nationaIe I*I of Canada du Canada Acquisitions and Acquisitions et Bibliographie Services services bibliographiques 395 Wellington Street 395, rue Wellington Ottawa ON K1A ON4 Ottawa ON KIA ON4 Canada Canada The author has granted a non- L'auteur a accordé une licence non exclusive licence allowing the exclusive permettant à la National Lhrary of Canada to Bibliothèque nationale du Canada de reproduce, loan, distriiute or sell reproduire, prêter, distribuer ou copies of this thesis in microforrn, vendre des copies de cette thèse sous paper or electronic formats. la forme de microfiche/fïh, de reproduction sur papier ou sur format électronique. The author retains ownership of the L'auteur conserve !ô popriété du copyright in this thesis. Neither the droit d'auteur qui protège cette thèse. thesis nor substantial extracts fiom it Ni la thèse ni des extraits substantiels may be printed or othenirise de celle-ci ne doivent être imprimés reproduced without the author' s ou autrement reproduits sans son permission. autorisation. Abstract Similarity-based rnatching is widely used in the vector space model. However, the widespread adoption of similarity-based matching is hampered by disagreements over hov~similarity measures should be constructed and how large databases should be indexed so the similarity matching is even possible. This thesis intends to overcome these hindrances and to establish a theoretical basis and implementation guidelines f~rapplying similarity-based matching in Web retrieval. The thesis analyzes the vector space madel and shows that FVeb space would be modeled more exactly as a curved space rather than as a Euclidean space. Based on this, the thesis claims that it is inappropriate to atternpt to apply a single simi- larity/dissimilarity measure globally on Web space. The thesis proposes a Riemann space model that explains previously unexplained phenornena. In the Riemann space model, dissimilarity functions are integrated into a single form of geodesic distances, which can be locally computed in a uniform formula. To some extent, this answers the long-existing open problem of identifying conditions for the use of a particular sirnilari ty/dissimilarity measure. According to the theory of the Riemann space model, we propose a multi-stage approach that combines exact matching and partial matching in the design of new WeP retrieval systems. In this approach, a retrieval system first forms a nei,nhborhood of a cluery. This can be done using exact matching. Then in the chosen neighborhood, more complicated similarity-based matching is performed. The documents are ranked accorcling to their geodesic distances to the query. This is equivalent to using a ranking function specialiy designed for the given neighborhood. Since the similarity-based matching is performed only in a neighborhood, the computationa1 cost involved in the search process wodd be reduced. The Riemann space mode1 provides a sound t heoretical basis for t his multi-stage approach. As a dernonstration of application, we designed and implemented a personal Web retrieval (PWR) system. DiEerent from curent search engines, subject trees, and metasearch engines, this system is a client side program. It works Iike a personal secretary. Tt reads Web documents, ranks them according to their geodesic distances to the query, and also considers the user's general search interests. It can be viewed as a prototype of intelligent Web retrieval systems. Acknowledgment s First, 1 would like to express my gratitude to my supervisor, Dr. R. B. Maguire. Without his valuable guidance and kind encouragement, this thesis would have been impossible. 1 am also grateful to my previous supervisor, Dr. S. K. M. Wong for his instructive training and kind help over a long period of time. 1 thank the committee members. Dr. B. Gilligan, Dr. S. K. M. Wong, and Dr. Y. Y. Yao, and the external examiner Dr. R. G. Goebel. Their instructive suggestions and comments greatly improved the thesis. I gratefully acknowledge the financial support provided by the Government of Saskatchewan, The Faculty of Graduate Stuclies and Research, and the Department of Computer Science. Especially, 1 appreciate the full-tirne career opportunity 1 have had since 1994 in the Department of Computer Science, University of Regina. 1 thank Dr. K. E. Denford, former Dean of the Faculty of Science and Dr. L. V. Saston, former Head of Computer Science. 1 am also indebted to my wife Xiaozhu, my daughter Min, and son Zheng for their devoted love and support. Contents Abstract Acknowledgrnents Table of Contents List of Tables List of Figures ix Chapter 1 Introduction 1 1.1 FiTorldWideWeb . 1 1.2 Retrieval Techniques , . 5 1.3 -4n Overview of the Riemann Space Model . - . 18 1.3.1 A Brief History of Non-Euclidean Geometry . 1S 1.:3.2 Riemann Space Model . 21 1.4 Surnmary of Contributions . 24 1.5 ThesisOutline. 26 Chapter 2 A Geometrical Analysis of the Vector Space Mode1 28 2.1 Generaiized Linear Similarity -Measures ................. 28 2.1.1 Definition ............................. '3s 2.1.2 Examples ............................. 29 2.2 An Analysis of Generalized Similarity Measures ............ Y3 2.2.1 Notations and Definitions .................... 34 2-22 The PseudeCosine Measure ................... 3s 2.23 The Cosine Measure ....................... 42 2.3 Existence Theorems ............................ 44 2.3.1 Preliminaries ........................... 45 2.3.2 The First Existence Theorem .................. 46 2.3..3 The Second Existence Theorem ................. 49 24 User Preference, Query, and Underlying Surface ............ 51 Chapter 3 The Theory of Riemann Space Mode1 54 3.1 Motivation ................................. 54 3.2 Forma1 Properties of Similarity and Dissimilarity ........... 5, 3.2- 1 Some Inappropriateness in the Conventional Similarity Measures 57 :3.2.2 Axiomatic Properties of Similarity and Dissirnilarity Functions 60 3. .3 The Notion of Curved Web Space .................... 63 3.3.1 Integration of Dissiinilarity Functions .............. 65 .3.3.2 Local Linearization of Global Dissimilarity ........... 64 3.4 Mathematical Xotions in Riemann Space Mode1 ............ 69 3.4.1 Topological Space ......................... 70 3.4.2 Differentiable blanifold ...................... 71 3.4.3 Tangent Space ........................... 73 3.4.4 Riemannian Metrics and Geodesics ............... 74 3.5 Curvilinear Coordinate System ..................... 77 i3.6 Keyword Analysis: the Topography in Web space ........... S2 Chapter 4 Application of the Riemann Space Mode1 87 4.1 An Overview of Web Retrieval Systems ................. Sï 4.2 A Persona1 Web Retrieval System .................... 93 4.3 System Outline ancl Architecture .................... 96 4.4 Automatic Query Formulating ...................... 102 4.4. 1 Query B y Example ........................ 102 4.4.1 User .M odeling ........................... 105 4.5 Dynamic Keyword Analysis ....................... 10'7 4.5. Local Coordinate System ..................... 10s 4.5.2 Statistical versus Semantical Analysis .............. 109 4.6 Best First Search ............................. 112 Chapter 5 Sarnple Implementation 116 5.1 Introduction ................................ 116 5.2 Algorithm Outline ............................ 11s 5.3 System Features .............................. 110 5.3.1 The Main Window ........................ 120 .5..3 .1 Contïolled Vocabulary ...................... 122 5.3.3 Keyword Extraction ....................... 123 5.3.4 Document Collection ....................... 124 5.3.5 The URL-Keyword Window ................... 126 5.3 -6 The Keyword-Keyword Window ................. 127 5.3.7 Ranking .............................. 127 Chapter 6 Conclusion and Discussion 132 6.1 Main Contributions ............................ 132 6.1.1 The Riemann Space ~Model.................... 132 6.1 -2 Local Linearization ........................ 133 6.1.3 User Modeling and Subspace ................... 134 6-1.4 Multi-S tage Personal Web Retrieval ............... 135 6.2 Future Research and Open Problems .................. 136 Appendix A Sarnple Source Code 139 Bibliography 151 vii List of Tables 1.1 Growth of the Internet .......................... 3 4.1 A List of Some Subject 'Trees ...................... S9 4.2 A List of Some Search Engines ...................... 91 4.3 -4 List of Some Metasearch Engines ................... 94 1 Cornparison of Different Sorting Algorithms .............. 129 List of Figures Classification of Retrieval Techniques .................. 7 A Contour Curve ............................. 37 A Contour Curve in Pseudo-Cosine Measure Mode1. The underlying hypersurface U is a plane . The contour curve usually is a straight line. 40 A Contour Curve in Cosine Measure Mode1. The underlying hypersur- face Ci is a sphere. The contour curve usually is a circle- ....... 43 The Four Mociules of the PWR System ................ 98 Query Formulating Module ....................... 100 lnternet Searching Module ....................... 101 The Algorithm of the PWR Systern ................... 119 Main Windotv ............................... 121 Main Window with Keytvords Extracted
Recommended publications
  • DOCUMENT RESUME AUTHOR Webnet 96 Conference Proceedings
    DOCUMENT RESUME ED 427 649 IR 019 168 AUTHOR Maurer, Hermann, Ed. TITLE WebNet 96 Conference Proceedings (San Francisco, California, October 15-19, 1996). INSTITUTION Association for the Advancement of Computing in Education, Charlottesville, VA. PUB DATE 1996-10-00 NOTE 930p.; For selected individual papers, see IR 019 169-198. Many figures and tables are illegible. AVAILABLE FROM Web site: http://aace.virginia.edu/aace/conf/webnet/proc96.html; also archived on WebNet 98 CD-ROM (includes 1996, 1997, 1998) AACE Membership/CD orders, P.O. Box 2966, Charlottesville, VA 22902; Fax: 804-978-7449 ($35, AACE members, $40, nonmembers). PUB TYPE Collected Works Proceedings (021) EDRS PRICE MF06/PC38 Plus Postage. DESCRIPTORS Access to Information; Authoring Aids (Programming); Computer Science; Computer Software; Courseware; Databases; Distance Education; Educational Media; Educational Strategies; *Educational Technology; Electronic Libraries; Elementary Secondary Education; *Hypermedia; Information Technology; Instructional Design; Multimedia Materials; Postsecondary Education; *World Wide Web IDENTIFIERS Electronic Commerce; Software Tools; Virtual Classrooms; *Web Sites ABSTRACT This proceedings contains 80 full papers, 12 posters/demonstrations, 108 short papers, one panel, and one tutorial, all focusing on World Wide Web applications. Topics include: designing hypertext navigation tools; Web site design; distance education via the Web; instructional design; the world-wide market and censorshipon the Web; customer support via the Web; VRML;
    [Show full text]
  • Regulating Search Engines: Taking Stock and Looking Ahead
    GASSER: REGULATING SEARCH ENGINES REGULATING SEARCH ENGINES: TAKING STOCK AND LOOKING AHEAD "To exist is to be indexed by a search engine" (Introna & Nissenbaum) URS GASSER TABLE OF CONTENTS I. IN TR O D UCTIO N ....................................................................................... 202 II. A BRIEF (AND CASUAL) HISTORY OF SEARCH ENGINES ..................... 203 Il. SEARCH ENGINE REGULATION: PAST AND PRESENT ........................ 208 A. OVERVIEW OF SEARCH ENGINE-RELATED CASES ............................ 208 B. LEGISLATION AND REGULATION ................................................. 216 C . SU M M AR Y .......................................................................................... 2 19 III. POSSIBLE FUTURE: HETEROGENEOUS POLICY DEBATES AND THE NEED FOR A NORMATIVE FRAMEWORK ......................................... 220 A. THEMES OF FUTURE POLICY DEBATES ............................................. 220 B . C HALLENGES A HEAD ........................................................................ 224 C. NORMATIVE FOUNDATIONS .............................................................. 227 IV . C ON CLU SIO N ....................................................................................... 234 * Associate Professor of Law, S.J.D. (St. Gallen), J.D. (St. Gallen), LL.M. (Harvard), Attorney at Law, Director, Research Center for Information Law, Univ. of St. Gallen, Faculty Fellow, Berkman Center for Internet & Society, Harvard Law School. I owe special thanks to my colleague James Thurman and the
    [Show full text]
  • Search Engines : Tools for Exploring the Internet
    CALIBER-98. 4-5 March 1998. Bhubaneswar. pp.193-199 @ 1NFLIBNET Centre. Ahmedabad Search Engines : Tools For Exploring The Internet SWAPAN KUMAR DASGUPTA Central Library, Kalyani University, Klyuniv @giasclOl. vsnl.net.in Abstract A software package for searching a particular information or topic from the vast amount of information available in INTERNET is called a search engine. The common search engines are Altavista. Webcrawler, Yahoo, Lycos, Infoseek, Aliweb. The author provides a list of search engines in INTERNET covering wide areas of interest and then brief description of URLs. After mentioning about the role of the INFLIBNET in modemising the university libraries and in improving the on-line access by creating its web page, the author says that in order to improve upon the education and research infrastructure of the country. some changes are necessary in our present thinking and approach. Introdution Internet is a global mesh and may be called a large repository of information put up by the user. Searching in a particular information or topic of interest, is an intricate task due to the fabulous size of Internet, and vast amount of information, and its many possible methods of storage. A software package for this purpose is called a search engine. Common Search Engines The common WWW search engines are Altavista, Webcrawler, Yahoo, Lycos, Infoseek, Aliweb. Some of these sites may be very busy, then the user has to try for another site, or may press G key for going to another URL. People are not aware that, netsurf can be of various ways. They may say it seems to be time consuming, but the fact is it is free from traditional constraints of time and space.
    [Show full text]
  • United States Patent (19) 11 Patent Number: 6,094,649 Bowen Et Al
    US006094649A United States Patent (19) 11 Patent Number: 6,094,649 Bowen et al. (45) Date of Patent: Jul. 25, 2000 54) KEYWORD SEARCHES OF STRUCTURED “Charles Schwab Broadens Deployment of Fulcrum-Based DATABASES Corporate Knowledge Library Application', Uknown, Full 75 Inventors: Stephen J Bowen, Sandy; Don R crum Technologies Inc., Mar. 3, 1997, pp. 1-3. Brown, Salt Lake City, both of Utah (List continued on next page.) 73 Assignee: PartNet, Inc., Salt Lake City, Utah 21 Appl. No.: 08/995,700 Primary Examiner-Hosain T. Alam 22 Filed: Dec. 22, 1997 Assistant Examiner Thuy Pardo Attorney, Agent, or Firm-Computer Law---- (51) Int. Cl." ...................................................... G06F 17/30 52 U.S. Cl. ......................................... 707/3; 707/5; 707/4 (57 ABSTRACT 58 Field of Search .................................... 707/1, 2, 3, 4, 707/5, 531, 532,500 Methods and Systems are provided for Supporting keyword Searches of data items in a structured database, Such as a 56) References Cited relational database. Selected data items are retrieved using U.S. PATENT DOCUMENTS an SQL query or other mechanism. The retrieved data values 5,375,235 12/1994 Berry et al. ................................. is are documented using a markup language such as HTML. 5,469,354 11/1995 Hatakeyama et al. ... 707/3 The documents are indexed using a web crawler or other 5,546,578 8/1996 Takada ................. ... 707/5 indexing agent. Data items may be Selected for indexing by 5,685,003 11/1997 Peltonen et al. .. ... 707/531 5,787.295 7/1998 Nakao ........... ... 707/500 identifying them in a data dictionary. The indexing agent 5,787,421 7/1998 Nomiyama ..
    [Show full text]
  • Applications That Changed the World
    Applications That Changed The World Some slides adapted from UC Berkeley CS10 – Dan Garcia Lecture Overview • What counts? • For each application – Historical context • What world was like before • On what shoulders does it stand? – Key players • Sometimes origins fuzzy – How it changed world • Summary Applications that Changed the World • Lots of applications changed the world – Electricity, Radio, TV, Cars, Planes, AC, ... • We’ll focus on those utilizing Computing • Important to consider historical apps – Too easy to focus on recent N years! Email (1965) • Fundamentally changed the way people interact! • 1965: MIT’s CTSS – Compatible Time-Sharing Sys • Exchange of digital info • How – Model: “Store and Forward” – Alice composes email to – “Push” technology [email protected] • Pros – Domain Name System looks up – Solves logistics (where) & where b.org is synchronization (when) – DNS server with the mail • Cons exchange server for b.org – “Email Fatigue” – Mail is sent to mx.b.org – Information Overload – Bob reads email from there – Loss of Context The Personal Computer (1970s) • First PCs sold as kits to hobbyists – Altair 8800 (1975) • Early mass-prod PCs – Apple I, II (Jobs & Woz) – Commodore PET Altair 8800 Apple II – IBM ran away w/market • Microprocessor key • Laptops portability • Created industry, wealth – Silicon Valley! – Bill Gates worth $50 Billion Commodore IBM PC PET en.wikipedia.org/wiki/Personal_computer The World Wide Web (1989) • “System of interlinked hypertext documents on the Internet” • History – 1945: Vannevar Bush describes hypertext system called World’s First web “memex” in article Tim Berners- server in 1990 – 1989: Tim Berners-Lee Lee proposes, gets system up ’90 – ~2000 Dot-com entrepreneurs rushed in, 2001 bubble burst www.archive.org • Wayback Machine – Snapshots of web over time • Today : Access anywhere! WWW Search & Browser (1993) • Browser – Marc L.
    [Show full text]
  • Way of the Ferret: Finding and Using Resources on the Internet
    W&M ScholarWorks School of Education Books School of Education 1995 Way of the Ferret: Finding and Using Resources on the Internet Judi Harris College of William & Mary Follow this and additional works at: https://scholarworks.wm.edu/educationbook Part of the Education Commons Recommended Citation Harris, Judi, "Way of the Ferret: Finding and Using Resources on the Internet" (1995). School of Education Books. 1. https://scholarworks.wm.edu/educationbook/1 This Book is brought to you for free and open access by the School of Education at W&M ScholarWorks. It has been accepted for inclusion in School of Education Books by an authorized administrator of W&M ScholarWorks. For more information, please contact [email protected]. DOCUMENT RESUME IR 018 778 ED 417 711 AUTHOR Harris, Judi TITLE Way of the Ferret: Finding andUsing Educational Resources on the Internet. SecondEdition. Education, Eugene, INSTITUTION International Society for Technology in OR. ISBN ISBN-1-56484-085-9 PUB DATE 1995-00-00 NOTE 291p. Education, Customer AVAILABLE FROM International Society for Technology in Service Office, 480 Charnelton Street,Eugene, OR 97401-2626; phone: 800-336-5191;World Wide Web: http://isteonline.uoregon.edu (members: $29.95,nonmembers: $26.95). PUB TYPE Books (010)-- Guides -Non-Classroom (055) EDRS PRICE MF01/PC12 Plus Postage. Mediated DESCRIPTORS *Computer Assisted Instruction; Computer Communication; *Educational Resources;Educational Technology; Electronic Mail;Information Sources; Instructional Materials; *Internet;Learning Activities; Telecommunications; Teleconferencing IDENTIFIERS Electronic Resources; Listservs ABSTRACT This book is designed to assist educators'exploration of the Internet and educational resourcesavailable online. An overview lists the five basic types of informationexchange possible on the Internet, and outlines five corresponding telecomputingoptions.
    [Show full text]
  • Webcrawler: Finding What People Want
    © Copyright 2000 Brian Pinkerton WebCrawler: Finding What People Want Brian Pinkerton A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy University of Washington 2000 Program Authorized to Offer Degree: Department of Computer Science & Engineering University of Washington Graduate School This is to certify that I have examined this copy of a doctoral dissertation by Brian Pinkerton and have found that it is complete and satisfactory in all respects and that any and all revisions required by the final examining committee have been made. Co-chairs of the Supervisory Committee: _______________________________________________ Edward Lazowska _______________________________________________ John Zahorjan Reading Committee: _______________________________________________ Edward Lazowska _______________________________________________ John Zahorjan _______________________________________________ David Notkin Date: _____________________ In presenting this dissertation in partial fulfillment of the requirements for the Doctoral degree at the Univer- sity of Washington, I agree that the Library shall make its copies freely available for inspection. I further agree that extensive copying of the dissertation is allowable only for scholarly purposes, consistent with “fair use” as prescribed in the U.S. Copyright Law. Requests for copying or reproduction of this dissertation may be referred to Bell and Howell Information and Learning, 300 North Zeeb Road, Ann Arbor, MI 48106- 1346, to whom the author
    [Show full text]
  • Google Vs Microsoft
    GOOGLE VS MICROSOFT De strijd om de standaard in zoekmachineland Naam: Anne Helmond Studentnummer: 0449458 E-mail: [email protected] Instelling: Universiteit van Amsterdam Opleiding: Media en Cultuur, Nieuwe Media Datum: 12 juni 2006 Begeleider: Rens Bod KEYWORDS Search engines, market competition, standardization wars, Google, Microsoft, Netscape. SAMENVATTING Google is op dit moment de onbetwiste marktleider in zoekmachineland en deze positie lijkt onaantastbaar. De vraag is echter of dit wel zo is. Microsoft introduceerde onlangs zijn vernieuwde zoekmachine Live Search waarmee de aanval op de dominante positie van Google wordt ingezet. Deze strijd doet denken aan de browseroorlog uit de jaren negentig toen Microsoft met de introductie van Internet Explorer toenmalig marktleider Netscape binnen enkele jaren de markt uit drukte. De browseroorlog was een standaardisatie-oorlog die van Internet Explorer de standaard browser maakte. Dit paper zal argumenteren dat er wederom sprake is van een standaardisatie-oorlog, zowel opnieuw in de browsermarkt als in de zoekmachinemarkt. Tevens zal worden aangetoond dat de browser en de zoekmachine tegenwoordig zodanig geïntegreerd zijn dat de uitkomst van deze standaardisatie-oorlog grote gevolgen kan hebben voor de internetgebruiker. Door middel van een historische analyse van de eerste browseroorlog met betrekking tot de gebruikte concurrentietactieken zal getracht worden een beeld te schetsen van de mogelijke scenario's van de huidige oorlog in de zoekmachinemarkt. And so at last the beast fell and the unbelievers rejoiced. But all was not lost, for from the ash rose a great bird. The bird gazed down upon the unbelievers and cast fire and thunder upon them.
    [Show full text]
  • A Self-Organizing Approach
    Internet Categorization and Search: A Self-Organizing Approach Item Type Journal Article (Paginated) Authors Chen, Hsinchun; Schuffels, Chris; Orwig, Richard E. Citation Internet Categorization and Search: A Self-Organizing Approach 1996, 7(1):88-102 Journal of Visual Communication and Image Representation, Special Issue on Digital Libraries Publisher Academic Press, Inc. Journal Journal of Visual Communication and Image Representation, Special Issue on Digital Libraries Download date 02/10/2021 20:28:31 Link to Item http://hdl.handle.net/10150/105957 JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION Vol. 7, No. 1, March, pp. 88±102, 1996 ARTICLE NO. 0008 Internet Categorization and Search: A Self-Organizing Approach 1 2 3 HSINCHUN CHEN, CHRIS SCHUFFELS, AND RICHARD ORWIG Management Information Systems Department, University of Arizona, Tucson, Arizona 85721 Received July 6, 1995; accepted December 5, 1995 that is used by searchers of varying backgrounds a more The problems of information overload and vocabulary differ- intelligent and proactive search aid is needed. ences have become more pressing with the emergence of increas- The problems of information overload and vocabulary ingly popular Internet services. The main information retrieval differences have become more pressing with the emergence mechanisms provided by the prevailing Internet WWW soft- of increasingly popular Internet services [47, 24]. Although ware are based on either keyword search (e.g., the Lycos server Internet protocols such as WWW/http support signi®cantly at CMU, the Yahoo server at Stanford) or hypertext browsing easier importation and fetching of online information (e.g., Mosaic and Netscape). This research aims to provide an alternative concept-based categorization and search capability sources, their use is accompanied by the problem of users for WWW servers based on selected machine learning algo- not being able to explore and ®nd what they want in an rithms.
    [Show full text]
  • Download Download
    International Journal of Management & Information Systems – Fourth Quarter 2011 Volume 15, Number 4 History Of Search Engines Tom Seymour, Minot State University, USA Dean Frantsvog, Minot State University, USA Satheesh Kumar, Minot State University, USA ABSTRACT As the number of sites on the Web increased in the mid-to-late 90s, search engines started appearing to help people find information quickly. Search engines developed business models to finance their services, such as pay per click programs offered by Open Text in 1996 and then Goto.com in 1998. Goto.com later changed its name to Overture in 2001, and was purchased by Yahoo! in 2003, and now offers paid search opportunities for advertisers through Yahoo! Search Marketing. Google also began to offer advertisements on search results pages in 2000 through the Google Ad Words program. By 2007, pay-per-click programs proved to be primary money-makers for search engines. In a market dominated by Google, in 2009 Yahoo! and Microsoft announced the intention to forge an alliance. The Yahoo! & Microsoft Search Alliance eventually received approval from regulators in the US and Europe in February 2010. Search engine optimization consultants expanded their offerings to help businesses learn about and use the advertising opportunities offered by search engines, and new agencies focusing primarily upon marketing and advertising through search engines emerged. The term "Search Engine Marketing" was proposed by Danny Sullivan in 2001 to cover the spectrum of activities involved in performing SEO, managing paid listings at the search engines, submitting sites to directories, and developing online marketing strategies for businesses, organizations, and individuals.
    [Show full text]
  • Top 5 Defacers Looking for Loop Holes Or Vulnerabilities to Exploit
    Cyber Intelligence Report, The original OSINT/APT CIR resource, since 2006. Social engineering (SE) has always been an extremely valuable and effective method for gathering information and exploiting vulnerabilities. From HUMINT to cyber warfare, SE is an invaluable asset to any attacker’s arsenal. This week’s episode covers the Social Engineering Toolkit (SET) http://youtu.be/cosWCrXSpt8 NSA META data used in drone strikes / DOJ Hacking: US centric. When those two stories hit the news, you have to think that what Edward Snowden did is completely justified. NSA claims that they only look at META data of your communications and the ex-Director just announced that they authorized kill commands using drone strikes against targets solely on META data. To quote the Church Lady (Dana Carvey), “Well isn’t that special…”. The Department of Justice now wants to use malware & exploitation as a legal means to track suspects. Well, if that is the case, shouldn’t these newly classified “cyber weapons” be protected under the 2nd Amendment for the right to bear arms and 1st Amendment of Freedom of Speech? With neither the current administration Image is an oldie but goodie from IOSS.gov or Eric Holder, the top Law Enforcement officer in the US willing to go on record as stating no Americans on American soil will be killed by a government owned drone (look up Rand Paul’s filibuster), this is a scary situation… “Chained exploits”: Just like hackers, lawyers and politicians are constantly Top 5 Defacers looking for loop holes or vulnerabilities to exploit. Think of it this way.
    [Show full text]
  • The World Wide
    ii Copyright © 1995 by Sams.net Publishing FIRST EDITION All rights reserved. No part of this book shall be reproduced, stored in a retrieval system, or transmitted by any means, electronic, mechanical, photocopying, recording, or otherwise, without written permission from the publisher. No patent liabiliry is assumed with respect to the use of the information contained herein. Although every precaution has been taken in the preparation of this book, the publisher and author assume no responsibil­ iry for errors or omissions. Neither is any liabiliry assumed for damages resulting from the use of the information contained herein. For information, address Sams.net Publishing, 201 W . 103rd St., Indianapolis, IN 46290. International Standard Book Number: 1-57521-050-9 Library of Congress Catalog Card Number: 95-7 1224 98 97 96 4 Interpretation of the printing code: the rightmost double-digit number is the year of the book's printing; the rightmost single-digit, the number of the book's printing. For example, a printing code of 95-1 shows that the first printing of the book occurred in 1995. Composed in Agaramond and MCPdigital by Macmillan Computer Publishing Printed in the United States ofAmerica Trademarks All terms mentioned in this book that are known to be trademarks or service marks have been appropriately capitalized. Sams.net Publishing cannot attest to the accuracy of this information. Use of a term in this book should not be regarded as affecting the validiry of any trademark or service mark. Sportster is a registered trademark of U.S. Robotics, Inc. OvervieV# 1 The World Wide Web: Interface on the Internet 1 2 Putting It All Together: The World Wide Web 9 3 Spry Mosaic 23 4 The World-Wide Tour 53 5 Finding It on the Web: Directories, Search Tools, and Cool and Unusual Places 67 6 Using the Web for Business 97 7 Education, Scholarship, and Research 115 8 Using the Web at Home 133 Index 149 iv Discover the World Wide Web with Your Sportster Contents 1 The World Wide Web: Interface on the Internet 1 The Concept of the World Wide Web ........................
    [Show full text]