Searching for New Search Technologies

Searching for New Search Technologies

Search services are a key to attracting users because they are an important rea- Searching for son people use portals in the first place. Moreover, about 71 percent of Internet users utilize search services to find Web sites, according to Nielsen Media Re- Industry Trends New Search search, a company that measures com- puter and Internet usage (as well as television audience levels). Currently, though, said Werbach, “For Technologies consumers, most of the search engines are pretty comparable.” Users thus fre- quently choose a portal for reasons other Ilan Greenberg and Lee Garber than its search service. However, a company might attract more users to its portal if it could offer an improved search technology. Re- earching for Web sites is one of searchers are thus looking at a variety of the most common tasks per- new technologies and techniques. formed on the Web. It is also one of the most frustrating. In CHALLENGES S fact, the situation has become The sheer size of the Web is a challenge a notorious symbol of the Web’s grow- to improving search technology. There ing size and lack of structure, as well as are more than 350 million Web pages, the inadequacy of Web search tech- and AltaVista contains only about 140 nologies. million of them, one of the largest totals However, a number of Web companies of any search service. and research organizations are taking a There are technical, Meanwhile, the Web is constantly variety of approaches to try to solve this commercial incentives changing, with new URLs added and old problem. for improving search pages discarded. NEC’s Research Institute Traditional search technology (see the technology. estimated that in 1998, more than 5 per- sidebar “Traditional Web Search Tech- cent of search results in one prominent nology”) is based on users typing in key- search service were invalid or “dead” links. words for the information they want to Boolean search techniques from the 60s receive. Search services then scan Web and 70s are running out of gas,” said NEW APPROACHES pages for those keywords. This approach Kevin Werbach, managing editor of Researchers are taking a variety of consistently causes a number of well- Release 1.0, a newsletter on emerging approaches to improving Web search known problems. communications and computing tech- technology. For example, some search Users must try to come up with the nologies. services are making their Web indexes correct keywords for their search. If the There are considerable commercial bigger, in an effort to make their results keywords are too general or have multi- incentives for developing better search more comprehensive. ple meanings, users may receive too technologies. Various search services— many results or too many irrelevant including AltaVista, Excite, Lycos, and Human annotation results to find the information they want. Yahoo—are turning their Web pages The human-annotation approach For example, a search for “history of into portals. Portals are Web home bases search results on the behavior of rock” could yield results related to pop- bases from which users can access a and the results obtained by previous ular music, geology, or history classes at variety of services, including searches, Web searchers, rather than just on key- a university. Meanwhile, the wrong e-commerce, stock prices, weather fore- words. Proponents say the results of pre- choice of keywords may lead to useless casts, chat rooms, and driving direc- vious searches, as well as Webmasters’ results or no results at all. tions. decisions about which pages their sites “In general, what’s happening is the Companies want to attract more peo- should link to, better indicate which sites ple to their portals because the more will satisfy new searches. They also say unique users they attract, the more this technique reduces the ability of a Editor: Lee Garber, Computer, 10662 Los money they can charge advertisers and Web site to use keywords to manipulate Vaqueros Circle, PO Box 3014, Los Alamitos, partners on the sites. Figure 1 lists the five search services. CA 90720-1314; [email protected] Web sites that attracted the most unique However, Release 1.0’s Werbach and visitors as of May 1999. Continued on page 6 4 Computer Industry Trends COMPUTER EDITORIAL BOARD EDITOR-IN-CHIEF: JAMES H. AYLOR, UNIV. OF VIRGINIA; [email protected]; (804) 924-6100 ASSOCIATE EDITOR-IN-CHIEF: DORIS CARVER, LOUISIANA STATE UNIV.; [email protected] ADVANCED DESIGN/MANUFACTURING: 35 JOSEPH WONG, [email protected] ARTIFICIAL INTELLIGENCE: BILL MARK, SRI INTERNATIONAL; [email protected] 30 BINARY CRITIC: TED G. LEWIS, TECHNOLOGY ASSESSMENT GROUP;TEDGLEWIS@ FRICTION- FREE- ECONOMY.COM 25 COMPONENT AND OBJECT TECHNOLOGY: BERTRAND MEYER, INTERACTIVE SOFTWARE 20 ENGINEERING AND MONASH UNIV.; [email protected] COMPUTING PRACTICES: THOMAS CAIN, UNIV. 15 OF PITTSBURGH; [email protected] CYBERSQUARE: RONALD HOELZEMAN, UNIV. OF PITTSBURGH; R.HOELZEMAN@ COMPUTER.ORG 10 HARDWARE TECHNOLOGIES: ROHIT KAPUR, Unique visitors (millions) SYNOPSYS; [email protected] 5 HIGH-PERFORMANCE COMPUTING: ROBERT COLWELL, INTEL CORP.; BCOLWELL@ICHIPS. INTEL.COM Yahoo Infoseek Excite Lycos AltaVista INTEGRATED ENGINEERING: JERZY ROZENBLIT, UNIV. OF ARIZ., TUCSON; [email protected]; Source: Media Metrix AND SANJAYA KUMAR, HONEYWELL TECHNOL- OGY CENTER; [email protected] According to Media Metrix, a company that provides Internet and digital-media measurement INTERNET WATCH: RON VETTER, UNIV. OF NORTH CAROLINA AT WILMINGTON; services, the Web-search sites with the greatest number of unique visitors as of May 1999 were [email protected] Yahoo, Infoseek, Excite, Lycos, and AltaVista. MANAGEMENT: BARRY BOEHM, UNIV. OF SOUTH- ERN CALIFORNIA; [email protected] MULTIMEDIA AND DATABASES: SHUNSUKE other critics say this approach can limit a other techniques. Google ranks such UEMURA, NARA INST. OF SCIENCE AND TECHNOL- search service’s effectiveness by forcing it pages highly and is likely to return them OGY; [email protected] to reflect past usage and not leaving it open in response to a search query. NETWORKING: JONATHAN M. SMITH, UNIVERSITY OF PENNSYLVANIA; [email protected] enough to meet the needs of new users. Clever. IBM is developing search tech- NEW BOOKS: MIKE LUTZ, ROCHESTER INSTITUTE Direct Hit. The Direct Hit search ser- nology it calls Clever, which uses an algo- OF TECHNOLOGY; [email protected] vice uses a technology it calls the Pop- rithm it calls HITS (Hyperlink-Induced SOFTWARE METRICS: WILL TRACZ, LOCKHEED MARTIN SYSTEMS; [email protected] ularity Engine. A proprietary algorithm Topic Search). (See the related article, SOFTWARE REALITIES: JAMES M. BACH, RELIABLE tracks users through Web searches. “Mining the Web’s Link Structure,” on SOFTWARE TECHNOLOGIES; [email protected] Direct Hit cofounder Gary Culliss said page 60.) The technology starts with a SOFTWARE TECHNOLOGIES: BILL N. SCHILIT, the tracking is done anonymously and standard keyword search to get a root set FX PALO ALTO LABORATORY; SCHILIT@ PAL.XEROX.COM cannot match specific IP addresses to of results. It then looks for documents SPECIAL ISSUES: KATHLEEN SWIGGER, UNIV. OF Web pages. that link to and from the root results. NORTH TEXAS; [email protected] The Popularity Engine monitors which Clever rates the Web pages in the root set STANDARDS: CHARLES R. SEVERANCE, MICHIGAN STATE UNIVERSITY; [email protected] Web pages a user accesses, how much and the linked pages on the basis of how TECHNICAL ACTIVITIES FORUM: DEBORAH time the user spends at each site, and many other sites link to them. SCHERRER, STANFORD UNIV.; DEBBIE@QUAKE. which hyperlinks the user clicks on. Pages that many Web site authors have STANFORD.EDU Direct Hit then uses this information to chosen to link to are called authorities CONTRIBUTING EDITORS rate the relevance of individual Web sites and are considered to be valuable sources DUNCAN LAWRIE, UNIVERSITY OF ILLINOIS; [email protected] to specific searches. of content. Web sites that link to many HOWARD RUBIN, HUNTER COLLEGE; [email protected] According to Culliss, this technique authorities are called hubs and are con- turns users into search editors. sidered to be valuable reference tools. COMPUTER ADVISERS EDWARD A. PARRISH, WORCESTER Google. Google, founded by Stanford POLYTECHNIC INSTITUTE; [email protected] University doctoral students Sergey Brin Built for speed RALPH CAVIN, SEMICONDUCTOR RESEARCH CORP.; [email protected] and Larry Page, is something of a hybrid Fast Search & Transfer (http://www. between the keyword and human-anno- fast.no) is using several approaches in an CS MAGAZINE OPERATIONS COMMITTEE CARL CHANG (CHAIR), tation approaches. effort to make its search service (http:// WILLIAM EVERETT (VICE CHAIR), JAMES H. AYLOR, JEAN BACON, Google uses its own crawler, called www.alltheweb.com) faster. WUSHOW CHOU, GEORGE CYBENKO, Googlebot, to zip around the Web. But Through the scalability in the archi- WILLIAM I. GROSKY, STEVE MCCONNELL, DANIEL E. O’LEARY, KEN SAKAMURA, instead of looking for keywords, tecture, the average response time for an MUNINDAR P. SINGH, JAMES J. THOMAS, Googlebot searches for hyperlinks. In advanced search is under a second, com- YERVANT ZORIAN response to a search topic, Googlebot pared to an industry average of four to CS PUBLICATIONS BOARD BEN WAH (CHAIR), CARL CHANG, JON BUTLER, looks for Web pages that hyperlink to four-and-a-half seconds, said Ray Ro- ALAN CLEMENTS, DANTE DEL CORSO, other pages that are deemed relevant to magnolo, a vice president at Fast Search WILLIAM EVERETT, DAVE PESSEL, FRANCIS LAU, RICHARD ECKHOUSE, SOREL REISMAN the topic, based on text-matching and & Transfer. The company credits its search ser- using natural language,” said John Laf- queries is limited because the algorithms vice’s performance in part to fast index- ferty, associate professor at Carnegie Mel- are still quite immature.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    4 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us