<<

AI Magazine Volume 18 Number 2 (1997) (© AAAI) Articles Moving Up the Information Food Chain Deploying Softbots on the

Oren Etzioni

■ I view the World Wide Web as an information Motivation food chain. The maze of pages and hyperlinks that comprise the Web are at the very bottom of Today’s Web is populated by a panoply of the chain. The WEBCRAWLERs and ALTAVISTAs of the primitive but popular information services. world are information herbivores; they graze on Consider, for example, an information cow Web pages and regurgitate them as searchable such as ALTAVISTA. ALTAVISTA requires massive indices. Today, most Web users feed near the bot- memory resources (to store an index of the tom of the information food chain, but the time Web) and tremendous network bandwidth is ripe to move up. Since 1991, we have been (to create and continually refresh the index). building information carnivores, which intelli- The cost of these resources is amortized over gently hunt and feast on herbivores in , on millions of queries per day. As a result, the the , and on the Web. Information carni- CPU cycles devoted to satisfying each indi- vores will become increasingly critical as the Web continues to grow and as more naive users are vidual query are sharply curtailed. There is exposed to its chaotic jumble. no time for intelligence. Furthermore, each query is independent of the previous one. No attempt is made to customize ALTAVISTA’s re- sponses to a particular individual. The result is homogenized, least-common-denominator service. view the World Wide Web as an informa- In contrast, visionaries such as Alan Kay tion food chain (figure 1). The maze of and Nicholas Negroponte have been advocat- pages and hyperlinks that comprise the I ing agents—personal assistants that act on Web are at the very bottom of the chain. The your behalf in cyberspace. While the notion WEBCRAWLERs and ALTAVISTAs of the world are of agents has been popular for more than a information herbivores; they graze on Web decade, we have yet to build agents that are pages and regurgitate them as searchable both widely used and intelligent. The Web indices. Today, most Web users feed near the presents a golden opportunity and an implicit bottom of the information food chain, but challenge for the AI community. As the old the time is ripe to move up. Since 1991, we adage goes “If not us, then who? And if not have been building information carnivores, now, when?” which intelligently hunt and feast on herbi- The challenge of deploying web agents will vores in UNIX (Etzioni, Lesh, and Segal 1993), help revitalize AI and forge closer links with on the Internet (Etzioni and Weld 1994), and other areas of computer science. But be on the Web (Doorenbos, Etzioni, and Weld warned, the Web community is hungry, 1997; Selberg and Etzioni 1997; Shakes, impatient, and skeptical. They expect robust- Langheinrich, and Etzioni 1997). I predict ness, a working system, accessible seven days that information carnivores will become a week, twenty-four hours a day; speed, virtu- increasingly critical as the Web continues to ally all widely-used Web resources begin grow and as more naive users are exposed to transmitting useful (or at least entertaining) its chaotic jumble. information within seconds; and added value,

Copyright © 1997, American Association for Artificial Intelligence. All rights reserved. 0738-4602-1997 / $2.00 SUMMER 1997 11 Articles

hallmarks of intelligence. In many cases, soft- bots rely on the same tools and utilities avail- able to human computer users—tools for Ahoy! sending mail, printing files, and so on. Mobile Personal robots have yet to achieve the physical ana- MetaCrawler Assistants log—using vacuum cleaners, lawn mowers, Softbots etc. Softbots are an attractive substrate for intelligent-agent research for the following reasons (Etzioni 1994, 1993). First, the cost, Alta Vista Indices, effort, and expertise necessary to develop and Yahoo Directories systematically experiment with software arti- Mass facts are relatively low. Second, software envi- ronments circumvent many of the thorny but Services peripheral problems that are inescapable in World Wide Web physical environments. Finally, in contrast to simulated physical worlds, software environ- ments are readily available (sophisticated sim- ulations can take years to perfect), intrinsical- ly interesting, and real. However, Softbots are not intended to replace robots; Robots and softbots are complimentary. Figure 1. The Information Food Chain. Much of our work has focused on the Inter- net softbot (also known as RODNEY) (Etzioni and Weld 1994). RODNEY enables a person to any increase in sophistication had better state what he or she wants accomplished. ROD- yield a tangible benefit to users. NEY disambiguates the request and dynamical- Is the Web challenge a distraction from our ly determines how and where to satisfy it, uti- long-term goal of understanding intelligence lizing a wide range of Internet services and and building intelligent agents? I believe that UNIX commands. RODNEY relies on a declarative the field benefits from a mixture of long-term representation of the different software tools and short-term goals and from both empirical at its disposal, enabling it to chain together and theoretical work. Work toward the goal multiple tools in response to a user’s request. of deploying intelligent agents on the Web is RODNEY uses automatic planning a valuable addition to the current mix for two to dynamically generate the appropriate reasons. First, the Web suggests new problems action sequence. The Internet softbots project and new constraints on existing techniques. has led to a steady stream of technical results Second, intelligent Web agents will provide (Etzioni, Golden, and Weld 1997, 1994; Kwok tangible evidence of the power and utility of and Weld 1996; Perkowitz and Etzioni 1995; AI techniques. Next time you encounter AI Golden, Etzioni, and Weld 1994; Etzioni et al. bashing, wouldn’t it be satisfying to counter 1992). Closely related projects include Kirk et with a few well-chosen URLs? al. (1995) and Arens et al. (1993). Personally, I find the Web irresistible. To Unfortunately, we have yet to produce a borrow Herb Simon’s phrase, it is today’s planner-based softbot that meets the strin- “Main Chance.” Simon describes his move gent demands of the Web community. While from the “academic backwater” of public continuing our ambitious long-term project administration to AI and cognitive psycholo- to develop planner-based softbots, we have gy as “gravitating toward the sun” (Simon embraced a new strategy for the creation of 1991, pp. 113–114). Although AI is not an intelligent agents which I call “useful first.” academic backwater, the Web is today’s sun. Instead of starting with grand ideas about Turning towards the sun and responding to intelligence and issuing a promissory note the Web challenge, my collaborators and I that they will eventually yield useful intelli- have begun to deploy a species of informa- gent agents, we take the opposite tack; we tion carnivores (called softbots) on the Web. begin with useful softbots deployed on the Web, and issue a promissory note that they Softbots will evolve into more intelligent agents. We are still committed to the goal of producing Softbots (software robots) are intelligent agents agents that are both intelligent and useful. that use software tools and services on a per- However, I submit that we are more likely to son’s behalf (figure 2). Tool use is one of the achieve this conjunctive goal if we reverse the

12 AI MAGAZINE Articles

Rodney BargainFinder

Sims Simon MetaCrawler ILA

InfoManifold

Occam

Ahoy! ShopBot

Figure 2. The Softbot Family Tree. The black boxes represent work at the University of Washington; the white boxes represent softbots currently deployed on the Web. traditional sub-goal ordering and focus on er science and one percent AI. The AI is criti- building useful systems first. cal but we cannot ignore the context into The argument for “useful first” is analogous which it is embedded. Patrick Winston has to the argument made by Rod Brooks (1991) called this the “raisin bread” model of AI. If and others (Etzioni 1993; Mitchell et al. 1990) we want to bake raisin bread, we cannot for building complete agents and testing focus exclusively on the raisins.1 them in a real world. As Brooks put it, “with a Operating on a shoestring budget, we have simplified world…it is very easy to accidental- been able to deploy several softbots on the ly build a submodule of the systems which Web. I review our fielded softbots and then happens to rely on some of those simplified consider both the benefits and pitfalls of the properties…the disease spreads and the com- “useful first” approach. plete system depends in a subtle way on the The METACRAWLER softbot (see www.cs.wash- simplified world” (p. 150). This argument ington.edu/research/metacrawler) provides a applies equally well to user demands and real- single, unified interface for Web document time constraints on Web agents. searching (Selberg and Etzioni 1997, 1995). There is a huge gulf between an AI proto- METACRAWLER supports an expressive query lan- type and an agent ready for deployment on guage that allows searching for documents the Web. One might argue that this gulf is of that contain certain phrases and excluding no interest to AI researchers. However, the documents containing other phrases. demands of the Web community constrain METACRAWLER queries five of the most popular the AI techniques we use, and lead us to new information herbivores in parallel. Thus, AI problems. We need to recognize that intel- METACRAWLER eliminates the need for users to ligent agents are ninety-nine percent comput- try and re-try queries across different herbi-

SUMMER 1997 13 Articles

Yahoo Researchers Sample AltaVista Transportation Sample

Hotbot Search Services MetaCrawler

Ahoy!

0% 20% 40% 60% 80% Percent of Targets Ranked #1

Figure 3. A Comparison of AHOY!’s Accuracy or “Precision” with General-Purpose Search Engines on Two Test Samples. See Shakes, Langheinrich, and Etzioni (1997) for the details.

vores. Furthermore, users need not remember lightweight to run on an average PC and the address, interface and capabilities of each serve as a personal assistant. Indeed, META- one. Consider searching for documents con- CRAWLER-inspired PC applications such as WEB- taining the phrase “four score and seven years COMPASS and Internet FASTFIND are now on the ago.” Some herbivores support phrase search- market. ing whereas others do not. METACRAWLER frees METACRAWLER demonstrates that Web ser- the user from having to remember such vices and their interfaces may be de-coupled. details. If a herbivore supports phrase search- METACRAWLER is a meta-interface with three ing, METACRAWLER automatically invokes this main benefits. First, the same interface can be feature. If a herbivore does not support phrase used to access multiple services simultaneous- searching, METACRAWLER automatically down- ly. Second, since the meta-interface has rela- loads the pages returned by that herbivore tively modest resource requirements it can and performs its own phrase search locally.2 reside on an individual user’s , which In a recent article, Forbes Magazine asked facilitates customization to that individual. Lycos’s Michael Maudlin “why aren’t the oth- Finally, if a meta-interface resides on the user’s er spiders as smart as METACRAWLER?” Maudlin machine, there is no need to “turn down the replied “with our volume I have to turn down smarts.” In a Web-mediated client/server the smarts...METACRAWLER will too if it gets architecture, where intelligence resides in the much bigger.” Maudlin’s reply misses an client, “volume” is no longer a limiting factor important point: because METACRAWLER relies on the “smarts” of the overall system. on information herbivores to do the resource- While METACRAWLER does not currently use AI intensive grazing of the Web, it is sufficiently techniques, it is evolving rapidly. For example,

14 AI MAGAZINE Articles we have developed a novel document cluster- In the context of AHOY! the “useful first” ing algorithm that enables users to rapidly constraint led us to tackle an important focus on relevant subsets of the references impediment to the use of machine learning returned by METACRAWLER (Zamir et al. 1997). In on the Web. Data is abundant on the Web, addition, we are investigating mixed-initiative but it is unlabeled. Most concept learning dialog to help users focus their search. Most techniques require training data labeled as important, METACRAWLER is an enabling tech- positive (or negative) examples of some con- nology for softbots that are perched above it in cept. Techniques such as uncertainty sam- the information food chain. pling (Lewis and Gale 1994) reduce the amount of labeled data needed, but do not AHOY! The Home Page Finder eliminate the problem. Instead, AHOY! Robot-generated Web indices such as ALTAVISTA attempts to harness the Web’s interactive are comprehensive but imprecise; manually nature to circumvent the labeling problem. generated directories such as YAHOO! are pre- AHOY! relies on its initial power to draw cise but cannot keep up with large, rapidly numerous users to it; AHOY! uses its experi- growing categories such as personal home- ence with these users’ queries to make gener- pages or news stories on the American econo- alizations about the Web, and improve its 3 my. Thus, if a person is searching for a partic- performance. Note that by relying on experi- ular page that is not cataloged in a directory, ence with multiple users, AHOY! rapidly col- she is forced to query a web index and manu- lects the data it needs to learn; systems that ally sift through a large number of responses. are focused on learning an individual user’s Furthermore, if the page is not yet indexed, taste do not have this luxury. AHOY!’s boot- then the user is stymied. In response we have strapping architecture is not restricted to developed Dynamic Reference Sifting—a novel learning about home pages; the architecture architecture that attempts to provide both can be used in a variety of Web domains. maximally comprehensive coverage and high- SHOPBOT ly precise responses in real time, for specific page categories (Shakes, Langheinrich, and SHOPBOT is a softbot that carries out compari- son shopping at Web vendors on a person’s Etzioni 1997). behalf (Doorenbos, Etzioni, and Weld 1997). The AHOY! softbot (see www.cs.washington. Whereas virtually all previous Web agents rely edu/research/ahoy) embodies dynamic refer- on hard-coded interfaces to the Web sites they ence sifting for the task of locating people’s access, SHOPBOT autonomously learns to extract home pages on the Web. AHOY! takes as input product information from Web vendors given a person’s name and affiliation, and attempts to find the person’s home page. AHOY! queries METACRAWLER and uses knowledge of Web geog- raphy (e.g., the URLs of home pages at the University of Washington end with washing- ton.edu) and home page appearance to filter METACRAWLER’s output. Typically, AHOY! is able to cut the number of references returned by a factor of forty but still maintain very high accuracy (see Figure 3). To improve its coverage, AHOY! learns from experience. It rapidly collects a set of home pages to use as training data for an unsuper- vised learning algorithm that attempts to dis- cover the conventions underlying home page placement at different institutions. For exam- ple, AHOY! learns that home pages at the Uni- versity of Washington’s Computer Science Department typically have the form www.cs.washington.edu/homes/. After learning, AHOY! is able to locate home pages of individuals even if they are not indexed by METACRAWLER’s herd of information herbivores. In our experiments, 9% of the home pages located by AHOY! were found using learned knowledge of this sort.

SUMMER 1997 15 Articles

their URL and general information about their merchants selling computer software (e.g. product domain (e.g., software). Specifically, CDW, Internet Shopping Network, etc.). JAN- SHOPBOT learns how to query a store’s search- GO collates the information it receives, able product catalog, learns the format in divides it into reviews and price quotes, elim- which product descriptions are presented, and inates irrelevant information, and presents learns to extract product attributes such as the result to the user. At this point JANGO can price from these descriptions. be instructed to buy the product using the SHOPBOT’s learning algorithm is based in part billing and shipping information in its user’s on that of the Internet Learning Agent (ILA) personal profile. (Perkowitz and Etzioni 1995). ILA learns to JANGO consists of four key components: a extract information from unfamiliar sites by natural language front-end, a query router, an querying with familiar objects and analyzing aggregation engine, and a filter. The front- the relationship of output tokens to the query end takes a natural language request such as object. SHOPBOT borrows this idea from ILA; “I want that book by Bill Gates” or “find me a SHOPBOT learns by querying stores for informa- Woody Allen movie starring Jody Foster on tion on popular products, and analyzing the VHS” and parses it into a logical description stores’ responses. Whereas ILA learns to under- of the desired product’s attributes, which is stand the meaning of tokens returned by passed to the query router. The router exam- unfamiliar web sites, SHOPBOT learns the for- ines this logical description and determines mat in which items are presented. the set of web resources it is familiar with In the software shopping domain, SHOPBOT that contain information relevant to the has been given the home pages for 12 on-line query. To make this decision, the router relies software vendors. After its learning is com- on efficient taxonomic reasoning and a mas- plete, SHOPBOT is able to speedily visit the ven- sive knowledge base to identify the precise dors, extract product information such as category of the product—is it a cigar, a movie availability and price, and summarize the title, or a classical music CD? The router asso- results for the user. In a preliminary user ciates each product category and subcategory study, SHOPBOT users were able to shop four with a set of Web sites that are likely to con- times faster (and find better prices!) than tain information relevant to the category. users relying only on a Web browser (Dooren- Once the product’s exact category is iden- bos, Etzioni, and Weld 1997). tified the router returns the set of relevant sites it is familiar with. NETBOT JANGO JANGO’s aggregation engine is very similar NETBOT JANGO is a softbot (see www.jango.com) to METACRAWLER. Based on the router’s output that fuses and extends ideas and the aggregation engine queries the relevant from METACRAWLER, AHOY!, and SHOPBOT to cre- sites in parallel, and collates their responses. ate a single, unified entry point for on-line JANGO extends the METACRAWLER architecture shopping in over twenty product categories. by introducing declarative “wrappers” that JANGO searches the Web for information to enable it to communicate with hundreds of help a person decide what to buy and where web sites. The wrappers specify how to query to buy it; in response to a query, JANGO will a site and how to parse the information visit manufacturer sites for product descrip- returned in response. Finally, the filter mod- tions, magazine sites for product reviews, and ule is invoked to perform the kind of strin- vendor sites for price quotes. Unlike most gent analysis done by AHOY! to strip and dis- softbots, JANGO goes beyond information gath- card any information irrelevant to the query ering and acts on a person’s behalf. When vis- before a report is generated for the user. iting a vendor, JANGO will automatically create In the future, planning techniques of the an account and a password (saving these for sort pioneered in the Internet Softbot will be future visits!). Furthermore, JANGO can be used to generate multi-step plans that com- instructed to buy a product at the click of a pose the results of queries to multiple web mouse. sites in order to obtain a precise response to a Consider the following example. A person query. Finally, JANGO’s wrappers can be gener- queries JANGO with “Adobe Photoshop on CD- ated automatically using the machine learn- ROM.” The softbot deduces that this is a ing techniques introduced in ILA and SHOPBOT request about computer software whose title (see also Kushmerick [1997]). is “Photoshop” whose manufacturer is “Adobe” and whose medium is “CD-ROM”. Discussion In response, JANGO visits Adobe’s home page, Every methodology has both benefits and pit- Ziff-Davis Publications, and numerous online falls; the softbot paradigm is no exception.

16 AI MAGAZINE Articles

Perhaps the most important benefit has been the discovery of new research challenges, the imposition of tractability constraints on AI algorithms, and the resulting innovations. In recent years, planner-based softbots have led us to the challenge of incorporating informa- tion goals, sensory actions, and closed world reasoning into planners in a tractable man- ner. Our focus on tractability led us to formu- late UWL (Etzioni et al. 1992) and Local Closed World Reasoning (Etzioni, Golden, and Weld 1997, 1994). We expect “useful first” to be equally productive over the next few years. I acknowledge that our approach has numerous pitfalls. Here are a couple, phrased as questions: will we fail to incorporate sub- stantial intelligence into our softbots? Does the cost of deploying softbots on the Web outweigh the benefit? Our preliminary suc- cess in incorporating AI techniques into our deployed softbots makes me optimistic, but time will tell.

Conclusion Each of the softbots described above uses the softbots described above, for his numer- multiple Web tools or services on a person’s ous contributions to the softbots project and behalf. Each softbot enforces a powerful its vision; he cannot be held responsible for abstraction: a person is able to state what the polemic tone and subversive methodolog- they want, the softbot is responsible for ical ideas of this piece. I would also like to deciding which Web services to invoke in thank members of the University of Washing- response and how to do so. Each softbot has ton (UW) Softbots Group and the folks at been deployed on the Web, meeting the Netbot, Inc. for making softbots real. Special requirements of robustness, speed, and added thanks are due to Erik Selberg who created value. Currently, METACRAWLER receives over METACRAWLER as part of his masters disserta- 220,000 queries a day and has been licensed tion. Thanks are due to Steve Hanks and oth- to an Internet Startup (Go2net, Inc.). Due to er members of the UW AI group for helpful its narrower scope, AHOY! receives only several discussions and collaboration. This research thousand queries per day. However, a recent was funded in part by Office of Naval New York Times article called it “the most reli- Research grant 92-J-1946, by ARPA / Rome able single tool for tracking down home Labs grant F30602-95-1-0024, by a gift from pages.” Finally, NETBOT JANGO can be down- Rockwell International Palo Alto Research, loaded from the Web, but no usage statistics and by National Science Foundation grant are available as this article goes to press. IRI-9357772. Having satisfied the “useful first” con- straint, our challenge is to make our current Notes softbots more intelligent, inventing new AI 1. See Brachman (1992) for an account of the mas- techniques and extending familiar ones. We sive re-engineering necessary to transform an are committed to doing so while keeping our “intelligent first” knowledge representation system into a usable one. softbots both usable and useful. If we suc- ceed, we will help to rid AI of the stereotype 2. Like most systems deployed on the web, METACRAWLER is a moving target; it has been licensed “if it works, it ain’t AI.” To check on our to Go2net, Inc. and they have been adding and progress, visit the URLs mentioned earlier. removing features at their discretion. Softbots are standing by... 3. An early version of AHOY! solicited explicit user Acknowledgments feedback on whether AHOY! returned the correct answer to a user query. Currently, AHOY! relies on an This article is loosely based on an invited talk unsupervised learning algorithm in which it auto- presented at AAAI-96. I would like to thank matically labels its own training examples as suc- Dan Weld, my close collaborator on many of cess or failure.

SUMMER 1997 17 Articles

References Lewis, D., and Gale, W. 1994. Training Text Clas- sifiers by Uncertainty Sampling. In Seventeenth Arens, Y.; Chee, C. Y.; Hsu, C.-N.; and Knoblock, C. Annual International ACM SIGIR Conference on A. 1993. Retrieving and Integrating Data from Mul- Research and Development in Information tiple Information Sources. International Journal of Retrieval. New York: Association of Computing Intelligent and Cooperative Information Systems 2(2): Machinery. 127–158. Mitchell, T. M.; Allen, J.; Chalasani, P.; Cheng, J.; Brachman, R. 1992. “Reducing” CLASSIC to Practice: Etzioni, O.; Ringuette, M.; and Schlimmer, J. C. Knowledge Representation Theory Meets Reality. In 1990. THEO: A Framework for Self-Improving Sys- Proceedings of the Third International Conference tems. In Architectures for Intelligence, ed. K. Van- on Principles of Knowledge Representation and Lehn, 323–356. Hillsdale, N.J.: Lawrence Erlbaum. Reasoning. San Francisco, Calif.: Morgan Kauf- mann. Perkowitz, M., and Etzioni, O. 1995. Category Translation: Learning to Understand Information Brooks, R. 1991. Intelligence without Representa- on the Internet. In Proceedings of the Fifteenth tion. Artificial Intelligence 47(1–3): 139–159. International Joint Conference on Artificial Intelli- Doorenbos, R.; Etzioni, O.; and Weld, D. 1997. A gence, 930–936. Menlo Park, Calif.: International Scalable Comparison-Shopping Agent for the Joint Conferences on Artificial Intelligence. World-Wide Web. In Proceedings of Autonomous Selberg, E., and Etzioni, O. 1997. The METACRAWLER Agents, 39–48. New York: Association of Comput- Architecture for Resource Aggregation on the Web. ing Machinery. IEEE Expert 12(1): 8–14. Etzioni, O. 1994. Etzioni Replies. AI Magazine 15(2): Selberg, E., and Etzioni, O. 1995. Multi-Service 13–14. Search and Comparison Using the MetaCrawler. In Etzioni, O. 1993. Intelligence without Robots: A Proceedings of the Fourth International World Reply to Brooks. AI Magazine 14(4): 7–13. Wide Web Conference, 11–14 December, Boston, Etzioni, O., and Weld, D. 1994. A Softbot-Based Massachusetts. Interface to the Internet. of the Shakes, J.; Langheinrich, M.; and Etzioni, O. 1997. ACM 37(7): 72–76. AHOY! The Home Page Finder. In Proceedings of the Etzioni, O.; Golden, K.; and Weld, D. 1997. Sound Sixth International World Wide Web Conference, and Efficient Closed-World Reasoning for Planning. 7–11 April, Santa Clara, California. Artificial Intelligence 89(1–2): 113–148. Simon, H. 1991. Models of My Life. New York: Basic Etzioni, O.; Golden, K.; and Weld, D. 1994. Books. Tractable Closed-World Reasoning with Updates. In Zamir, O.; Etzioni, O.; Madani, O.; and Karp, R. M. Proceedings of the Fourth International Conference 1997. Fast and Intuitive Clustering of Web Docu- on Principles of Knowledge Representation and ments. Submitted to the Third International Con- Reasoning, 178–189. San Francisco, Calif.: Morgan ference on Knowledge Discovery and Databases Kaufmann. (KDD-97), 14–17 August, Newport Beach, Califor- Etzioni, O.; Lesh, N.; and Segal, R. 1993. Building nia. Forthcoming. Softbots for UNIX (Preliminary Report), Technical Report, 93-09-01, University of Washington. Oren Etzioni received his bache- Etzioni, O.; Hanks, S.; Weld, D.; Draper, D.; Lesh, lor’s degree in computer science N.; and Williamson, M. 1992. An Approach to from Harvard University in June Planning with Incomplete Information. In Proceed- 1986, and his Ph.D. from ings of the Third International Conference on Prin- Carnegie Mellon University in ciples of Knowledge Representation and Reasoning, January 1991. He joined the Uni- 115–125. San Francisco, Calif.: Morgan Kaufmann. versity of Washington as assistant Golden, K.; Etzioni, O.; and Weld, D. 1994. professor of computer science Omnipotence without Omniscience: Sensor Man- and engineering in February agement in Planning. In Proceedings of the Twelfth 1991, and was promoted to an associate professor National Conference on Artificial Intelligence, in 1996. In 1996 he founded Netbot Inc. with Dan 1048–1054. Menlo Park, Calif.: American Associa- Weld. tion for Artificial Intelligence. In the fall of 1991, he launched the Internet Kirk, T.; Levy, A.; Sagiv, Y.; and Srivastava, D. 1995. Softbots project. In 1993, Etzioni received an NSF Young Investigator Award. In 1995, the Internet The Information Manifold. Presented at the AAAI Softbot was chosen as one of 5 finalists in the Spring Symposium on Information Gathering from National DISCOVER Awards for Technological Heterogeneous, Distributed Environments, 27–29 Innovation in Computer Software. He has pub- March, Stanford, California. lished more than 50 technical papers on intelligent Kushmerick, N. 1997. Wrapper Construction for software agents, machine learning, planning, etc. Information Extraction. Ph.D. thesis, Dept. of His group’s work on software agents has been fea- Computer Science, Univ. of Washington. tured in The Economist, Business Week, Discover Kwok, C., and Weld, D. 1996. Planning to Gather Magazine, Forbes Magazine, CyberTimes, and The Information. Technical Report, 96-01-04, Depart- New Scientist. His e-mail address is [email protected] ment of Computer Science and Engineering, Uni- ington.edu. versity of Washington.

18 AI MAGAZINE