AI Magazine Volume 27 Number 4 (2006) (© AAAI) Articles AI Meets Web 2.0

Building the Web of Tomorrow, Today

Jay M. Tenenbaum

Illustrations by Kevin Hughes

This article is derived from the 2005 Innovative Applica- niques. It also articulates an exciting set of business tions of Artificial Intelligence conference invited talk “AI opportunities to deliver on the economic promise of e-com- Meets Web 2.0: Building the Web of Tomorrow, Today” merce and AI. presented by Jay M. Tenenbaum in Pittsburgh, Pennsylva- – Neil Jacobstein, nia. I invited Marty to deliver that talk at IAAI-05 Chair, IAAI-05 because he is a true AI and web visionary. It is common in times of grade inflation and hyperbole that speakers ■ Imagine an -scale knowledge system where and authors are introduced as “visionaries.” Jay M. people and intelligent agents can collaborate on Tenenbaum (or Marty, as he is known) actually is a world-renowned Internet commerce pioneer and visionary. solving complex problems in business, engineer- Mary envisioned the commercial and industrial use of the ing, science, medicine, and other endeavors. Its Internet more than a decade before it became a reality, at resources include semantically tagged websites, a time when only academics, industrial R&D groups, and , and blogs, as well as social networks, vertical government labs had access to it. At the time, Marty’s e- search engines, and a vast array of web services commerce vision evoked all kinds of resistance and objec- from business processes to AI planners and domain tions about why it couldn’t or shouldn’t be done. models. Research prototypes of decentralized In 1991, Marty founded and became the chief execu- knowledge systems have been demonstrated for tive officer of Enterprise Integration Technologies, the first years, but now, thanks to the web and Moore’s law, company to conduct a commercial Internet transaction they appear ready for prime time. This article intro- (1992), a secure web transaction (1993), and an Internet duces the architectural concepts for incrementally auction (1993). In 1994, Marty formed the CommerceNet growing an Internet-scale knowledge system and industry association and jump-started the Internet mar- illustrates them with scenarios drawn from e-com- ketplace by convincing 50 leading corporations to get merce, e-science, and e-life. online. In 1997, he cofounded Veo Systems, which pio- neered the use of XML documents for automating busi- n this article, I want to share a vision of how ness-to-business transactions. In 1999, Commerce One to build or, more precisely, grow Internet- acquired Veo Systems, and Marty ultimately served as its scale knowledge systems. Such systems chief scientist. Marty now splits his time between Internet I enable large numbers of human and computer startups Efficient Finance, Medstory, and Patients Like agents to collaborate on solving complex prob- Me, and the nonprofit CommerceNet. This article points to the technical opportunity to meet lems in engineering, science, and business or Allen Newell’s criteria for intelligent systems by incremen- simply managing the complexities of life (say tally assembling complex web services applications out of planning a trip or an event). It’s a vision that’s component web services, tagged data, and inference tech- been evolving over 20 years since my days as

Copyright © 2006, American Association for Artificial Intelligence. All rights reserved. ISSN 0738-4602 WINTER 2006 47 Articles

oncile these two views and create a new breed of hybrid knowledge systems that combine the • Exhibit adaptive goal-oriented behavior best elements of AI and the web, and of humans and machines. Such hybrid systems • Learn from experience can solve heretofore intractable real-world problems and potentially demonstrate hereto- • Use vast amounts of knowledge fore unattainable levels of machine intelli- gence. • Exhibit self-awareness To give a hint of where I’m headed, I’ll revisit • Interact with humans using language and speech some classic AI problem-solving tasks, which are as timely now as they were in Newell’s day • Tolerate error and ambiguity in communication (table 2). I’ve worked on many of them, and I’m sure readers have too. How would you as • Respond in real time an individual approach any of these tasks today? Undoubtedly, by searching the web, which provides thousands of relevant informa- tion sources and services. Table 1. Newell’s Criteria for Intelligent Systems. If I were building an AI system to solve these problems, I too would start with the web, grad- ually adding a little AI here and there to auto- mate time-consuming functions and integrate Travel What’s the best way to get to Pittsburgh? processes. Take travel, for example. I might start by creating a simple agent to check many travel sites for available flights and fares. A sec- Meetings Who should I have dinner with tonight? Where? ond agent might use these results to select an optimal itinerary based on my personal prefer- Supply chain How can I get 100 PCs delivered by tomorrow? ences, frequent flyer accounts, and calendar schedule. Now imagine giving millions of people this Medicine What drugs might be effective against this new bug? ability to create agents that incrementally auto- mate tasks, as well as the ability to publish E-business Where can I get the best deal on a car and a loan? those agents as services that others can use and build upon. Before long, interesting forms of will begin to emerge. Approaching AI applications in this way has manifest benefits—the ability to leverage bil- Table 2. Classic AI Problems. lions of dollars of web infrastructure, millions of web services, and millions of knowledge an AI researcher and, more recently, as an engineers all building on each other’s work. Internet entrepreneur. Thanks to the explosive Moreover, such systems are useful from day growth of the web, it’s a vision whose time has one, because they start with the web. Automa- come. I also have a larger goal: to bridge the AI tion proceeds incrementally—one person, and and web communities, which have so much to one task at a time. Each step provides immedi- give to and learn from each other. ate, incremental benefit. And because people Twenty-five years ago, at the birth of AAAI, remain in the loop, such systems can fail grace- Allan Newell articulated a set of criteria that a fully, by handing off hard or unusual cases for system had to exhibit to be considered intelli- manual processing. gent (see table 1). Newell was very explicit that I’ll start this article by reviewing the evolu- an intelligent system had to exhibit all of these tion of the vision from its roots in AI research criteria. This requirement reflected the then through the early days of the web and more prevailing view that intelligent systems were recent developments such as web services and monolithic and were developed centrally by an Web 2.0 (referring to the spate of recent web individual or small group. innovations that includes blogs, wikis, social The web has shown us a different path to networks, and the like). These recent develop- intelligence—millions of simple knowledge ments make the web more programmable and services, developed collaboratively in a decen- more participatory, important attributes for tralized way by many individuals and groups, Internet-scale knowledge systems. I’ll review all building on each other, and demonstrating Web 2.0 technologies and methodologies from a collective form of intelligence. It’s time to rec- an AI perspective and then illustrate, through

48 AI MAGAZINE Articles

Program Management

Avionics Airframe

A Personal Agents A A Agents F

F F

F F

Facilitators A A

Propulsion Hydraulics

Figure 1. Concurrent Engineering and the PACT Architecture case studies and scenarios, some knowledge through a shared , representing systems that synthesize Web 2.0 and classic AI a model of the artifact. When an agent modi- approaches. Finally, I’ll conclude with a call for fied the design, affected agents were notified so action for realizing the vision on an Inter- they could critique the change or respond with netwide scale. changes of their own. For example, a hydraulics engineer installing some fluid lines might alert an airframe agent to check whether Evolution those lines interfered with the movement of Twenty years ago, I was running a research lab any control surface. Although a centralized for Schlumberger. At the time, Schlumberger shared knowledge base is depicted, the model was trying to break into the CAD/CAM busi- can be distributed in practice to facilitate scal- ness through its acquisition of Applicon. I was ing. Each agent maintains aspects of the model fascinated by the possibility of using the Inter- most relevant to it in local CAD systems and net to support large-scale engineering projects, provides that information to other agents that such as the design of a new airplane, involving need it. thousands of people at hundreds of companies (figure 1). The Palo Alto Inspired by distributed blackboard systems Collaborative Testbed (PACT) such as Hearsay, we modeled the process as We built several research prototypes of such human and computer agents collaborating agent-based collaborative engineering systems

WINTER 2006 49 Articles

Building a Virtual Company

Copyright © 1994 Enterprise Integration Technologies

Figure 2. The Virtual Company circa 1994.

during the late 1980s. The most impressive was ly mapped messages so that agents could focus the Palo Alto Collaborative Testbed (PACT), on design-related tasks. Finally, personal agents which was documented in the January 1992 enabled people to participate in this agent ecol- issue of IEEE Computer. PACT used the Internet ogy. Users could respond directly to messages to link four independently developed knowl- from other agents using a graphical user inter- edge-based design systems (covering require- face (GUI); they could also specify pattern- ments management, kinematics, dynamics, action rules that generated automated respons- and electronics) at three different organiza- es to routine messages. tions—EIT, Lockheed, and Stanford. The sys- In retrospect, PACT was a seminal paper, tems then collaborated on the design of a small articulating three key principles for building robotic manipulator. large distributed knowledge systems. Para- The PACT architecture consisted of three phrasing from the conclusion, the first princi- components, as depicted in figure 1. First, ple states that “Instead of literally integrating agents encapsulated existing design systems, code, modules should be encapsulated by transforming them into knowledge services. agents, and then invoked when needed as net- Second, facilitators handled interagent com- work services.” This is the core idea behind munication, routing messages on the basis of today’s service-oriented architectures. The sec- content to potentially interested agents. Facili- ond principle holds that agents should com- tators also filtered, prioritized, and semantical- municate on a knowledge level, using the

50 AI MAGAZINE Articles of the problem domain, rather than medium-size companies that compose the bulk the local standards employed by encapsulated of the electronics industry value chain. systems. Finally, messages should be routed With web services, any company can publish intelligently at run time based on their con- on its web server product descriptions, prices tent, since it’s often impossible to know a priori and availabilities, shipping and production which agents might be affected by a design schedules, and so on in a form that both peo- action. ple and machines can understand. With such information, it suddenly becomes feasible to Electronic Commerce optimize forecasting, production planning, By 1990, I had shifted my focus to Internet inventory management, and other enterprise commerce. The company I founded, Enterprise resource planning (ERP)-like functions across Integration Technologies (EIT), developed an entire industry rather than just within a sin- some of the earliest shopping sites including gle enterprise. Internet Shopping Network (ISN), launched in At least that’s the vision. Unfortunately, in April 1994. ISN pioneered many innovations practice, there are three problems with web including online catalogs, shopping carts, and services that stand in the way of doing busi- cookies and served as a role model for Amazon ness. and other shopping sites. PACT’s notion of First is the problem of complex standards. integrating services also figured prominently. Web services promised lightweight integration, As depicted in figure 2, we envisioned ISN as a using simple, open protocols inspired by the virtual company, outsourcing fulfillment serv- web. Things started out simply enough with ices to Merisel, shipping services to FedEx, and SOAP, WSDL, and UDDI. However, they rapidly payment services to Citibank. We even envi- got out of hand as people began adding fea- sioned integrating a build-to-order supplier like tures that pushed web services toward a full- Dell and creating a virtual computer company. fledged CORBA-like distributed operating sys- From an AI perspective, we thought of these tem. Today there are literally dozens of services as special=purpose problem solvers. standards supporting features such as business FedEx, for example, with more than 100,000 process orchestration, security, and service pro- employees and thousands of trucks and planes visioning and management. Additionally, was a special-purpose problem solver for logis- there are hundreds of specialized XML lan- tics, invoked through a simple message-based guages that define the semantics for doing application programming interface (API)— business in particular industries. Each such lan- ”Get this package to New York City by tomor- guage typically entails a painful multiyear stan- row morning.” It struck us as foolish to write a dards process whereby companies attempt to logistics planner rather than leverage such a reconcile their local data and business process powerful off-the-shelf resource. standards. The electronics industry has been attempting to standardize on RosettaNet, which is sponsored by leading manufacturers Web Services and distributors. However, after more than five Now let us fast-forward to today’s world of web years, only a few RosettaNet processes have services. Companies in competitive industries moved past initial pilots. like electronics are beginning to publish busi- Second is the problem of static processes. ness services (for example, place an order, make Business demands agility, but business process a payment, track a shipment) that others can execution language (BPEL), the emerging web discover in registries and integrate into their services standard for business process orches- own processes, then publish new services that tration, was designed for static processes. At the others can build on. With critical mass, busi- least, a rules engine is needed to select the right ness services are expected to grow explosively BPEL script at run time. Even better, rules like the web itself. should be embedded within BPEL so that Prior to web services, companies that wanted actions can respond dynamically to events. to integrate their business processes had to Third is the problem of point-to-point com- resort to expensive custom integration or elec- munications. Standard web services rely on tronic data interchange (EDI) solutions. Such point-to-point communications. However, as projects typically cost hundreds of thousands we’ve seen, it’s often not clear until run time of dollars and took months to implement. who needs to know about a design change, or They made sense for large trading partners like which supplier has the best prices and invento- an IBM or Ingram who do a billion dollars a ry on hand to fulfill a request. year of business together. However, they did We can overcome these problems by com- nothing for the tens of thousands of small and bining the best features of PACT and web serv-

WINTER 2006 51 Articles

I have 80 PCs available in two days Let me see what’s I need 100 PCs available by tomorrow

I have 100 PCs Warehouse B available now 20 PCs Warehouse A 80 PCs

Figure 3. Knowledge Services at Work.

ices. Agents wrap web services, transforming through Jane, HP’s account rep for GSA; her them into rule-based “knowledge services.” agent automatically approves the order Facilitators enable agents to communicate on a because it falls within GSA’s contractual guide- knowledge level; they route messages on the lines and credit limit. basis of their content and translate between local data standards and messaging protocols. Aggressive Interoperability Finally, personal agents enable humans to par- The knowledge services architecture relies on ticipate in automated processes. two important features. The first is aggressive Figure 3 illustrates how such a solution interoperation (or “A.I.” for short). Aggressive might work in practice. A purchasing agent at interoperation means never letting standards, the General Services Administration (GSA) or the lack thereof, get in the way of doing broadcasts a solicitation for a hundred PCs business. If Ingram and HP use different data needed by tomorrow. Ingram, a leading distrib- formats or messaging protocols, a translation utor, has an agent that subscribes to such solic- service can map them; if there’s one thing the itations. The agent picks up the request for AI community is good at, it’s mapping between quotation (RFQ) and forwards it to interested two structured representations. The translation suppliers, whose agents respond with current can be performed by the message sender, the inventory levels. In the case of one supplier recipient, or some third-party broker—whoever (HP), that involves polling and aggregating has the economic incentive. Thus, if HP is inventories from several warehouses. Ingram’s eager to attract new customers, it could accept agent informs the GSA agent that it can fulfill orders in any major format and translate them the request. The order is consummated into its internal standard. One enduring lesson

52 AI MAGAZINE Articles

Participatory (P2P) Blogs, wikis, social networking, RSS feeds, read/write

Semantic Tags, , , vertical search

Real time Instant messaging, events (publish/subscribe)

Pervasive Billions of edge devices with substantial computing and broadband access – telephones, cars, RFID readers . . . .

Community Simplicity, rapidity, mass collaboration, empowerment

Table 3. Web 2.0 Technologies and Methodologies. of the web is that if a service is economically mated not through an explicit design process, viable, someone will step forward to provide it. but rather through the independent actions of many, one person and one task at a time. This Incremental Automation instance of mass collaboration is an appropri- The second feature is incremental automation. ate segue to our tour of Web 2.0. Let’s assume that every step of this cross-orga- nizational purchasing process is initially per- formed manually, using e-mail, web browsers, Web 2.0 Tour and phones. One day, Joe in GSA’s stock room, The term Web 2.0 refers to a collection of decides he’s got better things to do than man- emerging web technologies and methodologies ually checking a website to determine whether (table 3) that make the web more participatory there are sufficient PCs on hand. He scripts his (that is, two-way versus read only), more personal agent to check the inventory level of semantic, and more real time (that is, event driv- PCs every morning and, if there are fewer than en). Perhaps most importantly, Web 2.0 is a 100 on hand, to e-mail a purchase request (or cultural phenomenon. Developers start with a fill out a web-based order form) for approval by simple but useful idea and get it out quickly so his supervisor Bill. Joe’s still there to handle the others can refine and embellish it. The process rest of his job as well as unexpected events con- has come to be known as mass collaboration— cerning stocking PCs, but now he’s got one less thousands of individuals, building incremen- rote task to do. tally upon each other’s work. Joe assumes his agent’s request will continue I’ve italicized the Web 2.0 concepts that are to be processed manually by Bill. However, at most important to realizing my vision, and will some point, Bill gets smart and has his personal give a brief tour of these concepts, both as agent automatically approve routine requests building blocks for Internet-scale knowledge that are within budget and pass them along to systems and as AI research opportunities for John in purchasing. In similar fashion, John enhancing their functionality. I’ll organize the might have his personal agent pass along rou- tour around four Web 2.0 themes from table 3, tine purchase orders to approved suppliers like beginning with making the web more partici- Ingram. At the supplier, an order entry clerk might task her agent to check whether the patory. The referenced websites provide more order can be fulfilled from stock on hand and, details. if not, to pass it along to the manufacturer (HP) Making the Web More Participatory for drop-shipping to the customer (GSA). Imagine tens of thousands of individuals at Our first Web 2.0 theme deals with transform- thousands of organizations in a supply chain, ing the web into a more participatory and per- each independently and for their own reasons, sonalized medium through blogs, syndication, deciding to automate some routine parts of wikis, and the like. their jobs. The result is a form of emergent Blogs. Blogs, the first major Web 2.0 phenom- intelligence, where complex processes are auto- enon, are shared online journals where people

WINTER 2006 53 Articles

Personal News Pittsburgh: 84˚F 85˚/56˚ Marty Tenenbaum Sunday, July 10, 2005

Today’s News Tomorrow’s News Choose Date Content Settings Customize Layout

(IAAI-05) The 17th Annual Conference To Do Today AAAI.org 13 minutes ago Meet with John Smith The Seventeenth Annual Conference on Innovative Applications of Artificial Intelli- plan meeting suggest venue gence (IAAI-05) is the premier venue for Plan flight to SFO learning about AI's impact through deployed fare search applications and emerging AI technologies and applications. Case studies of deployed Group dinner meeting applications with measurable benefits suggest venue arising from the use of AI technology Listen to jazz provide detailed evidence of the current impact and value of AI technology. IAAI-05 augments these case studies suggest venue with papers and invited talks that address emerging areas of AI technology and ... more Items of the Day

map find similar The Grand Concourse Festival In The Park It's like no other restaurant you've ever seen. This fine seafood emporium, a Chuck Muer pghevents.com today at 4 pm restaurant, is set in the Edwardian splendor Enjoy a Sunday afternoon and evening with of the Pittsburgh ... more this celebration of Indian culture while raising funds for child relief projects in India map plan find similar and the U.S. Biotech Outperforms Market The festival includes a gourmet selection of yahoo.com (PR) 3 hours ago authentic Indian cuisine; a DJ playing world SAN FRANCISCO, Calif., July 1 music plus live performances by Jim Dispirito, /PRNewswire/ -- "Heading into the summer, Big World and even some Carnatic Indian Music. In the evening the film, “Bride biotech widely outperformed the Dow and & Prejudice” will be shown. NASDAQ on a year-to-date basis," said more G. Steven ... more

map plan find similar find similar

Figure 4. The Learning Browser.

can post diary entries about their personal two big challenges: for bloggers, it’s getting experiences and hobbies. Thanks to user- noticed; for blog readers, it’s avoiding informa- friendly editing tools like Movable Type1 and tion overload. An individual with a browser services like Blogger,2 there are now over 12 can at best regularly track a few of dozen sites million blogs, with nine new ones created for updates. every minute (according to a recent estimate by Feeds. Syndication feeds, such as RSS and Wired Magazine). While many blogs are admit- , address both problems by defining an tedly full of drivel, some contain real-time XML format for about new content information and insights that would be diffi- added to a website—information such as the cult to find anywhere else. Those in business or title of a posting, its URL, and a short descrip- professionals ignore blogs at their peril. Those tion. Using a feed reader such as NetNewsWire, in AI and looking for a quick way to get rich individuals can periodically ping the feeds of should figure out how to mine this trove of favorite sites to determine quickly whether information and use the results to predict mar- anything has changed since their last visit. Site kets. owners can also submit their updates to search In a world of 12 million plus blogs, there are engines and aggregators. Aggregators subscribe

54 AI MAGAZINE Articles to numerous feeds and sort postings into sub- Semantics ject-specific channels such as “Science News” Our second Web 2.0 theme—semantics—in- or “Travel.” By subscribing to syndicated feeds, cludes tags, microformats, and vertical search. an individual can effectively track hundreds or even thousands of sites, qualitatively trans- Tagging and . Tags are personal- forming the web surfing experience. ized labels for describing web content—web RSS and Atom may disappoint AI folks with pages, blogs, news stories, photos, and the like. their limited semantic structure and content. One person might use the terms Beijing and In the example above, an aggregator might Buildings to describe a photo of downtown Bei- provide only the most basic information ele- jing. Another might prefer the tags China and ments, such as the name of the channel Vacation to describe the same photo. Collec- (“Yahoo! News: Science News”), a headline tively, the set of tags adopted by a community to facilitate the sharing of content is known as (“NASA Sets July 13 Shuttle Launch Date”), and a . a brief natural language summary (“NASA Folksonomies organize knowledge different- plans to blast into space on July 13 after more ly than traditional classification hierarchies. than two years on the ground…”). The AI com- While tags often correspond to traditional sub- munity can significantly improve the ability to ject headings or categories, they can also repre- find and filter web content by enriching the sent more subjective descriptors: “cool,” “rec- semantics of syndication feeds and adding ommended,” “reading list,” and so forth. inference rules to “feed-reader” software. I’ll Moreover, folksonomies are flat: categories can identify some specific opportunities later in overlap and items can be classified in multiple this article. ways. Instead of force-fitting an item into a tra- Wikis. Wikis are websites where content is ditional classification hierarchy like the Dewey developed and edited collaboratively. The most decimal system, just affix all the tags that 3 impressive example is , a -based apply. Most people find this easier and more encyclopedia with more than 600,000 articles intuitive than agonizing over whether a picture on every conceivable topic, contributed and shown should be classified under China, build- collaboratively edited by more than 300,000 ings, or vacation. The most defining character- people. Wikipedia provides interesting lessons istic of folksonomies is that they evolve for the AI community on developing and through mass collaboration within a commu- maintaining collaborative ontologies. For nity rather than through the dedicated efforts example, as of last July, the top-level subject of one (or a few) expert designers. Unlike most headings for geography included Antarctica, ontologies developed by AI researchers, folk- landforms, and villages. You may not agree sonomies are always widely used—at least that these categories deserve equal billing. The within the community that evolved them. good news is that with a wiki, if you don’t like Some of the best examples of collaborative how things are organized, edit it. If you dis- tagging can be found at Flickr,4 a photo-sharing agree with someone’s edit, open a discussion. site recently purchased by Yahoo! Most of the Imagine the power of combining these par- photos on Flickr have been tagged by multiple ticipatory technologies with the power of AI. community members. Useful tags stick. Popu- Figure 4 is a mockup of a “Learning Browser” lar tags get reused, and thus reinforced, because being developed with support from Com- people want their content to be found. Fre- merceNet. Unlike traditional aggregators that quently co-occurring tags, such as China and summarize new feeds by listing headlines or Beijing, provide an opportunity to infer rela- subject lines, it presents the user with a person- tionships (causal, structural, geographical, and alized “zine.” In the spirit of the Media Lab’s so on). “Daily Me,” the zine is a digest of the latest The Del.icio.us website5 pioneered collabora- feeds relevant to the task at hand—whether it’s tive tagging. Del.icio.us lets users assign arbi- researching a report, following a sport, or in trary tags to a web page to help them remem- the example shown, planning my trip to Pitts- ber it (for example, pages about China, cool burgh for AAAI-05. Over time, the browser can places, hot night spots). Del.icio.us also lets refine its model of my interests and modify users see how others tagged the page, which what’s presented, based on implicit feedback may influence their own choice of tags. They such as what I click and what I don’t, how long can click an existing and see all other pages I spend on an article, what I recommend to my sharing that tag. Additionally, they can drill friends, what they recommend to me, as well as down and see who affixed a particular tag, oth- what’s happening in the world. Many sites can er pages they labeled with that tag, as well as benefit from this kind of implicit feedback, any frequently co-occurring related tags (for from search engines to dating services. example, San Francisco and California), and

WINTER 2006 55 Articles

Onotologies: Precise, inflexible, formal, system-oriented, experts required

China

Business/Shopping Entertainment/Arts Education Real Estate

Dance Music Restaurants Theater

Chef Chu’s Mario’s PizzaSushi King Compadres

award friendly winning chang south garden historic hua facing herbs site garden rooms chicken Chef Chu’s li wan garden lake water lunch & chestnut terrace style halls guang- dinner cake zhou famous pastries

Folksonomies: Fuzzy, flexible, informal, human-oriented, no experts required

Figure 5. Ontologies and Folksonomies.

the pages associated with them. They can look enabling helpful inferences. Knowing that generically for cool or recommended pages or Guang-Zhou is in China, for instance, one for pages a specific individual thinks are cool could infer that Chef Chu’s was in China, even or worth recommending. though “China” did not appear explicitly as a The wealth of folksonomy data available tag for Chef Chu’s web page. A little knowledge from collaborative tagging sites like Flickr and can go a long way in increasing the usefulness Del.icio.us creates some tantalizing AI research of tags. Conversely, a lot of tags can go a long opportunities. One possibility: deriving formal way in increasing the usefulness of knowledge. ontologies automatically from raw folk- Microformats. Microformats are simple data sonomies. Imagine taking the collected tags schemas for describing people, places, events, assigned to thousands of restaurant sites in reviews, lists, and so forth—the things people China (figure 5) and inferring, for example, frequently blog about. People use them today that Chef Chu’s is a restaurant, that restaurants to embed structured metadata in web pages serve lunch and dinner, that Guang-Zhou is a and blogs so that their content can be more city in China, and so forth. If deriving ontolo- easily discovered, indexed, aggregated, and gies automatically proves too hard, a valuable summarized. Soon, microformats will also be fallback would be to map folksonomies onto used by knowledge services. Microformats are existing ontologies. Such a mapping would an important development for the AI commu- impose a hierarchical structure on tags, nity because they promise to significantly

56 AI MAGAZINE Articles increase the availability of structured web con- tent. Microformats adhere to five basic design principles, articulated by Tantek Çelik of Tech- • Solve a specific problem norati, which appear to be driving their popu- larity and rapid adoption (See table 4): • Start as simply as possible, and evolve First, design microformats to solve a specific problem, such as keeping track of your friends, • Humans first and machines second or sharing calendar events. Develop them in tandem with an initial web service or applica- • Reuse existing widely adopted standards tion that uses them. As John Seeley Brown observed,6 Microformats win because they • Modular and embeddable include pragmatics as well as semantics. Second, start with the simplest possible rep- resentation for the problem at hand, deploy it quickly, and add more features only when forced by a real application. By simple, think Table 4: Tantek Çelik’s Design Principles vCard—where a person might be represented initially by just their name and contact infor- mation. Third, design microformats for humans first iCalendar standard. Soon, a personal agent and machines second. Microformats consist of might use these listings to alert users to upcom- a little bit of metadata embedded in ordinary ing events of interest. They’ll never again have HTML—just enough so that a computer can to kick themselves for missing an event understand the meaning of what is being dis- because they didn’t hear about it in time. played. In Tantek’s words, information on the Ideally, one should be able to extract micro- web should be “Presentable and Parseable.” formatted content from any web page for use Microformats do that, as simply as possible. by knowledge services and applications. The Fourth, whenever possible, reuse existing Semantic Highlighter project at CommerceNet well-known standards rather than wasting time Labs is a step toward that goal. In the screen and money inventing and marketing new mockup shown in figure 7, the user has high- ones. HTML, hCard (based on the vCard for- lighted selected items of interest on a web page. mat), and hCalendar (based on the iCalendar The program then creates a structured repre- format for events)—few standards are more sentation of that information by selecting and well-known. populating relevant microformat(s). In this Fifth, microformats should be modular and example, the software is filling out a vCard embeddable. A reservation booking service, for describing me. Note that it has to follow the example, might use as input a composite URL to CommerceNet’s website to obtain my microformat that contained the previously contact information. In general, given some mentioned microformats for personal informa- highlighted data on a web page, an arbitrary tion and event descriptions. amount of inference may be required to pick Figure 6 illustrates how adding tiny bits of relevant microformats, and fill in missing markup through microformats can elegantly fields. More about microformats is available at transform any website into a structured knowl- microformats.org. edge source. On the left of the screen is a page Vertical Search. Another concept for making 7 from Eventful, a website that aggregates event the web more semantic is vertical search. Hori- listings. On the top is the vanilla HTML source zontal search engines such as Google work well for that page, with a few highlighted inser- for finding popular websites. However, when tions. These are hCalendar tags, and they tell a the site wanted is ranked 13,000 out of computer the meaning of each block of dis- 147,000, or when the answer to a question played text. Note that the hCalendar tags pig- must be compiled from information spread gyback on existing HTML syntax extensions over many sites, current search engines are for Cascading Style Sheets, such as and found wanting. . Thanks to the hCalendar tags, Eventful Vertical search engines overcome these limi- users can highlight events of interest and tations by using models of the user, the import them directly into their personal calen- domain, and the available information sources dars, in this example Apple’s iCal program. A to determine where and how to look. Vertical free web service8 translates the hCalendar-for- search often exploits deep web sources that matted event listings used by Eventful into the aren’t accessible to Google’s spiders, either

WINTER 2006 57 Articles

”> The 20th National Conference on Artificial Intelligence July 7, 2005 The Twentieth National Conference on Artificial Intellience (AAAI-05) and the Seventeenth Innovative Applications of Artificial Intelligence Conference (IAAI-05) will be held July 9-13, 2005 in Pittsburgh, Pennsylvania. The IAAI Conference maintains its own conference site...

Figure 6. The hCalendar Microformat.

because they are proprietary or require special- syndicated feeds to inform users when the ized access protocols (for example, SQL queries, memory card they want becomes available at filling out forms). The raw information their price. retrieved from these sources is initially organ- Medstory9 is a healthcare search engine. Its ized into data or knowledge bases. Domain and spiders scour the deep web, focusing on high- user models are then applied to connect the quality information sources that are inaccessi- dots and return useful answers. Consider three ble to Google’s web crawlers, such as the examples: Zoominfo, Dulance, and Medstory. National Library of Medicine’s PubMed data- Zoominfo is a vertical search engine for peo- base and subscription-based medical journals. ple in the business and professional worlds. Its It organizes the information it finds in a rela- spiders scour the web for information about tional and then applies domain mod- individuals, which is then aggregated into per- els of medicine and related fields to infer deep- sonal curricula vitae (CVs). AI-based tech- er connections. For example, it can use models niques are used to recognize redundant CVs of molecular pathways that are common to that describe the same person, and merge them several diseases to suggest novel uses of drugs into a composite summary of career accom- and infer potential side-effects. plishments. Typing in my name, for instance, Looking ahead, I see vertical search engines produces a CV compiled from more than 100 becoming a major source of structured infor- distinct Internet sources. It’s quite comprehen- mation for knowledge services. Agents will be sive and reasonably accurate. able to submit microformatted query forms Dulance is a shopping search engine that containing some blank fields and receive scours the web for information about products microformatted responses with those fields and prices. Dulance is not the first shopping instantiated. Search engines themselves will bot, but it’s the first one I know of that offers increasingly rely on collaboratively tagged and

58 AI MAGAZINE Articles

Figure 7. The Semantic Highlighter.

microformatted content to improve search quate for tracking blogs, which change rela- results, especially within professional commu- tively slowly, such centralized notification nities like healthcare and pharmaceutical services are not likely to scale. A truly Internet- research. scale event bus must accommodate potentially billions of agents publishing and subscribing The Real-Time Web to each other’s information and events in near The third Web 2.0 theme is making the web real time (100 milliseconds to a few seconds, more real time and event driven. A new gener- as fast as the underlying networks permit). ation of real-time search engines, such as Tech- And it must route information intelligently, norati and PubSub, are indexing the fast- not blindly copy spam. A supply chain man- changing subset of the web that some call the ager tracking events at airports is likely to be “World Live Web.” Technorati provides the lat- interested only in those events that could est blog, Flickr, and Del.icio.us posts, usually interrupt the supply chain—a fire or strike for within minutes of their posting. PubSub tracks example, but not the appointment of a new more than 12 million sources, primarily blogs landscaping contractor; and once notified, he and news feeds. Users subscribe to topics of or she doesn’t need 20 redundant messages interest and receive instant message (IM) alerts from other news sources on the same event. as soon as items of interest are published. Such a notification service requires a federated How do these real-time search and alerting peer-to-peer architecture and intelligent services work? Several major blogging tools agents that can filter and route content using ping a trackback cloud when new items are domain knowledge and context, not just key- posted. Technorati and PubSub.com monitor words. this cloud and use the postings to update their At CommerceNet Labs, we’re prototyping and alert users. While perhaps ade- such a federated, Internet-scale event bus.

WINTER 2006 59 Articles

1 craigslist > peninsula > apartments for rent

s.f. bayarea | san francisco | south bay | east bay | peninsula | north bay 2 Thu Jul 07 Automator $1700 - Nice 2 Bed 2 Bath Condo For Rent! (south san francisco) $2000 / 2br - Sunny Two Bedroom House for LEASE (burlingame)

Craigslist Data Source $3200 / 3br - 2ba - Bright and Clean Home near Park (palo alto) Query: 2 br rent apartment Palo Alto $1300 / 2br - GREAT PLACE & GREAT LOCATION! near Apple & Google

Check every: two hours $1000 / 2br - 1 Bath Apartment in Palo Alto (Photos) (pa(palo altoto)) pic Output: RSS $1900 / 2br - Burlingame 2br 2ba condo, near Broadway (burlingame) Microformat Transformation $21$2185 / 2b2br - Downtown Townhousese (palo alto) $2500 / 2br - Executive Designer Furnished Condo (palo alto) Format: hCard Script: if within 2 miles of CommerceNet then 3 E-mail set priority to medium if rent < $1500 then From: Automator Service set priority to high To: Marty Tenenbaum Subject: Potential Housing Found Transformation Notify

Script: if priority is high then page (650) 555-5555 if priority is medium then email [email protected] with subject“Potential Housing Found”

Figure 8. Transforming Websites Into Knowledge Services through Personal Agents.

Agents aggregate events in their areas of inter- a decentralized “blackboard” for triggering est and also act as facilitators, routing selected agents when the world changes. items to other agents. An agent aggregating Internet-scale pub-sub is a potentially dis- auto industry news, for example, might pass ruptive technology because it reduces the along items that affect the price of cars to an need for big aggregators like eBay and Google. agent aggregating car rental industry news. Once millions of players can directly subscribe That agent, in turn, might forward news affect- to offers to buy and sell, they’ll invent many ing the price or availability of rental cars to an new business models and ways to add value. agent aggregating travel industry news. Com- When the information you need can find merceNet’s notification service is being built you—that your order is ready for pickup, that on TPd, an experimental pub-sub network a new drug is available for your disease—with- infrastructure for semistructured syndication out your having to search for it, life will be a feeds. Unlike other middleware systems, it is little easier. designed from the ground up to deal with the potentially hostile world of decentralized infor- Community Empowerment mation, where multiple competing sites and The fourth major thrust of Web 2.0 is about users are sharing their own (possibly conflict- empowering end users. A great example is ing) opinions of the world. Aggregation, filter- Greasemonkey. Greasemonkey is a ing, and propagation occur throughout the extension that automatically invokes network at edge nodes, eliminating central JavaScript scripts when selected are points of congestion. In AI terms, TPd provides loaded. The scripts can transform the appear-

60 AI MAGAZINE Articles

Martin Guitar 600.00 mysite.com/guitar Acoustic Guitar 700.00 A model D15 Martin guitar in good condition, rarely played. Can deliver to anyone in NYC area. Looking for an acoustic guitar in CA area.

Market Maker Service Paul’s Blog Scott’s blog Buyer Seller For sale: Martin guitar, $600 Wanted: Acoustic guitar, $700 Match, can negotiate For sale: Gibson Les Paul Wanted: Jazz electric Match Found a A match!

Open Product A F Directory Trust Service F A A F

F

Figure 9. zBay.

ance of pages, perform computations, and even turn any website into a knowledge service. invoke external web resources. A popular Here’s a prospective scenario, inspired by the Greasemonkey script is Book Burro, written by Semantic Highlighter, and by Apple Automa- Jessie Andrews at the University of Kentucky. tor, a popular visual scripting tool for Macin- Activated by any Amazon page offering a book tosh computers.10 for sale, Book Burro fetches and displays prices The task is to find a reasonably priced apart- from competing booksellers. Another good ment near work. I start by going to Craigslist11 example is Greasemap. Greasemap scans every and highlighting a few apartments of interest page for street addresses, sends them to (step 1 in figure 8). Generalizing from those Google’s mapping service, and displays the examples, my agent infers that I’m interested results at the top of the page. in two-bedroom apartments in the Palo Alto By inserting computation between the area. Next, I create the Automator-like script viewer and content source, Greasemonkey shown in step 2 of figure 8: “Check every two scripts let users take charge of their screen real hours for new listings. If they’re within two estate. Imagine the possibilities if one could miles of work then notify me by e-mail (see insert logic, decision making, and learning. step 3 in figure 8). However, if the rent is less What if ordinary users could create scripts than $1500, page me.” Imagine millions of using visual programming techniques, with- people composing scripts like this and publish- out having to become JavaScript wizards? Our ing them as services that others can use and goal at CommerceNet Labs is precisely that: to build on. Such a vision would indeed trans- give everyone a personal agent tool that can form AI and the web as we know them.

WINTER 2006 61 Articles

1 • Fire at Heathrow • Oil hits $60/barrel Supplier Network 3 500 drives in Dallas, 700 in Denver, 300 in Chicago Transport Network Industry News Service A A A A F A A F Logistics F F F Service A F Spot Market A 5 Composite rush order service F A I need a new supplier 2 4 Delivery available from F that can deliver 1,000 Denver and Chicago! 80 GB drives by Friday

10,000 80GB drives are 6 available at 60% off

Figure 10. Electronics Industry Knowledge Services.

Case Studies and Scenarios In zBay, buyers and sellers post their desires directly on their blogs, using a microformat for Having concluded our tour of Web 2.0, we’re classified ads (see figure 9). In this example, now ready to talk about growing Internet-scale Scott on the left wants to sell a Martin guitar knowledge systems that combine Web 2.0 con- for $600, while Paul on the right is seeking an cepts with intelligent agents and web services. Acoustic guitar under $700. The problem is I’ll illustrate such systems using both visionary how to connect them. Step one is aggregating scenarios and case studies of actual deployed the raw listings. Posting tools might submit systems. listings automatically to large aggregators such as Google and Craigslist. Smaller, specialty zBay aggregators could spider for listings. These spe- zBay (as in decentralized) is an e-commerce sce- cialty market makers can now apply deep nario we’re exploring at CommerceNet, that domain knowledge, including open wiki-based involves literally blowing up today’s central- product ontologies, to match up compatible ized e-commerce ecosystems. Think eBay with- buyers and sellers. A market maker specializing out an eBay in the center! zBay was inspired by in used guitars would know, for example, that a prescient 2002 blog posting by writer and most Martin guitars are acoustic, while most consultant Paul Ford.12 Ford wanted to illus- Gibsons are electric. The big advantage of trate the potential of the semantic web. How- decentralized markets is that many specialists ever, his scenario is a lot more credible with can add value by bringing their domain knowl- Web 2.0 technologies like microformats and edge to bear. event buses. Blowing up the center unleashes other

62 AI MAGAZINE Articles opportunities for innovation. Paul himself How about a knowledge service that tracks sup- might decide to become proactive and task his pliers and learns which ones provide the most personal agent with monitoring favorite dealer reliable on-time delivery? Or a demand fore- sites for new guitar listings. Knowledge-based casting service that monitors blogs for subtle trust services can lubricate the market, using demand signals, such as an emerging interest social networking and blogs to track the repu- in 80 gigabyte iPods. Or a spot market service tations of buyers and sellers and to compile that notifies subscribers of buying opportuni- product reviews. Other service providers can ties that arise when another manufacturer loses facilitate fulfillment, shipping, and payment. a large order and is forced to dump inventory These ideas extend directly to industrial- (step 6). No ERP system can deal effectively strength business-to-business (B2B) applica- with such opportunistic, real-time informa- tions. tion. Humans aren’t very good at it either, due to information overload and limits on time- Electronics Industry Supply Chain bounded decision making. Couldn’t a little AI The electronics industry is fiercely competitive. help? Short product cycles make forecasting supply and demand difficult. Error rates of 50 percent Case Study: Webify’s or more are not uncommon. Firms must close- Insurance Ecosystem ly monitor changes in supply and demand and Webify14 is commercially deploying simple respond instantly to keep them balanced. The agent-based knowledge systems today in the traditional sources of supply and demand sig- insurance industry. The insurance industry is a nals are channel partners, such as distributors complex distributed ecosystem. It includes the and resellers. Increasingly, however, the most primary carriers, their agents, underwriters, up-to-date information comes from Web 2.0 and claims adjusters, reinsurers, fraud investi- sources including news feeds, blogs, and real- gators, and many other service providers (See time sensors. figure 11). Webify uses agents, operating with- Let’s return to our earlier electronics industry in a service-oriented architecture, to route doc- scenario (figure 3) and extend it to accommo- uments among service providers based on con- date these Web 2.0 resources. In the spirit of tent. Routine claims, for example, can be incremental automation, a supply chain man- routed automatically to $5-per-hour adjusters ager might initially just have a personal agent in Bangalore, while major claims with the monitor relevant feeds and notify him or her of potential for significant payouts are routed to events potentially warranting attention (figure $80-per-hour in-house specialists in Hartford. 10, step 1). Someone who had a critical ship- This ability to selectively outsource claims is ment of disk drives that was scheduled to be especially helpful following disasters, enabling routed through Heathrow around the time of carriers to handle peak loads without unduly the 2004 London bombings would have been inconveniencing other customers. wise to explore alternative sources of supply. There are many other opportunities for The manager or his or her personal agent pre- knowledge services to incrementally improve pares an RFQ and submits it to SupplyFX,13 a the performance of distributed insurance live, commercial knowledge service for sourc- ecosystems. For example, new policy applica- ing electronic parts (step 2). SupplyFX expe- tions for coverage can be automatically dites sourcing by selectively forwarding RFQs offloaded to reinsurers when the exposure to a to interested suppliers in its network and aggre- given risk exceeds predetermined levels. An gating their responses. outright moratorium on new hurricane cover- Fortunately, drives are available from several age can be instantly imposed at the first indica- suppliers (step 3). But can they arrive by Fri- tion of an approaching storm. Suspect claims day? Our supply chain manager submits that can be forwarded to fraud investigators, and so requirement to a third-party logistics service, forth. which queries its network of carriers and deter- Insurance was a prime domain for expert mines that Friday delivery is possible from systems in the 1980s. Several companies, such Denver and Chicago (step 4). Problems at as Syntelligence, were formed specifically to Heathrow will not disrupt production! Sup- address this lucrative market. None are still plyFX, sensing an opportunity, assembles and around. Their technology consisted of large, publishes a one-stop composite service for rush monolithic mainframe-based systems. It was orders that takes into account both product ill-suited to decentralized insurance ecosys- availability and shipping schedules (step 5). tems, where integration and automation are But why stop here? There are potentially best approached incrementally, one organiza- thousands of value-added knowledge services. tion and one task at a time. Web services over-

WINTER 2006 63 Articles

Chipped windshield: Claim routed to Bangalore Too much earthquake exposure in CA! New policies sent to reinsurers Insurance Carrier Fraud Agents and Detection Producers Outsourced Consumers Reinsurers Claims

Internal Claims Processing

House destroyed by mudslide: Consumers claim routed to Hartford

Figure 11. Insurance Industry Knowledge Services.

come this problem in insurance and other medical research. In collaboration with col- industries. They enable companies to blow up leagues at the Massachusetts Institute of Tech- existing value chains and reassemble the pieces nology, Carnegie Mellon University, and the into radically new cross-company processes Institute for Human and Machine Cognition, and business models. he’s developing a knowledge services grid for mobilizing the resources of the global scientific E-Science community in response to biological threats Scientific problems are increasingly complex (see figure 12). Suppose a virologist at CDC is and multidisciplinary, and thus beyond the confronting a suspicious new virus? Urgent reach of a single individual or group. In questions spring to mind. Is it natural or engi- response, many scientific communities are neered? How is it molecularly related to other now relying on Internet-based grids for sharing known infectious agents? Does that relation- information, computing, and other services. A ship present opportunities for intervention? recent issue of Science contained three papers The e-science grid springs into action. First, on service-oriented science.15 Most of the serv- a sequencing center with cutting edge viral ices discussed, however, involve only the shar- RNA chips analyzes the new virus’s genome ing of data and computing resources. Knowl- and publishes it to the grid (figure 12, step 1). edge services offer a significant extension, Based on that data, a knowledge service associ- enabling groups to share their specialized ated with the leading gene expression database knowledge and expertise. concludes that it is indeed a new bug (step 2). Alain Rappaport of Medstory is pioneering Other knowledge services look for close rela- the application of knowledge services in bio- tives to the new bug based on its phylogenetic

64 AI MAGAZINE Articles

1 Here are some with 4 l’ll sequence it similar structures Genome Here are some 3 Gene Expression Sequencing Service evolutionary 2 Nope, Database cousins It’s a new one 1 A Anyone seen a 3-D Structural A Phylogenic bug like this? Proteomics Analysis Analysis Database Research A A Coordination F A F A F F F Drug F Candidates Composite viral Expert A analysis service Finder F A From phylogeny 5 6 These drugs are and structure, it’s often effective Composite drug F likely in XYZ family on XYZ bugs candidate service

Figure 12. Service-Oriented Science. class (evolutionary neighbors) and three- laboratively by academic researchers, biotech dimensional molecular structure (steps 3 and scientists, and big pharma marketers, support- 4). Experts on these related viruses are alerted. ed by a vast ecosystem of third-party services— One of those experts creates a knowledge serv- combinatorial chemistry, toxicity assays, pro- ice that synthesizes the phylogenetic and struc- teomic analyses, clinical trials management, tural analyses to determine the precise family and so forth. Knowledge services help facilitate of known viruses the new bug most closely the sharing of information and expertise, the resembles (step 5). A pharmacology expert can allocation of resources, and the coordination of now combine this composite virus classifier workflows across organizations. with a service that provides access to a database of antiviral drugs, creating a knowledge service E-Life that can take a virus’s DNA and suggest the best My final example is an e-life scenario: planning drugs to defeat it. (step 6). a dinner for tonight in Pittsburgh. The task The circles associated with steps 5 and 6, and breaks down into a series of subtasks (see figure thousands more like them, represent compos- 13): Determining who is in town that I might ite knowledge services. Most will be created by like to spend some time with. Are they avail- scientists for their own use and then made able and interested in meeting for dinner? available for others to build on. Collectively Finding a restaurant, booking reservations, they will transform how science is done. making arrangements, and coordinating any Knowledge services are already impacting how last minute changes in plans. These are precise- drugs are developed (and not just for bioemer- ly the types of tasks the web excels at. gencies). Increasingly, drugs are developed col- Let’s start with who’s in town that I might

WINTER 2006 65 Articles

Who’s in Who do I Who’s the area? want to meet? available?

Who? Where will What will we eat? Where? What? we eat?

When? How?

When is the When is everyone How do we How will we restaurant open? available? coordinate? get there?

Figure 13: E-Life.

like to meet. That list would include people I see who’s available. An incremental service, know who live in Pittsburgh, as well as confer- built on Evite, might offer responders the ence attendees with whom I have something in opportunity to fill out a microform, indicating common—perhaps authors of papers I’ve liked preferences regarding cuisine, location, and so or colleagues who liked the same papers I did, on. Next up is selecting a place to eat. There are or those who fund research in my area. I might numerous sites on the web that rate and review start by asking my personal agent to compile a restaurants. A helpful knowledge service could list of the people I know who live in Pittsburgh, use these existing sites, together with guests’ using contacts from my address book and indicated preferences, to recommend restau- social networking sites. To compile a list of rants. Another knowledge service could then conference attendees, someone would have to check availability at OpenTable, book the reser- create a service that scraped the conference site vation, and notify the guests. for speakers, session chairs, and the like, and Each of these knowledge services is helpful published the results as a microformatted list; in its own right, so it’s a good bet that someone in the future, conference organizers might rou- will create services like them. Once they exist, tinely publish structured lists of registered it won’t take long before someone drops in a attendees. Someone else might build a collabo- constraint-based planner and offers a com- rative filtering service for CiteSeer, where users pletely automated dinner planning service. could share ratings of papers. Given such serv- ices, a simple agent could correlate the list of Conclusions conference attendees with the authors of papers I liked, to suggest possible dinner guests. I’m convinced that the time has come to Given a list of potential dinner guests, an rethink how we do AI applications in the con- agent could use e-mail or a service like Evite to text of the web. This is not an academic exer-

66 AI MAGAZINE Articles cise. It’s not a question of if or even when this close to meeting Newell’s condition. Amazon will happen. It’s already happening in the web has adaptive goal-oriented behavior—it tries to community. We just need to insert a little AI! sell you books. It certainly learns from experi- And if the AI community doesn’t step up to ence—what to recommend. Someone has built this opportunity, others will reinvent AI for us. a speech interface using Amazon’s published Stripped to its essence, I’m talking about (1) APIs. And Amazon even has a modicum of self- using microformats and agents to transform awareness—the ability to recognize when ordinary web resources into knowledge servic- servers go down and take corrective action. So es; (2) enabling these knowledge services to is Amazon intelligent? communicate directly and in a content-direct- Begging that question, the real issue is ed fashion, through events, to create new com- whether a decentralized web of knowledge pound knowledge services, and (3) providing services, each of which may satisfy at most a millions of end users with personal agent tools few of the criteria, can collectively achieve true to do this on an unprecedented scale, all build- intelligence. The jury is still out. However, for ing upon each other to scale exponentially like several reasons, I’m sanguine the answer will the web. be yes. First, as Google and others have demon- The result would be millions of simple, inter- strated, some of Newell’s criteria become sim- acting, Internet-scale knowledge systems, built pler at Internet scale—statistical learning and through mass collaboration, which support using large amounts of knowledge, for exam- how people actually work and live. And ple. Second, like open source software, decen- because they’re human-machine systems, they tralized knowledge systems tap the collective could utilize any web resource. Such systems wisdom and experience of large numbers of would stand in stark contrast to contemporary developers (potentially millions), rather than a AI and semantic web applications, which are handful of researchers. Although most of the typically built by a handful of researchers to resulting knowledge services likely will be sim- perform deep reasoning on a tiny subweb of ple compositions of existing building blocks formally structured information. Internet-scale glued together with end-user scripts, the pro- knowledge systems, by contrast, are developed gramming can be arbitrarily complex. Even pragmatically: Start with the most basic func- simple composite knowledge services can be tionality that does something useful. Get it out arbitrarily powerful when their components quickly so others can build on it. Automate can include anything from a FedEx-class logis- complex tasks incrementally; and never let a tics planner to a Newell-class general problem lack of standards get in the way of connecting solver. Clearly, conventional AI systems pro- things. vide a lower bound to the collective intelli- The resulting systems are not incompatible gence achievable by an ecosystem of cooperat- with the semantic web. Feel free to use the ing knowledge services. resource description framework (RDF) if you Neil Jacobstein suggested to me an intrigu- like—it’s just another microformat, one partic- ing third argument: in an ecosystem of mil- ularly good for deeper reasoning. Indeed, lions of diverse knowledge services, which microformats are a great way to bootstrap the evolve through composition, amid intense semantic web and get traction on problems competition, it seems likely that natural selec- that people actually care about. Some in the tion will apply—selecting for those with Web 2.0 community have begun referring to greater “fitness” to their business, scientific, or their microformat-based applications as the personal environments. lowercase semantic web. I prefer semantic web For all these reasons I’m confident we’re on 2.0, which appropriately acknowledges the the right track. We’re finally building software contributions from both the semantic web and that works the way a free society does—decen- Web 2.0 communities. tralized, evolutionary, democratic, and toler- ant. Markets aggregate the collective intelli- But Is It AI? gence of humans and computers, and rapidly Internet-scale knowledge systems are clearly disseminate the best ideas and knowledge serv- useful. But are they intelligent? Recall Newell’s ices. Mass collaboration bootstraps the result- criteria for intelligent systems (table 1). Most ing components into systems with unprece- websites exhibit at least a few of these charac- dented intelligence and flexibility. These ideas teristics. Moreover, each of them can be found are reminiscent of Marvin Minsky’s Society of somewhere on the web. Newell, however, was Mind.17 As AI meets Web 2.0, they are ideas adamant that all characteristics must be exhib- whose time has come. ited simultaneously. I invite you to contact me or visit the com- A few megasites, such as Amazon, come merce.net website16 to learn more about oppor-

WINTER 2006 67 Articles

tunities to get involved and help realize the vision. Acknowledgements AAAI This article is based on an invited talk delivered at IAAI-05. I’d like to thank the many colleagues who helped with that presentation—Alain Rappaport, Special Tantek Çelik, Randy Davis, Dan Gould, Chris Hib- bert, Rohit Khare, Greg Olsen, Barney Pell, Raj Red- dy, Adam Rifkin, Manoj Saxena, Allan Schiffman, Award Kragen Sitaker, Jay Weber, and especially Neil Jacob- stein, who persuaded me that these ideas were Nominations worth talking about, and Kevin Hughes, whose illustrations and collaboration helped crystallize for 2007 them. Notes 1. movabletype.org. 2. blogger.com. AAAI is pleased to announce the continu- 3. wikipedia.org. ation of its two special awards in 2006, 4. flickr.com. and is currently seeking nominations for 5. del.icio.us. 6. CommerceNet Workshop on Decentralized Commerce, the 2007 AAAI Classic Paper Award and Wharton West, San Francisco, 20 June 2005. the AAAI Distinguished Service Award. 7. eventful.com. 8. Available at suda.co.uk/projects/X2V/. The 2007 AAAI Classic Paper Award will 9. medstory.com. be given to the author of the most influ- 10. apple.com/macosx/features/automator/. ential paper(s) from the Seventh National 11. craiglist.org. 12. ftrain.com/google_takes_all.html. Conference on Artificial Intelligence, held 13. supplyframe.com. in 1988 in St. Paul, Minnesota. 14. webifysolutions.com (Webify is now part of IBM). 15. Science 6 May 2005, 308(5723): 809, 814–824. The 2007 AAAI Distinguished Service 16. commerce.net/semweb2. Award will recognize one individual for 17. Marvin Minsky, The Society of Mind. Simon and Schus- ter, New York, 1986. extraordinary service to the AI community. Jay M. “Marty” Tenenbaum spent the 1970s at SRI’s AI Center leading vision Awards will be presented at AAAI-07 in research, the 1980s at Schlumberger Vancouver, Canada. Complete nomina- managing AI Labs, and the 1990s founding a string of successful Internet tion information, including nomination commerce companies, ultimately serv- forms, is available at www.aaai.org/ ing as chief scientist of Commerce One. Awards/awards. php. He now splits his time between Internet startups Efficient Finance, Medstory, and Patients Like Me, and the nonprofit CommerceNet, The deadline for nominations is March where he’s using AI and the Web to improve healthcare by tapping the collective intelligence of doctors, patients and 15, 2007. medical researchers. Kevin Hughes is an internationally- For additional inquiries, please contact recognized pioneer in web design and Carol Hamilton at [email protected]. software. One of only six members of the Hall of Fame, in 1994 Hughes was presented the award by Tim Berners-Lee at the First Interna- tional World Wide Web conference in Geneva. He has worked with Tenen- baum at EIT, CommerceNet, VeriFone, and CommerceOne and is currently the CTO of ChipIn, a Web 2.0 startup based in Honolulu, Hawaii.

68 AI MAGAZINE