<<

AI Magazine Volume 26 Number 4 (2006)(2005) (© AAAI) 25th Anniversary Issue Knowledge Is Power: A View from the

James Hendler

■ The emerging Semantic Web focuses on bringing program could search for a house and negotiate knowledge representationlike capabilities to Web transfer of ownership of the house to a new applications in a Web-friendly way. The ability to owner. The land registry guarantees that the ti- put knowledge on the Web, share it, and reuse it tle actually represents reality.2 through standard Web mechanisms provides new In Berners-Lee’s view, the then young World and interesting challenges to artificial intelligence. Wide Web, could be seen as a Web not just of In this paper, I explore the similarities and differ- documents, but of concepts, where the infor- ences between the Semantic Web and traditional AI knowledge representation systems, and see if I mation described, if liberated from the text, can validate the analogy “The Semantic Web is to would provide information that machines KR as the Web is to .” could use for new, dare I say “knowledge based,” applications. The third quotation is again from Berners- et me start with three quotations. The first Lee, this time joined by of Nokia comes from a tutorial on expert systems Corporation and me, writing in a Scientific written by Robert Engelmore with Edward American article titled “The Semantic Web” that L appeared in 2001. In essence, we wrote about Feigenbaum in 1993. He wrote unifying the two earlier visions and said: … but in knowledge resides the power. Because of the importance of knowledge in expert sys- Knowledge representation … is currently in a tems and because the current knowledge acqui- state comparable to that of hypertext before the sition method is slow and tedious, much of the advent of the Web: it is clearly a good idea, and future of expert systems depends on breaking some very nice demonstrations exist, but it has the knowledge acquisition bottleneck and in not yet changed the world. It contains the seeds codifying and representing a large knowledge of important applications, but to realize its full infrastructure.1 potential it must be linked into a single global system (Berners-Lee, Hendler, and Lassila 2001) (Keep an eye on that last phrase “a large knowl- edge infrastructure”; we will return to it several That is, we argued that a knowledge represen- times.) tation (KR) system that could live on the Web, The second quotation is from Tim Berners- and function in a weblike way, could provide Lee, the inventor of the , in the “large knowledge infrastructure” that En- his keynote talk at the first World Wide Web gelmore had said was needed to “break the Conference held in 1994, around the same knowledge acquisition bottleneck” that held time as Engelmore penned the preceding back the use of AI techniques across a wide quote. Berners-Lee stated: range of applications. In this article, I revisit the Semantic Web vi- …documents on the Web describe real objects sion that we espoused in a more AI-specific way. and imaginary concepts, and give particular re- lationships between them… The title document I will explore where we are with respect to that to a house describes a house and also the own- vision four years later, look at some of the tech- ership relation with a person. This means that niques and tools we have available, and briefly machines, as well as operating on the Web in- present some of the challenges to AI inherent in formation, can do real things. For example, a this vision. I will examine some of the differ-

76 AI MAGAZINE Copyright © 2005, American Association for Artificial Intelligence. All rights reserved. 0738-4602-2005 / $2.00 25th Anniversary Issue ences between what the Semantic Web brings simple text annotation, you could mark up im- and the AI tradition of knowledge representa- ages, or parts thereof, with more precise infor- tion languages, exploring the hypothesis inher- mation about the content of a photo. Instead ent in our article, that the Semantic Web is to of a sentence fragment such as “the crew of KR, what the Web was to hypertext. STS-100,” you could use some KR about space missions to make it easy for you to state who was the commander of the mission (and where Imagine If… he or she was in the photo), what the payload Here are three scenarios to challenge your of the mission was, who else was on the crew, thinking—imagine the knowledge-rich appli- what the date and time of the mission was, and cations you could build if these things were out the like. there now. Even better, assume that the very rich con- tent that you created in this way Scenario 1: Document could be submitted to a Web portal and show Metadata Knowledge Bases up indexed under the appropriate terms. Sub- We’re used to thinking of the Web as a web of mitting your photographs to a NASA site, using documents, linked together in a multibillion this enhanced markup, your photographs page book that, unfortunately, is organized by could be found by students looking for infor- few if any principles. It’s the encyclopedia of mation about a particular shuttle mission, by everything, with no index except what can be scientists looking for photographs of a particu- found through keyword search. But what could lar payload, or by fans looking for photos of you do if more of the pages in the book had their favorite astronaut. even simple metadata? That is, what could you And better still, imagine that the portals were do with the you could create if not necessarily tied to a particular application every PDF document out there on the Web had vocabulary like space, but that you could mark easily extractable meta-data in a standard ma- up the same or different photographs against chine-readable form? You could easily write a many different kinds of vocabularies. A search tool to tell who created each document, when, on the Web would find you vocabularies about with what application, and what some of its space or biology or scuba diving or terrorist net- key document features were. works or whatever else you were working on. That may not seem like much information Your personal or group portal would expose per document, but there are hundreds of mil- what vocabulary was used, and related concepts lions of PDF documents out there on the Web! between them. You could browse from the pho- With a few bits of information from each one, tograph of a conference to the conference Web we could be looking at a knowledge base with site, or to other people in the photograph, or to hundreds of billions of facts about what was the restaurant that the conference attendees published when and by whom. It would be were photographed in front of, and more. commonplace to know the history of a docu- Think of flickr on steroids, powered by ontolo- ment you found on the Web, or to see who was gies that were easy to find and use! authoring with whom, or what was the mean time for change of a Web publication, or a myr- Scenario 3: Large Scale iad of other ways that one could enhance the Knowledge Infrastructure raw document sets available on the Web. Now let’s imagine that applications like the pre- ceding are becoming more commonplace, and Scenario 2: Semantic that the products of the metadata, and content Annotation of Nontext Media fields, and annotations, and so on were becom- Sharing photographs on the Web has become ing easier to find. Imagine there was a distrib- commonplace, with many sites allowing pho- uted knowledge base, accessible through any tos to be posted and friends to be notified in a computer anywhere with a Web browser, that tightly controlled way. Recently, however, a had information about millions of people, new phenomenom in photo sharing has been places, things, transactions, processes, services, the advent of sites like flickr3 which allow not and well, everything! Imagine the facts about just the posting of photographs, but the ability all these things could be indexed against thou- to post some simple text annotations about sands of ontologies (in a standard and widely what is in them. Thus, a few keywords describ- available KR language), and that there were ing a photo can be attached, allowing flickr to dozens of open-source and freely available tools use keyword approaches to index and search for parsing, serializing, browsing, editing, stor- for photographs. ing, searching, and inferencing all these things. But imagine what you could do if instead of Imagine, in essence, what current AI re-

WINTER 2005 77 25th Anniversary Issue

http://ns.adobe.com/pdf/1.3/Producer Acrobat Distiller 7.0 for Macintosh

http://ns.adobe.com/xap/1.0/ModifyDate 2005-09-19T17:25:21-04:00

Word: cgpdftops CUPS filter http://ns.adobe.com/xap/1.0/CreatorTool

2005-09-19T17:25:21-04:00 http://ns.adobe.com/xap/1.0/CreateDate

http://purl.org/dc/elements/1.1/format http://www.w3.org/RDF/Validator/run/1127167096101 application/pdf http://www.w3.org/1999/02/22-rdf-syntax-ns#Seq http://purl.org/dc/elements/1.1/creator http://www.w3.org/1999/02/22-rdf-syntax-ns#type

http://www.w3.org/1999/02/22-rdf-syntax-ns#_1 http://purl.org/dc/elements/1.1/title genid:ARP130317

http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://ns.adobe.com/xap/1.0/mm/DocumentID genid:ARP130318 http://www.w3.org/1999/02/22-rdf-syntax-ns#Alt http://www.w3.org/1999/02/22-rdf-syntax-ns#_1

http://ns.adobe.com/xap/1.0/mm/InstanceID uuid:e711f9d1-2953-11da-ac9a-000a95d6b344 Microsoft Word - Hendler.rtf

uuid:e7120064-2953-11da-ac9a-000a95d6b344

Figure 1. The Graph of the Metadata of This Article Saved in PDF. Showing the author and the creation and modification dates and other simple metadata.

searchers could do with the very “large scale Unique document IDs are assigned to each knowledge infrastructure” that Bob Engelmore document and version, and these can be linked described. Imagine what you could do with this together in a straightforward way (essentially enormous, Web-scale knowledge base. The pos- merging those nodes with the same ID). So a se- sibilities are endless… mantic network with the metadata from the huge number of documents already containing XMP, and from the thousands being created Stop Imagining… every day is now easily extractable. Couple The scenarios in the preceding section are not such a Web with text mining, coindexing, in the distant future, or even “just around the learning, or other such AI approaches, and the corner.” Each of the scenarios described above potential is huge. is already in existence and growing at a rapid pace. Let’s visit them each in turn. Content Annotation Given the popularity of Web sites like flickr, it Rich Metadata is not surprising that there is a great deal of re- Every PDF document created by Adobe Acrobat search, and now productization, exploring this 6.0 (or, in fact, any recent tool in the Adobe space. Several of the more advanced of these creativity suite and many of the tools that do projects are exploring ontology-based annota- automatic saving of documents as PDF) is auto- tion of images and portals that can index and matically tagged in a format called XMP4 with provide search of annotated images and other some document metadata using tags from the Web resources.7 One such project is the Photo- Metadata Initiative.5 XMP uses a stuff tool developed by my research group,8 subset of the Resource Description Framework which allows ontologies written in the Web- (RDF) (I will discuss this in greater detail later standard ontology language OWL (like RDF, on in this article) and embeds it in the docu- this topic will be discussed later on in this arti- ment in an easily extractable way (with tools cle) to be used to annotate pictures. freely available from the Adobe Web site6). Figure 2 shows an example of the use of Pho- What all that means is that embedded in your tostuff. In this case, the user of the tool has document is an XML encoding of some simple loaded an ontology about NASA crews from a metadata that is easily machine interpretable Web site and is using it to mark up the photo- and not dissimilar from an AI “semantic net- graph of a space shuttle crew from a particular work” graph. In fact, easily available tools can mission (STS-100). The user can draw a box indeed turn this metadata into such a graph. around part of the image and then drag a term For example, figure 1 shows the embedded from the ontology to that region to indicate metadata extracted from the PDF version of that an instance of a particular class is being this article that I am sending to the publisher, created. In this case, the user has drawn an im- which I have run through a tool that draws the age around a particular person and indicated graph encoded in the RDF. that this is the “payload commander.” The tool

78 AI MAGAZINE 25th Anniversary Issue

Figure 2. Photostuff Being Used to Mark Up Information About a NASA Mission Using an Ontology about NASA Crews. The pane on the left shows the classes in the ontology as a tree, the middle pane shows the photograph being marked up, and the right hand pane is a form, generated by the tool from the ontology, allowing the user to fill in the information about this particular astronaut. then extracts from the ontology the properties based on the ontologies to provide search based associated with that class and creates a simple on these terms. Thus, for example, figure 3a form on which the user can enter the facts shows the photographs available from our about the particular instance. In this case, the space portal10 as they are indexed, and figure 3b user is indicating that the payload commander shows the result of clicking on the STS-100 mis- is named “Kent Rominger” and that his em- sion (one that Kent Rominger commanded). ployer is “NASA.” The information about this instance is recorded in a Web document that Large-Scale Knowledge Infrastructure can then be submitted to a Web site through a The Semantic Web at the time I write this arti- WebDav or other submission interface. cle is well past the starting gate. Earlier in this Photostuff was designed to work with Seman- article I suggested we could imagine millions of tic Web portal technology that was also devel- facts, thousands of ontologies and dozens of oped by our research group (Goldbeck, Alford, tools all available through any machine on the and Hendler 2003). We have created a number Web. Those numbers are already here and are of specialized sites for viewing information easy to validate. about different domains, including space infor- On the Semantic Web, the facts are recorded mation, terrorism, biology, U.S. politicians, as using the RDF language alluded to earlier. RDF well as a general portal for information about can be used to create, essentially, semantic net- our research group and the work we do.9 These works that are encoded on Web documents, Semantic Web portals can index information and linked to each other through the use of

WINTER 2005 79 25th Anniversary Issue

AB

Figure 3. SemSpace Website. A. The Web site displays photos indexed by the terms in the ontology with which they were indexed. B. Choos- ing a photograph displays, in table form, the annotated information about that image.

common Web designators (the URLs and URIs organization). A number of blogging and other used to designate Web sites and other Web re- sites automatically create Foaf entries for their sources). Thus, for example, one could use my users, and thus it is currently estimated there Web page URI—www.cs.umd.edu/~hendler—as are well over five million individuals who are a unique identifier on which to tie information described in Foaf files. about me. You could say that that site describes Foaf is probably the most successful of the someone who is named “Hendler” and who Semantic Web vocabularies in common use, works at the organization described in but it is by no means the only one. There are www.cs.umd.edu (the site that describes the vocabularies for describing open source licens- department where I work). es,13 projects,14 thesaurus vocabularies,15 and In fact, having files on the Web to describe lots of other common domains.16 A Semantic oneself has become common, with the friend 17 of a friend (Foaf) vocabulary11 being a common Web crawler called Swoogle has been built by one to use. Foaf files are regular Web files (for Tim Finin and his students (Ding et al. 2004) example, one can see what is in them simply by using a targeted search (rather than a general clicking on a link12) and that file can be used by search like Google). At the time of this writing, anyone else anywhere on the Web to point to it finds more than 4,100 Web ontologies, 7.25 my designated Foaf file. It is estimated that million defined individuals, and 47.5 million there are several hundred thousands of Foaf total RDF “statements” in the approximately files out there (for today’s number, try 335,000 pages on which it discovers semantic Googling “ filetype:rdf” which will find a information. majority of, but by no means all, Foaf files). In- I also mentioned dozens of open source tools deed, a single file can designate many Foaf en- for using these Semantic Web languages, but I tries (for example all the people in a particular should have said dozens of good tools support-

80 AI MAGAZINE 25th Anniversary Issue

http://www.cs.umd.edu/users/hendler/index.html http://www.cs.umd.edu/

Figure 4. Setting Up a Link Between Two URIs. ed by reputable companies and large The unique name is a universal re- iad of domains of discourse being used research organizations. If we included source indicator (URI), with the most every day. all the tools that have been released, common being the familiar “http” The solution used in the Semantic no matter what quality, the number URIs we’ve all come to know and love. Web, realized through the resource de- would surely be in the hundreds. The So if I have my Web page20 and in it I scription framework (RDF), is that there World Wide Web Consortium (W3C) have a pointer to another Web docu- exists on the Web a mechanism for cre- maintains Web sites with pointers to a ment, for example “,” then I am set- an essentially infinite set of unique which are available open source or ting up a link between two URIs, as tags—the URI mechanism itself. That free. shown in figure 4. is, if a group of my friends and I agreed Since each of the URIs in figure 4 that having a “designatesPerson” link can, in turn, be linked to other URIs, would be beneficial to something we Semantic Web Languages the famous Web graph is created—an are doing, then we could agree to use (An AI Perspective) unlabeled, directed (distributed) graph the label http:/ /www.cs.umd.edu/ with billions of links. ~hendler/links# designatesPerson and I’ve already mentioned two of the If these links were labeled, rather build our applications around that. main Semantic Web languages (RDF than unlabeled, then Web applica- Thus, we reuse the naming scheme of and OWL), and there’s actually a tions could use the link types to do a the Web to get a unique, and unam- third—RDF Schema—which is used better job of searching, indexing, and biguous label to use among our group. for many of the vocabulary defini- displaying pages. For example, if the We could now create a labeled link that tions described earlier. In this section, link in figure 4 could be labeled some- would connect my home page to some I’ll tell you a little bit about these lan- thing like “WorksAt”, then a search image through the new link shown in guages and why they are interesting engine could let a user look specifical- figure 5. from an AI perspective. But I won’t try ly for where I work without getting And, if my home page had other to tell you much about the syntax of “confused” by all those other links named links to other resources, which these languages or the details of their that might or might not have any- might in turn link further, the Web use. All three are Web recommenda- thing to do with this. Even a simple la- could be extended, in essence, from an tions from the W3C, and primers, beling scheme (for example, designat- unlabeled to a labeled graph (with a po- guides, references, and technical doc- ing whether the thing at the end of a tentially infinite set of labels). Since the uments (including model theories) link designates information about a links would have at least human mean- are available through the W3C’s Se- person or not) could be used to help ing (that is, you as the reader can assign mantic Web activity page.19 Instead, I disambiguate things on the Web. For “designatesPerson” some sort of se- want to draw analogies to some of the example, if I did an image search, I mantics), we could think of the labeled earlier AI work and discuss some of could look for only those images that Web, from an AI point of view, a lot like the similarities and, more important- were linked to through a “designates- the early AI KR systems that used “se- ly, the differences. Person” link and thus tell photographs mantic networks” with labeling, but One of the key features that differ- of people from photographs of inani- not with a lot of meaning in the links entiates the hypertext markup lan- mate objects, animals, and so on. (Woods 1975). RDF as a language is guage (HTML) on the Web from earli- The problem, of course, is what link made up of triples, each of three URIs er hypertext languages is the ability to label set to use—much as AI re- (or a datatype, so we can link some- link a document to another across the searchers argued about what the thing to a string, or number) represent- . In HTML, this is achieved “primitives” of knowledge representa- ing exactly these “Subject Property Ob- through the “anchor” () tag in a tion systems for natural language ject” graphs, where any two references document having the ability to should be, Web researchers have pro- to the same URI are assumed to refer to uniquely name a Web resource, which posed various types of link sets. How- the same thing. A special property, can be a document, image, table, ever, given the scope and generality of called rdf:type is used to link a specific video, applet, or any other format the Web, it became clear that no sin- instance (say “book1”) to a category transferable over the Web’s protocols. gle, simple set could work for the myr- (“book) allowing instance data to be

WINTER 2005 81 25th Anniversary Issue

http://www.cs.umd.edu/~hendler/ links#designatesPerson http://.../hendler/index.htm http://.../hendler/pict1.jpg

Figure 5. A Labeled Link Connecting My Home Page to an Image through a New Link. created easily. There’s a lot more de- you would see the XML corresponding Terms for describing properties: allow- tails, but that’s basically it—so think of to: ing RDFS properties to be declared as RDF as the “” lan- :designatesPerson a rdf:Property; transitive or symmetric, or to be guage for the Web. rdfs:domain :Person; known to be one-to-one, one-to- Just as in AI where semantic net- rdfs:range :Photograph. many, or many-to-one. It also allows works are useful in themselves, but far which states that this property relates for distinguishing properties that link more useful with extensions, the Se- a Person to a Photograph (terms that classes to various datatypes instead of mantic Web community has realized are defined elsewhere on that page). to other classes (compare with assign- the need for more powerful representa- Representationally, RDFS is obvious- ing the “text-name” property to link a tion. The first such is the resource de- ly not very expressive. Many KR lan- person to a character string”). scription framework schema (RDFS), guages developed by AI researchers Terms for restrictions on property val- which is useful for creating vocabularies had the ability to express more com- ues of particular classes: in KR, it is im- on the Web, extending RDF into some- plex relationships between entities portant to be able to express that par- thing more like the frame representa- than simple domain and range con- ticular class/property pairings respect tions used in AI. This is done by adding straints, and a language that could certain restrictions. This permits the something akin to the ISA link (allow- represent more powerful ontologies expression of various kinds of quan- ing the creation of a “class” and the ex- was needed for a number of emerging tification (universal vs. existential, pression of “subclassOf” relations be- use cases on the Web.21 Based on the cardinality) on OWL classes. tween them in RDFS) and by being able progress of a number of research pro- Terms for expressing similarity or dif- to name the slots expected to be associ- jects that were exploring how to use ferences between classes, properties, ated with a class (using “domain” and URIs for ontologies on the Web (Heflin and/or individuals. “range” statements). The rdf:type state- and Hendler, 2000), how to represent among others. ment is used to link an individual to a knowledge in an XML syntax,22 and In doing this, OWL forms a fairly class, thus allowing instances and class- how to do logic-based reasoning for straightforward KR system, which is es to be distinguished (if desired). such Web languages (Fensel et al. restricted in its scope, but a standard One of the key advantages to the use 2000), the U.S. Defense Advanced Re- that is quite useful (and easy to ex- of RDFS over earlier such framelike rep- search Projects Agency (DARPA) creat- tend). resentations is that it is built to be com- ed a program aimed at funding the de- The importance of the fact that patible with the Web and Web architec- velopment of a standard in this area these languages are Web recommen- ture. First, all class names and (called the DARPA Agent Markup Lan- dations is easy to underestimate, but properties respect the RDF convention guage, DAML). Interest in the use of really terribly important. These Se- of being expressed as URIs, thus mak- DAML beyond the military grew, and mantic Web languages, by having ing the linking of classes and properties the World Wide Web Consortium cre- gone through the often-painful across documents possible. Second, ated the Web Ontology Working process of standardization, have and more importantly, there is a con- Group to create a Web standard based gained significant buy-in from the vention in RDFS that the URI of the on DAML. The Working Group began Web development and support com- document in which the vocabulary is in November 2001, and the OWL lan- munities. For example, the Firefox defined should be the same as the URI guage was officially recognized as a browser uses RDF extensively in sup- used in the relations. Thus, if I see a vo- W3C Recommendation in February of port of bookmarks, history, configura- cabulary term like the aforementioned 2004.23 tion management, and interface de- “www.cs.umd.edu/~hendler/links#de- Although OWL contains expressivity sign, and I already described the use signatesPerson,” then the document that goes beyond that of description of XMP by Adobe. IBM has an ontol- where the designatesPerson relation is logic reasoners, the easiest approxima- ogy management system called defined would be assumed to be the tion for the sake of this article is to SNObase, which supports OWL, avail- one in this URI. By HTTP-GETting this think of OWL as a DL-like KR language able on its Alphaworks portal. Oracle document I, or my computer agent, for the Web. It adds a number of repre- has recently announced support for can see the vocabulary definition asso- sentational features to RDF Schema, in- RDF in Oracle DB10.2 and has pub- ciated with it. That is, in that file there cluding, among others, the following: licly announced that it is considering

82 AI MAGAZINE 25th Anniversary Issue support for OWL in DB 11.0, and the list goes on with a number of compa- nies both large and small, developing data stores, APIs, reasoners, and appli- cations for these Semantic Web lan- guages. In short, these languages are not just another AI “religion” but are the sort of consensus standards that have a shot at wide distribution and use. (Contrast this to one of the mis- takes of the expert systems days, when AI companies competed to sell differ- ing shells, trying to lock users into one rule-based representation over anoth- er, rather than developing some stan- Figure 6. Feline Leukemia Ontology. dard rule languages and competing to provide better support for the com- mon representation.) ism_Affected” such that anything fill- bilities of RDFS and OWL from the What’s in a ing this property must be of type purely positive side. But there’s a po- “CYC:cat” (that is, cat as defined in the tential downside as well. After all, con- (Semantic Web) Link? OWL version of OpenCyc that has sider the following three issues: The real power of OWL and these oth- been released on the Web). First, OWL isn’t very expressive—it er Semantic Web languages, however, Why is this exciting? Consider, that certainly isn’t a KIF, let alone even a is not so much in their abilities as tra- the NCI ontology contains about full KL-ONE. The state of the art in AI ditional KR languages, or even in the 40,000 class definitions about cancers languages long ago surpassed the ex- standardization, but much more im- and cancer-related disease. The por- pressivity of OWL. portantly in what Berners-Lee refers to tion of the CYC ontology released in Second, Web linking is not for free; as the “webizing” of these languages. OWL to date has about 50,000 terms while the “network effect” of ontolo- He writes: about the “commonsense” world. So gies and data and Web resources The essential process in webizing is to this little ontology is linked to close to linked to other ontologies and data take a system which is designed as a 90,000 class definitions. This means and Web resources is an exciting new closed world, and then ask what hap- that if I create an instance of a cat with playpen, there are challenges. What pens when it is considered as part of this disease, (figure 7) my reasoner can happens if the NCI adds some new in- an open world. Practically, this effect know (by HTTP-GETting the defini- formation to the ontology I’m linked on a computer language is to replace tions from the linked files) that Puff is to and includes some information I the names/tokens/identifiers for URIs. a cat, is a feline, is an animal, could be don’t agree with (after all, that docu- Thus, where before reference could a pet owned by a person, is a member ment is out of my control). Consider only be made to something in the of the carnivore order (implying that same document/program/module what would happen to the reasoner one can with equal ease make refer- her teeth have adapted for eating trying to learn more about poor dis- ence to something in a different one meat), and so on (from Cyc) and that eased Puff if the CYC server is somewhere in that abstract space poor Puff is inflicted with feline down—what is an agent to do when which is the Web.24 leukemia, which is a leukemia, which faced with a semantic 404 error? So, by using RDF as the basis of OWL, is a cancer disease, that there has been Third, the web of knowledge I’ve we have webized KR, and this has pro- a clinical trial to explore whether Mer- been describing cannot be guaranteed found implications that we are only captoethane Sulfonate is effective consistent and, in fact, is pretty much starting to explore. against this disease in people, and so guaranteed not to be! On the Web Consider the ontology defined in on (from the NCI ontology). In short, there will be multiple causes of incon- OWL in figure 6 (this is the file in its the webized KR language lets us create sistency including disagreement full, with the exception of some header a web of terminology and definitions (when does life begin?), error (did I say information). This figure, in the XML that spans across multiple Web pages “Cat”? oops, I meant “Car”) and dis- form, states that this is defining a class and has the potential to reuse knowl- honesty (my site has just the term called “Feline Leukemia”, which is a edge across applications and domains. you’re looking for, you just have to subclass of the class Leukemia that is pay to use it…). KR in AI has worked defined elsewhere by the National But Wait, What About … off an assumption that KBs are largely Cancer Institute (“NCI:Leukemia” is consistent, and the approaches to rea- an abbreviation for the full URI.)25 It By this point, I suspect many a dis- soning in inconsistent knowledge goes on to say that there is a restriction cerning reader is ready to cry “foul.” bases are not yet scalable. on the property called “NCI:Organ- So far, I’ve talked about the new capa- To me, however, this seems more of

WINTER 2005 83 25th Anniversary Issue

:Puff a CYC:cat. http://www.cs.umd.edu/~hendler/links#feline-leukemia NCI:Organism_Affected :Puff.

Figure 7. An Instance of a Cat with Leukemia. a promise than a threat. Consider the S.; Sachs, J.; Doshi, V.; Reddivari, P.; and 14. usefulinc.com/doap. analogies—HTML (and even XML) is Peng, Y. 2004. Swoogle: A Search and Meta- 15. www.w3.org/2004/02/skos/. data Engine for the Semantic Web. In Pro- much less expressive than the lan- 16. See www.schemaweb.info/ for many ceedings of the Thirteenth ACM Conference on guages it derives from, and the Web more examples. “hypertext” is significantly less power- Information and . New York: Association for Computing Ma- 17. swoogle.umbc.edu. ful than many of the hyperbooks that chinery. 18. Or RDF: www.w3.org/RDF; for OWL: preceded it; the Web 404 error is crucial www.w3. org/ 2004/OWL. Fensel, D.; Horrocks, I.; Van Harmelen, F.; to the design of a scaleable open sys- Decker, S.; Erdmann, M.; and Klein, M. 19. www.w3.org/2001/sw. tem, and just take a brief look at “blog 2000. OIL in a Nutshell., In Knowledge Ac- 20. www.cs.umd.edu/users/hendler/in- space” if you want to see an open and quisition, Modeling, and Management, Pro- dex.html. thriving Web culture based on disagree- ceedings of the European Knowledge Acquisi- 21. www.w3.org/TR/webont-req/ is the ment, misinterpretation errors, and tion Conference (EKAW-2000), ed. R. Dieng “use cases and requirements” document yes, at times, calculated dishonesty. and O. Corby. Lecture Notes in Artificial In- for Web ontologies, which summarizes a In short, bringing AI to the amaz- telligence, Volume 1937. Berlin: Springer- number of these. ingly huge, open and changing world Verlag. 22. www.ai.sri.com/pkarp/xol/. that is the Web is one of the most ex- Golbeck, J.; Alford, A.; and Hendler, J. 23. The details of OWL, and of the com- citing, and potentially high payoff, 2003. Organization and Structure of Infor- promises between expressivity, flexibility, challenges to ever hit our field! We tru- mation Using Semantic Web Technologies. decidability, and so on that were involved ly may be where hypertext was before In Handbook of Human Factors in Web De- in its design, are beyond the scope of this sign, ed. R. Proctor and K. Vu. 2003. Mah- the Web—sitting on top of an impor- article—www.w3.org/2004/OWL is the of- weh, NJ: Lawrence Erlbaum. tant technology that can, for the first ficial OWL Web site, and contains not only Heflin, J., and Hendler, J. 2000. Dynamic the official documents but also documents time in fifty years, be linked into a Ontologies on the Web, In Proceedings of the and mail archives discussing the issues that “single global system” realizing its full Seventeenth National Conference on Artificial were, and sometimes were not, resolved. potential, and building what Bob En- Intelligence. Menlo Park, CA: AAAI Press. 24. www.w3.org/DesignIssues/webize. gelmore was calling for when he stated Woods, W. 1975. What’s in a Link: Founda- 25. The full URI is www.ncibi.nih.gov/ that the full potential of the advanced tions for Semantic Networks. In Representa- NCIT/NCIT.owl#leukemia. sort of expert systems he was envision- tion and Understanding: Studies in Cognitive ing required the creation of a “knowl- Science, ed. D. Bobrow and A. Collins. New edge infrastructure” that could help in York: Academic Press. “breaking the knowledge engineering bottleneck.” I contend that the we- Jim Hendler is a profes- bized KR language that OWL brings to Notes sor of computer science the Web (along with future exten- 1. See www.wtec.org/loyola/kb/c1_s1.htm. at the University of sions, which may include rule lan- 2. See www.w3.org/Talks/WWW94Tim/. Maryland. Having been born about nine months guages and more expressive logics), in 3. www.flickr.com/. after the Dartmouth allowing the linking of knowledge 4. www.adobe.com/products/xmp/ main. sources, in being standardized so that Workshop, he is fond of html. pointing out that he and the use of these languages brings 5. dublincore.org/. the field were conceived greater interoperability, and in build- 6. www.adobe.com/products/xmp/main. at about the same time. In the past few ing on the Web infrastructure that has html. years, Hendler has been working on bring- become so pervasive in modern soci- 7. A number of these are described at ing AI to the Web and has devoted much of his effort to technical development and ety, creates the very knowledge infra- w3photo.org/semantic/. structure that Bob prescribed more evangelism of and for the Semantic Web. 8. www.mindswap.org/2003/PhotoStuff/. than a decade ago. 9. www.mindswap.org. References 10. semspace.mindswap.org. Berners-Lee, T.; Hendler, J.; Lassila, O. 2001. 11. www.foaf-project.org/. The Semantic Web. Scientific American 12. Mine is at www.cs.umd.edu/~hendler/ 284(5): 34–43. 2003/foaf.rdf—click on it and see! Ding, L.; Finin, T.; Joshi, A.; Pan, R.; Cost, R. 13. creativecommons.org/.

84 AI MAGAZINE