SemanticSemantic TechnologyTechnology

ChrisChris WeltyWelty IBMIBM ResearchResearch WhatWhat areare semanticsemantic technologiestechnologies

„ DatesDates backback toto thethe 60s,60s, 70s,70s, 80s,80s, 90s90s „ STRIPS,STRIPS, SNePSSNePS,, CG,CG, KLKL--ONE,ONE, NIKL,NIKL, CLASSIC,CLASSIC, LOOM,LOOM, RACER,RACER, etcetc…… „ TodayToday wewe havehave standardsstandards Normative „ „ CommonCommon Logic,Logic, IKLIKL XML „ RDF,RDF, SKOS,SKOS, OWL,OWL, RIFRIF } syntaxes „ ODM,ODM, PRRPRR WhatWhat cancan youyou dodo withwith SemanticSemantic Technology?Technology?

„ BuildBuild informationinformation systemssystems „ Thesauri,Thesauri, terminologiesterminologies „ Learning,Learning, testing,testing, trainingtraining systemssystems „ EventEvent processing,processing, backback--officeoffice systemssystems „ SoftwareSoftware designdesign automation,automation, architecturearchitecture „ WebWeb services,services, Planning/schedulingPlanning/scheduling „ IntelligenceIntelligence analysisanalysis „ ……whatwhat dodo youyou need?need? CanCan’’tt DatabasesDatabases DoDo that???that???

„ NoNo „ YesYes „ WellWell…….... AdvantagesAdvantages ofof SemanticSemantic TechnologyTechnology

„ ConsiderConsider SoftwareSoftware ArchitectureArchitecture

„ MoreMore declarative,declarative, openopen „ BetterBetter abstractionabstraction „ CheaperCheaper maintenancemaintenance „ BetterBetter integrationintegration „ ……byby makingmaking thethe semanticssemantics explicitexplicit „ AtAt leastleast aa littlelittle…… CommonCommon LogicLogic

„ StandardStandard (ISO/IEC(ISO/IEC 24707:2007)24707:2007) syntaxsyntax andand semanticssemantics forfor FirstFirst OrderOrder LogicLogic (FOL)(FOL) „ XMLXML andand ““KIFKIF stylestyle”” syntaxessyntaxes „ ““WebWeb savvysavvy”” (can(can useuse URIsURIs)) „ AA fewfew implementations,implementations, stillstill inin nurturingnurturing stagestage „ AA contextcontext--logiclogic extensionextension proposedproposed (IKL)(IKL) SemanticSemantic WebWeb

„ RDFRDF „ A language for semantic graphs „ The nodes are anywhere in the web „ The arcs are labeled „ OWLOWL „ A language for giving more semantics to RDF graphs „ classes of nodes „ Constraints, equality, negation „ RIFRIF „ Rules for extending graphs automatically CanCan’’tt UMLUML dodo that???that???

„ ThereThere isis overlapoverlap betweenbetween UMLUML andand OWLOWL „ Classes,Classes, relations,relations, constraintsconstraints „ ButBut therethere areare significantsignificant differencesdifferences „ OWLOWL hashas aa fullfull modelmodel--theoretictheoretic semanticssemantics „ OWLOWL isis designeddesigned forfor specifyingspecifying informationinformation systemssystems „ OWLOWL limitedlimited toto consistent,consistent, sound,sound, andand computablecomputable reasoningreasoning ODMODM

„ InteroperabilityInteroperability standardstandard betweenbetween severalseveral semanticsemantic technologiestechnologies „ FacilitatesFacilitates MOFMOF--enabledenabled toolstools toto workwork withwith RDF,RDF, OWL,OWL, CL,CL, …… „ Editors,Editors, translatorstranslators „ VisualizationVisualization „ GivesGives semanticsemantic technologiestechnologies aa UMLUML ““flavorflavor”” TheThe SemanticSemantic WebWeb VisionVision

„ ~80%~80% ofof webweb pagespages areare generatedgenerated fromfrom backback endend databasesdatabases „ PublishPublish thethe semanticssemantics (schema?)(schema?) asas wellwell asas thethe datadata „ URIsURIs provideprovide aa webweb--basedbased formform ofof identityidentity „ It’s the , not the SEMANTIC web „ NOTNOT:: humanshumans willwill markupmarkup theirtheir webweb pagespages withwith semanticssemantics „ NOTNOT:: NLPNLP willwill populatepopulate thethe SWSW fromfrom webweb pagespages ErrorsErrors byby analogyanalogy

„ TheThe web:web: justjust hypertexthypertext „ TheThe web:web: badbad UIUI designdesign „ TheThe semanticsemantic web:web: justjust semanticsemantic technologytechnology „ TheThe semwebsemweb:: badbad KRKR designdesign HistoryHistory ofof HypertextHypertext

„ 1945: Vannevar Bush’s Memex „ Associative Indexing and links „ 1965: Ted Nelson coins hypertexthypertext „ “Nonsequential writing” „ 1967: Andries van Dam’s Editing System (sponsored by IBM). „ 1985: Janet WalkerWalker’s Symbolics Document Examiner „ 1987: Bill AtkinsonAtkinson’s Hypercard on the Mac „ 1991: Tim BernersBerners-Lee proposes HTTP, HTML, & URL „ Genesis c. 1989 „ 1993: Mark Andreesen releases Mosaic for Mac, Unix, Windows… HypertextHypertext ResearchResearch

„ DatingDating backback atat leastleast toto thethe latelate 60s60s „ ManyMany focifoci „ TechnologyTechnology (mouse,(mouse, software,software, protocols)protocols) „ UserUser interactioninteraction „ AestheticAesthetic „ PostPost--modernmodern „ EngineeringEngineering „ LargelyLargely ignoredignored byby webweb developersdevelopers „ EspeciallyEspecially inin thethe earlyearly daysdays ofof thethe webweb (93(93--96)96) GrassrootsGrassroots toto thethe WebWeb

„ EarlyEarly webweb dominateddominated byby ““whatwhat itit lookslooks likelike”” inin MosaicMosaic „ Unimpressed UI and Hypertext researchers „ FocusFocus onon spreadingspreading thethe word,word, notnot doingdoing itit rightright „ ManyMany earlyearly webweb pagespages didndidn’’tt havehave linkslinks inin texttext atat allall „ “Catalog” pages with lists of links „ “Text” pages with few or no links „ Embedded images more interesting than links „ JustJust dodo itit ratherrather thanthan dodo itit rightright „ ButBut…… „ The web became serious „ Then research started to matter „ Tooling for web/UI design became important OntologyOntology ResearchResearch

„ DatingDating backback…… „ MultipleMultiple focifoci „ Technology (logics, reasoners…) „ Meta-physics (what there is) „ Knowledge Acquisition „ NLP „ Engineering „ LargelyLargely ignoredignored byby SWSW developersdevelopers „ Web 2.0, groundswell „ Specifically criticized by some SW pundits GrassrootsGrassroots toto thethe SemanticSemantic WebWeb

„ Dominated by putting lots of data up „ Unimpressed KR and Ontology researchers „ Focus on spreading the word, not doing it right „ Most LOD sources don’t have published semantics at all „ Sources that provide popular utility (imdb, ) „ LOD= “Linked Open” or “Lots Of” Data? „ Just do it rather than do it right AA littlelittle semanticssemantics……

„ TheThe SWSW catchphrasecatchphrase „ “A little semantics goes a long way” „ SometimesSometimes strengthenedstrengthened „ A lot of semantics isis tootoo muchmuch „ 80/20 rule „ DoubleDouble--edgededged swordsword „ FOAF doesn’t look like even 1% „ The simplicity of FOAF hides any serious value proposition for SW „ SW not for people, for data „ Reasoning? Quality? WhereforeWherefore Reasoning?Reasoning?

„ VeryVery hardhard toto ““sellsell”” OWLOWL reasoningreasoning „ ManyMany usersusers wantwant veryvery simplesimple reasoningreasoning „ Simple subclass „ Simple range/domain constraints „ Simple rules (q.v. RIF) „ SomeSome usersusers wantwant moremore thanthan OWLOWL „ But just to express their semantics, not in run-time system „ ReasoningReasoning supportssupports qualityquality „ ImprovingImproving precision?precision? Must be measured. „ ImprovingImproving recall?recall? GettingGetting itit rightright

„ DoesDoes qualityquality matter?matter? „ GoodGood qualityquality ontologiesontologies costcost moremore „ RequiredRequired forfor somesome applicationsapplications „ ImprovementsImprovements inin qualityquality cancan improveimprove performanceperformance [Welty,[Welty, etet al,al, 2004]2004] „ 18%18% ff--improvementimprovement inin searchsearch „ CleanupCleanup costcost ~1mw/3000~1mw/3000 classesclasses „ BUTBUT …… lowlow qualityquality ontologyontology stillstill improvedimproved basebase WhereWhere’’ss itit going?going?

„ RealReal businessbusiness usesuses willwill requirerequire reasoningreasoning && qualityquality „ ReasoningReasoning && QualityQuality drivedrive needneed forfor toolingtooling

„ W3CW3C notnot aboutabout APIsAPIs ToolingTooling

„ ProtProtééggéé (Stanford(Stanford && Manchester)Manchester) „ ODMODM metamodelmetamodel && profilesprofiles „ EclipseEclipse--basedbased „ ATL „ IBM STK (Semantic Technology Toolkit) „ alphaworks „ VOMVOM „ UML-profile plugin (e.g. Rose, MagicDraw, RSA…) „ Integration with reasoning services „ ……othersothers (growing)(growing)