Semantic Web & Linked Data in Enterprises an Overview
Total Page:16
File Type:pdf, Size:1020Kb
Semantic Web Company Workshops & Trainings Semantic Web & Linked Data in Andreas Blumauer MSc IT Enterprises CEO Semantic Web Company An overview Welcome! Andreas Blumauer, MSc IT CEO of Semantic Web Company, Vienna Acknowldeged computer expert in the areas of Text Mining, Semantic Web, Knowledge Modelling & Linked Data • Some initial thoughts on “Information Quality” & “Knowledge Creation” • What is the Semantic Web? What is Linked Data? - The end of documents? - Standards & Norms • Some examples of Linked Data Applications • Linked Data in the context of information management About Semantic Web Company (SWC) SWC founded 2001 in Vienna More than 25 Linked Data experts Product: PoolParty Suite (on the markets since 2009) Customers from all sectors EU- & US-based Partner Network Our network: Customers & Partners Finance / Automotive / Publisher / Health Care / Public Administration / Energy / Education Customers Partners ● Credit Suisse ● Cognizant ● Daimler ● EBCONT ● Roche ● EPAM Systems ● Wolters Kluwer ● iQuest ● Tieto ● PwC ● Canadian Broadcasting Corporation (CBC) ● DTI AG ● World Bank Group ● Tenforce ● The Pokémon Company ● OpenLink Software ● Healthdirect Australia ● Ontotext ● Ministry of Finance (A) ● MarkLogic ● Wood Mackenzie ● Gravity Zero ● Red Bull Media House ● Altotech ● Council of the E.U. ● Wolters Kluwer ● TC Media ● Term Management ● American Physical Society ● Taxonomy Strategies ● Education Services Australia ● Search explained ● Pearson ● WAND ● Techtarget ● Digirati ● Norwegian Directorate of Immigration ● Cognistreamer ● REEEP ● Linked Data Factory ● European Commission ● Taxonic ● Bank of America ● semweb Information Quality & Knowledge Creation “Information Quality”: The Enterprise View nd • Information is often treated as ‘2 class citizen’ in enterprises • Information management lies in the responsibilty of the CTO → Information as technical artefact • Trend towards information silos, no standards • The value of contextual information and premium metadata is often underestimated • Business models rarely recognize the benefits of collaborative practices → Hypothesis 1: “The information demands of customers are often being neglected.” → Hypothesis 2: “Enterprises face increasing competitive pressure due to a lack of informational agility.” “Information Quality”: A Meta-Perspective Humans & Information (CIO-View) Hans Rosling: Growth of the global population Information increases in value, • when communicators share a mutual Analog understanding (common sense), and • when information is designed according to the needs of its recipients (personalisation) Digital → Hypothesis: “The ability to transfer knowledge (contexts, interdependecies) becomes more important.” “Information Quality”: A Meta-Perspective Humans & Information (CTO-View) Information increases in value, • the lower its integration costs, and • the cheaper its reusability in various contexts → Hypothesis: “Providing information (content) in various formats as service via APIs is key to increase information quality from a technical perspective.” What is the Semantic Web? What is Linked Data? Data as Precursor of Knowledge LOD Cloud Challenges in Data & Information Management 1. Distributed Data Sources 2. Differing Formats 3. Implicit Semantics 4. Dubious Provenance 5. Missing Licenses 6. Unclear Topicality The Semantic Web: ‘Things’ not Strings St. Mark’s Square Venice prefLabel http://www.mycom.com/ Piazza prefLabel taxonomy/97345854 altLabel San Marco http://www.mycom.com/ taxonomy/62346723 image has broader http://www.mycom.com/ http://www.mycom. images/90546089 com/taxonomy/4543567 prefLabel altLabel Piazza Square The power of knowledge graphs: Agility, flexibility, complexity Show me all Traditional approach documents about Graph-based approach European countries Norway France Austria Canada Norway France Austria Canada doc doc doc doc doc doc doc doc The power of knowledge graphs: Agility, flexibility, complexity Show me all Traditional approach documents about Graph-based approach European countries Europe Europe, Europe, Europe, America, Norway France Austria Canada Norway France Austria Canada doc doc doc doc doc doc doc doc The power of knowledge graphs: Agility, flexibility, complexity Traditional approach Graph-based approach Show me all documents about EU Europe member countries EU EU, EU, Europe, Europe, Europe, America, Norway France Austria Canada Norway France Austria Canada doc doc doc doc doc doc doc doc The power of knowledge graphs: Agility, flexibility, complexity Traditional approach Graph-based approach Europe French- speaking French- French, speaking? EU EU, EU, French, Europe, Europe, Europe, America, Norway France Austria Canada Norway France Austria Canada doc doc doc doc doc doc doc doc The power of knowledge graphs: Agility, flexibility, complexity Show me all Traditional approach documents from Graph-based approach European countries Show me all documents from EU Europe Metadatamember countries Knowledge French- speaking French- French,per speaking? aboutEU EU, EU, French, Europe, Europe, Europe, America, NorwaydocumentFrance Austria Canada Norway metadataFrance Austria Canada doc doc doc doc doc doc doc doc Linked Data: Discovering Answers to Complex Questions To answer the following question, “Are there interdependencies between the Human Development Index of certain countries and the regional research activities concerning specific types of illnesses?” the following sources can be consulted and linked: ● MeSH (Medical Subject Headings) ● PubMed ● Geonames ● DBpedia ● UNDP Interlinking of various Knowledge Graphs & Ontologies is key Venice http://www.mycom.com/ prefLabel taxonomy/5456544 St. Mark’s prefLabel Square http://www.mycom.com/ taxonomy/62346723 http://schema.org/containedIn http://schema.org/location http://www.mycom.com/ taxonomy/7835488 http://www.geonames.org/7302945 http://www.freebase.com/m/0q9rr http://schema.org/City Peggy http://dbpedia.org/resource/ Guggenheim Peggy_Guggenheim_Collection http://schema.org/TouristAttraction Museum https://www.youtube.com/ VeniceGuggenheim http://schema.org/ArtGallery Semantic Web - The End of Documents? The End of Documents? What is a Document? What should it be? ● Production: A tool to create information? ● Storage: A method to store information? ● Visualization: A convention to visualize and represent information? ● Interface: An access point (API) or container, to connect to information and make it findable? ● Craft: The art to tell stories, trigger emotions and/or create common sense? ● ? Knowledge workers link and contextualize information! Journal article Dossier Social Web Profil Health Record Blog post Product information Law News article Campaign Regulation Poem Contract Tweet Product specification “Follow your nose (‘nous’)” ...some more graphs Microsoft „Office Graph“ Google „Knowledge Graph“ Facebook „Social Graph“ What exactly do knowledge workers interlink? • Entities, not documents! • Things, not strings! PoolParty Tagging Workflow Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut. “strings” become “things” sadipscing Corpus Analysis Quality Checks Concept-based Tagging in Enterprise Content Systems Drupal Confluence SharePoint 2013 ‘Google’s Knowledge Graph’ as an example for semantic information machines Enterprises just have started to create their own, specific knowledge graphs. Which new opportunities can be derived from this development for the information management industry? Mashup from knowledge graphs and API calls! BBC’s Linked Data Plattform: How many information sources do you see? Individual CMSs are pretty good at keeping tabs on the content they create but if you wanted to get hold of the 20 most recent pieces of content from across the BBC (and hence across CMSs) on Burkina Faso, or Jarvis Cocker or global warming it would be very tricky. Oli Bartlett, product manager for the BBC's Linked Data Platform Clean Energy Data - Country Profiles Linked Data is a data model, which is based on graphs ● Linked Data is a graph-based data model that is expressive enough to represent and to process a wide spectrum of types of information → Being used for Data Integration & Dynamic Semantic Publishing (DSP) in distributed environments (“Semantic Web”) Semantic Web Standards & Technologies Resource Description Framework (RDF) predicate Subject Object Semantic Web is a Organization Company Semantic Web is located in Vienna Company Simple Knowledge Organization System (SKOS) Taxonomies and controlled vocabularies http://www.w3.org/2004/02/skos/ From Simple SKOS to large knowledge graphs Link and map Generate 1st Edit,extend & Extend schema, between version of SKOS curate apply ontologies, taxonomies taxonomy taxonomy use SKOS-XL and LD graphs - Reuse of existing - Taxonomy Editing - Reuse existing ontologies - Automatic mapping between vocabularies - Collaborative workflows - Create custom schemes taxonomies - Corpus Analysis - Free term extraction - Apply SKOS-XL - Linked Data frontend - Excel import - Tag recommender - Apply ontologies on your - Link to other LD graphs, e.g. - XML import - Quality Checker SKOS taxonomy DBpedia or Geonames - Linked data harvester your data, your your CMS e.g. Excel docs Linked Vocabularies - Linked Contents Wolters Kluwer Working Law Thesaurus Eurovoc STW Thesaurus DBpedia Linked Data & Linked Vocabularies can be reused with increased efficiency ● Linked Data is based on standards and embedded in a wide data eco- system → Semantic Web based ontologies,