Revelytix Sicop Presentation DRM 3.0 with Wordnet Senses in A

Total Page:16

File Type:pdf, Size:1020Kb

Revelytix Sicop Presentation DRM 3.0 with Wordnet Senses in A Knoodl.comKnoodl.com SemanticSemantic WikiWiki CreatingCreating andand usingusing OWLOWL vocabulariesvocabularies inin aa wikiwiki MichaelMichael LangLang RevelytixRevelytix JuneJune 18,18, 20072007 AgendaAgenda ► WhatWhat isis aa SemanticSemantic WikiWiki ► BuildingBuilding thethe SemanticSemantic ModelsModels ► BootstrappingBootstrapping COICOI basedbased vocabulariesvocabularies . WithWith WordNetWordNet contextcontext andand descriptiondescription ► COICOI vocabulariesvocabularies inin aa semanticsemantic WikiWiki . OWLOWL modelsmodels ► SemanticSemantic WikiWiki WikiWiki ►AA websitewebsite wherewhere anyoneanyone cancan editedit thethe contentcontent ofof thethe sitesite easilyeasily ►Wiki’sWiki’s areare nownow establishedestablished asas mainstreammainstream technologytechnology forfor collaborationcollaboration . OnOn thethe worldworld widewide webweb . WithinWithin thethe enterpriseenterprise ►AlsoAlso managingmanaging aa lotlot ofof contentcontent . ManyMany kindskinds ofof filesfiles cancan bebe linkedlinked toto oror embeddedembedded intointo thethe wikiwiki WikiWiki DrawbacksDrawbacks ►InformationInformation isis organizedorganized inin aa mannermanner similarsimilar toto aa filefile systemsystem . ItIt cancan bebe veryvery difficultdifficult toto findfind documentsdocuments onon aa wikiwiki afterafter thethe wikiwiki reachesreaches aa certaincertain sizesize . JustJust likelike thethe filefile systemsystem onon youryour personalpersonal computercomputer ►Except:Except: youyou organizedorganized everythingeverything onon youryour laptoplaptop ►EveryoneEveryone elseelse organizedorganized contentcontent onon thethe wikiwiki WikiWiki DrawbacksDrawbacks ►EvenEven thoughthough wikiswikis areare collaborationcollaboration andand contentcontent managementmanagement systemssystems . ThereThere isis nono informationinformation modelmodel thatthat cancan bebe usedused toto managemanage thethe contentcontent . WikisWikis containcontain structured,structured, unstructuredunstructured andand othersothers sortssorts ofof contentcontent SemanticSemantic WikiWiki 1.01.0 ►AA wikiwiki basedbased tooltool forfor buildingbuilding formalformal semanticssemantics . CommunityCommunity based,based, collaborativecollaborative . BothBoth structuredstructured andand unstructuredunstructured contentcontent isis managedmanaged inin thethe samesame collaborativecollaborative frameworkframework . ImportsImports andand exportsexports OWLOWL . AccessibleAccessible byby nonnon domaindomain expertsexperts SemanticSemantic WikiWiki 2.02.0 ►AA wikiwiki thatthat enablesenables anyany content,content, structuredstructured andand unstructured,unstructured, toto bebe “tagged”“tagged” soso thethe contentcontent cancan bebe queriedqueried andand reasonedreasoned overover . TaggedTagged meansmeans addingadding contentcontent toto anan OWLOWL basedbased ontologyontology ►AnAn integratedintegrated andand queryablequeryable knowledgebaseknowledgebase . QueryQuery isis veryvery differentdifferent fromfrom searchsearch . QueriesQueries cancan bebe embeddedembedded intointo thethe wikitextwikitext WikiWiki VocabulariesVocabularies WikiWiki VocabulariesVocabularies ► TheThe semanticssemantics forfor anyany domaindomain areare createdcreated withinwithin thethe wikiwiki asas anan OWLOWL vocabularyvocabulary . Project management, event management, social networks, logistics,logistics, acquisition,acquisition, bioinformatics,bioinformatics, CRMCRM ► MultipleMultiple domaindomain modelsmodels mightmight bebe availableavailable concurrentlyconcurrently ► OnceOnce thethe vocabularyvocabulary isis createdcreated andand publishedpublished thethe semanticssemantics cancan bebe leveragedleveraged toto achieveachieve . Interoperability . Integration . Discovery . Semantic matching . Semantic Wikis BootstrappingBootstrapping COICOI VocabulariesVocabularies inin aa SemanticSemantic WikiWiki BootstrappingBootstrapping OntologiesOntologies ► StepStep 1:1: StartStart atat thethe bottombottom . Build vocabularies from existing physical systems ► StepStep 2:2: CollaborateCollaborate . The community can document, review, discuss and change . Human-readable documentation and formal ontology definition ► StepStep 3:3: ShareShare andand UseUse . People access the vocabularies through web browsers to view the natural language documentation and navigate formal relationships . Machines can download OWL ontologies and use for automated reasoning StepStep 1:1: StartStart atat thethe BottomBottom ► BootstrapBootstrap fromfrom existingexisting systemssystems andand modelsmodels . Import the schemas from databases to start building the terms in the vocabulary . Messages, Excel, metadata repositories ► UseUse aa semanticallysemantically enabledenabled matchingmatching tooltool toto associateassociate semanticssemantics withwith thethe bootstrappedbootstrapped termsterms . Combine the terms used with knowledge bases to discover and assign semantics to information . Store the terms, definitions and semantics in vocabularies . Built-in knowledge base is WordNet, but can also use custom domain-specific VocabularyVocabulary ManagementManagement Step 1: Extract semantics from existing data DB DB DB XML XML VocabularyVocabulary ManagementManagement Step 2: Create bootstrapped vocabulary OWL OWLOWL StepStep 2:2: CollaborateCollaborate ► CreatingCreating vocabulariesvocabularies isis naturallynaturally collaborativecollaborative . identify,identify, define,define, document,document, standardize,standardize, edit,edit, revireview,ew, auditaudit . InvolveInvolve thethe rightright peoplepeople . ReuseReuse otherother vocabularies:vocabularies: benefitbenefit fromfrom thethe expertsexperts ► Community-orientedCommunity-oriented . AA communitycommunity consistsconsists ofof membersmembers thatthat shareshare experience,experience, expertiseexpertise andand interestinterest inin aa particularparticular domaindomain . CommunitiesCommunities managemanage memberships,memberships, content,content, andand accessaccess privilegesprivileges ► SemanticSemantic WikiWiki . CapturesCaptures thethe effortsefforts ofof manymany overover timetime . AddsAdds semanticsemantic richnessrichness toto wikiwiki markupmarkup languagelanguage VocabularyVocabulary DevelopmentDevelopment MatchIT: semantic matching OEM: Third- party Export Import modeling and data relational Vocabulary Matching Knoodl.com integration match sets models & Discovery Algorithms (ontologies) technology XSDs stacks Client Server Download Upload domain common ontologies terms & defs Import Files Knoodl.com: web-based ontology editor Vocabularies Export Web 2.0 Ontology / Matching or Community Model Inferencing / OWL Applications Repository / and Mash- Formal Governance Validation Reasoning Registry ups files Ontologies Hosted or Appliance StepStep 3:3: ShareShare andand UseUse ► MachinesMachines useuse ontologiesontologies . The vocabularies are represented with formalism that are rich and precise enough for software . Vocabularies can be downloaded as OWL ontologies ► PeoplePeople useuse naturalnatural languagelanguage . (Most) People don’t understand XML, OWL, RDF, or even HTML . People understand text, images, tables, charts, links . Follow existing web paradigms that people are comfortable with (browsers, links, pages, addresses, search, discussions, etc.) ► KeepKeep thethe twotwo partsparts togethertogether . People have to understand the vocabulary to maintain and use it . If parts are kept separate, more difficult to diverge . It’s simply easier this way! (Manually aligning documentation with models is too much work) SemanticSemantic WikiWiki Knoodl.comKnoodl.com ► UsesUses thethe WikiWiki paradigmparadigm toto enableenable thethe developmentdevelopment andand useuse ofof OWLOWL vocabulariesvocabularies byby CommunitiesCommunities ofof InterestInterest (COIs)(COIs) . W3C-basedW3C-based OWLOWL editor,editor, registry/repositoryregistry/repository . FacilitateFacilitate sharingsharing Knoodl.comKnoodl.com isis …… ► AnAn internetinternet applicationapplication wherewhere peoplepeople cancan collaboratecollaborate withwith othersothers inin theirtheir communitiescommunities ofof interestinterest toto . Create, edit, share and find . Vocabularies / ontologies ► OWLOWL RepositoryRepository . Free, but licensing controlled by COI’s ► InstitutionalInstitutional KnowledgeKnowledge ManagementManagement . Users contribute content and benefit from the content . Vocabularies capture much of the institutional knowledge of an enterprise or community . Gain value over time Knoodl.comKnoodl.com ►KnoodlKnoodl isis aa collaborativecollaborative frameworkframework ►WeWe needneed threethree groupsgroups ofof stakeholdersstakeholders contributingcontributing toto thethe descriptiondescription andand contextcontext ofof thethe domaindomain ►BusinesspeopleBusinesspeople ►TechnicalTechnical peoplepeople ►DataData peoplepeople . KnoodlKnoodl providesprovides thethe featuresfeatures forfor thethe businessbusiness peoplepeople toto participateparticipate VocabularyVocabulary ManagementManagement Evolve vocabulary collaboratively VocabularyVocabulary ManagementManagement Use vocabulary to understand SemanticSemantic WikiWiki ►IncorporateIncorporate formalformal semanticsemantic technologytechnology intointo thethe preeminentpreeminent collaborationcollaboration technologytechnology . FeaturesFeatures thatthat facilitatefacilitate thethe constructionconstruction ofof formalformal semanticsemantic modelsmodels . FeaturesFeatures thatthat makemake itit simplesimple andand eveneven
Recommended publications
  • Towards Ontology Based BPMN Implementation. Sophea Chhun, Néjib Moalla, Yacine Ouzrout
    Towards ontology based BPMN Implementation. Sophea Chhun, Néjib Moalla, Yacine Ouzrout To cite this version: Sophea Chhun, Néjib Moalla, Yacine Ouzrout. Towards ontology based BPMN Implementation.. SKIMA, 6th Conference on Software Knowledge Information Management and Applications., Jan 2012, Chengdu, China. 8 p. hal-01551452 HAL Id: hal-01551452 https://hal.archives-ouvertes.fr/hal-01551452 Submitted on 6 Nov 2018 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. 1 Towards ontology based BPMN implementation CHHUN Sophea, MOALLA Néjib and OUZROUT Yacine University of Lumiere Lyon2, laboratory DISP, France Natural language is understandable by human and not machine. None technical persons can only use natural language to specify their business requirements. However, the current version of Business process management and notation (BPMN) tools do not allow business analysts to implement their business processes without having technical skills. BPMN tool is a tool that allows users to design and implement the business processes by connecting different business tasks and rules together. The tools do not provide automatic implementation of business tasks from users’ specifications in natural language (NL). Therefore, this research aims to propose a framework to automatically implement the business processes that are expressed in NL requirements.
    [Show full text]
  • Harnessing the Power of Folksonomies for Formal Ontology Matching On-The-Fly
    Edinburgh Research Explorer Harnessing the power of folksonomies for formal ontology matching on the fly Citation for published version: Togia, T, McNeill, F & Bundy, A 2010, Harnessing the power of folksonomies for formal ontology matching on the fly. in Proceedings of the ISWC workshop on Ontology Matching. <http://ceur-ws.org/Vol- 689/om2010_poster4.pdf> Link: Link to publication record in Edinburgh Research Explorer Document Version: Early version, also known as pre-print Published In: Proceedings of the ISWC workshop on Ontology Matching General rights Copyright for the publications made accessible via the Edinburgh Research Explorer is retained by the author(s) and / or other copyright owners and it is a condition of accessing these publications that users recognise and abide by the legal requirements associated with these rights. Take down policy The University of Edinburgh has made every reasonable effort to ensure that Edinburgh Research Explorer content complies with UK legislation. If you believe that the public display of this file breaches copyright please contact [email protected] providing details, and we will remove access to the work immediately and investigate your claim. Download date: 01. Oct. 2021 Harnessing the power of folksonomies for formal ontology matching on-the-y Theodosia Togia, Fiona McNeill and Alan Bundy School of Informatics, University of Edinburgh, EH8 9LE, Scotland Abstract. This paper is a short introduction to our work on build- ing and using folksonomies to facilitate communication between Seman- tic Web agents with disparate ontological representations. We briey present the Semantic Matcher, a system that measures the semantic proximity between terms in interacting agents' ontologies at run-time, fully automatically and minimally: that is, only for semantic mismatches that impede communication.
    [Show full text]
  • Learning to Match Ontologies on the Semantic Web
    The VLDB Journal manuscript No. (will be inserted by the editor) Learning to Match Ontologies on the Semantic Web AnHai Doan1, Jayant Madhavan2, Robin Dhamankar1, Pedro Domingos2, Alon Halevy2 1 Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA fanhai,[email protected] 2 Department of Computer Science and Engineering, University of Washington, Seattle, WA 98195, USA fjayant,pedrod,[email protected] Received: date / Revised version: date Abstract On the Semantic Web, data will inevitably come and much of the potential of the Web has so far remained from many different ontologies, and information processing untapped. across ontologies is not possible without knowing the seman- In response, researchers have created the vision of the Se- tic mappings between them. Manually finding such mappings mantic Web [BLHL01], where data has structure and ontolo- is tedious, error-prone, and clearly not possible at the Web gies describe the semantics of the data. When data is marked scale. Hence, the development of tools to assist in the ontol- up using ontologies, softbots can better understand the se- ogy mapping process is crucial to the success of the Seman- mantics and therefore more intelligently locate and integrate tic Web. We describe GLUE, a system that employs machine data for a wide variety of tasks. The following example illus- learning techniques to find such mappings. Given two on- trates the vision of the Semantic Web. tologies, for each concept in one ontology GLUE finds the most similar concept in the other ontology. We give well- founded probabilistic definitions to several practical similar- Example 1 Suppose you want to find out more about some- ity measures, and show that GLUE can work with all of them.
    [Show full text]
  • Kgvec2go – Knowledge Graph Embeddings As a Service
    Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020), pages 5641–5647 Marseille, 11–16 May 2020 c European Language Resources Association (ELRA), licensed under CC-BY-NC KGvec2go – Knowledge Graph Embeddings as a Service Jan Portisch (1,2), Michael Hladik (2), Heiko Paulheim (1) (1) University of Mannheim - Data and Web Science Group, (2) SAP SE (1) B 6, 26 68159 Mannheim, Germany (2) Dietmar-Hopp Allee 16, 60190, Walldorf, Germany [email protected], [email protected], [email protected] Abstract In this paper, we present KGvec2go, a Web API for accessing and consuming graph embeddings in a light-weight fashion in downstream applications. Currently, we serve pre-trained embeddings for four knowledge graphs. We introduce the service and its usage, and we show further that the trained models have semantic value by evaluating them on multiple semantic benchmarks. The evaluation also reveals that the combination of multiple models can lead to a better outcome than the best individual model. Keywords: RDF2Vec, knowledge graph embeddings, knowledge graphs, background knowledge resources 1. Introduction The data set presented here allows to compare the perfor- A knowledge graph (KG) stores factual information in the mance of different knowledge graph embeddings on differ- form of triples. Today, many such graphs exist for various ent application tasks. It further allows to combine embed- domains, are publicly available, and are being interlinked. dings from different knowledge graphs in downstream ap- As of 2019, the linked open data cloud (Schmachtenberg plications. We evaluated the embeddings on three semantic et al., 2014) counts more than 1,000 data sets with multiple gold standards and also explored the combination of em- billions of unique triples.1 Knowledge graphs are typically beddings.
    [Show full text]
  • An Efficient Wikipedia Semantic Matching Approach to Text Document Classification
    Information Sciences 393 (2017) 15–28 Contents lists available at ScienceDirect Information Sciences journal homepage: www.elsevier.com/locate/ins An efficient Wikipedia semantic matching approach to text document classification ∗ ∗ Zongda Wu a, , Hui Zhu b, , Guiling Li c, Zongmin Cui d, Hui Huang e, Jun Li e, Enhong Chen f, Guandong Xu g a Oujiang College, Wenzhou University, Wenzhou, Zhejiang, China b Wenzhou Vocational College of Science and Technology, Wenzhou, Zhejiang, China c School of Computer Science, China University of Geosciences, Wuhan, China d School of Information Science and Technology, Jiujiang University, Jiangxi, China e College of Physics and Electronic Information Engineering, Wenzhou University, Wenzhou, Zhejiang, China f School of Computer Science and Technology, University of Science and Technology of China, Hefei, Anhui, China g Faculty of Engineering and IT, University of Technology, Sydney, Australia a r t i c l e i n f o a b s t r a c t Article history: A traditional classification approach based on keyword matching represents each text doc- Received 28 July 2016 ument as a set of keywords, without considering the semantic information, thereby, re- Revised 6 January 2017 ducing the accuracy of classification. To solve this problem, a new classification approach Accepted 3 February 2017 based on Wikipedia matching was proposed, which represents each document as a con- Available online 7 February 2017 cept vector in the Wikipedia semantic space so as to understand the text semantics, and Keywords: has been demonstrated to improve the accuracy of classification. However, the immense Wikipedia matching Wikipedia semantic space greatly reduces the generation efficiency of a concept vector, re- Keyword matching sulting in a negative impact on the availability of the approach in an online environment.
    [Show full text]
  • A Survey of Schema Matching Research Using Database Schemas and Instances
    (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 8, No. 10, 2017 A Survey of Schema Matching Research using Database Schemas and Instances Ali A. Alwan Mogahed Alzeber International Islamic University Malaysia, IIUM, International Islamic University Malaysia, IIUM Kuala Lumpur, Malaysia Kuala Lumpur, Malaysia Azlin Nordin Abedallah Zaid Abualkishik International Islamic University Malaysia, IIUM College of Computer Information Technology Kuala Lumpur, Malaysia American University in the Emirates Dubai, United Arab Emirates Abstract—Schema matching is considered as one of the which might negatively influence in the process of integrating essential phases of data integration in database systems. The the data [3]. main aim of the schema matching process is to identify the correlation between schema which helps later in the data Many firms might attempt to integrate some developed integration process. The main issue concern of schema matching heterogeneous data sources where these businesses have is how to support the merging decision by providing the various databases, and each database might consist of a vast correspondence between attributes through syntactic and number of tables that encompass different attributes. The semantic heterogeneous in data sources. There have been a lot of heterogeneity in these data sources leads to increasing the attempts in the literature toward utilizing database instances to complexity of handling these data, which result in the need for detect the correspondence between attributes during schema data integration [4]. Identifying the conflicts of (syntax matching process. Many approaches based on instances have (structure) and semantic heterogeneity) between schemas is a been proposed aiming at improving the accuracy of the matching significant issue during data integration.
    [Show full text]
  • A Distributional Semantic Search Infrastructure for Linked Dataspaces
    A Distributional Semantic Search Infrastructure for Linked Dataspaces Andr´eFreitas, Se´an O’Riain, Edward Curry Digital Enterprise Research Institute (DERI) National University of Ireland, Galway Abstract. This paper describes and demonstrates a distributional se- mantic search service infrastructure for Linked Dataspaces. The center of the approach relies on the use of a distributional semantics infrastruc- ture to provide semantic search and query services over data for users and applications, improving data accessibility over the Dataspace. By ac- cessing the services through a REST API, users can semantically index and search over data using the distributional semantic knowledge embed- ded in the reference corpus. The use of distributional semantic models, which rely on the automatic extraction from large corpora, supports a comprehensive and approximative semantic matching mechanism with a low associated adaptation cost for the inclusion of new data sources. Keywords: Distributional Semantics, Semantic Matching, Semantic Search, Explicit Semantic Analysis, Dataspaces, Linked Data. 1 Motivation Within the realm of the Web and of Big Data, dataspaces where data is more complex, sparse and heterogeneous are becoming more common. Consuming this data demands applications and search/query mechanisms with the seman- tic flexibility necessary to cope with the semantic/vocabulary gap between users, different applications and data sources within the dataspace. Traditionally, con- suming structured data demands that users, applications and databases share the same vocabulary before data consumption, where the semantic matching process is done manually. As dataspaces grow in complexity, the ability to se- mantically search over data using one’s own vocabulary becomes a fundamental functionality for dataspaces.
    [Show full text]
  • YASA-M: a Semantic Web Service Matchmaker
    YASA-M : a semantic Web service matchmaker Yassin Chabeb, Samir Tata, Alain Ozanne To cite this version: Yassin Chabeb, Samir Tata, Alain Ozanne. YASA-M : a semantic Web service matchmaker. 24th IEEE International Conference on Advanced Information Networking and Applications (AINA 2010):, Apr 2010, Perth, Australia. pp.966 - 973, 10.1109/AINA.2010.122. hal-01356801 HAL Id: hal-01356801 https://hal.archives-ouvertes.fr/hal-01356801 Submitted on 26 Aug 2016 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. YASA-M: A Semantic Web Service Matchmaker Yassin Chabeb, Samir Tata, and Alain Ozanne TELECOM SudParis, CNRS UMR Samovar, Evry, France Email: fyassin.chabeb, samir.tata, [email protected] Abstract—In this paper, we present new algorithms for match- This paper is organized as follows. Section II presents a state ing Web services described in YASA4WSDL (YASA for short). of the art of semantic matching approaches. In Section III, We have already defined YASA that overcomes some issues we give an overview of our service description language then missing in WSDL or SAWSDL. In this paper, we continue on our contribution and show how YASA Web services are we detail our service matching algorithm.
    [Show full text]
  • Semantic Matching in Search
    Foundations and Trends⃝R in Information Retrieval Vol. 7, No. 5 (2013) 343–469 ⃝c 2014 H. Li and J. Xu DOI: 10.1561/1500000035 Semantic Matching in Search Hang Li Huawei Technologies, Hong Kong [email protected] Jun Xu Huawei Technologies, Hong Kong [email protected] Contents 1 Introduction 3 1.1 Query Document Mismatch ................. 3 1.2 Semantic Matching in Search ................ 5 1.3 Matching and Ranking ................... 9 1.4 Semantic Matching in Other Tasks ............. 10 1.5 Machine Learning for Semantic Matching in Search .... 11 1.6 About This Survey ...................... 14 2 Semantic Matching in Search 16 2.1 Mathematical View ..................... 16 2.2 System View ......................... 19 3 Matching by Query Reformulation 23 3.1 Query Reformulation .................... 24 3.2 Methods of Query Reformulation .............. 25 3.3 Methods of Similar Query Mining .............. 32 3.4 Methods of Search Result Blending ............. 38 3.5 Methods of Query Expansion ................ 41 3.6 Experimental Results .................... 44 4 Matching with Term Dependency Model 45 4.1 Term Dependency ...................... 45 ii iii 4.2 Methods of Matching with Term Dependency ....... 47 4.3 Experimental Results .................... 53 5 Matching with Translation Model 54 5.1 Statistical Machine Translation ............... 54 5.2 Search as Translation .................... 56 5.3 Methods of Matching with Translation ........... 59 5.4 Experimental Results .................... 61 6 Matching with Topic Model 63 6.1 Topic Models ........................ 64 6.2 Methods of Matching with Topic Model .......... 70 6.3 Experimental Results .................... 74 7 Matching with Latent Space Model 75 7.1 General Framework of Matching .............. 76 7.2 Latent Space Models ...................
    [Show full text]
  • A Library of Schema Matching Algorithms for Dataspace Management Systems
    A LIBRARY OF SCHEMA MATCHING ALGORITHMS FOR DATASPACE MANAGEMENT SYSTEMS A dissertation submitted to the University of Manchester for the degree of Master of Science in the Faculty of Engineering and Physical Sciences 2011 By Syed Zeeshanuddin School of Computer Science Contents Abstract 8 Declaration 10 Copyright 11 Acknowledgments 12 List of Abbreviations 13 1 Introduction 14 1.1 Aims and Objectives . 18 1.2 Overview Of Approach . 19 1.3 Summary of Achievements . 19 1.4 Dissertation Structure . 20 2 Background 22 2.1 Applications Of Schema Matching . 22 2.2 Review Of State-of-the-art Schema Matching Systems . 25 3 Overview 32 3.1 Taxonomy Of Schema Matching Algorithms . 32 3.1.1 Element-level Schema Matching . 33 3.1.2 Structure-level Schema Matching . 38 3.1.3 Instance-level Schema Matching . 39 3.2 String Matching Algorithms . 42 3.2.1 Distance Based . 42 3.2.2 N-gram Based . 43 3.2.3 Stem Based String Comparison . 44 2 3.2.4 Phonetics Based String Comparison . 44 4 Architecture 45 4.1 Development Methodology . 46 4.2 Prototype Design . 46 4.2.1 Use Case . 46 4.2.2 Flow Of Activities . 48 4.2.3 Prototype Components . 48 5 Algorithms 55 5.1 Element-level Schema Matchers . 55 5.1.1 Element-level Name-based Without Context . 55 5.1.2 Element-level Name-based With Context . 56 5.1.3 Element-level Domain-based Without Context . 56 5.1.4 Element-level Domain-based With Context . 56 5.2 Structure-level Schema Matcher .
    [Show full text]
  • A Distributional Semantic Search Infrastructure for Linked Dataspaces
    A Distributional Semantic Search Infrastructure for Linked Dataspaces Andr´eFreitas,Se´an O’Riain, and Edward Curry Digital Enterprise Research Institute (DERI) National University of Ireland, Galway Abstract. This paper describes and demonstrates a distributional se- mantic search service infrastructure for Linked Dataspaces. The center of the approach relies on the use of a distributional semantics infrastruc- ture to provide semantic search and query services over data for users and applications, improving data accessibility over the Dataspace. By ac- cessing the services through a REST API, users can semantically index and search over data using the distributional semantic knowledge embed- ded in the reference corpus. The use of distributional semantic models, which rely on the automatic extraction from large corpora, supports a comprehensive and approximative semantic matching mechanism with a low associated adaptation cost for the inclusion of new data sources. Keywords: Distributional Semantics, Semantic Matching, Semantic Search, Explicit Semantic Analysis, Dataspaces, Linked Data. 1 Motivation Within the realm of the Web and of Big Data, dataspaces where data is more complex, sparse and heterogeneous are becoming more common. Consuming this data demands applications and search/query mechanisms with the seman- tic flexibility necessary to cope with the semantic/vocabulary gap between users, different applications and data sources within the dataspace. Traditionally, con- suming structured data demands that users, applications and databases share the same vocabulary before data consumption, where the semantic matching process is done manually. As dataspaces grow in complexity, the ability to se- mantically search over data using one’s own vocabulary becomes a fundamental functionality for dataspaces.
    [Show full text]
  • A Semantic Tag Recommendation Framework for Collaborative Tagging Systems
    See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/220876230 A Semantic Tag Recommendation Framework for Collaborative Tagging Systems Conference Paper · October 2011 DOI: 10.1109/PASSAT/SocialCom.2011.170 · Source: DBLP CITATIONS READS 2 89 3 authors, including: Konstantinos N. Vavliakis Pericles A. Mitkas Aristotle University of Thessaloniki Aristotle University of Thessaloniki 21 PUBLICATIONS 60 CITATIONS 272 PUBLICATIONS 1,634 CITATIONS SEE PROFILE SEE PROFILE All content following this page was uploaded by Pericles A. Mitkas on 06 April 2014. The user has requested enhancement of the downloaded file. All in-text references underlined in blue are added to the original document and are linked to publications on ResearchGate, letting you access and read them immediately. 2011 IEEE International Conference on Privacy, Security, Risk, and Trust, and IEEE International Conference on Social Computing A Semantic Tag Recommendation Framework for Collaborative Tagging Systems Zinovia I. Alepidou∗, Konstantinos N. Vavliakis∗,† and Pericles A. Mitkas∗,† ∗Electrical and Computer Engineering, Aristotle University of Thessaloniki †Informatics and Telematics Institute, CERTH Thessaloniki, Greece Email: [email protected], [email protected], [email protected] Abstract—In this work we focus on folksonomies. Our goal is II. RELATED WORK to develop techniques that coordinate information processing, by taking advantage of user preferences, in order to auto- Numerous methodologies, such as statistical models, mul- matically produce semantic tag recommendations. To this end, tilabel classifiers and collaborative systems that combine we propose a generalized tag recommendation framework that conveys the semantics of resources according to different user information from multiple resources have been proposed profiles.
    [Show full text]