Semantic Integration Across Heterogeneous Databases Finding Data Correspondences Using Agglomerative Hierarchical Clustering and Artificial Neural Networks

Total Page:16

File Type:pdf, Size:1020Kb

Semantic Integration Across Heterogeneous Databases Finding Data Correspondences Using Agglomerative Hierarchical Clustering and Artificial Neural Networks DEGREE PROJECT IN COMPUTER SCIENCE AND ENGINEERING, SECOND CYCLE, 30 CREDITS STOCKHOLM, SWEDEN 2018 Semantic Integration across Heterogeneous Databases Finding Data Correspondences using Agglomerative Hierarchical Clustering and Artificial Neural Networks MARK HOBRO KTH ROYAL INSTITUTE OF TECHNOLOGY SCHOOL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCE Semantic Integration across Heterogeneous Databases Finding Data Correspondences using Agglomerative Hierarchical Clustering and Artificial Neural Networks MARK HOBRO Master in Computer Science Date: April 11, 2018 Supervisor: John Folkesson Examiner: Hedvig Kjellström Swedish title: Semantisk integrering mellan heterogena databaser: Hitta datakopplingar med hjälp av hierarkisk klustring och artificiella neuronnät School of Computer Science and Communication iii Abstract The process of data integration is an important part of the database field when it comes to database migrations and the merging of data. The research in the area has grown with the addition of machine learn- ing approaches in the last 20 years. Due to the complexity of the re- search field, no go-to solutions have appeared. Instead, a wide variety of ways of enhancing database migrations have emerged. This thesis examines how well a learning-based solution performs for the seman- tic integration problem in database migrations. Two algorithms are implemented. One that is based on informa- tion retrieval theory, with the goal of yielding a matching result that can be used as a benchmark for measuring the performance of the machine learning algorithm. The machine learning approach is based on grouping data with agglomerative hierarchical clustering and then training a neural network to recognize patterns in the data. This al- lows making predictions about potential data correspondences across two databases. The results show that agglomerative hierarchical clustering per- forms well in the task of grouping the data into classes. The classes can in turn be used for training a neural network. The matching al- gorithm gives a high recall of matching tables, but improvements are needed to both receive a high recall and precision. The conclusion is that the proposed learning-based approach, us- ing agglomerative hierarchical clustering and a Neural network, works as a solid base to semi-automate the data integration problem seen in this thesis. But the solution needs to be enhanced with scenario spe- cific algorithms and rules, to reach desired performance. iv Sammanfattning Dataintegrering är en viktig del inom området databaser när det kom- mer till databasmigreringar och sammanslagning av data. Forskning inom området har ökat i takt med att maskininlärning blivit ett at- traktivt tillvägagångssätt under de senaste 20 åren. På grund av kom- plexiteten av forskningsområdet, har inga optimala lösningar hittats. Istället har flera olika tekniker framställts, som tillsammans kan för- bättra databasmigreringar. Denna avhandling undersöker hur bra en lösning baserad på maskininlärning presterar för dataintegreringspro- blemet vid databasmigreringar. Två algoritmer har implementerats. En är baserad på informations- sökningsteori, som främst används för att ha en prestandamässig ut- gångspunkt för algoritmen som är baserad på maskininlärning. Den algoritmen består av ett första steg, där data grupperas med hjälp av hierarkisk klustring. Sedan tränas ett artificiellt neuronnät att hitta mönster i dessa grupperingar, för att kunna göra förutsägelser huruvi- da olika datainstanser har ett samband mellan två databaser. Resultatet visar att agglomerativ hierarkisk klustring presterar väl i uppgiften att klassificera den data som använts. Resultatet av match- ningsalgoritmen visar på att en stor mängd av de matchande tabeller- na kan hittas. Men förbättringar behöver göras för att både ge hög en hög återkallelse av matchningar och hög precision för de matchningar som hittas. Slutsatsen är att ett inlärningsbaserat tillvägagångssätt, i detta fall att använda agglomerativ hierarkisk klustring och sedan träna ett ar- tificiellt neuronnät, fungerar bra som en basis för att till viss del auto- matisera ett dataintegreringsproblem likt det som presenterats i den- na avhandling. För att få bättre resultat, krävs att lösningen förbättras med mer situationsspecifika algoritmer och regler. v Acknowledgements I would first like to thank my supervisor at KTH, John Folkesson, for all the support during the project as well as my examiner, Hed- vig Kjellström, for making this thesis possible. I also want to express my gratitude to my supervisor at Sokigo, Kevin James, for giving me the opportunity to do my thesis at their office. Finally, I would like to thank my family and friends who have supported me throughout the entire process. Contents 1 Introduction 1 1.1 Motivation . .1 1.2 Problem Definition . .2 1.3 Limitation . .2 1.4 Sustainability and Ethics . .3 1.5 Outline of Report . .4 2 Background 5 2.1 Semantic Integration . .5 2.1.1 Schema Matching . .5 2.1.2 Instance Matching . .6 2.1.3 Match Cardinality . .6 2.2 Rule-based Matching . .6 2.3 Information Retrieval . .7 2.3.1 Tf-idf and Cosine Similarity . .7 2.4 Learning-based Approach . .8 2.4.1 Data Representation . .8 2.4.2 Agglomerative Hierarchical Clustering . 10 2.4.3 Neural Network . 12 2.5 Difficulties in Schema Integration . 16 2.5.1 Linguistic Challenges . 17 2.5.2 Structural Ambiguity . 17 2.6 Related Work . 17 3 Method 21 3.1 Setup and Development Tools . 21 3.2 Dataset . 21 3.2.1 Preprocessing . 22 3.3 Tf-idf and Cosine Similarity . 24 vi CONTENTS vii 3.4 Agglomerative Hierarchical Clustering . 25 3.4.1 Effect of Normalization . 25 3.4.2 Analysing Clustering Outcome . 27 3.4.3 Testing Dissimilarity Measures . 28 3.5 Neural Network . 29 3.5.1 Defining the Model . 29 3.5.2 Grid Search . 30 3.5.3 Name Matcher . 31 3.5.4 Training Setup . 32 3.6 Evaluating Performance . 33 3.6.1 Application in Database Environments . 34 3.6.2 Evaluating Testing Accuracy . 34 4 Result 35 4.1 Tf-idf and Cosine Similarity . 35 4.1.1 Precision and Recall . 35 4.2 Agglomerative Hierarchical Clustering . 36 4.2.1 Clustering Execution Times . 36 4.2.2 Clustering Visualization . 37 4.3 Neural Network . 37 4.3.1 Grid Search . 37 4.3.2 Model Evaluation . 38 4.3.3 Precision and Recall . 38 5 Discussion 41 5.1 Tf-idf and Cosine Similarity . 41 5.2 Agglomerative Hierarchical Clustering . 41 5.3 Neural Network . 42 5.4 About the Semantic Integration Problem . 43 5.5 Methodology . 45 6 Conclusion 47 6.1 Future Work . 48 Bibliography 49 A Agglomerative Hierarchical Clustering 53 B Cluster Content Explanation 56 Chapter 1 Introduction Semantic integration has been a widely discussed topic in the last three decades. The earlier work focused on structural integration, i.e. build- ing global data models with the goal of integrating well-structured data [39]. With continuous growth of information on the Internet, it became more common with semi- and unstructured data. Semantic integration of data was born out of the problems that came with this evolution. Dealing with heterogeneity across different data-sources is complex, especially since it needs to be done on several levels, such as schema and instance data level [39]. Therefore, semantic integra- tion will continue to be a research field that requires attention and it is nowadays an important area of database research [12]. 1.1 Motivation Semantic integration has evolved into a significant research area in the database community with the constant need of processing com- plex data. The research has been taking two different paths. Firstly, it has been investigated whether semantic integration is applicable to integrate data automatically (or at least semi-automatically) in specific scenarios. Secondly, the research has been focused on finding generic solutions, where semantic integration can be applied to build ontolo- gies that can integrate a wider scope of heterogeneous data. The lat- ter is rather complex, due to the immense diversity and range of data sources that can be encountered. Because of the diversity of data to- day, there are many data scenarios in which semantic integration has not been attempted for automatic schema and data integration. There- 1 2 CHAPTER 1. INTRODUCTION fore, there is a need to examine how well different semantic integration techniques perform for real-world data. Automation of database migrations is something that can be of use widely. It is not uncommon for companies to migrate data between databases and since modern technology evolves so swiftly, it is inter- esting to investigate how data migrations can be made more efficient. 1.2 Problem Definition Earlier research in semantic integration for database systems has shown that it is possible to automate the integration process to a certain ex- tent. But in many studies, the test data has been generic and have not covered many of the problems that come with matching complex and heterogeneous data. In this study, one goal is to build a generic solu- tion for the integration process of complex and field specific data and analyse the difficulties that may appear. It is interesting to investigate how different strategies, such as rule and learning-based, scale with increased data complexity. Will deep neural networks show toward better performance? Other questions that will be considered are how well a learning-based approach can be applied to the relational data and if a learning-based approach alone is enough to solve the integration problem. Is agglomerative hierarchi- cal clustering a preferable solution for establishing classes that can be used for training a neural network? The hypothesis is that both agglomerative hierarchical clustering and neural networks can be used to semi-automate the semantic in- tegration problem but will need the help of well constructed rules when the complexity of the data increases. The meaning with semi- automation is that the learning-based approach can distinguish the patterns between databases, but is not sufficient alone to yield optimal results (finding all data correspondences, meaning full automation).
Recommended publications
  • Semantic Integration and Knowledge Discovery for Environmental Research
    Journal of Database Management, 18(1), 43-67, January-March 2007 43 Semantic Integration and Knowledge Discovery for Environmental Research Zhiyuan Chen, University of Maryland, Baltimore County (UMBC), USA Aryya Gangopadhyay, University of Maryland, Baltimore County (UMBC), USA George Karabatis, University of Maryland, Baltimore County (UMBC), USA Michael McGuire, University of Maryland, Baltimore County (UMBC), USA Claire Welty, University of Maryland, Baltimore County (UMBC), USA ABSTRACT Environmental research and knowledge discovery both require extensive use of data stored in various sources and created in different ways for diverse purposes. We describe a new metadata approach to elicit semantic information from environmental data and implement semantics- based techniques to assist users in integrating, navigating, and mining multiple environmental data sources. Our system contains specifications of various environmental data sources and the relationships that are formed among them. User requests are augmented with semantically related data sources and automatically presented as a visual semantic network. In addition, we present a methodology for data navigation and pattern discovery using multi-resolution brows- ing and data mining. The data semantics are captured and utilized in terms of their patterns and trends at multiple levels of resolution. We present the efficacy of our methodology through experimental results. Keywords: environmental research, knowledge discovery and navigation, semantic integra- tion, semantic networks,
    [Show full text]
  • L Dataspaces Make Data Ntegration Obsolete?
    DBKDA 2011, January 23-28, 2011 – St. Maarten, The Netherlands Antilles DBKDA 2011 Panel Discussion: Will Dataspaces Make Data Integration Obsolete? Moderator: Fritz Laux, Reutlingen Univ., Germany Panelists: Kazuko Takahashi, Kwansei Gakuin Univ., Japan Lena Strömbäck, Linköping Univ., Sweden Nipun Agarwal, Oracle Corp., USA Christopher Ireland, The Open Univ., UK Fritz Laux, Reutlingen Univ., Germany DBKDA 2011, January 23-28, 2011 – St. Maarten, The Netherlands Antilles The Dataspace Idea Space of Data Management Scalable Functionality and Costs far Web Search Functionality virtual Organization pay-as-you-go, Enterprise Dataspaces Admin. Portal Schema Proximity Federated first, DBMS DBMS scient. Desktop Repository Search DBMS schemaless, near unstructured high Semantic Integration low Time and Cost adopted from [Franklin, Halvey, Maier, 2005] DBKDA 2011, January 23-28, 2011 – St. Maarten, The Netherlands Antilles Dataspaces (DS) [Franklin, Halevy, Maier, 2005] is a new abstraction for Information Management ● DS are [paraphrasing and commenting Franklin, 2009] – Inclusive ● Deal with all the data of interest, in whatever form => but semantics matters ● We need access to the metadata! ● derive schema from instances? ● Discovering new data sources => The Münchhausen bootstrap problem? Theodor Hosemann (1807-1875) DBKDA 2011, January 23-28, 2011 – St. Maarten, The Netherlands Antilles Dataspaces (DS) [Franklin, Halevy, Maier, 2005] is a new abstraction for Information Management ● DS are [paraphrasing and commenting Franklin, 2009] – Co-existence
    [Show full text]
  • Learning to Match Ontologies on the Semantic Web
    The VLDB Journal manuscript No. (will be inserted by the editor) Learning to Match Ontologies on the Semantic Web AnHai Doan1, Jayant Madhavan2, Robin Dhamankar1, Pedro Domingos2, Alon Halevy2 1 Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA fanhai,[email protected] 2 Department of Computer Science and Engineering, University of Washington, Seattle, WA 98195, USA fjayant,pedrod,[email protected] Received: date / Revised version: date Abstract On the Semantic Web, data will inevitably come and much of the potential of the Web has so far remained from many different ontologies, and information processing untapped. across ontologies is not possible without knowing the seman- In response, researchers have created the vision of the Se- tic mappings between them. Manually finding such mappings mantic Web [BLHL01], where data has structure and ontolo- is tedious, error-prone, and clearly not possible at the Web gies describe the semantics of the data. When data is marked scale. Hence, the development of tools to assist in the ontol- up using ontologies, softbots can better understand the se- ogy mapping process is crucial to the success of the Seman- mantics and therefore more intelligently locate and integrate tic Web. We describe GLUE, a system that employs machine data for a wide variety of tasks. The following example illus- learning techniques to find such mappings. Given two on- trates the vision of the Semantic Web. tologies, for each concept in one ontology GLUE finds the most similar concept in the other ontology. We give well- founded probabilistic definitions to several practical similar- Example 1 Suppose you want to find out more about some- ity measures, and show that GLUE can work with all of them.
    [Show full text]
  • Semantic Integration in the IFF Gies , Etc
    ogies, composing ontologies via fusions, noting dependen- cies between ontologies, declaring the use of other ontolo- 3 Semantic Integration in the IFF gies , etc. The IFF takes a building blocks approach to- wards the development of object-level ontological struc- ture. This is a rather elaborate categorical approach, which Robert E. Kent uses insights and ideas from the theory of distributed logic Ontologos known as information flow (Barwise and Seligman, 1997) [email protected] and the theory of formal concept analysis (Ganter and Wille, 1999). The IFF represents metalogic, and as such operates at the structural level of ontologies. In the IFF, Abstract there is a precise boundary between the metalevel and the object level. The IEEE P1600.1 Standard Upper Ontology (SUO) The modular architecture of the IFF consists of metale- project aims to specify an upper ontology that will provide vels, namespaces and meta-ontologies. There are three me- a structure and a set of general concepts upon which do- talevels: top, upper and lower. This partition, which cor- main ontologies could be constructed. The Information responds to the set-theoretic distinction between small Flow Framework (IFF), which is being developed under (sets), large (classes) and generic collections, is permanent. the auspices of the SUO Working Group, represents the Each metalevel services the level below by providing a structural aspect of the SUO. The IFF is based on category language that is used to declare and axiomatize that level. theory. Semantic integration of object-level ontologies in * The top metalevel services the upper metalevel, the upper the IFF is represented with its fusion construction .
    [Show full text]
  • Rule-Based Intelligence on the Semantic Web Implications for Military Capabilities
    UNCLASSIFIED Rule-Based Intelligence on the Semantic Web Implications for Military Capabilities Dr Paul Smart Senior Research Fellow School of Electronics and Computer Science University of Southampton Southampton SO17 1BJ United Kingdom 26th November 2007 UNCLASSIFIED Report Documentation Page Report Title: Rule-Based Intelligence on the Semantic Web Report Subtitle Implications for Military Capabilities Project Title: N/A Number of Pages: 57 Version: 1.2 Date of Issue: 26/11/2007 Due Date: 22/11/2007 Performance EZ~01~01~17 Number of 95 Indicator: References: Reference Number: DT/Report/RuleIntel Report Availability: APPROVED FOR PUBLIC RELEASE; LIMITED DISTRIBUTION Abstract Availability: APPROVED FOR PUBLIC RELEASE; DISTRIBUTION UNLIMITED Authors: Paul Smart Keywords: semantic web, ontologies, reasoning, decision support, rule languages, military Primary Author Details: Client Details: Dr Paul Smart Senior Research Fellow School of Electronics and Computer Science University of Southampton Southampton, UK SO17 1BJ tel: +44 (0)23 8059 6669 fax: +44 (0)23 8059 2783 email: [email protected] Abstract: Rules are a key element of the Semantic Web vision, promising to provide a foundation for reasoning capabilities that underpin the intelligent manipulation and exploitation of information content. Although ontologies provide the basis for some forms of reasoning, it is unlikely that ontologies, by themselves, will support the range of knowledge-based services that are likely to be required on the Semantic Web. As such, it is important to consider the contribution that rule-based systems can make to the realization of advanced machine intelligence on the Semantic Web. This report aims to review the current state-of-the-art with respect to semantic rule-based technologies.
    [Show full text]
  • Semantic Web Technologies and Data Management
    Semantic Web Technologies and Data Management Li Ma, Jing Mei, Yue Pan Krishna Kulkarni Achille Fokoue, Anand Ranganathan IBM China Research Laboratory IBM Software Group IBM Watson Research Center Bei Jing 100094, China San Jose, CA 95141-1003, USA New York 10598, USA Introduction The Semantic Web aims to build a common framework that allows data to be shared and reused across applications, enterprises, and community boundaries. It proposes to use RDF as a flexible data model and use ontology to represent data semantics. Currently, relational models and XML tree models are widely used to represent structured and semi-structured data. But they offer limited means to capture the semantics of data. An XML Schema defines a syntax-valid XML document and has no formal semantics, and an ER model can capture data semantics well but it is hard for end-users to use them when the ER model is transformed into a physical database model on which user queries are evaluated. RDFS and OWL ontologies can effectively capture data semantics and enable semantic query and matching, as well as efficient data integration. The following example illustrates the unique value of semantic web technologies for data management. Figure 1. An example of ontology based data management In Figure 1, we have two tables in a relational database. One stores some basic information of several companies, and another one describes shareholding relationship among these companies. Sometimes, users want to issue such a query “find Company EDOX’s all direct and indirect shareholders which are from Europe and are IT company”. Based on the data stored in the database, existing RDBMSes cannot represent and answer the above query.
    [Show full text]
  • Semantic Integration and Analysis of Clinical Data-1024
    Semantic integration and analysis of clinical data Hong Sun, Kristof Depraetere, Jos De Roo, Boris De Vloed, Giovanni Mels, Dirk Colaert Advanced Clinical Applications Research Group, Agfa HealthCare, Gent, Belgium {hong.sun, kristof.depraetere, jos.deroo, boris.devloed, giovanni.mels, dirk.colaert }@agfa.com Abstract. There is a growing need to semantically process and integrate clinical data from different sources for Clinical Data Management and Clinical Deci- sion Support in the healthcare IT industry. In the clinical practice domain, the semantic gap between clinical information systems and domain ontologies is quite often difficult to bridge in one step. In this paper, we report our experi- ence in using a two-step formalization approach to formalize clinical data, i.e. from database schemas to local formalisms and from local formalisms to do- main (unifying) formalisms. We use N3 rules to explicitly and formally state the mapping from local ontologies to domain ontologies. The resulting data ex- pressed in domain formalisms can be integrated and analyzed, though originat- ing from very distinct sources. Practices of applying the two-step approach in the infectious disorders and cancer domains are introduced. Keywords: Semantic interoperability, N3 rules, SPARQL, formal clinical data, data integration, data analysis, clinical information system. 1 Introduction A decade ago formal semantics, the study of logic within a linguistic framework, found a new means of expression, i.e. the World Wide Web and with it the Semantic Web [1]. The Semantic Web provides additional capabilities that enable information sharing between different resources which are semantically represented. It consists of a set of standards and technologies that include a simple data model for representing information (RDF) [2], a query language for RDF (SPARQL) [3], a schema language describing RDF vocabularies (RDFS) [4], a few syntaxes to represent RDF (RDF/XML [20], N3 [19]) and a language for describing and sharing ontologies (OWL) [5].
    [Show full text]
  • A Case Study of Data Integration for Aquatic Resources Using Semantic Web Technologies
    A Case Study of Data Integration for Aquatic Resources Using Semantic Web Technologies By Janice Gordon, Nina Chkhenkeli, David Govoni, Frances Lightsom, Andrea Ostroff, Peter Schweitzer, Phethala Thongsavanh, Dalia Varanka, and Stephan Zednik Open-File Report 2015–1004 U.S. Department of the Interior U.S. Geological Survey U.S. Department of the Interior SALLY JEWELL, Secretary U.S. Geological Survey Suzette M. Kimball, Acting Director U.S. Geological Survey, Reston, Virginia: 2015 For more information on the USGS—the Federal source for science about the Earth, its natural and living resources, natural hazards, and the environment—visit http://www.usgs.gov or call 1–888–ASK–USGS For an overview of USGS information products, including maps, imagery, and publications, visit http://www.usgs.gov/pubprod To order this and other USGS information products, visit http://store.usgs.gov Suggested citation: Gordon, Janice, Chkhenkeli, Nina, Govoni, David, Lightsom, Frances, Ostroff, Andrea, Schweitzer, Peter, Thongsavanh, Phethala, Varanka, Dalia, and Zednik, Stephan, 2015, A case study of data integration for aquatic resources using semantic web technologies: U.S. Geological Survey Open-File Report 2015–1004, 55 p., http://dx.doi.org/10.3133/ofr20151004. ISSN 2331-1258 (online) Any use of trade, firm, or product names is for descriptive purposes only and does not imply endorsement by the U.S. Government. Although this information product, for the most part, is in the public domain, it also may contain copyrighted materials as noted in the text. Permission to reproduce copyrighted items must be secured from the copyright owner. ii Contents Abstract ...................................................................................................................................................................... 1 Introduction ................................................................................................................................................................
    [Show full text]
  • Semantic Integration of Heterogeneous NASA Mission Data Sources
    Semantic Integration of Heterogeneous NASA Mission Data Sources Richard M. Keller1, Daniel C. Berrios2, Shawn R. Wolfe1, David R. Hall3, Ian B. Sturken3 1National Aeronautics and Space Administration 2University of California, Santa Cruz 3QSS Group, Inc. Intelligent Systems Division, NASA Ames Research Center Mail Stop 269-2, Moffett Field, CA 94035-1000 {keller, berrios, shawn, dhall, sturken}@email.arc.nasa.gov One of the most important and challenging knowledge solution is to build custom software that integrates a fixed management problems faced by NASA is the integration set of data sources. Generally, the software must be of heterogeneous information sources. NASA mission and reworked whenever an existing data source is modified or project personnel access information stored in many a new information source needs to be added. The different formats from a wide variety of diverse brittleness of this approach is what makes information information sources, including databases, web servers, integration systems costly to build and maintain. document repositories, ftp servers, special-purpose To address these problems, we have designed and application servers, and Web service applications. For implemented a generalized data mediation architecture example, diagnosing a problem with the International called SemanticIntegrator. In contrast with specialized Space Station (ISS) communications systems might integration solutions, this architecture is more easily require a flight controller to access multiple pieces of reused for different domains
    [Show full text]
  • Ontological View-Driven Semantic Integration in Open Environments
    Western University Scholarship@Western Electronic Thesis and Dissertation Repository 9-20-2010 12:00 AM Ontological View-driven Semantic Integration in Open Environments Yunjiao Xue The University of Western Ontario Supervisor Hamada H. Ghenniwa The University of Western Ontario Joint Supervisor Weiming Shen The University of Western Ontario Graduate Program in Electrical and Computer Engineering A thesis submitted in partial fulfillment of the equirr ements for the degree in Doctor of Philosophy © Yunjiao Xue 2010 Follow this and additional works at: https://ir.lib.uwo.ca/etd Part of the Computer Engineering Commons Recommended Citation Xue, Yunjiao, "Ontological View-driven Semantic Integration in Open Environments" (2010). Electronic Thesis and Dissertation Repository. 16. https://ir.lib.uwo.ca/etd/16 This Dissertation/Thesis is brought to you for free and open access by Scholarship@Western. It has been accepted for inclusion in Electronic Thesis and Dissertation Repository by an authorized administrator of Scholarship@Western. For more information, please contact [email protected]. Ontological View-driven Semantic Integration in Open Environments (Spine title: Ontological View-driven Semantic Integration in Open Environments) (Thesis format: Monograph) by Yunjiao Xue Graduate Program in Engineering Science Department of Electrical and Computer Engineering A thesis submitted in partial fulfilment of the requirements for the degree of Doctor of Philosophy The School of Graduate and Postdoctoral Studies The University of Western Ontario London, Ontario, Canada © Yunjiao Xue 2010 THE UNIVERSITY OF WESTERN ONTARIO SCHOOL OF GRADUATE AND POSTDOCTORAL STUDIES CERTIFICATE OF EXAMINATION Supervisor Examiners ______________________________ ______________________________ Dr. Hamada H. Ghenniwa Dr. Yong Zeng ______________________________ Co-Supervisor Dr. Robert E.
    [Show full text]
  • Designing a Semantic Repository Integrating Architectures for Reuse and Integration
    Designing a Semantic Repository Integrating architectures for reuse and integration Cory Casanave Cory-c (at) modeldriven.org ModelDriven.org May 2007 Overview The Semantic Metadata infrastructure will provide a “smart” repository for architectures at multiple levels of abstraction, from multiple sources and with multiple views. This infrastructure will integrate the OMG-Meta Object Facility (MOF) and Resource Description Framework (RDF) and RDF-Schema as defined by W3C as part of the “Semantic Web” initiative and will integrate the specification and provisioning concepts of the OMG Model Driven Architecture (MDA). The semantic repository is an open source project at modeldriven.org. Problem Statement Information about government processes, information and I.T. systems currently exists in a diverse set of forms, formats, repositories and documents with little to no management and coordination. The result is that there is rampant redundancy in the developing and re-developing the same information, different models, architectures and studies about the same things. Understanding and integrating this wide Varity of information is almost impossible and thus generally such understanding and integration is not achieved. Not only is this information expensive and time-consuming to develop, re-analyzing the same area results in inconsistent designs, lack of interoperability, redundant systems and missed opportunities for improvement. A Semantic Repository will replace this fragmentation and loss of information with an architected approach to developing,
    [Show full text]
  • Approach for Information Systems Semantic Interoperability in Supply Chain Environment Yan Lu
    Approach for Information Systems Semantic Interoperability in Supply Chain Environment Yan Lu To cite this version: Yan Lu. Approach for Information Systems Semantic Interoperability in Supply Chain Environment. Computer Aided Engineering. Zhejiang University, 2012. English. tel-00759808 HAL Id: tel-00759808 https://tel.archives-ouvertes.fr/tel-00759808 Submitted on 2 Dec 2012 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. 分类号: TP14 单位代码: 10335 密 级 公开 学 号: 10508042 博士学位论文 中文论文题目:供应链环境下的信息系统语义互 操作方法 英文论文题目:Approach for Information Systems Semantic Interoperability in Supply Chain Environment 申请人姓名: 吕艳 指导教师: 顾新建 教授 合作导师: Hervé Panetto 专业名称: 机械制造及自动化 研究方向: 制造业信息化 所在学院: 机械系 论文提交日期 2012-04-17 Abstract Multiple enterprises are usually involved in supply chain network, while almost each enterprise has its own information systems and databases (enterprise legacy systems). There are large quantities of information and data intersecting or even repeating with each other, which calls for system integration. Meanwhile in the cooperative process of supply chain,
    [Show full text]