Managing Semantic Metadata in Public Private Information Chains: a Reference Architecture for Alignment of Semantics, Technology and Stakeholders

Managing semantic metadata in public private information chains: A reference architecture for alignment of semantics, technology and stakeholders Master thesis MSc Systems Engineering, Policy Analysis & Management (SEPAM) V. den Bak Student #: 1268260 Delft University of Technology Thauris B.V. Faculty of Technology, Policy & Management Graduation Section Information & Communications Technology Graduation committee Prof.dr. Y. Tan (chair) Information & Communications Technology Dr.ir. M.F.W.H.A. Janssen (co-chair) Information & Communications Technology Ir. N. Bharosa (first supervisor) Information & Communications Technology Drs. H.G. van der Voort (second supervisor) Policy, Organization, Law & Gaming R. van Wijk MSc (external supervisor) Thauris B.V. Drs. S. Kockelkoren (external supervisor) Thauris B.V. 1 2 Managing semantic metadata in public private information chains: A reference architecture for alignment of semantics, technology and stakeholders Master thesis MSc Systems Engineering, Policy Analysis & Management (SEPAM) Author: Victor den Bak Student #: 1268260 Institution: Delft University of Technology Faculty of Technology, Policy & Management Jaffalaan 5 2628BK Delft The Netherlands In concordance with: Thauris B.V. Program: MSc Systems Engineering, Policy Analysis & Management Section: Information & Communications Technology Course: SEPAM Graduation Project (SPM5910) Submitted: 24-10-2011 Graduation: 7-11-2011 Graduation committee Prof.dr. Y. Tan (chair) ICT Dr.ir. M.F.W.H.A. (Marijn) Janssen (co-chair) ICT Ir. N. (Nitesh) Bharosa (first supervisor) ICT Drs. H.G. (Haiko) van der Voort (second supervisor) POLG R. (Remco) van Wijk MSc (external supervisor) Thauris B.V. Drs. S. (Stephan) Kockelkoren (external supervisor) Thauris B.V. 3 4 Acknowledgements Before presenting this research, I would like to express my gratitude to those without whom this research project could not have been completed successfully. First, I would like to thank the members of my graduation committee for their support: Marijn Janssen, Nitesh Bharosa, Haiko van der Voort, Remco van Wijk and Stephan Kockelkoren. They have always been available to provide useful advice and valuable comments on my work. Also, I am grateful to my colleagues at Thauris. They provided a friendly and motivating atmosphere that ensured steady progress amongst the many incentives for distraction. In particular Joris Hulstijn, who helped to structure this difficult problem on several occasions. Finally, I would like to thank the 18 interviewees from TU Delft, Bureau Jeugdzorg and Belastingdienst. These people made time available in their busy schedules and have provided me with valuable insights. Their hands-on and subject-matter expertise added enormous value on top of the scientific literature available. 5 6 Summary The use of a common set of semantic metadata is seen as one of the most promising developments in information exchange among public and private parties. Semantic metadata is data that provides context to core data and helps to convey the actual meaning and perspective of the information that is shared among people, systems and organizations. All information sharing activities are aimed at one objective: having the right information available to the end user, with as little loss, time delay and clutter as possible. Using a common set of semantics in electronic information exchange is believed to further reduce costs and time of information exchange, increase information quality and remove many of the unforeseen side effects and complexities of interconnecting stand alone information systems. Semantic metadata management is required in order to use semantic metadata effectively in a PPIC. A common vocabulary is of little use if it does not match organizational requirements. The main difficulty in semantic metadata management is that it touches on many elements of the organizational architecture. Semantic metadata management is primarily an alignment effort and partially a standardization effort. It includes the alignment of processes, technology and data models, both within and beyond organizational boundaries. However, many existing semantic metadata management approaches are ad hoc and lack a coordinated and premeditated approach. There are many theories and studies on individual topics related to metadata management, but a documented approach that puts all elements within the given scope in perspective is non-existent. This master thesis project was aimed at aiding those tasked with implementing a coordinated form of semantic metadata management within the domain of Public Private Information Chains (PPIC). The problem was approached from an enterprise architecture point of view. This means a broad, holistic view was applied. A PPIC is a digital information chain consisting of both public and private parties that is centered around a certain information process with a high rate of repetition and mutual responsibilities. This research project started out with a literature review and expert interviews. Best practices were extracted and tested in an in depth case study with two complementary cases in Dutch government organizations. The main research question has been answered by developing a reference architecture for semantic metadata management in a PPIC. A reference architecture is a generic blueprint that provides a holistic approach for a specific architecture archetype. It puts all elements required for semantic metadata into perspective making it easier to structure the many pieces of the puzzle. The reference architecture uses a format that on the one hand provides enough rigor to ensure interoperability, while on the other hand provides enough leeway to fit organizations with different characteristics or specific requirements. The mixture of rigor and leeway has been achieved by using both prescriptive design principles and tradeoffs that extend the design space. The reference architecture is centered around mitigating the main challenge in this domain and reinforcing one of the main potentials: reduction of complexity. Much of the complexity regarding information exchange in PPIC’s is artificial, not inherently present. Challenges have arisen by creating connections between systems, processes and organizations that were never designed from the outset to be interconnected in such a way. The semantic metadata management approach that is 7 introduced in this research has two pillars. First, a conceptual model is introduced to act as a single point of reference between all components, reducing the number of existing relations. Second, the relations between all components in the organizational architecture are actively managed. This proactive approach reduces incidents and improves information quality. The solution presented in this thesis is generic. The design principles and tradeoffs apply in a similar way to both private and public organizations. Moreover, it applies to organizations with different maturity levels in technology, data management and processes and with a varying level of ambition on this topic. In an information chain the diversity in stakeholders and their interests is a given situation. A certain degree of commitment and effort can be expected from the stakeholders in the chain, but semantic metadata management should not interfere with the private processes or bring an additional burden. The evaluated reference architecture presented in this study deals with this problem. Even though the solution is primarily aimed at providing benefits in inter-organizational information exchange, it is useful for internal use in individual organizations as well. 8 9 10 Table of contents 1 Introduction ----------------------------------------------------------------------------------------------------------- 17 1.1 Problem statement ------------------------------------------------------------------------------------------- 17 1.2 Research domain ---------------------------------------------------------------------------------------------- 17 1.3 Research goals ------------------------------------------------------------------------------------------------- 19 1.4 Scope ------------------------------------------------------------------------------------------------------------- 19 2 Methodology ---------------------------------------------------------------------------------------------------------- 21 2.1 Research questions ------------------------------------------------------------------------------------------- 21 2.2 Methodology --------------------------------------------------------------------------------------------------- 23 2.3 Case study approach ----------------------------------------------------------------------------------------- 27 2.4 Reference architecture -------------------------------------------------------------------------------------- 29 3 Public Private Information Chains ------------------------------------------------------------------------------- 33 3.1 Public policy and bureaucratic processes --------------------------------------------------------------- 33 3.2 Characteristics of Public Private Information Chains ------------------------------------------------- 34 3.3 Public Private Information Chains in practice ---------------------------------------------------------- 36 4 Semantic metadata: potential and challenges ---------------------------------------------------------------- 37 4.1 Potential benefits of standardizing semantic metadata --------------------------------------------- 37 4.2 Challenges

Managing Semantic Metadata in Public Private Information Chains: a Reference Architecture for Alignment of Semantics, Technology and Stakeholders

Metadata for Semantic and Social Applications

A Metadata Registry for Metadata Interoperability

1 1. Opening Page Good Morning Ladies and Gentlemen My Name Is

Reference Architecture for Space Information Management

Metadata Schema Registries in the Partially Semantic Web: the CORES Experience

2 Data Development Overview

Metadata Registry, Iso/Iec 11179

Metadata Standards and Metadata Registries: an Overview

Metadata and Paradata: Information Collection and Potential Initiatives

Ebxml Manager Composite Application User's Guide

Semantic Technologies I OMG Ontology Definition Metamodel

Metadata Standards & Applications