Semantic FAIR Data Web Me

Total Page:16

File Type:pdf, Size:1020Kb

Semantic FAIR Data Web Me SNOMED CT Research Webinar Today’ Presenter WELCOME! *We will begin shortly* Dr. Ronald Cornet UPCOMING WEBINARS: RESEARCH WEB SERIES CLINICAL WEB SERIES Save the Date! August TBA soon! August 19, 2020 Time: TBA https://www.snomed.org/news-and-events/events/web-series Dr Hyeoun-Ae Park Emeritus Dean & Professor Seoul National University Past President International Medical Informatics Association Research Reference Group Join our SNOMED Research Reference Group! Be notified of upcoming Research Webinars and other SNOMED CT research-related news. Email Suzy ([email protected]) to Join. SNOMED CT Research Webinar: SNOMED CT – OWL in a FAIR web of data Dr. Ronald Cornet SNOMED CT – OWL in a FAIR web of data Ronald Cornet Me SNOMED Use case CT Semantic FAIR data web Me • Associate professor at Amsterdam UMC, Amsterdam Public Health Research Institute, department of Medical Informatics • Research on knowledge representation; ontology auditing; SNOMED CT; reusable healthcare data; FAIR data Conflicts of Interest • 10+ years involvement with SNOMED International (Quality Assurance Committee, Technical Committee, Implementation SIG, Modeling Advisory Group) • Chair of the GO-FAIR Executive board • Funding from European Union (Horizon 2020) Me SNOMED Use case CT Semantic FAIR data web FAIR Guiding Principles https://go-fair.org/ FAIR Principles – concise • Findable • Metadata and data should be easy to find for both humans and computers • Accessible • The user needs to know how data can be accessed, possibly including authentication and authorization • Interoperable • Data need to be integrated with other data and interoperate with applications for analysis, storage, and processing • Reusable • (Licensing & provenance) metadata and data should be well-described so that they can be replicated and/or combined in different settings FAIR Principles = “What”, not “how” • Globally unique and persistent identifiers • https://orcid.org/0000-0002-1704-5980 • https://www.linkedin.com/in/ronaldcornet/ • … • Freedom of format Open license à structured à open format à URI-based à linked FAIR ≠ Open (meta)data • Not all of SNOMED CT can be used by all for all tasks • Clinical data capture using SNOMED CT requires a license, but a Global Patient Set is available for sharing patient health information • Research licenses exist for SNOMED CT, among others in UMLS • Much of SNOMED CT can be used by many for many tasks Examples of (more or less) FAIR repositories • https://home.fairdatapoint.org/ ß Links to FAIR data points • https://fairsharing.org/ • https://www.openaire.eu/ • https://www.ohdsi.org/ ß “Human” entry to harmonized data Me SNOMED Use case CT Semantic FAIR data web SNOMED CT – more than the numbers SNOMED CT - active elements over time 1400000 1200000 1000000 800000 600000 400000 200000 0 concepts relationships descriptions https://www.icthealth.nl/online-magazine/editie-04-2018/onder-de-motorkap-helpt-snomed-zorgverleners-met-eenheid-van-taal/ https://boston.cbslocal.com/2017/12/15/salem-new-hampshire- owl-found-under-hood-during-pep-boys-oil-change/ OWL - Web Ontology Language ● A Semantic Web language to represent rich and complex knowledge (things, groups of things, and relations between things). ● A computational logic-based language ○ OWL ontologies provide classes, properties, individuals and data values and are stored as Semantic Web documents ● One of the distinguishing features of OWL is that it can be used to express extremely complicated and subtle ideas about your data. ● Primary uses ○ Fast and flexible data modeling ○ Efficient automated reasoning Before July 2019 - OWL conversion OWL-version of SNOMED CT can Be Generated from RF2 taBles ● Concept ● (Stated) Relationship ● Description Two available transformations ● Spackman OWL script tls2_StatedRelationshipsToOwlkRSS_INT.pl ● SNOMED OWL Toolkit https://github.com/IHTSDO/snomed-owl-toolkit Before July 2019 - Drawbacks ● Limited expressiveness in RF2 tables ● Implicit knowledGe included in OWL-transformation scripts ○ Role chains ○ Transitivity of relationships ○ Reflexivity From July 2019 - Possibilities In theory: Full OWL expressiveness In practice: ● Multiple axioms ● Role chains; transitivity; reflexivity ● Generalized Concept Inclusions (GCI’s) Multiple axioms Role chains, transitivity, reflexivity • Role chains, e.g., • SubObJectPropertyOf (ObJectPropertyChain(:127489000 :738774007) :127489000) • Transitivity • TransitiveObJectProperty(:738774007 | Is modification of (attribute)) • Reflexivity • ReflexiveObJectProperty(:738774007 | Is modification of (attribute)) Generalized Concept Inclusions (GCI’s) OWL RefSet (not: OWL-file) – benefits • Versioning of OWL axioms • Challenging in OWL-only • Maintaining RF2 infrastructure • Compatibility with overall infrastructure • Easy to create OWL-file Me SNOMED Use case CT Semantic FAIR data web Semantic Web Standards • OWL - ontologies • Syntax: owl-functional, owl-manchester • ShEx - (clinical) data models • Syntax: ShExC or any RDF-syntax Metadata • RDF - instances • Syntax: rdf-Jsonld, rdf-nq, Data rdf-nt, rdf-trix, rdf-turtle, rdf-xml Open license à structured à open format à URI-based à linked HL7 data modeling • ShEx – Shape Expressions • Describe permitted attributes • Including cardinalities • Including allowed values RDF – instances Triple: Subject https://www.linkedin.com/in/ronaldcornet/ Predicate http://hl7.org/fhir/patient#gender Object http://hl7.org/fhir/codesystem-administrative-gender.html#administrative- gender-male Triple: Subject https://orcid.org/0000-0002-1704-5980 Predicate http://snomed.info/id/263495000 (gender) Object http://snomed.info/id/703117000 (masculine) Required (interoperability) services • Ontologies • Data models (Resources & profiles a.k.a. archetypes & templates) • Value sets • Ontology alignment services • Static, e.g., UMLS, Athena • Dynamic, e.g., AML, FCA-Map, LogMap) • Instance alignment services (e.g., https://www.sameas.cc/) • Data access services Legend: SNOMED CT Ontology VS VS Value set … … VS VS VS VS Alignment Data Model Data Model Data Model Individ. EHR Reg. Reasoning EDC Data Model Added value of linked data with OWL • “Enrichment” • Viral pneumonia à Infective pneumonia à • Applying reasoning on instances • PatientX: finding = Infective pneumonia • PatientX: finding = COVID-19 è PatientX: finding = (Viral pneumonia : causative agent = SARS-CoV-2) Me SNOMED Use case CT Semantic FAIR data web Use Case (2019/2023) EU proJect, 35 countries: 26 EU Countries, 7 associated, Uk & Canada • Selecting and combining semantic pieces • To create a Virtual Platform to interconnect FAIR rare disease data • Fully based on semantic web technology • Includes ORDO, NCIt, SIO, … • Current focus on high-quality RDF-data and ontology alignment Use Case (2019/2023) ADAM-proJect (Adequate Data Capture and Monitoring) • Assess and improve quality of data on the problem list • Based on “Diagnosethesaurus”, Dutch Interface Terminology on SNOMED CT • Assess completeness of COVID-19-cohort • Medical entity linking of free text to SNOMED CT FAIR COVID-Predict proJect • Harmonize data of Dutch COVID-19 patients in a single datawarehouse • Increase use of (LOINC & SNOMED CT) coding in EHR & Lab-systems • Realize (de-)centralized OHDSI-compliant FAIR data points Use Case (2020/2023) EU proJect to develop a coaching system for improving the quality of life of cancer home patients • Harmonization of cancer data from the Netherlands and Italy • Application of OHDSI approach for data harmonization • Data exchange using HL7 FHIR • Research into reasoning over instance data, to support SPARQL querying Use Case (2016/2023) EHR vendor • Benefit from SNOMED CT hierarchy, properties and patient-friendly terms to increase patients’ understanding of their record EDC vendor • Make research data FAIR upon collection • Facilitating specification of rich metadata and establishing FAIR data points Summary • Two approaches to FAIR data and metadata • Rooted in Semantic Web technology • Growing from harmonized models and vocabularies • Ontologies are essential metadata • SNOMED CT being an expressive OWL ontology contributes to reasoning over EHR data • Infrastructure is being established, integrating the pieces is the next step Me SNOMED Use case CT Semantic FAIR data web SNOMED CT Research Webinar: Q & A SNOMED CT Research Webinar: Contact SNOMED International [email protected] THANK YOU! Suzy Roy [email protected] Dr. Ronald Cornet .
Recommended publications
  • SHACL Satisfiability and Containment
    SHACL Satisfiability and Containment Paolo Pareti1 , George Konstantinidis1 , Fabio Mogavero2 , and Timothy J. Norman1 1 University of Southampton, Southampton, United Kingdom {pp1v17,g.konstantinidis,t.j.norman}@soton.ac.uk 2 Università degli Studi di Napoli Federico II, Napoli, Italy [email protected] Abstract. The Shapes Constraint Language (SHACL) is a recent W3C recom- mendation language for validating RDF data. Specifically, SHACL documents are collections of constraints that enforce particular shapes on an RDF graph. Previous work on the topic has provided theoretical and practical results for the validation problem, but did not consider the standard decision problems of satisfiability and containment, which are crucial for verifying the feasibility of the constraints and important for design and optimization purposes. In this paper, we undertake a thorough study of different features of non-recursive SHACL by providing a translation to a new first-order language, called SCL, that precisely captures the semantics of SHACL w.r.t. satisfiability and containment. We study the interaction of SHACL features in this logic and provide the detailed map of decidability and complexity results of the aforementioned decision problems for different SHACL sublanguages. Notably, we prove that both problems are undecidable for the full language, but we present decidable combinations of interesting features. 1 Introduction The Shapes Constraint Language (SHACL) has been recently introduced as a W3C recommendation language for the validation of RDF graphs, and it has already been adopted by mainstream tools and triplestores. A SHACL document is a collection of shapes which define particular constraints and specify which nodes in a graph should be validated against these constraints.
    [Show full text]
  • D2.2: Research Data Exchange Solution
    H2020-ICT-2018-2 /ICT-28-2018-CSA SOMA: Social Observatory for Disinformation and Social Media Analysis D2.2: Research data exchange solution Project Reference No SOMA [825469] Deliverable D2.2: Research Data exchange (and transparency) solution with platforms Work package WP2: Methods and Analysis for disinformation modeling Type Report Dissemination Level Public Date 30/08/2019 Status Final Authors Lynge Asbjørn Møller, DATALAB, Aarhus University Anja Bechmann, DATALAB, Aarhus University Contributor(s) See fact-checking interviews and meetings in appendix 7.2 Reviewers Noemi Trino, LUISS Datalab, LUISS University Stefano Guarino, LUISS Datalab, LUISS University Document description This deliverable compiles the findings and recommended solutions and actions needed in order to construct a sustainable data exchange model for stakeholders, focusing on a differentiated perspective, one for journalists and the broader community, and one for university-based academic researchers. SOMA-825469 D2.2: Research data exchange solution Document Revision History Version Date Modifications Introduced Modification Reason Modified by v0.1 28/08/2019 Consolidation of first DATALAB, Aarhus draft University v0.2 29/08/2019 Review LUISS Datalab, LUISS University v0.3 30/08/2019 Proofread DATALAB, Aarhus University v1.0 30/08/2019 Final version DATALAB, Aarhus University 30/08/2019 Page | 1 SOMA-825469 D2.2: Research data exchange solution Executive Summary This report provides an evaluation of current solutions for data transparency and exchange with social media platforms, an account of the historic obstacles and developments within the subject and a prioritized list of future scenarios and solutions for data access with social media platforms. The evaluation of current solutions and the historic accounts are based primarily on a systematic review of academic literature on the subject, expanded by an account on the most recent developments and solutions.
    [Show full text]
  • Definition of Data Exchange Standard for Railway Applications
    PRACE NAUKOWE POLITECHNIKI WARSZAWSKIEJ z. 113 Transport 2016 6/*!1 Uniwersytet Technologiczno-:]! w Radomiu, (,? DEFINITION OF DATA EXCHANGE STANDARD FOR RAILWAY APPLICATIONS The manuscript delivered: March 2016 Abstract: Railway similar to the other branches of economy commonly uses information technologies in its business. This includes, inter alia, issues such as railway traffic management, rolling stock management, stacking timetables, information for passengers, booking and selling tickets. Variety aspects of railway operations as well as a large number of companies operating in the railway market causes that currently we use a lot of independent systems that often should work together. The lack of standards for data structures and protocols causes the need to design and maintain multiple interfaces. This approach is inefficient, time consuming and expensive. Therefore, the initiative to develop an open standard for the exchange of data for railway application was established. This new standard was named railML. The railML is based on Extensible Markup Language (XML) and uses XML Schema to define a new data exchange format and structures for data interoperability of railway applications. In this paper the current state of railML specification and the trend of development were discussed. Keywords: railway traffic control systems, railML, XML 1. INTRODUCTION It is hard to imagine the functioning of the modern world without information technologies. It is a result of numerous advantages of the modern IT solutions. One of the most important arguments for using IT systems is cost optimisation [1, 3, 6]. Variety aspects of railway operations as well as a large number of companies operating in the railway market causes that currently we use a lot of independent systems that often should cooperate.
    [Show full text]
  • JSON Application Programming Interface for Discrete Event Simulation Data Exchange
    JSON Application Programming Interface for Discrete Event Simulation data exchange Ioannis Papagiannopoulos Enterprise Research Centre Faculty of Science and Engineering Design and Manufacturing Technology University of Limerick Submitted to the University of Limerick for the degree of Master of Engineering 2015 1. Supervisor: Prof. Cathal Heavey Enterprise Research Centre University of Limerick Ireland ii Abstract This research is conducted as part of a project that has the overall aim to develop an open source discrete event simulation (DES) platform that is expandable, and modular aiming to support the use of DES at multi-levels of manufacturing com- panies. The current work focuses on DES data exchange within this platform. The goal of this thesis is to develop a DES exchange interface between three different modules: (i) ManPy an open source discrete event simulation engine developed in Python on the SimPy library; (ii) A Knowledge Extraction (KE) tool used to populate the ManPy simulation engine from shop-floor data stored within an Enterprise Requirements Planning (ERP) or a Manufacturing Execution System (MES) to allow the potential for real-time simulation. The development of the tool is based on R scripting language, and different Python libraries; (iii) A Graphical User Interface (GUI) developed in JavaScript used to provide an interface in a similar manner to Commercial off-the-shelf (COTS) DES tools. In the literature review the main standards that could be used are reviewed. Based on this review and the requirements above, the data exchange format standard JavaScript Object Notation (JSON) was selected. The proposed solution accom- plishes interoperability between different modules using an open source, expand- able, and easy to adopt and maintain, in an all inclusive JSON file.
    [Show full text]
  • Strategic Directions for Sakai and Data Interoperability Charles Severance ([email protected]), Joseph Hardin ([email protected])
    Strategic Directions for Sakai and Data Interoperability Charles Severance ([email protected]), Joseph Hardin ([email protected]) Sakai is emerging as the leading open source enterprise-class collaboration and learning environment. Sakai's initial purpose was to produce a single application but increasingly, Sakai will need to operate with other applications both within and across enterprises. This document is intended to propose a roadmap as to where Sakai should go in terms of data interoperability both amongst applications running within Sakai and with applications interacting with Sakai. There will be other aspects of future Sakai development that are not covered here. This document takes a long-term view on the evolution of Sakai - it is complementary to the Sakai Requirements process, which captures short to medium term priorities. The tasks discussed in this document should not be seen as the "most important" tasks for Sakai. The document only covers data exchange aspects of Sakai. Nothing in this document is set in stone - since Sakai is a community effort, everyone is a volunteer. However, by publishing this overview we hope to facilitate consensus and alignment of goals within the community over the longer- term future of Sakai. Sakai is currently a Service-Oriented-Architecture. This work maintains and enhances the Service-Oriented aspects of Sakai and adds the ability to use Sakai as an set of objects. This work move be both Service Oriented and Object- Oriented. Making Sakai Web 2.0 While "Web 2.0" is a vastly overused term, in some ways, this effort can be likened to building the Web 2.0 version of Sakai.
    [Show full text]
  • Web Architecture: Structured Formats (DOM, JSON/YAML)
    INFO/CS 4302 Web Informaon Systems FT 2012 Week 5: Web Architecture: StructureD Formats – Part 4 (DOM, JSON/YAML) (Lecture 9) Theresa Velden Haslhofer & Velden COURSE PROJECTS Q&A Example Web Informaon System Architecture User Interface REST / Linked Data API Web Application Raw combine, (Relational) Database / Dataset(s), clean, In-memory store / refine Web API(s) File-based store RECAP XML & RelateD Technologies overview Purpose Structured Define Document Access Transform content Structure Document Document Items XML XML Schema XPath XSLT JSON RELAX NG DOM YAML DTD XSLT • A transformaon in the XSLT language is expresseD in the form of an XSL stylesheet – root element: <xsl:stylesheet> – an xml Document using the XSLT namespace, i.e. tags are in the namespace h_p://www.w3.org/1999/XSL/Transform • The body is a set of templates or rules – The ‘match’ aribute specifies an XPath of elements in source tree – BoDy of template specifies contribu6on of source elements to result tree • You neeD an XSLT processor to apply the style sheet to a source XML Document XSLT – In-class Exercise Recap • Example 1 (empty stylesheet – Default behavior) • Example 2 (output format text, local_name()) • Example 3 (pulll moDel, one template only) • Example 4 (push moDel, moDular Design) XSLT: ConDi6onal Instruc6ons • Programming languages typically proviDe ‘if- then, else’ constructs • XSLT proviDes – If-then: <xsl:If> – If-then-(elif-then)*-else: <xsl:choose> XML Source Document xsl:if xsl:if ? xsl:if <html> <boDy> <h2>Movies</h2> <h4>The Others (English 6tle)</h4>
    [Show full text]
  • Semantics and Validation of Recursive SHACL
    Semantics and Validation of Recursive SHACL Julien Corman1, Juan L. Reutter2, and Ognjen Savkovic´1 1 Free University of Bozen-Bolzano, Bolzano, Italy 2 PUC Chile and IMFD Chile Abstract. With the popularity of RDF as an independent data model came the need for specifying constraints on RDF graphs, and for mechanisms to detect violations of such constraints. One of the most promising schema languages for RDF is SHACL, a recent W3C recommendation. Unfortunately, the specification of SHACL leaves open the problem of validation against recursive constraints. This omission is important because SHACL by design favors constraints that reference other ones, which in practice may easily yield reference cycles. In this paper, we propose a concise formal semantics for the so-called “core constraint components” of SHACL. This semantics handles arbitrary recursion, while being compliant with the current standard. Graph validation is based on the existence of an assignment of SHACL “shapes” to nodes in the graph under validation, stating which shapes are verified or violated, while verifying the targets of the validation process. We show in particular that the design of SHACL forces us to consider cases in which these assignments are partial, or, in other words, where the truth value of a constraint at some nodes of a graph may be left unknown. Dealing with recursion also comes at a price, as validating an RDF graph against SHACL constraints is NP-hard in the size of the graph, and this lower bound still holds for constraints with stratified negation. Therefore we also propose a tractable approximation to the validation problem.
    [Show full text]
  • The RDF Data Model
    Part 2 RDF The RDF Data Model Werner Nutt Master Informatique Semantic Technologies 1 Part 2 RDF Acknowledgment These slides are based on the slide set • RDF By Mariano Rodriguez (see http://www.slideshare.net/marianomx) Master Informatique Semantic Technologies 2 Part 2 RDF • History and Motivation • Naming: URIs, IRIs, Qnames • RDF Data Model: Triples, Literals, Types • Modeling with RDF: BNodes, n-ary Relations, Reification • Containers Master Informatique Semantic Technologies 3 Part 2 RDF • History and Motivation • Naming: URIs, IRIs, Qnames • RDF Data Model: Triples, Literals, Types • Modeling with RDF: BNodes, n-ary Relations, Reification • Containers Master Informatique Semantic Technologies 4 Part 2 RDF RDF stands for … Resource Description Framework Master Informatique Semantic Technologies 5 Part 2 RDF History • RDF originated as a format for structuring metadata about Web sites, pages, etc. – Page author, creator, publisher, editor, … – Data about them: email, phone, job, … • First version in W3C Recommendation of 1999 – specified serialization in XML • Metadata = Data è RDF is a general data format • Berners-Lee, Hendler, and Lassila proposed RDF as the model for data exchange on the Semantic Web (see their paper in Scientific American, 2001) Master Informatique Semantic Technologies 6 Part 2 RDF RDF is… … the data model of Semantic Technologies and of the Semantic Web Master Informatique Semantic Technologies 7 Part 2 RDF Two Views of RDF • Intuitively, an RDF data set is a labeled, directed graph è what are the nodes? and
    [Show full text]
  • Data Exchange Formats Part 1
    Web Data Integration Data Exchange Formats -Part 1 - University of Mannheim – Prof. Bizer: Web Data Integration Slide 1 Data Exchange Data Exchange: Transfer of data from one system to another. Data Exchange Format: Format used to represent (encode) the transferred data. Data System A System B DB DB Data Data Web Server File University of Mannheim – Prof. Bizer: Web Data Integration Slide 2 Web Data Web Data is heterogeneous with respect to the employed 1. Data Exchange Format (Technical Heterogeneity) 2. Character Encoding (Syntactical Heterogeneity) University of Mannheim – Prof. Bizer: Web Data Integration Slide 3 Outline 1. Data Exchange Formats - Part I 1. Character Encoding 2. Comma Separated Values (CSV) 1. Variations 2. CSV in Java 3. Extensible Markup Language (XML) 1. Basic Syntax 2. DTDs 3. Namespaces 4. XPath 5. XSLT 6. XML in Java 2. Data Exchange Formats - Part II 1. JavaScript Object Notation (JSON) 2. Resource Description Framework (RDF) University of Mannheim – Prof. Bizer: Web Data Integration Slide 4 Character Encoding Every character is represented as a bit sequence, e.g. “A” = 0100 0001 Character encoding: mapping of “real” characters to bit sequences A common problem in data integration: http://w3techs.com/technologies/overview/character_encoding/all http://geekandpoke.typepad.com/geekandpoke/2011/08/coders-love-unicode.html University of Mannheim – Prof. Bizer: Web Data Integration Slide 5 Character Encoding: ASCII, ISO 8859 ASCII („American Standard Code for Information Interchange“) ISO 646 (1963), 127 characters (= 7 bits), 95 printable: !"#$%&'()*+,-./0123456789:;<=>? @ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_ `abcdefghijklmnopqrstuvwxyz{|}~ Extension to 8 Bits: ISO 8859-1 to -16 (1998) – covers characters of European languages – well-known: 8859-1 (Latin-1) – including: Ä, Ö, Ü, ß, Ç, É, é, … But the Web speaks more languages..
    [Show full text]
  • NGSI-LD API: for Context Information Management
    ETSI White Paper No. 31 NGSI-LD API: for Context Information Management 1st edition – January 2019 ISBN No. 979-10-92620-27-6 Authors: Duncan Bees Lindsay Frost Martin Bauer Mike Fisher Wenbin Li ETSI 06921 Sophia Antipolis CEDEX, France Tel +33 4 92 94 42 00 [email protected] www.etsi.org About the authors Duncan Bees Duncan Bees, Principal, Duncan Bees Technologies Ltd. Duncan Bees carries out technical and business projects in telecommunications, IoT, media streaming, and broadband infrastructure in Vancouver, Canada. He has led wireless baseband signal processing development teams, product planning for communications semiconductors, strategic planning for the Digital Living Network Association, and was Chief Technology and Business Officer of the Home Gateway Initiative (HGI). He holds the degrees of Master of Electrical Engineering (digital signal processing) from McGill University, and a Bachelor of Applied Science from the University of British Columbia. Lindsay Frost Chief Standardization Engineer, NEC Laboratories Europe Lindsay Frost was elected chairman of ETSI ISG CIM in February 2017, elected to the Board of ETSI in November 2017 and is ETSI delegate to the sub-committee of the EC Multi-Stakeholder Platform (Digitizing European Industry) and to the CEN-CENELEC-ETSI Sector Forum on Smart and Sustainable Cities and Communities. He began his career as a research manager in experimental physics facilities in Germany, Italy and Australia, before joining NEC in 1999. From 2003 to 2009 he managed NEC R&D teams for 3GPP, WiMAX, fixed-mobile convergence and WLAN, while also working for two years as a group chairman in the Wi-Fi Alliance.
    [Show full text]
  • Comparison of JSON and XML Data Interchange Formats: a Case Study
    Comparison of JSON and XML Data Interchange Formats: A Case Study Nurzhan Nurseitov, Michael Paulson, Randall Reynolds, Clemente Izurieta Department of Computer Science Montana State University – Bozeman Bozeman, Montana, 59715, USA {[email protected], [email protected], [email protected], [email protected]} Abstract The primary uses for XML are Remote Procedure Calls (RPC) [4] and object serialization for transfer of This paper compares two data interchange formats data between applications. XML is a language used currently used by industry applications; XML and for creating user-defined markups to documents and JSON. The choice of an adequate data interchange encoding schemes. XML does not have predefined tag format can have significant consequences on data sets and each valid tag is defined by either a user or transmission rates and performance. We describe the through another automated scheme. Vast numbers of language specifications and their respective setting of tutorials and user forums provide wide support for use. A case study is then conducted to compare the XML and have helped create a broad user base. XML resource utilization and the relative performance of is a user-defined hierarchical data format. An example applications that use the interchange formats. We find of an object encoded in XML is provided in figure 1. that JSON is significantly faster than XML and we further record other resource-related metrics in our results. 1. INTRODUCTION Data interchange formats evolved from being mark- up and display-oriented to further support the encoding Figure 1 : A hierarchical structure describing the of meta-data that describes the structural attributes of encoding of a name the information.
    [Show full text]
  • A Lightweight Data Integration Architecture Using Atom David W
    A Lightweight Data Integration Architecture using Atom David Williamson Nigel Stanger The Information Science Discussion Paper Series Number 2005/04 March 2005 ISSN 1172-6024 University of Otago Department of Information Science The Department of Information Science is one of six departments that make up the School of Business at the University of Otago. The department offers courses of study leading to a major in Information Science within the BCom, BA and BSc degrees. In addition to undergraduate teaching, the department is also strongly involved in post- graduate research programmes leading to MCom, MA, MSc and PhD degrees. Re- search projects in spatial information processing, connectionist-based information sys- tems, software engineering and software development, information engineering and database, software metrics, distributed information systems, multimedia information systems and information systems security are particularly well supported. The views expressed in this paper are not necessarily those of the department as a whole. The accuracy of the information presented in this paper is the sole responsibil- ity of the authors. Copyright Copyright remains with the authors. Permission to copy for research or teaching pur- poses is granted on the condition that the authors and the Series are given due ac- knowledgment. Reproduction in any form for purposes other than research or teach- ing is forbidden unless prior written permission has been obtained from the authors. Correspondence This paper represents work to date and may not necessarily form the basis for the au- thors’ final conclusions relating to this topic. It is likely, however, that the paper will ap- pear in some form in a journal or in conference proceedings in the near future.
    [Show full text]