
The University of Manchester Research JAMI: a Java library for molecular interactions and data interoperability. DOI: 10.1186/s12859-018-2119-0 Document Version Final published version Link to publication record in Manchester Research Explorer Citation for published version (APA): Sivade, D. M., Koch, M., Shrivastava, A., Alonso-López, D., De, L. R. J., Del-Toro, N., Combe, CW., Meldal, BHM., Heimbach, J., Rappsilber, J., Sullivan, J., Yehudi, Y., & Orchard, S. (2018). JAMI: a Java library for molecular interactions and data interoperability. BMC Bioinformatics. https://doi.org/10.1186/s12859-018-2119-0 Published in: BMC Bioinformatics Citing this paper Please note that where the full-text provided on Manchester Research Explorer is the Author Accepted Manuscript or Proof version this may differ from the final Published version. If citing, it is advised that you check and use the publisher's definitive version. General rights Copyright and moral rights for the publications made accessible in the Research Explorer are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. Takedown policy If you believe that this document breaches copyright please refer to the University of Manchester’s Takedown Procedures [http://man.ac.uk/04Y6Bo] or contact [email protected] providing relevant details, so we can investigate your claim. Download date:03. Oct. 2021 Sivade (Dumousseau) et al. BMC Bioinformatics (2018) 19:133 https://doi.org/10.1186/s12859-018-2119-0 SOFTWARE Open Access JAMI: a Java library for molecular interactions and data interoperability M. Sivade (Dumousseau)1, M. Koch1, A. Shrivastava1, D. Alonso-López2, J. De Las Rivas2, N. del-Toro1, C. W. Combe3, B. H. M. Meldal1, J. Heimbach5,6, J. Rappsilber3,4, J. Sullivan5,6, Y. Yehudi5,6 and S. Orchard1* Abstract Background: A number of different molecular interactions data download formats now exist, designed to allow access to these valuable data by diverse user groups. These formats include the PSI-XML and MITAB standard interchange formats developed by Molecular Interaction workgroup of the HUPO-PSI in addition to other, use- specific downloads produced by other resources. The onus is currently on the user to ensure that a piece of software is capable of read/writing all necessary versions of each format. This problem may increase, as data providers strive to meet ever more sophisticated user demands and data types. Results: A collaboration between EMBL-EBI and the University of Cambridge has produced JAMI, a single library to unify standard molecular interaction data formats such as PSI-MI XML and PSI-MITAB. The JAMI free, open-source library enables the development of molecular interaction computational tools and pipelines without the need to produce different versions of software to read different versions of the data formats. Conclusion: Software and tools developed on top of the JAMI framework are able to integrate and support both PSI-MI XML and PSI-MITAB. The use of JAMI avoids the requirement to chain conversions between formats in order to reach a desired output format and prevents code and unit test duplication as the code becomes more modular. JAMI’s model interfaces are abstracted from the underlying format, hiding the complexity and requirements of each data format from developers using JAMI as a library. Keywords: Molecular interactions, Protein-protein interaction, Protein complexes, Data standards, HUPO-PSI, PSI-MI Background and the constructs used in each assay. This version of Molecular interaction data is crucial to the study and the interchange format is still widely used to capture ex- understanding of the molecular biology of a cell. These perimental data, but the need to describe more abstract data are large and complex, but the creation of a stan- concepts has recently resulted in the release of PSI-MI dardised data interchange format (PSI-MI XML) allowed XML 3.0 [3]. PSI-MI XML3.0 allows the capture of de- easier access, enabling users to merge data from dispar- tails of cooperative or allosteric binding sites, the com- ate resources and encouraging the development of tools position of protein complexes taken from multiple and software that facilitated network visualisation and publications, and more complex data types such as dy- analysis. Version 1.0 [1] of the format only allowed a namic interaction networks that change with time or relatively simple description of protein interactions but with concentration of agonist. A simpler tab-delimited as the data grew, limitations of the original format were representation of molecular interaction data has also identified, and an updated version, PSI-MI XML2.5 [2], been available since 2007 but this has also grown in was released in 2007. It allows the description of interac- complexity in response to user requests, and MITAB2.5, tions between molecules other than proteins, and en- 2.6 and 2.7 are now all available [2]. Additionally, at the ables the detailed capture of both experimental context 2017 HUPO-PSI workshop, the Molecular Interaction workgroup decided the newly developed MI-JSON will * Correspondence: [email protected] be its recommended protocol for serving interaction 1European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Genome Campus, Hinxton CB10 1SD, UK data to web pages and visualisation tools. Full list of author information is available at the end of the article © The Author(s). 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. Sivade (Dumousseau) et al. BMC Bioinformatics (2018) 19:133 Page 2 of 9 PSI-MI XML, MITAB and MI-JSON are all capable of a default implementation. Implementations may be holding the same data, in differing degrees of detail, and added, edited or removed if necessary over time. Main are all annotated using a single shared controlled vo- entities in the data model include Complex, Interaction, cabulary but exist to serve different user groups. The Entity, Participant, and Publication - interfaces with a XML format is largely used by software developers and default implementation and format-specific overloaded database managers, the MITAB by biologists interested behaviours. For example, PSI-XML 2.5 [2] allowed ex- in simple binary representation and the MI-JSON for periment descriptions to contain either a cross-reference visual representation. Updating any data format necessi- to a Publication object, or directly contain a list of attri- tates changes to many dependent systems. A broad butes such as author and journal, whereas in XML 3.0, it range of software, including curation, editing, export, is possible to associate both of these data members with visualisation, validation and analysis packages use the an experiment [3]. Since the Publication and XML ex- PSI-MI formats to access and manipulate the data and port classes are only interfaces, exporting the two differ- consequently need to be updated with every format up- ent types of Publication can be handled by the same date. Format updates add complexity to existing soft- software, with implementation classes reconciling the ware packages, as the programs need to be extended to two XML versions. utilise the new version whilst still continuing to support When included as a library in bioinformatics software, those already existing and widely-used. These software JAMI hides the complexity of supporting multiple data and standards are consumed by a diverse group of orga- formats. It facilitates data import, integration and ana- nisations with different levels of resources, ranging from lysis, simplifying software development by offering a sin- PhD students in small research groups to data pipeline gle API. JAMI also eases the creation of new specialists in pharmaceutical or bioinformatics compan- interchange formats, like JSON-LD or RDF. Additional ies. Potentially some groups may end up using legacy formats can be added once to JAMI and are then sup- standards and software for many years simply because ported in multiple software packages with little effort. they do not possess the skills, time, or budget to update Similarly, JAMI prevents code duplication - each of their software. these software sources drawing from JAMI now share Supporting such diverse needs is time and resource in- code, ensuring less effort is put into the development of tensive, yet securing funding for software maintenance is multiple XML/MITAB parsing modules. challenging [4]. Each new data format is useful and must be maintained, but each update generates a new library, Implementation with duplicated code, requiring parallel testing and gen- Figure 1 shows the overall architecture of the JAMI li- erating its own bugs. In summary, while new formats brary. JAMI is implemented in Java, using Maven for dis- meet genuine need, they also result in an expensive cas- tribution and dependencies (https://www.ebi.ac.uk/intact/ cade of changes to software and tools. maven/nexus/content/repositories/ebi-repo/psidev/psi/ The JAMI (Java Molecular Interaction framework) li- mi/jami/)..The code is available under the Apache 2.0 li- brary was developed, using an object-orientated ap- cence, and is available on GitHub [5]. The architecture is proach, to address these concerns. JAMI can import, highly modular, driven by the anticipated need to modify inter-convert and re-export molecular interaction data or add input and output types in the future, without af- in a variety of formats and versions.
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages10 Page
-
File Size-