MAGPIE: Simplifying Access and Execution of Computational Models in the Life Sciences
Total Page:16
File Type:pdf, Size:1020Kb
RESEARCH ARTICLE MAGPIE: Simplifying access and execution of computational models in the life sciences Christoph Baldow1☯, Sebastian Salentin2☯, Michael Schroeder2, Ingo Roeder1,3, Ingmar Glauche1* 1 Institute for Medical Informatics and Biometry, Medizinische FakultaÈt Carl Gustav Carus, Technische UniversitaÈt Dresden, Dresden, Germany, 2 Biotechnology Center (BIOTEC), Technische UniversitaÈt Dresden, Dresden, Germany, 3 National Center for Tumor Diseases (NCT), Partner Site Dresden, Dresden, Germany a1111111111 ☯ These authors contributed equally to this work. a1111111111 * [email protected] a1111111111 a1111111111 a1111111111 Abstract Over the past decades, quantitative methods linking theory and observation became increasingly important in many areas of life science. Subsequently, a large number of mathe- OPEN ACCESS matical and computational models has been developed. The BioModels database alone lists more than 140,000 Systems Biology Markup Language (SBML) models. However, while the Citation: Baldow C, Salentin S, Schroeder M, Roeder I, Glauche I (2017) MAGPIE: Simplifying exchange within specific model classes has been supported by standardisation and data- access and execution of computational models in base efforts, the generic application and especially the re-use of models is still limited by the life sciences. PLoS Comput Biol 13(12): practical issues such as easy and straight forward model execution. MAGPIE, a Modeling e1005898. https://doi.org/10.1371/journal. pcbi.1005898 and Analysis Generic Platform with Integrated Evaluation, closes this gap by providing a software platform for both, publishing and executing computational models without restric- Editor: TimotheÂe Poisot, Universite de Montreal, CANADA tions on the programming language, thereby combining a maximum on flexibility for pro- grammers with easy handling for non-technical users. MAGPIE goes beyond classical Received: July 20, 2017 SBML platforms by including all models, independent of the underlying programming lan- Accepted: November 27, 2017 guage, ranging from simple script models to complex data integration and computations. We Published: December 15, 2017 demonstrate the versatility of MAGPIE using four prototypic example cases. We also outline Copyright: © 2017 Baldow et al. This is an open the potential of MAGPIE to improve transparency and reproducibility of computational mod- access article distributed under the terms of the els in life sciences. A demo server is available at magpie.imb.medizin.tu-dresden.de. Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. This is a PLOS Computational Biology Software paper. Data Availability Statement: The source code of the latest server application can be downloaded from the GitHub repository at https://github.com/ christbald/magpie. Furthermore, the source code at Introduction the time of publication can be found at https:// zenodo.org/record/1044034. The corresponding R Although the development and use of computational models is gaining more and more impor- package, providing a remote access to MAGPIE, is tance in the era of biological and big data, there is a distinct gap between the development of available at https://github.com/christbald/magpie_ computational models by theoreticians and their application by end users in experimental api_r. research or clinical practice [1±3]. In many cases, technical issues limit the application of Funding: This work was partly supported by the computational approaches: models commonly lack an easily accessible user interface or German Federal Ministry of Education and require advanced programming skills on the side of the end user. Furthermore, the limited PLOS Computational Biology | https://doi.org/10.1371/journal.pcbi.1005898 December 15, 2017 1 / 11 MAGPIE: Simplifying access and execution of computational models Research (www.bmbf.de/en/), Grant number access to models, or missing versioning of underlying algorithms, data, and parameters lead to 031A315 ªMessAgeº to IG and Grant number model developments, which in many cases can not be easily reproduced [4, 5]. In a recent sur- 031A424 ªHaematoOptº to IR. The funders had no vey conducted by Nature, more than half of the participating researchers believed that there is role in study design, data collection and analysis, decision to publish, or preparation of the a significant crisis in model and data reproducibility [6]. Our platform MAGPIE addresses this manuscript. issue by providing a single platform where researchers from different domains can publish, exchange and execute models on-line (Fig 1) as well as publish and exchange data. Competing interests: The authors have declared that no competing interests exist. In the past, there have been several approaches to provide a unified platform for computa- tional models in the biological domain: Standardised description languages for models, fore- most SBML (Systems Biology Markup Language) [7, 8], but also CellML [9], NeuroML [10] or MultiCellDS [11] and associated model databases such as BioModels [12] provide platforms to store and exchange standardised models. The Cardiac Electrophysiology Web Lab [13] is a fur- ther example of a database for standardised models, which allows to compare model behaviour under different conditions. Such tailor-made approaches for the description of specific pro- cesses are valuable for specific domains, but impose restrictions on other applications and their interoperatibility. Going one step further, workflow tools such as Taverna or GALAXY [14, 15] are application-oriented and help the user to integrate multiple analysis steps. Here, it is possible to combine SBML models, custom scripts, and web services. While this approach is powerful, the platforms themselves bring a high overhead when it comes to simple applications and may be unintuitive for inexperienced users. On the other side of the software spectrum, social networks for scientists, e.g. ResearchGate or Mendeley, aim to make individual research more visible by allowing the exchange of publications and expertise. However, they do not cur- rently offer to share and publish data-driven analyses or modelling approaches. The main objective of MAGPIE is to provide a user-friendly system to publish, version, access and execute models in an environment that is accessible for different end users. Con- trary to the idea of developing a new modelling standard, different types of computational Fig 1. Collaborative research with MAGPIE. Physicians and biologists (left) can use models provided by computational experts (right) to analyse their data in MAGPIE (center). The central element of the platform comprises a computing framework and a model database. https://doi.org/10.1371/journal.pcbi.1005898.g001 PLOS Computational Biology | https://doi.org/10.1371/journal.pcbi.1005898 December 15, 2017 2 / 11 MAGPIE: Simplifying access and execution of computational models models can be uploaded with minimal preparation. While MAGPIE also supports SBML, it puts no restrictions on programming languages or frameworks. The focus for model execution is not on complex workflows, but on one-click execution of single models. This makes it easy for end users to apply algorithms that have been previously deposited by computational experts. A hashtag-based system allows to share results and projects with other users. The central layers of MAGPIE are a repository of models and analysis tools, a project-ori- ented access to the models that allows changing parameter configurations and data input, pro- gram execution, visualization of results, as well as a twitter-like hashtag system to simplify the sharing of results increasing the visibility of individual researchers. Automated version control, remote access, and virtualization are available by default. Design and implementation MAGPIE server MAGPIE was implemented in Ruby on Rails 5. The public demo server is based on an nginx [16] environment in combination with foreman [17], Phusion Passenger [18], and Sidekiq [19] for simplified application management. Furthermore, it allows flexible scaling of MAGPIE for local installations and distributed cluster environments. The demo server uses PostgreSQL [20] as a database backend system. Security and virtualization MAGPIE ensures security features such as separateness, reproducibility and computation security, thereby allowing to be applied in systems medicine and potentially in clinical practice. It is implemented in a privacy-oriented fashion, such that everything a user wants to share has to be published explicitly. Additionally, MAGPIE relies on Docker [21] virtualization for exe- cuting jobs using docker-api for Rails. This brings the advantage of separateness and at the same time it can be ensured that a model can not harm the host system (e.g. the web server itself). The Docker containers also provide lightweight virtual model execution environments. Administrators of MAGPIE can install components and programming languages in these con- tainers to be used by the models or completely exchange the execution environment. Addi- tionally, they can use a role-based access control system. Version control The storage of models and corresponding data is organized by the repository system git to enable full reproducibility of calculated results. Whenever a new model is uploaded or an exist- ing model is updated, a new git repository is created or the existing model