BioModels Database

Camille Laibe

HARMONY 2012, 21­25th May 2012, Maastricht EBI is an Outstation of the European Laboratory. What is BioModels Database?Who are we?

BioModels.net team Technology part of the Computational Systems Neurobiology group (Nicolas Le Novère) at EMBL­EBI Standards: Minimal Information Required In the Annotation of Models (MIRIAM), Minimal Information About a Simulation Experiment (MIASE), Graphical Notation (SBGN), … Formats: Systems Biology Markup Language (SBML), Simulation Experiment Description Markup Language (SED­ML), ... Ontologies: Systems Biology Ontology (SBO), Kinetic Simulation Algorithm Ontology (KiSAO), TErminology for the Description of DYnamics (TEDDY), … Services: BioModels Database, MIRIAM Registry, Identifiers.org, ... Tools: libSBML, JSBML, SBFC, SBMLeditor, …

2 Basic (biochemical) model exampleWhat is BioModels Database?

k k A+B 1 A_B 3 Ap+B k2

d[A]/dt = - k1[B][A] + k2[A_B] d[Ap]/dt = + k3[A_B] d[B]/dt = - k1[B][A] + k2[A_B] + k3[A_B] d[A_B]/dt = + k1[B][A] - k2[A_B] - k3[A_B]

[x]

3 t How to build a model?

biological model mathematical model

simulation computational model4 Tyson et al (1991) PNAS 88(1):7328-32 What is BioModels Database?Why are models useful?

quantitative / dynamic understanding of biological systems integration of data from various scales make clear the current state of knowledge effective way of highlighting gaps in the knowledge

prediction of the behaviour of systems under certain conditions sometimes the only tool available

design novel experiments

Models are significant tools in Systems Biology

5 Requirements for storage and exchangeStorage and exchange of models

Modellers need to: find understand reuse existing models

6 Requirements for storage and exchangeStorage and exchange of models

Modellers need to: find understand reuse combine existing models

This requires: standard formats

access to published models

provision of reliable models: curated and annotated 7 http://www.ebi.ac.uk/biomodels/ 8 Examples of models in BioModels DatabaseWhat is BioModels Database?

Biochemical models interactions between molecules in multiple cellular compartments Pharmacometrics models tumor growth and treatment response Single­compartment neurons membrane voltage, current flow, concentrations of various ions intra­ and extracellularly Spread of infectious diseases outbreak of zombie infection Ecosystem models interaction of living in a given environment ...

9 What is BioModels Database?Curated branch content

10 Production pipeline

11 Model submission

12 Support for depositionModels provenance

From authors prior to publication Supported (listed in instructions for authors) by > 300 journals, including:

● Molecular Systems Biology

● All PLoS journals

● All BioMedCentral journals

● …

Submitted by curators implemented from literature imported from journal supplementary materials exchanged with other repositories (DOQCS, CellML Model Repository, JWS Online, ...)

13 Provided by other people curating models out of interest Model submission (step 1)

14 Model submission (step 2)

15 Curation

16 MIRIAM

The Minimum Information Required In the Annotation of a Model

http://biomodels.net/miriam/

MIRIAM STANDARDMIRIAM Guidelines

set of guidelines for the curation and annotation of quantitative models about encoding and annotation applicable to any structured model format

cf. Nicolas Le Novère et al. Minimum Information Requested in the Annotation of biochemical Models (MIRIAM). Nature , 2005

MIRIAM Compliance

Models must (among other things):

be encoded in a public machine­readable format be clearly linked to a single publication reflect the structure of the biological processes described in the reference paper (list of reactions, ...) be instantiable in a simulation (possess initial conditions, ...) be able to reproduce the results given in the reference paper contain creator’s contact details annotated: each model constituent must be unambiguously identified

Annotation

20 Curated and non­curated branchesSupport for deposition

Curated branch MIRIAM compliant models Non­curated branch valid SBML but not curated or annotated not MIRIAM compliant models cannot reproduce published results different model structure non kinetic model (FBA, stoichiometric maps, ...) MIRIAM compliant models models contain kinetic that we cannot curate up to now work in progress, will be moved to curated branch in the near future

21 Why are annotations important?

Annotations, and generally metadata, are essential for:

understanding data reusing data comparing data integrating data converting data providing efficient search strategies ...

22 Why are annotations important?

Annotations, and generally metadata, are essential for:

understanding data reusing data comparing data integrating data converting data providing efficient search strategies ...

→ true for any kind of data!

23 Identifiers for annotations

Unique and unambiguous an identifier must never be assigned to two different objects Perennial the identifier is constant and its lifetime is permanent Standards compliant must conform on existing standards, such as URI Resolvable identifiers must be able to be transformed into locations of online resources storing the object or information about the object Free of use everybody should be able to use and create identifiers, freely and at no cost

Towards globally unique identifiers

Entity Namespace identifier

Identifies a Identifies a data data collection entry within the data collection

provided by the from a shared list of data collection namespaces unique within the data collection format defined by the data collection

25 Provision of URIs for annotations

MIRIAM Registry catalogue of data collections and their associated namespace provides perennial identifiers for annotation and cross­ referencing purposes

Human calmodulin: P62158 in UniProt urn:miriam:uniprot:P62158

Alcohol dehydrogenase: 1.1.1.1 in Enzyme Nomenclature urn:miriam:ec-code:1.1.1.1

Activation of MAPKK activity: GO:0000186 in Gene Ontology 26 urn:miriam:obo.go:GO%3A0000186 Provision of URIs for annotations

MIRIAM Registry catalogue of data collections and their associated namespace provides perennial identifiers for annotation and cross­ referencing purposes

Identifiers.org

● built on the information stored in the Registry

● provides directly resolvable URIs Human calmodulin: P62158 in UniProt http://identifiers.org/uniprot/P62158

Alcohol dehydrogenase: 1.1.1.1 in Enzyme Nomenclature http://identifiers.org/ec-code/1.1.1.1

Activation of MAPKK activity: GO:0000186 in Gene Ontology 27 http://identifiers.org/obo.go/GO:0000186 Qualified annotation

Current BioModels.net QualifiersBioModels.net qualifiers

bqmodel:is bqbiol:isPropertyOf bqmodel:isDerivedFrom bqbiol:isVersionOf bqmodel:isDescribedBy bqbiol:hasVersion bqbiol:isHomologTo bqbiol:is bqbiol:isDescribedBy bqbiol:isDescribedBy bqbiol:encodes bqbiol:hasPart bqbiol:isEncodedBy bqbiol:hasProperty bqbiol:occursIn bqbiol:isPartOf [...]

http://biomodels.net/qualifiers/ 29 Annotations in SBML

30 Publication

31 List of models 32 Model browsing via GO terms 33 Model browsing via GO terms 34 Model browsing via 35 Model search 36 Advanced model search 37 Search strategyBioModels.net

38 Taxonomic search

Searching for mammalia retrieves all models fitting mammals both up and down the phylogenetic tree

Fungi Homo sapiens

Fungi / Metazoa Mammalia Rattus norvegicus Metazoa

Arthropoda

39 Taxonomic search

metazoa/fungi homo sapiens

rattus

rattus norvegicus

mammalia

40 41 42 43 44 User interface

45 User interface

46 Curation information 47 Graphical export 48 Online simulation 49 JWS Online 50 Sub­model creation 51 Sub­model creation 52 Sub­model creation 53 Sub­model creation 54 Sub­model creation 55 Model of the month 56 Report issues 57 Web Services 58 Supported formats

XPP-Aut Octave

VCML (Vcell)

59 Converter framework

System Biology Format Converter generic framework that potentially allows any conversion between two formats aims to be easily extended currently supported: conversion from SBML to SBGN­ML, BioPAX Level 2 and Level 3, XPP, Octave, Dot, ... allows the combination of several existing converters (conversion pipeline) collaborative project developed in Java online conversion service: http://www.ebi.ac.uk/compneur­srv/converters/ (beta)

http://sourceforge.net/projects/sbfc Path2Models

large scale generation of quantitative models form pathways initial pathways coming from databases (such as KEGG) completed with information coming from other resources (BioCarta, MetaCyc, ...) converted into SBML enriched with cross­references mathematical models are generated (using common modular rate law for the metabolic networks and logical models for the signalling pathways) whole genome metabolism models are reconstructed using flux balance constraint methods http://www.ebi.ac.uk/biomodels­main/path2models

Path2Models 62 Path2Models: browsing 63 Path2Models: display 64 Path2Models acknowledgements

Finja Büchel Claudine Chaouiya Andreas Dräger Sarah Keating Camille Laibe Julio Saez­Rodriguez Nicolas Le Novère Martijn Van Iersel Florian Mittag Tobias Czauderna Nicolas Rodriguez Michael Hucka Michael Schubert Roland Keller Niel Swainston Falk Schreiber Clemens Wrzodek

65 What is BioModels Database?Database content (r21)

160000 1000 M s

o n

d o 900 i 140000 e t

l

s a

l 800 e 120000

R 700 100000 600

80000 500

400 60000 300 40000 200 20000 100

0 0 04/2005 04/2006 04/2007 04/2008 04/2009 04/2010 04/2011 04/2012

Evolution of the content of BioModels Database 66 What is BioModels Database?Database content (r22)

M s

o n 30000000 150000

d o i 140000 e t

l

s a 130000 l 25000000

e 120000

R 110000 20000000 100000 90000 80000 15000000 70000 60000 10000000 50000 40000 30000 5000000 20000 10000 0 0 04/2005 12/2005 08/2006 04/2007 12/2007 08/2008 04/2009 12/2009 08/2010 04/2011 12/2011

Evolution of the content of BioModels Database 67 Terms of use

68 New infrastructure requirements

Features format independent full model versioning ranked search results (making full use of annotations) private secured access to the pipeline for the models you submitted collaboration: model sharing and development standard access for reviewers (before model publication) Software easy deployment and reuse (independent of EBI infrastructure) easy to extend (usage of plugins) improved performance (more and larger models) improved security customisable theme 69 ... JUMMP: JUst a Model Management Platform

https://bitbucket.org/jummp/jummp/

70 JUMMP: architecture

71 Technologies

Groovy

Grails (Spring, Hibernate, …)

Spring Security

Hibernate Search

Apache ActiveMQ / D­Bus

jQuery, jQuery UI

JSBML

Subversion / Git 72 ... Development model

new application focused on security, performance and flexibility multiple instances running in various institutes (various projects using the software to run their infrastructure) EBI (and its mirrors) remains the location where models are publicly available

community developed project initially undertaken by: European Institute (EBI) German Cancer Research Center (DKFZ)

73 JUMMP: acknowledgements

Jürgen Eils Martin Gräßlin Jochen Schramm Michael Hoehl

74 Acknowledgements

BioModels.net Team:

Viji Chelliah Nicolas Le Novère Mihai Glont Stuart Moodie Nick Juty Nicolas Rodriguez Sarah Keating Maciej Swat Camille Laibe Yangyang Zhao

75 [email protected]