S-01

(Registration number 2002IS006) Markup Language Standard

Research coordinator Hiroaki Kitano The Systems Biology Institute: JAPAN [email protected] Research team members John Doyle California Institute of Technology: USA Andrew Finney University Hertfordshire: UK Poul Nielsen University of Auckland:New Zealand

Duration:April 2002 – March 2005

Abstract Building and analyzing models is the approach in systems biology. However, no ideal software platform is available today for research, which is a big issue to be solved. To solve the problem, we need (a) development and standardization of modeling language that can represent biological phenomena, and (b) development of software tools which is compliant to the standardized language.

Therefore, we have proposed Systems Biology Markup Language (SBML) as a modeling language, and have developed Systems Biology Workbench (SBW). The developed software has been publicly released under open source license (LGPL).

With the NEDO funding, we conducted the following activities: (1) Definition of SBML Level-2 specification (2) Proposal of SBML Level-3 specification (3) Promotion of SBML

More than 90 groups commit developing SBML-compliant software. More are expected as Level-3 becomes available.

From international standardization viewpoint, we plan to create SBML to be “de facto”, then “de jure”. For de facto standardization, major journals such as Nature have decided to endorse SBML as standard representation for model data. Several commercial and public pathway databases have employed SBML. For de jury standard promotion, IETF has been accepted. We have approached OMG, and continue to pursue more established approval bodies such as IEEE and ISO.

Keywords: Systems Biology, SBML, SBW, Standardization

1. Introduction One of the big challenges in post-genomics biology is to build and understand entire process of biological phenomena from cellular behavior to gene regulatory networks to metabolic pathways. Building of models and analyzing from various perspectives is the approach in computational systems biology that may bring a major impact to science and drug discovery. Currently a large number of developers and researchers are developing softwares to enable the modeling and its analysis. Despite those efforts, the softwares are scarcely compatible with

each other, thus as is often the case that a model built upon software cannot be easily transferred to other software. Additionally, wide varieties of methodologies for analysis or even for model building exist, and it is not realistic to cover all functions comprised by a single software. Therefore, no ideal software platform is available today for systems biology research, which is a big issue to be solved.

We believe that to solve the problem is (a) development and standardization of modeling language by which various phenomena could be represented, and (b) development of software tools which is compliant to the standardized language, which can be exchanged with each other.

2. Strategy Based on this idea, we have proposed Systems Biology Markup Language (SBML) as a modeling language, and have developed Systems Biology Workbench (SBW). The developed software has been publicly released under open source license (LGPL).

With the help of the NEDO funding, we were able to conduct primarily following activities regarding standardization of SBML. (1) Definition of SBML Level-2 specification (2) Proposal of SBML Level-3 specification (3) Publicity activities of SBML

Fig 1: SBML Website (http://sbml.org) 3. Research 3.1 Definition of SBML specifications SBML, the first version of Level-1 specification released March, 2001, is now available its Level-2 specification (released June 2003), and have adopted by many software. With the latest version of Level-3 (although not yet released), various features will be added, which could not be represented in previous version, such as (1) spatial information of components, (2) modulation of proteins, and (3) multi-cellular model. Those features will serve modelers or researchers to simulate/analyze models in much practical manner. Up to now, 80 groups commit developing software with SBML compliant features included. More groups are expected to join our community in proportion to version up of SBML Level, which will be continued next fiscal year.

A) Modeling B) Simulation D) Utilities CellDesigner BALSA Kinsolver libSBML NetBuilder BIOCHAM MesoRD MathSBML CADLIVE BioCharon MMT2 KEGG2SBML Bio Sketch Pad BioNetGen Moleculizer CellML2SBML BioTapestry BioSPICEÂ Dashboard NetBuilder biocyc2SBML BioUML BioSpreadsheet PathwayLab BioGrid JDesigner BioUML PathwayBuilder BSTLab PaVESy CADLIVE PNK CL-SBML SRS Pathway Editor CellDesigner PROTON ecellJ Cellerator PySCeS MATLAB SBToolbox

Cellware runSBML Monod ) Analysis Copasi SBML ODE Solver Pathway Tools BioSens (UCSB) DBsolve SBMLSim SBMLeditor Cytoscape Dizzy SigTran SBMLR XPPAUT E-Cell SIMBA SBMLToolbox FluxAnalyzer ESS Simpathica SBW MetaboLogica Fluxor StochSim MetaFluxNet Gepasi STOCKS SCIpath INSILICO discovery TERANODEÂ Suite SimWiz Jarnac Trelis JigCell JSIM WinSCAMP Table 1: List of SBML-compliant software

3.2 International Standardization From international standardization viewpoint, we created SBML to be de facto standard, by involving large numbers of researchers and organizations in our community ― this is the first step. As second step, we will commit on de jure standard of SBML under carefully planned scheme. During this period, we focused on creating de facto standard for SBML, but at the same time, we wrote documents for de jure standardization. As a matter of fact, our SBML to IETF (=The Internet Engineering Task Force, http://www.ietf.org/) as “application/sbml+xml MIME media type standards” has been accepted. As far as de facto standard formation, it is noteworthy that major journals such as Nature have decided to endorse SBML as standard representation for model data that should be submitted with the paper. In addition, several commercial and public pathway databases have started to use SBML as underlying representation standard. For de jury standard promotion, we are already contacting with OMG (=Object Management Group, http://www.omg.org), and continue to pursue more established approval bodies such as IEEE and ISO.

Submission of SBML files

The Systems Biology Markup Language (SBML) is a computer-readable format for representing models of biochemical reaction networks. SBML is applicable to metabolic networks, cell-signaling pathways, regulatory networks, and many others. Where relevant and possible, authors are encouraged to submit datasets in SBML format. Authors should select 'SBML' from the available list of data file formats when uploading the data set file.

http://mts-msb.nature.com/cgi-bin/main.plex?form_type=display_auth_instructions Fig 2: From Nature Molecular Biology Website

Fig 3: PANTHER PATHWAY Applied Biosystems Inc.

4. Discussions Today, SBML is well known to be the de facto standard for computational modeling in systems biology. The success of SBML has led to requests from the community for new features and continued evolution of the language. We’ve role as organizers and editors in the development of SBML, and respected many feedbacks from the open community. We’ve invited individual researchers who are interested in SBML to our SBML forum, and had a discussion about the evolution of SBML. Simultaneously, we have developed software infrastructure, including programming libraries, conversion utilities, interface packages for commonly-used software environments, and web based online tools. Most of these software are distributed as an open-source software to maximize the accessibility and utility of the products. We think that such two-pronged approach (open community based discussion, and support of software infrastructure) is essential for the standardization.

5. Conclusions Overall, our project went well and has achieved its goal in establishing de facto standard in the field of biological modeling. This is clearly illustrated by the fact over 80 software adapted SBML as its model representation language. In addition, in May 5 issue of Nature, Nature Publishing Group made it clear that they endorse SBML as a format for submitting biological model. The next step is to make this as de jury standard, and we are working on this goal too. However, it is important that SBML has established de facto status with strong back up from major publishers such as Nature made it almost certain that we can also achieve this goal.

Acknowledgement SBML.ORG より転記 The development of SBML was originally funded by the Japan Science and Technology Agency under the ERATO Kitano Symbiotic Systems Project. Support for the continued development of SBML and associated software and activities today comes from the following sources:

• National Human Genome Research Institute (USA) • National Institute of General Medical Sciences (USA) • International Joint Research Program of NEDO (Japan) • ERATO-SORST Program of the Japan Science and Technology Agency (Japan) • Ministry of Agriculture (Japan) • Ministry of Education, Culture, Sports, Science and Technology (Japan)

• BBSRC e-Science Initiative (UK) • DARPA IPTO Bio-Computation Program (USA) • Army Research Office's Institute for Collaborative Biotechnologies (USA) • Air Force Office of Scientific Research (USA)

Additional support is provided by the California Institute of Technology (USA), the University of Hertfordshire (UK), the Molecular Sciences Institute (USA), and the Systems Biology Institute (Japan).

References

The list of the most important papers and patents from the project. Papers (Total 34) 1. Kitano, H. Standards for modeling. Nature Biotechnology. 337, 20, UK, April 2002. 2. Hucka, M.; Finney, A.; Sauro, H.M.; Bolouri, H.; Doyle, J.C.; Kitano, H. et al. The systems biology markup language (SBML): a medium for representation and exchange of biochemical network models. Bioinformatics. 524-531, 19, UK, March 1, 2003. 3. Kitano, H. A graphical notation for biochemical networks. BIOSILICO. 1, 5, 169-176, USA, November 2003. 4. Funahashi, A., Tanimura, N., Morohashi, M., and Kitano, H. CellDesigner: a process diagram editor for gene-regulatory and biochemical networks. BioSilico. 1, 5, 159-162, USA, November 2003. 5. Funahashi, A.; Morohashi, M.; Hucka, M.; Finney, A.; Sauro, H.; Bolouri, H.; Doyle, J.; Kitano, H. システ ムバイオロジーにおけるソウトウェア基盤. Nature. 423, 6940, Insight, 48-57, Japan, 2003. 6. Sauro, H.; Hucka, M.; Finny, A.; Wellock, C.; Bolouri, H.; Doyle, J.; Kitano, H. Next Generation Simulation Tools: The Systems Biology Workbench and BioSPICE Integration. Omics. 7, 4, 355-372, USA, 2003. 7. Andrew Finney, Michael Hucka. Systems biology markup language: Level 2 and beyond. Biochemical Soc. Trans. 1472-1473, 31, UK, 2003. 8. Crampin, E. J., Halstead, M., Hunter, P., Nielsen, P., Noble, D., Smith, N., Tawhai, M. Computational physiology and the physiome project. Experimental Physiology. 89, 1, 1-26, UK, January 2004. 9. Oda, K.; Kimura, T, Matsuoka, Y.; Funahashi, A.; Muramatsu, M.; Kitano, H. Molecular Interaction Map of Macrophage. AfCS Research Reports. 2, UK, Aug. 25, 2004. 10. M. Hucka; A. Finney; B.J. Bornstein; S.M. Keating; B.E. Shapiro; J. Matthews; B.L. Kovitz; M.J. Schilstra; A. Funahashi; J.C. Doyle; and H. Kitano. Evolving a lingua franca and associated software infrastructure for computational systems biology: the Systems Biology Markup Language (SBML) project. IEE Systems Biology. 41-53, 1, UK, June 2004. 11. Lloyd, C. M., Halstead, M., Nielsen, P. F. CellML: Its future, present and past. Biophysics and Molecular Biology. 85, 433-450, UK, 2004. 12. Kitano, H,; Funahashi, A.; Matsuoka, Y.; Oda, K. Using process diagrams for the graphical representation of biological networks. Nature Biotechnology. 23, 8, 961-966, UK, 2005.

Presentations (Total 151)

Patents (Total 0)