Metabolights: a Resource Evolving in Response to the Needs of Its

D440–D444 Nucleic Acids Research, 2020, Vol. 48, Database issue Published online 6 November 2019 doi: 10.1093/nar/gkz1019 MetaboLights: a resource evolving in response to the needs of its scientiﬁc community Kenneth Haug , Keeva Cochrane, Venkata Chandrasekhar Nainala, Mark Williams, Jiakang Chang, Kalai Vanii Jayaseelan and Claire O’Donovan * European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK Received September 24, 2019; Revised October 17, 2019; Editorial Decision October 17, 2019; Accepted October 31, 2019 ABSTRACT and environmental factors. Metabolomics is a powerful ap- proach because metabolites and their concentrations, un- MetaboLights is a database for metabolomics stud- like other ‘omics measures, directly reflect the underlying ies, their raw experimental data and associated biochemical activity and state of cells / tissues. Because metadata. The database is cross-species and cross- of this, metabolomics best represents the molecular pheno- technique and it covers metabolite structures and type. Metabolomics technologies yield many insights into their reference spectra as well as their biological basic biological research in areas such as systems biology roles and locations. MetaboLights is the recom- and metabolic modelling, pharmaceutical research, nutri- mended metabolomics repository for a number of tion and toxicology. leading journals and ELIXIR, the European infras- Our challenge is to capture the growing amount, depth tructure for life science information. In this article, we and diversity of metabolomic information and to make it describe the signiﬁcant updates that we have made easily available and interpretable to our users and integrated with the wider ‘omics community. We describe the signifi- over the last two years to the resource to respond cant developments that we have made with a focus on how to the increasing amount and diversity of data being we are positioning MetaboLights (1) to address its increas- submitted by the metabolomics community. We re- ing use in biosciences. freshed the website and most importantly, our submission process was completely overhauled to enable us to deliver a far more user-friendly submis- PROGRESS AND DEVELOPMENTS sion process and to facilitate the growing demand Growth of submissions for reproducibility and integration with other ‘omics. Metabolomics resources and data are available un- The MetaboLights repository has continued to see con- der the EMBL-EBI’s Terms of Use via the web at https: sistent year on year growth since its first release in 2012, //www.ebi.ac.uk/metabolights and under Apache 2.0 with a particularly notable exponential growth in the last at Github (https://github.com/EBI-Metabolights/). year (Figure 1A). The user base is global with the USA, China and the UK being the top countries for study submissions (Figure 1B). At the time of writing, the database INTRODUCTION hosts over 500 publicly available studies with a further 140 Metabolomics is the systematic study of the small molecu- awaiting public release due to publication requirements, and lar metabolites in a cell, tissue, biofluid or cell culture media ∼250 in preparation. MetaboLights supports publications that are the tangible result of cellular processes or responses in a range of journals with a significant proportion not to an environmental stress. Collectively, these metabolites only in specialist metabolomics journals but also in jour- and their interactions within a biological system are known nals from publication groups including Nature, Cell and as the metabolome. Just as genomics is the study of DNA PLOS. The studies in preparation reflect the fact that the and genetic information within a cell, and transcriptomics database is evolving into a resource where data can be de- is the study of RNA and differences in mRNA expres- posited throughout the experimental progress of a study sion; metabolomics is the study of substrates and prod- and in the future, we plan to also provide pre-processing ucts of metabolism, which are influenced by both genetic and analysis capabilities. *To whom correspondence should be addressed. Tel: +44 1223 494460; Fax: +44 1223 494468; Email: [email protected] Present address: Claire O’Donovan, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinx- ton, Cambridge CB10 1SD, UK. C The Author(s) 2019. Published by Oxford University Press on behalf of Nucleic Acids Research. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. Nucleic Acids Research, 2020, Vol. 48, Database issue D441 Figure 1. Growth and use of MetaboLights. (A) MetaboLights study submission rates per year (2019 total submissions estimated based on year to date). (B) Geographical distribution of submitted studies, including private studies (USA: 213, UK: 208, China: 207, Germany: 124, Spain: 44, Japan: 44, Italy: 36, France: 33, Netherlands: 31, Australia: 28, Belgium: 22, Sweden: 20). New online guided submission and editor In addition to the online submission process, we are actively working with the metabolomics community to The four stages in the MetaboLights submission and cu- provide solutions for larger scale resumable submission ration processes are: (i) Submitted, the user is adding all pipelines. A successful example of this is the recent devel- relevant raw data and information. (ii) In curation,the opment of a bespoke submission pipeline for metabolon MetaboLights curation team is editing where required. (https://www.metabolon.com). Metabolon submits raw and (iii) In review, the study is ready to be shared with jour- metadata to MetaboLights on behalf of their clients who nals. (iv) Public, the study is publicly available. As part of then complete the submission process with the online editor. our ongoing efforts to streamline and enhance the study Facilitated by the new MetaboLights RESTful API (https: submission and curation process, we have developed a //www.ebi.ac.uk/metabolights/ws/api/spec.html#!/spec), we new guided submission process to submit and edit stud- are now enabling programmatic submissions for Phenome ies online. This new submission tool (https://www.ebi.ac. centres and other large scale laboratories. uk/metabolights/editor/login) fully replaces our traditional desktop-based JAVA application (ISAcreator) (https://isa- tools.org/software-suite.html) while still using its function- Curation and content development alities relevant to metabolomics. The tool is also integrated with our resumable high-speed data transfer processes to A MetaboLights study’s discoverability and reusability is conveniently enable large volumes of data transfer simulta- enhanced by mapping the metadata to the most relevant neously while the user curates their study. The online sub- ontology resources. During the development of the online mission tool is context-aware and guides submitters step- tool, the curators took the opportunity to review the con- by-step through the process of describing the relevant exper- tent of all the studies and their related publications sub- imental metadata, such as study characteristics, protocols, mitted since 2012. This highlighted how the metabolomics instrumentation and related factors. This enhances both the field is evolving, in both the science and the technology. submitter experience and the accuracy and completeness of For example, while there are 3400 species represented in the the study. To aid the submitter further, there are short video MetaboLights database, there has been an increasing depo- tutorials available at each stage of the submission process, sition of human and mouse data which now constitutes 50% see Figure 2. of the studies, reflecting the use of metabolomics in clini- Upon completion of the submission, the submitter is pre- cal applications. The techniques are also diversifying from sented with a full study view to review and edit their study variations of liquid chromatography (LC) and gas chro- further should this be required. Going forward, we plan to matography (GC) based mass spectrometry (MS) and nu- extend this feature to facilitate ongoing studies/projects af- clear magnetic resonance (NMR) to more recent develop- ter the initial publication. Once the submitter is finished this ments in Image based MS and magnetic resonance. This re- initial stage, the MetaboLights curation team will start the view enabled us to align the metadata with the most relevant final curation and quality control process and will inter- ontology resources (see Table 1) As a result, submitters are act with the submitter as appropriate to ensure high quality provided with the closest matching ontology linked vocab- and comprehensive reporting of each study. Once this pro- ulary options within the new online editor. Where suitable cess is complete, the submitter will automatically receive in- terms are not available, submitters can make suggestions structions on how to safely allow direct read-only access to which are then reviewed by the curators. If the terms fall the study for the journals and their reviewers. The source within the remit of an existing ontology, they are submitted code for the new submission tool (https://github.com/EBI- to that resource. For more

Metabolights: a Resource Evolving in Response to the Needs of Its

The Structure and Antioxidant Properties

S41467-020-16549-2.Pdf

DATABASE Report April 2019 [email protected] TABLE of CONTENTS TABLE of CONTENTS

Using Natural Language Processing Methods to Support Curation of a Chemical Ontology., Doctor of Philosophy, May 2014 DECLARATION

The Experimental Factor Ontology Utilities in Knowledge Discovery and Sharing

SOT 2017 P225 AOP Poster Luettich Et Al Final

Expanding Opportunities for Mining Bioactive Chemistry from Patents, Drug Discov Today: Technol (2015), 10.1016/J.Ddtec.2014.12.001

The Role of Uniprot's Protein Sequence Databases in Biomedical Research

Carbon Dioxide - Wikipedia

Evaluation and Cross-Comparison of Lexical Entities of Biological Interest (Lexebi)

Chemistry Requirements

Extension of Roles in the Chebi Ontology