A Major Update to the Drugbank Database for 2018 David S
Total Page:16
File Type:pdf, Size:1020Kb
D1074–D1082 Nucleic Acids Research, 2018, Vol. 46, Database issue Published online 8 November 2017 doi: 10.1093/nar/gkx1037 DrugBank 5.0: a major update to the DrugBank database for 2018 David S. Wishart1,2,3,4,*, Yannick D. Feunang1,AnC.Guo1, Elvis J. Lo1,AnaMarcu1, Jason R. Grant1, Tanvir Sajed2, Daniel Johnson1, Carin Li1, Zinat Sayeeda1, Nazanin Assempour1, Ithayavani Iynkkaran1,4, Yifeng Liu2, Adam Maciejewski1, Nicola Gale5, Alex Wilson5, Lucy Chin5, Ryan Cummings5, Diana Le5, Allison Pon1,5,CraigKnox1,5 and Michael Wilson1,5 1Department of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9, Canada, 2Department of Computing Science, University of Alberta, Edmonton, AB T6G 2E8, Canada, 3Faculty of Pharmacy and Pharmaceutical Sciences, University of Alberta, Edmonton, AB T6G 2N8, Canada, 4Department of Laboratory Medicine and Pathology, University of Alberta, Edmonton, AB T6G 2R3, Canada and 5OMx Personal Health Analytics, Inc., 301-10359 104 St NW, Edmonton, AB T5J 1B9, Canada Received September 15, 2017; Revised October 12, 2017; Editorial Decision October 13, 2017; Accepted November 03, 2017 ABSTRACT content, interface and performance of the DrugBank website have been made and these should greatly DrugBank (www.drugbank.ca) is a web-enabled enhance its ease of use, utility and potential appli- database containing comprehensive molecular infor- cations in many areas of pharmacological research, mation about drugs, their mechanisms, their interac- pharmaceutical science and drug education. tions and their targets. First described in 2006, Drug- Bank has continued to evolve over the past 12 years in response to marked improvements to web stan- INTRODUCTION dards and changing needs for drug research and de- DrugBank is a comprehensive, freely available web resource velopment. This year’s update, DrugBank 5.0, repre- containing detailed drug, drug-target, drug action and drug sents the most significant upgrade to the database interaction information about FDA-approved drugs as well in more than 10 years. In many cases, existing data as experimental drugs going through the FDA approval content has grown by 100% or more over the last up- process. The rich, high quality, primary-sourced content found in DrugBank has allowed it become one of the date. For instance, the total number of investigational world’s most widely used reference drug resources. It is drugs in the database has grown by almost 300%, routinely used by the general public, educators, pharma- the number of drug-drug interactions has grown by cists, pharmacologists, medicinal chemists, pharmaceutical nearly 600% and the number of SNP-associated drug researchers and the pharmaceutical industry (1). Since its effects has grown more than 3000%. Significant im- first appearance in 2006, the evolution of DrugBank’s con- provements have been made to the quantity, qual- tent and interface has largely been directed by the requests ity and consistency of drug indications, drug bind- of its diverse user community and the efforts of dozens of ing data as well as drug-drug and drug-food inter- skilled programmers, domain-specific experts and trained actions. A great deal of brand new data have also biocurators. been added to DrugBank 5.0. This includes infor- DrugBank 1.0, released in 2006, provided novel (at the mation on the influence of hundreds of drugs on time) physico-chemical data on selected FDA-approved drugs and their drug-targets (2). DrugBank 2.0, released metabolite levels (pharmacometabolomics), gene ex- in 2008, added pharmacological, pharmacogenomic and pression levels (pharmacotranscriptomics) and pro- molecular biological data (3). DrugBank 3.0, released in tein expression levels (pharmacoprotoemics). New 2010, added drug–drug and drug–food interactions, drug data have also been added on the status of hundreds transporter data as well as pharmacokinetic information of new drug clinical trials and existing drug repurpos- (4). DrugBank 4.0, released in 2014, added significant ing trials. Many other important improvements in the amounts of drug metabolism data, QSAR (quantitative structure activity relationships) data and ADMET (absorp- *To whom correspondence should be addressed. Tel: +1 780 492 0383; Fax: +1 780 492 1071; Email: [email protected] C The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact [email protected] Nucleic Acids Research, 2018, Vol. 46, Database issue D1075 tion, distribution, metabolism, excretion and toxicity) data 474 (in DrugBank 4.0) to 1551 compounds in DrugBank 5.0 (5). While each of these prior releases provided notable im- This corresponds to an increase of more than 300%. Given provements and greatly enriched DrugBank’s data content, the growing concerns over illicit drugs and designer drugs this year’s release represents the most significant expansion as well as the continued interest by pharmacologists in un- of DrugBank in more than a decade. derstanding the toxicology profiles of withdrawn drugs, the In particular, the quantity of existing data in DrugBank curators for DrugBank 5.0 have also worked hard to ex- 5.0 has increased enormously. For instance, the number of pand and enhance this information as well. In particular, approved (FDA, Health Canada, EMA, etc.) drugs in the we have nearly doubled the number of drug entries in these database has grown from 1836 to 2358, the number of re- categories from 268 (in DrugBank 4.0) to 409 compounds ported phase I/II/III investigational drugs has grown from in DrugBank 5.0. 1219 to 4501, the number of drugs with experimentally Historically, DrugBank has been noted for its extensive acquired mass spectrometry (MS) and nuclear magnetic and very comprehensive data on drug targets. This contin- resonance (NMR) spectra has grown from 690 to 3620, ues to be a major focus for the DrugBank team and the the number of drug-drug interactions has grown from 14 number of drug targets (including proteins, RNA, DNA 150 to 365 984 and the number of pharmacogenomic and and other macromolecules), has increased from 4115 to SNP-associated drug effects has grown from 201 to 5993. 4563 unique molecules. Much of this increase was due to In addition to this very significant expansion of its exist- the inclusion of detailed target data for nearly 300 antibi- ing data, DrugBank 5.0 has added many new datasets or otics. This enriched antibiotic dataset contains information novel kinds of data. This includes information on the influ- from hundreds of target organisms and their corresponding ence of hundreds of drugs on metabolite levels (pharma- molecular target data. An even more significant expansion cometabolomics), gene expression levels (pharmacotran- has been seen in the number of known drug metabolizing scriptomics) and protein expression levels (pharmacopro- enzymes and drug transporters, which nearly doubled from toemics). New data have also been added on thousands of 253 (in DrugBank 4.0) to 497 different proteins (in Drug- investigational drug clinical trials and various drug repur- Bank 5.0). While drug-target information is important for posing trials. Additionally, DrugBank’s curation team has many pharmaceutical research applications, knowing how greatly improved the quality and consistency of all existing strongly certain drugs bind to their targets is often even drug indications, enhanced the information on drug-drug more important. For this year’s release, the number of com- and drug-food interactions, filled in data gaps on more than pounds with drug-target binding constant data has grown 600 existing drugs and greatly improved the quality and from 791 to 2242. Much of this binding constant data, along quantity of drug-target binding data. Major improvements with most of the data on drug targets added to this year’s to the spectral viewing and spectral search tools, spectral release of DrugBank was acquired from the primary litera- data formats (compatible with SPLASH (6), mzML (7)and ture. Using primary sources and employing expert biocura- nmrML (8)), chemical taxonomies, chemical ontologies (9), tors is one of the reasons that DrugBank’s data content has as well as text and structure searching/matching have also become so unique and so reliable. Over the past 12 years, 27 been made. Further details on the additions and enhance- 572 different peer-reviewed papers have been collected and ments made to DrugBank 5.0 are described below. assessed. Those sources meeting the acceptance criteria had their data manually extracted, validated and entered by the DrugBank curation team. DrugBank is also well regarded DATABASE ADDITIONS AND IMPROVEMENTS for its ongoing efforts to compile comprehensive, detailed This section is divided into four subsections: (i) expansion information on experimental and investigational drugs as and improvements made to DrugBank’s existing data, (ii) well as their protein targets. This kind of information has the addition of new data content and new data fields to been used by many researchers to explore new drug leads or DrugBank, (iii) enhanced DrugBank interface features (iv) to repurpose existing drugs. For DrugBank 5.0 the number a new model for updating and distributing DrugBank. of phase I/II/III investigational drugs has grown from 1219 to 4501 compounds. With many compounds moving from the ‘experimental’ category (i.e. drugs that are at the preclin- Enhancement of existing data ical or animal testing stage) to the ‘investigational’ category Since 2006 DrugBank has seen a progressive expansion in (i.e. drugs that are in human clinical trials), the number of the depth and breadth of its data as well as a significant experimental drugs in DrugBank 5.0 actually dropped from enhancement to the quality and reliability of its informa- 6009 to 4964.