Transitioning Biocyc to a Subscription Model

Transitioning Biocyc to a Subscription Model

Transitioning BioCyc to a Subscription Model Peter D. Karp ecocyc.org SRI International biocyc.org metacyc.org © 2014 SRI International BioCyc.org Collection of 9,300 Pathway/Genome Databases •Pathway/Genome Database (PGDB) – combines information about – Pathways, reactions, substrates – Enzymes, transporters – Genes, replicons – Transcription factors/sites, promoters, operons •Tier 1: Highly curated PGDBs – MetaCyc, HumanCyc, YeastCyc – EcoCyc -- Escherichia coli K-12 – AraCyc – Arabidopsis thaliana •Tier 2: Moderately curated -- 44 PGDBs – Bacillus subtilis, Mycobacterium tuberculosis •Tier 3: Computationally-derived DBs © 2014 SRI International BioCyc Use Cases • Access to an extremely wide range of curated and computationally predicted information: – Genes and proteins – Metabolic pathways, reactions, metabolites – Regulatory networks • Gene expression data analysis • Metabolomics data analysis • Execute metabolic models • Metabolic route searches • Comparative analysis © 2014 SRI International Highly Curated Pathway/Genome Databases Database Organism Organization Publications Curated From MetaCyc Multiorganism SRI 51,000 EcoCyc E. coli SRI 32,000 BsubCyc B. subtilis SRI 4,000 HumanCyc H. sapiens SRI AraCyc A. thaliana TAIR/Carnegie 4,100 Institution YeastCyc S. cerevisiae SGD/SRI 980 MouseCyc M. musculus MGD/Jackson Laboratory http://biocyc.org/otherpgdbs.shtml © 2014 SRI International Creation of BioCyc Databases Computational Inferences Predict metabolic reactions NIH Predict operons Predict transport reactions Compute orthologs RefSeq Predict metabolic pathways Compute Pfam domains Predict pathway hole fillers Curation PGDB Regulatory data Database links [regtransbase] Organism phenotype data Subcellular locations [psortdb] Gene essentiality data Phenotype microarray data GO terms [uniprot] Protein features [uniprot] Data Import © 2014 SRI International BioCyc Curated Data • Gene functions • Metabolic pathways, reactions, metabolites • Regulatory interactions © 2014 SRI International © 2014 SRI International Current Funding Sources for Curation • EcoCyc grant from NIH/NIGMS (3 FTE curators) • MetaCyc grant from NIH/NIGMS (1 FTE curator) • Support curation of two of our 9,600 databases • Additional revenues will let us curate additional databases © 2014 SRI International BioCyc has Moved to a Subscription Model • Gov’t supported databases remain free/open • Other databases accessible via subscription • Subscriptions available to individuals and institutions • Institutional subscription price depends upon usage level • Phoenix Bioinformatics provides us with sales, marketing, and paywall services © 2014 SRI International • Estimated cost/article for curation in EcoCyc project: – $219 – 6-15% open-access publication fee – Slightly more than 10% of the cost of coffee breaks for an R01 project © 2014 SRI International • Randomly choose curated assertions from CGD and from EcoCyc • Validate accuracy of those assertions in publications • CGD error rate: 1.82% • EcoCyc error rate: 1.40% © 2014 SRI International © 2014 SRI International •No © 2014 SRI International • NL-understanding problem is 60 years old • Lots of progress, but error rates are unacceptable (18%, 24%, 45%) • Info extraction software typically extracts narrow slivers of info • Cannot arbitrate among conflicts in the literature • Some evidence that info-extraction software can speed curation © 2014 SRI International Curation Complexity Varies Among Databases • Number of extracted datatypes • Number of database fields • Amount of meta-data (evidence codes) • Amount of interpretation and synthesis • Authoring of mini-reviews • End-uses of information (metabolic modeling) © 2014 SRI International • Much evidence to date indicates crowd-sourced curation is not a successful model • The author-curation model shows more promise for biocuration © 2014 SRI International.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    16 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us