Egon Willighagen (@Egonwillighagen)

Egon Willighagen (@Egonwillighagen)

How I failed to do Open Notebook Cheminformatics Egon Willighagen (@egonwillighagen) 14 July 2014, Jean-Claude Bradley Memorial Symposium #jcbms Department of Bioinformatics - BiGCaT 1 Jean-Claude Bradley Department of Bioinformatics - BiGCaT 2 “Open Notebook Science” First response: jealousy Department of Bioinformatics - BiGCaT 3 How I failed • I did Open Science – Strong focus on reproducibility – Open Source, Open Data, Open Standards (ODOSOS) • I did share notes... • I wrote up the stories (in a blog)... Department of Bioinformatics - BiGCaT 4 Realization • Scholars need notebooks – They need exact instructions – Just giving the outcome and tools is not enough • This applies to cheminformatics too Department of Bioinformatics - BiGCaT 5 First notes where during education Department of Bioinformatics - BiGCaT 6 ODOSOS • Software: – The Chemistry Development Kit • based on CompChem, Jmol and JchemPaint – Bioclipse, Jmol, ... • Data – Blue Obelisk Data Repository – RDF translations of knowledge bases • Standards – eNanoMapper ... Department of Bioinformatics - BiGCaT 7 Scribbling... Department of Bioinformatics - BiGCaT 8 Scribbline... Department of Bioinformatics - BiGCaT 9 Scribbline... Department of Bioinformatics - BiGCaT 10 Why important? • Going back to the original (raw data). • Pedagogical effect • Education (howto's) • Machines care about negative data • If we want to progress, we need to understand not just global patterns, but the fine details too Department of Bioinformatics - BiGCaT 11 Why cheminformatics too? • Where is the latest RDF of solubility data? Of the melting point data? • The trust problem applies to algorithms as much as data • What if... Department of Bioinformatics - BiGCaT 12 What I have in mind... WikiPedia, CC-BY-SA, http://en.wikipedia.org/wiki/Curtin %E2%80%93Hammett_principle Department of Bioinformatics - BiGCaT 13 Possible ONS of cheminformatics • Is this set of atom types covering ChEBI? • Can we map this metabolomics data to pathways? • How many CAS registry numbers can I resolve for this data set? Department of Bioinformatics - BiGCaT 14 Conclusion Some patience is needed, but I will start Open Notebook Science. (And I will push this concept with my Bart and Cristian too.) Department of Bioinformatics - BiGCaT 15 Benchmarking / metrics Department of Bioinformatics - BiGCaT 16.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    16 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us