Free and Open Source Software (FOSS) for Bioinformatics and Computational Biology
Total Page:16
File Type:pdf, Size:1020Kb
Free and Open Source Software (FOSS) for Bioinformatics and Computational Biology Pjotr Prins - 14 Feb 2013 Wageningen University Free and Open Source Software (FOSS) – p. 1 What is FOSS? FOSS: ‘free and open source software’ Software that is both free software and open source Licensed to grant users the right to use, copy, study, change, and improve Availability of source code Free and Open Source Software (FOSS) – p. 2 Licensing License protects copyright and more GNU Public License (GPL) - change code, but always make source code available when publishing the software (Linux) BSD License - change code, but publish copyright (FreeBSD) Artistic License - do what you want with the code (Perl) Others. (Mozilla) Free and Open Source Software (FOSS) – p. 3 Science Open, transparent -> reproducible Peer review Build on the shoulders of giants! Linux popular in Bioinformatics Great for writing software Great for running software Free and Open Source Software (FOSS) – p. 4 Open-Bio Open Bioinformatics Foundation (OBF) BioPerl, BioPython, BioRubya, BioJava. No BioJS (yet) State of the projects - github aBioRuby: Bioinformatics software for the Ruby programming language by Naohisa Goto, Pjotr Prins et al., Bioinformatics, 2010 Free and Open Source Software (FOSS) – p. 5 Github Track projects Publish (your) project! Use the code! Free and Open Source Software (FOSS) – p. 6 Biogems.info Decentralized development for (Bio)Ruby a Visibility & easy install Github integration Integration testing aBiogem: an effective tool based approach for scaling up open source soft- ware development in bioinformatics, by R. Bonnal, P Prins et al., Bioinformatics, February 2012 Free and Open Source Software (FOSS) – p. 7 Tradition: Ease vs Speed Easy but slow: dynamically typed (Perl, R, Python, Ruby) Hard but fast: statically typed (C, C++, FORTRAN) Mix (Perl|Python|Ruby|R) & C Free and Open Source Software (FOSS) – p. 8 Choosing a language Dynamic type, slowish Perl - ah, Perl, awful R - awful, but common Python - science! Graphics, math, statistics, R integration Ruby - cleaner, OOP, functional (like!) Javascript (fast, common, a bit ugly) Free and Open Source Software (FOSS) – p. 9 Choosing a language (2) Fixed type, fast Structured: C, GO OOP: Java Functional languages: Clojure, Erlang, Haskell Hybrid: Scala (like!), D (like!) Free and Open Source Software (FOSS) – p. 10 Tips Don’t be afraid of languages Mix and matcha Use one dynamic (easy), one static language (fast) JVM (Java, Groovy, Scala, Clojure, JRuby, Javascript) Computer language shootout aSharing programming resources between Bio* projects through remote pro- cedure call and native call stack strategies, by Pjotr Prins et al., Evolutionary Genomics: statistical and computational methods, Meth. Mol. Biol. 2012 Free and Open Source Software (FOSS) – p. 11 What libraries? Bio* projects. JVM has BioJava, BioScala and BioRuby! BioPerl; R/Bioconductor - large communities Biopython/BioRuby - growing Free and Open Source Software (FOSS) – p. 12 What tools? Search WWW. Say, ‘RNAseq tools’ CloudBiolinux (see biogems.info) - & Galaxy Search R/bioconductor modules/bindings biostars.org, stackexchange, etc. Q&A sites Ask on Bio* mailing lists But how do I know what to use? Free and Open Source Software (FOSS) – p. 13 Project activity Main website, documentation? Search the Q&A sites and mailing lists Check the bloody source! This is FOSS! # of contributors, # of commits Recent activity Free and Open Source Software (FOSS) – p. 14 Dead projects If a project looks dead, it is probably... If the source code looks bad, it is probably... If the documentation is old, it is probably... Still, it may be worth reviving. Free and Open Source Software (FOSS) – p. 15 What makes a bioinformatician? Can run BLAST? Can run CLI tools? Can chain tools? Can assess tools/libraries Loves programming Loves biology Free and Open Source Software (FOSS) – p. 16 Great bioinformaticians 100+ papers 10K citations 1st or last author? Many of the best papers have bioinformatics input! Example: Yves van de Peer Free and Open Source Software (FOSS) – p. 17.