Free and Open Source Software (FOSS) for Bioinformatics and Computational Biology

Free and Open Source Software (FOSS) for Bioinformatics and Computational Biology Pjotr Prins - 14 Feb 2013 Wageningen University Free and Open Source Software (FOSS) – p. 1 What is FOSS? FOSS: ‘free and open source software’ Software that is both free software and open source Licensed to grant users the right to use, copy, study, change, and improve Availability of source code Free and Open Source Software (FOSS) – p. 2 Licensing License protects copyright and more GNU Public License (GPL) - change code, but always make source code available when publishing the software (Linux) BSD License - change code, but publish copyright (FreeBSD) Artistic License - do what you want with the code (Perl) Others. (Mozilla) Free and Open Source Software (FOSS) – p. 3 Science Open, transparent -> reproducible Peer review Build on the shoulders of giants! Linux popular in Bioinformatics Great for writing software Great for running software Free and Open Source Software (FOSS) – p. 4 Open-Bio Open Bioinformatics Foundation (OBF) BioPerl, BioPython, BioRubya, BioJava. No BioJS (yet) State of the projects - github aBioRuby: Bioinformatics software for the Ruby programming language by Naohisa Goto, Pjotr Prins et al., Bioinformatics, 2010 Free and Open Source Software (FOSS) – p. 5 Github Track projects Publish (your) project! Use the code! Free and Open Source Software (FOSS) – p. 6 Biogems.info Decentralized development for (Bio)Ruby a Visibility & easy install Github integration Integration testing aBiogem: an effective tool based approach for scaling up open source software development in bioinformatics, by R. Bonnal, P Prins et al., Bioinformatics, February 2012 Free and Open Source Software (FOSS) – p. 7 Tradition: Ease vs Speed Easy but slow: dynamically typed (Perl, R, Python, Ruby) Hard but fast: statically typed (C, C++, FORTRAN) Mix (Perl|Python|Ruby|R) & C Free and Open Source Software (FOSS) – p. 8 Choosing a language Dynamic type, slowish Perl - ah, Perl, awful R - awful, but common Python - science! Graphics, math, statistics, R integration Ruby - cleaner, OOP, functional (like!) Javascript (fast, common, a bit ugly) Free and Open Source Software (FOSS) – p. 9 Choosing a language (2) Fixed type, fast Structured: C, GO OOP: Java Functional languages: Clojure, Erlang, Haskell Hybrid: Scala (like!), D (like!) Free and Open Source Software (FOSS) – p. 10 Tips Don’t be afraid of languages Mix and matcha Use one dynamic (easy), one static language (fast) JVM (Java, Groovy, Scala, Clojure, JRuby, Javascript) Computer language shootout aSharing programming resources between Bio* projects through remote pro- cedure call and native call stack strategies, by Pjotr Prins et al., Evolutionary Genomics: statistical and computational methods, Meth. Mol. Biol. 2012 Free and Open Source Software (FOSS) – p. 11 What libraries? Bio* projects. JVM has BioJava, BioScala and BioRuby! BioPerl; R/Bioconductor - large communities Biopython/BioRuby - growing Free and Open Source Software (FOSS) – p. 12 What tools? Search WWW. Say, ‘RNAseq tools’ CloudBiolinux (see biogems.info) - & Galaxy Search R/bioconductor modules/bindings biostars.org, stackexchange, etc. Q&A sites Ask on Bio* mailing lists But how do I know what to use? Free and Open Source Software (FOSS) – p. 13 Project activity Main website, documentation? Search the Q&A sites and mailing lists Check the bloody source! This is FOSS! # of contributors, # of commits Recent activity Free and Open Source Software (FOSS) – p. 14 Dead projects If a project looks dead, it is probably... If the source code looks bad, it is probably... If the documentation is old, it is probably... Still, it may be worth reviving. Free and Open Source Software (FOSS) – p. 15 What makes a bioinformatician? Can run BLAST? Can run CLI tools? Can chain tools? Can assess tools/libraries Loves programming Loves biology Free and Open Source Software (FOSS) – p. 16 Great bioinformaticians 100+ papers 10K citations 1st or last author? Many of the best papers have bioinformatics input! Example: Yves van de Peer Free and Open Source Software (FOSS) – p. 17.

Free and Open Source Software (FOSS) for Bioinformatics and Computational Biology

Introduction to Bioinformatics Software and Computing Infrastructures

Biolib: Sharing High Performance Code Between Bioperl, Biopython, Bioruby, R/Bioconductor and Biojava by Pjotr Prins

Gestalt of Bioinformatics

Visual Programming for Next-Generation Sequencing Data Analytics Franco Milicchio1, Rebecca Rose2, Jiang Bian3, Jae Min4 and Mattia Prosperi4*

Bioruby 2010 Updates: Moving to Agile Bioinformatics

Open Source Tools and Toolkits for Bioinformatics: Significance, and Where Are We? Jasone.Stajichandhilmarlapp

2 Project Overview and New Features Bioruby Project Was Started in Late

BOSC 2015 Dublin, Ireland July 10-11, 2015

Python for Gene Expression[Version 1; Peer Review

Bioruby: Object Oriented Open Source Library for Bioinformatics

UBIC2 – Towards Ubiquitous Bio-Information Computing: Data Protocols, Middleware, and Web Services for Heterogeneous Biological Information Integration and Retrieval

Bioruby Project Update