Web & Grid Technologies in Bioinformatics, Computational And

Total Page:16

File Type:pdf, Size:1020Kb

Web & Grid Technologies in Bioinformatics, Computational And Current Bioinformatics, 2008, 3, 000-000 1 Web & Grid Technologies in Bioinformatics, Computational and Systems Biology: A Review Azhar A. Shah1, Daniel Barthel1, Piotr Lukasiak2,3, Jacek Blazewicz2,3 and Natalio Krasnogor*,1 1School of Computer Science, University of Nottingham, Jubilee Campus, NG81BB, Nottingham, UK 2Institute of Computing Science, Poznan University of Technology, Piotrowo 2, 60-965 Poznan, Poland 3Institute of Bioorganic Chemistry, Laboratory of Bioinformatics, Polish Academy of Sciences, Noskowskiego 12/14, 61- 704 Poznan, Poland Abstract: The acquisition of biological data, ranging from molecular characterization and simulations (e.g. protein fold- ing dynamics), to systems biology endeavors (e.g. whole organ simulations) all the way up to ecological observations (e.g. as to ascertain climate change’s impact on the biota) is growing at unprecedented speed. The use of computational and networking resources is thus unavoidable. As the datasets become bigger and the acquisition technology more refined, the biologist is empowered to ask deeper and more complex questions. These, in turn, drive a runoff effect where large re- search consortia emerge that span beyond organizations and national boundaries. Thus the need for reliable, robust, certi- fied, curated, accessible, secure and timely data processing and management becomes entrenched within, and crucial to, 21st century biology. Furthermore, the proliferation of biotechnologies and advances in biological sciences has produced a strong drive for new informatics solutions, both at the basic science and technological levels. The previously unknown situation of dealing with, on one hand, (potentially) exabytes of data, much of which is noisy, has large experimental er- rors or theoretical uncertainties associated with it, or on the other hand, large quantities of data that require automated computationally intense analysis and processing, have produced important innovations in web and grid technology. In this paper we present a trace of these technological changes in Web and Grid technology, including details of emerging infra- structures, standards, languages and tools, as they apply to bioinformatics, computational biology and systems biology. A major focus of this technological review is to collate up-to-date information regarding the design and implementation of various bioinformatics Webs, Grids, Web-based grids or Grid-based webs in terms of their infrastructure, standards, pro- tocols, services, applications and other tools. This review, besides surveying the current state-of-the-art, will also provide a road map for future research and open questions. Keywords: Semantic web, web services, web agents, grid computing, middleware, in silico experiments. 1. INTRODUCTION networks, instruments, software applications and other com- puter related hardware equipment. These information and ‘The impact of computing on biology can fairly be con- resource-sharing capabilities of web and grid technologies sidered a paradigm change as biology enters the 21st cen- could upgrade a single user computer into a global tury. In short, computing and information technology ap- supercomputer with vast computational power and storage plied to biological problems is likely to play a role for 21st capacity. The so called upgraded web and grid-enabled century biology that is in many ways analogous to the role global super computer would make itself a potential candi- that molecular biology has played across all fields of bio- date to be used in resource-hungry computing domains. For logical research for the last quarter century and computing example, it could be used to efficiently solve complex calcu- and information technology will become embedded within lations such as parameter sweep scenario with Monte Carlo biological research itself’ [1]. simulation and modeling techniques, which would normally As a visualization of the above referred conclusion re- require several decades of execution time on a traditional garding the embedding of computing and information tech- single desktop processor. nology (IT) in biological research, one can look at the cur- A quick look at the literature reveals that web and grid rent state-of-the-art in web and grid technologies as applied technologies are continuously being taken up by the biologi- to bioinformatics, computational biology and systems biol- cal community as an alternate to traditional monolithic high ogy. The World Wide Web or simply the web has revolu- performance computing mainly because of the inherent na- tionized the field of IT and related disciplines, by providing ture of biological resources (distributed, heterogeneous and information-sharing services on top of the internet. Simi- CPU intensive), smaller financial costs, better flexibility, larly, grid technology has revolutionized the field of comput- scalability and efficiency offered by the web and grid- ing by providing location-independent resource sharing- enabled environment. An important factor that provides the services such as computational power, storage, databases, justification behind the ever growing use of web and grid technologies in life sciences is the continuous and rapid in- *Address correspondence to this author at the ASAP Group, School of crease in biological data production. About eight years ago it Computer Science, University of Nottingham, Jubilee Campus, NG81BB, has been approximated that the amount of information pro- Nottingham, UK; Tel: +44 115 8467592; Fax: +44 115 8467066; E-mail: duced by a single gene laboratory could be as higher as 100 [email protected] 1574-8936/08 $55.00+.00 © 2008 Bentham Science Publishers Ltd. 2 Current Bioinformatics, 2008, Vol. 3, No. 1 Shah et al. terabytes, which is equivalent to about 1 million encyclope- Some other large scale bioinformatics grid projects have dias [2]. Although being extensive, these data also differ in provided a platform where a biologist can design and run terms of storage and access technologies and are dispersed complex in silico experiments by combing several distrib- throughout the world. Currently, there are no uniform stan- uted and heterogeneous resources that are wrapped as web- dards, or at least not yet been adopted properly by the bio- services. Examples of these are myGrid [10, 11], BioMOBY logical community as a whole, to deal with the diverse na- [12-14], Seqhound (http://www.blueprint.org /seqhound) and ture, type, location and storage formats of this data. On the Biomart (http://www.biomart.org) etc., which allow for the other hand, in order to obtain the most comprehensive and automatic discovery and invocation of many bioinformatics competitive results, in many situations, a biologist may need applications, tools and databases such as EMBOSS [15] suite to access several different types of data which are publicly of bioinformatics applications and some other publicly avail- available in more than 700 [3] biomolecular databases. One able services from the National Center for Biomedical In- way to handle this situation would be to convert the required formatics (NCBI) (http://www.ncbi. nlm.nih.gov) and Euro- databases into a single format and then store it on a single pean Bioinformatics Institute (EBI) (http://www.ebi. ac.uk). storage device with extremely large capacity. Considering These projects also provide some special toolkits with neces- the tremendous size and growth of data, this solution seems sary application programming interfaces (APIs), which could to be infeasible, inefficient and very costly. The application be used to transform any legacy bioinformatics application of Web and Grid technology provides an opportunity to into a web-service that can be deployed on their platforms. standardize the access to these data in an efficient, automatic The availability of these BioGrid projects brought into sharp and seamless way through the use of appropriate DataGrid focus the need for better user interfaces as to provide the middleware technologies. These technologies include grid biologist with easier access to these web/grid resources. This middleware specific Data Management Services (DMS), has led to the development of various web based interfaces, distributed storage environments such as Open Source Grid portals, workflow management systems, problem solving Services Architecture-Data Access and Integration (OGSA- environments, frameworks, application programming envi- DAI) (http://www.ogsadia.org) with Distributed Query Proc- ronments, middleware toolkits, data and resource manage- essing (OGSA-DQP), Storage Resource Broker (SRB) [4] ment approaches along with various ways of controlling grid and IBM DiscoveryLink [5] middleware etc. access and security. This review attempts to provide an up- to-date coherent and curated overview of the most recent Furthermore, it is also very common for a typical bio- advances in these technologies as applied to life sciences. logical application that involves very complex analysis of large-scale datasets and other simulation related tasks to de- The review aims at providing a complementary source of additional information to some previous reviews in this field mand for high throughput computing power in addition to such as [16, 17]. seamless access to very large biological datasets. The tradi- tional approach towards this solution was to purchase ex- The organization of the paper is as follows: section 2 tremely costly special-purpose super computers or dedicated presents an overview of the state-of-the-art
Recommended publications
  • Bioinformatics Study of Lectins: New Classification and Prediction In
    Bioinformatics study of lectins : new classification and prediction in genomes François Bonnardel To cite this version: François Bonnardel. Bioinformatics study of lectins : new classification and prediction in genomes. Structural Biology [q-bio.BM]. Université Grenoble Alpes [2020-..]; Université de Genève, 2021. En- glish. NNT : 2021GRALV010. tel-03331649 HAL Id: tel-03331649 https://tel.archives-ouvertes.fr/tel-03331649 Submitted on 2 Sep 2021 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. THÈSE Pour obtenir le grade de DOCTEUR DE L’UNIVERSITE GRENOBLE ALPES préparée dans le cadre d’une cotutelle entre la Communauté Université Grenoble Alpes et l’Université de Genève Spécialités: Chimie Biologie Arrêté ministériel : le 6 janvier 2005 – 25 mai 2016 Présentée par François Bonnardel Thèse dirigée par la Dr. Anne Imberty codirigée par la Dr/Prof. Frédérique Lisacek préparée au sein du laboratoire CERMAV, CNRS et du Computer Science Department, UNIGE et de l’équipe PIG, SIB Dans les Écoles Doctorales EDCSV et UNIGE Etude bioinformatique des lectines: nouvelle classification et prédiction dans les génomes Thèse soutenue publiquement le 8 Février 2021, devant le jury composé de : Dr. Alexandre de Brevern UMR S1134, Inserm, Université Paris Diderot, Paris, France, Rapporteur Dr.
    [Show full text]
  • Easybuild Documentation Release 20210907.0
    EasyBuild Documentation Release 20210907.0 Ghent University Tue, 07 Sep 2021 08:55:41 Contents 1 What is EasyBuild? 3 2 Concepts and terminology 5 2.1 EasyBuild framework..........................................5 2.2 Easyblocks................................................6 2.3 Toolchains................................................7 2.3.1 system toolchain.......................................7 2.3.2 dummy toolchain (DEPRECATED) ..............................7 2.3.3 Common toolchains.......................................7 2.4 Easyconfig files..............................................7 2.5 Extensions................................................8 3 Typical workflow example: building and installing WRF9 3.1 Searching for available easyconfigs files.................................9 3.2 Getting an overview of planned installations.............................. 10 3.3 Installing a software stack........................................ 11 4 Getting started 13 4.1 Installing EasyBuild........................................... 13 4.1.1 Requirements.......................................... 14 4.1.2 Using pip to Install EasyBuild................................. 14 4.1.3 Installing EasyBuild with EasyBuild.............................. 17 4.1.4 Dependencies.......................................... 19 4.1.5 Sources............................................. 21 4.1.6 In case of installation issues. .................................. 22 4.2 Configuring EasyBuild.......................................... 22 4.2.1 Supported configuration
    [Show full text]
  • Bioperl What’S Bioperl?
    Bioperl What’s Bioperl? Bioperl is not a new language It is a collection of Perl modules that facilitate the development of Perl scripts for bioinformatics applications. Bioperl and perl Bioperl Modules Perl Modules Perls script input Perl Interpreter output Bioperl and Perl Why bioperl for bioinformatics? Perl is good at file manipulation and text processing, which make up a large part of the routine tasks in bioinformatics. Perl language, documentation and many Perl packages are freely available. Perl is easy to get started in, to write small and medium-sized programs. Where to get help Type perldoc <modulename> in terminal Search for particular module in https://metacpan.org Bioperl Document Object-oriented and Process-oriented programming Process-oriented: Yuan Hao eats chicken Name object: $name Action method: eat Food object: $food Object-oriented: $name->eat($food) Modularize the program Platform and Related Software Required Perl 5.6.1 or higher Version 5.8 or higher is highly recommended make for Mac OS X, this requires installing the Xcode Developer Tools Installation On Linux or Max OS X Install from cpanminus: perlbrew install-cpanm cpanm Bio::Perl Install from source code: git clone https://github.com/bioperl/bioperl-live.git cd bioperl-live perl Build.PL ./Build test (optional) ./Build install Installation On Windows Install MinGW (MinGW is incorporated in Strawberry Perl, but must it be installed through PPM for ActivePerl) : ppm install MinGW Install Module::Build, Test::Harness and Test::Most through CPAN: Type cpan to enter the CPAN shell. At the cpan> prompt, type install CPAN Quit (by typing ‘q’) and reload CPAN.
    [Show full text]
  • The Bioperl Toolkit: Perl Modules for the Life Sciences
    Downloaded from genome.cshlp.org on January 25, 2012 - Published by Cold Spring Harbor Laboratory Press The Bioperl Toolkit: Perl Modules for the Life Sciences Jason E. Stajich, David Block, Kris Boulez, et al. Genome Res. 2002 12: 1611-1618 Access the most recent version at doi:10.1101/gr.361602 Supplemental http://genome.cshlp.org/content/suppl/2002/10/20/12.10.1611.DC1.html Material References This article cites 14 articles, 9 of which can be accessed free at: http://genome.cshlp.org/content/12/10/1611.full.html#ref-list-1 Article cited in: http://genome.cshlp.org/content/12/10/1611.full.html#related-urls Email alerting Receive free email alerts when new articles cite this article - sign up in the box at the service top right corner of the article or click here To subscribe to Genome Research go to: http://genome.cshlp.org/subscriptions Cold Spring Harbor Laboratory Press Downloaded from genome.cshlp.org on January 25, 2012 - Published by Cold Spring Harbor Laboratory Press Resource The Bioperl Toolkit: Perl Modules for the Life Sciences Jason E. Stajich,1,18,19 David Block,2,18 Kris Boulez,3 Steven E. Brenner,4 Stephen A. Chervitz,5 Chris Dagdigian,6 Georg Fuellen,7 James G.R. Gilbert,8 Ian Korf,9 Hilmar Lapp,10 Heikki Lehva¨slaiho,11 Chad Matsalla,12 Chris J. Mungall,13 Brian I. Osborne,14 Matthew R. Pocock,8 Peter Schattner,15 Martin Senger,11 Lincoln D. Stein,16 Elia Stupka,17 Mark D. Wilkinson,2 and Ewan Birney11 1University Program in Genetics, Duke University, Durham, North Carolina 27710, USA; 2National Research Council of
    [Show full text]
  • Trichoderma Reesei Complete Genome Sequence, Repeat-Induced Point
    Li et al. Biotechnol Biofuels (2017) 10:170 DOI 10.1186/s13068-017-0825-x Biotechnology for Biofuels RESEARCH Open Access Trichoderma reesei complete genome sequence, repeat‑induced point mutation, and partitioning of CAZyme gene clusters Wan‑Chen Li1,2,3†, Chien‑Hao Huang3,4†, Chia‑Ling Chen3, Yu‑Chien Chuang3, Shu‑Yun Tung3 and Ting‑Fang Wang1,3* Abstract Background: Trichoderma reesei (Ascomycota, Pezizomycotina) QM6a is a model fungus for a broad spectrum of physiological phenomena, including plant cell wall degradation, industrial production of enzymes, light responses, conidiation, sexual development, polyketide biosynthesis, and plant–fungal interactions. The genomes of QM6a and its high enzyme-producing mutants have been sequenced by second-generation-sequencing methods and are pub‑ licly available from the Joint Genome Institute. While these genome sequences have ofered useful information for genomic and transcriptomic studies, their limitations and especially their short read lengths make them poorly suited for some particular biological problems, including assembly, genome-wide determination of chromosome architec‑ ture, and genetic modifcation or engineering. Results: We integrated Pacifc Biosciences and Illumina sequencing platforms for the highest-quality genome assem‑ bly yet achieved, revealing seven telomere-to-telomere chromosomes (34,922,528 bp; 10877 genes) with 1630 newly predicted genes and >1.5 Mb of new sequences. Most new sequences are located on AT-rich blocks, including 7 centromeres, 14 subtelomeres, and 2329 interspersed AT-rich blocks. The seven QM6a centromeres separately consist of 24 conserved repeats and 37 putative centromere-encoded genes. These fndings open up a new perspective for future centromere and chromosome architecture studies.
    [Show full text]
  • Introduction to Bioinformatics (Elective) – SBB1609
    SCHOOL OF BIO AND CHEMICAL ENGINEERING DEPARTMENT OF BIOTECHNOLOGY Unit 1 – Introduction to Bioinformatics (Elective) – SBB1609 1 I HISTORY OF BIOINFORMATICS Bioinformatics is an interdisciplinary field that develops methods and software tools for understanding biologicaldata. As an interdisciplinary field of science, bioinformatics combines computer science, statistics, mathematics, and engineering to analyze and interpret biological data. Bioinformatics has been used for in silico analyses of biological queries using mathematical and statistical techniques. Bioinformatics derives knowledge from computer analysis of biological data. These can consist of the information stored in the genetic code, but also experimental results from various sources, patient statistics, and scientific literature. Research in bioinformatics includes method development for storage, retrieval, and analysis of the data. Bioinformatics is a rapidly developing branch of biology and is highly interdisciplinary, using techniques and concepts from informatics, statistics, mathematics, chemistry, biochemistry, physics, and linguistics. It has many practical applications in different areas of biology and medicine. Bioinformatics: Research, development, or application of computational tools and approaches for expanding the use of biological, medical, behavioral or health data, including those to acquire, store, organize, archive, analyze, or visualize such data. Computational Biology: The development and application of data-analytical and theoretical methods, mathematical modeling and computational simulation techniques to the study of biological, behavioral, and social systems. "Classical" bioinformatics: "The mathematical, statistical and computing methods that aim to solve biological problems using DNA and amino acid sequences and related information.” The National Center for Biotechnology Information (NCBI 2001) defines bioinformatics as: "Bioinformatics is the field of science in which biology, computer science, and information technology merge into a single discipline.
    [Show full text]
  • Vasco Da Rocha Figueiras Algoritmos Para Genómica Comparativa
    Universidade de Aveiro Departamento Electrónica, Telecomunicações 2010 e Informática Vasco da Rocha Algoritmos para Genómica Comparativa Figueiras Universidade de Aveiro Departamento de Electrónica, Telecomunicações 2010 e Informática Vasco da Rocha Algoritmos para Genómica Comparativa Figueiras Dissertação apresentada à Universidade de Aveiro para cumprimento dos requisitos necessários à obtenção do grau de Mestre em Engenharia Electrónica e Telecomunicações, realizada sob a orientação científica do Doutor José Luís Oliveira, Professor associado da Universidade de Aveiro. o júri presidente Prof. Doutor Armando José Formoso de Pinho Professor Associado do Departamento de Electrónica, Telecomunicações e Informática da Universidade de Aveiro Prof. Doutor Rui Pedro Sanches de Castro Lopes Professor Coodenador do Departamento de Informática e Comunicações do Instituto Politécnico de Bragança orientador Prof. Doutor José Luis Guimarães de Oliveira Professor Associado do Departamento de Electrónica, Telecomunicações e Informática da Universidade de Aveiro agradecimentos Durante o desenvolvimento desta dissertação, recebi muito apoio de colegas e amigos. Agora que termino o trabalho não posso perder a oportunidade de agradecer a todas as pessoas que me ajudaram nesta etapa. Um agradecimento ao meu orientador Professor Doutor José Luís Oliveira e ao Doutor Miguel Monsanto Pinheiro, pela oportunidade de aprendizagem. À minha família pelo apoio incondicional, incentivo e carinho. Aos meus colegas e amigos por toda a ajuda e pelos momentos de convívio e de descontracção, que quebraram tantas dificuldades, ao longo desta etapa. A ti, Rita por todo o carinho, paciência e amor. palavras-chave Bioinformática, sequenciação, BLAST, alinhamento de sequências, genómica comparativa. resumo Com o surgimento da Genómica e da Proteómica, a Bioinformática conduziu a alguns dos avanços científicos mais relevantes do século XX.
    [Show full text]
  • Perl for Bioinformatics Perl and Bioperl I
    Perl for Bioinformatics Perl and BioPerl I Jason Stajich July 13, 2011 Perl for Bioinformatics Slide 1/83 Outline Perl Intro Basics Syntax and Variables More complex: References Routines Reading and Writing Regular Expressions BioPerl Intro Useful modules Data formats Databases in BioPerl Sequence objects details Trees Multiple Alignments BLAST and Sequence Database searching Other general Perl modules Perl for Bioinformatics Perl Intro Slide 2/83 Outline Perl Intro Basics Syntax and Variables More complex: References Routines Reading and Writing Regular Expressions BioPerl Intro Useful modules Data formats Databases in BioPerl Sequence objects details Trees Multiple Alignments BLAST and Sequence Database searching Other general Perl modules Perl for Bioinformatics Perl Intro Basics Slide 3/83 Why Perl for data processing & bioinformatics Fast text processing Regular expressions Extensive module libraries for pre-written tools Large number of users in community of bioinformatics Scripts are often faster to write than full compiled programs Cons Syntax sometimes confusing to new users; 'There's more than one way to do it' can also be confusing. (TMTOWTDI) Cons Not a true object-oriented language so some abstraction is clunky and hacky Scripting languages (Perl, Python, Ruby) generally easier to write simply than compiled ones (C, C++, Java) as they are often not strongly typed and less memory management control. Perl for Bioinformatics Perl Intro Basics Slide 4/83 Perl packages CPAN - Comprehensive Perl Archive http://www.cpan.org Perl
    [Show full text]
  • A Comprehensive Review and Performance Evaluation of Sequence Alignment Algorithms for DNA Sequences
    International Journal of Advanced Science and Technology Vol. 29, No. 3, (2020), pp. 11251 - 11265 A Comprehensive Review and Performance Evaluation of Sequence Alignment Algorithms for DNA Sequences Neelofar Sohi1 and Amardeep Singh2 1Assistant Professor, Department of Computer Science & Engineering, Punjabi University Patiala, Punjab, India 2Professor, Department of Computer Science & Engineering, Punjabi University Patiala, Punjab, India [email protected],[email protected] Abstract Background: Sequence alignment is very important step for high level sequence analysis applications in the area of biocomputing. Alignment of DNA sequences helps in finding origin of sequences, homology between sequences, constructing phylogenetic trees depicting evolutionary relationships and other tasks. It helps in identifying genetic variations in DNA sequences which might lead to diseases. Objectives: This paper presents a comprehensive review on sequence alignment approaches, methods and various state-of-the-art algorithms. Performance evaluation and comparison of few algorithms and tools is performed. Methods and Results: In this study, various tools and algorithms are studied, implemented and their performance is evaluated and compared using Identity Percentage is used as the main metric. Conclusion: It is observed that for pairwise sequence alignment, Clustal Omega Emboss Matcher outperforms other tools & algorithms followed by Clustal Omega Emboss Water further followed by Blast (Needleman-Wunsch algorithm for global alignment). Keywords: sequence alignment; DNA sequences; progressive alignment; iterative alignment; natural computing approach; identity percentage. 1. Brief History Sequence alignment is very important area in the field of Biocomputing and Bioinformatics. Sequence alignment aims to identify the regions of similarity between two or more sequences. Alignment is also termed as ‘mapping’ that is done to identify and compare the nucleotide bases i.e.
    [Show full text]
  • GPS@: Bioinformatics Grid Portal for Protein Sequence Analysis on EGEE Grid
    Enabling Grids for E-sciencE GPS@: Bioinformatics grid portal for protein sequence analysis on EGEE grid Blanchet, C., Combet, C., Lefort, V. and Deleage, G. Pôle BioInformatique de Lyon – PBIL Institut de Biologie et Chimie des Protéines IBCP – CNRS UMR 5086 Lyon-Gerland, France [email protected] www.eu-egee.org INFSO-RI-508833 TOC Enabling Grids for E-sciencE • NPS@ Web portal – Online since 1998 – Production mode • Gridification of NPS@ • Bioinformatics description with XML-based Framework • Legacy mode for application file access • GPS@ Web portal for Bioinformatics on Grid INFSO-RI-508833 GPS@ Bioinformatics Grid Portal, EGEE User Forum, March 1st, 2006, Cern 2 Institute of Biology and Enabling Grids for E-sciencE Chemistry of Proteins • French CNRS Institute, associated to Univ. Lyon1 – Life Science – About 160 people – http://www.ibcp.fr – Located in Lyon, France • Study of proteins in their biological context ♣ Approaches used include integrative cellular (cell culture, various types of microscopies) and molecular techniques, both experimental (including biocrystallography and nuclear magnetic resonance) and theoretical (structural bioinformatics). • Three main departments, bringing together 13 groups ♣ topics such as cancer, extracellular matrix, tissue engineering, membranes, cell transport and signalling, bioinformatics and structural biology INFSO-RI-508833 GPS@ Bioinformatics Grid Portal, EGEE User Forum, March 1st, 2006, Cern 3 EGEE Biomedical Activity Enabling Grids for E-sciencE • Chair: ♣ Johan Montagnat ♣ Christophe
    [Show full text]
  • Software List for Biology, Bioinformatics and Biostatistics CCT
    Software List for biology, bioinformatics and biostatistics v CCT - Delta Software Version Application short read assembler and it works on both small and large (mammalian size) ALLPATHS-LG 52488 genomes provides a fast, flexible C++ API & toolkit for reading, writing, and manipulating BAMtools 2.4.0 BAM files a high level of alignment fidelity and is comparable to other mainstream Barracuda 0.7.107b alignment programs allows one to intersect, merge, count, complement, and shuffle genomic bedtools 2.25.0 intervals from multiple files Bfast 0.7.0a universal DNA sequence aligner tool analysis and comprehension of high-throughput genomic data using the R Bioconductor 3.2 statistical programming BioPython 1.66 tools for biological computation written in Python a fast approach to detecting gene-gene interactions in genome-wide case- Boost 1.54.0 control studies short read aligner geared toward quickly aligning large sets of short DNA Bowtie 1.1.2 sequences to large genomes Bowtie2 2.2.6 Bowtie + fully supports gapped alignment with affine gap penalties BWA 0.7.12 mapping low-divergent sequences against a large reference genome ClustalW 2.1 multiple sequence alignment program to align DNA and protein sequences assembles transcripts, estimates their abundances for differential expression Cufflinks 2.2.1 and regulation in RNA-Seq samples EBSEQ (R) 1.10.0 identifying genes and isoforms differentially expressed EMBOSS 6.5.7 a comprehensive set of sequence analysis programs FASTA 36.3.8b a DNA and protein sequence alignment software package FastQC
    [Show full text]
  • Pipenightdreams Osgcal-Doc Mumudvb Mpg123-Alsa Tbb
    pipenightdreams osgcal-doc mumudvb mpg123-alsa tbb-examples libgammu4-dbg gcc-4.1-doc snort-rules-default davical cutmp3 libevolution5.0-cil aspell-am python-gobject-doc openoffice.org-l10n-mn libc6-xen xserver-xorg trophy-data t38modem pioneers-console libnb-platform10-java libgtkglext1-ruby libboost-wave1.39-dev drgenius bfbtester libchromexvmcpro1 isdnutils-xtools ubuntuone-client openoffice.org2-math openoffice.org-l10n-lt lsb-cxx-ia32 kdeartwork-emoticons-kde4 wmpuzzle trafshow python-plplot lx-gdb link-monitor-applet libscm-dev liblog-agent-logger-perl libccrtp-doc libclass-throwable-perl kde-i18n-csb jack-jconv hamradio-menus coinor-libvol-doc msx-emulator bitbake nabi language-pack-gnome-zh libpaperg popularity-contest xracer-tools xfont-nexus opendrim-lmp-baseserver libvorbisfile-ruby liblinebreak-doc libgfcui-2.0-0c2a-dbg libblacs-mpi-dev dict-freedict-spa-eng blender-ogrexml aspell-da x11-apps openoffice.org-l10n-lv openoffice.org-l10n-nl pnmtopng libodbcinstq1 libhsqldb-java-doc libmono-addins-gui0.2-cil sg3-utils linux-backports-modules-alsa-2.6.31-19-generic yorick-yeti-gsl python-pymssql plasma-widget-cpuload mcpp gpsim-lcd cl-csv libhtml-clean-perl asterisk-dbg apt-dater-dbg libgnome-mag1-dev language-pack-gnome-yo python-crypto svn-autoreleasedeb sugar-terminal-activity mii-diag maria-doc libplexus-component-api-java-doc libhugs-hgl-bundled libchipcard-libgwenhywfar47-plugins libghc6-random-dev freefem3d ezmlm cakephp-scripts aspell-ar ara-byte not+sparc openoffice.org-l10n-nn linux-backports-modules-karmic-generic-pae
    [Show full text]