Bioinformatics Databases and Applications

Bioinformatics Databases and Applications

Bioinformatics databases and applications Eitan Rubin, December 2002 Bioinformatics & Biological Computing Unit Eitan Rubin Department of Biological Services Outline • Introduction • A day in the life of a biologist • Major databases • Major tools Bioinformatics & Biological Computing Unit Eitan Rubin Department of Biological Services Outline • Introduction • A day in the life of a biologist • Major databases • Major tools Bioinformatics & Biological Computing Unit Eitan Rubin Department of Biological Services Life as a simple CS problem Input1 Algorithm Output Input2 Bioinformatics & Biological Computing Unit Eitan Rubin Department of Biological Services A more realistic view Input1 Algorithm2 Algorithm1 Output decision Input2 Algorithm3 Bioinformatics & Biological Computing Unit Eitan Rubin Department of Biological Services A typical real-life view Input1 Algorithm2 Algorithm1 Output decision Input2 Algorithm3 Input1 Input1 Algorithm2 Algorithm2 Algorithm1 Output decision Algorithm1 Output decision Input2 Algorithm3 Input2 Algorithm3 Input1 Input1 Algorithm2 Algorithm2 Algorithm1 Output decision Algorithm1 Output decision Input2 Algorithm3 Input2 Algorithm3 Bioinformatics & Biological Computing Unit Eitan Rubin Department of Biological Services The life cycle of a bioinformatics project • Clearly define the goals • Define a strategy • Run the process • QA & optimize – Controls – External knowledge – Re-sampling – Correlation Bioinformatics & Biological Computing Unit Eitan Rubin Department of Biological Services Outline • Introduction • A day in the life of a biologist • Major databases • Major tools Bioinformatics & Biological Computing Unit Eitan Rubin Department of Biological Services Positional cloning of disease X XM-417-L16 XM-417-L15 Bioinformatics & Biological Computing Unit Eitan Rubin Department of Biological Services Genome browser @ UCSC Looking at the region of interest chrX:98100000-98500000 Gene prediction program suggest there are 6-8 genes in the region Bioinformatics & Biological Computing Unit Eitan Rubin Department of Biological Services Get mRNA @ NCBI >unkown_protein MRLTEKSEGEQQLKPNNSNAPNEDQEEEIQQSEQHTPARQRTQRADTQPSRCRLPSR RTPTTSSDRTINLLEVLPWPTEWIFNPYRLPALFELYPEFLLVFKEAFHDISHCLKA Bioinformatics & Biological Computing Unit QMEKIGLPIILHLFALSTLYFYKFFLPTILSLSFFILLVLLLFIIVFILIFFEitan Rubin Department of Biological Services BLAST @ NCBI Bioinformatics & Biological Computing Unit Eitan Rubin Department of Biological Services Bioinformatics & Biological Computing Unit Eitan Rubin Department of Biological Services Search for domains @Interpro Bioinformatics & Biological Computing Unit Eitan Rubin Department of Biological Services Search for domains @Interpro Bioinformatics & Biological Computing Unit Eitan Rubin Department of Biological Services Bioinformatics & Biological Computing Unit Eitan Rubin Department of Biological Services Get predicted protein @ UCSC >naharu.b MSSRKQGSQPRGQQSAEEENFKKPTRSNMQRSKMRGASSGKKTAGPQQKN LEPALPGRWGGRSAENPPSGSVRKTRKNKQKTPGNGDGGSTSEAPQPPRK KRARADPTVESEEAFKNRMEVKVKIPEELKPWLVEDWDLVTRQKQLFQLP AKKNVDAILEEYANCKKSQGNVDNKEYAVNEVVAGIKEYFNVMLGTQLLY KFERPQYAEILLAHPDAPMSQVYGAPHLLRLFVRIGAMLAYTPLDEKSLA LLLGYLHDFLKYLAKNSASLFTASDYKVASAEYHRKAL Bioinformatics & Biological Computing Unit Eitan Rubin Department of Biological Services Bioinformatics & Biological Computing Unit Eitan Rubin Department of Biological Services Bioinformatics & Biological Computing Unit Eitan Rubin Department of Biological Services Bioinformatics & Biological Computing Unit Eitan Rubin Department of Biological Services Bioinformatics & Biological Computing Unit Eitan Rubin Department of Biological Services Bioinformatics & Biological Computing Unit Eitan Rubin Department of Biological Services Bioinformatics & Biological Computing Unit Eitan Rubin Department of Biological Services Bioinformatics & Biological Computing Unit Eitan Rubin Department of Biological Services Outline • Introduction • A day in the life of a biologist • Major databases • Major tools Bioinformatics & Biological Computing Unit Eitan Rubin Department of Biological Services BIND; MINT; BRITE … PDB HSSP Swissprot ; interpro; LAMA; GO StackDB; Gencarta; Ensembl AAAAA ??? EPD INSD (genbank, EMBL, DDJB) Specialized databases: Flybase, YPD, UCSC, Bioinformatics & Biological Computing Unit TAIR Eitan Rubin Department of Biological Services INSD • Genbank, EMBL, DDJB • CleanBank • Divisions (EST, HTG) • Specialized databases Bioinformatics & Biological Computing Unit Eitan Rubin Department of Biological Services Major tools • Transcript modelling from ESTs – Sequencher, Staden, StackPACK • Database searching – Blast – BLAT – Fasta • Multiple Sequence Alignment – ClustalX – MACAW Bioinformatics & Biological Computing Unit Eitan Rubin Department of Biological Services Major tools • Gene prediction • (EST) assembly • Promoter Finding • ORF identification • Similarity searching • MSA • Phylogenetic analysis • Structure prediction • Docking Bioinformatics & Biological Computing Unit Eitan Rubin Department of Biological Services ClustalX • Stepwise tree-guided alignment • “Bag full of tricks” • Demo Bioinformatics & Biological Computing Unit Eitan Rubin Department of Biological Services The effect of parameters Default parameters Modified parameters Bioinformatics & Biological Computing Unit Eitan Rubin Department of Biological Services The effect of parameters Bioinformatics & Biological Computing Unit Eitan Rubin Department of Biological Services Major tools • Gene prediction • (EST) assembly • Promoter Finding • ORF identification • Similarity searching • MSA • Phylogenetic analysis • Structure prediction • Docking Bioinformatics & Biological Computing Unit Eitan Rubin Department of Biological Services Similarity searching • SW (accelerated) • BLAST + The NCBI environment, Fast, wide dynamic range, availability - DNA very bad stats, poor for proteins ? Highly local FASTA • BLAT + Lightening fast, focused - Limited dynamic range Bioinformatics & Biological Computing Unit Eitan Rubin Department of Biological Services MSA • ClustalX + Fast; familiar - Global; One, not very accurate algorithm • Macaw + Very interactive; outstanding GUI; multiple algorithms - Immature; runs on PCs; incompatible • BLOCKS maker + Fully automated; fast - Poor control; many mistakes Bioinformatics & Biological Computing Unit Eitan Rubin Department of Biological Services.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    35 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us