Alignments in Practice BLAST and CLUSTAL
Total Page:16
File Type:pdf, Size:1020Kb
Alignments in Practice BLAST and CLUSTAL Introduction to Bioinformatics Dortmund, 16.-20.07.2007 Lectures: Sven Rahmann Exercises: Udo Feldkamp, Michael Wurst 1 Overview ● Dot Plots ● Nucleotide BLAST ● Protein BLAST ● BLAST Statistics ● BLAT ● CLUSTAL ● JalView 2 Dotter – Tool for Dot Plots ● http://www.cgb.ki.se/cgb/groups/sonnhammer/Dotter.html ● Dotlet: a Java applet for Dot Plots 3 Dot Plots ● Hemoglobin Alpha against Hemoglobin Beta 4 EBI Alignment Service 5 BLAST ● URL: http://www.ncbi.nlm.nih.gov/BLAST/ ● Basic Local Alignment Search Tool 6 Choose the right BLAST 7 Nucleotide BLAST Interface 8 BLAST Parameters ● Expect threshold: low [0.01] = strict high [100] = loose ● Word size: speed vs. sensitivity high = faster low = slower, but more sensitive 9 Protein BLAST 10 Protein BLAST Parameters 11 Translated BLAST ● protein query against nucleotide database – nucleotide sequence not unique – also consider reverse complement ● nucleotide query against protein database – consider all 6 reading frames 12 BLAST Output 13 BLAST Output II Database + Accession Description Bit score E-value Link 14 BLAST Statistics ● How good / reliable is a hit found by BLAST? ● Raw score := score of the alignment according to scoring matrix and gap penalties ● Bit score := score (log2 units), length-normalized ● E-value := Number of hits of such or better score in a hypothetical database of random proteins of the same size 15 More on Statistics ● Null model := random model describing sequences without intentional signal (here: pair of random sequences without intentional similarity) ● (single) p-value for observed score s := Prob(Score >= s) in the null model ● (multiple) p-value := Prob(Score >= s at least once) 16 BLAT ● BLAST-Like Alignment Tool ● index-based ● developed at UC Santa Cruz ● especially for searching in whole genomes ● very fast ● limited to nearly exact matches 17 UCSC Genome Browser + BLAT 18 CLUSTAL 19 What Clustal Did (“Output file”) 20 Clustal Results (pretty) 21 Clustal Results (“alignment file”) CLUSTAL W (1.83) multiple sequence alignment FOS_RAT MMFSGFNADYEASSSRCSSASPAGDSLSYYHSPADSFSSMGSPVNTQDFCADLSVSSANF 60 FOS_MOUSE MMFSGFNADYEASSSRCSSASPAGDSLSYYHSPADSFSSMGSPVNTQDFCADLSVSSANF 60 FOS_HUMAN MMFSGFNADYEASSSRCSSASPAGDSLSYYHSPADSFSSMGSPVNAQDFCTDLAVSSANF 60 FOS_CHICK MMYQGFAGEYEAPSSRCSSASPAGDSLTYYPSPADSFSSMGSPVNSQDFCTDLAVSSANF 60 FOS_ZEBRAFISH MMFTSLNADCDASS-RCSTASPSGDSVGYY------------PLNQTQEFTDLSVSSASF 47 **: .: .: :*.* ***:***:***: ** *:* : :**:****.* FOS_RAT IPTVTAISTSPDLQWLVQPTLVSSVAPSQTRAPHPYGLPTPS-TGAYARAGVVKTMSGGR 119 FOS_MOUSE IPTVTAISTSPDLQWLVQPTLVSSVAPSQTRAPHPYGLPTQS-AGAYARAGMVKTVSGGR 119 FOS_HUMAN IPTVTAISTSPDLQWLVQPALVSSVAPSQTRAPHPFGVPAPS-AGAYSRAGVVKTMTGGR 119 FOS_CHICK VPTVTAISTSPDLQWLVQPTLISSVAPSQNRG-HPYGVPAPAPPAAYSRPAVLKAP-GGR 118 FOS_ZEBRAFISH VPTVTAISSCPDLQWMVQP-MISSAAPS-------NGAAQSYNPSSYPKMRVTGAK---- 95 :*******:.*****:*** ::**.*** * . ..:*.: : : FOS_RAT AQSIGRRGKVEQLSPEEEEKRRIRRERNKMAAAKCRNRRRELTDTLQAETDQLEDEKSAL 179 FOS_MOUSE AQSIGRRGKVEQLSPEEEEKRRIRRERNKMAAAKCRNRRRELTDTLQAETDQLEDEKSAL 179 FOS_HUMAN AQSIGRRGKVEQLSPEEEEKRRIRRERNKMAAAKCRNRRRELTDTLQAETDQLEDEKSAL 179 FOS_CHICK GQSIGRRGKVEQLSPEEEEKRRIRRERNKMAAAKCRNRRRELTDTLQAETDQLEEEKSAL 178 FOS_ZEBRAFISH --TSNKRSRSEQLSPEEEEKKRVRRERSKMAAAKCRNRRRELTDTLQAETDQLEDEKSAL 153 : .:*.: **********:*:****.**************************:***** FOS_RAT QTEIANLLKEKEKLEFILAAHRPACKIPNDLGFPEE----MSVTS-LDLTGGLPEATTPE 234 FOS_MOUSE QTEIANLLKEKEKLEFILAAHRPACKIPDDLGFPEE----MSVAS-LDLTGGLPEASTPE 234 FOS_HUMAN QTEIANLLKEKEKLEFILAAHRPACKIPDDLGFPEE----MSVAS-LDLTGGLPEVATPE 234 FOS_CHICK QAEIANLLKEKEKLEFILAAHRPACKMPEELRFSEE----LAAATALDLG----APSPAA 230 FOS_ZEBRAFISH QNDIANLLKEKERLEFILAAHKPICKIPADASFPEPSSSPMSSISVPEIVTTSVVSSTPN 21322 * :*********:********:* **:* : *.* :: : :: :.. Clustal Guide Tree 23 Clustal Guide Tree ● Guide Tree is not a phylogenetic tree, just a computational device ● Cladogram: edge lengths have no meaning ● Phylogram: edgle lengths correspond to distances 24 JalView: Alignment Editor (start from the CLUSTAL web site) 25 Simple JalView Window ● Simple alignment editor (Java applet) ● Complex alignment editor (Java application) – Web Start, or – Download installer 26 Starting or Installing JalView www.jalview.org 27 Multiple Alignment @ BiBiServ 28 For Windows/MAC: QAlign2 ● URL: http://gi.cebitec.uni-bielefeld.de/QAlign/ ● Live Demo of QAlign2 29.