Alignments in Practice BLAST and CLUSTAL
Introduction to Bioinformatics Dortmund, 16.-20.07.2007
Lectures: Sven Rahmann
Exercises: Udo Feldkamp, Michael Wurst
1 Overview
● Dot Plots ● Nucleotide BLAST ● Protein BLAST ● BLAST Statistics ● BLAT ● CLUSTAL ● JalView
2 Dotter – Tool for Dot Plots
● http://www.cgb.ki.se/cgb/groups/sonnhammer/Dotter.html
● Dotlet: a Java applet for Dot Plots
3 Dot Plots
● Hemoglobin Alpha against Hemoglobin Beta
4 EBI Alignment Service
5 BLAST ● URL: http://www.ncbi.nlm.nih.gov/BLAST/ ● Basic Local Alignment Search Tool
6 Choose the right BLAST
7 Nucleotide BLAST Interface
8 BLAST Parameters
● Expect threshold: low [0.01] = strict high [100] = loose ● Word size: speed vs. sensitivity high = faster low = slower, but more sensitive
9 Protein BLAST
10 Protein BLAST Parameters
11 Translated BLAST
● protein query against nucleotide database – nucleotide sequence not unique – also consider reverse complement ● nucleotide query against protein database – consider all 6 reading frames
12 BLAST Output
13 BLAST Output II
Database + Accession Description Bit score E-value Link
14 BLAST Statistics
● How good / reliable is a hit found by BLAST? ● Raw score := score of the alignment according to scoring matrix and gap penalties ● Bit score := score (log2 units), length-normalized ● E-value := Number of hits of such or better score in a hypothetical database of random proteins of the same size 15 More on Statistics
● Null model := random model describing sequences without intentional signal (here: pair of random sequences without intentional similarity) ● (single) p-value for observed score s := Prob(Score >= s) in the null model ● (multiple) p-value := Prob(Score >= s at least once)
16 BLAT
● BLAST-Like Alignment Tool ● index-based ● developed at UC Santa Cruz ● especially for searching in whole genomes ● very fast ● limited to nearly exact matches
17 UCSC Genome Browser + BLAT
18 CLUSTAL
19 What Clustal Did (“Output file”)
20 Clustal Results (pretty)
21 Clustal Results (“alignment file”)
CLUSTAL W (1.83) multiple sequence alignment
FOS_RAT MMFSGFNADYEASSSRCSSASPAGDSLSYYHSPADSFSSMGSPVNTQDFCADLSVSSANF 60 FOS_MOUSE MMFSGFNADYEASSSRCSSASPAGDSLSYYHSPADSFSSMGSPVNTQDFCADLSVSSANF 60 FOS_HUMAN MMFSGFNADYEASSSRCSSASPAGDSLSYYHSPADSFSSMGSPVNAQDFCTDLAVSSANF 60 FOS_CHICK MMYQGFAGEYEAPSSRCSSASPAGDSLTYYPSPADSFSSMGSPVNSQDFCTDLAVSSANF 60 FOS_ZEBRAFISH MMFTSLNADCDASS-RCSTASPSGDSVGYY------PLNQTQEFTDLSVSSASF 47 **: .: .: :*.* ***:***:***: ** *:* : :**:****.*
FOS_RAT IPTVTAISTSPDLQWLVQPTLVSSVAPSQTRAPHPYGLPTPS-TGAYARAGVVKTMSGGR 119 FOS_MOUSE IPTVTAISTSPDLQWLVQPTLVSSVAPSQTRAPHPYGLPTQS-AGAYARAGMVKTVSGGR 119 FOS_HUMAN IPTVTAISTSPDLQWLVQPALVSSVAPSQTRAPHPFGVPAPS-AGAYSRAGVVKTMTGGR 119 FOS_CHICK VPTVTAISTSPDLQWLVQPTLISSVAPSQNRG-HPYGVPAPAPPAAYSRPAVLKAP-GGR 118 FOS_ZEBRAFISH VPTVTAISSCPDLQWMVQP-MISSAAPS------NGAAQSYNPSSYPKMRVTGAK---- 95 :*******:.*****:*** ::**.*** * . ..:*.: : :
FOS_RAT AQSIGRRGKVEQLSPEEEEKRRIRRERNKMAAAKCRNRRRELTDTLQAETDQLEDEKSAL 179 FOS_MOUSE AQSIGRRGKVEQLSPEEEEKRRIRRERNKMAAAKCRNRRRELTDTLQAETDQLEDEKSAL 179 FOS_HUMAN AQSIGRRGKVEQLSPEEEEKRRIRRERNKMAAAKCRNRRRELTDTLQAETDQLEDEKSAL 179 FOS_CHICK GQSIGRRGKVEQLSPEEEEKRRIRRERNKMAAAKCRNRRRELTDTLQAETDQLEEEKSAL 178 FOS_ZEBRAFISH --TSNKRSRSEQLSPEEEEKKRVRRERSKMAAAKCRNRRRELTDTLQAETDQLEDEKSAL 153 : .:*.: **********:*:****.**************************:*****
FOS_RAT QTEIANLLKEKEKLEFILAAHRPACKIPNDLGFPEE----MSVTS-LDLTGGLPEATTPE 234 FOS_MOUSE QTEIANLLKEKEKLEFILAAHRPACKIPDDLGFPEE----MSVAS-LDLTGGLPEASTPE 234 FOS_HUMAN QTEIANLLKEKEKLEFILAAHRPACKIPDDLGFPEE----MSVAS-LDLTGGLPEVATPE 234 FOS_CHICK QAEIANLLKEKEKLEFILAAHRPACKMPEELRFSEE----LAAATALDLG----APSPAA 230 FOS_ZEBRAFISH QNDIANLLKEKERLEFILAAHKPICKIPADASFPEPSSSPMSSISVPEIVTTSVVSSTPN 21322 * :*********:********:* **:* : *.* :: : :: :.. Clustal Guide Tree
23 Clustal Guide Tree
● Guide Tree is not a phylogenetic tree, just a computational device ● Cladogram: edge lengths have no meaning ● Phylogram: edgle lengths correspond to distances
24 JalView: Alignment Editor (start from the CLUSTAL web site)
25 Simple JalView Window
● Simple alignment editor (Java applet) ● Complex alignment editor (Java application) – Web Start, or – Download installer
26 Starting or Installing JalView
www.jalview.org
27 Multiple Alignment @ BiBiServ
28 For Windows/MAC: QAlign2
● URL: http://gi.cebitec.uni-bielefeld.de/QAlign/ ● Live Demo of QAlign2
29