<<

Alignments in Practice BLAST and

Introduction to Dortmund, 16.-20.07.2007

Lectures: Sven Rahmann

Exercises: Udo Feldkamp, Michael Wurst

1 Overview

● Dot Plots ● BLAST ● BLAST ● BLAST Statistics ● BLAT ● CLUSTAL ● JalView

2 Dotter – Tool for Dot Plots

● http://www.cgb.ki.se/cgb/groups/sonnhammer/Dotter.

● Dotlet: a Java applet for Dot Plots

3 Dot Plots

● Hemoglobin Alpha against Hemoglobin Beta

4 EBI Alignment Service

5 BLAST ● URL: http://www.ncbi.nlm.nih.gov/BLAST/ ● Basic Local Alignment Search Tool

6 Choose the right BLAST

7 Nucleotide BLAST Interface

8 BLAST Parameters

● Expect threshold: low [0.01] = strict high [100] = loose ● Word size: speed vs. sensitivity high = faster low = slower, but more sensitive

9 Protein BLAST

10 Protein BLAST Parameters

11 Translated BLAST

● protein query against nucleotide – nucleotide sequence not unique – also consider reverse complement ● nucleotide query against protein database – consider all 6 reading frames

12 BLAST Output

13 BLAST Output II

Database + Accession Description Bit score E-value Link

14 BLAST Statistics

● How good / reliable is a hit found by BLAST? ● Raw score := score of the alignment according to scoring matrix and gap penalties ● Bit score := score (log2 units), length-normalized ● E-value := Number of hits of such or better score in a hypothetical database of random of the same size 15 More on Statistics

● Null model := random model describing sequences without intentional signal (here: pair of random sequences without intentional similarity) ● (single) p-value for observed score s := Prob(Score >= s) in the null model ● (multiple) p-value := Prob(Score >= s at least once)

16 BLAT

● BLAST-Like Alignment Tool ● index-based ● developed at UC Santa Cruz ● especially for searching in whole genomes ● very fast ● limited to nearly exact matches

17 UCSC Genome Browser + BLAT

18 CLUSTAL

19 What Clustal Did (“Output file”)

20 Clustal Results (pretty)

21 Clustal Results (“alignment file”)

CLUSTAL W (1.83) multiple

FOS_RAT MMFSGFNADYEASSSRCSSASPAGDSLSYYHSPADSFSSMGSPVNTQDFCADLSVSSANF 60 FOS_MOUSE MMFSGFNADYEASSSRCSSASPAGDSLSYYHSPADSFSSMGSPVNTQDFCADLSVSSANF 60 FOS_HUMAN MMFSGFNADYEASSSRCSSASPAGDSLSYYHSPADSFSSMGSPVNAQDFCTDLAVSSANF 60 FOS_CHICK MMYQGFAGEYEAPSSRCSSASPAGDSLTYYPSPADSFSSMGSPVNSQDFCTDLAVSSANF 60 FOS_ZEBRAFISH MMFTSLNADCDASS-RCSTASPSGDSVGYY------PLNQTQEFTDLSVSSASF 47 **: .: .: :*.* ***:***:***: ** *:* : :**:****.*

FOS_RAT IPTVTAISTSPDLQWLVQPTLVSSVAPSQTRAPHPYGLPTPS-TGAYARAGVVKTMSGGR 119 FOS_MOUSE IPTVTAISTSPDLQWLVQPTLVSSVAPSQTRAPHPYGLPTQS-AGAYARAGMVKTVSGGR 119 FOS_HUMAN IPTVTAISTSPDLQWLVQPALVSSVAPSQTRAPHPFGVPAPS-AGAYSRAGVVKTMTGGR 119 FOS_CHICK VPTVTAISTSPDLQWLVQPTLISSVAPSQNRG-HPYGVPAPAPPAAYSRPAVLKAP-GGR 118 FOS_ZEBRAFISH VPTVTAISSCPDLQWMVQP-MISSAAPS------NGAAQSYNPSSYPKMRVTGAK---- 95 :*******:.*****:*** ::**.*** * . ..:*.: : :

FOS_RAT AQSIGRRGKVEQLSPEEEEKRRIRRERNKMAAAKCRNRRRELTDTLQAETDQLEDEKSAL 179 FOS_MOUSE AQSIGRRGKVEQLSPEEEEKRRIRRERNKMAAAKCRNRRRELTDTLQAETDQLEDEKSAL 179 FOS_HUMAN AQSIGRRGKVEQLSPEEEEKRRIRRERNKMAAAKCRNRRRELTDTLQAETDQLEDEKSAL 179 FOS_CHICK GQSIGRRGKVEQLSPEEEEKRRIRRERNKMAAAKCRNRRRELTDTLQAETDQLEEEKSAL 178 FOS_ZEBRAFISH --TSNKRSRSEQLSPEEEEKKRVRRERSKMAAAKCRNRRRELTDTLQAETDQLEDEKSAL 153 : .:*.: **********:*:****.**************************:*****

FOS_RAT QTEIANLLKEKEKLEFILAAHRPACKIPNDLGFPEE----MSVTS-LDLTGGLPEATTPE 234 FOS_MOUSE QTEIANLLKEKEKLEFILAAHRPACKIPDDLGFPEE----MSVAS-LDLTGGLPEASTPE 234 FOS_HUMAN QTEIANLLKEKEKLEFILAAHRPACKIPDDLGFPEE----MSVAS-LDLTGGLPEVATPE 234 FOS_CHICK QAEIANLLKEKEKLEFILAAHRPACKMPEELRFSEE----LAAATALDLG----APSPAA 230 FOS_ZEBRAFISH QNDIANLLKEKERLEFILAAHKPICKIPADASFPEPSSSPMSSISVPEIVTTSVVSSTPN 21322 * :*********:********:* **:* : *.* :: : :: :.. Clustal Guide Tree

23 Clustal Guide Tree

● Guide Tree is not a , just a computational device ● Cladogram: edge lengths have no meaning ● Phylogram: edgle lengths correspond to distances

24 JalView: Alignment Editor (start from the CLUSTAL web site)

25 Simple JalView Window

● Simple alignment editor (Java applet) ● Complex alignment editor (Java application) – Web Start, or – Download installer

26 Starting or Installing JalView

www.jalview.org

27 Multiple Alignment @ BiBiServ

28 For Windows/MAC: QAlign2

● URL: http://gi.cebitec.uni-bielefeld.de/QAlign/ ● Live Demo of QAlign2

29