USER RESPONSIBILITY
GARBAGE IN = GARBAGE OUT
Each step relies on accuracy of previous steps
Just because you get an answer does not make it right: Appropriate test? Correct parameters? Applicable dataset? ANALYSIS PIPELINE
Visualiza on Mul ple Format Evolu onary & Phylogene cs Alignment Input Data Analyses Adjustment
CLUSTALW GENEDOC FASTA Methods: r8s Distance Matrix T-COFFEE JALVIEW PHYLIP PAML Max Parsimony
MAFFT NEXUS Max Likelihood BEAST
MUSCLE Newick Programs: Mul div me PHYLIP PROBCONS RAxML
MrBayes ALIGNMENT PROGRAMS
ClustalW (1994) h p://www.ebi.ac.uk/Tools/msa/clustalw2/ Uses a progressive mul ple alignment; Parameters e.g. gap penal es are adjusted according to input i.e. divergence, length, local hydropathy, etc.
T-Coffee (2000) http://igs-server.cnrs-mrs.fr/Tcoffee/ Performs pairwise local and global alignments, then combines them in a progressive mul ple alignment
MAFFT (2002) http://mafft.cbrc.jp/alignment/server/ Detects local homologous regions by Fast Fourier Transform (considers aa size & polarity), then uses a restricted global DP and a progressive algorithm and horizontal refinement
MUSCLE (2004) http://www.drive5.com/muscle kmer distances and log-expecta on scores, progressive and horizontal refinement
PROBCONS (2005) http://probcons.stanford.edu <30 taxa** pairwise consistency based on an objec ve func on COMPARISON OF ALIGNMENT PROGRAMS ALIGNMENT: CLUSTALW
ALIGNMENT: MUSCLE
ALIGNMENT: MAFFT ALIGNMENT VIEWERS/MANIPULATORS
GENEDOC Program Descrip on: A Full Featured Mul ple Sequence Alignment Editor, Analyser and Shading U lity for Windows. h p://www.nrbsc.org/gfx/genedoc/ Pla orm: Windows Input: Amino acid and nucleo de FASTA, Clustal (.aln), Phylip, PIR, GCG (.msf), and GenBank formats. Output: Default are .msf files. Can also export in FASTA, Clustal (.aln), Phylip, PIR, and text
JALVIEW Program Descrip on: Jalview is a mul ple alignment editor wri en in Java. It is used widely in a variety of web pages but is available as a general purpose alignment editor and analysis workbench. h p://www.jalview.org/ Pla orm: Mac, Windows, Linux, Solaris, Unix, etc. Input: Amino acid and nucleo de FASTA, Clustal (.aln), BLC, PIR, GCG (.msf), and PFAM formats. Output: Default are .msf files. Can also export in FASTA, Clustal (.aln), Phylip, PIR, and text ALIGNMENT VIEWERS/MANIPULATORS
BLOSUM62 PERCENT IDENTITY
CLUSTAL HYDORPHOBICITY REGIONS OF PROBLEMATIC ALIGNMENT
Accuracy of Alignment has an impact on the resul ng phylogene c tree!!
ALIGNMENT: MUSCLE - FULL LENGTH
ALIGNMENT: MUSCLE - CONSERVED REGIONS
Gblocks: Castresana (2000) Mol. Biol. Evol. 17: 540-552 Radish2 Wradish3 30 Radish3 62 Wradish1 Wradish2 Radish2 CONSERVED REGIONS 56 82 FULL LENGTH Wradish1 40 Radish3 92 94 Wradish3 Radish1
Radish1 91 Wradish2 99 B rapa B napus1 B napus1 B rapa 67 100 99 91 B oleracea 44 B oleracea B napus2 100 B napus2 Athaliana 41 Athaliana
6 91 A lyrata1 91 A lyrata1 41 20 Sunflower2 Cotton1 27 Sunflower1 85 Cotton2 21 Lettuce1 Grape MFlower1 Poplar1 4 27 100 Tomato2 65 Cassava1 46 Potato2 37 81 Cassava2 57 Potato1 Apple1A 1 99 100 Tomato1 Apple1B
Grape 100 Soybean4 1 Cotton2 59 91 Soybean3
22 Moss1 Medicago1 1 100 Moss2 97 Medicago2 33 Cotton1 Soybean2 88 3 Poplar1 Soybean1 99 93 Apple1B 57 CommonBean Apple1A 93 Cowpea
10 Cassava2 52 Lettuce1 4 39 Cassava1 82 Sunflower2
99 Soybean4 Sunflower1 12 32 Soybean3 63 MFlower1
Medicago1 100 Potato2 58 20 Medicago2 Tomato2 4 Soybean2 85 Potato1 62 Soybean1 100 Tomato1 72 23 CommonBean 13 100 Moss2 55 Cowpea Moss1 3 Rice3 8 Rice3 Brachy2 Brachy2 97 97 78 Sorghum1 70 Sorghum1 86 Maize2 63 Maize2 Columbine Columbine Papaya Papaya
65 Lettuce3 96 Artichoke 99 Artichoke 98 Lettuce3
83 Dandelion 96 Dandelion2 Lettuce2 Lettuce2 100 Dandelion1 98 Dandelion1
42 MFlower4 45 MFlower4 3 Tomato3 9 15 Cucumber2 Cucumber2 Tomato3
100 Cotton3 100 Cotton3 1 Poplar2 6 Poplar2
65 Brachy1 53 Brachy1 57 Rice1 48 Rice1 10 20 100 Sorghum2 98 Sorghum2 Maize1 Maize1 31 Brachyp3 43 Rice2 Rice2 Brachy3 100 100 Sorghum3 Sorghum3 22 71 69 Maize3 91 Maize4 79 Maize4 88 Maize3
0.1 0.2 Radish2 30 Radish3
56 Wradish2 Wradish1 92 Wradish3 Radish1 99 B rapa B napus1 99 91 B oleracea B napus2 Wradish3 Athaliana 62 Wradish1 6 91 A lyrata1 Radish2 20 Sunflower2 82 40 27 Radish3 Sunflower1 94 Lettuce1 Radish1 MFlower1 91 Wradish2 4 27 B napus1 100 Tomato2 46 B rapa Potato2 67 100 57 Potato1 44 B oleracea 1 100 Tomato1 100 B napus2 Grape 41 Athaliana
1 Cotton2 91 A lyrata1 41 Cotton1 22 Moss1 1 100 Moss2 85 Cotton2 21 Cotton1 Grape 3 Poplar1 EFFECTS BRANCH/NODE SUPPORT Poplar1 93 Apple1B 65 Cassava1 Apple1A 37 81 Cassava2 10 CONSERVED REGIONS Cassava2 99FULL LENGTH Apple1A 4 39 Cassava1 Apple1B
99 Soybean4 100 Soybean4 32 Soybean3 59 91 Soybean3 Medicago1 Medicago1
20 Medicago2 97 Medicago2 4 33 Soybean2 Soybean2 62 88 Soybean1 Soybean1 72 99 CommonBean 23 57 CommonBean 55 Cowpea 93 Cowpea 3 Rice3 52 Lettuce1 Brachy2 97 82 Sunflower2 78 Sorghum1 Sunflower1 86 Maize2 12 63 MFlower1 Columbine 100 Potato2 Papaya 58 Tomato2 Lettuce3 65 85 Potato1 99 Artichoke 100 Tomato1 83 Dandelion 13 100 Moss2 Lettuce2 Moss1 100 Dandelion1 8 Rice3 MFlower4 42 Brachy2 3 Tomato3 97 70 Sorghum1 Cucumber2 63 Maize2 100 Cotton3 1 Columbine Poplar2 Papaya 65 Brachy1 96 Artichoke 57 Rice1 10 98 Lettuce3 100 Sorghum2 96 Dandelion2 Maize1 Lettuce2 31 Brachyp3 98 Dandelion1 Rice2 100 MFlower4 Sorghum345 22 9 15 Cucumber2 69 Maize3 Tomato3 79 Maize4 100 Cotton3
0.1 6 Poplar2
53 Brachy1 48 Rice1 20 98 Sorghum2 Maize1 43 Rice2 Brachy3 100 Sorghum3 71 91 Maize4 88 Maize3
0.2 Wradish3 62 Wradish1 Radish2 82 40 Radish3 94 Radish1
91 Wradish2 B napus1 B rapa 67 100 44 B oleracea 100 B napus2 41 Athaliana 91 A lyrata1 41 Cotton1 85 Cotton2 21 Grape Poplar1
Radish2 65 Cassava1 30 Radish3 37 81 Cassava2
56 Wradish2 99 Apple1A Wradish1 Apple1B 92 Wradish3 100 Soybean4 Radish1 59 91 Soybean3 99 B rapa Medicago1 B napus1 97 Medicago2 EFFECTS BRANCH/NODE SUPPORT 33 99 91 B oleracea Soybean2 88 Soybean1 B napus2 99 CONSERVED REGIONS Athaliana FULL LENGTH 57 CommonBean 93 6 91 A lyrata1 Cowpea Lettuce1 20 Sunflower2 52 27 Sunflower1 82 Sunflower2 Lettuce1 Sunflower1 12 MFlower1 63 MFlower1 4 27 Potato2 100 Tomato2 100 46 58 NO “CORRECT” SOLUTION Potato2 Tomato2 57 Potato1 85 Potato1 1 100 100 Tomato1 Tomato1 GrapeKNOW IMPLICATIONS OF YOUR DECISIONS 13 100 Moss2 Moss1 1 Cotton2 8 Rice3 22 Moss1 Brachy2 1 100 Moss2 97 Sorghum1 Cotton1 70 63 Maize2 3 Poplar1 Columbine 93 Apple1B Papaya Apple1A Artichoke 10 Cassava2 96 4 98 Lettuce3 39 Cassava1 96 Dandelion2 99 Soybean4 Lettuce2 32 Soybean3 98 Dandelion1 Medicago1 45 MFlower4 20 Medicago2 4 9 15 Cucumber2 Soybean2 62 Tomato3 Soybean1 72 100 Cotton3 23 CommonBean 6 Poplar2 55 Cowpea 3 53 Brachy1 Rice3 48 Rice1 Brachy2 20 97 98 Sorghum2 Sorghum1 78 Maize1 86 Maize2 43 Rice2 Columbine Brachy3 Papaya 100 Sorghum3 71 65 Lettuce3 91 Maize4 99 Artichoke 88 Maize3 83 Dandelion Lettuce2 0.2 100 Dandelion1
42 MFlower4 3 Tomato3 Cucumber2
100 Cotton3 1 Poplar2
65 Brachy1 57 Rice1 10 100 Sorghum2 Maize1 31 Brachyp3 Rice2 100 Sorghum3 22 69 Maize3 79 Maize4
0.1 ANALYSIS PIPELINE
Mul ple Manual Format Evolu onary Phylogene cs Alignment Adjustment Input Data Analyses
CLUSTALW GENEDOC FASTA Methods: r8s Distance Matrix T-COFFEE JALVIEW PHYLIP PAML Max Parsimony
MAFFT NEXUS Max Likelihood BEAST
MUSCLE Programs: Mul div me PHYLIP PROBCONS RAxML
MrBayes FILE FORMATS
FASTA FORMAT >Struthio_camelus ! VKYPNTNEEGKEVVLPKILSPIGSDGVYSNELANIEYTNVSKNNNNNNFAT--VDDYKPVPLDYMLDSK! >Rhea_americana ! VKYPNTNEEGKEVLLPEILNPVGTDGVYSNELANIEYTNVNKDNNNNNFAT--VDDHKPVSLEYMLDSK! >Pterocnemia_pennata ! VKYPNTNEEGKEVLLPEILNPVGADGVYSNELANIEYTNVSKDHDNEVFAT--VDDHKPVSLEYMLDSK! >Casuarius_casuarius ! VKYPNTNEDGKEVLLPKILNPIGSDGVYSDDLANIEYANVSKDHDKEVFAT--VDEYKPVSPEYMLDSK! >Dromaius_novaehollandiae ! VKYPNTNEDGKEVLLPKILNPIGSDGVYSNDLANIEYANVNNDNNNNNFAT--VDDYKPVSLEYMLDSK! >Nothoprocta_cinerascens ! VKYPNANDDGKEVPLPKTPSPIAANAVFGSDLANVEYTNISKDHDKNNNNNT-VDGYKPATLEYFLDNQ! >Eudromia_elegans ! VRYPNANDDGKEVPLPKTPSPVGANGVYSSDLANVEYTNINKNNNNNNNNNS-IDGYKPATLEFFLDNQ!
80 chars PHYLIP FORMAT 7 69! S_camelus VKYPNTNEEGKEVVLPKILSPIGSDGVYSNELANIEYTNVSKNNNNNNFAT--VDDYKPVPLDYMLDSK! R_american VKYPNTNEEGKEVLLPEILNPVGTDGVYSNELANIEYTNVNKDNNNNNFAT--VDDHKPVSLEYMLDSK! P_pennata VKYPNTNEEGKEVLLPEILNPVGADGVYSNELANIEYTNVSKDHDNEVFAT--VDDHKPVSLEYMLDSK! C_casuariu VKYPNTNEDGKEVLLPKILNPIGSDGVYSDDLANIEYANVSKDHDKEVFAT--VDEYKPVSPEYMLDSK! D_novaehol VKYPNTNEDGKEVLLPKILNPIGSDGVYSNDLANIEYANVNNDNNNNNFAT--VDDYKPVSLEYMLDSK! N_cinerasc VKYPNANDDGKEVPLPKTPSPIAANAVFGSDLANVEYTNISKDHDKNNNNNT-VDGYKPATLEYFLDNQ! E_elegans VRYPNANDDGKEVPLPKTPSPVGANGVYSSDLANVEYTNINKNNNNNNNNNS-IDGYKPATLEFFLDNQ!
10 chars NO WHITE SPACE FILE FORMATS
NEXUS FORMAT
#NEXUS ! begin data;! dimensions ntax=7 nchar=69;! format datatype=protein missing=? gap=- matchchar=.;! ! matrix! Struthio_camelus VKYPNTNEEGKEVVLPKILSPIGSDGVYSNELANIEYTNVSK??????FAT—VDDYKPVPLDYMLDSK! Rhea_americana ...... L..E..N.V.T...... ?.D?????...--...H...S.E.....! Pterocnemia_pennata ...... L..E..N.V.A...... DHD?EV...--...H...S.E.....! Casuarius_casuarius ...... D....L.....N...... DD...... A....DHDKEV...--..E....SPE.....! Dromaius_novaehollandiae ...... D....L.....N...... D...... A..??D?????...--...... S.E.....! Nothoprocta_cinerascens .....A.D.....P...TP...A.NA.FGS....V....I..DHDK?????T-..G...AT.E.F..N! Eudromia_elegans .R.....D.....P...TP..V.AN....S....V....I?.?????????S-I.G...AT.EFF..N! ;! end; ! ! begin mrbayes;! !prset aamodelpr=mixed;! end;! Wradish3 62 Wradish1 Radish2 82 40 Radish3 94 Radish1
91 Wradish2 B napus1 B rapa 67 100 44 B oleracea 100 B napus2 41 Athaliana 91 A lyrata1 41 Cotton1 85 Cotton2 21 Grape Poplar1
65 Cassava1 37 81 Cassava2
99 Apple1A Apple1B
100 Soybean4 59 91 Soybean3 Medicago1
97 Medicago2 33 Soybean2 88 NEWICK TREE FORMAT Soybean1 99 57 CommonBean A 93 Cowpea Topology ((A,B),C) B 52 Lettuce1 C 82 Sunflower2 A Sunflower1 B Branch Length ((A:2,B:4):10,C:8) 12 63 MFlower1 C
100 Potato2 A 58 2 89 Tomato2 Confidence Stats ((A:2,B:4):10[89],C:8) B 85 Potato1 C 100 Tomato1 2 13 100 Moss2 Moss1
8 Rice3 Brachy2 97 70 Sorghum1 63 Maize2 ((((Moss2:0.59223167356244488246,Moss1:0.48430519315771680677 Columbine ): 0.47610587518093150372[100],(Rice3:0.55644328355758998494, Papaya
(Brachy2:0.63383594852707514367, 96 Artichoke (Sorghum1:0.14451441234434442284,Maize2:0.5580828436343546750198 ): Lettuce3
0.29412654253200387622[63]):0.14718362545267285602[96 70]): Dandelion2 0.72708851517482031568[97]):0.16225290952698268043[8],…) Lettuce2 98 Dandelion1
45 MFlower4 9 15 Cucumber2 Tomato3
100 Cotton3 6 Poplar2
53 Brachy1 48 Rice1 20 98 Sorghum2 Maize1 43 Rice2 Brachy3 100 Sorghum3 71 91 Maize4 88 Maize3
0.2 ANALYSIS PIPELINE
Mul ple Manual Format Evolu onary Phylogene cs Alignment Adjustment Input Data Analyses
CLUSTALW GENEDOC FASTA Methods: r8s Distance Matrix T-COFFEE JALVIEW PHYLIP PAML Max Parsimony
MAFFT NEXUS Max Likelihood BEAST
MUSCLE Programs: Mul div me PHYLIP PROBCONS RAxML
MrBayes PHYLOGENETIC METHODS
DISTANCE MATRIX ANALYSES • The number of differences between all sequence pairs is treated as a distance • Clustering method
Neighbor-Joining: select tree with smallest total branch length by sequen al selec on of neighbors
PROS & CONS • Computa onally fast • Produces 1 tree > does not consider all possible topologies • Can get different results based on input order
PROGRAMS • PAUP* • MEGA5 • PHYLIP PHYLOGENETIC METHODS
MAXIMUM PARSIMONY ANALYSES • The op mum tree requires the minimum number of changes needed to explain the divergence between the taxa • Hypothesis that requires the fewest assump ons is the best
1 2 3 4! a A G G A! a V c a V b a V b b A G G G! c A A C A! b V d c V d d V c d A A C G!
PROS & CONS • Considers all possible trees (sort of) • Computa onally intensive 10 taxa > 2million possible trees • No mul ple hit correc on PROGRAMS • PAUP* • PHYLIP • MESQUITE • MEGA5 PHYLOGENETIC METHODS
MAXIMUM LIKELIHOOD ANALYSES Uses the maximum likelihood for each possible topology to chose the best tree
Ø Choose a probability model to es mate likelihood that a posi on will undergo a subs tu on within a given me Ø Generate likelihood for each possible tree Ø Calculate which tree has the op mal likelihood
PROS & CONS • Makes assump ons about both the rate of evolu on and pa ern of site subsitu on • Very slow – takes into considera on all possible trees AND calculates their likelihood • As long as assump ons are realis c – tends to be most consistent method
PROGRAMS • PAUP* • MrBayes • TREE-PUZZLE • PHYLIP • RAxML • PhyML ANALYSIS PIPELINE
Mul ple Manual Format Evolu onary Phylogene cs Alignment Adjustment Input Data Analyses
CLUSTALW GENEDOC FASTA Methods: r8s Distance Matrix T-COFFEE JALVIEW PHYLIP PAML Max Parsimony
MAFFT NEXUS Max Likelihood BEAST
MUSCLE Programs: Mul div me PHYLIP PROBCONS RAxML
MrBayes VALUABLE RESOURCE
h p://evolu on.gene cs.washington.edu/phylip/so ware.html PHYLIP h p://evolu on.gs.washington.edu/phylip.html
PROGRAM DESCRIPTION: A package of programs for inferring phylogenies. Methods available include parsimony, distance matrix, and likelihood methods, including bootstrapping and consensus trees.
PLATFORMS: Windows, Mac OS X, and Linux
INPUT: PHYLIP format; Data types include: molecular sequences, gene frequencies, restric on sites and fragments, distance matrices, and discrete characters.
OTHER GENERAL PURPOSE PACKAGES: • PAUP* • MEGA5 • MESQUITE PHYLIP: Distance Matrix Example Pipeline
63 proteins; 515 chars Generates mul ple resampled dataset from Instantaneous Seqboot Input data set (100 replicates)
Protdist Computes distance matrix from protein sequence 1 ½ hours
Global readjustment Fitch Generates topology using distance matrix Jumble = 5 <2 days
Consense Generates consensus tree from replicates above instantaneous PHYLIP MrBayes h p://mrbayes.sourceforge.net/index.php
PROGRAM DESCRIPTION: A program for Bayesian es ma on of phylogeny.
PLATFORMS: Mac (serial or clusters), Windows & Unix
INPUT: Nucleo de or amino acid alignments in NEXUS format
RUN TIME: 12 taxa; 898 char (nt), ngen=10000; samplefreq=10 <5 mins 89 taxa; 88 char (aa), ngen=10000; samplefreq=10 <15 mins 63 taxa; 515 char (aa), ngen=500000; samplefreq=10 19+ hours
MrBayes: Loading Input Data
MrBayes > excute filename.nex MrBayes: Define Structure of the Model
Datatype Nucmodel Rates 4x4 1 = F81 equal Doublet 2 = K80 gamma Codon 6 = GTR proinv invgamma adgamma
MrBayes > lset nst=6 rates=invgamma MrBayes > help lset MrBayes: Se ng the Priors
Types of parameters in the model: 1. Topology 2. Branch lengths 3. Sta onary frequencies of the nucleo des 4. Nucleo de subs tu on rates (6) 5. Propor on of invariable sites 6. Shape parameter of the gamma distribu on of rate varia on
Default parameters work well for most analyses
MrBayes > help prset MrBayes: Understanding Screen Printout
MrBayes > mcmc ngen=200000 samplefreq=10 prin req=50 (1,000,000) (100)
Cold Chain
ngen TREE #1 TREE #2 Time
MrBayes > help mcmc MrBayes: When to Stop Analysis?
MrBayes > sump burnin=#; # = value corresponding to 25% of samples
Example: if ngen=200000 samplefreq=50 than burnin=1000 (200000 ÷ 50 * 0.25)
COMPLETE RUN INCOMPLETE RUN
+------+ ! |2 2 2 2 2 |! | 2 2 2 2 2|! | 2 1 1|! | |! | |! | |! | 1 1 |! | 1 |! | |! | |! | |! | 1 1 1 |! |1 1 |! | |! | 1 |! +------+-----+-----+-----+-----+-----+-----+-----+-----+-----+ !
! !95% Cred. Interval! ------! Parameter Mean Variance Lower Upper Median PSRF *! ------! TL 19.978955 0.050256 19.597000 20.258000 20.084000 3.113! ------!
Poten al Scale Reduc on Factor RAxML h p://sco.h-its.org/exelixis/so ware.html
PROGRAM DESCRIPTION: A program for sequen al and parallel Maximum Likelihood based inference of large phylogene c trees.
PLATFORMS: Mac & Linux; online version h p://phylobench.vital-it.ch/raxml-bb/
INPUT: Nucleo de or amino acid alignments in PHYLIP format; Newick trees
RUN TIME: 25,000 taxa; 1500 char (nt) on single CPU >> 13 ½ days 63 taxa; 515 char (aa), 20 itera ons; 100 bootstraps >> 1 ¼ days 63 taxa; 134 char (aa), 20 itera ons; 100 bootstraps >> 9 hours RAxML
The Easy & Fast Way: (Works well in most prac cal cases) raxmlHPC -f a -x 12345 -p 12345 -# 100 -m model -s infile -n TEST #conducts BS search and then find best-scoring ML tree >> bootstrapped trees, best-scoring ML tree, & BS support values.
RAxML
The Hard & Slow Way 1. Determining ini al rearrangement se ng: If not specified with -i command, it will try se ngs of 5, 10, 15, 20, and 25 and use the minimal se ng that yields the best likelihood improvement on the star ng trees Run program several mes with both auto determina on se ng and with a pre-defined value of 10. raxmlHPC -y -s infile -m GTRCAT -n ST0 #generates random MP star ng tree raxmlHPC -f d -i 10 -m GTRMIX -s infile -t RAxML_parsimonyTree.ST0 -n FI0 #infers ML tree from star ng tree using fixed se ng raxmlHPC -f d -m GTRMIX -s infile -t RAxML_parsimonyTree.ST0 -n AI0 #infers ML tree from star ng tree using auto se ng
2. Determining Number of Rate Categories: Try several rate categories i.e. 10, 25, 40, & 55 and choose the one that gives the best likelihood value raxmlHPC -f d -i 10 -c 10 -m GTRMIX -s infile -t RAxML_parsimonyTree.ST0 -n C10
3. Finding the Best-Known Likelihood Tree (BKL): raxmlHPC -f d -i 10 -c 25 -m GTRMIX -s infile -# 10 -n MO 4. Bootstrapping: raxmlHPC -f d -i 10 -c 25 -m GTRCAT -s infile -# 100 -b 12345 -n MB 5. Generate Confidence Values: raxmlHPC -f b -m GTRCAT -s infile -z RAxML_bootstrap.MB –t RAxML_result.MO -n BS_tree TREE VISUALIZATION\MANIPULATION
FigTree http://tree.bio.ed.ac.uk/software/figtree/ Prepares graphical representa ons of tress for publica on (specifically with BEAST)
MEGA5 (Tree Explorer) http://www.megasoftware.net/ Plo ng, rearranging and edi ng trees
Dendroscope http://mafft.cbrc.jp/alignment/server/ Visualiza on and naviga on of phylogene c trees; designed specifically to handle very large trees i.e. 100,000s of taxa (recommended by RAxML)
MacClade h p://www.macclade.org/ Interac ve analysis of evolu on: observe effect of tree manipula on i.e # of char steps & distribu on of states of a given character TREE TYPES
MFlower4 45 Maize3 32 Wradish3 Wradish3 15 Cucumber2 Maize4 Sorghum3 62 62 Wradish1 Wradish1 Tomato3 Brachy3
Wradish1 Radish2 Radish2 Sorghum3 Wradish3 82 Radish2 82 91 Radish3 40 40 Radish3 Radish3 Maize4 Radish1 94 94 71 Maize1 Radish1 Radish1 88 88 0.2 Wradish2 Maize3 32 62 6 Rice2 91 100 82 40 91 Sorghum2 91 Wradish2 Wradish2 Brachy3 94 B napus1 71 B rapa B napus1 B napus1 43 Rice2 Rice1 91 B oleracea B rapa B rapa Maize1Brachy1 67 67 100 100 B napus2 100 67 44 B oleracea 44 B oleracea Sorghum2 98 44 100 Athaliana 100 B napus2 100 B napus2 Brachy1 100 9 20 48 48 98 41 Athaliana 41 Athaliana 53 Rice1 53 A lyrata1 43 91 A lyrata1 91 A lyrata1 Cotton3 Poplar2 91 Cotton3 41 Cotton1 41 Cotton1 100 Poplar2 Cotton1 41 Cotton2 85 Cotton2 85 Cotton2 98 Lettuce2 20 21 21 41 Grape Grape Grape Dandelion1 85 Tomato3 100 6 21 Poplar1 96 Dandelion2 Poplar1 Cucumber2 Poplar1 Artichoke 9 65 Cassava1 65 Cassava1 98 37 Cassava1 MFlower4 15 65 37 81 Cassava2 37 81 Cassava2 96 Lettuce3 45 13 81 Cassava2 98 99 Papaya Apple1A 99 Apple1A 99 Apple1A Dandelion1 96 33 Apple1B Apple1B Apple1B 100 Moss2 59 91 100 13 Moss1 Lettuce2 12 100 Soybean4 100 Soybean4 8 Soybean4 98 97 59 91 Soybean3 59 91 Soybean3 Rice3 Papaya 8 Soybean3 88 Medicago1 Medicago1 Brachy2 Dandelion2 97 Medicago2Medicago1 63
96 58 97 Medicago2 97 Medicago2 70 Sorghum1 99 Columbine 33 33 63 Maize2 82 57 Soybean2 Soybean2 Soybean2 52 88 Lettuce3 85 88 97 100 93 Soybean1 Soybean1 Soybean1 Poplar1 70 99 99 65 63
Cassava1 100 57 CommonBean 57 CommonBean CommonBean Artichoke 100 Cowpea 93 Cowpea 93 Cowpea 81 Cassava2 Lettuce1 MFlower1
Athaliana Tomato1 Lettuce1 Lettuce1 91 Potato1 52 52 Sunflower1
Tomato2 Potato2 Sunflower2 82 Sunflower2 82 Sunflower2 A lyrata1 B napus1 Sunflower1 Sunflower1 100 12 67 12 B rapa Moss1 63 MFlower1 63 MFlower1 Maize2 Sorghum1 Rice3 21 B oleracea Moss2
Potato2 Potato2 44 100 100 Medicago1
58 58 100 B napus2 Brachy2 Soybean4
Tomato2 Tomato2 91 Soybean3
12 Apple1A Wradish2 Apple1B
CommonBean
85 85 Medicago2 Potato1 Potato1 Soybean2 Radish1 Cassava2 Soybean1 100 100 Tomato1 41 Tomato1 94 Cassava1
Wradish3 Grape Moss2 Cowpea 100 Moss2 13 100 82 Cotton2 Wradish1 Poplar1 Moss1 Moss1 62 Sunflower2 Radish2 Cotton1 Rice3 Lettuce1 8 Rice3 8 99 A lyrata1 100 93 Sunflower1 40 Radish3 99 Brachy2 Brachy2 41 Athaliana 57 97 97 37 Cotton1 B napus2 91 52 MFlower1Potato2 70 Sorghum1 70 Sorghum1 88 100 85 81 Tomato2 85 Cotton2 B oleracea 97 63 65 63 Maize2 Maize2 63 82 Potato1 91 Tomato1 Grape 59 85 100 Columbine Columbine B rapa 100 41413721 58 4467 33 Medicago1 32 100 100 Moss2 Papaya Papaya 91 91 12 B napus1 40 94 Soybean4 6282 8 Moss1 96 Artichoke 96 Artichoke 97
100 Wradish2 13 97 Soybean3 9 98 98 6 15 Lettuce3 Lettuce3 70 Rice3 Medicago2 Radish1 Papaya Columbine 96 Dandelion2 96 Dandelion2 63 Radish3 20 45 Brachy2 13 Soybean2 Sorghum1
Lettuce2 Lettuce2 88 Radish2 Wradish3 13 93 CommonBean Wradish1 96 98 98 Dandelion1 33 Dandelion1 99 43 59 Cowpea Maize2
Cucumber2 MFlower4 100
45 MFlower4 45 57 Soybean1 Tomato3
9 15 Cucumber2 9 15 Cucumber2 Apple1A Maize3 71
Tomato3 Tomato3 88 100 98
99 Apple1B Maize4 Sorghum3 91 98 100 Cotton3 100 Cotton3
Sunflower1
82 Rice2 98 6 Poplar2 6 Poplar2 Brachy3
Lettuce1 MFlower4 Poplar2 96 Brachy1 53 Brachy1 53 52 Sunflower2 48 Lettuce2 48 Dandelion1 48 Rice1 Rice1 Dandelion2 20 20 63 MFlower1 53
98 98 Sorghum2 Sorghum2 Sorghum2 Maize1 100 Potato2 Maize1 Maize1
58 Artichoke Tomato2 Cotton3 Lettuce3 43 Rice2 43 Rice2 85 Potato1 Brachy3 Brachy3 Rice1 100 100 100 Tomato1
Sorghum3 Sorghum3 Brachy1 71 71 Columbine Maize4 91 Maize4 91 88 88 Maize3 Maize3 0.2 0.2 0.2 ANALYSIS PIPELINE
Mul ple Manual Format Evolu onary Phylogene cs Alignment Adjustment Input Data Analyses
CLUSTALW GENEDOC FASTA Methods: r8s Distance Matrix T-COFFEE JALVIEW PHYLIP PAML Max Parsimony
MAFFT NEXUS Max Liklihood BEAST
MUSCLE Programs: Mul DivTime PHYLIP PROBCONS RAxML
MrBayes POST-PYLOGENETIC ANALYSES r8s h p://loco.biosci.arizona.edu/r8s/index.html Analysis of rates ("r8s") of evolu on: a program for es ma ng absolute rates ("r8s") of molecular evolu on and divergence mes on a phylogene c tree.
BEAST h p://beast.bio.ed.ac.uk/Main_Page • Species phylogenies for molecular da ng • Coalescent-based popula on gene cs • Measurably evolving popula ons
Mul div me h p://statgen.ncsu.edu/thorne/mul div me.html • Studying rates of molecular evolu on • Es ma ng divergence mes
PAML h p://abacus.gene.ucl.ac.uk/so ware/paml.html • es mate branch length • Test evolu onary models • es mate parameters in evolu onary model: • calculate subs tu on rates among sites, • transi on/transversion rate ra o • reconstruct ancestral sequences, • the gamma parameter for variable • simulate sequence evolu on and subs tu on among sites phylogene c reconstruc on. • rate parameters for different genes • synonymous and nonsynonymous subs tu on rates