USER RESPONSIBILITY

GARBAGE IN = GARBAGE OUT

Each step relies on accuracy of previous steps

Just because you get an answer does not make it right: Appropriate test? Correct parameters? Applicable dataset? ANALYSIS PIPELINE

Visualizaon Mulple Format Evoluonary & Phylogenecs Alignment Input Data Analyses Adjustment

CLUSTALW GENEDOC FASTA Methods: r8s Distance Matrix T-COFFEE JALVIEW PHYLIP PAML Max Parsimony

MAFFT NEXUS Max Likelihood BEAST

MUSCLE Newick Programs: Muldivme PHYLIP PROBCONS RAxML

MrBayes ALIGNMENT PROGRAMS

ClustalW (1994) hp://www.ebi.ac.uk/Tools/msa/clustalw2/ Uses a progressive mulple alignment; Parameters e.g. gap penales are adjusted according to input i.e. divergence, length, local hydropathy, etc.

T-Coffee (2000) http://igs-server.cnrs-mrs.fr/Tcoffee/ Performs pairwise local and global alignments, then combines them in a progressive mulple alignment

MAFFT (2002) http://mafft.cbrc.jp/alignment/server/ Detects local homologous regions by Fast Fourier Transform (considers aa size & polarity), then uses a restricted global DP and a progressive algorithm and horizontal refinement

MUSCLE (2004) http://www.drive5.com/muscle kmer distances and log-expectaon scores, progressive and horizontal refinement

PROBCONS (2005) http://probcons.stanford.edu <30 taxa** pairwise consistency based on an objecve funcon COMPARISON OF ALIGNMENT PROGRAMS ALIGNMENT: CLUSTALW

ALIGNMENT: MUSCLE

ALIGNMENT: MAFFT ALIGNMENT VIEWERS/MANIPULATORS

GENEDOC Program Descripon: A Full Featured Mulple Sequence Alignment Editor, Analyser and Shading Ulity for Windows. hp://www.nrbsc.org/gfx/genedoc/ Plaorm: Windows Input: Amino acid and nucleode FASTA, (.aln), Phylip, PIR, GCG (.msf), and GenBank formats. Output: Default are .msf files. Can also export in FASTA, Clustal (.aln), Phylip, PIR, and text

JALVIEW Program Descripon: Jalview is a mulple alignment editor wrien in Java. It is used widely in a variety of web pages but is available as a general purpose alignment editor and analysis workbench. hp://www.jalview.org/ Plaorm: Mac, Windows, , Solaris, Unix, etc. Input: Amino acid and nucleode FASTA, Clustal (.aln), BLC, PIR, GCG (.msf), and PFAM formats. Output: Default are .msf files. Can also export in FASTA, Clustal (.aln), Phylip, PIR, and text ALIGNMENT VIEWERS/MANIPULATORS

BLOSUM62 PERCENT IDENTITY

CLUSTAL HYDORPHOBICITY REGIONS OF PROBLEMATIC ALIGNMENT

Accuracy of Alignment has an impact on the resulng phylogene tree!!

ALIGNMENT: MUSCLE - FULL LENGTH

ALIGNMENT: MUSCLE - CONSERVED REGIONS

Gblocks: Castresana (2000) Mol. Biol. Evol. 17: 540-552 Radish2 Wradish3 30 Radish3 62 Wradish1 Wradish2 Radish2 CONSERVED REGIONS 56 82 FULL LENGTH Wradish1 40 Radish3 92 94 Wradish3 Radish1

Radish1 91 Wradish2 99 B rapa B napus1 B napus1 B rapa 67 100 99 91 B oleracea 44 B oleracea B napus2 100 B napus2 Athaliana 41 Athaliana

6 91 A lyrata1 91 A lyrata1 41 20 Sunflower2 Cotton1 27 Sunflower1 85 Cotton2 21 Lettuce1 Grape MFlower1 Poplar1 4 27 100 Tomato2 65 Cassava1 46 Potato2 37 81 Cassava2 57 Potato1 Apple1A 1 99 100 Tomato1 Apple1B

Grape 100 Soybean4 1 Cotton2 59 91 Soybean3

22 Moss1 Medicago1 1 100 Moss2 97 Medicago2 33 Cotton1 Soybean2 88 3 Poplar1 Soybean1 99 93 Apple1B 57 CommonBean Apple1A 93 Cowpea

10 Cassava2 52 Lettuce1 4 39 Cassava1 82 Sunflower2

99 Soybean4 Sunflower1 12 32 Soybean3 63 MFlower1

Medicago1 100 Potato2 58 20 Medicago2 Tomato2 4 Soybean2 85 Potato1 62 Soybean1 100 Tomato1 72 23 CommonBean 13 100 Moss2 55 Cowpea Moss1 3 Rice3 8 Rice3 Brachy2 Brachy2 97 97 78 Sorghum1 70 Sorghum1 86 Maize2 63 Maize2 Columbine Columbine Papaya Papaya

65 Lettuce3 96 Artichoke 99 Artichoke 98 Lettuce3

83 Dandelion 96 Dandelion2 Lettuce2 Lettuce2 100 Dandelion1 98 Dandelion1

42 MFlower4 45 MFlower4 3 Tomato3 9 15 Cucumber2 Cucumber2 Tomato3

100 Cotton3 100 Cotton3 1 Poplar2 6 Poplar2

65 Brachy1 53 Brachy1 57 Rice1 48 Rice1 10 20 100 Sorghum2 98 Sorghum2 Maize1 Maize1 31 Brachyp3 43 Rice2 Rice2 Brachy3 100 100 Sorghum3 Sorghum3 22 71 69 Maize3 91 Maize4 79 Maize4 88 Maize3

0.1 0.2 Radish2 30 Radish3

56 Wradish2 Wradish1 92 Wradish3 Radish1 99 B rapa B napus1 99 91 B oleracea B napus2 Wradish3 Athaliana 62 Wradish1 6 91 A lyrata1 Radish2 20 Sunflower2 82 40 27 Radish3 Sunflower1 94 Lettuce1 Radish1 MFlower1 91 Wradish2 4 27 B napus1 100 Tomato2 46 B rapa Potato2 67 100 57 Potato1 44 B oleracea 1 100 Tomato1 100 B napus2 Grape 41 Athaliana

1 Cotton2 91 A lyrata1 41 Cotton1 22 Moss1 1 100 Moss2 85 Cotton2 21 Cotton1 Grape 3 Poplar1 EFFECTS BRANCH/NODE SUPPORT Poplar1 93 Apple1B 65 Cassava1 Apple1A 37 81 Cassava2 10 CONSERVED REGIONS Cassava2 99FULL LENGTH Apple1A 4 39 Cassava1 Apple1B

99 Soybean4 100 Soybean4 32 Soybean3 59 91 Soybean3 Medicago1 Medicago1

20 Medicago2 97 Medicago2 4 33 Soybean2 Soybean2 62 88 Soybean1 Soybean1 72 99 CommonBean 23 57 CommonBean 55 Cowpea 93 Cowpea 3 Rice3 52 Lettuce1 Brachy2 97 82 Sunflower2 78 Sorghum1 Sunflower1 86 Maize2 12 63 MFlower1 Columbine 100 Potato2 Papaya 58 Tomato2 Lettuce3 65 85 Potato1 99 Artichoke 100 Tomato1 83 Dandelion 13 100 Moss2 Lettuce2 Moss1 100 Dandelion1 8 Rice3 MFlower4 42 Brachy2 3 Tomato3 97 70 Sorghum1 Cucumber2 63 Maize2 100 Cotton3 1 Columbine Poplar2 Papaya 65 Brachy1 96 Artichoke 57 Rice1 10 98 Lettuce3 100 Sorghum2 96 Dandelion2 Maize1 Lettuce2 31 Brachyp3 98 Dandelion1 Rice2 100 MFlower4 Sorghum345 22 9 15 Cucumber2 69 Maize3 Tomato3 79 Maize4 100 Cotton3

0.1 6 Poplar2

53 Brachy1 48 Rice1 20 98 Sorghum2 Maize1 43 Rice2 Brachy3 100 Sorghum3 71 91 Maize4 88 Maize3

0.2 Wradish3 62 Wradish1 Radish2 82 40 Radish3 94 Radish1

91 Wradish2 B napus1 B rapa 67 100 44 B oleracea 100 B napus2 41 Athaliana 91 A lyrata1 41 Cotton1 85 Cotton2 21 Grape Poplar1

Radish2 65 Cassava1 30 Radish3 37 81 Cassava2

56 Wradish2 99 Apple1A Wradish1 Apple1B 92 Wradish3 100 Soybean4 Radish1 59 91 Soybean3 99 B rapa Medicago1 B napus1 97 Medicago2 EFFECTS BRANCH/NODE SUPPORT 33 99 91 B oleracea Soybean2 88 Soybean1 B napus2 99 CONSERVED REGIONS Athaliana FULL LENGTH 57 CommonBean 93 6 91 A lyrata1 Cowpea Lettuce1 20 Sunflower2 52 27 Sunflower1 82 Sunflower2 Lettuce1 Sunflower1 12 MFlower1 63 MFlower1 4 27 Potato2 100 Tomato2 100 46 58 NO “CORRECT” SOLUTION Potato2 Tomato2 57 Potato1 85 Potato1 1 100 100 Tomato1 Tomato1 GrapeKNOW IMPLICATIONS OF YOUR DECISIONS 13 100 Moss2 Moss1 1 Cotton2 8 Rice3 22 Moss1 Brachy2 1 100 Moss2 97 Sorghum1 Cotton1 70 63 Maize2 3 Poplar1 Columbine 93 Apple1B Papaya Apple1A Artichoke 10 Cassava2 96 4 98 Lettuce3 39 Cassava1 96 Dandelion2 99 Soybean4 Lettuce2 32 Soybean3 98 Dandelion1 Medicago1 45 MFlower4 20 Medicago2 4 9 15 Cucumber2 Soybean2 62 Tomato3 Soybean1 72 100 Cotton3 23 CommonBean 6 Poplar2 55 Cowpea 3 53 Brachy1 Rice3 48 Rice1 Brachy2 20 97 98 Sorghum2 Sorghum1 78 Maize1 86 Maize2 43 Rice2 Columbine Brachy3 Papaya 100 Sorghum3 71 65 Lettuce3 91 Maize4 99 Artichoke 88 Maize3 83 Dandelion Lettuce2 0.2 100 Dandelion1

42 MFlower4 3 Tomato3 Cucumber2

100 Cotton3 1 Poplar2

65 Brachy1 57 Rice1 10 100 Sorghum2 Maize1 31 Brachyp3 Rice2 100 Sorghum3 22 69 Maize3 79 Maize4

0.1 ANALYSIS PIPELINE

Mulple Manual Format Evoluonary Phylogenecs Alignment Adjustment Input Data Analyses

CLUSTALW GENEDOC FASTA Methods: r8s Distance Matrix T-COFFEE JALVIEW PHYLIP PAML Max Parsimony

MAFFT NEXUS Max Likelihood BEAST

MUSCLE Programs: Muldivme PHYLIP PROBCONS RAxML

MrBayes FILE FORMATS

FASTA FORMAT >Struthio_camelus ! VKYPNTNEEGKEVVLPKILSPIGSDGVYSNELANIEYTNVSKNNNNNNFAT--VDDYKPVPLDYMLDSK! >Rhea_americana ! VKYPNTNEEGKEVLLPEILNPVGTDGVYSNELANIEYTNVNKDNNNNNFAT--VDDHKPVSLEYMLDSK! >Pterocnemia_pennata ! VKYPNTNEEGKEVLLPEILNPVGADGVYSNELANIEYTNVSKDHDNEVFAT--VDDHKPVSLEYMLDSK! >Casuarius_casuarius ! VKYPNTNEDGKEVLLPKILNPIGSDGVYSDDLANIEYANVSKDHDKEVFAT--VDEYKPVSPEYMLDSK! >Dromaius_novaehollandiae ! VKYPNTNEDGKEVLLPKILNPIGSDGVYSNDLANIEYANVNNDNNNNNFAT--VDDYKPVSLEYMLDSK! >Nothoprocta_cinerascens ! VKYPNANDDGKEVPLPKTPSPIAANAVFGSDLANVEYTNISKDHDKNNNNNT-VDGYKPATLEYFLDNQ! >Eudromia_elegans ! VRYPNANDDGKEVPLPKTPSPVGANGVYSSDLANVEYTNINKNNNNNNNNNS-IDGYKPATLEFFLDNQ!

80 chars PHYLIP FORMAT 7 69! S_camelus VKYPNTNEEGKEVVLPKILSPIGSDGVYSNELANIEYTNVSKNNNNNNFAT--VDDYKPVPLDYMLDSK! R_american VKYPNTNEEGKEVLLPEILNPVGTDGVYSNELANIEYTNVNKDNNNNNFAT--VDDHKPVSLEYMLDSK! P_pennata VKYPNTNEEGKEVLLPEILNPVGADGVYSNELANIEYTNVSKDHDNEVFAT--VDDHKPVSLEYMLDSK! C_casuariu VKYPNTNEDGKEVLLPKILNPIGSDGVYSDDLANIEYANVSKDHDKEVFAT--VDEYKPVSPEYMLDSK! D_novaehol VKYPNTNEDGKEVLLPKILNPIGSDGVYSNDLANIEYANVNNDNNNNNFAT--VDDYKPVSLEYMLDSK! N_cinerasc VKYPNANDDGKEVPLPKTPSPIAANAVFGSDLANVEYTNISKDHDKNNNNNT-VDGYKPATLEYFLDNQ! E_elegans VRYPNANDDGKEVPLPKTPSPVGANGVYSSDLANVEYTNINKNNNNNNNNNS-IDGYKPATLEFFLDNQ!

10 chars NO WHITE SPACE FILE FORMATS

NEXUS FORMAT

#NEXUS ! begin data;! dimensions ntax=7 nchar=69;! format datatype=protein missing=? gap=- matchchar=.;! ! matrix! Struthio_camelus VKYPNTNEEGKEVVLPKILSPIGSDGVYSNELANIEYTNVSK??????FAT—VDDYKPVPLDYMLDSK! Rhea_americana ...... L..E..N.V.T...... ?.D?????...--...H...S.E.....! Pterocnemia_pennata ...... L..E..N.V.A...... DHD?EV...--...H...S.E.....! Casuarius_casuarius ...... D....L.....N...... DD...... A....DHDKEV...--..E....SPE.....! Dromaius_novaehollandiae ...... D....L.....N...... D...... A..??D?????...--...... S.E.....! Nothoprocta_cinerascens .....A.D.....P...TP...A.NA.FGS....V....I..DHDK?????T-..G...AT.E.F..N! Eudromia_elegans .R.....D.....P...TP..V.AN....S....V....I?.?????????S-I.G...AT.EFF..N! ;! end; ! ! begin mrbayes;! !prset aamodelpr=mixed;! end;! Wradish3 62 Wradish1 Radish2 82 40 Radish3 94 Radish1

91 Wradish2 B napus1 B rapa 67 100 44 B oleracea 100 B napus2 41 Athaliana 91 A lyrata1 41 Cotton1 85 Cotton2 21 Grape Poplar1

65 Cassava1 37 81 Cassava2

99 Apple1A Apple1B

100 Soybean4 59 91 Soybean3 Medicago1

97 Medicago2 33 Soybean2 88 NEWICK TREE FORMAT Soybean1 99 57 CommonBean A 93 Cowpea Topology ((A,B),C) B 52 Lettuce1 C 82 Sunflower2 A Sunflower1 B Branch Length ((A:2,B:4):10,C:8) 12 63 MFlower1 C

100 Potato2 A 58 2 89 Tomato2 Confidence Stats ((A:2,B:4):10[89],C:8) B 85 Potato1 C 100 Tomato1 2 13 100 Moss2 Moss1

8 Rice3 Brachy2 97 70 Sorghum1 63 Maize2 ((((Moss2:0.59223167356244488246,Moss1:0.48430519315771680677 Columbine ): 0.47610587518093150372[100],(Rice3:0.55644328355758998494, Papaya

(Brachy2:0.63383594852707514367, 96 Artichoke (Sorghum1:0.14451441234434442284,Maize2:0.5580828436343546750198 ): Lettuce3

0.29412654253200387622[63]):0.14718362545267285602[96 70]): Dandelion2 0.72708851517482031568[97]):0.16225290952698268043[8],…) Lettuce2 98 Dandelion1

45 MFlower4 9 15 Cucumber2 Tomato3

100 Cotton3 6 Poplar2

53 Brachy1 48 Rice1 20 98 Sorghum2 Maize1 43 Rice2 Brachy3 100 Sorghum3 71 91 Maize4 88 Maize3

0.2 ANALYSIS PIPELINE

Mulple Manual Format Evoluonary Phylogenecs Alignment Adjustment Input Data Analyses

CLUSTALW GENEDOC FASTA Methods: r8s Distance Matrix T-COFFEE JALVIEW PHYLIP PAML Max Parsimony

MAFFT NEXUS Max Likelihood BEAST

MUSCLE Programs: Muldivme PHYLIP PROBCONS RAxML

MrBayes PHYLOGENETIC METHODS

DISTANCE MATRIX ANALYSES • The number of differences between all sequence pairs is treated as a distance • Clustering method

Neighbor-Joining: select tree with smallest total branch length by sequenal selecon of neighbors

PROS & CONS • Computaonally fast • Produces 1 tree > does not consider all possible topologies • Can get different results based on input order

PROGRAMS • PAUP* • MEGA5 • PHYLIP PHYLOGENETIC METHODS

MAXIMUM PARSIMONY ANALYSES • The opmum tree requires the minimum number of changes needed to explain the divergence between the taxa • Hypothesis that requires the fewest assumpons is the best

1 2 3 4! a A G G A! a V c a V b a V b b A G G G! c A A C A! b V d c V d d V c d A A C G!

PROS & CONS • Considers all possible trees (sort of) • Computaonally intensive 10 taxa > 2million possible trees • No mulple hit correcon PROGRAMS • PAUP* • PHYLIP • MESQUITE • MEGA5 PHYLOGENETIC METHODS

MAXIMUM LIKELIHOOD ANALYSES Uses the maximum likelihood for each possible topology to chose the best tree

Ø Choose a probability model to esmate likelihood that a posion will undergo a substuon within a given me Ø Generate likelihood for each possible tree Ø Calculate which tree has the opmal likelihood

PROS & CONS • Makes assumpons about both the rate of evoluon and paern of site subsituon • Very slow – takes into consideraon all possible trees AND calculates their likelihood • As long as assumpons are realisc – tends to be most consistent method

PROGRAMS • PAUP* • MrBayes • TREE-PUZZLE • PHYLIP • RAxML • PhyML ANALYSIS PIPELINE

Mulple Manual Format Evoluonary Phylogenecs Alignment Adjustment Input Data Analyses

CLUSTALW GENEDOC FASTA Methods: r8s Distance Matrix T-COFFEE JALVIEW PHYLIP PAML Max Parsimony

MAFFT NEXUS Max Likelihood BEAST

MUSCLE Programs: Muldivme PHYLIP PROBCONS RAxML

MrBayes VALUABLE RESOURCE

hp://evoluon.genecs.washington.edu/phylip/soware.html PHYLIP hp://evoluon.gs.washington.edu/phylip.html

PROGRAM DESCRIPTION: A package of programs for inferring phylogenies. Methods available include parsimony, distance matrix, and likelihood methods, including bootstrapping and consensus trees.

PLATFORMS: Windows, Mac OS X, and Linux

INPUT: PHYLIP format; Data types include: molecular sequences, gene frequencies, restricon sites and fragments, distance matrices, and discrete characters.

OTHER GENERAL PURPOSE PACKAGES: • PAUP* • MEGA5 • MESQUITE PHYLIP: Distance Matrix Example Pipeline

63 proteins; 515 chars Generates mulple resampled dataset from Instantaneous Seqboot Input data set (100 replicates)

Protdist Computes distance matrix from protein sequence 1 ½ hours

Global readjustment Fitch Generates topology using distance matrix Jumble = 5 <2 days

Consense Generates consensus tree from replicates above instantaneous PHYLIP MrBayes hp://mrbayes.sourceforge.net/index.php

PROGRAM DESCRIPTION: A program for Bayesian esmaon of phylogeny.

PLATFORMS: Mac (serial or clusters), Windows & Unix

INPUT: Nucleode or amino acid alignments in NEXUS format

RUN TIME: 12 taxa; 898 char (nt), ngen=10000; samplefreq=10 <5 mins 89 taxa; 88 char (aa), ngen=10000; samplefreq=10 <15 mins 63 taxa; 515 char (aa), ngen=500000; samplefreq=10 19+ hours

MrBayes: Loading Input Data

MrBayes > excute filename.nex MrBayes: Define Structure of the Model

Datatype Nucmodel Rates 4x4 1 = F81 equal Doublet 2 = K80 gamma Codon 6 = GTR proinv invgamma adgamma

MrBayes > lset nst=6 rates=invgamma MrBayes > help lset MrBayes: Seng the Priors

Types of parameters in the model: 1. Topology 2. Branch lengths 3. Staonary frequencies of the nucleodes 4. Nucleode substuon rates (6) 5. Proporon of invariable sites 6. Shape parameter of the gamma distribuon of rate variaon

Default parameters work well for most analyses

MrBayes > help prset MrBayes: Understanding Screen Printout

MrBayes > mcmc ngen=200000 samplefreq=10 prinreq=50 (1,000,000) (100)

Cold Chain

ngen TREE #1 TREE #2 Time

MrBayes > help mcmc MrBayes: When to Stop Analysis?

MrBayes > sump burnin=#; # = value corresponding to 25% of samples

Example: if ngen=200000 samplefreq=50 than burnin=1000 (200000 ÷ 50 * 0.25)

COMPLETE RUN INCOMPLETE RUN

+------+ ! |2 2 2 2 2 |! | 2 2 2 2 2|! | 2 1 1|! | |! | |! | |! | 1 1 |! | 1 |! | |! | |! | |! | 1 1 1 |! |1 1 |! | |! | 1 |! +------+-----+-----+-----+-----+-----+-----+-----+-----+-----+ !

! !95% Cred. Interval! ------! Parameter Mean Variance Lower Upper Median PSRF *! ------! TL 19.978955 0.050256 19.597000 20.258000 20.084000 3.113! ------!

Potenal Scale Reducon Factor RAxML hp://sco.h-its.org/exelixis/soware.html

PROGRAM DESCRIPTION: A program for sequenal and parallel Maximum Likelihood based inference of large phylogenec trees.

PLATFORMS: Mac & Linux; online version hp://phylobench.vital-it.ch/raxml-bb/

INPUT: Nucleode or amino acid alignments in PHYLIP format; Newick trees

RUN TIME: 25,000 taxa; 1500 char (nt) on single CPU >> 13 ½ days 63 taxa; 515 char (aa), 20 iteraons; 100 bootstraps >> 1 ¼ days 63 taxa; 134 char (aa), 20 iteraons; 100 bootstraps >> 9 hours RAxML

The Easy & Fast Way: (Works well in most praccal cases) raxmlHPC -f a -x 12345 -p 12345 -# 100 -m model -s infile -n TEST #conducts BS search and then find best-scoring ML tree >> bootstrapped trees, best-scoring ML tree, & BS support values.

RAxML

The Hard & Slow Way 1. Determining inial rearrangement seng: If not specified with -i command, it will try sengs of 5, 10, 15, 20, and 25 and use the minimal seng that yields the best likelihood improvement on the starng trees Run program several mes with both auto determinaon seng and with a pre-defined value of 10. raxmlHPC -y -s infile -m GTRCAT -n ST0 #generates random MP starng tree raxmlHPC -f d -i 10 -m GTRMIX -s infile -t RAxML_parsimonyTree.ST0 -n FI0 #infers ML tree from starng tree using fixed seng raxmlHPC -f d -m GTRMIX -s infile -t RAxML_parsimonyTree.ST0 -n AI0 #infers ML tree from starng tree using auto seng

2. Determining Number of Rate Categories: Try several rate categories i.e. 10, 25, 40, & 55 and choose the one that gives the best likelihood value raxmlHPC -f d -i 10 -c 10 -m GTRMIX -s infile -t RAxML_parsimonyTree.ST0 -n C10

3. Finding the Best-Known Likelihood Tree (BKL): raxmlHPC -f d -i 10 -c 25 -m GTRMIX -s infile -# 10 -n MO 4. Bootstrapping: raxmlHPC -f d -i 10 -c 25 -m GTRCAT -s infile -# 100 -b 12345 -n MB 5. Generate Confidence Values: raxmlHPC -f b -m GTRCAT -s infile -z RAxML_bootstrap.MB –t RAxML_result.MO -n BS_tree TREE VISUALIZATION\MANIPULATION

FigTree http://tree.bio.ed.ac.uk/software/figtree/ Prepares graphical representaons of tress for publicaon (specifically with BEAST)

MEGA5 (Tree Explorer) http://www.megasoftware.net/ Plong, rearranging and eding trees

Dendroscope http://mafft.cbrc.jp/alignment/server/ Visualizaon and navigaon of phylogenec trees; designed specifically to handle very large trees i.e. 100,000s of taxa (recommended by RAxML)

MacClade hp://www.macclade.org/ Interacve analysis of evoluon: observe effect of tree manipulaon i.e # of char steps & distribuon of states of a given character TREE TYPES

MFlower4 45 Maize3 32 Wradish3 Wradish3 15 Cucumber2 Maize4 Sorghum3 62 62 Wradish1 Wradish1 Tomato3 Brachy3

Wradish1 Radish2 Radish2 Sorghum3 Wradish3 82 Radish2 82 91 Radish3 40 40 Radish3 Radish3 Maize4 Radish1 94 94 71 Maize1 Radish1 Radish1 88 88 0.2 Wradish2 Maize3 32 62 6 Rice2 91 100 82 40 91 Sorghum2 91 Wradish2 Wradish2 Brachy3 94 B napus1 71 B rapa B napus1 B napus1 43 Rice2 Rice1 91 B oleracea B rapa B rapa Maize1Brachy1 67 67 100 100 B napus2 100 67 44 B oleracea 44 B oleracea Sorghum2 98 44 100 Athaliana 100 B napus2 100 B napus2 Brachy1 100 9 20 48 48 98 41 Athaliana 41 Athaliana 53 Rice1 53 A lyrata1 43 91 A lyrata1 91 A lyrata1 Cotton3 Poplar2 91 Cotton3 41 Cotton1 41 Cotton1 100 Poplar2 Cotton1 41 Cotton2 85 Cotton2 85 Cotton2 98 Lettuce2 20 21 21 41 Grape Grape Grape Dandelion1 85 Tomato3 100 6 21 Poplar1 96 Dandelion2 Poplar1 Cucumber2 Poplar1 Artichoke 9 65 Cassava1 65 Cassava1 98 37 Cassava1 MFlower4 15 65 37 81 Cassava2 37 81 Cassava2 96 Lettuce3 45 13 81 Cassava2 98 99 Papaya Apple1A 99 Apple1A 99 Apple1A Dandelion1 96 33 Apple1B Apple1B Apple1B 100 Moss2 59 91 100 13 Moss1 Lettuce2 12 100 Soybean4 100 Soybean4 8 Soybean4 98 97 59 91 Soybean3 59 91 Soybean3 Rice3 Papaya 8 Soybean3 88 Medicago1 Medicago1 Brachy2 Dandelion2 97 Medicago2Medicago1 63

96 58 97 Medicago2 97 Medicago2 70 Sorghum1 99 Columbine 33 33 63 Maize2 82 57 Soybean2 Soybean2 Soybean2 52 88 Lettuce3 85 88 97 100 93 Soybean1 Soybean1 Soybean1 Poplar1 70 99 99 65 63

Cassava1 100 57 CommonBean 57 CommonBean CommonBean Artichoke 100 Cowpea 93 Cowpea 93 Cowpea 81 Cassava2 Lettuce1 MFlower1

Athaliana Tomato1 Lettuce1 Lettuce1 91 Potato1 52 52 Sunflower1

Tomato2 Potato2 Sunflower2 82 Sunflower2 82 Sunflower2 A lyrata1 B napus1 Sunflower1 Sunflower1 100 12 67 12 B rapa Moss1 63 MFlower1 63 MFlower1 Maize2 Sorghum1 Rice3 21 B oleracea Moss2

Potato2 Potato2 44 100 100 Medicago1

58 58 100 B napus2 Brachy2 Soybean4

Tomato2 Tomato2 91 Soybean3

12 Apple1A Wradish2 Apple1B

CommonBean

85 85 Medicago2 Potato1 Potato1 Soybean2 Radish1 Cassava2 Soybean1 100 100 Tomato1 41 Tomato1 94 Cassava1

Wradish3 Grape Moss2 Cowpea 100 Moss2 13 100 82 Cotton2 Wradish1 Poplar1 Moss1 Moss1 62 Sunflower2 Radish2 Cotton1 Rice3 Lettuce1 8 Rice3 8 99 A lyrata1 100 93 Sunflower1 40 Radish3 99 Brachy2 Brachy2 41 Athaliana 57 97 97 37 Cotton1 B napus2 91 52 MFlower1Potato2 70 Sorghum1 70 Sorghum1 88 100 85 81 Tomato2 85 Cotton2 B oleracea 97 63 65 63 Maize2 Maize2 63 82 Potato1 91 Tomato1 Grape 59 85 100 Columbine Columbine B rapa 100 41413721 58 4467 33 Medicago1 32 100 100 Moss2 Papaya Papaya 91 91 12 B napus1 40 94 Soybean4 6282 8 Moss1 96 Artichoke 96 Artichoke 97

100 Wradish2 13 97 Soybean3 9 98 98 6 15 Lettuce3 Lettuce3 70 Rice3 Medicago2 Radish1 Papaya Columbine 96 Dandelion2 96 Dandelion2 63 Radish3 20 45 Brachy2 13 Soybean2 Sorghum1

Lettuce2 Lettuce2 88 Radish2 Wradish3 13 93 CommonBean Wradish1 96 98 98 Dandelion1 33 Dandelion1 99 43 59 Cowpea Maize2

Cucumber2 MFlower4 100

45 MFlower4 45 57 Soybean1 Tomato3

9 15 Cucumber2 9 15 Cucumber2 Apple1A Maize3 71

Tomato3 Tomato3 88 100 98

99 Apple1B Maize4 Sorghum3 91 98 100 Cotton3 100 Cotton3

Sunflower1

82 Rice2 98 6 Poplar2 6 Poplar2 Brachy3

Lettuce1 MFlower4 Poplar2 96 Brachy1 53 Brachy1 53 52 Sunflower2 48 Lettuce2 48 Dandelion1 48 Rice1 Rice1 Dandelion2 20 20 63 MFlower1 53

98 98 Sorghum2 Sorghum2 Sorghum2 Maize1 100 Potato2 Maize1 Maize1

58 Artichoke Tomato2 Cotton3 Lettuce3 43 Rice2 43 Rice2 85 Potato1 Brachy3 Brachy3 Rice1 100 100 100 Tomato1

Sorghum3 Sorghum3 Brachy1 71 71 Columbine Maize4 91 Maize4 91 88 88 Maize3 Maize3 0.2 0.2 0.2 ANALYSIS PIPELINE

Mulple Manual Format Evoluonary Phylogenecs Alignment Adjustment Input Data Analyses

CLUSTALW GENEDOC FASTA Methods: r8s Distance Matrix T-COFFEE JALVIEW PHYLIP PAML Max Parsimony

MAFFT NEXUS Max Liklihood BEAST

MUSCLE Programs: MulDivTime PHYLIP PROBCONS RAxML

MrBayes POST-PYLOGENETIC ANALYSES r8s hp://loco.biosci.arizona.edu/r8s/index.html Analysis of rates ("r8s") of evoluon: a program for esmang absolute rates ("r8s") of molecular evoluon and divergence mes on a phylogenec tree.

BEAST hp://beast.bio.ed.ac.uk/Main_Page • Species phylogenies for molecular dang • Coalescent-based populaon genecs • Measurably evolving populaons

Muldivme hp://statgen.ncsu.edu/thorne/muldivme.html • Studying rates of molecular evoluon • Esmang divergence mes

PAML hp://abacus.gene.ucl.ac.uk/soware/paml.html • esmate branch length • Test evoluonary models • esmate parameters in evoluonary model: • calculate substuon rates among sites, • transion/transversion rate rao • reconstruct ancestral sequences, • the gamma parameter for variable • simulate sequence evoluon and substuon among sites phylogenec reconstrucon. • rate parameters for different genes • synonymous and nonsynonymous substuon rates