Parsimony.Pdf
Total Page:16
File Type:pdf, Size:1020Kb
Parsimony (and a little bit of the history of phylogenetics) Paul O. Lewis ~ Phylogenetics, Spring 2020 Cladistics 1950: Willi Hennig published Grundzüge einer Theorie der Phylogenetischen Systematik, which was later (1966) fly frog human translated into English as Phylogenetic Systematics. - - Emphasized that phylogenetic groups should be defined on vertebrae the basis of shared, derived characters (synapomorphies), not on the basis of shared, ancestral characters (symplesiomorphies) fern lily moss Although widely considered to be the father of cladistics, Hennig never described an algorithm based on the parsimony criterion, nor did he suggest what to do in the case of character conflict. ⑤vascular tissues Hennig, W. 1966. Phylogenetic systematics. University of Illinois Press, Urbana Photo: Frontispiece of Wiley, E. O. 1981. Phylogenetics: the theory and practice of Paul O. Lewis ~ Phylogenetics, Spring 2020 phylogenetic systematics. John Wiley & Sons, New York. Photo by George W. Byers, Univ. Kansas Changing terminology Oglabrous Otomentose glabrous tomentose tomentose tomentose Transformation series: Character: leaf pubescence leaf pubescence Characters: Character states: glabrous glabrous IEtomentose # tomentose Photo (tomentose): http://www.sbs.utexas.edu/bio406d/images/pics/eup/croton_capitatus.htm Paul O. Lewis ~ Phylogenetics, Spring 2020 Photo (glabrous): http://www.lhccrems.nsw.gov.au/CPR/CPR/glossary/glabrous.htm First computed phylogeny 1957: Charles Michener and o s Robert Sokal published the first phylogeny estimated using a computer (but not using parsimony). Robert Sokal, 1959 “...the computation of a 97 species X 97 species matrix of correlation coefficients presents serious technical difficulties. An operation of this magnitude cannot be reasonably undertaken without punched card or electronic computing equipment.”- - Michener, C. D., and R. R. Sokal. 1957. A Quantitative Approach to a Problem in Classification. Evolution 11:130-162 Paul O. Lewis ~ Phylogenetics, Spring 2020 Photo: Fig. 10.1, p. 124, in Felsenstein, J. 2004. Inferring phylogenies. Sinauer, Sunderland, MA. Photo courtesy of Robert Sokal. First parsimony analysis 1964: Anthony Edwards and Luca Cavalli-Sforza first used parsimony and likelihood for human blood group frequency data ×. “...we do not anticipate any =major difficulties in the corresponding computer programmes, although these €are not yet working.” This quote refers to the maximum likelihood approach: it would not be until 1973 that Joe Felsenstein identified the - I central problem (trying to estimate too many parameters) and produced a workable maximum likelihood method for continuous data. Edwards, A. W. F., and L. L. Cavalli-Sforza. 1964. Reconstruction of evolutionary trees. pp. 67-76 in Phenetic and phylogenetic classification, ed. V. H. Heywood and J. McNeill. Systematics Association Publ. No. 6, London. Photo: Fig. 10.3, p. 126, in Felsenstein, J. 2004. Inferring phylogenies. Sinauer, Sunderland, MA. Photo by Joe Anthony Edwards, 1963 Felsenstein. Paul O. Lewis ~ Phylogenetics, Spring 2020 Irreversible Parsimony 1965: Joseph FCamin and Robert Cladogram of fossil horses (Fig. 4 in Camin & Sokal, 1965) Sokal published the first parsimony - - method method using discrete character data Character states ordered from most ancestral to most derived, and reversals were not allowed “Comparison by Camin of these various schemes with the “truth” led him to the observation that those trees which most closely resembled the true cladistics invariably required for their construction the least number of postulated evolutionary ⇐steps for the characters studied. Subsequently we examined the possibility of reconstructing cladistics by the principle of evolutionary O parsimony.” Joseph Camin, mid-1970s Some representative “Caminalcules” Camin, J. H., and R. R. Sokal. 1965. A method for deducing branching sequences in phylogeny. Evolution 19:311-326 Paul O. Lewis ~ Phylogenetics, Spring 2020 Photo: Fig. 10.6, p. 129, in Felsenstein, J. 2004. Inferring phylogenies. Sinauer, Sunderland, MA. Photo courtesy of the Snow Entomological Division, Natural History Museum, University of Kansas. First molecular phylogeny 1966: Richard Eck and Margaret Dayhoff published the first molecular phylogeny using a numerical algorithm “In principle, for a family of closely related proteins, we seek the topology that has the minimum number of mutations.” Eck and Dayhoff method: • characters unordered (any amino acid can change to any other amino acid with 1 step) • starting tree by stepwise addition (parsimony score approximate) Margaret Dayhoff •= branch swapping (not explicitly defined) Eck, R. V., and M. O. Dayhoff. 1966. Atlas of protein sequence and structure. National Biomedical Research Foundation. Silver Spring, Maryland. Paul O. Lewis ~ Phylogenetics, Spring 2020 Photo: http://www.dayhoff.cc/MODayhoff_Contrib_Science.html Wagner’s Ground State 0 = presumed ancestral state, State 1 = presumed derived state Plan Divergence - ToCharacter Neocteniza Actinopus Missulena Warren “Herb” Wagner 1 1 0 0 is well-known for his 2 1 0 0 work on ferns as well 3 1 0 0 as his “ground plan divergence” method for 4 1 0 0 constructing 5 1 0 0 phylogenies 6 1 0 0 7 1 0 0 8 0 1 1 9 0 1 1 10 0 1 1 11 0 1 1 ÷ 12 0 1 1 13 0 1 0 14 0 1 0 15 0 1 1 . http://um2017.org/faculty-history/faculty/warren- ÷. 16 0 1 0 * 17 0 1 0 Wagner ground plan divergence diagram (Fig. 18 0 0 1 5.30, p. 177, in Wiley, 1981) 19 0 0 1 Photo of Wagner: Photo of Wagner: %E2%80%9Cherb%E2%80%9D-wagner total O7 O10 O8 Paul O. Lewis ~ Phylogenetics, Spring 2020 Ordered (Wagner) parsimony - - 1969: Arnold Kluge and Steve Farris produce software for computing Wagner parsimony trees (and introduce CI) Arnold Kluge Steve. Farris Kluge, A. G., and J. S. Farris. 1969. Quantitative phyletics and the evolution of anurans. Systematic Zoology 18:1-32 Phylogeny of anuran (frogs and toads) families Wiley, E. O. 1981. Phylogenetics: the theory and practice of phylogenetic (Fig. 5, p. 14, in Kluge and Farris, 1969) systematics. John Wiley & Sons, New York. Photo of Farris: http://systbio.org.whsites.net/?q=taxonomy/term/2&from=30 Photo of Kluge: http://um2017.org/faculty-history/faculty/arnold-g-kluge Paul O. Lewis ~ Phylogenetics, Spring 2020 Ordered states whorled alternate 2 steps 00-0opposite 1 step 1 step Paul O. Lewis ~ Phylogenetics, Spring 2020 10 Unordered (Fitch)- parsimony - 1971: Walter Fitch described his algorithm for computing tree length for parsimony with four unordered states (e.g., DNA or RNA data) Nearly all parsimony analyses of nucleotide data have used this Walter Fitch, 1975 algorithm Fitch, W. 1971. Toward Defining the Course of Evolution: Minimum Change for a Specific Tree Topology. Systematic Zoology 20:406-416 Paul O. Lewis ~ Phylogenetics, Spring 2020 Photo: Fig. 10.7, p. 131, in Felsenstein, J. 2004. Inferring phylogenies. Sinauer, Sunderland, MA. Photo courtesy of Walter Fitch. Sequence Data parsimonyinformahve parsimonffrmatie Taxa Characters# (=sites in this case) u t E Agave GGGCTGGGC-CGGCCGGTCCGC-CTCT... Bougainvillia GGGGTGGGT-CGACCGGTCCGC-CTTG... Ceratophyllum GGGTTGGGC-CGGTCGGTCOGTCACTCT... Digitalis GGGTTGGGT-CGGCCGGTCCGC-CTTT... Epilobium GGGTTGGGT-CGACCGGTCCGC-CTTA... Fagus GGGTTGGGC-AGAGCGGTCCGC-CCCT... Galax GGGTTGG?C-CG?CCGGTCCGC-CTAG... l A- Constant variable site sites constant site parsimony informative site Paul O. Lewis ~ Phylogenetics, Spring 2020 variable site parsimony uninformative site Discrete Morphology Taxa Characters P._fimbriata 000000001101010110?01... P._robusta 000000001101010110001... P._articulata 100111010110000001100... P._parksii 000111012110000001000... P._americana 100111010111100001000... P._myriophylla 100111010111000001000... P._macrophylla 110111012100000000000... P._polygama 110111013100100000000... P._gracilis 101111013000001000110... P._ciliata 001110112000001000110... P._basiramia 001110112000001000110... binary (2-state) site Paul O. Lewis ~ Phylogenetics, Spring 2020 multistate site Parsimony 123456789... Taxon1 CGACCAGGT... Taxon2 CGACCAGGT... Taxon3 CGGTCCGGT... Taxon4 CGGCCTGGT... Same tree but with data One of the from site 6 inserted three possible in place of taxon Taxon1 unrooted Taxon3 A C names trees 8Taxon2 Taxon4 A T Paul O. Lewis ~ Phylogenetics, Spring 2020 ) "Standard" Parsimony A C A C A C A C A A A C A G A T :iIA T A T A . T A T A C A C A C A C [C A 'C C 'C G 'C T A 5 T A 3 T A 5 T A 4 T A C A C A C A C GG A GG C GG G GG T A T A T A T A T A C A C A C A C TT A TT C TT G TT T A T A T A T A T Paul O. Lewis ~ Phylogenetics, Spring 2020 Important things to note about that last slide: • Two (2) steps was the minimum – no way to explain the observed data with just 1 evolutionary change • More than one way to assign ancestral character states to get 2 steps – one interior node must have A but the other interior node can have anything except G • Enumerating all possible combinations of ancestral states is not the most efficient way to determine the number of steps – more on this later Paul O. Lewis ~ Phylogenetics, Spring 2020 Parsimony Steps d Taxon1 Taxon3 123456789... ' Taxon1 CGACCAGGT... 4 µA 6 Taxon2 CGACCAGGT... 36 • Taxon3 CGGTCCGGT... Taxon2 Taxon4 Taxon4 CGGCCTGGT... Steps _________...001 I 02008 Let's call this tree 1: (1,2,(3,4)) Tree 1's length for first 9 sites = ⑧? 44 Paul O. Lewis ~ Phylogenetics, Spring 2020 Parsimony Steps 123456789... Taxon1 Taxon2 Taxon1 CGACCAGGT... Taxon2 CGACCAGGT... < Taxon3 CGGTCCGGT... 4=3 3% Taxon4 CGGCCTGGT... → 6 Taxon3 Taxon4 Steps _________...002A 02000 Tree 2: (1,3,(2,4)) Tree 2's length for first 9 sites = ?• 5 Paul O. Lewis ~ Phylogenetics, Spring 2020 Parsimony Steps 123456789... Taxon1 Taxon2 Taxon1 CGACCAGGT... Taxon2 CGACCAGGT... 3 4 - r Taxon3 CGGTCCGGT... ' G 643 Taxon4 CGGCCTGGT... Taxon4 Taxon3 Steps 002102000_________... Tree 3: (1,4,(2,3)) Tree 3's length for first 9 sites = •? 5 Paul O. Lewis ~ Phylogenetics, Spring 2020 Parsimony (using only 9 sites) Taxon1 Taxon3 Taxon1 Taxon2 Taxon1 Taxon2 Taxon2 Taxon4 Taxon3 Taxon4 Taxon4 Taxon3 4 steps 5 steps 5 steps most This is the simplest explanation of the data for parsimonious the first 9 sites according to the parsimony criterion.