Quick viewing(Text Mode)

Methods and Principles of Biological Systematics

Methods and Principles of Biological Systematics

2 Methods and Principles of Biological

iological systematics is literally “the study of biological B systems.” While this originally focused on classification, the field has broadened into discovery of phylogeny and use of phylogenies to understand the processes underlying diversification. In this broad definition of systematics, produc- ing a classification and assigning names to groups (a field also known as ) is only one component of systematics. This chapter is divided into two sections. The first outlines how one goes about determining the history of a group, and the second then discusses how the history can be used to uncover patterns of diversification and to construct a classification. This field has become so large that it can stand as a course by itself, so the presentation here is necessarily brief and introductory. Many of the topics in this chapter are discussed at greater length in books by Baum and Smith (2012) and Stuessy et al. (2014), which should be consulted for additional details, references, and examples.

Discovering Phylogeny: How Phylogenetic Are Constructed As described in Chapter 1, evolution is not simply descent with modification, but also involves the separation of lineages. This process can be visualized with diagrams such as those in Figures 1.3 and 1.4. We can also think of phylogeny as the history of DNA molecules. DNA replicates in a semi-conservative fashion to produce two daughter molecules, each of which replicates again to produce its own two daughter molecules. This process has been going on in an unbroken chain since the beginning of life on Earth. Thus each DNA molecule present today must have come from an ancestral DNA molecule in an earlier generation. By trac- ing the history of DNA molecules, we can uncover the history of life. evolutionary Trees and What They Depict: reading the The DNA replication process can be diagrammed as a tree (Figure 2.1). The tree shows ancestor-descendant relationships. Descendants 1 and 2 share with each other a more recent common ancestor (labeled “1-2” on the figure) than they do with descendants 3 and 4, and so 1 and 2 are interpreted as being more closely related to each other than either of them is to 3 and 4. Likewise, descendants 1

© 2016 Sinauer Associates, Inc. This material cannot be copied, reproduced, manufactured or disseminated in any form without express written permission from the publisher. 14 Chapter 2

1-16 (ancestral molecule)

1-8 9-16

1-4 5-8 9-12 13-16 13-16 Time

15-16 13-14 1-2 3-4 5-6 7-8 9-10 11-12 13-14 15-16

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 15 16 13 14

The order in which terminals are displayed along this axis has no meaning.

Figure 2.1 The history of life on Earth is the history of the 16 can be traced backward to the common ancestor. DNA molecules. In this diagram, an ancestral molecule rep- In the inset, the branches leading to 15-16 and 13-14 are licated to give rise two descendants, each of which gave rotated, but the ancestor–descendant history is still the rise to two more descendants, and so on. This diagram same. The horizontal axis in this diagram has no meaning. stops at 16 descendant molecules. The history of each of through 4 are more closely related to each other (i.e., pattern of ancestor-descendant relationships would re- share a more recent common ancestor) than they are to main the same. In addition, the shape, or topology, of 5 through 8. the tree is determined only by the connections between DNA replication can thus be represented easily as a the branches. As long as the connections between the branching diagram. It is common to trace the history of branches are unchanged, any branches can be rotated one gene (a piece of DNA) and from that to infer the his- around the nodes; the inset in Figure 2.1 depicts exactly tory of closely linked genes. From the history of genomes the same history as is shown in the main tree, but some we can then infer the history of organisms that bear them, of the branches have been rotated. and from the history of individual organisms we can infer The evolutionary “story” thus can be described by the Juddhistory 4e of species, genera, and families. It is worth starting at any branch tip in the tree and working toward keepingSinauer in Associates mind that a diagram of a the other branch tips. This means that the terms higher Morales Studio mayJudd4e_02.01.ai be labeled to make 09-04-15 it look like the history of species, and lower have no meaning but simply reflect how we but in fact it is an inference about species extrapolated from have chosen to draw the evolutionary tree. For example, data on DNA sequences of individual genes, genomes, some textbooks consider the Asteraceae an “advanced” and cells. For example, because the large subunit of ri- family and Magnoliaceae and Nymphaeaceae “primi- bulose 1,5-bisphosphate carboxylase/oxygenase (rubisco) tive” families. However, because the families have had is encoded in the chloroplast and we can sequence the exactly the same amount of time to evolve (i.e., the DNA large subunit of rubisco, we can infer the history of the has had exactly the same number of years over which to entire chloroplast genome. Before DNA data were easily accumulate mutations), each is a complex mix of ancestral available, such inferences were made from morphological and derived characteristics, meaning that no extant characters (see Box 2A). family is more primitive or more advanced than the other. The history of DNA molecules and thus the phylogeny Drawing a phylogenetic tree using DNA molecules as is not changed by the orientation of the tree. For example, in Figure 2.1 is cumbersome, so we could replace the DNA the tree in Figure 2.1 could be drawn with the most re- molecules with ovals, as in Figure 2.2A, or we could leave cent molecules at the bottom, top, right, or left, but the out the ovals but still keep the labels for the ancestors,

© 2016 Sinauer Associates, Inc. This material cannot be copied, reproduced, manufactured or disseminated in any form without express written permission from the publisher. Methods and Principles of Biological Systematics 15

Figure 2.2 A portion of the (A) (B) (C) same tree as in Figure 2.1, but 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 with the ancestral DNA mol- ecules replaced with ovals (A), with the ovals removed but ancestors still labeled (B), and 1-2 3-4 5-6 7-8 1-2 3-4 5-6 7-8 with the ancestors unlabeled and just represented by the points of intersection of the branches (C). 1-4 5-8 1-4 5-8

1-8 1-8 as in Figure 2.2B. However, it is most common to draw Sources of DNA Sequence Data phylogenetic trees without DNA molecules, ovals, or even Sequencing determines the precise order of the nucleo- labeled ancestors. Instead, the ancestors are simply rep- tides A, C, G, and T in a stretch of DNA. The DNA se- resented by the nodes, or points where the branches join quences from multiple organisms can then be aligned, (Figure 2.2C). This is sometimes confusing for students and mutations can be detected by noting points at which because the lines are now more prominent visually than sequences differ between two organisms. DNA sequence the nodes, but the pattern of inference is still the same. data are currently being generated in two ways: (1) a The history is traced backward from the terminals (cur- gene-by-gene approach, in which a gene of interest is rent DNA molecules) to successively older nodes, which chosen, isolated from a large number of organisms, and represent the common ancestors. sequenced; and (2) a genomic approach, in which an en- In the examples shown in Figures 1.3, 1.4, 2.1, and tire chloroplast or nuclear genome is sequenced and the 2.2, we described evolution as though we were watch- sequences of many genes from the genome are analyzed. ing it happen. This is rarely possible, of course, so part The plant cell contains three different genomes: chlo- of the challenge of systematics is that we must infer what roplast, mitochondrial, and nuclear (Table 2.1). Not only went on in the past. The first step in making such infer- are these found in three locations in the cell, but they also ences is to identify characteristics of organisms that are differ in structure and gene content. Systematists use data heritable. A heritable character is any aspect of the plant from all three. The chloroplast and mitochondrion, and so that can be passed down genetically through evolutionary also their genomes, are generally inherited uniparentally time and still be recognizable. The most commonly used (usually maternally in angiosperms); the nuclear genome heritable characters are those of the DNA itself—the four is biparental. The three genomes differ dramatically in bases: adenine (A), cytosine (C), guanineJudd 4e (G), and thy- size, with the nuclear being by far the largest—measured mine (T). However, morphological charactersSinauer Associates can also be in megabases of DNA. The mitochondrial genome in- used. A ’s petal color,Morales Studio struc- cludes several hundred kilobase pairs (kbp) of DNA (200– ture, and habit (general growth pattern)Judd4e_02.02.ai are all known 09-04-15 to 2500 kbp), which makes it small relative to the nuclear be under genetic control, and these characters are gener- genome but large relative to the mitochondrial genomes ally stably inherited from one generation to the next. Each of animals (which tend to be about 16 kbp). The chloro- character is then divided into character states, which are plast genome is the smallest of the three plant genomes, used in a way similar to DNA mutations to track evolu- in most ranging from 135 to 160 kbp. tionary history. Many examples of such heritable charac- Like the bacteria from which they are derived, mito- ters are described in Chapter 4. chondria and chloroplasts have circular genomes. Large regions of noncoding DNA separate the genes in the mi- TABLe 2.1 tochondrial genome, and their order in the genome is vari- Comparison of the three genomes in a plant cell able; in fact, their order chang- Genome Genome size (kbp) Inheritance es so easily and frequently that many rearranged forms Chloroplast 120–160 Generally maternal (from the seed parent) can occur even within the Mitochondrion 200 –2740 Generally maternal (from the seed parent) same cell. Rearrangements of Nucleus 1.1 × 106 to 1.1 × 1011 Biparental the mitochondrial genome oc- cur so often within individual

© 2016 Sinauer Associates, Inc. This material cannot be copied, reproduced, manufactured or disseminated in any form without express written permission from the publisher. 16 Chapter 2

Rubisco subunit Photosystem Cytochrome-related ATP synthase NADH dehydrogenase Ribosomal protein subunit LSC Ribosomal RNA Plastid-encoded RNA polymerase Other Unknown function Vitis vinifera 160,928 bp Transfer RNA Intron

IRb IRa Figure 2.3 Diagram of the chloroplast SSC genome of grape (Vitis vinifera), showing the locations of some major genes and the inverted repeat (IR), large single-copy (LSC), and small single-copy (SSC) regions. (From Jansen et al. 2006.)

isms and reversals will accumulate to the point that all phylogenetic information is lost; the history of the se- quence will be obliterated. The latter problem is particu- plants that they do not characterize or differentiate species larly acute in work with noncoding sequences or remotely or groups of species and are thus not especially useful for related taxa. inferring relationships. The chloroplast genome, in con- Genes accumulate mutations at different rates, in part trast, is largely stable, both within cells and within species. because gene products ( and RNAs) differ in The most obvious feature of the chloroplast genome is the how many changes they can tolerate and still function. presence of two regions that encode the same genes, but in Histones, for example, are proteins that generally cease opposite directions; these are known as inverted repeats. to work if many of their amino acids are replaced with Between them are a small single-copy region and a large different ones. The internal transcribed spacer (ITS) of ri- single-copy region (Figure 2.3). bosomal RNA, on the other hand, can still fold properly even if many of its nucleotides are changed. Thus genes Judd 4eGene-by-Gene Approaches for histones do not accumulate mutations rapidly, whereas Sinauer AssociatesThe gene-by-gene approach is rapidly becoming of only genes for the ITS do, reflecting the different functional Morales Studiohistorical interest, but we discuss it briefly here because it constraints on their gene products. In general, chloroplast Judd4e_02.03.ai 08-12-15 is still in use. For several decades, systematists have used genes tend to accumulate mutations more rapidly than do the polymerase chain reaction (PCR) to amplify one gene mitochondrial genes in plants. It is more difficult to gen- at a time. They have then used this single gene as the ba- eralize about nuclear genes, which is hardly surprising, sis for inferring the history of the cell and organism that because there are so many of them. bear it. The frequency of mutation of a gene determines When systematists first began using DNA characters, how useful it is for addressing particular phylogenetic attention focused on chloroplast genes because of the problems. In general, a rapidly mutating gene is needed ease of amplification and alignment and because many to assess relationships among closely related populations of them exhibit an appropriate level of variation. Some of or species. Genes that mutate more slowly can be helpful the earliest insights from DNA data came from sequences in studies of older groups; however, if a gene is changing of the gene for the large subunit of rubisco, mentioned slowly, it will be difficult to find mutations from which above. This is still used from time to time but has been a phylogeny can be constructed. At a very low mutation augmented with data from many other chloroplast genes. rate, the level of variation will approach the expected level To track the history of the nuclear genome, many people of sequencing error, and inferences will become unreli- used, and still use, the ITS. This is also easy to amplify able. Conversely, if a gene is changing too fast, parallel- because of its high copy number.

© 2016 Sinauer Associates, Inc. This material cannot be copied, reproduced, manufactured or disseminated in any form without express written permission from the publisher. Methods and Principles of Biological Systematics 17

Both chloroplast genes and the ITS have their draw- DNA data are widely available and can be obtained backs. Because the chloroplast is maternally inherited, it quickly and cheaply if necessary. Sequences of both in- will provide only part of the history of a group if there dividual genes and whole genomes are available from has been a history of hybridization and/or allopolyploidy. GenBank, at the National Center for Biotechnology In- (Allopolyploidy refers to the doubling of the nuclear ge- formation (www.ncbi.nlm.nih.gov/genbank/), which is nome following hybridization. See Chapter 4 for discus- the repository of a vast amount of publicly accessible data. sion.) The high copy number of the ITS creates problems Nuclear genomes are also becoming far more accessible for data analysis because the thousands of copies in any for analysis because use of next-gen approaches is much cell are not all identical. In many studies, the ITS is treated less expensive than amplifying and sequencing one gene as though it is a single marker, but in fact it has a complex at a time. Entire nuclear genomes are being produced at a pattern of evolution that is not easily interpreted. Thus it rapid rate and are available for search and download from can provide an inaccurate picture of phylogenetic history. a variety of websites, including GenBank and Phytozome Nuclear genes have been used less frequently for sev- (http://phytozome.jgi.doe.gov/pz/portal.html), which is eral reasons. First, they vary enough that new sets of PCR supported by the U.S. Department of Energy. While the primers must be designed for each gene in each taxo- taxonomic representation of entire nuclear genomes is still nomic group studied. In addition, many nuclear genes are too sparse for addressing many systematic questions, they duplicated or exist as part of a small set of genes (a gene often provide useful sources of sequences for preliminary family). Because of this, some preliminary work is often analyses or for using as taxa. required to be sure that all the sequences used are from Other methods of sampling nuclear genomes are be- genes that are related by descent (orthologous genes) coming common. One of these is to sequence just the and not simply recent duplicates (paralogous genes). For- transcribed genes, or transcriptomes. A second approach tunately, population genetic theory suggests that allelic is to cut the nuclear genome into small pieces using re- variation should not be misleading in studies of closely striction enzymes and then to sequence just the ends of related species, because alleles in one species should be the fragments, an approach variously known as RAD- more closely related to one another than they are to al- seq (restriction site–associated DNA sequencing) or GBS leles in other species. Data on many nuclear genes sup- (genotyping by sequencing). A third possibility is to use port this expectation. methods that enrich the pool of sequences for particular nuclear genes of interest. Genomic Approaches It is now possible to sequence entire chloroplast genomes Sequence Alignment rapidly and cheaply, so phylogenetic studies are moving Sequence alignment is a critical step in phylogeny recon- rapidly beyond focusing on single genes and instead use struction. Alignment is the stage at which the scientist hundreds of genes. Sequencing technology is improving makes the initial assessment of similarity of nucleotide and changing at a remarkable rate, such that genomic sites. It is by far the most difficult part of using sequence studies are rapidly superseding studies of individual genes, data, and it is hard to automate. A poor alignment will and the techniques of molecular systematics are becoming lead to a meaningless phylogenetic tree. those of genomics. Advances in sequencing methods are In a sequence alignment, the columns correspond to such that we did not attempt to describe here the range nucleotides at homologous positions, and the rows corre- of technologies in the category of next generation (“next- spond to homologous sequences of DNA (see Figure 2.4). gen”) sequencing. These are developing and disappearing An alignment allows us to track mutations in the DNA so rapidly that whatever we wrote would have been obso- over evolutionary time. Although DNA is usually copied lete before this book was published. For an introduction to faithfully through many rounds of cell division, the cellu- current sequencing technology, see van Dijk et al. (2014). lar machinery makes occasional mistakes—mutations— Genome architecture has been used for many years to that get transmitted to subsequent generations. One com- track phylogenetic relationships. Rearrangements of the mon mutation is substitution of one base pair for another, chloroplast genome are rare enough in evolution that they also known as a point mutation. Two other common mu- can be used to demarcate major groups. For example, ge- tations are insertion and deletion, both of which result nome mapping has shown that one of the inverted repeats in sequences of two different lengths. Because we often has been lost independently in a of papilionoid le- cannot tell whether the length change is an insertion in gumes, in all conifers, and in Euglena (a flagellated photo- one sequence or a deletion in the other one, such length synthetic eukaryote unrelated to green plants). In another changes are usually called indels (insertion or deletion). example, in all Asteraceae except subfamily Barnadesioi- Other sorts of mutations include changes in the order of deae, a portion of the large single-copy region is inverted the bases, such as inversion. relative to other angiosperms (Jansen and Palmer 1987). Figure 2.4A shows two sequences that are similar at This provides evidence that Barnadesioideae is sister to all all sites except the position near the middle, a G in se- other Asteraceae. quence 1 and an A in sequence 2. Figure 2.4B shows an-

© 2016 Sinauer Associates, Inc. This material cannot be copied, reproduced, manufactured or disseminated in any form without express written permission from the publisher. 18 Chapter 2

(A) score the gap? One possibility would be for the gap not Sequence 1 ACCCGAACCAATCCATGTCGCTCTC T C to count at all. However, if that were the case, then there Sequence 2 ACCCGAACCAATCCATATCGCTCTC T C would be no way to distinguish between the alignment in Figure 2.4C and that in Figure 2.4D. In fact, if gaps are (B) “free,” then you can align anything with anything just by Sequence 1 ACCCGAACCAAT CCATGT CGC T C T CT C - inserting enough gaps. To prevent this nonsensical result, Sequence 2 - ACCCGAACCAAT CCATAT CGC T C T C TC a number called a gap penalty is subtracted from the total score. The alignment in Figure 2.4C would have 22 match- (C) es, 2 mismatches, and three gaps. If we decide that we Sequence 3 ACCCGAAC - - - T CCATGT CGC T C T CAC should subtract 3 points from the score for every position Sequence 4 ACCCGAACCAAT CCATAT CGC T C T C T C with a gap (i.e., a gap penalty of 3), then the total score for the alignment in Figure 2.4C would be 11 (i.e., 22 – 2 – 9), (D) whereas the one in Figure 2.4D would have 22 matches, no Sequence 3 ACCCGAAC - - - T CCATG - T CGC T C T CA-C mismatches, and 7 gaps, for a total score of 1. Sequence 4 ACCCGAACCAAT CCAT - AT CGC T C T C -TC The alignment scoring process builds in assumptions about the underlying mutational process. For most re- gions of most genomes, we assume that the probability of Figure 2.4 (A) Two DNA sequences, each 27 base pairs a base pair substitution is greater than the probability of long, aligned so that, for the most part, the bases are the an insertion or deletion. Therefore, the gap penalty is usu- same in any one column. This alignment shows a single ally assumed to have a larger absolute value (i.e., a larger difference, in red—a G-A substitution. (B) An alignment number is subtracted from the alignment score) than the of the two DNA sequences from (A) but with the bottom cost for a mismatch. Likewise, it is common to assume sequence shifted one base to the right. This alignment that a transition (substitution of a purine for a purine, or a postulates 20 differences between the sequences, in addi- pyrimidine for a pyrimidine) is easier and therefore more tion to the gaps at either end. (C) Two DNA sequences likely than a transversion (substitution of a purine for a that differ by two substitutions. One sequence is also three pyrimidine or vice versa). base pairs longer than the other. (D) Another alignment of the two DNA sequences in (C) but one that postulates four Many computer programs will produce alignments. gaps in addition to the three gaps identified in (C). For sequences from closely related plants, alignment may not be a serious problem. However, as more distantly related plants are compared, the sequences accumulate length mutations as well as nucleotide substitutions, and alignment becomes more difficult. For genes encoding other alignment of the same two sequences. Intuitively, RNAs, alignment sometimes can be guided by models you can see that the second alignment is not as “good” as of the secondary structure of the gene product (the way the first one, because fewer base pairs match up. How do the molecule folds). In this case the secondary structure is youJudd tell 4e a computer program to find the alignment with used as a template and the sequence is mapped on it. This theSinauer highest Associates possible number of matches? In a highly in- method ensures that the proposed alignments maintain fluentialMorales Studio paper in 1970, Needleman and Wunsch came up the structure of the molecule. (Methods for inferring sec- Judd4e_02.04.ai 09-22-15 with a scoring scheme and an algorithm that would allow ondary structure, however, have their own limitations.) a computer to find the best scoring alignment. They gave In protein-coding genes, alignments must consider a certain number of points to a match and subtracted the structure of the protein. The DNA sequence of such points for a mismatch. For example, if we give 1 point for genes is read in groups of three bases, or codons, with every match in Figure 2.4A (26 matches) and subtract 1 each codon specifying a particular amino acid. Most com- point for every mismatch (1 in this case), this alignment monly, insertions or deletions occur in sets of three bases will have a score of 25, whereas the alignment in Figure as well, corresponding to the gain or loss of an amino 2.4B will have a score of –14 (6 matches minus 20 mis- acid. Alignments need to incorporate this fact. Addition matches), not counting the gaps at either end. Applica- or subtraction of a single base (rather than a set of three) tion of the algorithm would indicate that the alignment will change the entire structure of the protein encoded in Figure 2.4A is preferred because it has a higher align- by the sequence by changing the start point of the sub- ment score than the one in Figure 2.4B. While there have sequent codons (the reading frame). For example, the been many changes to the original algorithm of Needle- sequence AAATTGACTTAC codes for the four amino man and Wunsch, the basic idea of alignment scoring acids lysine-leucine-threonine-tyrosine (K-L-T-Y, using has remained the same. their one-letter abbreviations, as in Figure 2.5). If a single One problem comes up when sequences are not of iden- base were lost from the first lysine codon—leaving, for tical lengths. For example, in Figure 2.4C, the sequences example, AATTGACTTAC—the protein would change to have 2 mismatches and a 3-base-pair gap. How do you an asparagine followed by a stop codon. The stop codon

© 2016 Sinauer Associates, Inc. This material cannot be copied, reproduced, manufactured or disseminated in any form without express written permission from the publisher. Methods and Principles of Biological Systematics 19

Figure 2.5 Alignment of part of 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 the sequence for the large subunit Taxa G G A T T T A A A G C T G G T G T T A A G G A T of rubisco. Genera 2 through 12 are 1 Joinvillea grasses (); Joinvillea is an G F K A G V K D G G A T T T C A A G C T G G T G T T A A G G A T outgroup (family ). 2 Aristida Below each row of nucleotides is G F Q A G V K D G G A T T T C A A G C T G G T G T T A A G G A T the amino acid translation, with the 3 Stipagrostis G F Q A G V K D amino acid name abbreviated by a G G A T T T A A A G C T G G T G T T A A G G A T 4 Amphipogon standard one-letter code. The por- G F K A G V K D tion of the rubisco protein in all taxa G G A T T T A A A G C T G G T G T T A A G G A T 5 Arundo shown here starts with a glycine (G) G F K A G V K D G G A T T T A A A G C C G G T G T T A A G G A T followed by a phenylalanine (F). Varia- 6 Molinia tion at a given nucleotide position G F K A G V K D G G A T T T A A A G C C G G T G T T A A G G A T may or may not change the resulting 7 Phragmites protein. Note that an alignment is G F K A G V K D G G A T T T A A A G C T G G T G T T A A A G A T actually a character × matrix, 8 Danthonia G F K A G V K D similar to that in Table 2.3. (Alignment G G A T T T A A A G C T G G T G T T A A G G A T 9 Thysanolaena displayed in MacClade 4.0; Maddison G F K A G V K D and Maddison 2005.) G G A T T T A A A G C T G G T G T T A A G G A T 10 Gynerium G F K A G V K D 11 Eragrostis G G A T T T C A A G C C G G T G T T A A A G A T would prevent the remainder of the G F Q A G V K D protein from being synthesized. 12 Enneapogon G G A T T T C A A G C T G G T G T T A A A G A T Nucleotide substitutions may G F Q A G V K D or may not affect the protein pro- duced. For example, at position 10 in the alignment shown in Figure 2.5, some taxa have imaginary plants and have aligned them (Figure 2.6A). an A and some have a C. This variation in the character This particular alignment shows that most of the plants state has an effect on the amino acid produced at that po- have the same nucleotide at positions 1–7, 9, 10, 12–15, 17, sition; Aristida, Stipagrostis, Eragrostis, and Enneapogon all and 18, but they differ at positions 8, 11, and 16 (high- have glutamine (Q) at that position in rubisco, whereas the lighted in green). These three variable positions then other taxa have lysine (K) there. In contrast, the variation provide the information that allows us to begin to define at position 24 (G or A) does not change the protein, be- groups and infer the evolutionary history. cause both AAG and AAA encode lysine. Note that both The matrix in Figure 2.6A allows us to infer that at some position 10 and position 24 provide potentially useful phy- time in the past there were mutations at three positions. logenetic characters, even though only one has a biological However, the matrix does not provide the direction of the effect on the protein. mutation. For example, the ancestral nucleotide at position For this discussion, assume that we have produced 8 could have been a G that later mutated to a T, or the T DNA sequences for a particular gene from four groups of could have been ancestral and later mutated to a G. The matrix can also be represented as a network (Figure 2.6B) Judd 4e showing the relative positions of the three mutations. In (A) Sinauer Associates Figure 2.6B, all plants designated by the same shape are Position 1 4 In house7 10 13 16 Judd4e_02.05.ai 09-23-15 drawn as though they arose at the same time. This ar- Star plants T C C T G A C G C G G A G G T G A T rangement generally indicates ambiguity; for the purposes of this simplified example, we have not provided any infor- Circle plants T C C T G A C G C G G A G G T C A T mation on their order of evolutionary origin. We define the length of the network by the number of mutations it rep- Square plants T C C T G A C T C G G A G G T C A T resents, in this case three. Even though the network looks Diamond plants T C C T G A C T C G A A G G T C A T somewhat like a time line, it is not: it could be read from left to right, from right to left, or perhaps from the middle out-

(B) Figure 2.6 (A) An alignment of imaginary DNA G C G T G A sequences from four imaginary groups of plants. Position 16 Position 8 Position 11 Positions in the alignment are numbered across the top of the matrix. (B) An unrooted network based on the DNA data from the alignment.

© 2016 Sinauer Associates, Inc. This material cannot be copied, reproduced, manufactured or disseminated in any form without express written permission from the publisher.

Judd 4e Sinauer Associates Morales Studio Judd4e_02.06.ai 09-04-15 20 Chapter 2

Figure 2.7 Three possible (A) (B) (C) rootings of the network in Figure Position 16 2.6B. Note that in each case the C G Position 11 number of evolutionary steps Position 11 G A G A (character state changes) is the Position 8 same as in the unrooted net- T G work. In (C), you cannot deter- mine the branch along which Position 8 Position 8 position 8 changed, so the bar G T G T marking the change in position 8 Position 11 is shown in two different places, A G in a lighter color.

Position 16 G C ward. To turn the network into an evolutionary tree, we must determine which changes are relatively more recent and which occurred further in the past. In other words, the tree must be rooted. Rooting polarizes the character changes, giving them a specific direction. Position 8 Rooting T G Rooting a phylogenetic tree is critical for interpreting how Position 16 plants evolved, and different C G rootings suggest different pat- terns of change (different char- acter polarizations). If you imagine that the network is a determining when lineages diverged from one another piece of string, you can keep the connections exactly the (i.e., when taxa originated). When taxa die out is interest- same, even when you pull down a root in different places. ing to know, but that fact by itself does not help in deter- The network from Figure 2.6B is redrawn in Figure 2.7 and mining their origins. Fossils are, of course, extremely use- rooted in three different places. Notice that the length of ful when included as additional taxa in a phylogeny. They each tree (or ) is the same as the length of the often have combinations of character states that no longer original network—three mutational steps—and that all the occur in extant taxa and can affect the overall structure of connections are the same but that the order in which the the tree, sometimes in surprising and informative ways. character-change events occur differs considerably. In general, evolutionary trees are rooted using a rela- For example, in the rooting shown in Figure 2.7A, the tive of the group under study: an outgroup. When select- ancestral plants had G at position 16, position 8, and posi- ing an outgroup, one must assume only that all ingroup tion 11. In contrast, Figure 2.7B suggests that the ancestral members (members of the group under study) are more plants had A at position 11, T at position 8, and C at posi- closely related to one another than to the outgroup; in tion 16. In Figure 2.7C, the tree is rooted in such a way that other words, the outgroup must have separated from the the ancestor must have had C at position 16 and G at posi- ingroup before the ingroup diversified. Often tion 11. However, notice that we can’t tell what the base several outgroups are used. If an outgroup is added to a was at position 8. It could have been a G and then mutated network, the point at which it attaches is determined as to a T in the ancestor of the squares + diamonds, or it could the root of the tree. Judd 4e have been a T that mutated to a G inSinauer the Associatesancestor of the In Figure 2.8A, an imaginary outgroup, a yellow poly- circles + stars. Morales Studio gon, is added to the matrix from Figure 2.6A. For the par- How is the position of the root determined?Judd4e_02.07.ai One 10-06-15 com- ticular gene sequence in the matrix, the yellow polygon mon suggestion is to use fossils, but this approach is prob- has the same set of nucleotides as the star plants. With lematic. The fact that an extinct plant has been fossilized this information, the polygon can be added to the network does not mean that its lineage originated earlier than those as an outgroup, as in Figure 2.8B. Because the polygon at- of plants now living; we know only that it died out earlier. taches among the star species, the tree can be rooted and In determining evolutionary history, we are interested in redrawn as in Figure 2.8C. This tree corresponds to the

© 2016 Sinauer Associates, Inc. This material cannot be copied, reproduced, manufactured or disseminated in any form without express written permission from the publisher. Methods and Principles of Biological Systematics 21

(A) (C) Position 1 4 7 10 13 16 Position 11 Star plants T C C T G A C G C G G A G G T G A T G A

Circle plants T C C T G A C G C G G A G G T C A T

Square plants T C C T G A C T C G G A G G T C A T Position 8 G T Diamond plants T C C T G A C T C G A A G G T C A T

Polygon plants T C C T G A C G C G G A G G T G A T (outgroup) Position 16 G C (B)

G C G T G A Position 16 Position 8 Position 11

Figure 2.8 (A) The alignment from Figure 2.6A but with char- acter states added for an imaginary outgroup (polygon). (B) The unrooted network from Figure 2.6B but with the polygon attached according to the character states in (A). (C) The network of Figure 2.6B rooted with the polygon. Note that the evolution- ary history is now the same as in Figure 2.7A.

rooted tree in Figure 2.7A and strengthens the hypothesis that Figure 2.7A accurately reflects evolutionary history. With this tree we can illustrate the same point made in one that includes a common ancestor and some, but not Figure 2.1—the tree can be drawn in different ways and all, of its descendants. still reflect the same evolutionary history. Comparing Fig- A character state that is derived at one time may be- ures 2.9A and B with Figure 2.8C shows that we can ro- come ancestral later. In Figure 2.8C, C at position 16 is a tate the branches of the tree around any one of the nodes shared derived character of a large group. It is a synapo- without affecting the inferred order of events. morphy and indicates of the circles + squares With a rooted tree (and only with a rooted tree), we can + diamonds. For the clade of squares + diamonds, though, determine which groups are monophyletic (made up of C at position 16 is an ancestral, or plesiomorphic, char- an ancestor and all of its descendants) and form a clade. In acter: a character that all of the species in the group in- the example laid out in Figure 2.8C, the diamond plants herited from their common ancestor and thus indicates are monophyletic. The evidence for their monophyly is nothing about their relationships with one another. Ple- provided by the shared derived (synapomorphic) A at po- siomorphic character states cannot show evolutionary sition 11. But notice that the interpretation of monophyly relationships in the group being studied because they dependsJudd 4e on rooting. If Figure 2.7B were the correct root- evolved earlier than any of the taxa being compared and Sinauer Associates ing,Morales then Studio A at position 11, T at 8, and C at 16 would be have merely been retained in the group’s various lineages. theJudd4e_02.08.ai ancestral character 09-21-15 states (usually called symplesio- morphies, or shared ancestral states) rather than derived Homoplasy synapomorphies. In Figure 2.7B, the species indicated by As the preceding discussions have shown, determining diamonds and squares do not share any derived character the evolutionary history of a group of organisms fol- and do not include all the descendants of their common lows a series of standard steps. First, DNA from multiple ancestor; some of those descendants went on to become plants is sequenced and recorded as an alignment (see the circle and the star plants. Therefore, if Figure 2.7B were Figure 2.6A). Second, a branching network is constructed correct, the diamond plus square species would not be a (see Figure 2.6B). Third, by inclusion of an outgroup, the monophyletic group (as they are with the rooting in Figure network can be rooted to produce an evolutionary tree, 2.7A). Instead, they would constitute a paraphyletic group, cladogram, or phylogeny (see Figure 2.8C).

© 2016 Sinauer Associates, Inc. This material cannot be copied, reproduced, manufactured or disseminated in any form without express written permission from the publisher. 22 Chapter 2

(A) (B) 2.10B. Counting the number of changes on this network (its length), we find five: one each at positions 3, 11, and 16 and two at position 8. According to this network, there have been two parallel mutations at position 8. If we were to look solely at position 8 and create groups based on the bases at that position, the plants with a T would be considered polyphyletic. Polyphyletic groups have two or more ancestral lineages in which the parallel character states evolved. (Although we distinguish here Position 8 between paraphyletic and polyphyletic groups, many sys- G T Position 11 tematists have observed that the difference is slight and G A simply call any para- or polyphyletic group nonmonophy- letic.) Why not draw the network in such a way that there is only one change from G to T? Such a network is shown Position 11 Position 16 G A in Figure 2.10C. Now we have one change at position 8, Position 8 G C but that change then requires two changes at positions G T 3 and 16, making the network six steps long. There is no way to draw the network without at least one character changing more than once. Each of the networks can be converted into a phylog- eny by rooting at the polygon, but each makes different suggestions about how the plants have evolved. In Figure 2.10B, positions 3 and 16 have been stable over evolution- Position 16 ary time, whereas there have been two independent mu- G C tations at position 8. In Figure 2.10C, we postulate that positions 3 and 16 have mutated twice over evolutionary time, while position 8 has changed only once. By drawing either of these networks, we are proposing a hypothesis about how evolution has happened—about which genetic changes have occurred, at what frequency, and in which order. You can see that in fact you could construct quite a few different trees from these data, all of them produc- ing different ideas about evolution. How do we determine Figure 2.9 Two different ways to draw the tree in Figure 2.8C. Note that the length does not change, nor does the which one is correct? There is no way to be certain. No hypothesized order of events. one was there to watch the evolution of these plants. We can, however, make educated guesses, or inferences, and some such inferences seem more likely than others to be Two phenomena, however, make it much harder to correct. determine evolutionary history in practice: parallelism There are two general ways to infer a phylogenetic and reversal, which are sometimes referred to together as tree. One way is to use a clustering algorithm such as homoplasy. Parallelism is the appearance of similar char- . An algorithm is a series of repetitive acter states in unrelated organisms. (Many authors make steps that allow a large problem to be broken into a set a distinction between parallelism and convergence, but for of smaller problems. A second general way to produce a this discussion we will treat them as though they were the tree is to use an optimality criterion (plural, criteria), a same.) A reversal occurs when a derived character state specific rule that helps you determine which tree is best. changesJudd 4e back to the ancestral state. Three such criteria are in common use. First is the criteri- SinauerTo provide Associates a clear example, we will divide the group on of tree length, in which shorter, or more parsimonious, thatMorales we Studio have called star plants into white star plants, red trees are preferred to longer ones. Second is the maximum starJudd4e_02.09.ai plants, and 10-06-15 gold star plants. Let us assume that the likelihood criterion, in which the preferred tree is the one red star and white star plants have an A at position 3, that maximizes the likelihood of the data, given the tree. whereas all the rest of the plants have a C (including the Third is the Bayesian posterior probability, in which the polygon plants outgroup). Let us further assume that the preferred tree is the one with highest posterior probabil- white star plants have a T at position 8. We can revise the ity. We will discuss algorithms first and will discuss the alignment in Figure 2.8A to create Figure 2.10A, which optimality options in Methods That Use an Optimality gives the same information as the network in Figure Criterion (see page 27).

© 2016 Sinauer Associates, Inc. This material cannot be copied, reproduced, manufactured or disseminated in any form without express written permission from the publisher. Methods and Principles of Biological Systematics 23

(A) Figure 2.10 (A) The alignment from Figure 2.8, Position 1 4 7 10 13 16 with the star plants divided into subgroups (white White star plants T C A T G A C T C G G A G G T G A T star, red star, and gold star). (B) Network derived from the alignment in (A). Notice that position 8 Red star plants T C A T G A C G C G G A G G T G A T has changed two times independently. (C) A dif- ferent network, postulating one change at posi- Gold star plants T C C T G A C G C G G A G G T G A T tion 8. Note that this forces two changes at posi- tions 3 and 16. Circle plants T C C T G A C G C G G A G G T C A T

Square plants T C C T G A C T C G G A G G T C A T

Diamond plants T C C T G A C T C G A A G G T C A T

Polygon plants T C C T G A C G C G G A G G T G A T (outgroup)

(B)

T G A C G C G T G A

Position 8 Position 3 Position 16 Position 8 Position 11

(C)

A C G C G T G A

Position 3 Position 16 Position 8 Position 11

C

Position 3 C A Position 16

Tree Building: G Clustering Algorithms and Neighbor Joining There are many different clustering algorithms, but most share a number of features. In general, they start with a dissimilarity, or distance, matrix. A dissimilarity ma- trix for the sequences in Figure 2.10A is shown in Figure 2.11A. While it is possible to record the dissimilarity of While a full description of this method is beyond the two sequences as the numbers of mutations separating scope of this book, a hallmark of the algorithm is that them, as shown above the diagonal of blank boxes, dis- it accounts for the possibility that the rate of molecular similarityJudd is 4emore commonly normalized by length of evolution may be different in different sequences. While sequence.Sinauer In this Associates case, the numbers of mutations distin- it starts with a dissimilarity matrix, the matrix is first cor- guishingMorales two sequencesStudio are divided by 18, which is the rected to account for differences in the mutation rate. The Judd4e_02.10.ai 09-11-15 total sequence length, as step 1. Step 2 is to join the two steps are then repeated until a full tree is produced. least dissimilar sequences to form one node with two The result is a tree in which the lengths of the branches branches. Step 3 is to remove those two from the ma- are distances, rather than particular mutations (Figure trix and replace them with a single value that represents 2.11B). Another valuable property of the neighbor-join- them. Step 4 is to recompute the matrix. These steps are ing algorithm is that the sequences that are most similar repeated until all sequences have been joined. may not necessarily be sisters in the tree. There are two The most commonly used algorithm for construct- possible reasons why two sequences might be similar to ing phylogenetic trees is the neighbor-joining algorithm. each other. First, and most intuitive, is the possibility that

© 2016 Sinauer Associates, Inc. This material cannot be copied, reproduced, manufactured or disseminated in any form without express written permission from the publisher. 24 Chapter 2

Figure 2.11 (A) A dissimilarity (A) matrix for the sequences from Fig- White Red Gold ure 2.10A. Numbers of mutations star star star Circle Square Diamond Polygon are shown to the right, above the diagonal of blank boxes; percent White star 1 2 3 2 3 2 dissimilarity is shown to the left, below the diagonal of blank boxes. Red star 0.0556 1 2 3 4 1 (B) A neighbor-joining tree based on Gold star 0.1111 0.0556 1 2 3 0 (A) and the sequences from Figure 2.10A. Branch lengths are adjusted Circle 0.1667 0.1111 0.0556 1 2 1 dissimilarity values. Numbers above Square 0.1111 0.1667 0.1111 0.0556 1 2 the branches are bootstrap values (explained later in this chapter). Diamond 0.1667 0.2222 0.1667 0.1111 0.0556 3 Polygon 0.1111 0.0556 0.0000 0.0556 0.1111 0.1667 the two are indeed very closely re- (B) lated and share a recent common 67 ancestor. The second possibility is 53 0.04167 0.05556 that the two are not particularly 0.0116 0.04167 closely related but that they just 0.01389 have not changed very much over 53 0.04630 time. An effective tree-building 0.04167 method must distinguish between 0.00926 these two possibilities, and neigh- 44 bor joining is able to make such a 0.01273 distinction. For example, in Figure 2.11A the red star and green circle plants are not very dissimilar (0.111, or 2 mutations apart), How can we address the problem of unseen mutations? which might make you think that they could be closely Start with the assumption that mutation is a random pro- related. However, looking at Figure 2.11B reveals that cess. To envision this, begin by thinking of raindrops fall- they are on different branches entirely but simply have ing on dry pavement. When there are not many drops, not changed much since the root of the tree. An older you will be able to see where each one has fallen, and method that was once widely used, the unweighted pair they will make a random pattern. You can count them all, group method with averaging, or UPGMA, is unable to probably fairly accurately. However, as raindrops continue distinguish between a true relationship and a slow rate to fall, sooner or later a drop will land right on top of a of change and is thus not appropriate for phylogenetic previous drop, and if this happens very often, you will reconstruction. quickly lose count of exactly how many drops have fallen In summary, neighbor joining is a very fast algorithmic at a given spot. If you now try to use the number of spots method that produces large trees in a fraction of a sec- to estimate the number of raindrops, your estimate will be ond. In addition, it considers the possibility that different too low because you will miss all the ones that have fallen sequences may mutate at different rates. Another useful directly on top of each other. characteristic of neighbor joining is that it can accommo- Like the pattern of random raindrops, it is easy to lo- date different models of evolution, described in the next cate the position of mutations in a DNA sequence if there section. are not very many mutations (Figure 2.12A). However, given enough mutations, sooner or later a base pair that Models of Evolution has mutated once will mutate again. At this point you lose In the description of the neighbor-joining method and in any evidence that the original mutation occurred. For ex- the examples earlier in this chapter, we have been pro- ample, a base that was originally A could mutate first to ceeding as though we could see every mutation that ever G and then to T. Unless you have an individual plant with happened. This, of course, is nonsense: DNA has existed the intermediate state (G), you will have no way of know- for several billion years and has been mutating the en- ing that it ever occurred; you will only see the A and T. tire time. Because of this long invisible history, we can Your estimate of the number of mutations, and hence the be certain that many mutations have occurredJudd 4e that are distance, will be too low. This pattern of multiple muta- Sinauer Associates no longer detectable. This is particularlyMorales true Studioif the time tions at the same site is sometimes called multiple hits. since divergence of sequences is long, orJudd4e_02.11.ai if the mutation 09-04-15 This can be diagrammed by graphing the dissimilarity rate is high. between the two sequences against time (Figure 2.12B).

© 2016 Sinauer Associates, Inc. This material cannot be copied, reproduced, manufactured or disseminated in any form without express written permission from the publisher. Methods and Principles of Biological Systematics 25

(A) GTTCCAGCGGGAGCAGAACCTGCAGCTGCACCGGCGCGGCCACAACCTGCCG

6 11 17 4,21 8 19 13 3 7 18 10 1 9,15 2,14 20 16 5 12

Figure 2.12 (A) A DNA sequence with the positions of 21 random (B) mutations indicated by numbered arrows. Mutations 14, 15, and 21 (blue) occurred in the same positions as earlier mutations 2, 9, and 4, Actual dissimilarity respectively. (B) Dissimilarity between two sequences plotted against 0.75 time. Actual dissimilarity increases linearly over time, whereas observed dissimilarity is lower and approaches an asymptote of 0.75. (C) A DNA sequence over time, showing in red the numbers and positions of Observed dissimilarity mutations occurring between time 1 and time 2 and between time 2 and time 3.

not change, and the third position changed from C back to T. If the sequence at time 2 were not observed, Dissimilarity between two sequences Dissimilarity between two three mutations would be “invisible” and we would Time or evolutionary distance think that there had only been three rather than six (C) changes. Time 1 gaaccaatc catg tcgctctct ccccagaggaag attgggc A model of evolution is thus a correction factor to estimate the real dissimilarity between two sequences. Time 2 gaaccaatt cata tcgctctcc ccccagaggaaa attgggc In the context of a clustering algorithm such as neigh- Time 3 gaaccaatg cata tcgctctct ccccagaggaaa attgggc bor joining, the pairwise distance from the origi- nal sequences can be adjusted according to a model of evolution and then the algorithm can be applied. The actual number of mutations increases without limit For example, suppose one has two sequences that differ over time, but the observed number accumulates more at 20% of their sites. One assumes that mutation is ran- slowly and eventually approaches an asymptote at 75% dom, that nucleotide frequencies are equal, that all sub- distance. Because there are only four base pairs, there is a stitutions are equally probable, and that all sites are free 1/4 (25%) chance that two base pairs will match purely at to vary. (These are the assumptions of the Jukes-Cantor random. Thus two random sequences will always be ap- model, named after the authors who first proposed it.) proximately 25% similar and 75% dissimilar. At 75% dis- One can then calculate what the actual percentage of similarity, the actual divergence cannot be distinguished changes must have been to have an observed dissimilar- from random noise. ity of 20%. The actual percentage of changes is the cor- Basic probability theory can be used to calculate the rected dissimilarity, or branch length. The assumptions probability of multiple hits for a given pair of sequences, of the Jukes-Cantor model are oversimplified, and more and hence can come up with a corrected measure of dis- realistic models have been developed, but all serve the similarity that approximates the actual value. Such an same general purpose. estimate of the probability of multiple hits is known as What happens if you don’t correct for multiple muta- a model of evolution. In other words, if one observes a tions at the same site in a molecule? If all sequences are pairwise dissimilarity between two sequences of, say, 1%, fairly similar, i.e., if you are at a point near the origin in what is the probability that there have been additional in- the graph in Figure 2.12B, then the observed dissimilarity visible mutations? Inspection of the graph in Figure 2.12B is not far from the actual dissimilarity and correcting for suggests that at 1% dissimilarity, the actual value and multiple mutations does not make much difference. How- theJudd observed 4e value are very similar. Conversely, if one ever, if the dissimilarities between sequences are large observesSinauer Associates a value of 30% dissimilarity, one can expect to and, in particular, very unequal, uncorrected distances Morales Studio haveJudd4e_02.12.ai missed quite 09-11-15 a few mutations, and the actual value will produce a particular tree topology, whether it is cor- is probably a good deal higher than 30%. This can also be rect or not. This can be seen Figure 2.13. In this figure, we seen in Figure 2.12C, which shows a sequence sampled consider only four sequences, and we see in Figure 2.13A at three different times. If we look at time 2, we can see that sequences 1 and 4 have accumulated more mutations that between time 1 and time 2, the sequence underwent than sequences 2 and 3 and, furthermore, that there are four mutations—C to T, G to A, T to C, and another G not many mutations along the internal branch. Such a to A. Between times 2 and 3, the first mutated position pattern can be caused by a large difference in the rate of changed from T to G, the second and fourth positions did mutation. Because sequences 1 and 4 have accumulated

© 2016 Sinauer Associates, Inc. This material cannot be copied, reproduced, manufactured or disseminated in any form without express written permission from the publisher. 26 Chapter 2

(A) Figure 2.13 (A) An unrooted network show- ing the actual history of four DNA sequences. 1 Note that sequences 1 and 4 have accumulated AATGAAAGAATCGC 1 many more mutations than sequences 2 and 3, 2 and the internal branch has only one mutation. 3 (B) The result of an analysis that does not use 7 ACTGAACGTAACGT 9 a model of evolution. The long branches have 14 11 1 “attracted” each other, and the network now GCTGAACGTAACGC ACTGAACGTAACGC places sequences 1 and 4 together on one side 13 2 of the internal branch. Parallel changes in posi- GCTGAACGTAACCC 5 tions 2, 9, and 11 are erroneously interpreted 2 7 as indicating shared history and the ancestral 9 10 sequences are wrongly inferred. 11 AATGTAGGACTCGC 4 many mutations relative to other sequences in (B) the tree. Random similarities to various other 3 1 sequences can lead to many placements that ACTGAACGTAACGT AATGAAAGAATCGC are almost equally good. Such branches ap- 14 2 9 11 7 pear to “wander” in trees. Because the branch ACTGAACGTAACGC AATGAACGAATCGC leading to the outgroup is often the longest in 13 5 the tree, the position of the root itself may be 1 7 uncertain because of the long branch. 10 GCTGAACGTAACCC Sometimes the long-branch problem can be 2 AATGTAGGACTCGC solved and sometimes it can’t. While model- 4 based methods mitigate this problem, they do not eliminate it entirely. If there are close rela- more mutations, the chances of their having multiple hits tives to the species on the long branch, these can be in- goes up. Because there are only four nucleotides, some of cluded and will have the effect of breaking up the branch. those mutations will lead to identical bases in both se- However, in some cases (e.g., the root of the angiosperm quences, purely by chance. In this example, note that po- tree) none of the extant relatives are particularly recent. sitions 2, 9, and 11 are identical in sequences 1 and 4 but Thus some problems are inherently difficult and possibly that these similarities come from independent mutations; not solvable by current methods or data. We will return that is, sequences 1 and 4 are different from their ances- to this unfortunate property of trees when we talk about tral sequences at those positions. The rapidly changing parsimony algorithms on pages 27–28. sequences can thus appear to be closely related even if they are not. Should You Believe the Tree: An uncorrected neighbor-joining analysis will infer Bootstrapping and Comparison of Trees the tree in Figure 2.13B instead of the real tree (in Fig- An evolutionary tree is simply a model or hypothesis, a ure 2.13A). Note that sequences 1 and 4 are now placed best guess about the history of a group of plants. It fol- together and sequences 2 and 3 are together. In other lows that some guesses might be better, or at least more words, the algorithm will generate the wrong tree. In this convincing, than others. A computer will produce a tree case, the long branches are said to “attract” each other, for any data you give it. How can you tell whether the tree andJudd the 4e phenomenon is known as long-branch attrac- is a good reflection of the data? (Note that you have no tionSinauer. Even Associates worse, the more data you collect, the more way of knowing whether the tree is accurate or not—in correctMorales the Studio wrong tree will appear to be; that is, you will most cases the phylogeny is unknown and must simply becomeJudd4e_02.13..ai increasingly 09-04-15 confident that the wrong answer is be estimated.) correct. While this example is artificial, such trees are One way to assess the strength of the tree would be to surprisingly common in the study of plant evolution. All randomize the entire data set, construct a new tree, re- the problems with long branches are worse if the internal peat this many times, and determine whether the tree you branches of the tree are short. This issue appears often have produced is significantly different from one of the with rapid radiations of species in which there was a burst random trees. Would this be an interesting result? Prob- of speciation and little time for mutations to accumulate, ably not. Generally we are interested in saying something followed by a longer time in which lineages diverged. more specific than simply that our data are nonrandom. A version of the long-branch problem can also appear Usually instead we are interested in assessing whether with individual long branches, which have accumulated the tree is a good reflection of the data, or if it would be

© 2016 Sinauer Associates, Inc. This material cannot be copied, reproduced, manufactured or disseminated in any form without express written permission from the publisher. Methods and Principles of Biological Systematics 27

produced by a similar sample of data. We are also gener- majority (more than 50%) of the bootstrap trees, then ally interested in knowing how well the data support par- this clade is included in the majority-rule consensus tree, ticular parts of the tree: we want a way to assess support along with an indication of the bootstrap value, that is, for particular groups, or . the percentage of bootstrap trees showing that clade. For One very common statistical method of resampling is example, Figure 2.15 shows a recent phylogeny of the bootstrap analysis, which randomizes columns in the (Refulio-Rodriguez and Olmstead 2014). Above matrix with respect to taxa. To create a bootstrapped each branch are three numbers, the second and third of data set, start with the original alignment. As an exam- which are bootstrap values. Higher values are better, so ple, begin with the matrix in Figure 2.10A, and choose nodes that appear in 100% of the bootstrap trees, such as one position (column) at random. Make that column the those at the deeper nodes of this particular tree, are more first position of a new alignment. Go back to the original strongly supported by the data than nodes with lower val- alignment and choose another column at random. Make ues. It is important to keep in mind that bootstrap values that the second position of the new alignment. Continue are not conventional probability values, so a bootstrap until the new alignment is the same length as the origi- value of 85% does not indicate that the clade has an 85% nal one. Because you are returning to the original ma- chance of being correct. It merely provides a rough gauge trix each time to choose a new column, some characters of how well the data support a given result. may be represented several times in each new matrix, while others may be omitted entirely. In other words, Methods That Use an Optimality Criterion sample the original alignment with replacement. Figure We have talked a great deal about one common method 2.14 shows the new matrix constructed from sampling of tree construction, the neighbor-joining method. This is randomly with replacement from the matrix in Figure an algorithmic method, in which the data are simply fed 2.10A; note that character 17 from the original matrix through a series of repetitive computational steps and a has been selected four times, whereas several characters single tree appears at the end. This approach is limited, (e.g., 5 and 6) have been missed by the random selection however, because it provides no way to consider trees that process. may be nearly as good as the one you arrived at. As in all Once a bootstrapped data matrix is created, it can be aspects of science, it is important to consider alternatives to analyzed by neighbor joining or by the parsimony or your results. In this case, it is often helpful to have a num- maximum likelihood methods described below. A hun- ber by which you can rank trees as the best, second best, dred or more bootstrapped matrices are created and ana- third best, etc. Also, many systematists wish to develop a lyzed, producing a set of at least 100 phylogenetic trees. tree that is consistent with a particular statistical or philo- The large set of trees is then summarized by a majority- sophical criterion. Such trees are said to use optimality cri- rule consensus tree. If a particular clade is present in the teria. The most common of these are parsimony, maximum likelihood, and Bayesian approaches.

Position in original alignment Parsimony In parsimony analyses, 4 9 3 1 15 17 2 9 9 17 2 17 7 18 1 11 17 18 the optimality criterion is tree length, or number of mutational steps. Spe- White star T C A T T A C C C A C A C T T G A T cifically, the shortest tree for the data is preferred. The parsimony method asks, Red star T C A T T A C C C A C A C T T G A T What is the simplest explanation of the observations? By asking this question, Gold star T C C T T A C C C A C A C T T G A T we apply a rule that is used throughout Circle T C C T T A C C C A C A C T T G A T science, known as Occam’s razor: Do not generate a hypothesis any more complex Square T C C T T A C C C A C A C T T G A T than is demanded by the data. Applying this principle of simplicity, or parsimony, Diamond T C C T T A C C C A C A C T T A A T leads us to prefer the shorter network. The fact that it is shorter does not make Polygon T C C T T A C C C A C A C T T G A T (outgroup) it correct, but it is the simplest explana- tion of the data. The network in Figure Figure 2.14 A data matrix created from the alignment in Figure 2.10A by 2.6 is in fact a parsimony network—the sampling randomly with replacement. Notice that the number of columns in shortest network that can be constructed the original matrix is the same as the number in this matrix (18) but that some for those data. The fact that we prefer the columns are included here more than once and others are missed entirely. A network in Figure 2.10B over the one in bootstrap analysis begins by creating 100 or more randomized matrices like Figure 2.10C is also based on the prin- this one. ciple of parsimony.

© 2016 Sinauer Associates, Inc. This material cannot be copied, reproduced, manufactured or disseminated in any form without express written permission from the publisher.

Judd 4e Sinauer Associates Morales Studio Judd4e_02.14.ai 09-04-15 28 Chapter 2

Figure 2.15 Phylogeny of the Lamiales based on sequences of chloroplast and mitochondrial genes. The tree is the Bayesian majority-rule con- 1.0/100/96 Orobanchaceae sensus tree. Numbers above branches indicate 1.0/95/55 1.0/89/88 Rehmanniaceae Bayesian posterior probability, the maximum likeli- hood bootstrap value, and the parsimony boot- 1.0/93/74 Paulowniaceae strap value, respectively (see pages 33, 32, and 27). A hyphen indicates that the branch was not sup- Phrymaceae 0.99/59/12 1.0/93/34 ported or not found in a particular analysis. (After Mazaceae Refulio-Rodriguez and Olmstead 2014.) 1.0/94/65 Verbenaceae

0.77/25/- 0.79/-/- Thomandersiaceae Lentibulariaceae

0.99/-/- Schlegeliaceae 1.0/91/- Bignoniaceae

0.98/-/30 Acanthaceae 0.99/25/- 0.65/-/17 Martyniaceae

Pedaliaceae

1.0/81/39 Linderniaceae 1.0/-/-

1.0/99/68 Byblidaceae Stilbaceae 1.0/100/100 Scrophulariaceae

Plantaginaceae

In the example in Figure 2.10, in which there are few nents of this method, this feature is a virtue because it characters and little homoplasy, it is easy to construct the uses only data that can be observed directly and does not shortest network that can link the organisms. In most real rely on inferences. For detractors, however, the inability to cases, however, many networks are possible, and it is not consider multiple mutations (a model of evolution) means immediately obvious which one is the shortest. Computer that the method ignores much of what we know about the algorithms compare thousands of trees, calculate their mutational process. All users of this method know that it lengths, and determine which is shortest. If the taxa are is particularly susceptible to the problem of long-branch numerous, only heuristic algorithms can be used. These attraction. Because long-branch attraction is especially algorithms may not succeed in finding the shortest tree or problematic with DNA sequence data, it is becoming less trees, because of the large number of possible trees. and less common to use parsimony approaches. Con- Branch lengths in Juddparsimony 4e are recorded as the total versely, parsimony is particularly suited to morphological Sinauer Associates number of substitutions,Morales and Studio they are not normalized ac- data for which models of evolution are hard to apply. Thus cording to the lengthJudd4e_02.15.ai of the matrix (generally 07-24-15 a DNA se- parsimony methods are still valuable when incorporating quence). Because of this lack of normalization it is difficult fossils into phylogenetic analyses. to assess whether branch lengths are particularly short The parsimony method has been widely used, is eas- or particularly long. (In other words, if I see a clade sup- ily applicable to morphological changes, and is also pos- ported by a branch length of three mutations, should I be sibly the most intuitive of tree reconstruction methods. impressed or concerned?) Sites that appear to be the same Parsimony methods work well when evolutionary rates in all sequences (invariant sites) are considered to be par- are slow enough that chance similarities (due to the evo- simony uninformative and are simply ignored. Note that lution of identical derived character states independently this is quite different from the way model-based methods in two or more lineages) do not overwhelm character such as maximum likelihood and Bayesian analyses (see states shared by the common ancestor. At higher rates below) treat such sites. of change, however, parsimony methods are particularly Tree length does not provide any method for consider- susceptible to long-branch problems, described above for ing multiple mutations at the same site. For some propo- uncorrected neighbor-joining analyses. Support for parsi-

© 2016 Sinauer Associates, Inc. This material cannot be copied, reproduced, manufactured or disseminated in any form without express written permission from the publisher. Methods and Principles of Biological Systematics 29

BOX 2A Building a Phylogeny from Morphological Characters

It is possible to build a phylogeny often all we can do is be sure that the observing similarity is only the first from morphological characters, character states really are distinct. step in determining homology, since although it is rarely done now that For quantitative characters such as not every observed similarity is the DNA data are so common and easy leaf length or corolla tube width, this result of homology.* (For example, to obtain. However, DNA data are means graphing the quantitative data structural similarities can evolve in- not always available, either because (i.e., the measurements) to be sure dependently in unrelated plants that the plant is represented by a single that the measurements of the species live in similar environments.) This herbarium specimen with limited we are studying do not overlap. If the text follows the viewpoint held by material, or because the plant is measurements do overlap, then the the many phylogenetic systematists fossilized. In such cases, a phylog- assumption of underlying genetic who argue that homology can be eny can still be constructed. This is switches—and therefore division determined only by constructing an usually done using the parsimony into character states—is unsupported evolutionary tree. method (see page 27) because of the by any evidence. In these cases, the Morphological characters may be difficulty of constructing a model characters in question should be divided into two states: binary (hav- of evolution for most morphological omitted from phylogenetic analysis ing two states) or multistate (having characters. unless the overlap is caused by only three or more states). Binary charac- The basic principles are the same a few individuals, in which case the ters are interpreted as representing a as those outlined in the main text, character could be scored as poly- single genetic switch—“on” produc- but instead of starting with an align- morphic for that species and retained ing one state (e.g., tricolpate pollen), ment of DNA sequences, the analysis in the analysis. Even though such and “off” resulting in the other state starts with a matrix of morphological overlapping characters might reflect (e.g., one-grooved, or monosulcate, characters, in which the rows corre- genetic changes over evolutionary pollen). Over evolutionary time, of spond to individual plants or groups time, overlap makes it difficult, given course, such characters can continue of plants and the columns correspond our current state of knowledge, to ex- to change. For example, tricolpate to characters. Individual cells in the tract any reliable information on the pollen is modified in some Caryo- matrix then carry information on underlying gene changes (although phyllales, where it is spherical with character states. When we describe methods of dealing with plants many pores evenly spaced around it the variation among similar mor- with variable characters have been (e.g., pantoporate, looking rather like phological structures by dividing the developed). a golf ball). If we were to include the character into character states, we are It may be harder than you think character pollen colpi in a matrix con- in fact putting forward a hypothesis to determine which structures of taining some taxa with pantoporate of underlying genetic control, even one plant can usefully be compared pollen, then pollen colpi would have though the assumption is rarely with structures of a different plant. three states (monosulcate, tricolpate, stated in these terms. For example, if Two structures may be deemed to and pantoporate), making it a multi- two species differ in the color of their be similar if (1) they are found in a state character. Multistate characters flowers, we may score the character similar position in both organisms, raise a difficult question: How many petal color as having two states, red (2) they are similar in their cellular genetic switches are involved? and blue. By scoring it this way, we and histological structure, and/or (3) It is possible that monosulcate are hypothesizing that the genes they are linked by intermediate forms pollen changed to tricolpate pollen, underlying petal color have switched, of the structure (either by intermedi- which then changed to pantoporate over evolutionary time, to produce ei- ates at different developmental stages pollen (Figure 2.16A). This scenario ther red flowers from a blue-flowered of the same organism or by interme- implies two genetic switches that ancestor or blue flowers from a red- diates in different organisms). These must have occurred in order; that is, flowered ancestor. In this instance, three statements constitute Remane’s pantoporate pollen could arise only we know that there are in fact genes criteria of similarity. after tricolpate pollen did. If we ac- (e.g., components of the anthocyanin Remane (1952) actually called this cept this series of events, the multi- pathway) that control petal color, list the “criteria of homology.” In this state character is considered ordered. and thus the inference of two states book, however, we use the term ho- If we decide to allow for reversals of controlled by a “genetic switch” is mology in a more restricted sense, to character states—that is, if we con- probably a reasonable one. mean identity by descent. In other continued on the next page In many cases we have no idea of words, if we say that a character is the genetic mechanisms that con- homologous among a group of spe- *You should be aware that homology has trol the character states observed. cies, we mean that all those species several different meanings, and when you read the literature, it is worth checking what In proposing hypotheses about the inherited that character from a com- particular authors mean when they use the nature of the underlying switches, mon ancestor. Under this definition, term.

© 2016 Sinauer Associates, Inc. This material cannot be copied, reproduced, manufactured or disseminated in any form without express written permission from the publisher. 30 Chapter 2

BOX 2A Building a Phylogeny from Morphological Characters (continued)

(A) postulate only one switch between any two states. Note, for comparison, that DNA sequence characters are multistate characters with four states (A, T, G, C). To treat these as ordered would be nonsensical; adenine does not need to change to cytosine before (B) Figure 2.16 Three alternative changing to guanine. DNA charac- hypotheses about the evolution of ters are always treated as unordered pollen morphology: (A) Monosul- and fully reversible. cate changed to tricolpate, which In parsimony analyses of mor- then changed to pantoporate. As phological characters, it is common drawn, the character is ordered to assume that each character state and irreversible. (B) Monosulcate change represents a single genetic changed to tricolpate, and it also switch and that all state changes and independently changed to are equally likely. Although this pantoporate. Again, the character approach sometimes is described is ordered and irreversible. (C) Any as “unweighted parsimony,” in fact (C) pollen type can change to any other it assumes that all characters are pollen type. The double-headed equally likely to change and weights arrows indicate that the character is them accordingly. Mechanisms exist unordered and reversible. to give added weight to particular changes or particular directions of change, based on prior knowledge This sequence would suggest that of the character. For example, it is there is a genetic switch that allows possible to build into a morphological change from monosulcate to tricol- analysis the assumption that pres- pate pollen, as well as a switch that ence or absence of trichomes may be allows change from monosulcate to more likely to change over evolution- pantoporate pollen, but that a change ary time than numbers of anthers. sider the possibility that pantoporate from tricolpate to pantoporate pollen Also implicit in parsimony analy- pollen might switch back to tricolpate is impossible. The character in this ses of morphology is the assump- and tricolpate back to monosul- case is still ordered, but in a dif- tion that all characters of organisms cate—the character is still ordered. It ferent way than in Figure 2.16A. If evolve independently, that is, that requires two evolutionary (genetic) reversals are possible, then two steps change in one character does not steps to go from monosulcate to are required to get from tricolpate increase the probability of change pantoporate pollen, and two steps to to pantoporate pollen, and two from in another character. This assump- go from pantoporate to monosulcate pantoporate to tricolpate pollen. tion may be violated frequently; for pollen. With morphological characters example, a change in flower color If we didn’t know anything about and character states, we are usu- might well lead to a shift in pollina- the plants involved, we might want to ally unsure of which switches are tors, which would then increase the consider the possibility that mono- possible, so it is common to treat probability that corolla shape would sulcate pollen changed to tricolpate multistate characters as unordered change. Violations of this assumption Juddpollen 4e and, in an independent event, (Figure 2.16C). This method is some- obviously affect character weighting, Sinauermonosulcate Associates pollen also changed to times called Fitch parsimony. In the in that the likelihood of change for Moralespantoporate Studio pollen (Figure 2.16B). case of an unordered character, we any two characters is not the same. Judd4e_02.16.ai 08-12-15

mony trees is generally assessed by bootstrapping. In the of mutations rather than mutations per length of se- tree in Figure 2.15, the rightmost number at each branch quence. Branch lengths cannot be corrected for multiple is the parsimony bootstrap value. hits, making parsimony particularly prone to artifacts cre- In summary, parsimony is a mutation-counting meth- ated by long branches. Parsimony is used less frequently od that uses the optimality criterion of tree length (num- for nucleotide data but remains the most commonly used bers of mutations, or steps). Branch lengths are numbers method for morphological data (Box 2A). It also underlies

© 2016 Sinauer Associates, Inc. This material cannot be copied, reproduced, manufactured or disseminated in any form without express written permission from the publisher. Methods and Principles of Biological Systematics 31

BOX 2B Hennigian Argumentation

In current methods of phylogenetic Outgroup(s) I II III analysis, a network is constructed and is then polarized by determin- ing where the outgroup attaches. However, in the original concept of phylogenetic analysis proposed by Willi Hennig (see Chapter 3), charac- ters were polarized first using one or several outgroups, and the phylogeny was constructed afterward. Consider, for example, the imagi- nary plants presented in Figure 2.17 and their imaginary morphological character states listed in Table 2.2. In Table 2.2, the character states of the outgroup are assumed to be ancestral (plesiomorphic) and are represented by a 0, while the derived (apomor- phic) states are represented by a 1. These character states are then used to produce a character × taxon matrix (Table 2.3). Next, a phylogenetic Figure 2.17 Three imaginary species (I, II, and III) and an outgroup. tree is constructed in which the taxa are grouped according to evidence provided by shared derived character unique common ancestor in which The phylogenetic tree, then, repre- states (synapomorphies). The pres- the apomorphy first evolved; the two sents the simplest hypothesis that ence of a derived character state in taxa are assumed to have inherited explains the pattern of derived char- two taxa suggests that they share a the apomorphy from this ancestor. acter states, following the principle of parsimony. A hypothesis of evolutionary TABLe 2.2 relationships for species I, II, and III of Figure 2.17 is presented in Figure States of morphological characters used in cladistic 2.18. Species II and III are hypoth- analysis of the three imaginaryJudd species 4e in Figure 2.17 esized to share a unique common Sinauer Associates ancestor because they share the Character statea Morales Studio derived states of characters 3 and Morphological Judd4e_02.17.ai 07-24-15 5, that is, they both have opposite, character Plesiomorphic Apomorphic petiolate leaves (see Table 2.2). These 1. Roots Less than 1 mm thick (0) Greater than 5 mm thick (1) are hypothesized to have evolved in their common ancestor. Similarly, the 2. Stems Glabrous (0) Pubescent (1) shared possession of solitary flowers 3. Leaves Alternate (0) Opposite (1) (character 9) supports the recogni- tion of a more inclusive monophyletic 4. Venation Pinnate (0) Palmate (1) group containing species I, II, and III. 5. Petiole Lacking (0) Present (1) The presence of pubescent (hairy) stems in species I and III is homopla- 6. Base of blade Acute (0) Cordate (1) sious; that is, hairy stems are hypoth- 7. Perianth parts 4 (0) 3 (1) esized to have evolved in parallel in these two species, so their similarity 8. Perianth parts Separate (0) Fused (1) is not based on common ancestry. 9. Flowers b In a group of 3 (0) Solitary (1) Note, however, that hairy stems could have evolved in the most recent aCharacter state codings are given in parentheses. common ancestor of all three species bNote that inflorescence condition (flowers solitary versus flowers in groups of 3) cannot be polarized unless additional outgroups are employed. continued on the next page

© 2016 Sinauer Associates, Inc. This material cannot be copied, reproduced, manufactured or disseminated in any form without express written permission from the publisher. 32 Chapter 2

BOX 2B Hennigian Argumentation (continued)

I II III Figure 2.18 A phylogenetic tree illustrating hypoth- esized evolutionary relationships of the three imaginary 8 species of Figure 2.17. Synapomorphies are indicated on 7 the tree by the numbered green lines. 6 4 2 these are shared ancestral are derived character states unique 1 2 characters. Such characters to species III. These unique derived do not indicate relationship. In character states () 3 contrast, palmate venation, cordate also tell us nothing about the phylo- 5 (heart-shaped) leaf bases, and flow- genetic relationships of species III. ers with three fused perianth parts

9 TABLe 2.3 Character × taxon matrix for the three hypothetical species of Figure 2.17, based on characters in Table 2.2 Characters and then been lost (in a reversal) in Taxa 1 2 3 4 5 6 7 8 9 species II. The sharing of pinnate venation Species I 0 1 0 0 0 0 0 0 1 (feather-shaped leaves) with acute Species II 0 0 1 0 1 0 0 0 1 leaf blade bases (forming an angle less than 90°) and flowers with four Species III 1 1 1 1 1 1 1 1 1 separate perianth parts by spe- Outgroup(s) 0 0 0 0 0 0 0 0 0 cies I and II is symplesiomorphic:

an older method of phylogeny reconstruction known as der a Jukes-Cantor model, compute the probability that Hennigian argumentation (Box 2B). that A will change to T in the diamond plants, T in the square plants, G in the circle plants, and G in the star Maximum likelihood Maximum likelihood is a power- plants. Then compute the probability for the case where ful approachJudd used 4e in many aspects of statistics, including the base at node X is A but the one at Y is G; record that constructionSinauer of trees, Associates and is based on a model of the prob- ability of changeMorales from Studio one character state (in this case) to another. ThisJudd4e_02.18.ai model is used 08-12-15 to calculate the likelihood that a given branching diagram will lead to the particular G G T T set of data observed. The best tree is the one that maxi- Length 3 Length 2 mizes the likelihood of the data, given the tree topology, Length 4 Length 5 a set of branch lengths, and a model of evolution, so this X model is known as the maximum likelihood method. In Length 1 other words, the method starts with a tree and then asks, If this were the real tree, what is the probability that it Y would have produced my set of data? For example, consider the tree in Figure 2.19. This is Figure 2.19 Maximum likelihood computation for posi- a tree topology and a set of arbitrarily chosen branch tion 8 from the matrix in Figure 2.6. The goal is to compute lengths. The terminals represent position 8 in the matrix the likelihood of observing the bases shown, given the from Figure 2.6. The likelihood calculation then proceeds branch lengths shown, testing for all possible combina- as follows: Suppose the base at node X is A and the one tions of bases at positions X and Y. This computation is then at node Y is also A. Given the particular set of branch repeated for all sites in the alignment and for various sets of lengths and assuming that mutation is proceeding un- branch lengths.

© 2016 Sinauer Associates, Inc. This material cannot be copied, reproduced, manufactured or disseminated in any form without express written permission from the publisher.

Judd 4e Sinauer Associates Morales Studio Judd4e_02.19.ai 09-11-15 Methods and Principles of Biological Systematics 33 probability. Continue until you have done the probability frequentist statistics that are taught in common univer- calculations for all possible combinations of nucleotides sity courses. The approach dates to the eighteenth century at nodes X and Y, and add up the probabilities of all pos- and is distinguished from frequentist approaches because sible combinations. Then do a similar calculation for ev- it explicitly takes into account prior information. In other ery site in the alignment. Then change the branch length words, rather than assuming that nothing is known about and repeat the computations. The product of all these a system, it explicitly incorporates knowledge accrued be- very small probabilities is a very tiny number, beginning fore the data analysis in question. with a decimal point followed by several thousand zeros. In the context of tree building, Bayesian analyses are Likelihoods are thus generally expressed as logarithms or model based, like maximum likelihood methods. The negative logarithms, and the highest (maximum) likeli- optimality criterion is the posterior probability; the pre- hood is the one with the smallest absolute value. ferred tree is the one with the greatest posterior probabil- Note that you can estimate the probability of change ity. In contrast to the likelihood, which is the probability even for sites that appear to be the same for all sequences of the data given the tree, the posterior probability is the (e.g., position 1 in the matrix in Figure 2.6). There is a prob- probability of the tree given the data, so it represents a ability that these sites have not changed at all from their more intuitive measure of probability. According to Bayes’ common ancestor, but there is also some probability that theorem, the posterior probability is calculated as the like- “invariant sites” have simply changed to the same state but lihood multiplied by the prior probability (or just “prior,” from different ancestral nucleotides. For example, the nu- for short) that the hypothesis (in this case the tree) is true; cleotide at ancestor X could have been C and changed to A that product is then divided by a constant. The center of independently in both the diamond and square plants. Or disagreements between frequentist and Bayesian statisti- the nucleotide at X could have been A, changed to some- cians is, first, whether a prior should be incorporated at all thing else, and then changed back to their original state and, second, how that should be done. (e.g., A to C to A). The probability of such invisible changes We have already discussed how likelihoods are com- gets higher as the branch lengths get longer. This is differ- puted. How do we assign prior probabilities to a tree or ent from the way that parsimony methods treat invariant set of trees? Two approaches are in common use. The first sites; in parsimony methods such sites are assumed to be (sometimes called “flat priors”) is to assume that the prior the same in the ancestors and descendants, and are as- probabilities of all possible trees are the same. In this case, sumed not to have changed over evolutionary time. the posterior probability is simply the likelihood multi- Assessment of support for a maximum likelihood tree plied by a constant, so the Bayesian tree is the same as is done via bootstrapping as in neighbor joining and par- the likelihood tree, although the interpretation of the two simony. For many years maximum likelihood algorithms is different. A second approach is to permit the priors of were so slow that this was very hard to do, but rapid algo- all trees to be almost but not quite the same; this is the rithms now make bootstrapping a maximum likelihood approach implemented in the program MrBayes (http:// tree possible and indeed routine. In the tree in Figure mrbayes.sourceforge.net/index.php; Ronquist et al. 2012), 2.15, the middle number on each branch is the maximum which is the most common program used to compute likelihood bootstrap value. Because maximum likeli- Bayesian trees. hood uses a model of evolution, it is much less susceptible Rather than try to identify a single best tree, as maxi- than parsimony or uncorrected neighbor joining to long- mum likelihood methods do, Bayesian analyses examine branch attraction. However, it has a weak tendency to a set of “credible trees.” Within that set, the frequency of fall prey to the opposite problem, long-branch repulsion. each clade is then determined. This means that the es- In this case, if the long branches really are sisters (e.g., timate of support is built into the analysis, rather than if there is a speedup of the evolutionary rate of a clade), determined afterward by bootstrapping. maximum likelihood can tend to force them apart. The steps in a Bayesian analysis are as follows: First, In summary, the maximum likelihood method uses assign priors. Sample tree space for each tree, computing likelihood as its optimality criterion. The preferred tree is the likelihood and multiplying by the prior. Then, when the one that maximizes the likelihood of producing the ob- you have sampled enough trees (see the next paragraph served data, given a particular model of evolution. Branch for how to know how many is enough), remove the sub- support is generally evaluated by bootstrapping. While it optimal trees that were computed at the start. Finally, cal- is less susceptible than uncorrected neighbor-joining or culate a consensus tree. The result is generally a topology parsimony methods to artifacts caused by long branches, quite similar to that from a maximum likelihood analysis it is not wholly immune. It requires more computational with an assessment of support. power than neighbor joining but is still reasonably fast for Although the steps in a Bayesian phylogenetic analysis moderate-sized data sets (several hundred sequences). are conceptually straightforward, the tree search methods are computationally intractable. Thus a Bayesian search is Bayesian methods Bayesian statistics is a fundamen- carried out using a Markov chain Monte Carlo (MCMC) tally different branch of statistics from the more familiar method. MCMC is a “robot” with a series of rules for

© 2016 Sinauer Associates, Inc. This material cannot be copied, reproduced, manufactured or disseminated in any form without express written permission from the publisher. 34 Chapter 2 searching through many combinations of branch lengths, isms. In fact, this has proved true for many groups of tree topologies, and characters at ancestral nodes. Effec- plants. The monophyly of such families as the Poaceae, tively, the computer algorithm wanders around in tree Onagraceae, , Asteraceae, Fabaceae, and Orchi- space computing posterior probabilities as it goes. The al- daceae, and of other large groups of plants such as angio- gorithm (1) starts with a tree; (2) modifies all parameters sperms, monocots, and , has been supported by a little bit and evaluates the tree again; (3) following some virtually all phylogenetic analyses of many kinds of data, preset rules, accepts or rejects the new tree. Unlike tree including morphology, chloroplast gene sequences, and evaluation in maximum likelihood and parsimony meth- nuclear gene sequences. This list of robust groups is long. ods, the new tree may or may not be better than the pre- While many papers in the systematics literature wrestle vious ones. Periodically all the parameters are changed with situations in which phylogenies disagree, it is impor- substantially in an effort to prevent the algorithm from tant to recognize that in many cases phylogenetic data are getting stuck in one small set of trees. quite consistent. Each new tree is called a generation. The first trees Even though many aspects of plant phylogeny are computed are not yet optimized so are usually discarded. well supported, not all phylogenetic trees are identical. After these initial trees, posterior values generally settle We have already mentioned a number of reasons for this. down and vary around a mean. Once the values have First, the different analytical methods have different un- settled down, it is common to sample every 1000 trees derlying assumptions and thus are subject to different for several million generations. Finally all the sampled analytical artifacts, particularly when dealing with nucle- trees are combined into a majority-rule consensus tree, otide data. Because of this, it is common to use several as would be done for a bootstrap analysis. The value on analytical methods to determine which aspects of the re- any given branch is the posterior probability. sult are sensitive to the method of analysis and which are How are Bayesian posterior probability values com- robust. Often analyses will begin with a quick neighbor- pared with bootstrap proportions? Because the credible joining analysis just to guide subsequent thinking and set of Bayesian trees is not constructed or assessed in the data exploration and perhaps additional sampling. These same way as a set of bootstrap trees, it is not surprising fast preliminary analyses can then be followed with more that the two sorts of values cannot be directly compared; extensive analyses using the various optimality criteria. they are simply measuring different things. A general Second, trees may differ if they use different sorts of observation is that Bayesian posterior probabilities are characters. For example, phylogenies based on morphol- higher than maximum likelihood bootstrap values for ogy or chloroplast, mitochondrial, or nuclear DNA nu- the same branches on trees. Recall that bootstrap analy- cleotide sequences can be (and often are) compared. If sis resamples the data matrix, whereas Bayesian analysis these phylogenies show similar groups, we can be more does not. Therefore a node supported by only one or two confident that they reflect the true order of events. For ex- mutations will necessarily have a low bootstrap value, be- ample, chloroplast, mitochondrial, and nuclear DNA each cause those mutations will be missed by the resampling support three major clades in the Rosidae (Sun et al. 2015). procedure some of the time. The Bayesian analysis will All data support the monophyly of Fabidae (Fabales plus not be misled in the same way. A comparison between some ), Malvidae (Malvales plus other rosids), and Bayesian posterior probabilities and bootstrap values can the clade made up of Celastrales, Oxalidales, and Mal- be seen in the Lamiales tree in Figure 2.15, where the pighiales (COM), but the relationships among the three posterior probability, the maximum likelihood bootstrap clades are different when chloroplast gene sequences are value, and the parsimony bootstrap value are shown from used than when mitochondrial or nuclear gene sequences left to right along the branch for each node. For many are used (Figure 2.20A). Based just on these three trees, a nodes, the posterior probability (× 100) is considerably majority-rule consensus tree would appear as the one on higher than either of the two bootstrap values. the left in Figure 2.20B, reflecting that two-thirds of the In summary, a Bayesian analysis is model based and is trees place the COM clade as sister to Malvidae. similar in many respects to a maximum likelihood analy- One way to show groups that are found by all methods sis because the likelihood is one component of the pos- of analysis, or among different sorts of data, is a strict terior probability. Branch lengths are in substitutions per consensus tree, which displays only monophyletic groups site. The analysis itself tends to be slower than the other that are common to all trees. Lineages whose positions methods. Bayesian support values are generally interpret- are uncertain are then drawn as though they arose at the ed as conventional probability values, so any value below same node in the tree diagram. In Figure 2.20, the dif- 90% or 95% indicates poor support ferent sets of data find that COM could be sister to the Fabidae or sister to Malvidae, so the strict consensus tree Comparing Trees from Different Methods shows the branching order of the three clades as being and Sources of Data ambiguous (Figure 2.20B, right). Although it appears that In theory, every method of analysis and every source of the lineages arose at the same time, in fact the diagram data should give the same history for a group of organ- indicates that we cannot tell when they originated.

© 2016 Sinauer Associates, Inc. This material cannot be copied, reproduced, manufactured or disseminated in any form without express written permission from the publisher. Methods and Principles of Biological Systematics 35

Figure 2.20 (A) Phylogenetic his- (A) Fabidae Fabidae Fabidae tory of the Rosidae. All data support the monophyly of Fabidae, Malvidae, COM COM COM and the Celastrales, Oxalidales, and (COM) clade, but the Malvidae Malvidae Malvidae relationships among the three clades Outgroup Outgroup Outgroup differ depending on whether chloro- plast, mitochondrial, or nuclear gene Chloroplast DNA Mitchondrial DNA Nuclear DNA sequences are used. (B) Based just on the three trees in (A), a majority-rule (B) consensus tree would reflect that 67% of the trees place the COM clade as sis- Fabidae Fabidae 100 ter to Malvidae, and 100% of the trees COM COM support the monophyly of the Fabidae, 67 Malvidae, and COM clade together(left). Malvidae Malvidae The strict consensus would reflect that the relationships among Fabidae, COM, Outgroup Outgroup and Malvidae are unresolved, indicating Majority rule consensus Strict consensus that the precise relationship is unknown (right). (After Sun et al. 2015.) 3. Hybridization or introgression (described in Chapter 5) may transfer some DNA into a different lineage. This is particularly true in the case of chlo- As the example in Figure 2.20 shows, the history of any roplasts and mitochondria, which are not linked to particular gene may not reflect the history of the organ- particular nuclear genomes. In the example of the isms bearing that gene. Nuclear genes may or may not Rosidae in Figure 2.20, Sun et al. (2015) suggest track the history of the nucleus, and chloroplasts and mi- that hybridization might explain the differences in tochondria may or may not have a history different from the gene phylogenies. Allopolyploidy (duplication of that of the nucleus. There are three main reasons for these an entire nuclear genome) also often accompanies differences: hybridization. 1. Mutation is a random process; therefore the phylog- We now have multiple gene trees for many groups of or- eny reconstructed for a particular geneJudd may 4e dif- fer from those for other genes by chanceSinauer alone, Associates as ganisms, and it is possible that none of those gene trees discussed above in Models of Evolution.Morales Studio will be exactly the same as the species tree. For example, Judd4e_02.20.ai 09-04-15many plant taxa have been shown to have the “wrong” 2. Polymorphisms in an ancestral species can be lost chloroplast, presumably because of introgression. In a in descendant species. By chance, this can result in recent striking example, many species of (Sa- a history of the genes that is actually different from lix) have similar chloroplast sequences, even though the the history of the organisms, a process known as species are morphologically distinct, a pattern that is ex- incomplete lineage sorting (Figure 2.21). plained by widespread hybridization (Percy et al. 2014). Individual nuclear genes may also have quite differ- ent histories. Figure 2.22 shows two nuclear gene trees Figure 2.21 A comparison of gene trees and from the Linaria (toadflax, family Plantaginaceae) species trees. In both the left and right diagrams, (Blanco-Pastor et al. 2012). One tree is for the internal the history of a single gene (a gene tree) is shown transcribed spacer of the rRNA (ITS), and the other is for by black lines within green shading, and the his- tory of the species as a whole (the species tree) is shown by the green shading. (Left) This gene A B C A B C diverged at the same time as the species, such that the speciation event leading to B and C hap- pened at the same time as divergence of the gene lineages that now exist in B and C. (Right) In this tree, a polymorphism appears in the lineage leading to species B and C. One of the two gene copies is more closely related to A, a pattern known as incomplete lineage sorting. Sampling of this gene will lead to incorrect inferences about the species tree. (After Avise 1994.)

© 2016 Sinauer Associates, Inc. This material cannot be copied, reproduced, manufactured or disseminated in any form without express written permission from the publisher.

Judd 4e Sinauer Associates In house Judd4e_02.21.ai 09-04-15 36 Chapter 2

(A) Macrocentrum Versicolores L. meyeri L. ventricosa Diffusae clade 1 L. dalmatica L. genistifolia L. peleponnesiaca L. loeselii L. odora L. tibetica L. vulgaris Diffusae clade 2 L. bubanii L. arvensis/micrantha/simplex L. alpina L. munbyana L. bipunctata L. amethystea L. saxatilis L. licaulis L. tursica L. badalii L. glauca L. propinqua L. depauperata L. polygalifolia L. orbensis L. supina L. aeruginea L. anticaria L. almijarensis L. glacialis L. oblongifolia (B) L. tibetica L. platycalyx L. dalmatica L. cuartanensis L. genistifolia L. saturejoides L. loeselii L. ventricosa L. meyeri L. vulgaris L. odora L. peleponnesiaca Macrocentrum Diffusae clade 1 L. polygalifolia L. depauperata L. orbensis L. anticaria L. almijarensis L. cuartanensis L. aeruginea L. glacialis L. platycalyx Versicolores L. amethystea L. munbayana L. badalii L. licaulis Diffusae clade 2 L. supina L. alpina L. bubanii L. saturejoides L. propinqua L. saxatilis L. bipunctata L. oblongifolia L. glauca L. tursica L. arvensis/micrantha/simplex

© 2016 Sinauer Associates, Inc. This material cannot be copied, reproduced, manufactured or disseminated Judd 4e in any form without express written permission from the publisher. Sinauer Associates Morales Studio Judd4e_02.22.ai 09-21-15 Methods and Principles of Biological Systematics 37

 Figure 2.22 Two distinct phylogenies for members of Gene copies from parent B will group with that parent, the genus Linaria (Plantaginaceae). (A) Tree for the nuclear and the copies from parent D will group with that parent. sequences of the internal transcribed spacer of the rRNA, The phylogeny thus provides two pieces of information: it or ITS. (B) Tree for the nuclear gene alanine:glyoxylate ami- shows that hybridization and polyploidy occurred, and it notransferase1, or AGT1. Branches highlighted in blue or identifies the parents of the cross. green indicate species that are morphologically placed in the same subsection. Note that the positions of these taxa are different in the two phylogenies, as would be expected Using Phylogenetic Trees if the history of the genus involved hybridization and incom- Phylogenetic trees have proved to be a remarkably pow- plete lineage sorting. (After Blanco-Pastor et al. 2012.) erful tool for comparative biologists in general, not just systematists. While phylogenies form the basis of most current classifications, they are also useful for studying alanine:glyoxylate aminotransferase1 (AGT1). Although how characters of plants evolve over time, which is ac- there are clearly similarities between the trees, there are complished by mapping characters on trees. In addition, notable differences, as can easily be seen by compar- phylogenies allow us to estimate the dates of major events ing the positions of the green and blue branches of the in evolutionary history so those dates can be correlated trees. In this case, the authors attributed the differences with geological changes. We will discuss each of these to a combination of hybridization and incomplete lineage topics in turn. sorting (see Chapter 5 for more information on these im- portant biological processes). Constructing a Classification Hybridization followed by polyploidy can also lead to A classification can be constructed in many ways. For ex- complex patterns of gene trees, as has been documented ample, plants can be classified on the basis of their me- in the case of cotton, wheat, and many other crop plants dicinal properties (as they are in some systems of herbal (Renny-Byfield and Wendel 2014). Hybridization can be, medicine) or on the basis of their preferred habitats (as and often is, accompanied by spontaneous doubling of they are in some ecological classifications). Phylogenetic the genome, also known as polyploidy. Organisms, such classification, as we use in this book, attempts to arrange as humans, with two sets of chromosomes are described organisms into groups on the basis of their evolutionary as being diploid. Organisms with more than two sets are relationships. known as polyploid, with the appropriate Greek prefix The theory of classification is a topic with which sys- used to indicate the numbers of chromosome sets. For tematists have been wrestling for centuries, and their example, plants with four sets of chromosomes are tet- struggles have led to a broad and frequently contentious raploid, and those with six sets are hexaploid. Polyploidy literature (see Chapter 3). The principles of phylogenetic that is the result of hybridization is sometimes referred to classification outlined here are commonly but not univer- as allopolyploidy. An allotetraploid contains four sets of sally held. In general, however, classification has several chromosomes, one pair from each diploid parent. goals. A classification is a common vocabulary designed Figure 2.23A illustrates a phylogeny of diploid species to aid communication. Therefore a classification should be A through F and a hybridization event between B and D stable; names that are frequently changed become useless that leads to an allotetraploid containing genomes from for communication. In addition, a classification should be both the B and D parents. When we sequence genes from predictive; that is, the name of a plant should help you to the diploids and tetraploids, the gene tree will look like learn more about that plant and guide you to its literature. the one in Figure 2.23B. DNA sequences from the tet- Systematists generally agree about the goals of classi- raploid plant will appear in two places in the gene tree. fication, but they may disagree profoundly about how to reach those goals. In this text, we take a particular point (A) (B) of view, using phylogenetic classifications throughout, as Diploid Tetraploid A opposed to phenetic or evolutionary school classifications (2x) (4x) B A B D Figure 2.23 (A) Phylogeny of a group of diploid species, B A–F, and an allotetraploid formed following hybridization of C C B D B and D. (B) Phylogeny of a gene from A–F and the allotet- D raploid BD. The single plant BD will have two distinct forms D of every gene, and these forms will fall into two distinct B D E places in the tree. One gene copy will fall with the B paren- tal diploid, as indicated by the black B and gray D, and the E F other gene copy will fall with the D parental diploid, as indi- F cated by the black D and gray B.

© 2016 Sinauer Associates, Inc. This material cannot be copied, reproduced, manufactured or disseminated in any form without express written permission from the publisher.

Judd 4e Sinauer Associates Morales Studio Judd4e_02.23.ai 09-04-15 38 Chapter 2

(see below and Chapter 3). As far as possible, we recog- phyletic, that is, composed of an ancestor and all its de- nize monophyletic and avoid paraphyletic or polyphyletic scendants. For example, we infer that the Asteraceae are groups. In the few cases in which a nonmonophyletic monophyletic because they have flowers in heads. The family or subfamily has not yet been divided into mono- Asteraceae plus other plants that share the character state phyletic units, we have placed the taxon name in quo- fused petals are also monophyletic; this group also has a tation marks. The monophyly of many genera of angio- name, the Asteridae (or the asterid clade). Similarly, the sperms is questionable, but fewer phylogenetic analyses entire group of plants with tricolpate pollen is monophy- are available at this level, so generally we have not tried letic and is known as the (or the tricolpate clade). to indicate possible or probable or This group has been given a formal Latin name, Eudicot- of genera. yledoneae, under the PhyloCode (see Cantino et al. 2007). A classification can be derived directly from a phylog- In phylogenetic classification, paraphyletic groups are not eny. For example, the shared derived character states in named. Naming a group that includes species from two Figure 1.4 can be arranged in a hierarchy from more in- clades would imply that those species are closely related clusive (e.g., stems woody or petals red) to less inclusive even though they are not. (e.g., leaves hairy, seed coat spiny). This arrangement then This book contains examples of named groups of allows the plants themselves to be arranged in a hierar- plants that we now believe to be paraphyletic. One is chical classification that reflects their evolutionary history. “Caesalpinioideae,” a group of legumes that tradition- The plants could be divided into two groups: one with the ally includes the North American redbuds (Cercis) and shared derived character state red petals and the ancestral the sensitive pea (Chamaecrista) among many others. But character state of herbaceous stems, the other with the the redbuds and sensitive peas are more distantly related shared derived character state woody stems and the an- to one another than either of them is to other legumes. cestral character state white petals. Each of these groups Without quotation marks, the name Caesalpinioideae im- could also be divided into two groups. plies a closer relationship than actually exists. The biological diversity on Earth is the result of ge- As phylogenetic data accumulate, some families previ- nealogical descent with modification, and monophyletic ously thought to be monophyletic are being found to be groups owe their existence to this process. It is appropri- actually paraphyletic; examples include Apocynaceae and ate, therefore, to use monophyletic groups in biological Capparaceae, as traditionally delimited. Over the last de- classifications so that we may most accurately reflect this cades, these families have been recircumscribed to recog- genealogical history. Classifications based on monophy- nize monophyletic groups: Apocynaceae has been com- letic groups are more predictive and of greater heuristic bined with Asclepiadaceae, and Capparaceae has been value than those based on overall similarity or on idio- divided into Capparaceae sensu stricto (s.s., meaning, in syncratic weighting of particular characters (Farris 1979; the strict, or narrow, sense) and Cleomaceae. Donoghue and Cantino 1988). Phylogenetic classifications, because they reflect ge- Naming: Not all groups are named A phylogenetic clas- nealogy, will be the most useful in biological fields such sification attempts to name only monophyletic groups, but as the study of plant distributions (phytogeography), the fact that a group is monophyletic does not mean it host-parasite or plant-herbivore interactions, pollination needs to have a name. The reasons for this are practical. biology, and fruit dispersal and in answering questions We could put every pair of species into its own genus, ev- related to the origin of adaptive characters. Because of its ery pair of genera into its own family, every pair of fami- predictive framework, a phylogenetic classification can lies into its own superfamily, and so on. But such a clas- direct the search for genes, biological products, biocontrol sification would be cumbersome; in addition, it would not agents, and potential crop species. Phylogenetic informa- be stable, because our view of sister species would change tion is also useful for making conservation decisions. Fi- each time a new species was described, and our view of nally, phylogenetic classifications provide a framework for the entire classification would have to shift accordingly. biological knowledge and a basis for comparative studies In practice, many monophyletic groups are not named. linking all fields of biology (Cracraft and Donoghue 2004). For example, the genus Stenanthium (Melanthiaceae) is Constructing a classification involves two steps. The monophyletic and contains four species (Zomlefer et al. first step is delimitation and naming of groups. In a phy- 2001; Zomlefer and Judd 2002; Wofford 2006). Although logenetic classification this step is uncontroversial: named the data show that these four species fall into two mono- groups must be monophyletic. The second step involves phyletic pairs, the two pairs of species are not named, and ranking the groups and placing them in a hierarchy. This few systematists would consider doing so. In another ex- step remains problematic. ample, over half of the genera of the grass family fall into a single large clade that contains four large, traditionally Grouping: Named groups are monophyletic A phy- recognized subfamilies plus two smaller ones. Although logenetic classification reflects evolutionary history and agrostologists refer to this clade as the PACMAD clade attempts to give names only to groups that are mono- (an acronym for Panicoideae-Arundinoideae-Chloridoi-

© 2016 Sinauer Associates, Inc. This material cannot be copied, reproduced, manufactured or disseminated in any form without express written permission from the publisher. Methods and Principles of Biological Systematics 39 deae-Micrairoideae-Aristidoideae-Danthonioideae), it has that have been named in the past can—and we would no formal Latin name. argue should—retain their names. This is yet another ar- No formal rules exist for deciding which monophyletic gument against formally naming the PACMAD clade of groups to name, but several criteria have been suggested, the grasses, because it would entail an unnecessary set of and some criteria are in common use despite not being changes affecting long-standing taxonomic usage (Back- fully articulated. A major criterion—perhaps the major lund and Bremer 1998; Stevens 1998). criterion—is the strength of the evidence supporting a group. Ideally, only clades with strong support should be Ranking: Ranks are arbitrary Having decided which formally recognized and named in classifications. This monophyletic groups to name, we still have the question makes sense if a classification is to function as a common of exactly how to name them. The groups could, for ex- vocabulary. ample, be numbered, and a central index could list what Names are most useful if they can be defined, and the is included in each numbered group. This approach is more precise the definition the better. In other words, if similar to the system used by telephone companies to a clade is to be named, it should have a set of characters organize telephone numbers. The difficulty, of course, is by which it can be distinguished from other clades, or that without a telephone book (a central index) and/or an diagnosed. This criterion is also important for nomen- excellent memory, the system is inaccessible. clatural stability: if the meaning of a name shifts every Biological classification attempts to provide a work- time a new phylogeny is produced or a new character is ing vocabulary that conveys phylogenetic information, examined, the name becomes effectively meaningless. yet can be learned by biologists who are not themselves Similarly, if, for example, the only way a field biologist primarily systematists. Because a phylogeny is similar in can identify an organism is by knowing whether it has structure to a hierarchy, in which small groups are in- an alanine or a serine at position 281 in its rubisco large cluded in larger groups, which themselves are included in subunit molecule, he may not find the classification much still larger groups, it makes sense for the classification to help. If, on the other hand, he knows that the organism reflect the hierarchy. is a grass with a particular spikelet structure, he can eas- Botanical classification uses a system developed in the ily and reliably infer many aspects of its biology. (Lack of eighteenth century in which taxa are assigned particu- an obvious morphological synapomorphy is one of sev- lar ranks, such as kingdom, phylum, class, order, family, eral reasons that the PACMAD clade of the grasses is not genus, and species (i.e., Linnaean ranks; see Chapter 3 given a name.) The characters used for classification do and Appendix 1). A classification of named monophyletic not have to be those used for identification, but many sys- groups should be logically consistent with the phyloge- tematists prefer to name clades that are easily recognized netic relationships hypothesized for the organisms being morphologically. classified. So, the categorical ranks of a Linnaean clas- Another criterion for naming is size of the group. Hu- sification are used to express sister-group relationships. man memory is easily able to keep track of small num- Even though monophyletic taxa are considered to bers of items (in the range of three to seven) (Stevens represent real groups that exist in nature as a result of 1998), but to organize and remember larger numbers of the historical process of evolution, the categorical ranks items requires additional mnemonic devices. (As an ex- themselves are only mental constructs. They have only ample, consider how many nine-digit zip codes you can relative (not absolute) meaning (Stevens 1998). In other remember compared with the five-digit variety, or with words, the familial level is less inclusive than the ordi- seven-digit telephone numbers.) Dividing a large group nal level and more inclusive than the generic level, but into smaller groups is a way to organize one’s thinking no criteria are available to indicate whether a particular about large numbers of taxa. For example, the genus Ste- taxon, such as the angiosperms, should be recognized at nanthium could be redefined to include only Stenanthium the level of phylum, class, or order. gramineum and S. diffusum, and a new genus could be de- In Figure 2.24, a cladogram of imaginary taxa A scribed to include S. densum and S. leimanthoides. There through E is converted into a hierarchical classification seems little reason to do this, however, because four spe- according to Linnaean categorical ranks. Note that sub- cies is not a difficult number to keep track of. That said, genus DE is nested within genus CDE, which is in turn there seems little reason to divide a large group if well- nested within family ABCDE. (But, we could have treated supported, morphologically distinct clades cannot be clade ABCDE as an order, clade CDE as a family, and identified within it, and even then there is no necessity clade DE as a genus.) Often, however, to fully express the to divide it. sister-group relationships (in the cladogram), one needs A third naming criterion is nomenclatural stability. A more ranks than are available (in the taxonomic hierar- classification is ultimately a vocabulary, a means of com- chy), even after creating additional ranks by use of the munication. It cannot function this way if the meanings of prefixes super- and sub-. the names continually change. Thus, given a set of well- Even though ranking is arbitrary, the criteria described supported, diagnosable, monophyletic groups, groups here for deciding which groups to name can also be ap-

© 2016 Sinauer Associates, Inc. This material cannot be copied, reproduced, manufactured or disseminated in any form without express written permission from the publisher. 40 Chapter 2

Classi cation 1 Classi cation 2 Some systematists have proposed abandoning the Lin- naean system altogether and replacing it with a “phylo- genetic taxonomy.” Full exploration of this possibility is A Family ABCDE Family ABCDE beyond the scope of this text, but we will briefly address Genus AB Genus AB the arguments against the use of Linnaean ranks here. B Species A Species A Because rank is arbitrary, a genus in one family may not be the same age as, encompass the same amount of varia- C Species B Species B tion as, or indeed have anything in common at all with a genus in another family—other than the fact that they are Genus CDE Genus C D both monophyletic groups. Trained systematists are gen- Subgenus C Species C erally aware of this (Darwin was, for example) and realize that genera, families, and so on are not comparable units E Species C Genus DE (Stevens 1997). Some scientists, however, frequently use Subgenus DE Species D such categories as though they were real. For example, it is common to measure plant diversity by listing the num- Species D Species E ber of families represented by a local flora, even though Species E family does not say anything in particular about a unit. If rank is arbitrary, then one logical step would be to Figure 2.24 Alternative classifications based on the eliminate ranks altogether. Taxa would be placed in named phylogeny of a hypothetical group of taxa A, B, C, D, and groups, but the groups would not be designated as genus, E. The classification on the left uses four categorical ranks family, order, or any other rank. Such categorization al- (family, genus, subgenus, and species); the one on the right ready exists informally, particularly among groups above uses only three ranks (family, genus, and species). Both are the level of orders. The eudicots, for example, are widely correct. recognized as monophyletic but are not given a particular Linnaean rank. Similarly, few systematists worry about whether the angiosperms should be recognized as a divi- plied to deciding the level at which to rank a group (see sion, class, subclass, superorder, or other rank; they are Stevens 1998 for full discussion). Nomenclatural stability clearly monophyletic and designated by the non-Linnaean again becomes important here. For example, the earliest- name angiosperm. Alternatively, rank could be based on diverging lineage in the family Poaceae includes only two absolute age of a clade, but this would cause major disrup- extant genera, Anomochloa and Streptochaeta (Figure 2.25). tions, even if age could be assigned unambiguously. Thus one could, in principle, create a new family for Ano- Eliminating ranks becomes more problematic among mochloa and Streptochaeta; after all, it would be monophy- orders, families, and genera. Groups assigned to those Juddletic and4e would leave the Poaceae as also monophyletic. ranks are familiar, their names are in common use, and Sinauer Associates MoralesFor the Studio purposes of stability, however, it makes sense to they tap into the implicit hierarchy we use in language, Judd4e_02.24.aileave the two genera09-04-15 in Poaceae, where they have been so an entirely new sort of nomenclature is unlikely to be given a subfamilial name: Anomochlooideae. accepted rapidly or without protest. Nonetheless, an al- ternative system of phylogenetic nomenclature, known as the PhyloCode, is being developed. The PhyloCode is de- signed entirely outside the rules of the International Code of Nomenclature for Algae, Fungi and Plants (ICN), which All other Poaceae governs the use of Linnaean ranks and has long been used

Anomochloa Streptochaeta by all plant taxonomists (see Appendix 1). It is an alter- native nomenclatural system rather than a revision of the existing system (see the PhyloCode website at www.ohiou. edu/). Many phylogenies are only partially resolved, so pre- cise placement of taxa is impossible given the available data. This means that some species cannot be placed for certain in a genus, and some genera cannot be reliably assigned to a family. The current system allows uncertain placements above the rank of species to be reflected by the category incertae sedis—literally “of uncertain po- sition.” An alternative would be a rank-free system, in Figure 2.25 Phylogeny of Poaceae, showing the position which neither placement in a larger group nor naming all of the genera Anomochloa and Streptochaeta. branches of a dichotomy or polytomy would be necessary.

© 2016 Sinauer Associates, Inc. This material cannot be copied, reproduced, manufactured or disseminated in any form without express written permission from the publisher.

Judd 4e Sinauer Associates Morales Studio Judd4e_02.25.ai 08-12-15 Methods and Principles of Biological Systematics 41

Figure 2.26 Phylogeny and a resulting D 1 E 10 C 3 B 11 A nonphylogenetic classification produced according to the evolutionary school. The classification includes a mix of monophyletic and paraphyletic groups, separated from Autapomorphies, one another by morphological “gaps.” used to separate Monophyletic A from CB group, recognized Paraphyletic group, because recognized because separated phenetically uniform from CB and A by “gap” in were secondary. Thus a group could be recognized on pattern of variation the basis of some combination of derived and ancestral, unique and shared characters (Figure 2.26). Importance was given to the recognition of “gaps” in the pattern of variation among phylogenetically adjacent groups (Simp- Classi cation son 1961; Ashlock 1979; Cronquist 1987). Characters con- Family ABCDE sidered to be evolutionarily (or ecologically) significant Genus DE were stressed, and the expertise, authority, and intuition Species D of individual systematists were central. This sort of clas- Species E Genus CB sification led to recognition of the classical group “dicots” Species C as distinct from monocots. The morphological similarity Species B of the “dicots” was afforded more importance than the Genus A fact that “dicots” are not a monophyletic group. Species A It has been said that systematics is as much an art as a science (although this statement begs the question of how one might define art and science), in part because The authors of this text have been involved in reclas- so many aspects of the discipline have seemed to have sifications of genera, families, and orders on the basis of no objective basis. One fortunate result of phylogenetic phylogenetic data and have found that—as long as the systematics is that at least one major aspect of systemat- phylogeny is clear—use of the standard Linnaean hier- ics—the delimitation of groups—has become formalized archy is quite easy (especially when it is supplemented by such that there is general agreement on how it should be unranked informal names or PhyloCode names). When done. Whereas phenetic and evolutionary classifications the phylogeny is unclear, one should generally wait for were ambiguous about grouping criteria, phylogenetic Judd 4e more data before modifying the classification. classifications are precise. A named group can be taken Sinauer Associates as monophyletic, including all descendants of a single Morales Studio Judd4e_02.26.aiComparing 08-12-15 phylogenetic classifications with those common ancestor. derived using other taxonomic methods Not all tax- onomists use phylogenetic methods, although this is the Describing Evolution: majority approach. A few systematists have held the view Mapping Characters onto Trees that although evolution has occurred, parallelism and A major aspect of plant systematics is understanding the reversal are so common that the details of evolutionary evolution of particular characteristics—morphological, history can never be deciphered. This point of view led physiological, environmental, or other—of plants. Phy- to a school of systematics known as . Pheneti- logenies can be used to describe the evolutionary process cists argued that because evolutionary history can never and to develop hypotheses about adaptation, morphologi- be unequivocally detected, organisms might best be clas- cal and physiological change, or biogeography, among sified according to overall similarity. Thus they placed many other uses. This is the kind of study that system- similar organisms together in a group and very different atists frequently engage in because the details of character organisms in different groups (Sneath and Sokal 1973). evolution lead to hypotheses about how natural selection This approach is discussed in more detail in Chapter 3. has worked. In addition, when constructing classifica- The school of evolutionary taxonomy also differed tions, one frequently wants to know what morphological from phylogenetic taxonomy in its approach to classifica- characters can be attributed to and distinguish a particu- tion. In this school, the morphological similarity within lar monophyletic group. a group was of utmost importance, and monophyly and Making use of a phylogeny in this way requires that paraphyly (in the strict cladistic senses of those words) changes in particular characters be assigned to particular

© 2016 Sinauer Associates, Inc. This material cannot be copied, reproduced, manufactured or disseminated in any form without express written permission from the publisher. 42 Chapter 2 nodes in the phylogenetic tree. For example, we may wish Fused: Enkianthus to determine whether a group of cold-tolerant plants in the Arctic is derived from a different cold-tolerant Arc- Fused: Empetrum tic group or whether it is derived from cold-intolerant Fused: Rhododendron sect. 1 plants from lower latitudes. This requires that we create a hypothesis about the nature of the plants represented Fused: Rhododendron sect. 2 by each ancestral node on the tree and estimate whether they are cold-tolerant or cold-intolerant. Generating this Free: Rhododendron subsect. Ledum sort of hypothesis is known as mapping the character onto the tree. Fused: Rhododendron subsect. 3 The characters mapped onto the tree may be mor- Fused: Cassiope phological characters, physiological characters such as drought tolerance or high-efficiency photosynthesis, en- Fused: Harrimanella vironmental characters such as mean precipitation in July or density of shade, or biogeographic characters such as Fused: Lyonia location of plants on particular continents or in particular biomes. The characters may be quantitative or qualitative Fused: Epacris (although for methodological reasons, most are qualita- Free: Vaccinium sect. Oxycoccus tive). The process of identifying homologous structures, defining characters, and delimiting character states is the Fused: other Vaccinium same as that described for building a morphological clas- sification (see Box 2A). Figure 2.27 Phylogeny of a portion of the Ericaceae. The process of mapping characters is generally sepa- The genus Rhododendron is more complex than indicated rate from that of generating a tree. Mapping can be done here, but for simplicity we note three sections, two num- using only the topology of the tree or can also incorporate bered 1 and 2 and one divided into subsection Ledum and information on branch lengths. It can be done in a par- subsection 3. For the sample of species investigated, two simony, likelihood, or Bayesian framework. Note that in changes to free petals are hypothesized, with the changes this sort of study the tree is taken as given. This means indicated by green bars. (After Gillespie and Kron 2010.) that the method used to produce the tree does not need to be the same as the method used to map characters onto it. It does not matter how the tree was produced, and indeed sionJudd more 4e closely. If we were studying only species of Vac- Sinauer Associates the data used to produce the tree are not important. ciniumMorales, Studio we would have no way of knowing whether fused petalsJudd4e_02.27.ai were ancestral 09-04-15 or derived (Figure 2.28A). There Mapping in a parsimony framework It is perhaps easi- must have been one genetic change, but it could have est to start by thinking about simple parsimony map- happened as easily in the lineage leading to Vaccinium ping using only the topology of the tree, that is, ignoring sect. Oxycoccus as in the lineage leading to the other Vac- branch lengths for the time being. Consider a group of cinium species. Only by reference to the outgroup Epacris plants for which the phylogenetic tree is known; a good can we determine when petal fusion was lost. Because example is the Ericaceae, for which much information Epacris has fused petals, free petals must have originat- is available (Figure 2.27). Assume for the purposes of ed within Vaccinium sect. Oxycoccus. This is the same as this discussion that each of the terminal genera really saying that the ancestor of other Vaccinium species plus is monophyletic, as demonstrated by studies of multiple Vaccinium sect. Oxycoccus had fused petals. If we were to species of each. Then consider a study that is concerned postulate that the ancestor had free petals (Figure 2.28B), with the gain or loss of fused petals, which are intimately we would need two changes to fused petals: one in Epac- connected with the evolution of pollination systems. Fig- ris and one in the blueberries. The same argument applies ure 2.27 shows the observed character states (fused or free in the case of Rhododendron subsect. 3 and Rhododendron petals) for the genera. subsect. Ledum (see Figure 2.27). It seems trivially obvious from looking at the distri- Now suppose we were studying only species of Vac- bution of characters and character states that free petals cinium, but this time, instead of using Epacris or other must have evolved once in the lineage leading to Rhodo- Ericaceae as outgroups, we used only Rhododendron sub- dendron subsect. Ledum (Labrador tea) and again in the sect. Ledum. This could easily be the case if material of the lineage leading to Vaccinium sect. Oxycoccus (cranberries). other genera were hard to obtain or if those genera were Phrased another way, the ancestor of Vaccinium sect. Oxy- extinct and we did not know they had existed. Now the coccus and all other vacciniums (blueberries) had fused most parsimonious conclusion is that the ancestor of all petals, as did the ancestor of Rhododendron subsections vacciniums had free petals and that in response to some Ledum and 3. However, examine this “obvious” conclu- unknown selective pressure, there was a change to fused

© 2016 Sinauer Associates, Inc. This material cannot be copied, reproduced, manufactured or disseminated in any form without express written permission from the publisher. Methods and Principles of Biological Systematics 43

(A) Free: Vaccinium sect. Oxycoccus Figure 2.28 (A) Two Vaccinium taxa differ in character states. It is impossible to determine from only this informa- Ancestor fused Fused: Other Vaccinium tion what the character state of the ancestor was, because either assumption involves one change in one descendant Free: Vaccinium sect. Oxycoccus lineage. (B) The addition of an outgroup determines the character state of the ancestor. In this case, it is more parsi- Ancestor free Fused: Other Vaccinium monious to assume that the ancestor had fused petals.

the same study of Vaccinium, but now use both Rhodo- (B) Fused: Epacris dendron subsect. Ledum and Rhododendron subsect. 3 as Free: Vaccinium sect. Oxycoccus outgroups. In this case the direction of change is ambigu- ous. It is as simple to postulate that the ancestor of the Ancestor fused Fused: Other Vaccinium group had fused petals and that there were two changes to free petals as it is to postulate that the ancestor had free petals and that there were two changes to fused. These Fused: Epacris two choices are known as equally parsimonious recon- Free: Vaccinium sect. Oxycoccus structions. For many characters on many trees, there are multiple equally parsimonious reconstructions. In other Ancestor free Fused: Other Vaccinium words, there are multiple equally good hypotheses about the direction and timing of character state change. Ambiguity can also come from including taxa for which petals (Figure 2.29A, bottom). This is exactly the opposite the character state is not known. Suppose, for example, conclusion from the one reached in the previous example, and two new taxa are discovered such that, on the basis of the only difference is the genera included in the analysis. other characters, new species 1 is clearly sister to blueber- One might try to improve the situation by using ad- ries (other Vaccinium) and new species 2 is clearly sister to ditional outgroups (Figure 2.29B). For example, consider Vaccinium sect. Oxycoccus (Figure 2.30). In addition, sup- pose that it is unclear whether the petals are fused or free. This type of ambiguity is more common than you might (A) Free: Rhododendron subsect. Ledum think; it can occur when the original description is vague and/or illustrations are unclear, or when the original plant Free: Vaccinium sect. Oxycoccus is known only from fruiting material. In this example we do not know what the ancestral state was for Vaccinium, Fused: Other Vaccinium Ancestor fused so we cannot make any hypothesis about the direction of evolutionary change. Nor can we be sure that the charac- Free: Rhododendron subsect. Ledum ter fused petals is a synapomorphy for the genus.

Free: Vaccinium sect. Oxycoccus Mapping in a likelihood or Bayesian framework In Ancestor free Fused: Other Vaccinium the process of producing a likelihood or Bayesian tree, the analysis automatically assigns the probability of each ancestral state for each nucleotide at each ancestor (see Maximum Likelihood and Bayesian Methods on pages (B) Free: Rhododendron subsect. Ledum 32–34). In the example described above (see Figure 2.19), Judd 4e we estimated the likelihood of the observed data if an- Fused: Rhododendron subsect. 3 Sinauer Associates cestor X had an A at a particular position, and then we Morales Studio estimated the probability of the data if X had a C or a Free: Vaccinium sect. Oxycoccus Judd4e_02.28.ai 08-11-15 G or a T. Using exactly the same process, we could esti- Ancestor fused Fused: Other Vaccinium

Figure 2.29 (A) Analysis of character state change in Free: Rhododendron subsect. Ledum Vaccinium using a different outgroup. Note that the inference Fused: Rhododendron subsect. 3 of the ancestral state is exactly the opposite of that reached when Epacris is used as an outgroup. (B) Analysis of character Free: Vaccinium sect. Oxycoccus state change in Vaccinium using two outgroups that differ in state. It is now impossible to determine the character state of Ancestor free Fused: Other Vaccinium the ancestor.

© 2016 Sinauer Associates, Inc. This material cannot be copied, reproduced, manufactured or disseminated in any form without express written permission from the publisher.

Judd 4e Sinauer Associates Morales Studio Judd4e_02.29.ai 09-04-15 44 Chapter 2

Petals free Also unlike many parsimony methods, maximum or fused Unknown state: New species 1 likelihood methods take into account the lengths of the branches. Thus if a branch is short (i.e., there were few Fused: other Vaccinium mutations between an ancestor and a descendant), then change of the character state is less likely. Conversely, es- timation of ancestral states over a very long branch be- Petals free Free: Vaccinium sect. Oxycoccus or fused comes highly uncertain. Because the Bayesian posterior probability is computed Petals free Unknown state: New species 2 from the likelihood multiplied by the prior probability, or fused Bayesian estimates of ancestral character states are inti- Figure 2.30 Addition of species for which the charac- mately connected to likelihood values. Similar to maxi- ter state is unknown can prevent any inference about the mum likelihood reconstructions, Bayesian approaches ancestral state. to mapping ancestral states can incorporate uncertainty in the tree topology and in parameters of the underlying model of evolution. While this often means that the an- cestral character state appears ambiguous, most system- mate the likelihood that we would observe any other set of characteristics given a particular ancestral state. This would lead to a maximum likelihood estima- R. adenocalyx tion of a particular character for each ancestor in the R. tomestosa tree. Because we would be working in a likelihood framework, however, the ancestral states would not R. verbasciformis be simply binary (e.g., petals either fused or free, or R. incomta plant cold-tolerant or cold-intolerant), but rather the likelihood that a particular state of an ancestor would R. beyrichiana lead to the states of descendants that we see today. An R. magni ora example is shown in Figure 2.31, which shows corolla color mapped onto a portion of the phylogeny of the R. donnelsmithii genus Ruellia (Acanthaceae) (Tripp and Manos 2008). The common ancestor of the clade that includes R. R. gemini ora beyrichiana through R. geminiflora is inferred to have R. hookeriana had purple flowers, a result that likely would have appeared also if the mapping had used a parsimony R. leucantha algorithm. However, the corolla color of the common R. alboviolacea ancestor of the entire clade is uncertain in most trees (indicated by the large gray region of the pie chart on R. amoena the far left), but there is a small probability, R. foetida indicatedJudd 4e by the very small pie slice, that the corollaSinauer might Associates have been purple. The pie charts R. matagalpae at theMorales other Studio nodes also indicate considerable R. mcvaughii uncertaintyJudd4e_02.30.ai about the 09-04-15 ancestral state. R. novogaliciana

R. petiolaris

Figure 2.31 Mapping of a character R. erythopus using maximum likelihood on a set of 1000 R. tarapotana Bayesian trees. A portion of the phylogeny of the genus Ruellia (Acanthaceae) shows R. yunmaguensis evolution of corolla color. The colors of the R. humboltiana taxon names show corolla color, except that gray indicates white corollas. Each pie R. longilamentosa chart at a node indicates the likelihood of that particular color at that hypothetical R. tubi ora ancestor. Gray in the pie charts indicates R. densa uncertainty in reconstructions in individual trees. (After Tripp and Manos 2008.) R. villosa

© 2016 Sinauer Associates, Inc. This material cannot be copied, reproduced, manufactured or disseminated in any form without express written permission from the publisher.

Judd 4e Sinauer Associates Morales Studio Judd4e_02.31.ai 08-11-15 Methods and Principles of Biological Systematics 45 atists now see this as an advantage because it prevents a smoothing methods, the rate of molecular evolution is false sense of certainty. permitted to vary from one branch to the next, but the amount of change is specified by a number known as a Assigning Dates to Nodes in Phylogenies smoothing parameter. Perhaps not surprisingly, changes A phylogenetic tree is a diagram that depicts time. As in the smoothing parameter can give different estimates such it can be used to assess both relative time (Did event of the date of an internal node. X occur before event Y?) and absolute time (How many Once the rate of mutation is forced to be approximately years ago did event X occur?). constant throughout the tree, the next step is to deter- Many questions in evolution can be answered with mine how many mutations have occurred in a particular knowledge of only relative time. The actual date of an period of time. In other words, if a mutation is one tick event is not especially relevant, only that one event oc- of a clock, how fast is the clock ticking? Answering this curred before or after another. In general, ancestors deep- question requires that the clock be calibrated. To do this, er in the tree must have existed before ones at more shal- we must assign one or more fossils a position in the phy- low nodes. Likewise, if characters are mapped onto the logeny based on whatever characters are available. The tree, then the phylogeny provides information on the rela- date of the fossil then provides a minimal age of the node tive timing of character state change. Thus the phylogeny to which it attaches. For example, if we have a fossil that of the Ericaceae in Figure 2.27 shows that free petals orig- is dated to 50 million years ago (mya), and if that fossil is inated after fused petals in this group, even though we well enough preserved that we can be sure that it has the don’t know the exact dates for any of the nodes of the tree. synapomorphy of a particular clade, then the clade must Although the tree alone can answer questions of rela- be at least 50 million years old. The fossil does not give tive time, many evolutionary questions require more pre- us a maximum age, because the group could have been cise data on how fast mutations are occurring and what around since before 50 mya; we have no way of knowing. geological date should be assigned to a particular node. The tree in Figure 2.32 shows the history of fern ge- For example, a dated phylogeny is needed to determine nus Osmunda, its sister genus Osmundastrum (which was whether Gnetaceae underwent most speciation during formerly included in Osmunda), and two outgroups, To- the Cretaceous, or whether Malvaceae is old enough to dea and Leptopteris. Node 1 is known as the stem node, have been affected by the breakup of Gondwana. If spe- the point of origin of Osmunda, and node 2 is the crown ciation in Gnetaceae or Malvaceae is more recent than node, the point of the earliest speciation events within about 80 million years ago, then it was probably not af- the genus. Suppose you find a fossil that is dated to 153 fected by the breakup. mya and it has the synapomorphies of both Osmunda s.s. When molecular data first became available, there was and the Osmundastrum + Osmunda clade. Should the fossil some hope that DNA might mutate in a clocklike fashion, be attached to the stem or crown node? The generic sy- so every mutation would equal one “tick of the clock.” napomorphy may have arisen anywhere along the inter- If this were the case, then we would only have to pro- node between the stem and crown node, but the earliest duce a molecular phylogeny and count the number of possible time that synapomorphy could have arisen was mutations and we could estimate time. However, DNA immediately after the stem node. Thus we would assign does not provide a well-behaved clock. Different genes 153 million years as the minimal age of the stem node. in different lineages have rates of mutation that are far Note that the stem node could be older—in fact, quite a more different than would be expected purely by random lot older—and that 153 mya is merely the latest that spe- chance. This means that over the same period of time, ciation event could have occurred. some branches of a tree accumulate mutations more rap- The methods just described separate the steps of mak- idly than others and thus appear as longer branches on a ing the tree clocklike and then calibrating it. However, in phylogenetic tree. For example, woody plants in general another set of methods, a Bayesian framework is used to tend to have slower rates than herbaceous plants (Smith integrate over a broad range of possible rates and possible and Donoghue 2008). calibration points. These methods are becoming increas- Methods for assigning dates to phylogenies involve a ingly popular because they require less precise (and hence series of steps. First, basic probability theory is used to presumably more realistic) assumptions about calibration test whether the mutation rate in any particular lineage is dates. Still other methods are being developed that are behaving in an approximately clocklike fashion. In some less heavily influenced by the date of the oldest fossil and cases, the DNA sequences all mutate in a way consistent that incorporate more extensive morphological data along with the same underlying rate, but in most trees there is with DNA data (see Grimm et al. 2015 for discussion). significant variation in mutation rate from one branch to Despite considerable improvement in methods for es- the next. In the latter case, the rate of molecular evolu- timating dates of ancestral nodes, estimates of dates have tion must be made predictable throughout the tree. There huge errors associated with them. For example, in dating are many ways to do this, and all involve complex sets the phylogeny of Osmunda shown in Figure 2.32, Grimm of assumptions. In one set of methods, known as rate- et al. (2015) tested four different analytical methods,

© 2016 Sinauer Associates, Inc. This material cannot be copied, reproduced, manufactured or disseminated in any form without express written permission from the publisher. 46 Chapter 2

Todea

Leptopteris

Osmundastrum (Osmunda) cinnamomeum (-a)

Osmunda claytoniana 1 Osmunda vachellii Osmunda stem node 2 Osmunda banksiifolia Osmunda Osmunda javanica Figure 2.32 Phylogeny of the genus crown node Osmunda Osmunda japonica and relatives. The stem node of Osmunda is the node marking the split between Osmunda and Osmun- Osmunda lancea dastrum (node 1), whereas the crown node is the one Osmunda regalis marking the earliest speciation event within Osmunda (node 2). The distinction is important for calibrating the phylogeny. (After Grimm et al. 2015.) rable to that of extant taxa. While some fossils are indeed astonishingly well preserved, others are less so. In addi- tion, it is often hard to double-check the paleontological which dated the split between Osmunda and Osmundas- data, since the specimens usually have to be checked in trum as 238, 182, 176, or 156 mya. While the authors offer the museums that hold them. In addition, the stratigraphy reasons to prefer the oldest date, the discrepancies are still can be hard for a nonspecialist to assess. greater than 25%. Likewise, estimates of the age of the In summary, current methods for dating phylogenies crown node of the angiosperms mostly place the diver- are constantly improving, but any assessment of the time gence between Amborella and all other angiosperm taxa of origin of a particular clade still needs to be viewed in the range of 210–130 mya, although some estimates are skeptically. Estimates of dates can vary by tens of millions 270 mya (Stevens 2001 onwards; Bell et al. 2010; Moore et of years, which makes correlating biological events with al. 2010; Magallón et al. 2015). This is a very large period changes in geology or climate risky at best. Combining of Earth history. improved analytical methods with a better understanding All dates assume that the fossils used for calibration of the fossil record and of morphological evolution pro- have been identified and dated correctly and that their vides the best hope for a temporal understanding of plant morphology has been interpreted in a way that is compa- diversification. Judd 4e Sinauer Associates MoralesLi Studioterature Cited & Suggested Readings Judd4e_02.32.ai 08-11-15 Items marked with an asterisk are espe- Blanco-Pastor, J. L., P. Vargas and B. E. Pfeil. classification of the subfamily Ericoi- cially recommended to those readers who 2012. Coalescent simulations reveal hy- deae (Eriaceae). Mol. Phylogenet. Evol. 56: are interested in further information on bridization and incomplete lineage sorting 343–354. in Mediterranean Linaria. PLoS ONE 7(6): Grimm, G. W., P. Kapli, B. Bomfleur, S. the topics discussed in this chapter. e39809. Doi: 10.1371/journal.pone.0039809 McLoughlin and S. S. Renner. 2015. Us- Ashlock, P. D. 1979. An evolutionary system- Cantino, P. D., J. A. Doyle, S. W. Graham, W. ing more than the oldest fossils: dating atist’s view of classification. Syst. Zool. 28: S. Judd, R. G. Olmstead, D. E. Soltis, P. S. Osmundaceae with three Bayesian clock 441–450. Soltis and M. J. Donoghue. 2007. Towards a approaches. Syst. Biol. 64: 396–405. Avise, J. C. 1994. Molecular markers, natural phylogenetic nomenclature of Tracheophy- *Hall, B. G. 2011. Phylogenetic trees made easy, history and evolution. Chapman & Hall, ta. Taxon 56: 1E–44E. 4th ed. Sinauer Associates, Sunderland, New York. *Cracraft, J. and M. J. Donoghue. 2004. As- MA. Backlund, A. and K. Bremer. 1998. To be sembling the tree of life. Oxford University *Hennig, W. 1966. Phylogenetic systematics. or not to be: Principles of classification Press, Oxford and New York. University of Illinois Press, Urbana. and monotypic plant families. Taxon 47: Cronquist, A. 1987. A botanical critique of Jansen, R. K. and J. D. Palmer. 1987. A chlo- 391–400. cladism. Bot. Rev. 53: 1–52. roplast DNA inversion marks an ancient *Baum, D. and S. Smith. 2013. Tree thinking: *Donoghue, M. J. and P. D. Cantino. 1988. evolutionary split in the sunflower family an introduction to phylogenetic biology. Rob- Paraphyly, ancestors, and the goals of (Asteraceae). Proc. Natl. Acad. Sci. USA 84: erts & Co., Greenwood Village, Colorado. taxonomy: A botanical defense of cladism. 5818–5822. *Bell, C. D. 2015. Between a rock and a hard Bot. Rev. 54: 107–128. Jansen, R. K., C. Kaittanis, S.-B. Lee, C. Saski, place: Applications of the “” Farris, J. S. 1979. The information content J. Tomkins, A. J. Alverson and H. Daniell. in systematic biology. Syst. Bot. 40: 6–13. of the phylogenetic system. Syst. Zool. 28: 2006. Phylogenetic analyses of Vitis (Vi- Bell, C. D., D. E. Soltis and P. S. Soltis. 2010. 458–519. taceae) based on complete chloroplast ge- The age and diversification of the angio- Gillespie, E. and K. Kron. 2010. Molecular nome sequences: effects of taxon sampling sperms revisited. Am. J. Bot. 97: 1296–1303. phylogenetic relationships and a revised and phylogenetic methods on resolving

© 2016 Sinauer Associates, Inc. This material cannot be copied, reproduced, manufactured or disseminated in any form without express written permission from the publisher. Methods and Principles of Biological Systematics 47

relationships among rosids. BMC Evol. Biol. natural system, comparative anatomy, and *Stuessy, T. R., D. J. Crawford, D. E. Soltis 6: 32–46. ). Geest & Portig, Leipzig. and P. S. Soltis. 2014. Plant systematics: The Maddison, D. and W. Maddison. 2005. Mac- Renny-Byfield, S. and J. F. Wendel. 2014. origin, interpretation, and ordering of plant Clade 4.0. Sinauer Associates, Sunderland, Doubling down on genomes: polyploidy biodiversity. Regnum Vegetabile Volume MA. and crop plants. Am. J. Bot. 101: 1711–1725. 156. Koeltz Scientific Books, Königstein, Magallón, S., S. Gómez-Acevedo, L. L. Sán- Ronquist, F., M. Teslenko, P. van der Mark, D. Germany. chez-Reyes and T. Hernández-Hernández. L. Ayres, A. Darling, S. Höhna, B. Larget, Sun, M., D. E. Soltis, P. S. Soltis, X. Zhu, J. G. 2015. A metacalibrated time-tree docu- L. Liu, M. A. Suchard and J. P. Huelsen- Burleigh and Z. Chen. 2015. Deep phylo- ments the early rise flowering plant phylo- beck. 2012. MrBayes 3.2: Efficient Bayesian genetic incongruence in the angiosperm genetic diversity. New Phytol. DOI: 10.1111/ phylogenetic inference and model choice clade Rosidae. Mol. Phylogenet. Evol. 83: nph.13264 across a large model space. Syst. Biol. 61: 156–166. Moore, M. J., P. S. Soltis, C. D. Bell, J. G. Bur- 539–542. Tripp, E. A., and P. S. Manos. 2008. Is floral leigh and D. E. Soltis. 2010. Phylogenetic Simpson, G. G. 1961. Principles of animal tax- specialization an evolutionary dead-end? analysis of 83 plastid genomes further re- onomy. Columbia University Press, New Pollination system transitions in Acantha- solves the early diversification of eudicots. York. ceae. Evolution 62: 1712–1737. Proc. Natl. Acad. Sci. USA 107: 4623–4628. Smith, S. A. and M. J. Donoghue. 2008. Rates Van Dijk, E. L., H. Auger, Y. Jaszczyszyn Needleman, S. B., and C. D. Wunsch. 1970. A of molecular evolution are linked to life his- and C. Thermes. 2014. Ten years of next- general method applicable to the search for tory in flowering plants. Science 322: 86–89. generation sequencing technology. Trends similarities in the amino acid sequence of Sneath, P. H. A. and R. R. Sokal. 1973. Numer- Genet. 30: 418–426. two proteins. J. Mol. Biol. 48: 443–453. ical taxonomy. Freeman, San Francisco. Wofford, B. E. 2006. A new species of Stenan- Percy, D. M., G. W. Argus, Q. C. Cronk, A. J. Stevens, P. F. 1997. How to interpret botanical thium (Melanthiaceae) from Tennessee, Fazekas, P. R. Kasanakurti, K. S. Burgess, classifications: Suggestions from history. U.S.A. Sida 22: 447–459. B. C. Husband, S. G. Newmaster, S. C. H. BioScience 47: 243–250. Zomlefer, W. B. and W. S. Judd. 2002. Resur- Barrett and S. W. Graham. 2014. Under- Stevens, P. F. 1998. What kind of classification rection of segregates of the polyphyletic standing the spectacular failure of DNA should the practising taxonomist use to be genus Zigadenus s.l. (Liliales: Melanthiace- barcoding in willows (Salix): Does this re- saved? In Plant diversity in Malesia III: Pro- ae) and resulting new combinations. Novon sult from a trans-specific selective sweep? ceedings of the 3rd International Flora Malesi- 12: 299–308. Mol. Ecol. 23: 4737–4756. ana Symposium 1995, J. Dransfield, M. J. E. Zomlefer, W. B., N. H. Williams, W. M. Whit- Refulio-Rodriguez, N. F. and R. G. Olmstead. Coode and D. A. Simpson (eds.), 295–319. ten and W. S. Judd. 2001. Generic circum- 2014. Phylogeny of Lamiidae. Am. J. Bot. Royal Botanical Gardens, Kew, London. scription and relationships in the tribe 101: 287–299. Stevens, P. F. 2001 onwards. Angiosperm Melanthieae (Liliales, Melanthiaceae), with Remane, A. 1952. Die Grundlagen des naturli- phylogeny website. Version 12, July 2012 emphasis on Zigadenus: Evidence from ITS chen Systems, der vergleichenden Anatomie [and more or less continuously updated and trnL-F sequence data. Am. J. Bot. 88: and der Phylogenetik (The principles of the since]. http://www.mobot.org/MOBOT/ 1657–1669. research/APweb/.

© 2016 Sinauer Associates, Inc. This material cannot be copied, reproduced, manufactured or disseminated in any form without express written permission from the publisher.