<<

Syst. Biol. 56(2):163-181,2007 Copyright © Society of Systematic Biologists ISSN: 1063-5157 print / 1076-836X online DOI: 10.1080/10635150701258787 Widespread Genealogical Nonmonophyly in of Pinus Strobus

JOHN SYRING,1* KATHLEEN FARRELL,2 ROMAN BUSINSKY,3 RICHARD CRONN,4 AND AARON LISTON2 department of Biological and Physical Sciences, Montana State University-Billings, Billings, Montana 59101, USA ^•Department of and Pathology, Oregon State University, Corvallis, Oregon 97331, USA; E-mail: [email protected] (A.L.) 3Silva Tarouca Research Institute for Landscape and Ornamental Gardening, Pruhonice, Czech Republic ^Pacific Northwest Research Station, USDA Forest Service, 3200 SW Jefferson Way, Corvallis, Oregon 97331, USA *Corresponding author: John Syring. E-mail: [email protected]

Abstract.—Phylogenetic relationships among Pinus species from subgenus Strobus remain unresolved despite combined efforts based on nrFTS and cpDNA. To provide greater resolution among these taxa, a 900-bp intron from a late embryogenesis abundant (LEA)-like gene (IFG8612) was sequenced from 39 species, with two or more alleles representing 33 species. Downloaded from https://academic.oup.com/sysbio/article/56/2/163/1685319 by guest on 28 September 2021 Nineteen of 33 species exhibited allelic nonmonphyly in the strict consensus , and 10 deviated significantly from allelic monophyly based on topology incongruence tests. Intraspecific nucleotide diversity ranged from 0.0 to 0.0211, and analysis of variance shows that nucleotide diversity was strongly associated (P < 0.0001) with the degree of species monophyly. Although species nonmonophyly complicates phylogenetic interpretations, this nuclear locus offers greater topological support than previously observed for cpDNA or nrlTS. Lacking evidence for hybridization, recombination, or imperfect , we feel that incomplete lineage sorting remains the best explanation for the polymorphisms shared among species. Depending on the species, coalescent expectations indicate that reciprocal monophyly will be more likely than paraphyly in 1.71 to 24.0 x 106 years, and that complete genome-wide coalescence in these species may require up to 76.3 x 106 years. The absence of allelic coalescence is a severe constraint in the application of phylogenetic methods in Pinus, and taxa sharing similar life history traits with Pinus are likely to show species nonmonophyly using nuclear markers. [Lineage sorting; monophyly; nonmonophyly; nuclear genes; ; Pinus; phylogeny.]

Whenever a phylogenetic study uses a single individ- species; Miiller and Borsch, 2005), Solanum (14 included ual to represent a species, an implicit assumption is made species; Levin et al., 2005), Viburnum (41 included species; that the species is monophyletic (Funk and Omland, Winkworth and Donoghue, 2005), Silene (16 included 2003; Shaw and Small, 2005). To the extent that complicat- species; Popp and Oxelman, 2004), and Dioscorea (67 in- ing factors (e.g., reticulation) are rare in the divergence cluded species; Wilkin et al., 2005). history of terminal taxa, this assumption offers a con- Although many examples of species nonmonophyly venient simplification for sampling. However, sampling are the direct result of inadequate phylogenetic sig- a single individual per species provides no opportunity nal (Cross et al., 2002; Roalson and Friar, 2004; Levin to test the hypothesis of allelic monophyly (coalescence) and Miller, 2005; Shaw and Small, 2005), recent pub- within species. In , this issue can lications across the plant have demonstrated become acute, because gene are used to infer or- that species-level paraphyly and polyphyly can be well ganismal phylogenies. These gene trees are subject to supported (Roalson and Friar, 2004; Alvarez et al., 2005; processes that can result in the nonmonophyly of se- Church and Taylor, 2005; Kamiya et al., 2005; Oh and quences sampled from a single species (lineage sorting, Potter, 2005; Yuan et al., 2005). Causative factors respon- reticulate evolution; Nei, 1987; Wendel and Doyle, 1998). sible for species nonmonophyly are often difficult to es- Errors in phylogenetic estimation can occur when gene tablish. Factors commonly cited include introgressive tree/species tree incongruence exists, but species sam- hybridization (Roalson and Friar, 2004; Kamiya et al., pling is insufficient to detect the responsible phenomena. 2005; Mason-Gamer, 2005; Shaw and Small, 2005), in- Many species remain poorly known, and the presence of complete lineage sorting (Chiang et al., 2004; Bouill£ and cryptic taxa or an inadequate taxonomic treatment can Bousquet, 2005; Kamiya et al., 2005), unrecognized am- be recognized when nonmonophyletic species are dis- plification of a paralogous locus (Roalson and Friar, 2004; covered in a phylogenetic analysis. Alvarez et al., 2005), recombination among divergent al- Awareness of the existence and complications aris- leles (Schierup and Hein, 2000), and imperfect taxonomy ing from intraspecific polymorphism is growing. This is including the occurrence of cryptic species (Goodwillie illustrated by recently published plant molecular phy- and Stiller, 2001; Treutlein et al., 2003; Roalson and logenetic studies (Appendix 1) that increasingly in- Friar, 2004; Shaw and Small, 2005). Further, paraphyletic clude sampling to evaluate species level monophyly. species may be the direct result of certain evolutionary Not surprisingly, varying levels of nonmonophyly are processes, as suggested for recent progenitor-derivative encountered as more intensive population-level sam- speciation (Rieseberg and Brouillet, 1994; Rosenberg, pling is included in phylogenetic analyses (Appendix 1). 2003). Despite the prevalence of nonmonophyly, many recent Funk and Omland (2003) reported a common trend studies in do not include multiple samples per of species-level coalescence failure from mitochondrial species, even in the species-rich genera Rhododendron (86 DNA studies in animals. Their survey of 584 stud- included species; Goetsch et al., 2005), Aconitum (54 in- ies and 2319 species found that 23.1% of the studies cluded species; Luo et al., 2005), Utricularia (31 included showed species-level paraphyly or polyphyly. Bouill6

163 164 SYSTEMATIC VOL. 56 and Bousquet (2005) recently demonstrated a striking large effective population sizes (Ne; Ledig, 1998) makes case of trans-species allelic polymorphism in three low- it likely that mutations have spread and become fixed copy nuclear genes in different species of spruce (Picea). slowly across a species' range. This presents the poten- Allelic coalescence times between randomly selected al- tial for long-lived allelic diversity that spans one or more leles from these spruce species were estimated at 10 to speciation events. 18 million years ago, values that overlapped with esti- Pinus subg. Strobus has a long history of systematic mated divergence times (13 to 20 million years ago) for inquiry (reviewed in Critchfield 1986; Price et al., 1998; the species studied. Because spruces share many life his- Wang et al., 1999; Gernandt et al., 2005; Syring et al., tory traits with other temperate zone gymnosperm and 2005). To date, however, relationships among the ter- angiosperm trees (e.g., highly outcrossing, long-lived minal taxa remain nearly unresolved (reviewed in Sy- perennials with large effective population sizes), Bouill6 ring et al., 2005). In , low-copy nuclear genes are and Bousquet (2005) suggest that the incomplete lineage an untapped resource for clarifying these terminal re- sorting phenomenon detected in Picea could hinder the lationships, especially when genetic variation is inter- utility of the nuclear markers in phylogenetic analyses preted within a framework where species monophyly Downloaded from https://academic.oup.com/sysbio/article/56/2/163/1685319 by guest on 28 September 2021 of and other trees. can be assessed, and where the impact of incomplete lin- The similarities in life history traits between Picea and eage sorting can be determined. Data from multiple low- Pinus (both genera of Pinaceae) suggest that incomplete copy nuclear loci in pines (Syring et al., 2005) provide lineage sorting could be a common feature of the 100+ initial evidence that intraspecific diversity is confined species in this , thus creating a potential obsta- within Pinus subsections, although the frequency of non- cle to phylogenetic analyses based on low-copy nuclear coalescence at the species level has not been previously genes. Pinus is a diverse and relatively ancient genus addressed. with origins that date to the early Cretaceous (145 to In this paper, we present a phylogenetic analysis of 125 million years ago; Alvin, 1960). Paleontological and subg. Strobus using the most informative nuclear locus molecular data suggest that the first major divergence identified in a recent survey (Syring et al., 2005), a ca. event separating extant lineages occurred perhaps 85 to 900-bp intron localized witliin a Late Embryogenesis 45 million years ago (Miller, 1973; Meijer, 2000; Magallon Abundant (LEA)-like gene. The goal of this study is to in- and Sanderson, 2002; Willyard et al., 2007), giving rise tensively sample the remaining species of subg. Strobus, to two distinct lineages recognized today as subg. Pinus particularly the species-rich subsects. Strobus and Cem- and subg. Strobus, the "hard" and "soft" pines, respec- broides, and to place them witirdn the phylogenetic frame- tively (Liston et al., 1999; Gernandt et al., 2005; Syring work developed in prior studies (Gernandt et al., 2005; et al., 2005). Although there is extensive morphological Syring et al., 2005) with multiple markers. In addition, we seek to examine the impact of intraspecific variation variation in the genus, most character states exhibit ho- and patterns of allelic coalescence on the accuracy of the moplasy across subgenera and sections (Gernandt et al., derived phylogeny. To achieve this goal, we sequenced 2005). The number of fibrovascular bundles per needle is a minimum of two alleles for 33 of the 39 species used the only diagnostic character that is nonhomoplastic for in this analysis. This sampling strategy allows us to ad- the two subgenera (one for subg. Strobus, two for subg. dress three important questions: (1) How frequently do Pinus). Recent classifications of subg. Strobus recognize species show complete allelic coalescence at this locus? 36 to 40 species (Farjon, 2005; Gernandt et al., 2005), with (2) In the absence of species monophyly, at what taxo- complete agreement on 34 species and alternative treat- nomic rank do the alleles coalesce? and (3) If species- ments for the remaining taxa. Specific treatments for re- level non-coalescence is common, what insights can this gions with high endemism, e.g., Mexico (Perry, 1991) and provide regarding the nature of pine species, the process East Asia (Businsky, 1999, 2004), distinguish narrower of speciation in Pinus, or operational species definitions taxonomic limits and thus recognize several additional within this group? species. Within subg. Strobus, the recent classification of Gernandt el al. (2005) recognizes two sections, Quin- quefoliae and Parrya, each containing three subsections MATERIALS AND METHODS (Table 1). Despite the relative antiquity of the subgenus Strobus- Plant Materials subgenus Pinus split, molecular evidence suggests that Thirty-nine species of Pinus subgenus Strobus were extant species within sections from these subgenera sampled (Table 1). Haploid megagametophyte tissue show a high degree of genetic similarity. For example, av- was used as the DNA source for most amplifications and erage pairwise nucleotide divergence (n) of species from extracted using the FastPrep DNA isolation kit (Qbio- subg. Strobus ranges from 0.84% for two plastid genes gene, Carlsbad, , USA). In the select cases (Gernandt et al., 2005) and 5.6% for 11 nuclear genes where needle tissue was used, direct sequencing iden- (Willyard et al., 2007). By integrating this information tified homozygotes and heterozygotes. Homozygous with fossil calibrations, Willyard et al. (2007) showed that sequences were directly incorporated into the align- the lineages harboring the greatest number of soft pine ment, whereas heterozygous sequences were cloned into species (Subsects. Strobus [21 species] and Cembroides [11pGem-T Easy (Promega, Madison, Wisconsin). From species]) arose between 10 and 20 million years ago. each of the 39 sampled species, an effort was made to The combination of recent divergence and generally very obtain a minimum of two unique alleles. Depending 2007 SYRING ET AL.—WIDESPREAD NONMONOPHYLY IN PINUS SUBGENUS STROBUS 165

TABLE 1. Sampled Pinus subg. Strobus representatives. When two alleles are listed for the same accession then either both alleles came from the same individual, or the accession is a bulk collection of multiple individuals and is marked as such with a footnote. information refers to Figures 1 and 2.

13 Taxona Allele GenBank Clade Collection information Collector or source (voucher ) Section Quinquefoliae Subsection Gerardianae P. bungeana Zuccarini ex Endlicher Al, DQ642500 Shanxi Province, China6 USDAFSf Institute of Forest Genetics (OSC) A2 DQ642499 P. gerardiana Wallich ex D. Don Al DQ018379 Gilgit, Pakistan Businsky 41105 (RILOGC) A2 DQ642498 Gilgit, Pakistan Businsky 41123 (RILOGC) P. squamata X.W. Li Al DQ642501 Yunnan, China Businsky 46118 (RILOGC) Al Yunnan, China Businsky 46120 (RILOGC) Subsection Krempfianae Downloaded from https://academic.oup.com/sysbio/article/56/2/163/1685319 by guest on 28 September 2021 P. krempfii Lecomte Al DQ018380 Lam Dong, Vietnam First Darwin Expedition 242 (RBGE) Subsection Strobus P. albicaulis Engelmann Al DQ642447 A California, USA Rancho Santa Ana Botanic Garden (OSC) Al DQ642448 A Montana, USA USDAFS Coeur d'Alene Nursery (OSC) A2 DQ642449 A Oregon, USA USDAFS Dorena Genetic Resource Center (OSC) P. armandii Franchet Al DQ642471 E Anhui, China Businsky 46136 (RILOGC) P. armandii Franchet subsp. mastersiana A2 DQ642472 E Kaohsiung, Taiwan Businsky 32163 (RILOGC) (Hayata) Businsky P. ayacahuite Ehrenberg ex Schlechtendal Al DQ642450 B Mexico, Mexico USDAFS Institute of Forest Genetics (OSC) var. veitchii (Roezl) Shaw P. ayacahuite Ehrenberg ex Schlechtendal A2 DQ642451 B Michoacan, Mexico USDAFS Institute of Forest Genetics (OSC) var. veitchii (Roezl) Shaw P. ayacahuite Ehrenberg ex Schlechtendal A3 DQ642452 B La Paz, Honduras USDAFS Institute of Forest Genetics (OSC) P. bhutanica Grierson, Long & Page Al DQ642474 D Arunachal Pradesh, India Businsky 57112 (RILOGC) A2 DQ642473 E West Kameng, India Businsky 57121 (RILOGC) P. cembra Linnaeus Al DQ642475 D North Carpathians, Blada (OSC) Romania A2 DQ642476 D Austria USDAFS Dorena Genetic Resource Center (OSC) P. chiapensis (Martinez) Andresen Al DQ642453 C Chiapas, Mexico Dvorak, CAMCORE (no voucher) Al DQ642454 C Guatemala USDAFS Institute of Forest Genetics (OSC) A2 DQ642455 C Veracruz, Mexico Hernandez (OSC) P. dalatensis Ferr6 subsp. procera Al DQ642477 E Kon Turn Province, Businsky 44116b (RILOGC) Businsky Vietnam P. dalatensis Ferr6 subsp. procera A2 DQ642478 E Kon Turn Province, Businsky 44114 (RILOGC) Businsky Vietnam P. kiuangtungensis Chun ex Tsiang Al DQ642479 E China Washington Park Arboretum 1943-40 (OSC) Pflexilis James Al DQ642458 B Alberta, Cananda Natural Resources Canada (OSC) A2 DQ642456 B California, USA USDAFS Institute of Forest Genetics (OSC) A3 DQ642457 B Colorado, USA Roelof(OSC) P. koraiensis Siebold & Zuccarini Ald, DQ642480 A, Khabarovsky, Russia Krutovsky (no voucher) A2d DQ642481 A A3 DQ642482 A Saitama, Japan Forest Tree Breeding Center, Japan (OSC) P. lambertiana Douglas Al, DQ642459 A, California, USA6 USDAFS Institute of Forest Genetics (OSC) A2 DQ642460 B A3 DQ642461 D California, USA USDAFS Central Zone Genetic Resource Program (OSC) P. monticola Douglas ex D. Don Al DQ642464 A California, USA USDAFS Central Zone Genetic Resource Program (OSC) A2 DQ642462 B Oregon, USA USDAFS Dorena Genetic Resource Center (OSC) A3 DQ642463 B California, USA USDAFS Central Zone Genetic Resource Program (OSC) P. morrisonicola Hayata Al, DQ642483 E, Taiwan6 USDAFS Dorena Genetic Resource Center (OSC) A2 DQ642484 E P. parviflora Siebold & Zuccarini subsp. Ald DQ642485 D Shikoku, Japan Businsky 45155 (RILOGC) parviflora P. parviflora Siebold & Zuccarini A2 DQ642486 D Japan Iseli Nursery (OSC) P. parviflora Siebold & Zuccarini subsp. A2 D Hokkaido, Japan Forest Tree Breeding Center, Japan (OSC) pentaphylla (Mayr) Businsky P. parviflora Siebold & Zuccarini subsp. A2 D Hokkaido, Japan Forest Tree Breeding Center, Japan (OSC) pentaphylla (Mayr) Businsky P. peuce Grisebach Al DQ642487 D Bulgaria USDAFS Dorena Genetic Resource Center (OSC) (Continued on next page) 166 SYSTEMATIC BIOLOGY VOL. 56

TABLE 1. Sampled Pinus subg. Strobus representatives. When two alleles are listed for the same accession then either both alleles came from the same individual, or the accession is a bulk collection of multiple individuals and is marked as such with a subscript. Clade information refers to Figures 1 and 2. (Continued)

Taxona Allele GenBank Clade Collection information Collector or source (voucherb) Al DQ642488 D Macedonia, Yugoslavia USDAFS Institute of Forest Genetics (OSC) Al DQ642489 D Macedonia, Yugoslavia USDAFS Institute of Forest Genetics (OSC) P. pumila (Pallas) Regel Ald DQ642490 A Byelorussia, Russia Krutovsky (no voucher) A2 DQ642491 A Hokkaido, Japan Watano (OSC) A3 DQ642492 A Hokkaido, Japan Watano (OSC) P. sibirica Du Tour Al DQ642494 D Kemerovo District, Russia Natural Resources Canada (OSC) A2 DQ642493 D Krasnoyarsk Krai, Russia Buck (OSC) P. strobiformis Engelmann Al DQ642467 A Texas, USA Gernandt DSG599 (OSC) A2 DQ642465 Coahuila, Mexico USDAFS Institute of Forest Genetics (OSC) B Downloaded from https://academic.oup.com/sysbio/article/56/2/163/1685319 by guest on 28 September 2021 A3 DQ642466 B Nuevo Le6n, Mexico USDAFS Institute of Forest Genetics (OSC) P. strobus Linnaeus Al DQ642468 A Wisconsin, USA USDAFS Dorena Genetic Resource Center (OSC) A2d DQ642470 A New Jersey, USA Gernandt DSG00500 (OSC) A3 DQ642469 A Minnesota, USA USDAFS Oconto River Orchard (OSC) P. wallichiana A. B. Jackson Al DQ642497 D Hutu, Rara, Nepal Natural Resources Canada (OSC) A2 DQ642495 E Gilgit, Pakistan Businsky 4U01 (RILOGC) A3 DQ642496 E Himalayas Natural Resources Canada (OSC) Section Parrya Subsection Balfourianae P. aristata Engelmann Al DQ642503 Colorado, USA USDAFS Institute of Forest Genetics (OSC) A2 DQ642504 Arizona, USA Farrell 36 (OSC) P. balfouriana Balfour Al DQ642505 S. California, USA Wisura (OSC) A2 DQ642506 N. California, USA USDAFS Central Zone Genetic Resource Program (OSC) P. longaeva Bailey Ald AY634346 California, USA Gernandt 03099 (OSC) A2 DQ642507 Utah, USA McArthur (OSC) Subsection Cembroides P. cembroides Zuccarini Al DQ642508 F San Luis Potosi, Mexico USDAFS Institute of Forest Genetics (OSC) A2 DQ642510 F Texas, USA Gernandt DSG593 (OSC) A3 DQ642509 F Hidalgo, Mexico Hernandez (OSC) P. culminicola Andresen & Beaman Al, DQ642511 F, Nuevo Le6n, Mexico Velazquez (OSC) A2 DQ642512 G P. discolor Bailey & Hawksworth Ald, DQ642513 F, San Luis Potosi, Mexico Gernandt 2101 (MEXU) A3d DQ642514 F A2 DQ642515 F Arizona, USA Hammond (OSC) P. edulis Engelmann Al DQ642518 F Utah, USA Gernandt DSG489 (OSC) A2 DQ642517 G Utah, USA Gernandt DSG488 (OSC) A3 DQ642516 G New Mexico, USA USDAFS Institute of Forest Genetics (OSC) P. johannis Robert-Passini Al DQ642519 G Nuevo Le6n, Mexico Frankis, M.P. 179 (E) A2 DQ642520 I San Luis Potosi, Mexico Gernandt DSG501 (MEXU) P. maximartinezii Rzedowski Al DQ642521 I Zacatecas, Mexico USDAFS Institute of Forest Genetics (OSC) Al I Zacatecas, Mexico USDAFS Institute of Forest Genetics (OSC) P. monophylla Torrey & Fremont Al DQ642522 G California, USA Rancho Santa Ana Botanic Garden (OSC) A2 DQ642523 G Nevada, USA Halse 6663 (OSC) A3 DQ642524 G Utah, USA Gernandt DSG480 (OSC) P. pinceana Gordon Al DQ642525 I Quer6taro, Mexico USDAFS Institute of Forest Genetics (OSC) A2 DQ642526 I Coahuila, Mexico USDAFS Institute of Forest Genetics (OSC) P. quadrifolia Parlatore ex Al DQ642527 H California, USA Winter (OSC) Sudworth A2 DQ642528 H California, USA Winter (OSC) P. remota (Little) Bailey & Al DQ642531 F Texas, USA Gernandt DSG588 (OSC) Hawksworth A2 DQ642530 F Texas, USA Zech 382-4 (OSC) A3d DQ642529 G Coahuila, Mexico Gernandt DSG0601 (MEXU) P. rzedowskii Madrigal & Caballero Al DQ642532 I Michoaccin, Mexico Businsky 47131 (RILOGC) Subsection Nelsoniae P. nelsonii Shaw Al DQ642502 Nuevo Le6n, Mexico Gernandt & Ortiz DSG10398 (OSC, MEXU) A2d AY634347 Nuevo Le6n, Mexico Gernandt & Ortiz DSG11298 (OSC, MEXU)

aTaxonomy follows Gernandt et al. (2005). bHerbarium acronyms follow Index Herbariorum: http://sciweb.nybg.org/science2/IndexHerbariorum.asp. CRILOG = Silva Tarouca Research Institute for Landscape and Ornamental Gardening, 252 43 Pruhonice, Czech Republic. d Cloned product, see Methods for details. eBulk collections from numerous individuals at the same location. fUnited States Department of Agriculture Forest Service. 2007 SYRING ET AL.—WIDESPREAD NONMONOPHYLY IN PINUS SUBGENUS STROBUS 167 on availability, individuals were selected that spanned calculated from the alignment include the average the geographic range of the species. In cases of nar- number of characters, number of variable and parsi- row endemism or where collections were limited, two mony informative characters, average within-group alleles were sequenced from the same individual. Esti- p-distance, and average base composition (determined mates of species distribution area were based on pub- using MEGA 2.1; Kumar et al., 2001). lished maps (Critchfield and Little, 1966; Malusa, 1992; Evidence for recombination was assessed using both Farjon and Styles, 1997; Delgado et al, 1999, Ledig et al, sequence-based (maximum x2 method) and topological 1999). (difference in sum of squares; DSS) approaches. First, For the eight representatives of North American the substitution-based "maximum y}" method was cho- subsection Strobus, three alleles were chosen from a sen because of its high power and low false-positive rate more extensive data set that will be used to evalu- (Posada, 2002). This method (Smith, 1992; Posada and ate population-level variation (four to seven alleles per CrandalL 2001) identifies recombinant segments by test- species; Syring et al., unpublished data). The three alleles ing for significant differences among proportions of vari- selected represent the most divergent sequences based able and nonvariable polymorphic positions in adjacent Downloaded from https://academic.oup.com/sysbio/article/56/2/163/1685319 by guest on 28 September 2021 on calculated p-distances. regions of aligned sequences. For all possible sequence pairs, a sliding window containing 10% of the variable Choice ofOutgroup positions was divided into two equal partitions and moved in 1-bp increments along the alignment. At each As described in Syring et al. (2005), introns between 2 pine subgenera are frequently unalignable due to nu- increment, a 2 x 2 x was calculated as an expression of merous and overlapping indels, repeated motifs, and the difference in the number of variable sites on each side uncertain sequence homology. This precludes the use of of the partition for pairs of sequences. Putative recombi- members from subgenus Pinus or more distantly related nation points were identified by plotting y} values along genera as outgroups in this analysis. Independent evi- the length of the alignment. An alternative phylogeny- dence from five nuclear genes and cpDNA (Gernandt based test, the difference in sum of squares method (DSS; et al., 2005; Syring et al., 2005) shows that the sections of McGuire et al., 1997; Milne et al., 2004), was also used to subg. Strobus are monophyletic and sister to each other; identify putative recombinants using phylogenetic trees for this reason, members of one section were used as the constructed from adjacent regions of an alignment. For outgroup for analyzing the alternative section. Pinus nel- all sequences, a 150-bp window was divided into two sonii (Sect. Parrya, subsect. Nelsoniae) is exceptional. Evi-partitions and moved in 20-bp increments. For each in- dence from three nuclear genes (Syring et al., 2005) and crement, a distance matrix (Fitch 84) was calculated and cpDNA (Gernandt et al., 2005) resolve P. nelsonii as the a least squares tree constructed for each partition. Branch sister lineage to the remaining members of sect. Parrya. lengths were estimated using least squares, and sum of In contrast, the LEA-like locus used in this study places squares for each partition were recorded. Topologies for P. nelsonii in a unique, moderately supported (71% BS) partitions were then swapped, branch lengths for the position sister to sect. Quinquefoliae when midpoint root- forced topology were determined, and sum of squares ing is employed. Because of this uncertainty, we chose recorded. DSS values were plotted along alignments, and to include P. nelsonii as a member of the outgroup in an- recombinant segments were identified by peaks in DSS alyzing both sects. Quinquefoliae and Parrya. Therefore, values. Significance for both methods was determined by P. monticola, P. krempfii, P. gerardiana, and P. nelsonii were 1000 permutations (parametric bootstrapping) using a = used as outgroup taxa for analyzing sect. Parrya, and P. 0.01 as a threshold of significance and the program RDP2 aristata, P. monophylla, and P. nelsonii were used as out- (Martin et al., 2005). Analyses were conducted on data group taxa for analyzing sect. Quinquefoliae. Although sets where shared indels were discarded to prevent false the LEA-like locus is too labile to provide insight into positives. the placement of P. nelsonii, future studies using more evolutionarily constrained molecules will be used to in- Phylogenetic and Statistical Analysis vestigate the position of this lineage. Phylogenetic analyses were performed by taxonomic section using maximum parsimony (MP; PAUP* ver- Locus Amplification, Sequencing, Alignment, and sion 4.0bl0; Swofford, 2003). Most parsimonious trees Recombination Analysis were found from branch-and-bound searches, with all Description of the LEA-like locus and the protocols characters weighted equally and treated as unordered. for amplification, sequencing, and alignment are given Branch support was evaluated using the nonparamet- in Syring et al. (2005). Gaps were coded as phylogenetic ric bootstrap (Felsenstein, 1985), with 1000 replicates characters using the method of Simmons and Ochoterena and TBR branch swapping. Alternative phylogenetic (2000) and the online program Gap Recoder (R. Ree, http: hypotheses and statistical strength for species nonmono- / /maen.huh.harvard.edu:8080 / services / gap_recoder); phyly were analyzed using constraints on tree topolo- all coded gaps were verified manually. Four previously gies in PAUP*. The Wilcoxon signed-rank test (WSR; published sequences (DQ018379, DQ018380, AY634346, Templeton, 1983) was employed to test for significant AY634347) are incorporated in this study (Table 1). The differences among topologies. For this test, up to 1000 of alignment is available at TreeBase (S1538). Statistics the most-parsimonious trees were used as constraint 168 SYSTEMATIC BIOLOGY VOL. 56 topologies. The range of P values across all topologies is six species we obtained a single allele, although in three reported for every test. of these cases multiple sequences from different indi- Statistical associations between intraspecific nu- viduals returned an identical allele (see Table 1); these cleotide diversity, geographic range (a proxy for species include P. peuce (N = 3 different ), P. maximartinezii abundance and census population size), and extent of (N = 2), P. squamata (N = 2), and P. kiuangtungensis, P. monophyly were explored using analysis of variance krempfii, and P. rzedowskii (N = 1 template each). In Pi- (ANOVA). For these tests, species were categorized into nus krempfii alone, the locus was amplified directly from two "PHYLY" classes, either strongly nonmonophyletic diploid needle tissue; direct sequencing indicated this in- (i.e., intraspecific allelic coalescence resulted in statis- dividual was homozygous. P'or three species having two tically significant topological distortion as shown by unique alleles, further seed sampling from other trees the WSR constraint test) or weakly nonmonophyletic to recovered one of the known alleles: albicaulisAl (allele monophyletic (i.e., WSR constraint tests were insignif- Al, N = 2), chiapensisAl (allele Al; N = 2), and parvi- icant). Nucleotide diversity was analyzed directly, and

2 floraAl (allele A2; N = 3). For P. bungeana, P. culminicola, Downloaded from https://academic.oup.com/sysbio/article/56/2/163/1685319 by guest on 28 September 2021 geographic ranges (km ; Critchfield and Little, 1966) and P. morrisonicola, both alleles were sequenced from were loglO-transformed to minimize skewness and kur- the same individual; for P. discolor (Al and A3), P. ko- tosis. Differences in univariate measures between PHYLY raiensis (Al and A2), and P. lambertiana (Al and A2), two classes were tested using one-factor ANOVA. Statistical analyses were performed using SAS (SAS Institute, 1999). of the three alleles were from the same individual. Ten se- quences included in this data, set were cloned: discolor Al, discolor A3, koraiensisAl, koraiensisAl, longaevaAl, nel- Coalescent Expectations of Genie and Genomic MonophylysoniiAl, parvifloraAl, pumilaAl, remotaA3, and strobus Al. According to Rosenberg (2003) reciprocal monophyly Based on this sample, our phylogenetic estimates of al- is predicted to be more likely than paraphyly under lelic monophyly can be assessed in 33 of 39 included species. conditions of neutrality at ~1.67 x 2Ne diploid gener- ations, and complete genome-wide coalescence may re- Our aligned sequence for the LEA-like locus was 1554 bp in length, with individual sequences averaging 898.5 quire ~5.30 x 2Ne diploid generations. In to make these calculations for species of Pinus, Watterson's esti- bp from sect. Quinquefoliae (range = 821-983 bp) and mate of Theta (6) was determined for select species using 987.3 bp from sect. Parrya (range = 937-1132 bp). The DnaSP v.4.00.6 (Rozas et al., 2004). Effective population aligned sequence includes a partial exon on the 3' end with 44 complete and two partial codons (135 bp). The sizes were estimated using the formula Ne = 0/(4 X/JLG), where IIQ is the absolute silent mutation rate per nu- remaining sequence is intron and has an aligned length cleotide, adjusted for generation time. The absolute silent of 1419 bp. The exon has 20 variable positions within mutation rate for Pinus was recently estimated from 11 subg. Strobus (12 in sect. Quinquefoliae, 8 in sect. Parrya), 10 of which 3 were localized in first codon positions, 5 in nuclear genes (Willyard et al., 2007) to average 7.0 x 10~ second positions, and 12 in third positions. Within subg. substitutions/site/year, assuming an 85 million year di- Strobus, inferred amino acid replacements occur at 7 of vergence of pine subgenera. Generation times for indi- 44 sites. The intron segment included 267 variable sites vidual species were taken as the average for the range of and 152 parsimony-informative (PI) sites within subg. years to seed bearing age cited in Krugman and Jenkin- Strobus. This included 157 variable sites (88 PI) in sect. son (Woody Plant Seed Manual, 2nd edition [R. G. Nis- Quinquefoliae, and 110 variable sites (41 PI) in sect. Parrya. ley, ed.], http://www.nsl.fs.fed.us/wpsm/). Although Nucleotide frequencies were relatively AT-rich (25.9% A, estimating the generation time of species with uneven- 33.5% T, 20.2% G, 20.6% C). Complex and simple indels aged populations and overlapping generations can be are frequent across the length of the intron and range in contentious, the range of variability in this parameter is length from a single nucleotide to a 110-bp deletion in not great enough to affect the order of magnitude for the armandiiAl. In total, 74 gaps were scored and appended calculations of Ne. to the alignment. In this study we test the extent of monophyly for Average interspecific n for subg. Strobus at the LEA- LEA-like allele lineages within species. For simplicity, like locus is 0.0330 ± 0.0029. Estimates of intraspecific n we abbreviate this as "species monophyly" or "species show that soft pine species display a wide range of ge- nonmonophyly," but want to emphasize that we are re- netic diversity and differentiation. Across the 33 species ferring to the coalescence of genealogies of the LEA-like with nonidentical alleles, intraspecific nucleotide diver- gene and not the extent of monophyly for actual species. sity ranged from 0.0 in P. albicaulis (alleles differ by one indel) to 0.0211 in P. strobiformis, and averaged 0.0085 (Table 2). Other species showing high nucleotide diver- RESULTS sity in our sample include P. lambertiana (n — 0.0196), P.jo- Sequences, Alignment Characteristics, Allelic Diversity, andhannis (0.0186), P. bhutanica (0.0185), P. monticola (0.0155), Recombination P. edulis (0.0158), and P. aristata (0.0149). Members of sect. We acquired 86 LEA-like alleles from 39 species of Pi- Parrya showed a trend toward higher intraspecific nu- nus subgenus Strobus (Table 1). From 14 of the 39 species cleotide diversity relative to sect. Quinquefoliae, with 10 we sequenced three unique alleles, and from 19 species of 13 species (76.9%) versus 11 of 20 species (55.0%) show- we sequenced two unique alleles. From the remaining ing n > 0.005, respectively (Table 2). 2007 SYRING ET AL.—WIDESPREAD NONMONOPHYLY IN PINUS SUBGENUS STROBUS 169

TABLE 2. Interspecific nucleotide diversity (n), approximate geo- nica A21'wallichianaA2, bhutanicaA2/wallichianaA3, and graphic ranges, and monophyletic status of the species included in this culminicola A2/remota A3. study. JV = number of unique alleles upon which n was based, n Was determined using p-distances; SD = standard deviation of the n cal- Of the 86 LEA-like sequences used in this analysis, culations. Approximate ranges for each species were determined from only one allele showed evidence of recombination, de- several sources and represent best estimates (see Methods). spite a relatively lax threshold for significance (a = 0.01). The maximum x2 method identified one recombinant re- Approx. Monophyletic gion from strobiformisAl between positions 388 and 399. range in the Strict Close inspection shows that strobiformisAl contains an Species N TT(SD) (km2) Consensus Tree? a 11-nucleotide motif (CTTTTGTAGCC) with 37% identity P. strobiformis 3 0.0211 (0.0041) 160,000 N to the same region (e.g., TTCYACAGGA) from all other P. lambertiana 3 0.0196 (0.0038) 110,000 N P. johannis 2 0.0186 (0.0040) 18,000 N species of sect. Quinquefoliae. If this segment is recom- P. bhutanica 2 0.0185 (0.0045) 8000 N binant, it likely arose via xenologous recombination or P. edulis 3 0.0158 (0.0033) 260,000 N the PCR process because it lacks similarity to other ho- P. monticola 3 0.0155 (0.0033) 370,000 N mologous alleles. The topology-based DSS method pro- Downloaded from https://academic.oup.com/sysbio/article/56/2/163/1685319 by guest on 28 September 2021 P. aristata 2 0.0149 (0.0040) 15,000 Y vides no evidence for recombination in strobiformisAl or P. longaeva 2 0.0128 (0.0038) 10,000 Y P. discolor 3 0.0112 (0.0026) 27,000 N other sequences. Because the single putatively recom- P. remota 3 0.0112 (0.0026) 27,000 N binant segment from strobiformisAl only adds autapo- P. zuallichiana 3 0.0112 (0.0028) 150,000 N morphic steps to this lineage and does not distort the P. balfouriana 2 0.0107 (0.0034) 2500 N topology (data not shown), we included this sequence in P. cembra 2 0.0100 (0.0032) 80,000 N P. pumila 3 0.0092 (0.0025) 6,000,000 Y all analyses (the potentially recombinant 12 bases were P. sibirica 2 0.0088 (0.0033) 5,000,000 N omitted from estimates of n; Table 2). P. strobus 3 0.0078 (0.0024) 1,800,000 Y P. cembroides 3 0.0076 (0.0023) 190,000 N P. culminicola 2 0.0075 (0.0022) 50 N Phylogenetic Analyses P. dalatensis 2 0.0077 (0.0028) 50 N P. monophylla 3 0.0058 (0.0021) 75,000 N Section Quinquefoliae.—From the branch-and-bound P. armandii 2 0.0051 (0.0027) 600,000 N search of the sect. Quinquefoliae data set, twenty most par- P. pinceana 2 0.0047 (0.0023) 25,000 Y simonious trees were recovered (Fig. 1). Trees were 350 P. nelsonii 2 0.0044 (0.0022) 6000 Y steps in length, had a consistency index (CI) of 0.8229, P. koraiensis 3 0.0038 (0.0018) 600,000 Y P. gerardiana 2 0.0035 (0.0019) 25,000 Y and a retention index (RI) of 0.8835. Subsection Krempfi- P. chiapensis 2 0.0031 (0.0017) 5000 Y anae is sister to a clade of subsects. Strobus and Gerardianae P. ayacahuite 3 0.0022 (0.0012) 50,000 N (67% bootstrap support, BS). Support for the topology P. parviflora 2 0.0022 (0.0015) 140,000 Y P. flexilis 3 0.0014 (0.0010) 250,000 N within subsect. Gerardianae is relatively high, with 81% P. albicaulis 2 0.0000 (0.0000) 400,000 Y BS for the monophyly of the subsection and 90% BS for P. bungeana 2 0.0012 (0.0012) 50,000 Y the resolution of the rare endemic P. squamata as sister to P. morrisonicola 2 0.0011 (0.0011) 5000 Y P. gerardiana and P. bungeana. P. quadrifolia 2 0.0010 (0.0011) 5000 Y P. iauangtungensis 1 n/a 500 n/a Alleles from species of subsect. Strobus resolved into P. krempfii 1 n/a 300 n/a five major groups (Fig. 1), all of which were present in P. maximartinezii 1 n/a 4 n/a the strict consensus tree. Bootstrap support ranged from P. pence 1 n/a 2000 n/a lacking (<50%; clades A and D), or moderate (74%; clade P. rzedowskii 1 n/a 100 n/a E), to very strong (98%; B and C). Clade A includes P. squamata 1 n/a 0.05 n/a alleles from five North American species (P. strobiformis, aValue excludes the region where tests indicated potential recombination be- P. albicaulis, P. strobus, P. lambertiana, P. monticola) and two tween sites 388 and 399, see text for details. East Asian species (P. koraiensis, P. pumila). Noteworthy is the 100% BS uniting alleles lambertianaAl and monticola Al. Despite the sister relationship among these alleles, Only one allele was shared between two species, a sister relationship among these species is contradicted namely ayacahuiteAl and flexilisAl. This observation by the lack of allelic coalescence within these two species was striking, because these samples came from wild (described below). Sister to clade A are two strongly sup- collections separated by ~3600 km (P. ayacahuite orig- ported and exclusively North American groups, one of inated from the state of Mexico, Mexico and P. flex- which includes alleles from five species (clade B) and the ilis from Alberta, Canada). This allele has also been second of which includes only P. chiapensis alleles (clade sequenced for P. strobiformis from both New Mexico C). Clade C is the only group where species do not share and Texas (J. Syring, unpublished data). All other com- alleles with another clade. Relationships between clades parisons between P. ayacahuite and P. flexilis alleles A, B, and C are essentially unresolved, and the node show pairwise distances between 0.0011 and 0.0043, supporting clade C as sister to clades A/B collapses in suggesting a close genetic affinity. Although not con- the strict consensus tree. The remaining alleles from P. taining identical alleles, seven interspecific comparisons monticola (A2, A3) both resolved within clade B, as did yielded highly similar alleles with p-distances < 0.0011: one allele from P. lambertiana (A2). Clade D is composed ayacahuiteAl /flexilis A3, ayacahuiteAl/flexilis Al, ayac- entirely of alleles sampled from European (P. cembra, P. ahuiteAl / strobiformis A2, cembraAl/parvifloraA2, bhuta- peuce) and Asian (P. bhutanica, P. wallichiana, P. sibirica, P. 170 SYSTEMATIC BIOLOGY VOL. 56

aH>/cau//sA1(2) albicaulisA2 • koraiensisA^ \86\koraiensisA2 • koraiensisAZ 100 r- lambertianaAA monticolaAl 99 j— strobusA1] Clade A |881 I strobusA2 strobusAZ pumilaAl pumilaA2 pumilaAZ

• strobiformisAI Downloaded from https://academic.oup.com/sysbio/article/56/2/163/1685319 by guest on 28 September 2021 ayacahuiteA^ — ayacahuiteA2 M* 50 86] ayacahuiteA3 •— strobiformisA2 76 flexilisA-\ - flexilisA2 Clade B 67 flexilisAZ - monticolaA2 55 subsect. Strobus 98 j-momonticolaAZ strobiformisAZ — lambertianaA2 100 i— chiapensisAl (2) Clade C ' chiapensisA2 99 r- bhutanicaA\ wallichianaAl 1— peuceAl (3) 5 changes 83 I cembraA2 T^ L- sibiricaAl Clade D ' sibiricaA2 97 74 pa/v/Y/oraA1 parvifloraA2 (3) lambertianaAZ bhutanicaA2 wallichianaA2 wallichianaAZ

2i— dalatensisA2 67 ' kwangtungensisAA Clade E dalatensisAI armandiiA2 94 |— morrisonicolaAA morrisonicolaA2 97 — arman d//A1 squamataA"\ (2) 81 100 r~ bungeanaA2 90 ' bungeanaAl subsect. Gerardianae j— gerardianaA\ '— gerardianaA2 • krempfiiAl I subsect. Krempfianae nelsonii A1! • aristataA2 Outgroups edulisAZ

FIGURE 1. One of 20 most-parsimonious trees for section Quinquefoliae. Trees are rooted with P. aristata, P. edulis, and P. nelsonii. Bootstrap values from 1000 replicates and TBR branch swapping are shown near nodes. Number of characters = 1628; length of trees = 358; consistency index = 0.827; retention index = 0.889. Asterisks (*) indicate nodes that collapse in the strict consensus tree. Numbers in parentheses following alleles refer to the number of additional times the same allele was sequenced. The letters "L" and "M" are placed on the node of the most recent common ancestors for all alleles of P. lambertiana and P. monticola, respectively. 2007 SYRING ET AL.—WIDESPREAD NONMONOPHYLY IN PINUS SUBGENUS STROBUS 171

monticolaA2

gerardianaA2 Outgroups krempfiifiA — ne/son//A1 871— aristataAl aristataAl 97 balfourianaAl subsect. Balfourianae 97 balfourianaA2 70 — longaevaM

• longaevaA2 Downloaded from https://academic.oup.com/sysbio/article/56/2/163/1685319 by guest on 28 September 2021 cembro/desAI - etfu//sA1 cembroidesA2 - cembroidesA3 79 discolorAI remotaAl Clade F discolorAl L remotaA2 100 ^ culminicolaAl discolorAZ

culminicolaA2

subsect. Cembroides 55* remotaA3

Clade G

monophyUaAl

monophylfaA2 97 57* monophyllaAZ J90J—

62 r- maximartineziiA\ (2) 68J |fl5|— p//iceanaA1 Clade i 89 pinceanaA2 L 5 changes rzedowskiiA'l

FIGURE 2. One of 4495 most-parsimonious trees for section Parrya. Trees are rooted with P. gerardiana, P. krempfii, P. monticola, and P. nelsonii. Bootstrap values from 1000 replicates and TBR branch swapping are shown near nodes. Number of characters = 1628; length of trees = 300; consistency index = 0.843; retention index = 0.883. Asterisks (*) indicate nodes that collapse in the strict consensus tree. Numbers in parentheses following alleles refer to the number of additional times the same allele was sequenced. pnrviflora) species, with the exception of lambertianaA3, Section Parrya.—Branch-and-bound searches on the which is in an unsupported position as the sister to the sect. Parrya data recovered 4495 most parsimonious remaining members of this clade. Pinus bhutanica and P. trees that were 292 steps in length (Fig. 2; CI = 0.843, wallichiana have alleles in both clade D and clade E. The RI = 0.883). Subsections Balfourianae and Cembroides were latter clade is composed strictly of alleles from species monophyletic sister lineages, with 97% BS each. Support distributed in southeastern Asia. for relationships within subsect. Balfourianae is high, with 172 SYSTEMATIC BIOLOGY VOL. 56

P. longaeva resolving as sister to the clade of P. aristata/P. (0.008) and is well supported as monophyletic (88% BS). balfouriana (97% BS, although alleles of P. balfouriana are However, of the seven species having mean intraspecific nonrnonophyletic in the strict consensus tree). Species of p-distances >0.008, only P. pumila was monophyletic in subsect. Cembroides resolved into four clades (F through the strict consensus tree, and none had bootstrap support I; Fig. 2), all of which were present in the strict consen- >50%. sus tree with the exception of the culminicolaAl and dis- Enforcing topological constraints for species mono- colorA3 alleles (these collapse out of clade F in the strict phyly shows that 6 of the 11 nonmonophyletic mem- consensus tree). With the exclusion of these two poorly bers of sect. Quinquefoliae return significant results in supported alleles and johannisA2 from clade I, each of the WSR test (Table 3), indicating statistical support for the four major clades within subsect. Cembroides have their nonmonophyly (a = 0.05). For example, constrain- moderate to strong support (78% to 90% BS), though re- ing topologies for the monophyly of P. flexilis or P. ay- lationships among and within the clades remain largely acahuite led to trees with lengths 351 and 352, respec- unresolved.

tively (one step and two steps longer than unconstrained Downloaded from https://academic.oup.com/sysbio/article/56/2/163/1685319 by guest on 28 September 2021 Clade F contains all alleles from P. cembroides and P. topologies). These topologies returned insignificant WSR discolor, along with some of the alleles for P. culminicola, results (P = 0.3173-0.6547 for P. flexilis and 0.1573-0.4142 P. edulis, and P. remota, the remainder of which are found for P. ayacahuite), suggesting that the nonmonophyly in clade G. Clade G contains all three alleles of P. mono- of these species is not strongly supported. In contrast, phylla and one of two P. johannis alleles. Support for the results from P. lambertiana (369 steps, P = <0.0001- monophyly and relationships among the three alleles of 0.0003), P. monticola (368 steps, P = <0.0001-0.0001), and P. monophylla is weak (57% and 55% BS, respectively), P. bhutanica (367 steps, P = <0.0001-0.0002) strongly sup- and all remaining support within clade G unites alleles port the polyphyly of alleles from these species. from differing species. Clade H contains only the two al- Section Parrya.—The strict consensus tree indicates leles of P. quadrifolia (90% BS) and is unique in being the that of the 13 species for which multiple alleles were se- only clade from this section where species do not share quenced, only 5 species (38.5%) were monophyletic (Ta- alleles with another clade. Clade I contains both alleles of ble 2). The remaining eight species (61.5%) were either P. pinceana and the singleton alleles of P. maximartinezii paraphyletic or polyphyletic. For those species showing and P. rzedowskii. The support for the node uniting these allelic monophyly, four were strongly supported (85% to three species is strong (89% BS), but the node supporting 100% BS; the monophyly of P. longaeva has 70% BS). For the sister relationship of P. maximartinezii and P. pinceana those species showing paraphyly or polyphyly, five had is only weakly supported (68% BS). Sister to these three at least one allele in a moderately or strongly supported species is the poorly supported johannisAl (62% BS), position (79% to 90% BS) ensuring nonmonophyly. Alle- whose other allele is found in a strongly supported po- les from four species (P. culmicola, P. edulis, P. johannis, P. sition within clade G. remota) are spread across two or more of the major clades. None of the five species represented with three sequences Species Monophyly and Topological Conflict Arising from were monophyletic in the strict consensus tree, suggest- Genealogical Nonmonophyly ing that, in contrast to sect. Quinquefoliae, additional sam- pling may reduce the number of monophyletic species Section Quinquefoliae.—The strict consensus tree indi- in sect. Parrya. cates that of the 20 species for which multiple unique al- Following the trend in sect. Quinquefoliae, allelic mono- leles were sequenced, 9 species (45%) were monophyletic phyly is not a simple function of intraspecific genetic (Table 2). For species exhibiting allelic monophyly, seven diversity (Table 2). Although nucleotide diversity is showed moderate to very high support (82% to 100% low for the three alleles of P. monophylla (n = 0.006), BS). For species exhibiting paraphyly, eight had at least this species is not monophyletic in the strict consensus one well-supported allele (83% to 100% BS), ensuring tree. In contrast, P. aristata (n = 0.015) and P. longaeva nonmonophyly. Most severe in this regard were the five (TT = 0.013), once considered to be a single species (see species that possess alleles in two or more of the major Bailey, 1970), show high nucleotide diversity yet are each clades: these include P. bhutanica (clades D, E), P. lamber-monophyletic. tiana (clades A, B, D), P. monticola (clades A, B), P. strobi- Enforcing topological constraints for species mono- formis (clades A, B), and P. wallichiana (clades D, E). For phyly shows that four of the eight nonmonophyletic species including three alleles, the proportion of mono- members of sect. Parrya return significant results in the phyletic species was nearly identical to the broader sam- WSR test (Table 3), and 335 of the 951 trees (35.2%) indi- ple (50%; 6 of 12 species), suggesting that small sample cate that the nonmonophyly of P. discolor is significant. It sizes generally capture enough intraspecfic heterogene- is noteworthy that in both sects. Quinquefoliae and Parrya, ity to be indicative of allelic monophyly in this section. all species whose alleles fall into two or more of the major Species-level monophyly in the strict consensus tree clades return significant WSR results, thereby providing is not necessarily dependent on having low nucleotide further confidence in the distinctiveness of these clades. diversity (Table 2). Although intraspecific mean p- Subgenus summary.—Across 33 species from subgenus distances are low for P. flexilis (0.001) and P. armandii Strobus, we found that 14 were monophyletic (42.4%) and (0.005), these species are not monophyletic. In contrast, P. 19 (57.6%) were nonmonphyletic at the LEA-like locus strobus has a relatively high mean intraspecific p-distance in the strict consensus tree (Table 2). Ten of 33 species 2007 SYRING ET AL.—WIDESPREAD NONMONOPHYLY IN PINUS SUBGENUS STROBUS 173

TABLE 3. Wilcoxon signed rank (WSR) tests of species nonmonophyly. All species from the strict consensus tree that were either para- or polyphyletic were constrained to monophyly and the resulting trees were tested for topological incongruence against the unconstrained trees. Up to 1000 most parsimonious trees were saved in the constraint analysis. Significant results at the a = 0.05 level are marked with an asterisk (*).

No. of trees saved Clades in which from constraint Length of Range of P Species alleles appear8 analysis Trees'3 values sect. QuinquefoUae Figure 1 P. armandii E-2 114 351 0.3173-0.6547 P. ayacahuite B-3 150 352 0.1573-0.4142 P. bhutanica D-l, E-l 1000 367 <0.0001-0.0002* P. centbra D-2 397 359 0.0027-0.0126* P. dalatensis E-2 61 351 0.3173-0.6547 P.flexilis B-3 487 351 0.3173-0.6547 P. lambertiana A-l, B-l, D-l 1000 369 <0.0001-0.0003* P. monticola A-l, B-2 831 368 <0.0001-0.0001* Downloaded from https://academic.oup.com/sysbio/article/56/2/163/1685319 by guest on 28 September 2021 P. sibirica D-2 35 353 0.2568-0.3173 P. strobiformis A-l, B-2 154 364 0.0005-0.0010* P. wallichiana D-l, E-2 70 364 0.0005-0.0017* sect. Parrya Figure 2 P. balfouriana sect. Balfourianae 1000 292 1.0000 P. cembroides F-3 1000 294 0.1573-0.5637 P. culminicola F-1, G-1 1000 302 0.0016-0.0352* P. discolor F-3 951 298 0.0339*-0.1336c P. edulis F-1, G-2 594 304 0.0047-0.0073* P. johannis G-1,1-1 1000 302 0.0016-0.0124* P. monophylla G-3 1000 292 1.0000 P. remota F-2, G-1 1000 308 0.0001-0.0036* Tetters refer to clades in Figures 1 and 2, numbers following letters indicate the number of sequences in each clade. bLength of most parsimonious trees without constraints is 350 steps for sect. Quinquefoliae and 292 steps for sect. Parrya. cThirty-h've percent of the topologies for P. discolor returned significant results in the WSR test.

(30.3%) were supported as nonmonophyletic both in the nonmonophyletic species at 40,615 km2, and the aver- strict consensus tree and in the WSR constraint anal- age range for weakly nonmonophyletic-to-monophyletic yses (referred to as cases of "strong" nonmonophyly; species at 52,516 km2. In combination, these analyses Table 3). Data for the remaining nine species that were indicate that the most important determinant of allelic nonmonophyletic in the strict consensus tree but did not monophyly is intraspecific variation (and possibly ap- return significant results in the WSR tests are consid- portionment of genetic variation across species), rather ered ambiguous (including P. discolor; referred to as cases than the geographic range of a species or their census of "weak" nonmonophyly). Excluding cases of weak population size. nonmonophyly, 6 of 15 species from sect. Quinquefoliae (40.0%) and 4 of 9 species (44.4%) from sect. Parrya show strong allelic nonmonophyly. Estimating Ne, the Time to Reciprocal Monophyly, Across both pine sections, the PHYLY of a sample of and Genome-Wide Coalesence alleles from a species (e.g., strongly nonmonophyletic Estimates of Ne, the number of years for reciprocal versus weakly nonmonophyletic or monophyletic) was monophyly to be more likely than paraphyly, and the evaluated for statistical associations with two depen- number of years for complete genome-wide coalescence dent variables, n and Iogl0(geographic range). These were calculated for three species of pines (Table 5). These two variables are uncorrelated with each other (Pearson's species were chosen because they represent a range of R = 0.0554; P = 0.759). Individually, PHYLY shows a nucleotide diversities (Table 2), as well as the full phylo- statistically significant association with nucleotide di- genetic spectrum from essentially monophyletic (P. flex- versity. Average nucleotide diversities in species show- ilis;n = 0.0014), to weakly non-monophyletic (P. discolor; ing strong deviation from monophyly are significantly TT — 0.0112), to strongly nonmonophyletic (P. lamber- higher (n = 0.0149) than comparable values from mono- tiana; TT = 0.0196). Estimates of 6 are not significantly phyletic and weakly nonmonophyletic species com- different than values of TT (based on Tajima's D statis- bined (TT = 0.0056; one-way ANOVA, P < 0.0000; Table tic; data not shown), so our estimates of 9 are unlikely to 4). Monophyletic and weakly nonmonophyletic species reflect effects from natural selection. Estimated values for 4 4 are not significantly different {TT = 0.0049 and 0.0067 Ne range from ca. 1.7 x 10 for P. discolor to 12 x 10 for P. respectively; one-way ANOVA, P = 0.339; results not lambertiana (Table 5). Based on these estimates, the time shown). A relationship between PHYLY and geographic for reciprocal monophyly to become more likely than range of a species, in contrast, was not detected. Aver- paraphyly at this locus is substantially different across age geographic ranges for these two classes are essen- species, ranging from 1.7 million years for the nearly tially indistinguishable (one-way ANOVA, P < 0.8007; monophyletic species P.flexilis to 24 million years for the Table 4), with the average geographic range for strongly strongly nonmonophyletic P. lambertiana. The estimated 174 SYSTEMATIC BIOLOGY VOL. 56

TABLE 4. Nucleotide diversity and geographic range of strongly nonmonophyletic versus monophyletic to weakly nonmonophyletic species of pines. Significant differences between Phyly categories were tested using one-way ANOVA.

Dependent variable Phyly Mean SD Nucleotide diversity Strongly nonmonophyletic 10 0.01494 0.00471 Monophyletic to weakly nonmonophyletic 23 0.00566 0.00416 F = 32.02; df= 1 between groups, 31 within groups; P< 0.0000 LoglO (geographic range) Strongly nonmonophyletic 10 4.6087 1.1613 Monophyletic to weakly nonmonophyletic 23 4.7203 1.1554 F = 0.06; df = 1 between groups, 31 within groups; P = 0.8007 time for complete genome-wide coalescence ranges from range from 0.0131 (subsect. Cembroides) to 0.0177 (sub- 5.4 million years to 76 million years. sect. Strobus), which are 4.5 times (subsects. Strobus, Cem- broides) to 6.1 times (subsect. Balfourianae) greater than Downloaded from https://academic.oup.com/sysbio/article/56/2/163/1685319 by guest on 28 September 2021 comparable values for cpDNA (Gernandt et al., 2005). DISCUSSION Despite greater divergence for nrlTS (Liston et al., 1999) Considering that widespread allelic nonmonophyly relative to the LEA-like locus in the range of 1.6- (subsect. was observed and the fact that this is a single-locus Cembroides) to 2.9-fold (subsect. Gerardianae), orthologi- estimate of relationships, phylogenetic conclusions re- cal complexity in rDNA precludes its use for assessing garding relationships among the terminal taxa remain species-level monophyly. premature, even in those cases where species monophyly Our results suggest that the allele pool for any given is clear and support values are high. However, the phy- pine species may be very large, and it is likely that our logeny presented in this paper is in strong agreement small sample sizes failed to capture all of the major allele with both cpDNA (Gernandt et al., 2005) and nrDNA ITS lineages within any given species. For example, genetic (Liston et al., 1999) in resolving the same monophyletic differences partitioned by geographical subdivision may subsections, thereby further increasing our confidence not have been sampled in many species, particularly in in these intrageneric ranks. Although this study has de- those cases where monophyly was assessed with two tected nine cases of species having alleles in two or more alleles from a single individual (Table 1) or where the of the major clades in subsects. Strobus and Cembroides, range of a species was extensive (Table 2). Therefore, fur- no intraspecific variability crossed the taxonomic subsec- ther allele sampling may increase the number of species tions defined by Gernandt et al. (2005). Lack of species exhibiting allelic nonmonophyly. monophyly appears to be a greater complication within the species-rich subsects. Strobus and Cembroides than in the species-poor subsects. Gerardianae and Balfourianae— Possible Factors Leading to Species-Level Nonmonophyly 11 of 28 species tested for allelic monophyly in subsects. in and Cembroides were strongly nonmonophyletic, A number of factors have been suggested as poten- whereas none of the six members of subsects. Gerardianae tial mechanisms for apparent species nonmonophyly and Balfourianae were strongly nonmonophyletic. based on cytoplasmic and nuclear markers (Pamilo and The LEA-like intron offers greater support for the rela- Nei, 1988; Rieseberg and Broulliet, 1994; Moore, 1995; tionships among and within subsections than previously Crisp and Chandler, 1996; Doyle, 1997; Shaw, 2001; Hud- observed for cpDNA (Wang et al., 1999; Gernandt et al., son and Coyne, 2002; Rosenberg and Nordborg, 2002; 2005) and nrlTS (Liston et al., 1999). For example, average Funk and Omland, 2003; Rosenberg, 2003; Sites and Mar- p-distances at the LEA-like locus across four subsections shall, 2003; Bouille and Bosquet, 2005). Here, we review the main factors in reference to the LEA-like topologies (Figs. 1 and 2), and consider how these factors may have TABLE 5. Population and genetic parameters for select Pinus species played a role in species level monophyly and nonmono- (see text for details on calculations). phyly in subg. Strobus. As Funk and Omland (2003) point out, definitive causes for species polyphyly are difficult Years for Years to to prove, however, from the observed topological pat- monophyly reach terns causal inferences can be made. to be more genome- likely than wide Inadequate phylogenetic signal.—The signature of in- Species ab K paraphyly coalescence sufficient phylogenetic signal is apparent, but not uni- P. lambertiana 60 0.02016 12.0 x 104 24.0 x 10* 76.3 x 10* versal, in this data set. Inadequate information almost P. discolor 50 0.01124 8.03 x 104 13.4 x 10* 42.6 x 10* certainly plays a role in the nine cases of weak allelic P.flexilis 30 0.00143 1.70 x 104 1.71 x 10* 5.41 x 10* nonmonophyly across the subgenus where insufficient "Generation time in years, taken as the average number of years to seed bear- data were available to determine the status of species ing (Krugman and Jenkinson, online Woody Plant Seed Manual). Estimates for monophyly. Two of the clearest examples are P. balfouri- P. discolor are not available, so we used values from P. edulis, which is highly ana and P. monophylla, which have no alleles separated similar. bEstimates of theta are calculated using Watterson's approximation (Rozas by supported nodes in the strict consensus tree, and et al., 2004) based on silent positions. show insignificant nonmonophyly in WSR tests (Table 3). 2007 SYRING ET AL.—WIDESPREAD NONMONOPHYLY IN PINUS SUBGENUS STROBUS 175

However, inadequate phylogenetic information cannot in this taxonomic complex due to the diversity within explain many of the cases of nonmonophyly. There were P. strobiformis. 14 species across the subgenus that had one or more al- Hybridization and introgression.—The ability for pine leles in a supported (79% to 100% BS) topological posi- species to interbreed under artificial (Little and Righter, tion that ensured allelic nonmonophyly. Further, of these 1965; Garrett, 1979; Critchfield, 1986) and natural con- 14 species, constrained topologies resulted in 10 species ditions (P. monophylla x P. edulis: Lanner 1974a; Lanner being significantly nonmonophyletic (WSR tests; Table and Phillips, 1992; P. parviflora x P. pumila: Watano et al., 3). The fact that 10 species are resolved as nonmono- 2004) has been well documented. In addition, there are at phyletic, show strong nodal support, and have signifi- least three cases where hypothesized hybridization has cant WSR results provides compelling evidence that the given rise to a named taxon in subg. Strobus, namely P. species level polyphyly uncovered in this study is the hakkodensis (Farjon, 1998), P. quadrifolia (Lanner, 1974b), product of true biological signal and not simply inade- and P. edulis var. fallax (Little, 1968). However, none of quate phylogenetic information. these cases have been supported with molecular evi- Imperfect taxonomy.—In pines, overreliance on labile dence, in contrast to the well-documented hybrid origin Downloaded from https://academic.oup.com/sysbio/article/56/2/163/1685319 by guest on 28 September 2021 characters, such as number of needles per fascicle, or the of P. densata in subg. Pinus (Wang et al., 2001; Song et al., misinterpretation of intraspecific variation has caused 2003). In Pinus, natural hybridization is usually of lo- imperfect taxonomic circumscriptions in subg. Strobus cal occurrence in areas of sympatry, and often associated (reviewed in Farjon and Styles 1997; Businsky, 2004; with disturbance (Ledig, 1998). The two subgenera, Pinus Gernandt et al., 2005). Although issues of taxonomic un- and Strobus, are completely isolated, and crossing among certainty remain, particularly in the Asiatic members of subsections is very rare (Little and Critchfield, 1969). subsect. Strobus and the Mexican members of subsect. In order to determine whether hybridization could ex- Cembroides, it appears that imperfect taxonomy plays plain the distribution of alleles in the nonmonophyletic only a minor role in explaining the observed species species, we looked specifically at the groups where hy- nonmonophyly. For taxa in subsect. Cembroides, broader bridization has been documented either in the wild or species circumscription (e.g., synonymizing P. cembroides under artificial conditions. In subsect. Strobus, hybridiza- and P. remota [Farjon and Styles, 1997], synonymizing P. tion between P. pumila x P. parviflora (Watano etal, 2004) cembroides, P. discolor, and P. remota [Krai, 1993], or synis- not reflected in Figure 1, where both species are mono- onymizing P. cembroides, P. discolor, and P.johannis [Farjophyletin c and found in unique clades (A and D, respec- and Styles, 1997]) does not result in monophyly due to tively). Gernandt et al. (2005) resolve in a allele sharing across the major clades or well-supported clade of North American species based on a cpDNA anal- nodes within one of the major clades. Even the broadest ysis. The discrepancy between cpDNA and the LEA-like species concept, which treats all taxa of clades F, G, and locus may be a result of cpDNA introgression. In subsect. H as varieties of P. cembroides (Shaw, 1914, with the caveat Parrya, the documented hybridization between P. edulis that several of these taxa were described subsequent to x P. monophylla is widely recognized (Lanner, 1974a). In his publication), would fail to yield a monophyletic P. this case, the introgression of alleles in regions of sympa- cembroides sensu lato due to the position of johannisA2 intry could be observed, but we uncovered no cases of allele clade I (Fig. 2). sharing among these species (Fig. 2). The hybrid origin The situation in sect. Quinquefoliae is similar to that of of P. quadrifolia could not be evaluated because we did sect. Parrya. Pinus bhutanica was recently recognized as not sample populations of putative parents "P. juarazen- a of P. wallichiana (Businsky, 1999, 2004). Al- sis" Lanner from and P. monophylla from though their alleles are closely related, they appear in southern California. two separate clades (D and E), and thus even a broader Pinus lambertiana serves to illustrate the potential for al- species concept does not result in a monophyletic al- lelic nonmonophyly in the absence of hybridization. This lele lineage. Pinus kwangtungensis, recently considered species is easily distinguished from all other members of to be conspecific with P. wangii Hu & W. C. Cheng and subg. Strobus, and it is reproductively isolated from all related to P. parviflora (Businsky, 2004), is found in the North American pines (Critchfield, 1986; Fernando et al., well-supported (98% BS) clade E, several steps removed 2005). In our study, P. lambertiana is polyphyletic, with from P. parviflora. Pinus chiapensis, first described as athree alleles appearing in as many clades in the strict variety of P. strobus (Martinez, 1940), was very strongly consensus tree. The most recent common ancestor for supported (100% BS) as monophyletic in a monotypic all P. lambertiana alleles is at the node supporting the di- clade C, and with no allele sharing between this species vergence of clade D from clades A/B/C (Fig. 1, marked and P. strobus. Further, in the 20 recovered trees for sect. with an "L"). The allele lambertianaAl (clade A) is sister Quinquefoliae, the alternative placement for P. chiapensisto monticolaAl with 100% BS and could be interpreted as (clade C) was sister to clade B, which contains two of a potentially introgressant allele between these species. three alleles from P. monticola but none of the diversity However, Critchfield (1975, 1986) demonstrated that P. of P. strobus. In a final example, Critchfield and Little lambertiana can hybridize only with Eurasian members (1966) recognize P. strobiformis as a morphological and of subsect. Strobus (Critchfield, 1986), and the failure of geographical link between P.flexilis to the north and P. ay- P. lambertiana x P. monticola crosses traces to prefertil- acahuite to the south. Neither a broadly circumscribed P. ization barriers (Fernando et al., 2005). It is noteworthy ayacahuite nor P.flexilis would restore species monophyly that lambertianaA3 is found associated with clade D, an 176 SYSTEMATIC BIOLOGY VOL. 56 otherwise strictly Eurasian clade. There remains the pos- estimates of Willyard et al. (2007), the species-rich sub- sibility that allele A3 represents an ancient hybridization sections Cembroides and Strobus diverged from their sister event, but this seems unlikely given the geographic and lineages (subsects. Balfourianae and Gerardianae, respec- reproductive isolation of P. lambertiana. tively) between 20 and 10 million years ago, depend- Paralogy versus orthology.—We have no evidence to sug- ing upon the calibration date used. Given the rapidity gest that the observed pattern of species nonmonophyly with which lineages radiated in these subsections (e.g., is the result of paralogous gene amplification. On the clades A to E in Fig. 1, clades F to I in Fig. 2), divergence contrary, evidence suggests that the sequences ampli- events may have occurred so rapidly that lineage sort- fied from across subg. Strobus were orthologous. First, ing rarely approached allelic fixation within daughter the LEA-like locus has been shown to have a low copy lineages. number and is not a member of a large gene (Kru- Recombination.—The effect of recombination on recon- tovsky et al., 2004). The locus was amplified from haploid structing genealogies has been well documented (re- (megagametophyte) tissue, and this provides a power-

viewed in Posada et al., 2002). Because recombination Downloaded from https://academic.oup.com/sysbio/article/56/2/163/1685319 by guest on 28 September 2021 ful screen for evaluating PCR pool heterogeneity. If du- produces sequence segments that have different ge- plicate paralogs were amplified from the genome, this nealogical histories, organismal history cannot be accu- would produce a heterogeneous PCR pool that would rately depicted by a single phylogenetic "tree," but rather be easily identified by direct DNA sequencing. This was not observed, so we infer that only a single LEA-like tar- a set of correlated trees across recombinant segments in get was amplified and seqeunced. the alignment. Under scenarios of ancient recombina- tion, recombination between highly similar segments, or Lineage sorting.—Lineage sorting is the process by low recombination rates, the majority of positions sam- which ancestral polymorphism is "sorted" and later pled will accurately reflect phylogenetic history and the fixed in daughter lineages, either by drift or by se- impact of recombination will be limited to alleles within lection. Incomplete lineage sorting, by contrast, is the species or recently diverged species (Schierup and Hein, persistence and retention of ancestral polymorphisms 2000; Posada and Crandall, 2002). In contrast, recombina- through multiple speciation events. Incomplete lineage tion between divergent sequences or high recombination sorting can potentially impact any single-locus gene tree rates can produce inaccurate phylogenies that show arti- in any taxon. The theoretical impact of incomplete lin- factually long terminal branches, apparent trans-specific eage sorting on gene trees and inferred species trees has polymorphism (as shown here and the study of Bouille" been known for some time (Nei, 1987; Doyle, 1992). In- and Bousquet, 2005), and phylogenies that are signifi- complete lineage sorting can be promoted by biological cantly different from the true histories underlying the conditions that encourage the retention of genetic vari- data (Posada and Crandall, 2002). ability in a species, e.g., long life spans, large Ne, and The locus examined in this study has certainly ex- outcrossing. The retention of shared ancestral polymor- perienced recombination during the divergence of sec- phisms is also affected by natural selection (Broughton tions of subg. Strobus, but different methods employed and Harrison, 2003), in that balancing selection works (maximum x2; DSS) fail to detect recombination in our to oppose directional selection and maintain genetic di- data. We chose these methods because they use differ- versity. The timing of speciation events is another crit- ent approaches to infer recombination and because the ical factor impacting the process of lineage sorting be- maximum x2 method shows high sensitivity and a low cause rapid radiations or multiple speciation events in false-error rate in detecting recombination from empiri- quick succession reduce the chances for lineage sorting cal data sets (Posada, 2002). The low nucleotide diversity to reach completion before cladogenesis. If the time inter- characteristic of pines from subg. Strobus (n = 0.02 for vals between species divergence events are short relative Sect. Quinquefoliae and 0.03 for Sect. Parrya) may be the to the time intervals between lineage-branching events primary obstacle in detecting recombination because n in each species, ancestral polymorphisms may be carried values of ~5% are required to obtain statistical power through successive rounds of divergence. (Posada and Crandall, 2002). The added combination of Recent studies implicate incomplete lineage sorting high haplotype diversity and large effective population as a major factor in the retention of polymorphism in sizes for these species makes the detection of parental se- plants (Ioerger et al., 1990; Comes and Abbott, 2001; quences and daughter recombinants unlikely given our Chiang et al., 2004; Bouille and Bousquet, 2005) and ani- sampling strategy. mals (Nagl et al., 1998; Hare et al., 2002; Broughton and For pine species exhibiting monophyly across multiple Harrison, 2003; citations in Funk and Omland, 2003). LEA-like alleles (e.g., P. albicaulis, P. chiapensis; Figs. 1,2), Estimates for the retention of ancestral polymorphisms undetected recombination will have mimimal impact on range from 27 to 36 million years for SI alleles under bal- phylogenetic resolution because recombinants between ancing selection in the (Ioerger et al., 1990) "divergent" alleles will show congruent phylogenetic to 10 to 18 million years for genes of unknown function signal. For species deviating significantly from mono- in three species of Picea (Bouille and Bousquet, 2005). In phyly, recombination among divergent alleles could Pinus, ancestral retention on this order could explain all have a pronounced impact. Simulations by Schierup cases of species nonmonophyly within each of the sub- and Hein (2000) show that low levels of recombina- sections. According to the molecular clock divergence tion can produce trees that underestimate the amount 2007 SYRING ET AL.—WIDESPREAD NONMONOPHYLY IN PINUS SUBGENUS STROBUS 177 of true divergence between parental alleles and yield wide coalescence would extend far beyond the diver- more "star-like" phylogenies than would be obtained gence of the two sections from subg. Strobus. Clearly, the with nonrecombined sequences. Clearly, the presence of accuracy of these estimates depends upon many assump- divergent allelic polymorphism is primarily responsible tions; nevertheless, they highlight the fallacy of assum- for the pattern of nonmonophyly in pines; recombination ing 'species monophyly' in groups characterized by large simply adds complexity and uncertainty to the pattern. population sizes, and the complexity of resolving phylo- genetic relationships among pine species. In light of this information, the historic effective pop- How Do Pine Species Attain Monophyly? ulation size, as reflected by nucleotide diversity within Rieseberg and Brouillet (1994) suggest that species contemporary species, seems to be the driving factor in concepts based on monophyly are inadequate be- determining whether pine species are genetically unique, cause paraphyletic species should be expected from and whether genes can accurately trace a species phylo- progenitor-derivative speciation. Under that mode of genetic history. Noteworthy in this regard is that cur- speciation, paraphyly should be expected as a direct re- rent geographic ranges (a proxy for census population Downloaded from https://academic.oup.com/sysbio/article/56/2/163/1685319 by guest on 28 September 2021 sult of incomplete lineage sorting. Given enough time sizes and global abundance) are uncorrelated with nu- following a speciation event, drift or directional selection cleotide diversity, and show no association with the ex- should theoretically lead to genome-wide monophyly tent of monophyly or nonmonophyly of a species (Table via the sorting and extinction of lineages (Rieseberg and 4). Based on this analysis (Table 4), it seems that effective Brouillet, 1994; Rosenberg, 2003) as long as balancing se- population size either has an unpredictable association lection does not maintain polymorphisms that predate with monophyly in pines, or perhaps more likely, that ge- speciation events. At the point when lineage sorting is ographic range (or census count) is a poor predictor of the complete, all new mutation will result from the same effective population size of a species. These results were lineage, and intraspecific variation will reflect postspe- not entirely unexpected, because studies of pines have ciation mutation. The situation in Pinus is more com- shown that geographically widespread species can lack plex because multiple speciation events appear to have genetic diversity, whereas narrowly distributed pines occurred before lineage sorting was completed in any can show ample genetic diversity (Ledig, 1998; Ledig single bifurcation event. As a consequence, ancestral et al., 1999; Delgado et al., 1999). We note, for instance, polymorphisms have been retained through multiple that P. chiapensis (a narrow endemic of Mexico, limited speciation events. One potential outcome of ancestral al- to ca. 5,000 km2) shows far greater nucleotide diversity lelic retention on this order is that allele lineages within than P. albicaulis (n = 0.0031 versus 0.0000), even though species become polyphyletic. the latter species is dispersed across ca. 400,000 km2 of The occurrence of paraphyletic and polyphyletic western North America. species in both Figures 1 and 2 appears consistent with The multiple cycles of glaciation in North America, our estimates for the number of years until reciprocal Europe, and Asia have had a pronounced impact on monophyly is expected to be more likely than paraphyly genetic diversity (MacDonald et al., 1998; Petit (Table 5; Rosenberg, 2003). Even though our estimates of et al., 2003), and it is likely that contemporary ranges re- 6 are based on a small sample, the values in Table 5 are in- flect nonequilibrium processes of recent expansion (e.g., structive. For example, the estimated time for reciprocal species with low genetic diversity and large geographic monophyly to be more likely than paraphyly is 13.4 mil- ranges like P. koraiensis and P. strobus), recent range con- lion years for P. discolor. This value is bounded by the traction (e.g., species with high genetic diversity and estimated age of sect. Cembroides, which is calculated at small ranges like P. bhutanica and P. johannis), and pos- 19 million years (Willyard et al., 2007). Topological anal- sibly geographic or ecological isolation coupled with yses suggest that P. discolor is weakly nonmonophyletic long-distance dispersal events (e.g., P. chiapensis and P. (35% of WSR tests were significant; Table 3) at the LEA- albicaulis). This observation has implications for molec- like locus, a finding that shows surprisingly good agree- ular phylogenetic studies focusing on recently diverged ment with the approximations of allele coalescence and taxa (e.g., genera, and species complexes) because it high- molecular evolutionary age of the Cembriodes lineage. lights the importance of considering the magnitude of in- Nevertheless, the estimate for genome-wide coalescence traspecific diversity within the overall pattern of phyletic in this species is ca. 43 million years, suggesting that divergence. Until accurate estimates of relative or abso- portions of the genome may harbor deep trans-species lute effective population sizes become available, the con- polymorphisms, even under neutrality. nection between monophyly and Ne can only be inferred For species with large values of 0, such as P. lambertiana, from simulation studies (e.g., Rosenberg, 2003). the phylogenetic implications are striking. A calculated value of ca. 24 million years until reciprocal monophyly is expected to be more likely than paraphyly would predate Broader Implications for Phylogenic Studies of Seed Plants the divergence of subsects. Strobus and Gerardianae. If this Coalescence of allele lineages is dependent on a estimate is correct, it suggests that the potential exists for suite of interacting processes that occurred prior to and trans-species polymorphisms to be snared among pine during speciation, and continue to the present. Pro- species from different subsections. Further, a calculated cesses responsible for genie nonmonophyly in species value of ca. 76 million years for complete genome- include hybridization with subsequent introgression, 178 SYSTEMATIC BIOLOGY VOL. 56 incomplete lineage sorting, and recombination. These the limitation of noncoalescence, discrete nuclear genes processes may become superimposed, and their present- can provide important insights into the historical, demo- day genealogical patterns may reflect ancient and recent graphic, and possibly even selective processes that forge events. Sequences of this nature track the mutational his- new species. This information offers new perspectives tory of an allele, but they are unlikely to track the com- (relative to organellar DNA or nrlTS) as to the biolog- paratively simple cladistic history of recently diverged ical basis for the presence (or absence) of phylogenetic species. It is noteworthy that these processes are found to patterns. varying degrees in cpDNA and mtDNA; thus, the lack of Given the patterns of nonmonophyly detected in this species monophyly cannot be dismissed as exclusively a study, the use of nuclear genes (including nrlTS; see nuclear gene phenomenon. Nonetheless, coalescent sim- Gernandt et al., 2001) for the identification of species us- ulations by Rosenberg (2003) show that when three coa- ing DNA barcoding (Chase et al., 2005; Kress et al., 2005) lescent units of time have passed for haploid organellar is probably inappropriate in pines because species are genes, the probability of reciprocal monophyly exceeds segregating for polymorphisms that are retained across 0.8; this same length of time equates to 0.75 coalescent multiple and ancient speciation events. In some cases Downloaded from https://academic.oup.com/sysbio/article/56/2/163/1685319 by guest on 28 September 2021 units for nuclear loci, at which time the probability of (e.g., P. lambertiana and P. edulis), intraspecih'c allelic di- genie monophyly is less than 0.1. In the absence of com- versity is located in widely divergent phylogenetic po- plicating factors (e.g., organellar introgression, nuclear sitions (Figs. 1 and 2). Preliminary data from the matK recombination), the prevalence of noncoalescence is ex- region of the chloroplast genome among members of the pected to be much more problematic for nuclear loci than North American subsection Quinquefoliae (Liston et al., organellar loci. unpublished data) indicate that although allelic non- Given the long time frame predicted by coalescent the- monophyly occurs, it may not be as widespread as in ory for species of Pinus to attain monophyly (Table 5), the nuclear genome. genealogical-based species concepts (de Queiroz and Perhaps the most important general finding of this Donoghue, 1988; Baum and Donoghue, 1995; Shaw, 1998) work is that coalescence failure can lead to significantly derived from molecular markers may be inappropriate different phylogenetic interpretations that are only de- for pines and other species sharing similar life history tectable by sampling multiple individuals per species, traits. Distinct morphological and ecological differences and perhaps multiple loci. In Pinus, large Ne, long gener- are readily apparent between most of the species of ation times, and high outcrossing rates combine to make subgenus Strobus included in this study, yet species- allelic noncoalescence a readily detected pattern, even level paraphyly or polyphyly appears in nearly half of with a small sample size. Given this combination of traits, the species examined. Evidence from Pinus is consistent it seems reasonable to expect that other species-rich tree with the theoretical expectation that a large portion of genera with large, widespread populations, e.g., Quer- cus (400 species), Salix (450 species), Ficus (750 species), the genome frequently remains common to closely re- and Eucalyptus (680 species), should be prone to similar lated species well after speciation has occurred (Rosen- levels of nonmonophyly if examined by nuclear mark- berg, 2003). Wu (2001) outlined a theory of speciation ers. Less clear is the impact noncoalescence will have whereby "differentiation loci" become fixed within a pair on less widespread, more genetically uniform taxa. In of species whereas regions of "neutral divergence" in be- order to ensure the robustness of future studies, we rec- tween these fixed sites are free to have unrestricted gene ommend that researchers explicitly test the monophyly flow. Under this theory, the fixed regions in both genomes of species and lower level taxa by including multiple in- become larger over time as gene flow is further restricted. dividuals across the range of a species, especially prior Our data suggest the possibility for retention of ancestral to proposing new classifications or delimiting species. In polymorphisms in these regions of neutral divergence, the absence of such sampling,2 and working under the regardless of whether gene flow is ongoing. In other assumption of species monophyly, it seems likely that words, even after interspecific mating barriers become many more cases of species nonmonophyly will remain fixed at many loci, pines can be expected to harbor ances- undetected. tral polymorphisms at a large fraction of their genome. The demonstration of widespread species-level non- monophyly appears to be a severe constraint in the ACKNO WLEDG EMENTS application of nuclear genes in resolving organismal evolutionary history of pines at low taxonomic ranks. We thank Martin Gardner, Fiona Inches, and Philip Thomas (Royal Similar conclusions were reported by Bouille and Botanic Garden Edinburgh, Scotland), David Gernandt (Universidad Bousquet (2005) and Broughton and Harrison (2003), Aut6noma de Hidalgo, Mexico), Dave Johnson (USDA Forest Service, all of whom have suggested that nuclear gene genealo- Institute of Forest Genetics, Placerville, CA), Richard Sneizko (USDA gies have limited potential to reconstruct evolutionary Forest Service, Dorena Genetic Resource Center, Cottage Grove, OR), Michael Wall (Rancho Santa Ana Botanic Garden, Claremont, CA), histories among closely related species. Although we Jesus Vargas Hernandez (Institute de Recursos Naturales, Mexico), generally agree with this conclusion for our work in de- Randall Hitchin (Washington Park Arboretum, Seattle), Dale Simpson ciphering relationships among the terminal species of and Jean Beaulieu (Natural Resources Canada), Steven Roelof (Native Pinus, it is difficult to extrapolate these findings to dis- Plant Society of Oregon), Jim Buck (University of Michigan), Richard similar taxa with different life history traits (e.g., small Halse (Oregon State University), Bill Dvorak (CAMCORE), Forest Tree Breeding Center Gapan), Sanderson McArthur (USDAFS Rocky Mt. Re- Ne, short life spans, higher inbreeding rates). Even with search Station), Paul Halladin (Iseli Nursery, Oregon), Dennis Ringnes 2007 SYRING ET AL.—WIDESPREAD NONMONOPHYLY IN PINUS SUBGENUS STROBUS 179

(USDAFS Central Zone Genetic Resource Program, California), Car- Delgado, P., D Pinero, A. Chaos, N. P6rez-Nasser, and E. R. Alvarez- rie Sweeney (USDAFS Oconto River Seed Orchard, Minnesota), Joe Buylla. 1999. High population differentiation and genetic variation Myers (USDAFS Coeur d'Alene Nursery, Idaho), Yasayuki Watano in the endangered Mexican pine Pinus rzedowskii (Pinaceae). Am. J. (Chiba University, Japan), loan Blada (Forest Research and Manage- Bot. 86:669-676. ment Institute of Bucharest, Romania), Frank Hammond (U.S. Army, Doyle, J. J. 1997. Trees within trees: Genes and species, molecules and Fort Huachuca, AZ), Kirsten Winter (Cleveland National Forest, San morphology. Syst. Biol. 46:537-553. Diego, CA), James C. Zech (Sull Ross State University, Alpine, TX), Doyle, J. J. 1992. Gene trees and species trees: Molecular systematics as and Konstantin Krutovsky (Texas A&M University) for the generous one-character taxonomy. Syst. Bot. 17:144-163. contributions of seed and needle tissue. Funding for this study was Farjon, A. 1998. World checklist and bibliography of conifers. Royal provided by the National Science Foundation grant DEB 0317103 to Botanical Gardens at Kew, Richmond, UK. Aaron Liston and Richard Cronn and the USDA Forest Service Pacific Farjon, A. 2005. Drawings and descriptions of the genus Pinus, 2nd Northwest Research Station. edition. E. J. Brill and W. Backhuys, Leiden, Netherlands. Farjon, A., and B. T. Styles. 1997. Pinus (Pinaceae). Flora Neotropica Monograph 75. The New York Botanical Garden, New York. REFERENCES Felsenstein, J. 1985. Confidence limits on phylogenies: An approach using the bootstrap. Evolution 39:783-791.

Alvarez, I., R. Cronn, and J. F. Wendel. 2005. Phylogeny of the New Downloaded from https://academic.oup.com/sysbio/article/56/2/163/1685319 by guest on 28 September 2021 World diploid cottons (Cossypium L., Malvaceae) based on sequences Feng, Y, S.-H. Oh, and P. S. Manos. 2005. Phylogeny and historical of three low-copy nuclear genes. Plant Syst. Evol. 252:199-214. biogeography of the genus Platanus as inferred from nuclear and Alvin, K. 1960. Further conifers of the Pinaceae from the Wealden For- chloroplast DNA. Syst. Bot. 30:786-799. mation of Belgium. Mem. Inst. Roy. Sci. Nat. Belg. 146:1-39. Fernando, D. D., S. M. Long, and R. A. Sniezko. 2005. Sexual reproduc- Andresen, J. W. 1966. A multivariate analysis of the Pinus chiapensis- tion and crossing barriers in white pines: The case between Pinus monticola-strobus phylad. Rhodora 68:1-24. lambertiana (sugar pine) and P. monticola (western white pine). Tree Bailey, D. K. 1970. Phytogeography and taxonomy of Pinus subsection Genet. Genomes 1:143-150. Balfourianae. Ann. Missouri Bot. Gard. 57:210-249. Funk, D. J., and K. E. Omland. 2003. Species-level paraphyly and Baum, D. A., and M. J. Donoghue. 1995. Choosing among alternative polyphyly: Frequency, causes, and consequences, with insights from "phylogenetic" species concepts. Syst. Bot. 20:560-573. animal mitochondrial DNA. Annu. Rev. Ecol. Evol. Syst. 34:397- Berry, P. E., A. L. Hipp, K. J. Wurdack, B. Van Ee, and R. Riina. 2005. 423. Molecular phylogenetics of the giant genus Croton and Cro- Garrett, P. W. 1979. Species hybridization in the genus Pinus. USDA For- toneae (Euphorbiaceae sensu stricto) using ITS and trnL-trnF DNA est Service, Northeast Forest Experiment Station, Station Research sequence data. Am. J. Bot. 92:1520-1534. Paper 436. Bouill6, M., and J. Bousquet. 2005. Trans-species shared polymorphisms Gernandt, D. S., A. Liston, and D. Pinero. 2001. Variation in the nrDNA at orthologous nuclear gene loci among distant species in the conifer ITS of Pinus subsection Cembroides: Implications for molecular sys- Picea (Pinaceae): Implications for the long-term maintenance of ge- tematic studies of pine species complexes. Mol. Phylogenet. Evol. 21:449-467. netic diversity in trees. Am. J. Bot. 92:63-73. Broughton, R. E., and R. G. Harrison. 2003. Nuclear gene genealogies Gernandt, D. S., A. Liston, and D. Pinero. 2003. Phylogenetics of Pi- reveal historical, demographic and selective factors associated with nus subsections Cembroides and Nelsoniae inferred from cpDNA se- speciation in field crickets. Genetics 163:1389-1401. quences. Syst. Bot. 4:657-673. Gernandt, D. S., G. G. Lopez, S. O. Garcia, and A. Liston. 2005. Phy- Businsky, R. 1999. Taxonomic revision of Eurasian pines (genus Pi- logeny and classification of Pinus. Taxon 54:29—42. nus L.)—Survey of species and infraspecific taxa according to latest Gilbert, C, J. Dempcy, C. Ganong, R. Patterson, and G. S. Spicer. 2005. knowledge. Acta Pruhoniciana 68:7-86. Phylogenetic relationships within Phacelia subgenus Phacelia (Hy- Businsky, R. 2004. A revision of the Asian Pinus subsection Strobus drophyllaceae) inferred from nuclear rDNA ITS sequence data. Syst. (Pinaceae). Willdenowia 34:209-257. Bot. 30:627-634. Chase, M. W, N. Salamin, M. Wilkinson, J. M. Dunwell, R. P. Goetsch, L., A. J. Eckert, and B. D. Hall. 2005. The molecular systematics Kesanakurthi, N. Haidar, and V. Savolainen. 2005. Land plants and of Rhododendron (Ericaceae): A phylogeny based upon RPB2 gene DNA barcodes: Short-term and long-term goals. Phil. Trans. R. Soc. sequences. Syst. Bot. 30:616-626. Lond. B 360:1889-1895. Goodwillie, C, and J. W. Stiller. 2001. Evidence for polyphyly in a Chiang, T.-Y., K.-H. Hung, T.-W. Hsu, and W.-L. Wu. 2004. Lineage species of Linanthus (Polemoniaceae): Convergent evolution in self- sorting and phylogeography in Lithocarpus formosanus and L. dodon- fertilizing taxa. Syst. Bot. 26:273-282. aeifolius (Fagaceae) from Taiwan. Ann. Missouri Bot. Gard. 91:207- 222. Hare, M. P., F. Cipriano, and S. R. Palumbi. 2002. Genetic evidence on the Church, S. A., and D. R. Taylor. 2005. Speciation and hybridization demography of speciation in allopatric dolphin species. Evolution among Houstonia (Rubiaceae) species: The influence of polyploidy 56:804-816. on reticulate evolution. Am. J. Bot. 92:1372-1380. Hudson, R. R., and J. A. Coyne. 2002. Mathematical consequences of Comes, H. P., and R. J. Abbott. 2001. Molecular phylogeography, retic- the genealogical species concept. Evolution 56:1557-1565. ulation, and lineage sorting in Mediterranean Senecio sect. Senecio Ioerger, T. R., A. G. Clark, and T. H. Kao. 1990. Polymorphism at the (Asteraceae). Evolution 55:1943-1962. self-incompatibility locus in Solanaceae predates speciation. Proc. Crisp, M. D., and G. T. Chandler. 1996. Paraphyletic species. Telopea Natl. Acad. Sci. USA 87:9732-9735. 6:813-844. Kamiya, K., K. Harada, H. Tachida, and P. S. Ashton. 2005. Phylogeny of Critchfield, W. B. 1975. Interspecific hybridization in Pinus: A summary PgiC gene in Shorea and its closely related genera (Dipterocarpaceae), review. Pages 99-105 in Proceedings 14th Canadian Tree Improve- the dominant trees in southeast Asian tropical rain forests. Am. J. Bot. ment Association, Part 2, Symposium on interspecific hybridization 92:775-788. in forest trees, NB, 28-30 August 1973 (Fowler, D. P., and C. W. Krai, R. 1993. Pinus. Pages 373-398 in Flora of North America (North Yeatman, eds.). Canadian Forestry Service, Ottawa, Ontario. of Mexico), volume 2 (Flora of North America Editorial Committee, Critchfield, W. B. 1986. Hybridization and classification of the white ed.). Oxford University Press, New York. pines (Pinus section Strobus). Taxon 35:647-656. Kress, W. J., K. J. Wurdack, E. A. Zimmer, L. A. Weigt, and D. H. Janzen. Critchfield, W. B., and E. L. Little, Jr. 1966. Geographic distribution of the 2005. Use of DNA barcodes to identify flowering plants. Proc. Natl. pines of the world. USDA Forest Service Miscellaneous Publication Acad. Sci. USA 102:8369-8374. 991. Washington, DC. Krutovsky, K. V, M. Troggio, G. R. Brown, K. D. Jermstad, and D. Cross, E. W., C. J. Quinn, and S. J. Wagstaff. 2002. Molecular evidence B. Neale. 2004. Comparative mapping in the Pinaceae. Genetics for the polyphyly of Olearia (Astereae: Asteraceae). Plant. Syst. Evol. 168:447-461. 235:99-120. Kumar, S., K. Tamura, I. B. Jakobsen, and M. Nei. 2001. MEGA2: Molec- de Queiroz, K., and M. J. Donoghue. 1988. Phylogenetic systematics ular evolutionary genetics analysis software. Bioinformatics 17:1244- and the species problem. Cladistics 4:317-338. 1245. 180 SYSTEMATIC BIOLOGY VOL. 56

Lanner, R. M. 1974a. Natural hybridization between and Moore, W. S. 1995. Inferring phylogenies from mtDNA variation: in the American Southwest. Silvae Genet. 23:108- Mitochondrial-gene trees versus nuclear-gene trees. Evolution 116. 49:718-726. Lanner, R. M. 1974b. A new pine from Baja California and the hybrid Muller, K., and T. Borsch. 2005. Phylogenetics of Utricularia (Lentibu- origin of Pinus quadrifolia. Southwest. Nat. 19:75-95. lariaceae) and molecular evolution of the trnK intron in a lineage Lanner, R. M., and A. M. Phillips III. 1992. Natural hybridization and with high substitutional rates. Plant Syst. Evol. 250:39-67. introgression of pinyon pines in northwestern Arizona. Int. J. Plant Nagl, S., H. Tichy, W. E. Mayer, N. Takahata, and J. Klein. 1998. Persis- Sci. 153:250-257. tence of neutral polymorphisms in Lake Victoria cichlid fish. Proc. Ledig, F. T. 1998. Genetic variation in Pinus. Pages 251-280 in Ecol- Natl. Acad. Sci. USA 95:14238-14243. ogy and biogeography of Pinus (D. M. Richardson, ed.). Cambridge Nei, M. 1987. Molecular evoltionary genetics. Columbia University University Press, Cambridge, UK. Press, New York. Ledig, F. T., M. T. Conkle, B. Bermejo-Velazaquez, T. Eguiluz-Piedra, Oh, S.-H., and D. Potter. 2005. Molecular phylogenetic systematics and P. D. Hodgskiss, D. R. Johnson, W. S. Dvorak. 1999. Evidence for an biogeography of tribe Neillieae (Rosaceae) using DNA sequences of extreme bottleneck in a rare Mexican pinyon: Genetic diversity, dise- cpDNA, rDNA, and LEAFY. Am. J. Bot. 92:179-192. quilibrium, and the mating system in Pinus maximartinezii. Evolution Oline, D. K., J. B. Mitton, and M. C. Grant. 2000. Population and sub- 53:91-99.

specific genetic differentiation in the foxtail pine (Pinus balfouriana). Downloaded from https://academic.oup.com/sysbio/article/56/2/163/1685319 by guest on 28 September 2021 Lee, C, S.-C. Kim, K. Lundy, and A. Santos-Guerra. 2005. Chloroplast Evolution 54:1813-1819. DNA phylogeny of the woody Sonchus alliance (Asteraceae: Sonchi- Pamilo, P., and M. Nei. 1988. Relationships between gene trees and nae) in the Macaronesian Islands. Am. J. Bot. 92:2072-2085. species trees. Mol. Biol. Evol. 5:568-583. Levin, R. A., and J. S. Miller. 2005. Relationships within tribe Lycieae Perez de la Rosa, J., S. A. Harris, and A. Farjon. 1995. Noncoding chloro- (Solanaceae): Paraphyly of and multiple origins of gender plast DNA variation in Mexican pines. Theor. Appl. Genet. 91:1101— dimorphism. Am. J. Bot. 92:2044-2053. 1106. Levin, R. A., K. Watson, and L. Bohs. 2005. A four-gene study of evo- Perry, J. P. 1991. The pines of Mexico and Central America. Timber lutionary relationships in Solanum section Acanthophora. Am. J. Bot. Press, Portland, Oregon. 92:603-612. Petit, R. J., I. Aguinagalde, J. L. de Beaulieu, C. Bittkau, S. Brewer, Liston, A., W. A. Robinson, D. Pifiero, and E. R. Alvarez-Buylla. 1999. R. Cheddadi, R. Ennos, S. Fineschi, D. Grivet, M. Lascoux, A. Mo- Phylogenetics of Pinus (Pinaceae) based on nuclear ribosomal DNA hanty, G. Miiller-Starck, B. Demesure-Musch, A. Palm6, J. P. Martin, internal transcribed spacer region sequences. Mol. Phylogenet. Evol. S. Rendell, and G. G. Vendramin. 2003. Glacial refugia: Hotspots but 11:95-109. not melting pots of genetic diversity. Science 300:1563-1565. Little, E. L., Jr. 1968. Two new pinyon varieties from Arizona. Phytologia Popp, M., and B. Oxelman. 2004. Evolution of a RNA polymerase gene 17:329-342. family in Silene (Caryophyllaceae)—Incomplete concerted evolution Little, E. L., Jr., and W. B. Critchfield. 1969. Subdivisions of the and topological congruence among paralogues. Syst. Biol. 53:914- genus Pinus. USDA Forest Service Miscellaneous Publication 1144. 932. Washington, DC. Posada, D. 2002. Evaluation of methods for detecting recombination Little, E. L., Jr., and F. I. Righter. 1965. Botanical descriptions of forty arti- from DNA sequences: Empirical data. Mol. Biol. Evol. 19:708-717. ficial pine hybrids. USDA Forest Service Technical Bulletin 1345:1-47. Posada, D., and K. A. Crandall. 2001. Evaluation of methods for de- Washington, DC. tecting recombination from DNA sequences: Computer simulations. Luo, Y., F. Zhang, and Q.-E. Yang. 2005. Phylogeny of Aconitum sub- Proc. Natl. Acad. Sci. USA 98:13757-13762. genus Aconitutn (Ranunculaceae) inferred from ITS sequences. Plant Posada, D., and K. A. Crandall. 2002. The effect of recombination on Syst. Evol. 252:11-25. the accuracy of phylogeny estimation. J. Mol. Evol. 54:396-402. MacDonald, G. M., L. C. Cwynar, and C. Whitlock. 1998. The late Qua- Price, R. A., A. Liston, and S. H. Strauss. 1998. Phylogeny and system- ternary dynamics of pines in northern North America. Pages 122-149 atics of Pinus. Pages 49-68 in Ecology and biogeography of Pinus (D. in Ecology and biogeography of Pinus (D. M. Richardson, ed.). Cam- M. Richardson, ed.). Cambridge University Press, Cambridge, UK. bridge University Press, Cambridge, UK. Rieseberg, L. H., and L. Broulliet. 1994. Are many plant species para- Magall6n, S., and M. J. Sanderson. 2002. Relationships among seed phyletic? Taxon 43:21-32. plants inferred from highly conserved genes: Sorting conflicting Roalson, E. H., and E. A. Friar. 2004. Phylogenetic relationships and phylogenetic signals among ancient lineages. Am. J. Bot. 89:1991- biogeographic patterns in North American members of Carex section 2006. Acrocystis (Cyperaceae) using nxDNA ITS and ETS sequence data. Malusa, J. 1992. Phylogeny and biogeography of the pinyon pines (Pi- Plant Syst. Evol. 243:175-187. nus subsect. Cembroides). Syst. Bot. 17:42-66. Robba, L., M. A. Carine, S. J. Russell, and F. M. Raimondo. 2005. The Martin, D. P., C. Williamson, and D. Posada. 2005. RDP2: Recombina- monophyly and evolution of Cynara L. (Asteraceae) sensu lato: Evi- tion detection and analysis from sequence alignments. Bioinformat- dence from the internal transcribed spacer region of nrDNA. Plant ics 21:260-262. Syst. Evol. 253:53-64. Martinez, M. 1940. Pinaceas Mexicanas. Anales del Instituto de Biologia Rosenberg, N. A. 2003. The shapes of neutral gene genealogies in two de Mexico 11:57-84. species: Probabilities of monophyly, paraphyly, and polyphyly in a Mason-Gamer, R. J. 2005. The /5-amylase genes of grasses and a phy- coalescent model. Evolution 57:1.465-1477. logenetic analysis of the Triticeae (Poaceae). Am. J. Bot. 92:1045- Rosenberg, N. A., and M. Nordborg. 2002. Genealogical trees, coa- 1058. lescent theory and the analysis of genetic polymorphisms. Nature McGuire, G., F. Wright, and M. J. Prentice. 1997. A graphical method for 3:380-390. detecting recombination in phylogenetic data sets. Mol. Biol. Evol. Rozas, J., J. SaYichez-DelBarrio, X. Messeguer, and R. Rozas. 2004. 14:1125-1131. DnaSP: DNA sequence polymorphism, v4.00.6. Bioinformatics McKown, A. D., J.-M. Moncalvo, and N. G. Dengler. 2005. Phylogeny of 19:2496-2497. Flaveria (Asteraceae) and inference of C4 photosynthesis evolution. Rzedowski, J., and L. Vela. 1966. Pinus strobus var. chiapensis en la Sierra Am. J. Bot. 92:1911-1928. Madre del Sur de Mexico. Ciencia 24:211-216. Meijer, J. 2000. Fossil woods from the late Cretaceous Aachen Forma- SAS Institute, Inc. 1999. SAS/STAT user's guide, version 8, volume 1. tion. Rev. Palaeobot. Palynol. 112:297-336. SAS Institute, Cary, North Carolina. Miller, C. 1973. Silicified cones and vegetative remains of Pinus from Schierup, M. H., and J. Hein. 2000. Consequences of recombination on the Eocene of British Columbia. Cont. Univ. Mich. Museum Paleo. traditional phylogenetic analysis. Genetics 156:879-891. 24:101-118. Schneeweiss, G., P. Schonswetter, S. Kelso, and H. Niklfeld. 2004. Com- Milne, I., F. Wright, G. Rowe, D. F. Marshall, D. Husmeier, and G. plex biogeographic patterns in Androsace (Primulaceae) and related McGuire. 2004. TOPALi: Software for automatic identification of re- genera: Evidence from phylogenetic analyses of nuclear internal combinant sequences within DNA multiple alignments. Bioinfor- transcribed spacer and plastid trnL-F sequences. Syst. Biol. 53:856- matics 20:1806-1807. 876. 2007 SYRING ET AL.—WIDESPREAD NONMONOPHYLY IN PINUS SUBGENUS STROBUS 181

Shaw, G. R. 1914. The genus Pinus. Publications of the Arnold Arbore- Watano, Y, A. Kanai, and N. Tani. 2004. Genetic structure of hy- tum No. 5, Riverside Press, Cambridge, Massachusetts. brid zones between and P. parviflora var. pentaphylla Shaw, J. 2001. Biogeographic patterns and cryptic speciation in (Pinaceae) revealed by molecular hybrid index analysis. Am. j. Bot. bryophytes. J. Biogeogr. 28:253-261. 91:65-72. Shaw, J., and R. L. Small. 2005. Chloroplast DNA phylogeny and phylo- Wendel, J. R, and J. J. Doyle. 1998. Phylogenetic incongruence: Win- geography of the North American plums (Prunus subgenus Prunus dow into genome history and molecular evolution. Pages 265-296 in section Prunocerasus, Rosaceae). Am. J. Bot. 92:2011-2030. Molecular systematics of plants II: DNA sequencing (P. S. Soltis, D. E. Shaw, K. L. 1998. Species and the diversity of natural groups. Pages Soltis, and J. J. Doyle, eds.). Kluwer Academic Publishers, Dordrecht, 44-56 in Endless forms: Species and speciation (D. J. Howard and S. Netherlands. J. Berlocher, eds.). Oxford University Press, Oxford, UK. Wilkin, P., P. Schols, M. W. Chase, K. Chavamarit, C. A. Furness, S. Simmons, M. P., and H. Ochoterena. 2000. Gaps as characters in Huysmans, F. Rakotonasolo, E. Smets, and C. Thapyai. 2005. A plas- sequence-based phylogenetic analyses. Syst. Biol. 49:369-381. tid gene phylogeny of the yam genus, Dioscorea: Roots, fruits and Sites, J. W., and J. C. Marshall. 2003. Delimiting species: A renaissance Madagascar. Syst. Bot. 30:736-749. issue in systematic biology. Trends Ecol. Evol. 18:462-470. Willyard, A., J. Syring, D. Gernandt, A. Liston, and R. Cronn. 2007. Smith, J. M. 1992. Analyzing the mosaic structure of genes. J. Mol. Evol. Molecular evolutionary rates indicate a recent and rapid diversifica- 34:126-129. tion for modern pine lineages. Mol. Biol. Evol 23:1-12. Song, B. H., X. Q. Wang, X. R. Wang, K. Y. Ding, and D. Y. Hong. 2003. Winkworth, R. C, and M. J. Donoghue. 2005. Viburnum phylogeny Downloaded from https://academic.oup.com/sysbio/article/56/2/163/1685319 by guest on 28 September 2021 Cytoplasmic composition in Pinus densata and population establish- based on combined molecular data: Implications for taxonomy and ment of the diploid hybrid pine. Mol. Ecol. 12:2995-3001. biogeography. Am. J. Bot. 92:653-666. Swofford, D. L. 2003. PAUP*4.0bl0: Phylogenetic analysis using par- Wright, J. A., A. M. Marin V, and W. S. Dvorak. 1996. Conservation simony (*and other methods). Sinauer Associates, Sunderland, and use of the Pinus chiapensis genetic resource in Columbia. Forest Massachusetts. Ecol.Manag. 88:283-288. Syring, J., A. Willyard, R. Cronn, and A. Liston. 2005. Evolutionary rela- Wu, C.-1.2001. The genie view of the process of speciation. J. Evol. Biol. tionships among Pinus (Pinaceae) subsections inferred from multiple 14:851-865. low-copy nuclear loci. Am. J. Bot. 92:2086-2100. Yamane, K., and T. Kawahara. 2005. Intra- and interspecific phy- Templeton, A. R. 1983. Phylogenetic inference from restriction endonu- logenetic relationships among diploid Triticum-Aegilops species clease cleavage site maps with particular reference to the evolution (Poaceae) based on base-pair substitutions, indels, and microsatel- of humans and the apes. Evolution 37:221-244. lites in chloroplast noncoding sequences. Am. J. Bot. 92:1887- Treutlein, J., G. F. Smith, B.-E. van Wyk, and M. Wink. 2003. Evidence for 1898. the polyphyly of Haivorthia (Asphodelaceae Alooideae; Yuan, Y. M., S. Wohlhauser, M. Moller, J. Klackenberg, M. Callman- Asparagales) inferred from nucleotide sequences of rbcL, matK, ITS1 der, and P. Kupfer. 2005. Phylogeny and biogeography of Exacum and genomic fingerprinting with ISSR-PCR. Plant Biol. 513-521. (Gentianaceae): A disjunctive distribution in the Indian ocean basin Wang, X. R., A. E. Szmidt, and O. Savolainen. 2001. Genetic composition resulted from long distance dispersal and extensive radiation. Syst. and diploid hybrid speciation of a high mountain pine, Pinus densata, Biol. 54:21-34. native to the Tibetan plateau. Genetics 159:337-346. Wang, X. R., Y. Tsumura, H. Yoshimaru, K. Nagasaka, and A. E. Szmidt. 1999. Phylogenetic relationships of Eurasian pines (Pinus, Pinaceae) First submitted 14 February 2006; reviews returned 08 May 2006; based on chloroplast rbcL, matK, rpl20-rpsl8 spacer, and trnV intron final acceptance 05 September 2006 sequences. Am J Bot 86:1742-1753. Associate Editor: Vincent Savolainen

APPENDIX 1. Representative phylogenetic studies that highlight the issue of species nonmonophyly. Papers include recent studies from four prominent journals {Systematic Biology, Systematic Botany, American Journal of Botany, Plant Systematics and Evolution). Only studies that included multiple samples per species and studies that sampled closely related species within a genus are included. Where possible, determinations of species monophyly were based on strict consensus trees (or ML trees).

Number of species Number of species Number of with multiple accessions that were monophyletic, accessions (total species in study, (% of species tested included per Authors Taxonomic group Loci % of species tested) that were monophyletic) spp. (range) Berry et al., 2005 Croton, Euphorbiaceae ITS + trnL-F 7 (78,9.0) 7 (100) 2-3 Schneeweiss et al., 2004 Androsace, Primulaceae ITS + trnL-F 15 (47,31.9) 14 (93) 2-4 Robba et al., 2005 Cynara, Asteraceae ITS1 5 (8,62.5) 4(80) 2-4 Feng et al., 2005 Platanus, Platanaceae ITS + trnT-L 4 (5,80.0) 4(80) 2-6 Gilbert et al., 2005 Phacelia, Hydrophyllaceae ITS 17 (53,32.1) 13 (76) 2-4 Oh and Potter, 2005 Tribe Neillieae, Rosaceae trnL-F + trnD-T + psbA-trnK + 12 (16, 75.0) 8(66) 2-4 matK-trnK + ITS + ETS + leafy Yamane and Kawahara, Triticum-Aegilops, Poaceae trnC-rpoB + trnF-ndhJ + 13 (13,100) 8(62) 4-13 2005 ndhF-rpll32 + atpI-atpH + 7 microsatellites Yuan et al., 2005 Exacum, Gentianaceae ITS + trnL intron 5 (30,16.7) 3(60) 2-4 Lee et al., 2005 Sonchus, Asteraceae ITS + trnT-L-F + matK 12 (27,44.4) 6(50) 2-4 Alvarez et al., 2005 Gossypium, Malvaceae AdhC 6 (14,42.9) 3(50) 2-5 Church and Taylor, Houstonia, Rubiaceae trnL intron + trnS-trnG spacer 14 (17,82.4) 6(43) 2-13

Kamiya et al., 2005 Shorea, Dipterocarpaceae PgiC 17 (48,35.4) 6(35) 2-5 Levin and Miller, 2005 Lycium, Solanaceae waxy+ trnT-F 6 (47,12.8) 2(33) 2-3 McKown et al., 2005 Flaveria, Asteraceae trnL-F 20(21,95.2) 5(25) 2-7 Roalson and Friar, 2004 Carex, Cyperaceae ITS + ETS 4 (22,18.2) 1(25) 2-A Shaw and Small, 2005 Prunus, Rosaceae rplU 13 (14,92.9) 0 3-53