This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and education use, including for instruction at the authors institution and sharing with colleagues. Other uses, including reproduction and distribution, or selling or licensing copies, or posting to personal, institutional or third party websites are prohibited. In most cases authors are permitted to post their version of the article (e.g. in Word or Tex form) to their personal website or institutional repository. Authors requiring further information regarding Elsevier’s archiving and manuscript policies are encouraged to visit: http://www.elsevier.com/copyright Author's personal copy

Molecular Phylogenetics and Evolution 64 (2012) 452–470

Contents lists available at SciVerse ScienceDirect

Molecular Phylogenetics and Evolution

journal homepage: www.elsevier.com/locate/ympev

Three genome-based phylogeny of s.l.: Further evidence for the evolution of gymnosperms and Southern Hemisphere biogeography ⇑ Zu-Yu Yang a,b, Jin-Hua Ran a, Xiao-Quan Wang a, a State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing 100093, China b Graduate University of the Chinese Academy of Sciences, Beijing 100039, China article info abstract

Article history: Phylogenetic information is essential to interpret the evolution of species. While DNA sequences from Received 1 April 2011 different genomes have been widely utilized in phylogenetic reconstruction, it is still difficult to use Revised 1 May 2012 nuclear genes to reconstruct phylogenies of groups with large genomes and complex gene families, Accepted 2 May 2012 such as gymnosperms. Here, we use two single-copy nuclear genes, together with chloroplast and mito- Available online 18 May 2012 chondrial genes, to reconstruct the phylogeny of the ecologically-important family Cupressaceae s.l., based on a complete sampling of its 32 genera. The different gene trees generated are highly congru- Keywords: ent in topology, supporting the basal position of and the seven-subfamily classification, LEAFY (LFY) and the estimated divergence times based on different datasets correspond well with each other and with NEEDLY (NLY) Allopolyploid origin the oldest fossil record. These results imply that we have obtained the species phylogeny of Cupressaceae Ancient hybridization s.l. In addition, possible origins of all three polyploid were investigated, and a hybrid origin was Gymnosperm suggested for , and . Moreover, we found that the biogeographic history of Gondwana Cupressaceae s.l. is associated with the separation between Laurasia and Gondwana and the further break-up of the latter. Our study also provides new evidence for the gymnosperm phylogeny. Ó 2012 Elsevier Inc. All rights reserved.

1. Introduction convincing if divergence time estimates are congruent among different genes and consistent with the fossil record. Reconstructing plant phylogenies using sequences from inde- In the past decade, low-copy nuclear genes have been widely pendent nuclear loci and different genomic compartments has utilized to improve the resolution and robustness of plant phyloge- been increasingly popular due to the growing awareness that rely- netic reconstruction at various taxonomic levels (e.g., Wang et al., ing on a single data set may result in insufficient phylogenetic res- 2000; Sang, 2002; Peng and Wang, 2008). However, this use is lim- olution or misleading inferences (Maddison, 1997; Wendel and ited by the problems associated with the complex evolutionary Doyle, 1998). Phylogenetic congruence among different genomic dynamics of nuclear genes, such as gene paralogy, recombination, compartments could strongly suggest that the gene trees are also lineage sorting, and lateral gene transfer (Small et al., 2004). This congruent with the species phylogeny (Wang et al., 2000). On the limitation is particularly notable for gymnosperms due to the large other hand, molecular dating has proved very efficient in estimat- nuclear genomes and complex gene families (Kinlaw and Neale, ing evolutionary divergence times of diverse taxa (e.g., Wang et al., 1997; Murray, 1998; Leitch et al., 2001; Ahuja and Neale, 2005), 2000; Sanderson, 2002; Knapp et al., 2005; Barker et al., 2007; as well as the unavailability of complete genome sequences so Sauquet et al., 2009), although there are still some controversies far. In contrast to other low-copy nuclear genes with a high rate regarding mainly the appropriateness of the selected model, of birth and death evolution, the use of sister genes from ancient calibration procedure, effect of long branches, and degree of gene duplication could minimize these potential problems when congruence between time estimates and the fossil record (e.g., both copies exist in the studied taxa. Kumar, 2005; Magallón and Sanderson, 2005; Rutschmann et al., Cupressaceae s.l., including Cupressaceae s.s. and traditional 2007; Inoue et al., 2010; Magallón, 2010). It would be more Taxodiaceae without Sciadopitys, is an important component of for- ests, and comprises 32 genera and more than 130 species (Farjón, 2005; Adams et al., 2009; Debreczy et al., 2009). Among them, only four genera, i.e., , Cupressus, (a New World ⇑ Corresponding author. Address: State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, 20 separated from Cupressus)(Adams et al., 2009) and Juniperus, Nanxincun, Xiangshan, Beijing 100093, China. Fax: +86 10 62590843. have more than 10 species, and as many as 19 genera are monotypic. E-mail address: [email protected] (X.-Q. Wang). Cupressaceae s.s. was first separated from Taxodiaceae by Pilger

1055-7903/$ - see front matter Ó 2012 Elsevier Inc. All rights reserved. http://dx.doi.org/10.1016/j.ympev.2012.05.004 Author's personal copy

Z.-Y. Yang et al. / Molecular Phylogenetics and Evolution 64 (2012) 452–470 453

(1926), but afterwards the morphological, anatomical, embryologi- The LEAFY (LFY) gene that encodes a transcription factor in- cal, immunological, and cladistic studies (Eckenwalder, 1976; Hart, volved in regulating cell division and arrangement or floral meri- 1987; Price and Lowenstein, 1989; Farjón, 2005; Schulz and Stutzel, stem identity occurs in all land (Frohlich and Meyerowitz, 2007) as well as molecular investigations (Brunsfeld et al., 1994; 1997; Maizel et al., 2005; Tanahashi et al., 2005; Moyroud et al., Tsumura et al., 1995; Chaw et al., 1997, 2000; Stefanovic et al., 2010). Although this gene exists as a single-copy in most diploid 1998; Gadek et al., 2000; Kusumi et al., 2000; Quinn et al., 2002; angiosperms, its sister gene NEEDLY (NLY) that originated from a Rydin et al., 2002; Schmidt and Schneider-Poetsch, 2002; Rai duplication event in the common ancestor of seed plants still re- et al., 2008) consistently support a merger of the two families. For mains in gymnosperms (Frohlich and Meyerowitz, 1997; Moura- the infra-familial classification of Cupressaceae s.l., Gadek et al. dov et al., 1998; Maizel et al., 2005; Vazquez-Lobo et al., 2007; (2000) divided the family into seven subfamilies based on morpho- Shiokawa et al., 2008). Thus, the duplicated sister genes LFY and logical and cpDNA evidence, which include Cunninghamioideae, NLY are very suitable to be used as nuclear gene markers for the Taiwanioideae, Athrotaxidoideae, , Taxodioideae, phylogenetic reconstruction of Cupressaceae s.l. Callitroideae and Cupressoideae. However, Farjón (2005) did not Recently, the LFY gene, especially its second intron, has been recognize the subfamily Callitroideae that occurs in the Southern widely used to reconstruct the phylogeny of many angiosperm Hemisphere, and treated Cupressaceae s.s. as a subfamily (Cupres- groups (e.g., Oh and Potter, 2003; Grob et al., 2004; Kim et al., soideae) rather than two subfamilies. 2008), and several genera of gymnosperms such as Gnetum (Won All previous molecular phylogenies of Cupressaceae (s.l. or s.s.) and Renner, 2006), (Peng and Wang, 2008), and Pseudotsuga were reconstructed based on chloroplast DNA (cpDNA) markers (Wei et al., 2010). Also, the NLY gene has been used to resolve the (Tsumura et al., 1995; Gadek and Quinn, 1993; Brunsfeld et al., interspecific relationships of Cupressus (Little, 2006). Moreover, 1994; Gadek et al., 2000; Kusumi et al., 2000), although 4–10 genera there is a rich fossil record of Cupressaceae s.l. (as summarized in Florin, of the family were sampled in several other studies using nuclear 1963; Miller, 1977; Farjón, 2005), which is very helpful for estimat- genes (Chaw et al., 1997; Stefanovic et al., 1998; Kusumi et al., ing the divergence times of different lineages within the family. 2002; Rydin et al., 2002). In addition, the published cpDNA phylog- In the present study, we use the two nuclear genes LFY and NLY, enies comprise only 12–22 genera of Cupressaceae s.l. (Gadek and coupled with the chloroplast matK and mitochondrial rps3 genes, Quinn, 1993; Brunsfeld et al., 1994; Tsumura et al., 1995; Kusumi to reconstruct the phylogeny of Cupressaceae s.l. based on a com- et al., 2000; Quinn et al., 2002) except that 31 genera were sampled plete sampling of its 32 genera. Then, we discuss the evolution of by Gadek et al. (2000), and the intergeneric relationships, especially LFY and NLY in gymnosperms, and possible hybrid origins of Cupres- within Cupressaceae s.s., were poorly resolved in the rbcL gene trees sus, Fitzroya and Sequoia. In addition, the biogeographical history of (Gadek and Quinn, 1993; Brunsfeld et al., 1994; Gadek et al., 2000). Cupressaceae s.l., in particular Cupressaceae s.s. in the Southern In the study of Gadek et al. (2000), five genera occurring in the Hemisphere, is investigated with the help of molecular dating, the Southern Hemisphere (, , Fitzroya, Pilgero- fossil record and geological evidence for the break-up of Gondwana. dendron and ) and two genera in the Northern Hemi- sphere [Cupressus s.s. and Xanthocyparis, a new genus from 2. Materials and methods northern Vietnam (Farjón et al., 2002)] were not included in the combined matK+rbcL gene analysis. In particular, after diagnosing 2.1. Taxon sampling the inadequacies of the matK+rbcL analysis, the authors favored the matK + non-molecular analysis and used it as the basis for their clas- All the recognized 32 genera (totaling 45 species) of Cupressa- sification. Furthermore, the chloroplast genome behaves as a single ceae s.l. (Farjón, 2005; Adams et al., 2009; Debreczy et al., 2009) locus (Doyle, 1992), and can only provide genetic information of one were sampled. We used the names Callitropsis nootkatensis and parent given its predominantly paternal inheritance in Cupressa- Xanthocyparis vietnamensis following Debreczy et al. (2009). Taxus ceae s.l. (reviewed by Mogensen, 1996). Thus, analysis of multiple cuspidata var. nana and Cephalotaxus sinensis were chosen as out- genes from different genomic compartments, especially the nuclear groups to study the intergeneric relationships of Cupressaceae s.l. genome, and extensive sampling would be necessary to clarify the according to the sister relationship between Taxaceae–Cephalotax- intergeneric relationships within Cupressaceae s.l., since hybridiza- aceae and Cupressaceae s.l. (Quinn et al., 2002). To investigate evo- tion has played a major role in the development of plant species lution of the LFY and NLY genes in gymnosperms, we also sampled diversity (Soltis and Soltis, 2009). the other families of conifers (except Phyllocladaceae), Zamia fur- Compared to the abundance of polyploids in angiosperms, coni- furacea, Ginkgo biloba and Gnetales. The LFY and NLY sequences of fers are mainly diploids with scattered natural polyploids only Zamia furfuracea, Ginkgo biloba, Gnetales, Pinus radiata, Picea abies, occurring in Cupressaceae s.l., such as the tetraploid Fitzroya Podocarpus matudae var. reichei and Taxus globosa were downloaded cupressoides (Hair, 1968), and the hexaploid from GenBank. All the above-mentioned samples were used in the (Stebbins, 1948). Cupressaceae s.l. is also the only coniferous family DNA analysis. To preliminarily explore whether the LFY and NLY with a virtually worldwide distribution, being represented in all genes amplified from genomic DNA are functional, we further chose continents except Antarctica. According to the fossil record and seven easily accessible species from three families (Araucariaceae, molecular divergence time estimates, the present distribution of Cupressaceae s.l., Pinaceae) for RNA extraction and RT-PCR analysis. the traditional Taxodiaceae is generally interpreted as a relic from Moreover, to determine whether each of the LFY and NLY genes ex- a much more widespread and common occurrence in the past, ists as a single locus, four species from different lineages of conifers while the split between the two clades of Cupressaceae s.s. (North- that represent both diploid and polyploid species were selected for ern and Southern Hemispheres) could be dated back to the separa- the Southern blot analysis. The origins of materials and voucher tion of Laurasia and Gondwana (Li, 1953; Miller, 1977; Li and Yang, specimens are shown in Table S1. 2002). In particular, many genera of the Southern Hemisphere clade of Cupressaceae s.s. are confined to a single continent. There- fore, a robust phylogenetic reconstruction of Cupressaceae s.l., 2.2. DNA and RNA extraction, PCR and RT-PCR amplification, cloning especially the intergeneric relationships, may also shed light on and sequencing the origin and evolution of the rare natural polyploids of conifers and provide further evidence for the break-up history of Total DNA was extracted from silica gel dried leaves using either Gondwana. the modified CTAB method (Rogers and Bendich, 1985) or the Author's personal copy

454 Z.-Y. Yang et al. / Molecular Phylogenetics and Evolution 64 (2012) 452–470

DNAsecure Plant Kit (Tiangen, Beijing, China). Young leaves of Arau- Buckinghamshire, UK). For both genes, the probes were designed caria heterophylla, Juniperus chinensis, glyptostroboides, to cover exon 1 (for P. armandii) or exon 2 (for S. sempervirens Pinus armandii, orientalis, Sequoia sempervirens and and C. lanceolata) or exons 2–3 including intron 2 (for T. cuspidata (different individuals from those for DNA extrac- var. nana), which were amplified from species-specific clones and tion except Pinus armandii) were collected for RNA extraction. Total labeled with alkaline phosphatase using the Gene Images AlkPhos RNA was isolated and the first-strand cDNA was produced follow- Direct Labeling and Detection System (GE Healthcare, Bucking- ing Ran et al. (2010a). For polymerase chain reaction, we used total hamshire, UK). The membranes were hybridized and washed at DNA or cDNA as template. The LFY gene was amplified with the 56–65 °C, following the protocol provided by the manufacturer. forward primer LFYE1F3 (Peng and Wang, 2008) or LFYE1F4 and The hybridized signals were detected with the CDP-Star detection reverse primer LFYE3R3 or LFYE3R4 (Peng and Wang, 2008), and reagent (GE Healthcare, Buckinghamshire, UK), and visualized by a the NLY gene with NLYE1F1 or NLYE1F2 (forward) and NLYE3R3 GeneGnome BioImaging System (Syngene, Cambridge, UK) or (reverse) (Fig. S1 and Table S2). To obtain the matK gene sequence exposure to Kodak X-Omat BT Film at room temperature for 24 h. of each species from the same individual as used in the other genes, also for the accuracy of sequences, we amplified this gene with 2.4. Sequence analysis primers matKF and matKR (Kusumi et al., 2000) rather than down- loading sequences from GenBank. PCR amplification was carried Sequence alignments were made with CLUSTAL X (Thompson out in a volume of 25 ll containing 100–200 ng DNA or 1–2 ll et al., 1997) and refined manually. The variable sites and variability cDNA template for nuclear genes and 5–50 ng of DNA template of conspecific clones of LFY and NLY were calculated by MEGA ver- for matK, 6.25 pmol of each primer, 0.2 mM of each dNTP, 2 mM sion 3 (Kumar et al., 2004) and BioEdit software (Hall, 1999),

MgCl2, and 0.75 U of ExTaq DNA polymerase (Takara Biotechnology, respectively. The unalignable regions in LFY and rps3 and introns CO., Ltd. Dalian, China). Amplification was conducted in a Tgradient of the two nuclear genes that cannot be reliably aligned among dif- thermal cycler (Biometre, Göttingen, Germany) or an Eppendorf ferent families of gymnosperms, even within Cupressaceae s.l., Mastercycler (Eppendorf Scientific, Westbury, NY, USA). PCR cycles were excluded manually from the analyses at or above the family were as follows: one cycle of 4 min at 94 °C, four cycles of 1 min at level. The aligned sequences were tested for substitutional satura- 94 °C, 30 s at 55 °C (for matK and LFY)or58°C (for NLY), and tion by a maximum likelihood method (Griffiths, 1997). 1.5–6.0 min at 72 °C, followed by 35 cycles of 30 s at 94 °C, 30 s at To assess congruence between the four gene data sets, LFY, NLY, 53 °C (for matK and LFY)or56°C (for NLY), and 1.5–6.0 min at matK and rps3, in Cupressaceae s.l., we used the incongruence 72 °C, with a final extension step for 10 min at 72 °C. The amplifica- length difference test (ILD) (Farris et al., 1994) as implemented in tion of the mitochondrial rps3 gene followed Ran et al. (2010a), PAUPÃ 4.0b10 (Swofford, 2002) in a pair-wise fashion. In the first using the same primer pair rps19-F + rpl16-R. round of tests, all sequences (45 species) were included, and when PCR products were separated by 1.2% agarose gel electrophore- a species harbors two distinct LFY or NLY clones, the other genes sis and purified using the TIANgel Midi Purification Kit (Tiangen, were duplicated and randomly paired with them. To identify se- Beijing, China). For the nuclear genes, the purified PCR products quences and species responsible for incongruence between data were first sequenced directly with the PCR primers and two inter- partitions, the ILD test was further conducted after excluding se- nal primers to cover all exons. Then, we cloned all the PCR products quences from all species of which clones do not form monophyletic with the pGEM-TÒ Easy Vector System II (Promega, Madison, USA). groups. When the data sets were reduced to include 39 species, the Ten clones with the correct insertion, determined by digestion result showed that all genes except rps3 are not significantly incon- with EcoR I, were picked for each species and screened by sequenc- gruent (Table 1). However, the 39-species data set does not include ing with the primer T7. All distinct clones were further sequenced the two genera Xanthocyparis and Fitzroya. Then, to include all gen- with the primer SP6 and internal primers (Table S2). For the cyto- era of Cupressaceae s.l. in the combined analysis, we added Xantho- plasmic genes, the purified PCR products were directly sequenced. cyparis vietnamensis and Fitzroya cupressoides into the 39-species However, the matK gene of Juniperus chinensis, plumosa, data set, and tried all different combinations of the four clones of Xanthocyparis vietnamensis, and Araucaria heterophylla has poly- the two species (Xanthocyparis-NLY-1, Xanthocyparis-NLY-11, morphic sites in the sequencing chromatogram, so these samples Fitzroya-LFY-1, and Fitzroya-LFY-7) for the ILD test (41 speciesA: were also cloned as described above. Xanthocyparis-NLY-1+Fitzroya-LFY-7; 41 speciesB: Xanthocyparis- Sequencing was performed using the DYEnamic ET Terminator NLY-1+Fitzroya-LFY-1; 41 speciesC: Xanthocyparis-NLY-11 + Kit (Amersham Pharmacia Biotech) or the BigDye Terminator v3.1 Fitzroya-LFY-7; 41 speciesD: Xanthocyparis-NLY-11 + Fitzroya- Cycle Sequencing Kit. The sequencing products were separated on LFY-1). The ILD test on the four 41-species data sets did not find a MegaBACE 1000 automatic sequencer (Amersham Biosciences, significant incongruence among the three genes LFY, NLY and matK Buckinghamshire, UK) or a 96-capillary 3730XL DNA analyzer (Table 1), and preliminary phylogenetic analyses found that only (Applied Biosystems, Foster City, CA, USA). The sequences reported the relationships of Xanthocyparis, Hesperocyparis and Callitropsis in this study are deposited in GenBank under accessions HQ245712– are different when different NLY clones of Xanthocyparis were used HQ245782 (NLY), HQ245783–HQ245805 and JN226115 (rps3), (trees not shown, and see Section 4). Finally, we randomly chose the HQ245806–245873 (LFY), and HQ245874–HQ245921 (matK) 41 speciesA data set for the combined analysis (LFY + NLY, and LFY + (Table S1). NLY + matK). Although the ILD test showed that rps3 is significantly incongruent with other genes, we also tried to combine the four 2.3. Southern blot analysis genes (LFY + NLY + matK+rps3) for analysis.

Southern blotting was used to detect the number of loci of the 2.5. Phylogenetic analysis two nuclear genes LFY and NLY in the four conifer species Cunninghamia lanceolata, Sequoia sempervirens, Taxus cuspidata Phylogenetic analyses were conducted on seven data matrices, var. nana and Pinus armandii. Approximately 20–40 lg of genomic including two of gymnosperms (LFY and NLY, Zamia furfuracea as DNA was digested with restriction enzymes that do not have rec- outgroup), and five of Cupressaceae s.l., i.e., LFY + NLY, matK, rps3, ognition sites in the probe sequence, separated on a 0.8% agarose LFY + NLY + matK, and LFY + NLY + matK+rps3. According to the gel, and transferred to a nylon Hybond N+ membrane using the ML test (Griffiths, 1997), none of the data sets was saturated with VacuGene XL Vacuum Blotting System (Amersham Biosciences, nucleotide substitutions, thus all phylogenetic analyses were based Author's personal copy

Z.-Y. Yang et al. / Molecular Phylogenetics and Evolution 64 (2012) 452–470 455

Table 1 P values from the incongruence length difference (ILD) test.

Datasets P values 45 Species 43 Species 39 Species 41 SpeciesA 41 SpeciesB 41 SpeciesC 41 SpeciesD LFY vs. NLY 0.043 0.142 0.267 0.317 0.289 0.290 0.292 LFY vs. matK 0.003 0.014 0.087 0.089 0.086 0.098 0.092 NLY vs. matK 0.022 0.110 0.105 0.099 0.099 0.097 0.101 rps3 vs. LFY 0.001 0.001 0.001 0.001 0.001 0.001 0.001 rps3 vs. NLY 0.029 0.087 0.023 0.016 0.023 0.023 0.029 rps3 vs. matK 0.001 0.001 0.001 0.001 0.001 0.001 0.001

45 Species: including all species and sequences. 43 Species: excluding Xanthocyparis and Fitzroya. 39 Species: excluding all ‘‘non-monophyletic species’’ (clones of a species do not form a monophyletic group). 41 SpeciesA: 39 species + Xanthocyparis-NLY-1+Fitzroya-LFY-7. 41 SpeciesB: 39 species + Xanthocyparis-NLY-1+Fitzroya-LFY-1. 41 SpeciesC: 39 Species + Xanthocyparis-NLY-11 + Fitzroya-LFY-7. 41 SpeciesD: 39 species + Xanthocyparis-NLY-11 + Fitzroya -LFY-1. on the complete alignments except the exclusion of introns of the 1000 bootstrap replicates, based on which a 95% confidence net- two nuclear genes and the unalignable regions in LFY and rps3. work was constructed for each data set. To construct phylogenetic trees, maximum parsimony (MP), maximum likelihood (ML) and Bayesian inference (BI) were 2.7. Molecular dating analysis performed in PAUPÃ 4.0b10, PHYML version 2.4.4 (Guindon and Gascuel, 2003) and MrBayes 3.1.2 (Ronquist and Huelsenbeck, The molecular dating analysis was conducted on matK, LFY, NLY 2003), respectively. Indels in all data matrices were treated as miss- and combined LFY + NLY + matK+rps3, respectively (the rps3 gene ing data. The MP analyses used heuristic searches with 1000 ran- was not independently used due to its great length variation and dom addition sequence replicates, tree-bisection-reconnection fast evolution in conifer II). The ML tree topologies of Cupressaceae (TBR) branch-swapping, and MulTrees option on, except that a max- s.l. used for analysis were generated from the simplified data sets, imum of 2000 trees were saved per round for the rps3 alignment. All which include one species from each genus and a single clone from character states were treated as unordered and equally weighted. each species that were randomly chosen (all distinct NLY clones of The relative robustness of the clades found in the MP trees was eval- Xanthocyparis and LFY clones of Fitzroya were included, since they uated by bootstrap analyses (Felsenstein, 1985) based on 1000 rep- do not cluster into a single clade), and Araucaria heterophylla was licates, using the same options as above except that a maximum of used as the outgroup. For absolute ages, we relied on the following 2000 trees were saved per round for the LFY and NLY alignments, calibrations. The root node, the split between Cupressaceae s.l. and and 50% majority rule consensus was used. In ML and BI analyses, Taxaceae–Cephalotaxaceae, was constrained to minimally 192 MA the evolutionary models were optimized in Modeltest 3.7 (Posada based on Paleotaxus in the lower Jurassic (Taylor and Taylor, 1993) and Crandall, 1998) and MrModeltest 2.2 (Nylander, 2004) using as used in Magallón and Sanderson (2005), and maximally Akaike Information Criterion (AIC), respectively. The best models 237 MA based on Parasciadopitys from the Middle Triassic of and the model settings for analyses are shown in Table 2. For the Antarctica, a close relative of Sciadopitys (Sciadopityaceae) (Yao ML analysis in PHYML version 2.4.4, the GTR model was used for et al., 1997). According to the earliest fossil record, the most recent matK and rps3 due to the unavailability of the best-fit model TVM. common ancestors (MRCA) of the five clades, i.e., Sequoia–Metase ML parameters were optimized with a BIONJ tree as a starting point quoia–, , Libocedrus–Pilgero (Gascuel, 1997), and support values for nodes on the ML tree were dendron, –Fitzroya– and Juniperus–Cupressus– estimated with 1000 bootstrap replicates. For the Bayesian infer- Hesperocyparis, were constrained to the minimal age of 140 MA ence, two independent runs of Markov chain Monte Carlo (MCMC) (Sequoia in early Cretaceous) (Penny, 1947; Ma et al., 2005), were conducted simultaneously, with each including one cold and 99 MA (Glyptostrobus in Cenomanian, upper Cretaceous) (Miller, three incrementally heated chains that started randomly in the 1977; Aulenback and LePage, 1998), 61.7 MA (Libocedrus in parameters space (Temp = 0.20, Swapfreq = 1, Nswaps = 1) and ran early-middle Paleocene) (Pole, 1998), 95 MA (Widdringtonia in for 1000,000 generations. Every 100 generations were sampled, Cretaceous) (McIver, 2001), and 33.9 MA (Juniperus dated back to and the first 30% of samples were discarded as burn-in according the Eocene/Oligocene boundary) (Kvacek, 2002), respectively. to the analysis by Tracer v1.5 (Rambaut and Drummond, 2007). A Rate constancy across lineages was examined for each dataset 50% majority-rule consensus tree was generated based on the trees using a likelihood ratio test (LRT) (Felsenstein, 1988), in which like- sampled after generation 300,000. In addition, we repeated the BI lihood scores of the trees with and without an enforced molecular analysis four times for each data set to confirm the results. clock were compared. When the clock assumption was rejected (all P  0.001), we used both Bayesian and penalized-likelihood (PL) methods to estimate the divergence times for comparison, 2.6. Network analysis since different dating methods may generate different results based on the same data set. The Bayesian approach was performed with To investigate the possible hybridization events in the evolution PAML/multidivtime (Thorne et al., 1998; Yang, 2007), following of Cupressaceae s.l., the Neighbor-Net method as implemented in the step-by-step manual, version 1.5 (Rutschmann, 2005). Prior Splitstree 4.11.3 (Huson and Bryant, 2006) was ued to reconstruct gamma distributions on parameters of the relaxed clock model reticulate networks based on sequences of the combined nuclear were as follows: the mean of the prior distribution for the root genes (LFY + NLY), matK, and rps3, respectively. For distance calcu- age (rttm) was set to 2 time units (100 MA) based on the split time lations, we excluded indels and used the best-fit or the second between Cupressaceae s.l. and Taxaceae–Cephalotaxaceae (237– best-fit model (GTR) selected by AIC in Modeltest (Table 2). The 192 MA as mentioned above); the standard deviation of this prior relative robustness of the clades was estimated by performing (rttmsd) was also set to 2; the mean and SD of the prior distribution Author's personal copy

456 Z.-Y. Yang et al. / Molecular Phylogenetics and Evolution 64 (2012) 452–470

Table 2 Sequence evolution models best fit to each data set as determined in Modeltest 3.7 and MrModeltest 2.2 using Akaike Information Criterion (AIC), and model settings for analyses.

Data set Best model for ML ML model settings Best model for BI BI model settings LFY GTR + I + G GTR GTR + I + G Lset nst = 6 Lset nst = 6; Rates = invgamma; Rates = gamma Prset Shape = 1.7877; statefreqpr = dirichlet(1,1,1,1); Pinvar = 0.3421; NLY GTR + I + G GTR GTR + I + G Lset nst = 6 Lset nst = 6; Rates = invgamma; Rates = gamma Prset Shape = 1.3047; statefreqpr = dirichlet(1,1,1,1); Pinvar = 0.2861; LFY + NLY GTR + G GTR GTR + G Lset nst = 6 Lset nst = 6; Rates = gamma; Rates = gamma Prset Shape = 0.4382; statefreqpr = dirichlet(1,1,1,1); matK TVM + GÃ (GTR + G) GTR GTR + G Lset nst = 6 Lset nst = 6; Rates = gamma; Rates = gamma, estimated Prset statefreqpr = dirichlet(1,1,1,1); LFY + NLY + matK GTR + I + G GTR GTR + G Lset nst = 6 Lset nst = 6; Rates = gamma; Rates = gamma Prset Shape = 1.0148; statefreqpr = dirichlet(1,1,1,1); Pinvar = 0.2344; rps3 TVM+I+GÃ (GTR + I + G) GTR GTR + I + G Lset nst = 6 Lset nst = 6; Rates = invgamma; Rates = gamma, estimated; Prset Pinvar, estimated; statefreqpr = dirichlet(1,1,1,1); LFY + NLY + matK + rps3 (GTR + I + G) GTR GTR + I + G Lset nst = 6 Lset nst = 6; Rates = invgamma; Rates = gamma Prset Shape = 0.8272; statefreqpr = dirichlet(1,1,1,1); Pinvar = 0.2805;

The second best-fit model is shown in brackets. for the ingroup root rate (rtrate and rtratesd) were both set to 0.05; as used in the molecular dating, but the traditional Taxodiaceae the brownmean and brownsd were both set to 0.5 following the was excluded. Nine geographical areas were defined in our analy- manual’s recommendation that the brownmean mutiplied by rttm sis: (A) Australia including Tasmania, (B) South America, (C) New be about 1–2. Markov chains in multidivtime were run for 1000,000 Caledonia, (D) New Zealand, (E) New Guinea and Moluccas, (F) generations, sampling every 100th generation for a total of 10,000 Africa, (G) Asia, (H) North America, and (I) Europe. Ancestral range trees, with a burn-in of the first 3000 trees. patterns were inferred by a statistical dispersal-vicariance analysis The PL analysis (Sanderson, 2002) under the TN algorithm was based on the BI topology constructed from each of the four simpli- implemented in the program r8s. The topologies with branch length fied data matrices LFY, NLY, matK and combined LFY + NLY + generated by PHYML version 2.4.4 (Guindon and Gascuel, 2003) matK+rps3 (each genus represented by one sequence), using were used for time estimation. Optimal values of smoothing were S-DIVA, which reconstructs ancestral ranges while accounting for found to be 1800, 1800, 1300 and 1000 for LFY, NLY, matK and com- both phylogenetic uncertainty and multiple solutions in DIVA bined LFY + NLY + matK+rps3, respectively, using the statistical optimization (Yu et al., 2010). For the estimation of node support cross-validation method (cvstart = 0, cvinc = 0.125, cvnum = 32). values and the final topology, BEAST v1.7.1 package (http://beast. The credibility intervals were estimated using the profile command bio.ed.ac.uk/Main_Page, Drummond and Rambaut, 2007) was used based on 100 topologically constrained trees with branch lengths, to generate the trees by the following three steps: (1) The BEAST which were yielded from 100 bootstrap resampled data generated file was generated with BEAUti by setting 1000,000 generations by PAUPÃ 4.0b10. and sampling every 1000 generations. (2) One thousand trees were obtained from the BEAST analysis, based on which a BI topology (the final topology) was generated in TreeAnnotator by discarding 2.8. Biogeographic analysis the first 300 trees as burn-in. (3) The obtained 1000 trees, the final topology and a file of taxon distribution were imported into S- The traditional Taxodiaceae originated before the separation of DIVA, and the ancestral area reconstruction was performed by set- Gondwana and Laurasia according to the divergence time estima- ting a maximum of seven areas at each node. tion, and nearly all of its genera are monotypic and relictual. More- over, the biogeographical study of some genera of Cupressaceae s.s. widely distributed in the Northern Hemisphere, such as Juniperus, 3. Results Cupressus and Thuja, needs a dense species sampling and the inte- gration of complex geological and climatic information (e.g., Peng 3.1. Sequence characterization and Wang, 2008). Therefore, this study focused on the biogeogra- phy of Cupressaceae s.s., especially its genera distributed in the All PCR products of LFY and NLY showed a single band after Southern Hemisphere. In the analysis, we used the same data sets electrophoresis in 1.2% agarose gel except that Diselma archeri Author's personal copy

Z.-Y. Yang et al. / Molecular Phylogenetics and Evolution 64 (2012) 452–470 457 had another LFY band of a much smaller size, which was confirmed The obtained sequence of LFY varied from 2170 bp (Araucaria to be a LFY pseudogene with a large deletion between intron 1 and heterophylla) to 3201 bp (Austrocedrus chilensis), containing three intron 2 by cloning and thus was excluded for further analysis. exons (partial exon 1, exon 2 and partial exon 3) and two introns. Based on direct sequencing and cloning, no more than two distinct The exon length of this gene ranged from 1078 bp ( dolab- clones of LFY and NLY occurred in the same individual, and most rata) to 1123 bp (Cedrus deodara), except for the short sequences species did not show clone polymorphism. If two distinct clones (889 bp) obtained from Metasequoia glyptostroboides and Sequoia- are present in the same species, conspecific clones share the highest dendron giganteum with the primer pair LFYE1F4 + LFYE3R4. After similarity except for LFY of Fitzroya cupressoides, and NLY of Cupres- excluding some unalignable regions, the alignment of LFY exons sus funebris, Juniperus chinensis, and Xanthocyparis vietnamensis. All for analysis was 1168 bp, of which 628 were variable and 540 were the species harboring two distinct clones of LFY and NLY and the parsimony informative. The PCR-amplified NLY gene also contained variability of conspecific clones are shown in Table 3. By RT-PCR, three exons and two introns (partial exons 1 and 3), with a total we obtained cDNA sequences of both genes from Juniperus chinen- length ranging from 3116 bp (Austrocedrus chilensis) to 4335 bp sis, Metasequoia glyptostroboides, Platycladus orientalis, Thuja occi- (Cephalotaxus sinensis). The sequence alignment of its exons for dentalis, Sequoia semperviens and Pinus armandii, and LFY from phylogenetic analyses was 1063 bp (578 variable sites, 407 parsi- Araucaria heterophylla. These cDNAs are identical to the exon se- mony-informative sites). quences of the genomic DNA obtained from the same species except Direct sequencing of the matK gene failed in Juniperus chinensis, for 1–2 site mutations in some species, which could be attributed to Libocedrus plumosa, Xanthocyparis vietnamensis and Araucaria hete- different individuals used for DNA and RNA extraction or PCR rophylla, and matK pseudogenes with predicted early stop codons errors. The LFY and NLY variable sites among the five genera Callitr- were found after cloning and sequencing. Phylogenetic analysis opsis, Cupressus, Hesperocyparis, Juniperus and Xanthocyparis and showed that the conspecific clones clustered together with high among the three genera Metasequoia, Sequoia and Sequoiadendron support values (trees not shown), and thus the pseudogenes were are shown in Fig. S2. excluded from further analysis. The sequence alignment of the matK region was 1577 bp (containing 173 bp of the trnK intron and 1404 bp of the matK gene), of which 657 nucleotide sites were Table 3 variable and 414 were parsimony-informative. The aligned se- The species harboring two distinct clones of LFY and NLY and the variability of quence of the mitochondrial rps3 region, including the rps3 gene conspecific clones. and partial sequences of its two flanking genes rps19 and rpl16, Gene Species Two DNA AA DNA Shared was 2815 bp (excluding two highly variable regions that could conspecific variable variable variable similarity not be reliably aligned), which included 531 variable sites and clones sites sites sites (full length) compared (Exon) (Intron) (%) 232 parsimony-informative characters. LFY Cunninghamia 3 vs. 5 4 2 14 99 lanceolata 3.2. Southern blotting Cupressus 4 vs. 7 4 2 8 99 atlantica Southern blotting showed that both LFY and NLY have a single Fitzroya 1 vs. 7 23 10 52 94 locus in conifers according to our study on three diploid (Cunningh- cupressoides (4Â) amia lanceolata, Pinus armandii, Taxus cuspidata var. nana) and one Hesperocyparis 1 vs. 7 8 2 26 98 hexaploid (Sequoia semperviens) species. In each restriction digest bakeri of genomic DNA, only one band was detected when the LFY and Hesperocyparis 1 vs. 5 11 10 28 99 NLY probes were used, respectively, except that two signals of macrocarpa Juniperus 3 vs. 9 6 3 10 99 LFY were observed in the Xba I digest of C. lanceolata (Figs. 1 and saltillensis S3). We found that Xba I has no recognition site in the probe se- Sequoia 1 vs. 3 9 2 29 98 quence or the LFY region of C. lanceolata covered by the probe, sempervirens and one recognition site in the downstream of the probe region (6Â) and at the same position (intron 2) of the two distinct LFY clones Xanthocyparis 5 vs. 6 0 0 12 99.5 vietnamensis (alleles) obtained from this species. Thus, the two LFY signals of C. lanceolata with great difference in molecular weight could be NLY Cupressus 1 vs. 8 3 1 52 98 dupreziana caused by another recognition site of Xba I in the upstream of Cupressus 1 vs. 3 11 3 67 96 LFY that is heterozygous for its two alleles. To verify this inference, funebris we further blotted the genomic DNA of C. lanceolata digested by Fitzroya 2 vs. 3 3 0 17 99 EcoR V and Hind III, respectively, with a LFY probe (cover exon cupressoides (4Â) 2–exon 3) amplified from this species, and observed a single signal Hesperocyparis 1 vs. 3 3 2 17 99 from each digestion (Fig. S3). The above evidence suggests that the bakeri LFY gene also exists as a single locus in C. lanceolata. Hesperocyparis 2 vs. 6 2 1 27 99 macrocarpa Juniperus 1 vs. 5 16 6 104 97 3.3. Phylogenetic analysis chinensis (4Â) Junipersu 1 vs. 3 6 2 46 98 The MP analyses generated 56 equally most-parsimonious trees saltillensis (tree length = 2119, CI = 0.479, RI = 0.8021, including uninforma- Sequoia 1 vs. 8 2 1 20 98 tive characters) for the LFY gene, and 144 (tree length = 1598, sempervirens (6Â) CI = 0.5463, RI = 0.7857) for the NLY gene. Except for the position 1 vs. 4 7 3 33 99 of Welwitschia in NLY trees and some poorly resolved clades, the Widdringtonia 2 vs. 3 2 0 14 99 MP and BI trees generated are almost identical to the ML trees nodiflora for both LFY and NLY (Fig. 2). According to the NLY gene, Welwits- Xanthocyparis 1 vs.11 9 5 88 97 vietnamensis chia was clustered with Zamia in the MP trees, but with Pinaceae in ML and BI trees, although the support values are very low. Also, Author's personal copy

458 Z.-Y. Yang et al. / Molecular Phylogenetics and Evolution 64 (2012) 452–470

Fig. 1. Southern blot hybridization of the genomic DNAs of Sequoia sempervirens and Cunninghamia lanceolata with the Sequoia-LFY probe (A) and the Sequoia-NLY probe (B) at 56 °C. Numbers on the right indicate DNA molecular weight marker.

except a little difference in the position of Podocarpaceae, the tree The parsimony analysis of the combined LFY + NLY + matK gen- topologies of the two sister genes LFY and NLY are highly congruent erated six equally most-parsimonious trees (tree length = 2627, in revealing the inter- and intra-family relationships of conifers. CI = 0.6829, RI = 0.8399), which are highly congruent with the ML For example, Pinaceae is monophyletic and sister to the other and BI trees in topology except for some poorly resolved clades conifers, among which Cephalotaxaceae–Taxaceae is sister to (see Fig. S6, the ML topology). The intergeneric relationships of Cupressaceae s.l. (Fig. 2). In particular, the seven subfamilies of Cupressaceae s.l. revealed by the combined LFY + NLY + matK are Cupressaceae s.l. recognized by Gadek et al. (2000) are highly sup- the same as those by the combined LFY + NLY (Fig. S5), and support ported. In addition, the distinct clones from a same species form a values for some clades increased. Based on the combined LFY + strongly supported monophyletic clade, except for LFY of Hespero- NLY + matK+rps3, we also obtained six equally most-parsimonious cyparis bakeri, Hesperocyparis macrocarpa and Fitzroya, and NLY of trees (tree length = 3390, CI = 0.7068, RI = 0.8379) that are topolog- Xanthocyparis, Cupressus dupreziana, Hesperocyparis macrocarpa ically identical to the ML and BI trees (Fig. 5, the ML topology). The and Thuja plicata (Figs. 2 and S4). In the LFY tree, Diselma was intergeneric relationships of Cupressaceae s.l. revealed by the com- nested into the Fitzroya cupressoides clones with high support val- bined LFY + NLY + matK+rps3(Fig. 5) are nearly the same as those ues, and in the NLY tree, the clone Xanthocyparis vietnamensis-11 by the combined LFY + NLY + matK(Fig. S6) except for the relation- forms a sister relationship with the Hesperocyparis–Callitropsis–X. ship between Thuja + Thujopsis and + . vietnamensis-1 clade (Figs. 2B and S4B). The parsimony analysis of the combined nuclear genes gener- 3.4. Network analysis ated 16 equally most-parsimonious trees (tree length = 1368, CI = 0.6768, RI = 0.8531), which are nearly identical to the ML and Three confidence networks (at the 95% confidence level) of BI trees except that Fokienia and Chamaecyparis have basal positions Cupressaceae s.l. were constructed based on sequences of the com- in Cupressoideae on the MP tree. The ML tree, with bootstrap per- bined LFY + NLY, matK and rps3 genes, respectively. In the com- centages of MP and ML as well as Bayesian posterior probabilities, bined LFY + NLY network, two reticulations were found (Fig. 6). is shown in Fig. S5, in which the relationships within Sequoioideae, That is, Sequoia and Cupressus appeared to be recombinants be- positions of Austrocedrus and Papuacedrus, and the relationships tween Metasequoia and Sequoiadendron, and between Juniperus among , and Platycladus–Microbiota are still and the Hesperocyparis–Callitropsis–Xanthocyparis clade, respec- poorly resolved, as in the separate nuclear gene trees (Fig. 2). tively. In addition, reticulation events could have occurred in the Based on the matK gene, 16 equally most-parsimonious trees basal groups of the Nothern Hemisphere Cupressaceae s.s., i.e., (tree length = 1232, CI = 0.6972, RI = 0.8399) were obtained, which among Fokienia–Chamaecyparis, Thuja and Thujopsis (Fig. 6). In con- are topologically identical to the ML and BI trees (see Fig. 3, the ML trast, no reticulation was found in the networks based on the cyto- topology). Compared to the combined nuclear gene tree (Fig. S5), plasmic genes matK and rps3(Fig. S7). positions of three genera, i.e., , Austrocedrus and Caloce- drus, are different, but none of them are strongly supported 3.5. Divergence time estimation (Fig. 3). Even though the mitochondrial rps3 gene has the lowest le- vel of phylogenetic signal among the data sets, it also provided a The estimated divergence times based on different data sets cor- relatively good resolution for some intergeneric relationships of respond well with each other when topologies are consistent, Cupressaceae s.l. (Fig. 4). although the standard errors estimated by Multidivtime (Bayes Author's personal copy

Z.-Y. Yang et al. / Molecular Phylogenetics and Evolution 64 (2012) 452–470 459

Fig. 2. ML trees of gymnosperms constructed from LFY (A) and NLY (B). Numbers associated with branches are bootstrap percentages of MP and ML greater than 50%. Bold lines indicate Bayesian posterior probabilities greater than 0.90. The letter c and numbers following species names denote sequences from cDNA and clone numbers, respectively.

method) are a little larger than those by PL in r8s (Table 4). The 4. Discussion Multidivtime estimates based on the combined LFY + NLY + matK+rps3 are shown in Fig. 7. The results of molecular dating 4.1. High congruence among different gene trees and the phylogenies showed that all genera of the traditional Taxodiaceae originated of Cupressaceae s.l. and gymnosperms in the Jurassic or lower Cretaceous and most genera of Cupressa- ceae s.s. diverged in the upper Cretaceous or Tertiary, which are To date, cpDNA and nrDNA are still the most widely used highly congruent with the earliest fossil record (Fig. 7). molecular markers for studying phylogenetic relationships of gym- nosperms (e.g., Chaw et al., 1997; Cheng et al., 2000; Quinn et al., 3.6. Biogeographic analysis 2002; Rydin et al., 2002; Little, 2006; Rai et al., 2008; Lin et al., 2010). The use of single or low copy nuclear gene markers may The results of ancestral range reconstruction for Cupressaceae provide more phylogenetic information and be very helpful to test s.s. are shown in Fig. 8A (based on the combined LFY + NLY +- the reliability of phylogenies based on cpDNA and nrDNA, but it is matK+rps3), which were compared with the modern distribution very difficult in gymnosperms that are remarkable for a large nu- of its genera and the break-up history of Gondwana (Fig. 8B, mod- clear genome, and highly complex gene families (Kinlaw and ified from Sanmartin and Ronquist, 2004). Since several alternative Neale, 1997; Murray, 1998; Leitch et al., 2001). In particular, most ancestral patterns were suggested for each node, we only focused single copy nuclear genes reported in angiosperms exist as gene on the ancestral distributions with the highest relative probabili- families in gymnosperms (Kinlaw and Neale, 1997). Thus, distin- ties for the nodes (Fig. 8A, nodes 1, 5, 6, 7, 8 and 9) consistently re- guishing orthologs from paralogs is the first and most important solved in different gene trees (Figs. 2, 3 and 5). Although the nodes thing for using nuclear genes in the gymnosperm phylogenetic are congruent based on different data sets (LFY, NLY, matK, com- reconstruction. bined LFY + NLY + matK+rps3), the S-DIVA method suggested dif- In this study, although a LFY pseudogene was found in Diselma ferent ancestral ranges for some nodes (Table 5). archeri, direct sequencing and cloning of both genomic DNA and Author's personal copy

460 Z.-Y. Yang et al. / Molecular Phylogenetics and Evolution 64 (2012) 452–470

Fig. 3. The ML tree of Cupressaceae s.l. inferred from the chloroplast matK gene. Numbers associated with branches are bootstrap percentages of MP and ML greater than 50% and Bayesian posterior probabilities greater than 0.90, respectively. Numbers following species names denote the clone numbers.

cDNA, together with Southern blotting (Figs. 1 and S3), strongly for studying phylogenetic and evolutionary relationships of suggest that both LFY and NLY exist as single copy in conifers, even gymnosperms, in addition to the use of cytoplasmic DNA. in the polyploid species. None of the studied species harbors more Despite different inheritance pathways of nuclear and organelle than two distinct clones of LFY or NLY, and the results of our South- genomes in the family, the phylogenetic trees generated ern blot analyses (Figs. 1 and S3) are consistent with the previous from the nuclear LFY and NLY, chloroplast matK and mitochondrial finding that LFY has a single locus in Gnetum parvifolium (Shindo rps3 genes are highly congruent in topology except for several et al., 2001), Pinus radiata (Mellerowicz et al., 1998) and Pinus poorly resolved clades in the rps3 tree (Figs. 2–4). Our results cor- caribaea var. caribaea (Dornelas and Rodriguez, 2005). Therefore, roborate several important findings by previous morphological, the two sister genes LFY or NLY are promising candidate markers anatomical and cpDNA analyses of Cupressaceae s.l. (Brunsfeld Author's personal copy

Z.-Y. Yang et al. / Molecular Phylogenetics and Evolution 64 (2012) 452–470 461

Fig. 4. The ML tree of Cupressaceae s.l. inferred from the mitochondrial rps3 gene. The branch length of outgroups is not shown to scale because of the great distance between outgroup and ingroup. Numbers associated with branches are bootstrap percentages of MP and ML greater than 50% and Bayesian posterior probabilities greater than 0.90, respectively.

et al., 1994; Gadek et al., 2000; Kusumi et al., 2000; Quinn et al., 2–5, S5 and S6). Therefore, the high topological congruence among 2002; Schulz and Stutzel, 2007), including the basal position of different gene trees implies that we may have recovered the spe- Cunninghamia, the close relationships among Metasequoia, Sequoia cies phylogeny of Cupressaceae s.l. and Sequoiadendron and among , Glyptostrobus and However, there are still some discrepancies among the different Taxodium, and Cupressaceae s.s. as a monophyletic group derived gene trees (Figs. 2–5), which include the systematic positions of from the traditional Taxodiaceae and sister to the Cryptomeria Athrotaxis, Papuacedrus and Tetraclinis, the relationships within –Glyptostrobus–Taxodium clade. In particular, the three genome- Sequoioideae, between Fokienia–Chamaecyparis and Thuja–Thujop- based phylogeny (Figs. 2–5, S5 and S6) strongly supports the clas- sis, and among Cupressus, Juniperus and Hesperocyparis–Callitropsis– sification of Cupressaceae s.l. into seven subfamilies by Gadek et al. Xanthocyparis, as well as the intrageneric relationships of Cupressus (2000) or six subfamilies by Farjón (2005). The six subfamilies rec- and Hesperocyparis. These discrepancies could be attributed to ognized by Farjón (2005) include Cunninghamioideae, Taiwanioi- insufficient resolution of the molecular markers, historical hybrid- deae, Athrotaxidoideae, Sequoioideae containing Metasequoia, ization, incomplete lineage sorting or ancient radiation (Maddison, Sequoia and Sequoiadendron, Taxodioideae comprising Cryptomeria, 1997; Wendel and Doyle, 1998; Whitfield and Lockhart, 2007), Glyptostrobus and Taxodium, and Cupressoideae containing all the which need further studies. remaining genera (Cupressaceae s.s.). In contrast, as the only differ- The relationships among Cupressus, Juniperus, Hesperocyparis, ence from Farjón (2005), Gadek et al. (2000) recognized two sub- Callitropsis and Xanthocyparis have been discussed in previous families in Cupressaceae s.s., i.e., Cupressoideae and Callitroideae studies (Little et al., 2004; Little, 2006; Adams et al., 2009). A close occurring in the Northern and Southern Hemispheres, respectively, relationship between Xanthocyparis vietnamensis and Callitropsis based both on morphological and cpDNA evidence. In addition, nootkatensis was suggested by some previous morphological stud- our study resolved six subclades in Cupressaceae s.s., i.e. ies (Farjón, 2005; Debreczy et al., 2009) and molecular evidence Thuja–Thujopsis, Fokienia–Chamaecyparis, Platycladus–Microbiota, from nrDNA ITS and NLY intron 2 (Little et al., 2004; Little, 2006). Cupressus–Juniperus–Hesperocyparis–Callitropsis–Xanthocyparis, This relationship is weakly supported by some gene trees yielded Libocedrus–, and Actinostrobus–Callitris–Neocallitrop- in the present study (Figs. 2 and S5). It may be worthy to investi- osis–Diselma–Fitzroya–Widdringtonia, and they are strongly sup- gate whether X. vietnamensis has originated from hybridization, ported in the nuclear, chloroplast and combined gene trees (Figs. considering the occurrence of two distinct NLY gene clones in this Author's personal copy

462 Z.-Y. Yang et al. / Molecular Phylogenetics and Evolution 64 (2012) 452–470

Fig. 5. The ML tree of Cupressaceae s.l. generated from the combined LFY + NLY + matK+rps3 genes. Numbers associated with branches are bootstrap percentages of MP and ML greater than 50% and Bayesian posterior probabilities greater than 0.90, respectively.

species (Figs. 2B and S4B) and its inconsistent positions in different and the ancestor of Hesperocyparis–Callitropsis–Xanthocyparis. The gene trees (Figs. 2–5). According to the nuclear genes LFY and NLY tetraploid Fitzroya and the hexaploid Sequoia also have discordant as well as the chloroplast matK gene, three monophyletic groups positions in different gene trees, which will be discussed later. (Cupressus, Juniperus, and Hesperocyparis–Callitropsis–Xanthocyp- To improve the understanding of evolutionary relationships aris) are strongly supported (Figs. 2 and 3). However, relationships within the gymnosperms, great efforts have been undertaken in of the three subclades are incongruent among different data sets the past two decades (e.g., Chaw et al., 1997, 2000; Gugerli et al., (Fig. S8). Cupressus is strongly supported as sister to Juniperus by 2001; Quinn et al., 2002; Rydin et al., 2002; Hajibabaei et al., the two single-copy nuclear genes (LFY and NLY), while, in contrast, 2006; Rai et al., 2008; Ran et al., 2010a). There have been great de- it forms a sister group with the Hesperocyparis–Callitropsis–Xantho- bates on the position of Gnetales (see review in Ran et al., 2010a); a cyparis clade in both cpDNA (matK, Fig. 3; petN-psbM, Fig. S8G) and sister relationship between it and Pinaceae or conifers is supported nrDNA ITS (Fig. S8A) trees. It is interesting that Cupressus shows by most molecular phylogenetic analyses (e.g., Soltis et al., 1999; similarity to both Juniperus and the Hesperocyparis–Callitropsis– Chaw et al., 1997, 2000; Donoghue and Doyle, 2000; Gugerli Xanthocyparis clade in the nucleotide sequence of LFY and NLY et al., 2001; Hajibabaei et al., 2006; McCoy et al., 2008; Rai et al., (Fig. S2). Moreover, the network analysis indicates that Cupressus 2008; Ran et al., 2010a). In this study, we found that the pair of an- has evolved from a recombination between the two clades cient duplicated genes LFY and NLY are very informative in recon- (Fig. 6). Based on all the above evidence, we infer that Cupressus structing the phylogeny of gymnosperms. The two gene trees have might have originated through hybridization between Juniperus nearly identical topologies such as the basal positions of Araucari- Author's personal copy

Z.-Y. Yang et al. / Molecular Phylogenetics and Evolution 64 (2012) 452–470 463

Fig. 6. A confidence reticulate network constructed based on the combined LFY and NLY gene sequences of Cupressaceae s.l. (at the 95% confidence level).

Table 4 Estimated divergence times of the consistent nodes among different gene trees of Cupressaceae s.l.

Nodea Calibration matK(MA) LFY(MA) NLY(MA) LFY + NLY + matK + rps3(MA) Multidivtime PL Multidivtime PL Multidivtime PL Multidivtime PL 1 Min = 192 Max = 237 230.76 ± 5.65 237 228.91 ± 7.23 237 226.57 ± 8.88 237 233.45 ± 3.25 237 2 220.10 ± 9.79 198.59 ± 3.85 220.75 ± 10.01 208.90 ± 4.40 217.99 ± 11.18 198.12 ± 6.19 198.82 ± 12.91 197.66 ± 2.54 3 211.47 ± 10.35 187.83 ± 4.19 213.32 ± 11.24 197.64 ± 4.96 211.56 ± 11.98 188.92 ± 7.01 191.89 ± 11.37 187.51 ± 2.54 4 201.14 ± 10.51 178.68 ± 3.90 206.00 ± 12.12 186.73 ± 5.86 204.24 ± 12.74 181.87 ± 6.69 183.80 ± 10.62 180.36 ± 2.33 5 Min = 140 153.48 ± 10.09 140.06 ± 3.24 163.74 ± 16.87 160.38 ± 11.83 177.04 ± 18.76 163.55 ± 13.96 147.18 ± 7.48 142.04 ± 3.56 6 183.32 ± 11.60 163.18 ± 4.11 187.14 ± 14.09 163.16 ± 7.42 188.28 ± 14.25 161.49 ± 7.78 161.75 ± 8.77 162.48 ± 2.40 7 148.91 ± 13.29 136.28 ± 7.49 125.45 ± 14.53 127.84 ± 14.49 127.96 ± 16.57 113.15 ± 13.04 139.62 ± 8.43 127.05 ± 5.83 8 Min = 99 116.66 ± 13.08 109.27 ± 10.54 112.21 ± 10.76 111.90 ± 12.68 112.06 ± 11.81 99.18 ± 1.19 111.69 ± 9.18 101.75 ± 5.06 9 165.49 ± 11.79 149.17 ± 4.31 168.76 ± 16.32 143.02 ± 8.69 178.16 ± 14.96 154.75 ± 8.55 152.63 ± 7.75 149.94 ± 2.53 10 137.77 ± 10.72 126.47 ± 3.86 145.91 ± 17.47 119.81 ± 10.28 153.66 ± 17.35 122.20 ± 9.56 126.30 ± 5.71 122.74 ± 2.37 11 Min = 61.7 71.43 ± 8.30 62.39 ± 2.45 78.11 ± 13.66 82.07 ± 13.55 75.99 ± 12.52 62.03 ± 2.10 68.87 ± 5.69 61.7 12 115.37 ± 8.46 107.92 ± 3.15 112.52 ± 10.83 95.95 ± 14.58 132.84 ± 16.66 102.72 ± 12.86 107.06 ± 3.59 105.31 ± 1.61 13 Min = 95 100.10 ± 4.84 95.00 103.40 ± 8.12 95.00 106.50 ± 10.50 95.00 97.46 ± 2.34 95.00 14 49.34 ± 15.61 49.39 ± 22.41 63.70 ± 16.92 57.80 ± 13.05 60.74 ± 21.28 48.53 ± 30.95 72.23 ± 6.89 66.92 ± 6.34 15 46.27 ± 12.75 45.46 ± 6.75 51.70 ± 17.01 49.58 ± 14.49 40.27 ± 20.26 27.77 ± 11.24 63.38 ± 10.36 37.87 ± 5.21 16 149.95 ± 13.01 139.22 ± 6.09 143.28 ± 18.28 133.97 ± 10.09 130.65 ± 22.63 118.49 ± 11.04 143.91 ± 7.10 133.00 ± 4.74 17 102.55 ± 20.71 101.13 ± 13.70 65.83 ± 25.16 64.01 ± 40.09 73.86 ± 26.09 82.60 ± 30.65 111.93 ± 11.45 97.40 ± 8.79 18 140.10 ± 14.26 130.88 ± 7.52 128.87 ± 18.78 122.59 ± 10.64 119.92 ± 22.28 111.76 ± 11.81 139.38 ± 7.39 127.54 ± 5.25 19 91.45 ± 21.07 75.61 ± 15.31 91.82 ± 24.65 107.47 ± 14.40 78.78 ± 26.09 90.53 ± 19.99 106.93 ± 11.45 80.40 ± 8.88 20 100.67 ± 18.05 93.62 ± 8.98 88.59 ± 18.68 80.54 ± 10.40 93.77 ± 22.39 78.47 ± 11.31 99.81 ± 11.19 85.42 ± 6.20 21 39.74 ± 16.73 33.96 ± 8.62 45.54 ± 15.66 40.13 ± 9.03 41.61 ± 18.75 32.51 ± 11.26 57.42 ± 12.56 34.22 ± 5.50 22 Min = 33.9 65.17 ± 17.41 56.14 ± 7.80 46.13 ± 11.05 35.19 ± 2.60 55.60 ± 16.92 38.31 ± 6.56 47.15 ± 8.98 36.89 ± 3.32 23 47.01 ± 15.09 47.43 ± 7.28 32.73 ± 10.76 26.44 ± 5.37 42.11 ± 15.24 31.31 ± 7.64 39.92 ± 9.03 29.07 ± 3.87

a Node numbers correspond to those in Fig. 7, except that node 18 is the stem group of Thuja and Thujopsis in the LFY gene tree.

aceae and Podocarpaceae in Conifer II (non-Pinaceae conifers) and supports Gnetales as a sister group of conifers and the sister rela- the sister relationship between Taxaceae + Cephalotaxaceae and tionship between Pinaceae and Conifer II (Fig. 2A). Furthermore, Cupressaceae s.l. (Fig. 2), which are in accordance with the results introns of LFY and NLY can be well aligned among closely related of most previous molecular studies (e.g., Chaw et al., 1997; Gugerli gymnospermous genera and have good inter- and intrageneric res- et al., 2001; Quinn et al., 2002; Rydin et al., 2002; Rai et al., 2008; olution (Fig. S4; Little, 2006; Won and Renner, 2006; Peng and Ran et al., 2010a). The phylogenetic position of Welwitschia is not Wang, 2008; Wei et al., 2010). Thus, these two genes could be used consistent among the ML, MP and BI trees of the NLY gene as men- to investigate the evolutionary relationships of gymnosperms at tioned previously, and thus still needs more studies to resolve various taxonomic levels in the future. In particular, the LFY gene (Fig. 2B), since we failed to amplify this gene from the other two may have the potential to become a nuclear DNA barcode of land genera of Gnetales. Nevertheless, the LFY gene phylogeny strongly plants (Ran et al., 2010b). Author's personal copy

464 Z.-Y. Yang et al. / Molecular Phylogenetics and Evolution 64 (2012) 452–470

Fig. 7. The combined gene (LFY + NLY + matK+rps3) topology showing divergence times of Cupressaceae s.l. estimated by Multidivtime associated with the earliest fossil record of each genus. The fossil and modern distributions of the Southern Hemisphere Cupressaceae s.s. are indicated on the right. A: Australia, T: Tasmania, NC: New Caledonia, NZ: New Zealand, AF: Africa, NG: New Guinea, SA: Southern America, NA: Northern America, M: Moluccas.

4.2. Evolution of the LFY and NLY genes with implications for the origin conifers, including the two tetraploids Fitzroya cupressoides and Juni- and evolution of polyploids in gymnosperms perus chinensis ‘Pfitzeriana’ and the hexaploid Sequoia sempervirens, belong to Cupressaceae s.l. (Khoshoo, 1959; Ahuja, 2005). Thus, the Polyploidy, as a common model of evolution and speciation rarity of polyploids found in Cupressaceae s.l. implied a different (Soltis et al., 2010), is particularly prominent in plants (Soltis and polyploidization pathway from angiosperms that might provide Soltis, 2009; Van de Peer et al., 2009; Wood et al., 2009; Ainouche additional clues for understanding of plant polyploidization. and Jenczewski, 2010). The genome sequences or genome-scale As discussed earlier, both LFY and NLY generally exist as a single data, coupled with cytogenetic and phylogenetic databases, locus in gymnosperms. Although functions of the two sister genes have further shown that polyploidy is far more prevalent than ex- in gymnosperms are still poorly understood, comparative spatio- pected (Cui et al., 2006; Meyers and Levin, 2006; Doyle et al., 2008; temporal patterns of their expression in the three conifer genera Hegarty and Hiscock, 2008; Wood et al., 2009; Soltis et al., 2010), Picea, Podocarpus and Taxus suggest a functional divergence be- and that 47–100% of flowering plants and most extant ferns might tween them. That is, they can be expressed simultaneously in a sin- be derived from ancient polyploidy (Wood et al., 2009). However, gle reproductive axis, initially overlapping but later in mutually compared to other vascular plants, many fewer natural polyploids exclusive primordia and/or groups of developing cells in both have been reported from gymnosperms, and all natural polyploid female and male structures (Vazquez-Lobo et al., 2007). In Author's personal copy

Z.-Y. Yang et al. / Molecular Phylogenetics and Evolution 64 (2012) 452–470 465

Fig. 8. Ancestral range reconstruction for Cupressaceae s.s. based on the combined LFY + NLY + matK+rps3 genes compared with the modern distribution of its genera (A) and the break-up history of Gondwana (B) modified from Sanmartin and Ronquist (2004) (only four genera of Cupressoideae are shown). Pie charts at nodes show probabilities of alternative ancestral ranges obtained by S-DIVA. The divergence times shown for some nodes are same as those in Fig. 7. A: Australia including Tasmania; B: South America; C: New Caledonia; D: New Zealand; E: New Guinea and Moluccas; F: Africa; G: Asia; H: North America; I: Europe.

Table 5 The highest relative probabilities of ancestral ranges for the consistent nodes among angiosperm, the LFY gene regulates expression of the MADS-box different gene trees of the Southern Hemisphere Cupressaceae s.s. genes responsible for floral-meristem identity (Soltis et al., 2002), and also occurs as a single copy, although two paralogs of LFY have Datasets Ancestral range of each nodea been found in recent polyploids (Bomblies and Doebley, 2005; 16789Esumi et al., 2005). While LFY has been successfully used in inves- matK ABCDEFG ACF ABF AB AC tigating the origin and evolution of some hybrid species of seed LFY ABCDEFG ABCF BF B AC plants (Oh and Potter, 2003; Wei et al., 2010), allopolyploid speci- NLY ABEG A ABF AB A LFY + NLY + matK + rps3 ABCDEFG ABCF ABF AB C ation (Kim et al., 2008), and reticulate evolution (Peng and Wang, 2008), its sister gene NLY has also shown good resolution for inter- a Node numbers and definition of geographical area correspond to those in Fig. 8. specific relationships of Cupressus (Little, 2006). In particular, the Author's personal copy

466 Z.-Y. Yang et al. / Molecular Phylogenetics and Evolution 64 (2012) 452–470 origin of Pseudotsuga wilsoniana from interspecific hybridization and stabilization process of these interesting coniferous was successfully revealed by the distribution of two distinct LFY polyploids. gene types in this species and the phylogenetic tree of this gene Diploidization is important to the stabilization of neopolyploids (Wei et al., 2010). (Ramsey and Schemske, 2002), during which the preservation of Sequoia semperviens is a hexaploid (2n =6x = 66) occurring in duplicated gene copies appears to be nonrandom, with some genes southwest Oregon and northwest California. According to morpho- being duplicated and reduplicated whereas others being iteratively logical and cytological studies, several hypotheses were proposed returned to singleton status (Blanc and Wolfe, 2004). In the present for its origin (Stebbins, 1948; Li, 1987, 1988; Ahuja and Neale, study, we did not find redundant copies of LFY or NLY in conifers, 2002; Ahuja, 2005). Stebbins (1948) inferred that Sequoia origi- even not in the hexaploid Sequoia, although the two genes origi- nated as an allopolyploid by hybridization between Metasequoia nated from an ancient gene duplication. However, two distinct and some probably extinct taxodiaceous plant. Li (1987, 1988) sug- copies are maintained in LFY of Fitzroya cupressoides and NLY of gested Metasequoia and Sequoiadendron or ancestors of the two Juniperus chinensis, which might represent ‘‘distant’’ alleles des- genera as the parental species of Sequoia. However, Ahuja and cended from the putative parents of the tetraploid species. The Neale (2002) thought that Sequoia could be an autohexaploid, maintenance of different genes in different groups could have autoallohexaploid or segemental allohexapliod. According to the caused by genetic drift and diploidization of the polyploids, during present study, Sequoia is clustered with Metasequoia glyptostrobo- which some loci retain contributions from both parents and other ides (a species historically widespread but currently confined to retain alleles only from one parent (Wolfe, 2001; Doyle et al., Sichuan and Hubei, China) in the LFY tree (Fig. 2A) but with 2008). Given the key regulatory function of LFY and NLY in the Sequoiadendron giganteum (distributed in the Sierra Nevada Moun- development of reproductive organs in gymnosperm (Vazquez- tains of California, USA) in the NLY tree (Fig. 2B), which are congru- Lobo et al., 2007), the two sister genes will very likely return to sin- ent with the sequence characters of the two nuclear genes (Fig. S2). gleton status following genome/gene duplication. Network analysis based on the combined nuclear genes strongly supports that Sequoia has originated from a recombination be- 4.3. Divergence times and biogeography of Cupressaceae s.s.: Further tween Metasequoia and Sequoiadendron. Based on previous cpDNA evidence for Southern Hemisphere biogeography phylogenies (Brunsfeld et al., 1994; Tsumura et al., 1995; Kusumi et al., 2000) and the present chloroplast matK and mitochondrial Southern Hemisphere biogeography has drawn tremendous rps3 gene trees (Figs. 3 and 4), Sequoia is sister to Sequoiadendron. interest from biologists and geologists (Sanmartin and Ronquist, The above evidence implies that Sequoiadendron is the paternal 2004; Knapp et al., 2005; Barker et al., 2007; Upchurch, 2008), ancestor of Sequoia if this genus really originated by hybridization and is considered a typical vicariance scenario responsible for the between Metasequoia and Sequoiadendron or ancestors of the two transoceanic disjunctions of biota that developed by the sequential genera, a hypothesis suggested by Li (1987, 1988). The hybrid ori- breakup of the Gondwana supercontinent during the last 165 Myr gin of Sequoia is also supported by the fossil record of Metasequoia (McLoughlin, 2001; Sanmartin and Ronquist, 2004). in Northern America (Stockey et al., 2001; Farjón, 2005). Unfortu- However, based on molecular estimates and more accurate nately, clues for the origin of Sequoia may have been blurred in paleogeographic reconstruction, recent studies indicate that dis- the long evolutionary history, given that the earliest fossil record persal has also played an important role in shaping these biogeo- of this genus can be dated back to the early Cretaceous (Penny, graphical patterns (Givnish and Renner, 2004; Sanmartin and 1947; Ma et al., 2005; Farjón, 2005). Nevertheless, the inconsistent Ronquist, 2004). In particular, unlike the situation in animals, bio- relationships among Metasequoia, Sequoia and Sequoiadendron re- geographical histories of plants, especially angiosperm groups such vealed by different data sets (Figs. 2–4) could be an important sign as Nothofagus, Proteaceae and Restionaceae, show less evidence for of reticulate evolution among the three genera, even though it is or are only partially consistent with the timing of the Gondwanan difficult to deduce how and when Sequoia originated. breakup (Linder et al., 2003; Knapp et al., 2005; Barker et al., 2007). Interestingly, the variability of conspecific clones of LFY or NLY That is, a more complicated Southern Hemisphere biogeography is is higher in the two tetraploids Juniperus chinensis ‘Pfitzeriana’ being revealed by integrating evidences from molecular dating, and Fitzroya cupressoides than in other diploid species. Juniperus paleontology and ecology, challenging the classic ‘Gondwana’ par- chinensis ‘Pfitzeriana’ has 22 pairs of chromosomes in the meiotic adigm (Crisp et al., 2011). cells (Sax and Sax, 1933) and its origin by hybridization between Biogeographical reconstruction of the ancient conifer family J. chinensis and J. sabina was suggested by RAPDs (Le Duc et al., Cupressaceae s.l., especially Cupressaceae s.s., could shed some 1999). Two distinct NLY clones were obtained from genomic light on Southern Hemisphere biogeography and the history of DNA of this cultivar, which differ in 16 sites in a total exon Gondwana. The molecular dating by Li and Yang (2002) indicated length of 1009 bp (Table 3). The two gene members might have that the divergence of major lineages of Taxodiaceae occurred in function, since both of them were also found in the cDNA. Fitz- the Jurassic, and that the Northern and Southern Hemisphere roya cupressoides is endemic to temperate forests in southern clades of Cupressaceae s.s. (Cupressoideae and Callitroideae) di- South America, with a chromosome number of 2n =44 (Hair, verged at least 124 MA. Regretfully, in their study, most genera 1968). This species harbors two distinct LFY sequences that differ of Cupressaceae s.s were excluded from the molecular clock analy- in 23 sites in a total exon length of 1081 bp (share 94% identity), sis due to substitution rate heterogeneity. In the present study, we while its two NLY clones share high similarity and are sister to used multiple calibration points (Fig. 7) considering the great im- each other (Table 3; Figs. 2 and S4). According to the fact that pact of fossil calibration on posterior time estimates (Inoue et al., the two LFY clones of F. cupressoides do not form a sister relation- 2010), and performed the relaxed molecular clock analysis for ship and the close relationship between Fitzroya and Diselma Cupressaceae s.l. based on nuclear, chloroplast and combined gene (Figs. 2–4 and S4c), together with divergence time estimation data sets. It is very interesting that the results yielded from differ- (Table 4), we infer that the monotypic genus Fitzroya might have ent data sets correspond well with each other (Table 4). Moreover, originated from hybridization between Diselma and an extinct the estimated times of unconstrained nodes correspond well with group, and F. cupressoides could be an allotetraploid. However, the oldest fossil record (Fig. 7). Here we use results of the Bayesian it is difficult to distinguish between ancient allopolyploid and analysis (multidivtime) for further discussion, since this method, autopolyploid speciation only based on a couple of genes. More compared with the PL method by r8s, provides a powerful frame- studies are still needed to investigate the evolutionary history work for integrating fossil information (Inoue et al., 2010). Author's personal copy

Z.-Y. Yang et al. / Molecular Phylogenetics and Evolution 64 (2012) 452–470 467

The divergence times of all main lineages of the Taxodiaceae, NLY, 60.74 ± 21.28 MA; combined LFY + NLY + matK+rps3, including Cunninghamia (node 2), (node 3), Sequoioideae 72.23 ± 6.89 MA), and the rich fossils of Fitzroya found in Tasmania (node 4), and Taxodioideae (node 7), can be dated back to the Juras- (Hill and Whang, 1996; Hill and Paull, 2003; Paull and Hill, 2010). sic or even earlier (Fig. 7 and Table 4), which are highly accordant However, it cannot be completely ruled out that the current distri- with the fossil record (summarized in Miller, 1977; Farjón, 2005). butions of the three genera Widdringtonia, Fitzroya and Diselma The congruence also supports the idea that the present distribution have resulted from long distance dispersal and extinction of some of this traditional family is a relic of a much more widespread of their close relatives, if Widdringtonia was distributed in Australia occurrence in the past (Miller, 1977; Li and Yang, 2002; Farjón, or its neighboring islands. Also, it is more parsimonious that both 2005). The split between Callitroideae and Cupressoideae, node 9 parents of Fitzroya originally occurred in Tasmania and hybridized, (Fig. 7), can also be dated back to the Jurassic (165.49 ± 11.79, and then the generated tetraploid Fitzroya migrated into South 168.76 ± 16.32, 178.16 ± 14.96 and 152.63 ± 7.75 MA based on America. matK, LFY, NLY and combined LFY + NLY + matK+rps3, respectively, One may wonder why the fossil of Widdringtonia was found in Table 4), providing strong evidence for the separation of two sub- North America (McIver, 2001). Low latitude connection between families by the spilt of Laurasia and Gondwana (Li, 1953; Li and Brazil and equatorial Africa might have been maintained until Yang, 2002) that is supported by the ancestral range reconstruction 119–105 MA due to translational movement of the continents (Fig. 8, node 1; Table 5). along the Guinea Fracture Zone (McLoughlin, 2001), while North The Greater Cape, one of the global biodiversity hotspots, has America and South America became connected in the middle Cre- been identified as a combination of ancient species repository taceous (100 MA) and then separated in the early Eocene (Hay and hot-bed of recent radiation in flora (Warren and Hawkins, et al., 1999). These connections made it possible for Widdringtonia 2006; Verboom et al., 2009). Widdringtonia is endemic to Southern to disperse from Africa to North America in the Cretaceous. Africa, and extends from the Cape to Malawi. Based on its oldest The divergence times of New Zealand and New Caledonian Lib- fossil record in North America (McIver, 2001) and the divergence ocedrus, South American Pilgerodendron, New Guinean Papuacedrus between it and the Fitzroya–Diselma clade that was dated back to and South American Austrocedrus were dated back to the Creta- 65 MA, Warren and Hawkins (2006) supported the hypothesis of ceous, even early Cretaceous (Fig. 7 and Table 4), suggesting their McIver (2001) that Widdringtonia originated in Laurasia in the early origins before the separation of those islands or continents from Cretaceous and later migrated to Africa. However, this hypothesis Australia (McLoughlin, 2001). These genera might have a wider is very unlikely to be true considering our findings that all genera distribution in the past according to fossil record (Hill and of Cupressaceae s.s. occurring in the Southern Hemisphere com- Brodribb, 1999; Farjón, 2005; Fig. 7). prise a strongly supported clade in all gene trees (Figs. 2–5). Widd- Based on robust phylogenetic reconstruction, molecular dating, ringtonia forms a clade with the two monotypic genera Fitzroya and ancestral range reconstruction and fossil record, our study pro- Diselma, which are endemic to the Southern Andes and Tasmania, vides some evidence for the biogeographic history of Cupressaceae respectively, and the clade is sister to the Australian clade compris- s.l. and generally supports some hypotheses about the evolutionary ing Callitris, Actinostrobus and Neocallitropsis (Figs. 2, 3 and 5). history of continents such as the separation of Gondwana from These phylogenetic relationships and biogeographic histories of Laurasia in the Jurassic and the break-up process of Gondwana the six genera (Fig. 8A, node 6) are generally congruent with the (McLoughlin, 2001). However, our inferences are very preliminary break-up history of Gondwana (Fig. 8B). That is, the separation of due to the limitation of sampling at the genus level, many extinc- East and West Gondwana at 165–130 MA (McLoughlin, 2001) led tion events, and the very complicated biogeographic and geological to the divergence between the two clades Callitris–Actinostrobus– history in the Southern Hemisphere (McLoughlin, 2001; Sanmartin Neocallitropsis and Widdringtonia–Fitzroya (Diselma will be dis- and Ronquist, 2004; Crisp et al., 2011). More samples at the species cussed later) (Fig. 7, node 12) that can be dated back to the early level are needed in future biogeographic studies of the Southern Cretaceous (matK, 115.37 ± 8.46 MA; LFY, 112.52 ± 10.83 MA; Hemisphere Cupressaceae s.s., especially those species occurring NLY, 132.84 ± 16.66 MA; combined LFY + NLY + matK+rps3, in Australia, New Caledonia and New Zealand. 107.06 ± 3.59 MA, Table 4). Also, the fact that the split between Widdringtonia and Fitzroya-Diselma occurred at least 95 MA according to the oldest fossil of Widdringtonia from the Tuscaloosa Acknowledgments Formation of Alabama (McIver, 2001) is generally consistent with the final separation of Africa from Southern America at about We are indebted to Profs. Christopher Quinn (Royal Botanic 105 MA (McLoughlin, 2001). The above evidence, coupled with Gardens of Australia), Robert P. Adams (Baylor University, USA), ancestral range reconstruction (Fig. 8, nodes 6 and 7; Table 5), and Peter Hollingsworth (Royal Botanic Garden Edinburgh) for may suggest that vicariance is mainly responsible for this biogeo- their great help in sampling the genera of Cupressaceae s.s. ende- graphic pattern. mic to the Southern Hemisphere and America. We thank Drs. Of great interest is the grouping of the Tasmanian Diselma with Dan Peng and Qiao-Ping Xiang (Institute of Botany, Chinese Acad- Fitzroya (Figs. 2 and 3). As discussed earlier, the tetraploid Fitzroya emy of Sciences), Drs. Maurizio Rossetto and Carolyn Porter (Bota- could have originated by hybridization with Diselma as one parent. nic Gardens Trust in Sydney), Dr. Shou-Zhou Zhang (Shenzhen It would be reasonable to see the sister relationship between the FairyLake Botanical Garden, China), and the Royal Botanic Garden, two genera if Diselma was the paternal ancestor, given the predom- Kew (UK) for providing some samples for DNA analysis; Dr. Ken- inantly paternal cytoplasmic inheritance in Cupressaceae s.l. (re- neth H. Wolfe for suggestions on polyploid evolution; Dr. Hui viewed by Mogensen, 1996). That is, the maternal ancestor of Gao for assistance with lab work; Drs. Qiang Zhang and Fu-Sheng Fitzroya migrated into Tasmania from South America and hybrid- Yang for help in molecular dating and biogeographic analyses; ized with Diselma, giving rise to the putative allotetraploid Fitzroya, Ms. Wan-Qing Jin and Rong-Hua Liang for their assistance in which migrated back to South America by the connection between DNA sequencing. We also thank the handling editor and the two Australia and South America through Antarctic (52–35 MA) anonymous reviewers for their insightful comments and sugges- (McLoughlin, 2001; Sanmartin and Ronquist, 2004). This inference tions on the manuscript. This work was supported by the National is supported by the ancestral range reconstruction (Fig. 8, node Natural Science Foundation of China (Grant Nos. 31170197, 8), estimate of the divergence time between Fitzroya and Diselma 30730010, 30425028) and the Chinese Academy of Sciences (the (Fig. 7, node 14) (matK, 49.34 ± 15.61 MA; LFY, 63.70 ± 16.92 MA; 100-Talent Project). Author's personal copy

468 Z.-Y. Yang et al. / Molecular Phylogenetics and Evolution 64 (2012) 452–470

Appendix A. Supplementary material Gadek, P.A., Alpers, D.L., Heslewood, M.M., Quinn, C.J., 2000. Relationships within Cupressaceae sensu lato: a combined morphological and molecular approach. Am. J. Bot. 87, 1044–1057. Supplementary data associated with this article can be found, in the Gascuel, O., 1997. BIONJ: an improved version of the NJ algorithm based on a simple online version, at http://dx.doi.org/10.1016/j.ympev.2012.05.004. model of sequence data. Mol. Biol. Evol. 14, 685–695. Givnish, T.J., Renner, S.S., 2004. Tropical intercontinental disjunctions: Gondwana breakup, immigration from the boreotropics, and transoceanic dispersal. Int. J. References Plant Sci. 165 (Suppl. 4), S1–S6. Griffiths, C.S., 1997. Correlation of functional domains and rates of nucleotide substitution in cytochrome b. Mol. Phylogenet. Evol. 7, 352–365. Adams, P.R., Bartel, J.A., Price, R.A., 2009. A new genus, Hesperocyparis, for the Grob, G.B.J., Gravendeel, B., Eurlings, M.C.M., 2004. Potential phylogenetic utility of of the Western Hemisphere (Cupressaceae). Phytologia 91, the nuclear FLORICAULA/LEAFY second intron: comparison with three 160–185. chloroplast DNA regions in Amorphophallus (Araceae). Mol. Phylogenet. Evol. Ahuja, M.R., 2005. Polyploidy in gymnosperms: revisited. Silvae Genet. 54, 30, 13–23. 59–69. Gugerli, F., Sperisen, C., Buchler, U., Brunner, L., Brodbeck, S., Palmer, J.D., Qiu, Y.L., Ahuja, M.R., Neale, D.B., 2002. Origins of polyploidy in coast redwood (Sequoia 2001. The evolutionary split of Pinaceae from other conifers: Evidence from an sempervirens (D. Don) Endl.) and relationship of coast redwood to other genera intron loss and a multigene phylogeny. Mol. Phylogenet. Evol. 21, 167–175. of Taxodiaceae. Silvae Genet. 51, 93–100. Guindon, S., Gascuel, O., 2003. A simple, fast, and accurate algorithm to estimate Ahuja, M.R., Neale, D.B., 2005. Evolution of genome size in conifers. Silvae Genet. 54, large phylogenies by maximum likelihood. Syst. Biol. 52, 696–704. 126–137. Hair, J.B., 1968. The chromosomes of the Cupressaceae I. Tetraclineae and Ainouche, M.L., Jenczewski, E., 2010. Focus on polyploidy. New Phytol. 186, 1–4. Actinostrobeae (Callitroideae). New Zeal. J. Bot. 6, 277–284. Aulenback, K.R., LePage, B.A., 1998. Taxodium wallisii sp. nov.: first occurrence of Hajibabaei, M., Xia, J., Drouin, G., 2006. Seed plant phylogeny: gnetophytes are Taxodium from the Upper Cretaceous. Int. J. Plant Sci. 159, 367–390. derived conifers and a sister group to Pinaceae. Mol. Phylogenet. Evol. 40, 208– Barker, N.P., Weston, P.H., Rutschmann, F., Sauquet, H., 2007. Molecular dating of 217. the ‘Gondwanan’ plant family Proteaceae is only partially congruent with the Hall, T.A., 1999. BioEdit: a user-friendly biological sequence alignment editor and timing of the break-up of Gondwana. J. Biogeogr. 34, 2012–2027. analysis program for Windows 95/98/NT. Nucl. Acids Symp. Ser. 40, 95–98. Blanc, G., Wolfe, K.H., 2004. Functional divergence of duplicated genes formed by Hart, J.A., 1987. A cladistic analysis of conifers: preliminary results. J. Arnold polyploidy during Arabidopsis evolution. Plant Cell 16, 1679–1691. Arboretum 68, 269–307. Bomblies, K., Doebley, J.F., 2005. Molecular evolution of FLORICAULA/LEAFY Hay, W.W., DeConto, R.M., Wold, C.N., Wilson, K.M., Voigt, S.Schulz, M., Wold- orthologs in the Andropogoneae (Poaceae). Mol. Biol. Evol. 22, 1082–1094. Rossby, A., Dullo, W.C., Ronov, A.B., Balukhovsky, A.N., Soeding, E., 1999. Brunsfeld, S.J., Soltis, P.S., Soltis, D.E., Gadek, P.A., Quinn, C.J., Strenge, D.D., Ranker, Alternative global cretaceous paleogeography. In: Evolution of Cretaceous T.A., 1994. Phylogenetic relationships among the genera of Taxodiaceae and ocean-climate systems. Special Paper 332. Geological Society of America, pp. 1– Cupressaceae: evidence from rbcL sequences. Syst. Bot. 19, 253–262. 47. Chaw, S.M., Zharkikh, A., Sung, H.M., Lau, T.C., Li, W.H., 1997. Molecular phylogeny Hegarty, M.J., Hiscock, S.J., 2008. Genomic clues to the evolutionary success of of extant gymnosperms and seed plant evolution: analysis of nuclear 18S rRNA polyploid plants. Curr. Biol. 18, R435–R444. sequences. Mol. Biol. Evol. 14, 56–68. Hill, R.S., Brodribb, T.J., 1999. Turner review no. 2 – Southern conifers in time and Chaw, S.M., Parkinson, C.L., Cheng, Y.C., Vincent, T.M., Palmer, J.D., 2000. Seed plant space. Aust. J. Bot. 47, 639–696. phylogeny inferred from all three plant genomes: monophyly of extant Hill, R.S., Paull, R., 2003. Fitzroya (Cupressaceae) macrofossils from Cenozoic gymnosperms and origin of Gnetales from conifers. Proc. Natl. Acad. Sci. USA sediments in Tasmania, Australia. Rev. Palaeobot. Palynol. 126, 145–152. 97, 4086–4091. Hill, R.S., Whang, S.S., 1996. A new species of Fitzroya (Cupressaceae) from Cheng, Y.C., Nicolson, R.G., Tripp, K., Chaw, S.M., 2000. Phylogeny of Taxaceae and Oligocene sediments in north-western Tasmania. Aust. Syst. Bot. 9, 867–875. Cephalotaxaceae genera inferred from chloroplast matK gene and nuclear rDNA Huson, D.H., Bryant, D., 2006. Application of phylogenetic networks in evolutionary ITS region. Mol. Phylogenet. Evol. 14, 353–365. studies. Mol. Biol. Evol. 23, 254–267. Crisp, M.D., Trewick, S.A., Cook, L.G., 2011. Hypothesis testing in biogeography. Inoue, J., Donoghue, P.C.J., Yang, Z.H., 2010. The impact of the representation of Trends Ecol. Evol. 26, 66–72. fossil calibrations on Bayesian estimation of species divergence times. Syst. Biol. Cui, L.Y., Wall, P.K., Leebens-Mack, J.H., Lindsay, B.G., Soltis, D.E., Doyle, J.J., Soltis, 59, 74–89. P.S., Carlson, J.E., Arumuganathan, K., Barakat, A., Albert, V.A., Ma, H., Khoshoo, T.N., 1959. Polyploidy in gymnosperms. Evolution 13, 24–39. dePamphilis, C.W., 2006. Widespread genome duplications throughout the Kim, S.T., Sultan, S.E., Donoghue, M.J., 2008. Allopolyploid speciation in Persicaria history of flowering plants. Genome Res. 16, 738–749. (Polygonaceae): insights from a low-copy nuclear region. Proc. Natl. Acad. Sci. Debreczy, Z., Musial, K., Price, R.A., Rácz, I., 2009. Relationships and nomenclatural USA 105, 12370–12375. status of the nootka cypress (Callitropsis nootkatensis, Cupressaceae). Phytologia Kinlaw, C.S., Neale, D.B., 1997. Complex gene families in pine genomes. Trends Plant 91, 140–158. Sci. 2, 356–359. Donoghue, M.J., Doyle, J.A., 2000. Seed plant phylogeny: demise of the anthophyte Knapp, M., Stockler, K., Havell, D., Delsuc, F., Sebastiani, F., Lockhart, P.J., 2005. hypothesis? Curr. Biol. 10, R106–R109. Relaxed molecular clock provides evidence for long-distance dispersal of Dornelas, M.C., Rodriguez, A.P.M., 2005. A FLORICAULA/LEAFY gene homolog is Nothofagus (southern beech). PLoS Biol. 3, 38–43. preferentially expressed in developing female cones of the tropical pine Pinus Kumar, S., 2005. Molecular clocks: four decades of evolution. Nat. Rev. Genet. 6, caribaea var. caribaea. Genet. Mol. Biol. 28, 299–307. 654–662. Doyle, J.J., 1992. Gene trees and species trees: molecular systematics as one- Kumar, S., Tamura, K., Nei, M., 2004. MEGA3: integrated software for molecular character . Syst. Bot. 17, 144–163. evolutionary genetics analysis and sequence alignment. Brief. Bioinform. 5, Doyle, J.J., Flagel, L.E., Paterson, A.H., Rapp, R.A., Soltis, D.E., Soltis, P.S., Wendel, J.F., 150–163. 2008. Evolutionary genetics of genome merger and doubling in plants. Annu. Kusumi, J., Tsumura, Y., Yoshimaru, H., Tachida, H., 2000. Phylogenetic relationships Rev. Genet. 42, 443–461. in Taxodiaceae and Cupressaceae sensu stricto based on matK gene, chlL gene, Drummond, A.J., Rambaut, A., 2007. BEAST: Bayesian evolutionary analysis by trnL-trnF IGS region, and trnL intron sequences. Am. J. Bot. 87, 1480–1488. sampling trees. BMC Evol. Biol. 7, 214. Kusumi, J., Tsumura, Y., Yoshimaru, H., Tachida, H., 2002. Molecular evolution of nuclear Eckenwalder, J.E., 1976. Re-evaluation of Cupressaceae and Taxodiaceae: a proposed genes in Cupressaceae, a group of conifer trees. Mol. Biol. Evol. 19, 736–747. merger. Madroño 23, 237–256. Kvacek, Z., 2002. A new from the Palaeogene of Central Europe. Fedd. Repert. Esumi, T., Tao, R., Yonemori, K., 2005. Isolation of LEAFY and TERMINAL FLOWER 1 113, 492–502. homologues from six fruit tree species in the subfamily Maloideae of the Le Duc, A., Adams, R.P., Zhong, M., 1999. Using random amplification of Rosaceae. Sex. Plant Reprod. 17, 277–287. polymorphic DNA for a taxonomic reevaluation of Pfitzer . Farjón, A., 2005. A Bibliography of Cupressaceae and Sciadopitys. Royal Botanic HortScience 34, 1123–1125. Gardens, Kew. Leitch, I.J., Hanson, L., Winfield, M., Parker, J., Bennett, M.D., 2001. Nuclear DNA C- Farjón, A., Hiep, N.T., Harder, D.K., Loc, P.K., Averyanov, L., 2002. A new genus and values complete familial representation in gymnosperms. Ann. Bot. 88, 843– species in Cupressaceae (Coniferales) from northern Vietnam, Xanthocyparis 849. vietnamensis. Novon 12, 179–189. Li, H.L., 1953. Present distribution and habitats of the conifers and taxads. Evolution Farris, J.S., Kallersjo, M., Kluge, A.G., Bult, C., 1994. Testing significance of 7, 245–261. incongruence. Cladistics 10, 315–319. Li, L., 1987. The origin of Sequoia sempervirens (Taxodiaceae) based on karyotype. Felsenstein, J., 1985. Confidence limits on phylogenies: an approach using the Acta Bot. Yunnan. 9, 187–192. bootstrap. Evolution 39, 783–791. Li, L., 1988. The parents of Sequoia sempervirens (Taxodiaceae) based on Felsenstein, J., 1988. Phylogenies from molecular sequences: inference and morphology. Acta Bot. Yunnan. 10, 33–37. reliability. Annu. Rev. Genet. 22, 521–565. Li, C., Yang, Q., 2002. Divergence time estimates for major lineages of Cupressaceae Florin, R., 1963. The distribution of conifer and taxad genera in time and space. Acta (sl). Acta Phytotaxon. Sin. 40, 323–333. Hort. Berg. 20, 121–312. Lin, C.P., Huang, J.P., Wu, C.S., Hsu, C.Y., Chaw, S.M., 2010. Comparative chloroplast Frohlich, M.W., Meyerowitz, E.M., 1997. The search for flower homeotic gene genomics reveals the evolution of Pinaceae genera and subfamilies. Genome homologs in basal angiosperms and Gnetales: a potential new source of data on Biol. Evol. 2, 504–517. the evolutionary origin of flowers. Int. J. Plant Sci. 158, S131–S142. Linder, H.P., Eldenas, P., Briggs, B.G., 2003. Contrasting patterns of radiation in Gadek, P.A., Quinn, C.J., 1993. An analysis of relationships within the Cupressaceae African and Australian Restionaceae. Evolution 57, 2688–2702. sensu stricto based on rbcL sequences. Ann. Mo. Bot. Gard. 80, 581–586. Author's personal copy

Z.-Y. Yang et al. / Molecular Phylogenetics and Evolution 64 (2012) 452–470 469

Little, D.P., 2006. Evolution and circumscription of the true cypresses Rydin, C., Kallersjo, M., Friist, E.M., 2002. Seed plant relationships and the (Cupressaceae: Cupressus). Syst. Bot. 31, 461–480. systematic position of Gnetales based on nuclear and chloroplast DNA: Little, D.P., Schwarzbach, A.E., Adams, R.P., Hsieh, C.F., 2004. The circumscription conflicting data, rooting problems, and the monophyly of conifers. Int. J. Plant and phylogenetic relationships of Callitropsis and the newly described genus Sci. 163, 197–214. Xanthocyparis (Cupressaceae). Am. J. Bot. 91, 1872–1881. Sanderson, M.J., 2002. Estimating absolute rates of molecular evolution and Ma, Q.-W., Li, F.-L., Li, Ch.-S., 2005. The coast redwoods (Sequoia, Taxodiaceae) from divergence times: a penalized likelihood approach. Mol. Biol. Evol. 19, 101–109. the Eocene of Heilongjiang and the Miocene of Yunnan. Chin. Rev. Palaeobot. Sang, T., 2002. Utility of low-copy nuclear gene sequences in plant phylogenetics. Palyno. 135, 117–129. Crit. Rev. Biochem. Mol. Biol. 37, 121–147. Maddison, W.P., 1997. Gene trees in species trees. Syst. Biol. 46, 523–536. Sanmartin, I., Ronquist, F., 2004. Southern Hemisphere biogeography inferred by Magallón, S., 2010. Using fossils to break long branches in molecular dating: a event-based models: plant versus animal patterns. Syst. Biol. 53, 216–243. comparison of relaxed clocks applied to the origin of angiosperms. Syst. Biol. 59, Sauquet, H., Weston, P.H., Barker, N.P., Anderson, C.L., Cantrill, D.J., Savolainen, V., 384–399. 2009. Using fossils and molecular data to reveal the origins of the Cape proteas Magallón, S., Sanderson, M.J., 2005. Angiosperm divergence times: the effect of (subfamily Proteoideae). Mol. Phylogenet. Evol. 51, 31–43. genes, codon positions, and time constraints. Evolution 59, 1653–1670. Sax, K., Sax, H.J., 1933. Chromosome number and morphology in the conifers. J. Maizel, A., Busch, M.A., Tanahashi, T., Perkovic, J., Kato, M., Hasebe, M., Weigel, D., Arnold. Arboretum 14, 356–375. 2005. The floral regulator LEAFY evolves by substitutions in the DNA binding Schmidt, M., Schneider-Poetsch, H.A.W., 2002. The evolution of gymnosperms domain. Science 308, 260–263. redrawn by phytochrome genes: the Gnetatae appear at the base of the McCoy, S.R., Kuehl, J.V., Boore, J.L., Raubeson, L.A., 2008. The complete plastid gymnosperms. J. Mol. Evol. 54, 715–724. genome sequence of Welwitschia mirabilis: an unusually compact plastome Schulz, C., Stutzel, T., 2007. Evolution of taxodiaceous Cupressaceae (Coniferopsida). with accelerated divergence rates. BMC Evol. Biol. 8, 130. Org. Divers. Evol. 7, 124–135. McIver, E.E., 2001. Cretaceous Widdringtonia Endl. (Cupressaceae) from North Shindo, S., Sakakibara, K., Sano, R., Ueda, K., Hasebe, M., 2001. Characterization of a America. Int. J. Plant Sci. 162, 937–961. FLORICAULA/LEAFY homologue of Gnetum parvifolium and its implications for McLoughlin, S., 2001. The breakup history of Gondwana and its impact on pre- the evolution of reproductive organs in seed plants. Int. J. Plant Sci. 162, 1199– Cenozoic floristic provincialism. Aust. J. Bot. 49, 271–300. 1209. Mellerowicz, E.J., Horgan, K., Walden, A., Coker, A., Walter, C., 1998. PRFLL –aPinus Shiokawa, T., Yamada, S., Futamura, N., Osanai, K., Murasugi, D., Shinohara, K., radiata homologue of FLORICAULA and LEAFY is expressed in buds containing Kawai, S., Morohoshi, N., Katayama, Y., Kajita, S., 2008. Isolation and functional vegetative shoot and undifferentiated male cone primordia. Planta 206, 619– analysis of the CjNdly gene, a homolog in Cryptomeria japonica of FLORICAULA/ 629. LFY genes. Tree Physiol. 28, 21–28. Meyers, L.A., Levin, D.A., 2006. On the abundance of polyploids in flowering plants. Small, R.L., Cronn, R.C., Wendel, J.F., 2004. Use of nuclear genes for phylogeny Evolution 60, 1198–1206. reconstruction in plants. Aust. Syst. Bot. 17, 145–170. Miller, C.N., 1977. Mesozoic conifers. Bot. Rev. 43, 217–280. Soltis, D.E., Soltis, P.S., 2009. The role of hybridization in plant speciation. Annu. Rev. Mogensen, H.L., 1996. The hows and whys of cytoplasmic inheritance in seed plants. Plant Biol. 60, 561–588. Am. J. Bot. 83, 383–404. Soltis, P.S., Soltis, D.E., Wolf, P.G., Nickrent, D.L., Chaw, S.-M., Chapman, R.L., 1999. Mouradov, A., Glassick, T., Hamdorf, B., Murphy, L., Fowler, B., Maria, S., Teasdale, The phylogeny of land plants inferred from 18S rDNA sequences: pushing the R.D., 1998. NLY,aPinus radiata ortholog of FLORICAULA/LEAFY genes, expressed limits of rDNA signal? Mol. Biol. Evol. 16, 1774–1784. in both reproductive and vegetative meristems. Proc. Natl. Acad. Sci. USA 95, Soltis, D.E., Soltis, P.S., Albert, V.A., Oppenheimer, D.G., dePamphilis, C.W., Ma, H., 6537–6542. Frohlich, M.W., Theissen, G., 2002. Missing links: the genetic architecture of Moyroud, E., Kusters, E., Monniaux, M., Koes, R., Parcy, F., 2010. LEAFY blossoms. flower and floral diversification. Trends Plant Sci. 7, 22–31. Trends Plant Sci. 15, 346–352. Soltis, D.E., Buggs, R.J.A., Doyle, J.J., Soltis, P.S., 2010. What we still don’t know about Murray, B.G., 1998. Nuclear DNA amounts in gymnosperms. Ann. Bot. 82, 3–15. polyploidy. Taxon 59, 1387–1403. Nylander, J.A.A., 2004. MrModeltest v2. Program Distributed by the Author. Stebbins, G.L., 1948. The chromosomes and relationships of Metasequoia and Evolutionary Biology Centre, Uppsala Univ., Uppsala, Sweden. Sequoia. Science 108, 95–98. Oh, S.H., Potter, D., 2003. Phylogenetic utility of the second intron of LEAFY in Neillia Stefanovic, S., Jager, M., Deutsch, J., Broutin, J., Masselot, M., 1998. Phylogenetic and Stephanandra (Rosaceae) and implications for the origin of Stephanandra. relationships of conifers inferred from partial 28S rRNA gene sequences. Am. J. Mol. Phylogenet. Evol. 29, 203–215. Bot. 85, 688–697. Paull, R., Hill, R.S., 2010. Early Oligocene Callitris and Fitzroya (Cupressaceae) from Stockey, R.A., Rothwell, G.W., Falder, A.B., 2001. Diversity among taxodioid conifers: Tasmania. Am. J. Bot. 97, 809–820. Metasequoia foxii sp nov from the Paleocene of central Alberta, Canada. Int. J. Peng, D., Wang, X.-Q., 2008. Reticulate evolution in Thuja inferred from multiple Plant Sci. 162, 221–234. gene sequences: implications for the study of biogeographical disjunction Swofford, D.L., 2002. PAUP⁄. Phylogenetic Analysis Using Parsimony (⁄and other between eastern Asia and North America. Mol. Phylogenet. Evol. 47, 1190–1202. methods), Version 4. Sinauer Associates, Sunderland, Massachusetts. Penny, J.S., 1947. Studies on the Conifers of the Magothy Flora. Am. J. Bot. 34, 281– Tanahashi, T., Sumikawa, N., Kato, M., Hasebe, M., 2005. Diversification of genie 296. function: homologs of the floral regulator FLO/LFY control the first zygotic cell Pilger, R., 1926. Coniferae. In: Engler, A. (Ed.), Die Naturlichen Pflanzenfamilien, 2nd division in the moss Physcomitrella patens. Development 132, 1727–1736. ed., Bd. 13. Dunker and Humblot, Berlin, Germany, pp. 121–403. Taylor, T.N., Taylor, E.L., 1993. The Biology and Evolution of Fossil Plants. Prentice Pole, M., 1998. Paleocene gymnosperms from Mount Somers, New Zealand. J. Roy. Hall, NJ, USA. Soc. New Zeal. 28, 375–403. Thompson, J.D., Gibson, T.J., Plewniak, F., Jeanmougin, F., Higgins, D.G., 1997. The Posada, D., Crandall, K.A., 1998. MODELTEST: testing the model of DNA substitution. CLUSTAL_X windows interface: flexible strategies for multiple sequence Bioinformatics 14, 817–818. alignment aided by quality analysis tools. Nucleic Acids Res. 25, 4876–4882. Price, R.A., Lowenstein, J.M., 1989. An immunological comparison of the Thorne, J.L., Kishino, H., Painter, I.S., 1998. Estimating the rate of evolution of the Sciadopityaceae, Taxodiaceae, and Cupressaceae. Syst. Bot. 14, 141–149. rate of molecular evolution. Mol. Biol. Evol. 15, 1647–1657. Quinn, C.J., Price, R.A., Gadek, P.A., 2002. Familial concepts and relationships in the Tsumura, Y., Yoshimura, K., Tomaru, N., Ohba, K., 1995. Molecular phylogeny of conifer based on rbcL and matK sequence comparisons. Kew. Bull. 57, 513–531. conifers using RFLP analysis of PCR-amplified specific chloroplast genes. Theor. Rai, H.S., Reeves, P.A., Peakall, R., Olmstead, R.G., Graham, S.W., 2008. Inference of Appl. Genet. 91, 1222–1236. higher-order conifer relationships from a multi-locus plastid data set. Botany Upchurch, P., 2008. Gondwanan break-up: legacies of a lost world? Trends Ecol. 86, 658–669. Evol. 23, 229–236. Rambaut, A., Drummond, A.J., 2007. Tracer v1.5. . genome duplications. Nat. Rev. Genet. 10, 725–732. Ramsey, J., Schemske, D.W., 2002. Neopolyploidy in flowering plants. Annu. Rev. Vazquez-Lobo, A., Carlsbecker, A., Vergara-Silva, F., Alvarez-Buylla, E.R., Pinero, D., Ecol. Evol. Syst. 33, 589–639. Engstrom, P., 2007. Characterization of the expression patterns of LEAFY/ Ran, J.-H., Gao, H., Wang, X.Q., 2010a. Fast evolution of the retroprocessed FLORICAULA and NEEDLY orthologs in female and male cones of the conifer mitochondrial rps3 gene in Conifer II and further evidence for the phylogeny genera Picea, Podocarpus, and Taxus: implications for current evo-devo of gymnosperms. Mol. Phylogenet. Evol. 54, 136–149. hypotheses for gymnosperms. Evol. Dev. 9, 446–459. Ran, J.-H., Wang, P.-P., Zhao, H.-J., Wang, X.-Q., 2010b. A test of seven candidate Verboom, G.A.,Archibald, J.K.,Bakker, F.T., Bellstedt,D.U.,Conrad, F.,Dreyer,L.L.,Forest, F., barcode regions from the plastome in Picea (Pinaceae). J. Integr. Plant Biol. 52, Galley, C., Goldblatt, P., Henning, J.F., Mummenhoff, K., Linder, H.P., Muasya, A.M., 1109–1126. Oberlander, K.C., Savolainen, V., Snijman, D.A., van der Niet, T., Nowell, T.L., 2009. Rogers, S.O., Bendich, A.J., 1985. Extraction of DNA from milligram amounts of fresh, Origin and diversification of the Greater Cape flora: ancient species repository, hot- herbarium and mummified plant-tissues. Plant Mol. Biol. 5, 69–76. bed of recent radiation, or both? Mol. Phylogenet. Evol. 51, 44–53. Ronquist, F., Huelsenbeck, J.P., 2003. MrBayes 3: Bayesian phylogenetic inference Wang, X.-Q., Tank, D.C., Sang, T., 2000. Phylogeny and divergence times in Pinaceae: under mixed models. Bioinformatics 19, 1572–1574. evidence from three genomes. Mol. Biol. Evol. 17, 773–781. Rutschmann, F., 2005. Bayesian Molecular Dating Using PAML/Multidivtime. A Warren, B.H., Hawkins, J.A., 2006. The distribution of species diversity across a Step-by-step Manual. Version 1.5 (July 2005). Institute of Systematic Botany, flora’s component lineages: dating the Cape’s ‘relicts’. Proc. Roy. Soc. B 273, University of Zurich, Zurich, Switzerland. . 2149–2158. Rutschmann, F., Eriksson, T., Salim, K.A., Conti, E., 2007. Assessing calibration Wei, X.-X., Yang, Z.-Y., Li, Y., Wang, X.-Q., 2010. Molecular phylogeny and uncertainty in molecular dating: the assignment of fossils to alternative biogeography of Pseudotsuga (Pinaceae): insights into the floristic relationship calibration points. Syst. Biol. 56, 591–608. between Taiwan and its adjacent areas. Mol. Phylogenet. Evol. 55, 776–785. Author's personal copy

470 Z.-Y. Yang et al. / Molecular Phylogenetics and Evolution 64 (2012) 452–470

Wendel, J.F., Doyle, J.J., 1998. Phylogenetic incongruence: window into genomes Wood, T.E., Takebayashi, N., Barker, M.S., Mayrose, I., Greenspoon, P.B., Rieseberg, history and molecular evolution. In: Soltis, D.E., Soltis, P.S., Doyle, J.J. (Eds.), L.H., 2009. The frequency of polyploid speciation in vascular plants. Proc. Natl. Molecular systematics of plants II: DNA sequencing. Kluwer, Boston, pp. 265–296. Acad. Sci. USA 106, 13875–13879. Whitfield, J.B., Lockhart, P.J., 2007. Deciphering ancient rapid radiations. Trends Yang, Z.H., 2007. PAML 4: Phylogenetic analysis by maximum likelihood. Mol. Biol. Ecol. Evol. 22, 258–265. Evol. 24, 1586–1591. Wolfe, K.H., 2001. Yesterday’s polyploids and the mystery of diploidization. Nat. Yao, X.L., Taylor, T.N., Taylor, E.L., 1997. A taxodiaceous seed cone from the Triassic Rev. Genet. 2, 333–341. of Antarctica. Am. J. Bot. 84, 343–354. Won, H., Renner, S.S., 2006. Dating dispersal and radiation in the gymnosperm Yu, Y., Harrisb, A.J., He, X.J., 2010. S-DIVA (Statistical Dispersal-Vicariance Analysis): Gnetum (Gnetales) – clock calibration when outgroup relationships are a tool for inferring biogeographic histories. Mol. Phylogenet. Evol. 56, uncertain. Syst. Biol. 55, 610–622. 848–850.