GLOBAL AND FINE SCALE MOLECULAR STUDIES OF POLYPLOID EVOLUTION

IN L. ()

by

Eugenia Yuk Ying Lo

A thesis submitted in conformity with the requirements for the degree of Doctor of Philosophy Graduate Department of Ecology and Evolutionary Biology University of Toronto

© Copyright by Eugenia Yuk Ying Lo 2008

Global and fine scale molecular studies of polyploid evolution in Crataegus L. (Rosaceae)

Doctor of Philosophy

Eugenia Yuk Ying Lo

Graduate Department of Ecology and Evolutionary Biology

University of Toronto

2008

Abstract

As many as 70% of angiosperm species are known to contain polyploids, but many aspects of polyploid evolution are unclear in woody . Crataegus is a woody of Rosaceae comprising 140-200 species that are widely distributed in the Northern

Hemisphere. Several species, particularly those in North America, are shown to contain polyploids. The overall goal of the thesis is to provide a better understanding of polyploid evolution by resolving problems from intergeneric to intraspecific levels in Crataegus using phylogenetic and population genetic approaches. Three major aspects were investigated: (1) Phylogeography of the Old and New World Crataegus; (2)

Reproductive system and distribution of cytotypes of the black-fruited series

Douglasianae in Pacific Northwest and; (3) Origins, population structure, and genetic diversity of diploid and polyploid species.

Phylogenetic analyses of molecular data provide evidences of historical events such as trans-Beringian migrations and North Atlantic vicariance that contributed to modern distribution of Crataegus. Poor resolution and short internal branches in eastern North ii

American species suggest genetic bottlenecks and/or rapid divergence following glaciations. In the Pacific Northwest, polyploids of series Douglasianae show a wider distribution and ecological amplitude than diploids. Parsimony tree and network analyses indicate that autotriploids and allotriploids occur in C. suksdorfii, while tetraploid C. suksdorfii are formed via the triploid bridge followed by introgression of sympatric C. douglasii. At the regional level, microsatellite data indicate a separation of the Pacific coastal diploids and triploids from the Columbia Plateau and Rocky Mountain triploids and tetraploids. High genetic differentiation among C. suksdorfii populations suggests that gene flow is limited by ploidy level differences as well as geographical distance.

Within-population multilocus genotypic variation is greatest in sexual diploids, and least in apomictic triploids. Frequent gene flow via seed dispersal contributes to an appreciable level of intrapopulation diversity in apomictic tetraploids, and counterbalances the effects of apomixis and/or self-fertilization, which diminish genetic variation within and between seed families. These findings collectively clarify and historical biogeography, provide an explicit reticulation model for polyploid formation, and shed light on evolution of natural populations in woody plants that show heterogeneous ploidy levels and reproductive systems.

iii

Statement about previously copyrighted material

Chapter 2 appeared as:

Eugenia Y. Y. Lo, Saša Stefanović, and Timothy A. Dickinson. 2007. Molecular reappraisal of relationships between Crataegus and Mespilus (Rosaceae, Pyreae) – Two genera or one? Systematic Botany, 32(3): 596-616.

This chapter is reproduced with the permission of the Systematic Botany, American

Society of Taxonomists, who granted the right to authors to reprint or reuse the authors’ own material without requesting permission.

The other five chapters (chapters 2-7) have been submitted for publication. All these six manuscripts that were submitted for publication prior to submission of this thesis are co-authored by my co-supervisors Dr. Timothy A. Dickinson and Dr. Saša Stefanović, who have authorized the inclusion of the material from these manuscripts in the thesis.

iv

Acknowledgements

I would like to express my deepest appreciation to my co-supervisors Dr. Timothy A.

Dickinson and Dr. Saša Stefanović, who have provided me the greatest support and helpful editorial suggestions for the completion of the thesis. I also wish to sincerely acknowledge my supervisory and examining committee members Dr. James E.

Eckenwalder, Dr. Jean-Marc Moncalvo, Dr. Nancy G. Dengler, Dr. David Guttman, Dr.

Allan Baker, Dr. Asher Cutter, and Dr. Daniel Potter, as well as my colleagues and friends particularly Dr. Nadia Talent, Dr. Maria Kuzmina, Dr. Knud IB Christensen, and

Ms. Pauline Wang for their valuable advices and technical assistance. I thank the institutions Department of Immunology (University of Toronto) for flow cytometry service, Centre for Applied Genomics (Hospital for Sick Children) for microsatellite service, and Molecular Systematics Laboratory (Royal Ontario Museum) for sequencing service. Finally, I am indebted to members of my family for their understanding and encouragement throughout my studies.

v

Table of Contents

Abstract……………………………………………………………………………………..……i

Statement about previously copyrighted material……………………………………………vi Acknowledgements……………………………………………………………………………...v Table of Contents………………………………...………………………………………...…. vii List of Tables……………………………………………………………………..……...……..xii List of Figures…………………………………………………………..………….…...…...…xiv List of Appendices…………………….………………………………..…………...……..…xviii Chapter 1. Background……………………...…………………………...……..…………...... 1

1.1 Significance of polyploids…………………………………………………...... …1

1.2 Reproductive biology in polyploids and evolutionary hypotheses……………...…...….1

1.3 Polyploidy and related taxonomic problems in the Rosaceae………..…………..…..…3

1.4 Classification and distribution of Crataegus………………………………………………....5

1.5 Characterizing Crataegus polyploids in the western North America……….…………..7

1.6 References……………………………………………………………………………...10

Chapter 2. Molecular reappraisal of relationships between Crataegus and Mespilus

(Rosaceae, Pyreae) – Two genera or one?...... 23

2.1 Introduction…………………………………………………………………..………...24

2.2 Materials and methods……………………………………………………….………...27

2.2.1 Taxon sampling…………………………………………………...….………....27

2.2.2 Morphological data…………………………………………………....………..28

2.2.3 DNA extraction, PCR, and sequencing…………………………………………28

2.2.4 Sequence editing, alignment, and phylogenetic analyses…………….………...29

2.2.5 Alternative topologies……………………………………………….……….....31

2.3 Results……………………………………………………………………………....….31

vi

2.3.1 Sequences…………………………………………………………….………....31

2.3.2 Nuclear phylogeny…………………………………………………………...…32

2.3.3 Chloroplast phylogeny………………………………………….…...……….....34

2.3.4 Maximum likelihood analyses and tests of alternative phylogenetic

hypotheses……………………………………………………………….…..………...34

2.3.5 Combined nuclear and chloroplast phylogeny……………………………….....35

2.4 Discussion……………………………………………………………….…………..…36

2.4.1 Intergeneric divergence of LEAFY sequences………………………...... …36

2.4.2 Phylogenetic utility of chloroplast regions in Pyreae……………...………...…37

2.4.3 Implications of nuclear and chloroplast data incongruence………………….....38

2.4.4 Re-evaluation of generic limits…………………………………………………41

2.5 Acknowledgements……………………………………...……………………………..45

2.6 References…………………………………………………………………………...…46

Chapter 3. Evidences for genetic association between East Asian and Western North

American Crataegus L. (Rosaceae) and rapid divergence of Eastern North American lineages based on multiple DNA sequences……………………………...…….…………...... 65

3.1 Introduction……………………………………………………………….………..…..66

3.2 Materials and methods………………………………………………………….……...68

3.2.1 Taxon sampling and DNA regions used……………………………………...…68

3.2.2 PCR amplification and sequencing…………………………………………...... 69

3.2.3 Computational analyses………………………………………………….…..…70

3.2.3.1 Sequence alignment and variation comparisons……………..……….…70

3.2.3.2 Tree reconstructions………………………………………………….….70

3.2.3.3 Test for topological incongruence…………………...…………………..71

3.2.3.4 Ancestral areas inference………………………………………...... 72 vii

3.2.3.5 Divergence time estimations……………………………………...……..73

3.2.3.6 Mapping of morphological characters and ploidy level……....………...74

3.3 Results…………………………………………………………….…………………....74

3.3.1 Sequence divergence…………………………………………………………....74

3.3.2 Species relationships…………………………………………………...... 75

3.3.3 Incongruent topologies………………………………………………………....76

3.3.4 Total sequence analyses……………………………………………………...... 76

3.3.5 DIVA and r8s analyses…………………………………………………...... 78

3.3.6 Character changes and homoplasy………………………………….....………..79

3.4 Discussion……………………………………………………………………………...79

3.4.1 Ancestral areas of Crataegus……………………………………………………...….79

3.4.2 Intercontinental migrations and hybridization hypotheses…………..………....81

3.4.3 Evolution of eastern North American taxa and character

homoplasy……………………………………………………..………………………84

3.4.4 Conclusions…………………………………………………………………...…85

3.5 Acknowledgements………………………………………...….………..……………...86

3.6 References……………………………………………………………...………………86

Chapter 4. Cytotype diversity, heterogeneity in reproductive system, and climatic correlates of distribution of the Pacific Northwest hawthorns (Crataegus section

Douglasianae; Rosaceae)…………………………………………………………..………....121

4.1 Introduction……………………………………………………………...…………....122

4.2 Materials and Methods………………………………………………………………..124

4.2.1 Taxon identification……………………………………………………….…..124

4.2.2 Sampling sites and plant materials………………………………………….....125

4.2.3 Ploidy level estimation and data analyses………………………………....…..126 viii

4.2.4 Statistical tests of distribution and climate data……………………..…….…..126

4.3 Results………………………………………………………………….…….…….…127

4.3.1 Leaf ploidy level variation and distribution…………………………..…….…127

4.3.2 Seed ploidy level among cytotypes…………………………………………...129

4.3.3 Comparisons of elevations and climatic variables among cytotypes…..……..130

4.4 Discussion……………………………………………………………..……………...131

4.4.1 Cytotype segregation and climatic correlates of distribution……………..…..131

4.4.2 Gametophytic apomixis in triploid and tetraploid plants…………….………..133

4.4.3 Conclusions………………………………………………………...……….....135

4.5 Acknowledgements……………………………………………...………………...….136

4.6 References………………………………………………………………………….....136

Chapter 5. Detecting origins and inferring reticulate history in diploid-polyploid complexes of Crataegus suksdorfii sensu lato (Rosaceae) using tree and network approaches…………………………………………………………………………………….164

5.1 Introduction………………………………………………………………………...…166

5.2 Materials and Methods…………………………………………………………...…...168

5.2.1 Plant materials and ploidy level determinations…………………….……..….168

5.2.2 Gene markers and sequencing strategy……………………….…..………...…169

5.2.3 Sequence analyses………………………………………………………...... …170

5.2.4 Phylogenetic tree and network reconstructions…………………..………...…171

5.3 Results………………………………………………………………………..…….…172

5.3.1 Flow cytometry…………………………………...……………………..….…172

5.3.2 Chloroplast sequence polymorphism……………………………………..…...172

5.3.3 Topologies of PISTILLATA……………….………………………………....……174

5.3.4. Topologies of PEPC……………….……………………………………….………..175 ix

5.4 Discussion…………………………………………………………..…………...……176

5.4.1 Duplication of PISTILLATA and PEPC genes………………………………...176

5.4.2 Autopolyploids and allopolyploids formation……………………………..….177

5.4.2.1 Route 1—Autotriploidy………………………………………..…....…178

5.4.2.2 Route 2—Allotriploidy………………………………………...………178

5.4.2.3 Route 3—Backcross to tetraploidy………………………...………...... 179

5.4.2.4 Route 4—Gene flow between sympatric tetraploids………...…….…..179

5.4.3. Conclusions……………………………………………...………………..…...180

5.5 Acknowledgements………………………………………………..……………...…..181

5.6 References………………………………………………………………………...…..181

Chapter 6. Population genetic structure of diploid sexuals and polyploid apomicts of

Crataegus suksdorfii and C. douglasii sensu lato (Rosaceae) in the Pacific Northwest…...206

6.1 Introduction……………………………………………………………………...……208

6.2 Materials and Methods……………………………………………………...... …210

6.2.1 Plant materials……………………………………………………...………….211

6.2.2 DNA extraction and microsatellite markers………………………...………....211

6.2.3 Data analyses………………………………………………..……………...…212

6.3 Results……………………………………………………………………………...…214

6.3.1 Allelic and genotypic variation within populations…………………...….…...214

6.3.2 Partitioning genetic variation among populations………………..……….…..215

6.3.3 Genetic clustering of individuals…………………………………………...... 216

6.3.4 Isolation by distance…………………………………………………………..217

6.4 Discussion………………………………………………………………...….…….....218

6.4.1 Dispersion of C. douglasii in the Pacific Northwest………………...... …218

6.4.2 Source of genetic variation in apomictic populations……………...... …..219 x

6.4.3 Gene flow limited by ploidy level differences in C. suksdorfii……………….221

6.4.4 Conclusions………………………………………………………………...….223

6.5 Acknowledgements……………………………………………………………...……224

6.6 References………………………………………………………………………….....224

Chapter 7. Limited pollen contribution in apomictic plants: Inference from the genetic variability in seed families of the Ontario Crataegus (hawthorns; Rosaceae) with microsatellite markers……………………………………………………………..………....255

7.1 Introduction……………………………………………………………………..…….257

7.2 Materials and Methods……………………………………………………...………...259

7.2.1 Plant materials……………………………………………………………...... 259

7.2.2 DNA extraction and microsatellite markers…………………………………...259

7.2.3 Statistical analyses………………………………………………………….....260

7.3 Results………………………………………………………………………….…..…262

7.3.1 Comparison of genetic variation………………………………...………….....262

7.3.2 Genetic relatedness among seed samples…………………………………..…263

7.4 Discussion………………………………………………………………….………....263

7.5 Acknowledgements……………………………………………...……….……….…..268

7.6 References……………………………………………………...……….………….....268

Chapter 8. Summary……………………...………………………...... ………………...……285

Appendices…………………...……………………………………...…………………...……291

xi List of Tables

Table 2.1: Comparison of sequence variation in Crataegus, Mespilus, and outgroups for the two

nuclear and four chloroplast regions……………………………….………………...... ….52

Table 2.2: Morphological variation, ploidy level, and geographic distribution as they are

expressed in Amelanchier, Mespilus, and exemplar species of Crataegus.…...... ……...... 53

Table 3.1: Summary of Crataegus samples included………...……………………………...….96

Table 3.2: Information of chloroplast and nuclear primers used...……………………………...97

Table 3.3: Estimates of sequence divergence among Crataegus species with respect to the four

major geographical areas for the chloroplast and nuclear regions…………………….…….99

Table 3.4: DIVA results indicating the most recent common areas (MRCA) obtained for the root

of Crataegus and potential number of dispersal events………………………………...... 100

Table 3.5: Age estimate and their confidence intervals of Crataegus lineages………………..101

Table 3.6: Estimate of number of changes and homoplasy of morphological characters on

molecular tree…………………………………………………………………………...... 102

Table 4.1: Geographical localities, coordinates, elevations of respective sites, and total number

of leaf and seed samples…………………………………………………………………...143

Table 4.2: Summary of flow cytometry results of leaf and seed (embryo and endosperm)

samples of the Douglasianae taxa……………………………..…...……………………...146

Table 4.3: One-way ANOVA comparisons and mean estimates of elevations and climate

variables among cytotypes in C. suksdorfii and C. douglasii ………………………….….149

Table 5.1: Summary of C. douglasii and C. suksdorfii individuals included in amplifications of

the two nuclear gene regions PEPC and PISTILLATA…………………………………….188

Table 5.2: Summary of C. douglasii and C. suksdorfii individuals included in amplifications of

the two chloroplast intergenic regions…………………………………………………...... 190 xii Table 5.3: Mean nuclear 2C values and standard deviation (SD) of C. douglasii and C.

suksdorfii individuals resulted from flow cytometry…...………………………………….192

Table 5.4: Diversity measures from chloroplast sequences of C. douglasii and C. suksdorfii

individuals from separate or combined localities estimated by DNASP…………………..194

Table 5.5: Summary of results for PISTILLATA and PEPC paralogue sequences among C.

douglasii and C. suksdorfii individuals……………………………..………..…………….195

Table 6.1: Locality, ploidy level, habitat, and population size (N) of C. douglasii sensu lato and

C. suksdorfii………………………………………………….…………………………….234

Table 6.2: Nucleotide sequences of the 13 selected microsatellite primers used……..……….236

Table 6.3: Descriptive statistics of diploid, triploid, and tetraploid populations of C. suksdorfii

and C. douglasii based on the 13 microsatellite loci………………………………..……..237

Table 6.4: ANOVA-based F- and R-statistics for SSR data calculated for all populations,

separately for C. douglasii and C. suksdorfii…………………………………..…….…….238

Table 6.5: Analysis of Molecular Variance (AMOVA) showing the partitioning of genetic

variation among and within populations of C. douglasii and C. suksdorfii, respectively…239

Table 6.6: Membership coefficient at K = 9 inferred from STRUCTURE analyses when all C.

douglasii and C. suksdorfii individuals are included……………………………...……………240

Table 7.1: Descriptive statistics of genetic variation at five microsatellite loci among seed

families of C. crus-galli and C. punctata………………………………………………...... 274

Table 7.2: Nucleotide sequences and information of microsatellite markers used in C. crus-galli

and C. punctata…………………………………………………………………………….275

Table 7.3: Tukey (HSD) analysis of the differences in pairwise Rousset’s distances (2000)

within and between seed families of C. crus-galli and C. punctata…….………………....276

xiii List of Figures

Figure 1.1: Phylogenetic tree showing the evolution of research project and outline of the

present thesis………………………..…………………………….…………………21

Figure 2.1: Strict consensus trees, from maximum parsimony (MP) analyses of (a)

ITS1-5.8S-ITS2 (2761 trees) and (b) LEAFY second intron sequence data (27684

trees)………………………………………………………………………………....55

Figure 2.2: Strict consensus of 790 maximum parsimony (MP) trees from the combined

analysis of ITS and LEAFY second intron data…………………………………...…57

Figure 2.3: Strict consensus of 18,432 equally parsimonious trees from the maximum

parsimony (MP) analysis of the combined trnG-trnS, psbA-trnH, trnH-rpl2, and

rps20-rpl12 data……………………………………..………………………………59

Figure 2.4: The maximum likelihood (ML) trees of the combined nuclear (a) and

chloroplast (b) data, generated by PAUP* using the TIM and GTR models,

respectively……………………………………………………………….………….61

Figure 2.5: Trees based on combined nuclear and chloroplast data generated by (a)

maximum parsimony (MP) and (b) maximum likelihood (ML)…………………….63

Figure 3.1: Strict consensus tree from maximum parsimony (MP) analyses of the

combined chloroplast data……………………………………………………….....103

Figure 3.2: Strict consensus maximum parsimony trees from the combined analysis of

ITS and LEAFY second intron data………………………………………...………105

Figure 3.3: Maximum likelihood (ML) tree based on combined nuclear and chloroplast

data using the TVM model…………………………………………………………107

Figure 3.4: (A) Strict consensus of 895 equally parsimonious trees from the maximum

parsimony (MP) analysis using the combination of four chloroplast and five nuclear xiv regions. (B) Phylogram generated by maximum likelihood analysis showing shorter

internal branches within the ENA than WNA-EA clades………………………..…109

Figure 3.5: Biogeographic model for Crataegus based on molecular phylogenies and

Dispersal-Vicariance (DIVA) results……………………………………...………..111

Figure 3.6: Mapping of morphological characters and ploidy level using MacClade 4.0

based on the combined chloroplast and nuclear tree……………………………….113

Supplementary figure 3A: Bayesian tree of the chloroplast data………………………115

Supplementary figure 3B: Maximum likelihood tree of the chloroplast data.…………117

Supplementary figure 3C: Maximum parsimony phylogram showing branch length of

diploid taxa only based on combined chloroplast and nuclear regions………..…...119

Figure 4.1: Schematic diagrams showing (A) sexual and (B) gametophytic apomictic

pathways in megagametophytes of tetraploid plants……………………………….150

Figure 4.2: Distribution sites of C. douglasii, C. suksdorfii, C. saligna, and C. rivularis

cytotypes included in the present study…………………………………………….152

Figure 4.3: Histogram of leaf nuclei DNA content (mean and standard deviation) from

individuals of C. douglasii, C. suksdorfii, C. saligna, and C. rivularis representing a

total of 47 localities…………………………………...……………………………154

Figure 4.4: Histogram of embryo and endosperm nuclear DNA content (mean and

standard deviation) from seeds of C. douglasii, C. suksdorfii, C. saligna, and C.

rivularis from 27 localities…………………………………………………………156

Figure 4.5: Boxplots indicating the estimated ratio of endosperm to embryo DNA content

of seed samples across (A) sites of C. douglasii and its segregates, and (B) sites of C.

suksdorfii………………………………………………………………………...…158

Figure 4.6: Regression plots of leaf nuclear DNA content of C. douglasii and segregates

and C. suksdorfii against selected climate variables including (A) temperature, (B) xv precipitation, (C) relative humidity, and (D) number of days with frost…………...160

Figure 4.7: (A) Biplot showing the variation and mean vectors of 84 monthly values of

the seven climate variables. (B) Scatter plot generated by principle component

analysis (PCA) of diploid and polyploid sites based on 84 monthly values of seven

climate variables…………………………………………………..…………….….162

Figure 5.1: Statistical network of chloroplast haplotypes A-S obtained from 132

sequences representing diploid and polyploid individuals of the Douglasianae..…196

Figure 5.2: Strict consensus parsimonious tree for the S and L paralogs of the

PISTILLATA sequence data………………………………………………………...198

Figure 5.3: Strict consensus parsimonious tree for the S and L paralogs of PEPC

sequence data that resolved as two distinct clades…………………………………200

Figure 5.4: Statistical network of nuclear haplotypes A-Q obtained from 91 PEPC -L

paralog sequences representing diploid and polyploid individuals………………...202

Figure 5.5: Reticulation model synthesized from ploidy level, as well as chloroplast and

nuclear data…………………………………………………………………………204

Figure 6.1: Distribution of sampling sites of C. suksdorfii and C. douglasii sensu lato

include in the population study……………………………..……………………....241

Figure 6.2: Neighbor-joining tree based on Jaccard coefficients of alleles obtained from

the 13 microsatellite loci showing the genetic relationships among samples of C.

douglasii and C. suksdorfii…………………………………………………………243

Figure 6.3: Unrooted dendrogram based on DS distances (Nei 1978), estimates of

F-statistics under IAM model, computed from SPAGEDI showing relatedness among

sites of C. douglasii and C. suksdorfii……………………………………..……….245

Figure 6.4: Bayesian inferences of clusters (K) estimated by STRUCTURE among

individuals of (A) C. douglasii and (B) C. suksdorfii…………………………...…247 xvi Figure 6.5: Scatter plots of pairwise DS values among populations of C. douglasii (A) and

C. suksdorfii (B) versus geographical distances…………………………………....249

Supplementary figure 6A: Frequency distribution of the observed alleles of some

selected microsatlelite loci with respect to C. douglasii and C. suksdorfii….……..251

Supplementary figure 6B: Frequency distribution of the observed alleles with respect to

2x, 3x, and 4x individuals of C. suksdorfii……………………….…..…………….253

Figure 7.1: Frequency distribution of the pairwise IAM-based genetic distance among

seed progenies in C. punctata (A) and C. crus-galli (B)………………………...…277

Figure 7.2: Neighbor-joining trees based on allelic similarity among seed samples of C.

punctata (A) and C. crus-galli (B)…………………………………...…………….279

Figure 7.3: Box plots of pairwise Rousset’s distance calculated within (WS) and between

(BS) seed families from C. crus-galli (C) and C. punctata (P)…………………….281

Supplementary figure: Frequency distributions for detected alleles of five microsatellite

loci in the two hawthorn species, C. crus-galli and C. punctata…………………...283

Figure 8.1: Significance of the overall thesis research………………………...…….…289

xvii List of Appendices

Appendix 1: Locality and vouchers data for outgroup, Mespilus, and Crataegus taxa used for

molecular analyses in Chapter 2………………………………………………………..….291

Appendix 2: Morphological characters and their states, together with ploidy level and

geographic distribution, as they are expressed in Amelanchier, Mespilus, and Crataegus

species……………………………………………………………………………………...296

Appendix 3: GenBank accession numbers of representative species used in the phylogenetic

reconstructions……………………….…………………………………………………….297

Appendix 4: Locality and vouchers data for outgroup and Crataegus taxa used for molecular

analyses in Chapter 3………………………………………..……………………………..299

Appendix 5: Keys indicating ploidy level and the states of the eight morphological characters

mapped on the molecular tree…………………………………………………………..….308

Appendix 6: Locality and vouchers data for Crataegus section Douglasianae accessions used

for flow cytometry and molecular analyses in Chapter 4, 5, and 6……………….……….309

Appendix 7: Locality and source of C. crus-galli and C. punctata seed samples examined in

Chapter 7………………………………….………………………………………………..324

xviii 1

Chapter 1 Background

1.1 Significance of polyploids

Polyploidy is one of the processes leading to adaptation and speciation (Lewis 1980; Grant 1981;

Leitch & Bennet 1997; Taylor et al. 2001; Levin 2002; Meyer & Levin 2002) and is well known in plants and some vertebrate organisms. About 70% of species are believed to be polyploids (Soltis & Soltis 1999; Otto & Whitton 2000; Meyer & Levin 2002). While polyploidy occurs in almost all classes of vertebrates including fishes (Meyer & Schartl 1999;

Comber & Smith 2004), rodents (Gallardo et al. 2004), as well as reptiles and amphibians

(Ptacek et al. 1994; Becak & Becak 1998), it is much less common in these groups.

Chromosome/genome multiplication events serve to introduce novel genotypes to enhance adaptation and enable changes in ecological habitat, geographical distribution, population structure, and reproductive system, as shown in herbaceous species (Lewis 1980; Husband &

Schemske 1998; Allem 2003; Van Dijk 2003; Bacck 2004; Jakob et al. 2004; Verduijn et al.

2004; Van Dijk & Vijverberg 2005; Hörandl 2006). However, very little is known about the evolutionary dynamics of polyploids in woody perennials. Therefore, the overall goal of the thesis is to provide a better understanding of polyploid evolution in one of the woody complexes of Crataegus by untangling problems related to polyploids from intergeneric to intraspecific levels using phylogenetic and population genetic approaches (Fig. 1.1), so as to supplement the existing knowledge of polyploid evolution not only in herbaceous but also in woody angiosperms.

1.2 Reproductive biology in polyploids and evolutionary hypotheses

Two types of polyploids, autopolyploids and allopolyploids are recognized (Grant 1981).

1 2 Autopolyploidy refers to the whole genome duplication within a species involving fertilization of unreduced gametes or meristematic/somatic chromosome doubling within or between conspecific individuals (deWet 1980). This is different from allopolyploidy, which refers to duplication of two or more divergent genomes within a hybrid involving crosses between different species (Soltis & Rieseberg 1986; Ramsey & Schemske 1998). Apomixis, the asexual formation of seeds, occurs sporadically among the ca. 457 angiosperm families (see review in

Whitton et al. 2008) and is often associated with polyploids (Grant 1981; Nogler 1984; &

Jerling 1992; Carmen 1997). Gametophytic apomixis and adventitious embryony are the two major developmental pathways by which asexual (agamospermous) seed is produced (Nogler

1984; Czapik 1996; Whitton et al. 2008). The former requires the formation of an unreduced megagametophyte via diplospory or apospory in embryo development, whereas the latter involves direct embryo development from the nucellus tissue. About 126 genera are known to include gametophytic apomicts and they are found mainly in the three families Rosaceae,

Poaceae, and (Carmen 1997; Whitton et al. 2008). One predicted consequence of asexual reproduction is the formation of genetically uniform populations that are likely to be differentiated only into patches of few different genotypes by stochastic colonization events

(Hamrick & Godt 1996; Starfinger & Stocklin 1996; McLellan et al. 1997). This has been shown in some agamospermous species which revealed more uniform genotypes and lower level of genetic variation than congeneric sexual relatives (e.g., Nybom & Schaal 1990; Kraft et al.

1996; Shi et al. 1996; Lyman & Ellstrand 1998; Nybom 1998; Kollmann et al. 2000; Hörandl et al. 2001; Storcrhova et al. 2002; Paun et al. 2006), corroborating some early predictions that suggested an evolutionary “dead-end” in asexual organisms (Stebbins 1950; Clausen 1954;

Darlington 1958; Grant 1981; Lynch & Gabriel 1983; Kondrashov 1993; Judson & Normark

1996). However, this notion is challenged by several other studies that indicate a paradoxically high level of genetic diversity in various agamospermous plants, which are not significantly 3 different from congeneric sexual relatives (e.g., Ellstrand & Roose 1987; Bayer 1990;

Watkinson & Powell 1993; Widen et al. 1994; Menken et al. 1995; Noyes & Soltis 1996;

Richards 1996; Gabrielsen & Brochmann 1998; Carino & Daehler 1999; Campbell et al. 1999;

Durand et al. 2000; Van der Hulst et al. 2000; Persson-Hovmalm & Gustavaaon 2001; Kjolner et al. 2004; D’Souza et al. 2005). Their findings argue against the evolutionary “dead-end” hypothesis and point toward the (direct or indirect) influence of dispersal, sexuality, and mutations on the genetic diversity of agamospermous plants that vary between species.

1.3 Polyploidy and related taxonomic problems in the Rosaceae

To date, molecular studies of polyploid and apomictic species are mainly concentrated on herbaceous plants of the Asteraceae, Poaceae, and Ranunculaceae. (e.g., Watkinson & Powell

1993; Menken et al. 1995; Esselman et al. 1999; Van der Hulst et al. 2000; Garnier et al. 2002;

Paun et al. 2006). Few attempts have been made to examine diploid-polyploid relationships and compare population genetic structure in sexual and apomictic taxa of woody plants e.g., in

Amelanchier, Crataegus, Malus, and Sorbus of the Rosaceae, where many species exhibit the natural phenomena of polyploidy and apomixis (Dickinson et al. 2007). Likewise, few of the consequences of polyploidy for the way in which these species interact with abiotic factors in their environment have been studied.

Rosaceae are well represented in the flora of the Northern Hemisphere with about 100 genera and 2,000-3,000 species classified in three subfamilies Spiraeoideae, Rosoideae, and

Dryadoideae (Potter et al. 2007). The subtribe Pyrinae (formerly named as Maloideae) of the

Spiraeoideae contains some 26 genera that are mainly woody plants with more or less fleshy fruits derived from a hypanthial ovary (known as berries or polypyrenous drupes) and have a base chromosome number of n = 17 (Campbell et al. 2007; Dickinson et al. 2007). Nine out of these 26 genera are reported to comprise diploids and polyploids (Dickinson et al. 2007). 4 According to Evans and Campbell (2002), the Pyrinae probably has a polyploid origin together with members of a lineage that contains the ancestors of Gillenia (n = 9). Such an origin may relate to the abundance of polyploids in large genera such as Crataegus, Malus, and Sorbus, because of a weak reproductive barrier that exists not only at species but also at higher taxonomic levels. In these genera, many polyploid species can reproduce by gametophytic apomixis i.e., parthenogenesis of unreduced egg cell into embryo (Dickinson et al. 2007).

Polyploidy associated with apomixis, together with other common events such as hybridization, often blur taxonomic boundaries in several different ways, e.g., by segregating a taxon into microspecies/topodemes (Dickinson & Phipps 1985; Dickinson 1986, 1998, 1999) or by merging morphological and/or genetic characters between distant taxa thus creating new nothotaxa at different levels (Campbell et al. 1991; Dickinson & Campbell 1991; Campbell et al.

2007; Dickinson et al. 2007). Such influence is exemplified by the genus Sorbus, which has been recently treated to include subgenera Aria, Cormus, Chamaemespilus, and Torminaria because their species are interfertile and produce offspring with overlapping morphological and genetic features, which cannot be properly classified (McAllister 1986; Aas et al. 1994;

Nelson-Jones et al. 2002). Another, and the most extreme, opinion of the Pyrinae classification was that of Sax (1931) who proposed to merge all genera as one genus and perhaps recognize the present genera as separate species. However, such notion was not taken on the basis of marked morphological differences between most genera (Robertson et al. 1991). Overall, delimitation of the Pyrinae genera remains challenging because of the combination of hybridization and rapid radiation (Campbell et al. 2007).

Crataegus and Mespilus are sister genera in the Pyrinae (Campbell et al. 1995; Evans et al.

2000; Campbell et al. 2007) and share morphological features such as stony pyrenes, thorns, and paired superimposed ovules. However, their taxonomic status as separate genera may be questioned because members of those two genera overlap substantially in morphology and no 5 single character is entirely discriminative between the two. The modern concepts of Crataegus and Mespilus were originated with Medikus (1793) and based on the way in which the apices of pyrenes are covered by epidermis in Mespilus but exposed in the fruits of Crataegus. This concept has been in use throughout the twentieth century without in-depth investigations.

Crataegus is one of the largest genera in the Pyrinae containing about 140-200 species distributed widely across the Northern Hemisphere (Phipps et al. 1990). By contrast, Mespilus contains only two species, the European M. germanica and the rare eastern North American M. canescens only known from about 30 individuals on a nature reserve in Arkansas (Phipps 1991).

Variation in Mespilus in leaf margination and venation, number of flowers per inflorescence, and in stamen number per flower, encompasses most or all of the states exhibited in Crataegus.

Evidence of natural hybridization between species of Crataegus and Mespilus is limited despite the documentation of hybrids such as M. ×grandiflora and M. × lobata in early literature

(Hooker 1835; Byatt 1977). While M. germanica is known to be a diploid, M. canescens proves to be a triploid and is predominantly sterile (Talent & Dickinson 2005). However, it is unclear (1) how the 3x M. canescens is derived from and related to Mespilus germanica and Crataegus taxa, and (2) whether Mespilus and Crataegus are phylogenetically distinct. Therefore, the first part of the thesis aims to determine the relationships among members of the two genera as well as to unravel the origin of the rare triploid M. canescens in the context of the phylogeny of the two genera (Fig. 1.1).

1.4 Classification and distribution of Crataegus

Crataegus trees, commonly known as hawthorns, are characterized by thorns and clusters of small pink or white flowers in late spring followed by red, apple-like fruits called haws. Fruits are marked by hypanthial openings and stony endocarps that are readily dispersed by larger birds and small rodents (Phipps & Muniyamma 1980; Courtney & Manzur 1985; Dickinson & 6 Campbell 1991; Guitian 1998). They are found widely in meadows and prairies, along creeks and rivers, in open woodlands, as well as in hilly montane forests in the northern temperate regions of the Old and New World (Phipps & Muniyamma 1980; Christensen 1992; Mabberley

1997). The latest and most comprehensive treatment of Crataegus divides the 140-200 species into 15 sections and 35 series based on morphological characters and geographical localities

(Phipps et al. 1990). In the Old World, 60 or more species representing four taxonomic sections are known from Europe and Asia (Phipps & Muniyamma, 1980; Phipps et al., 1990; Christensen,

1992; Gu & Spongberg, 2003). About 85% of these Old World species are reported as diploids

(Talent & Dickinson, 2005) and their taxonomy is not as complicated as the New World species

(Phipps et al., 1990; Christensen 1992; Gu & Spongberg, 2003). By contrast, over 60% of the more than 100 New World species representing 11 sections are reported to include polyploids

(Longley 1924; Phipps & Muniyamma 1980; Phipps et al. 1990; Talent & Dickinson 2005). In eastern North America, only 12 out of some 80 described species are entirely diploids (Talent &

Dickinson 2005). Tremendous effort has been made by co-workers to clarify species in this complex which is dominated by polyploids and perhaps hybrids (e.g. Phipps & Muniyamma,

1980; Dickinson & Phipps, 1985; Phipps 1988, 1998; Phipps et al., 1990; Christensen, 1992;

Dickinson et al., 1996; Dickinson 1999). However, cladistic relationships are unclear and the existing classification has never been tested with molecular data. Questions that need to be addressed include: (1) Do phylogenetic data agree with the existing morphological treatment, particularly regarding the sectional limits? (2) How morphological characters and ploidy levels have evolved within the genus?

The rich fossil record of Rosaceae in North American and Europe has provided some insights to the biogeographic history of the family as a whole (DeVore & Pigg 2007). However, very few genera have been investigated in-depth with regard to biogeographic relationships with other types of data. Crataegus has a wide geographical distribution in the Old and New World 7 (Phipps et al., 1990) and is a woody genus with its earliest fossils dating from the mid-Tertiary

(MacGinitie, 1934; Oliver, 1936; Lamotte, 1952; Wolfe & Wehr, 1988; DeVore & Pigg 2007).

However, the phylogenetic relationships between the Old and New World Crataegus are still unknown. The ancestral origin and historical migratory pathways that contributed to modern distribution of this genus are unclear. Previous morphological data suggested southwest China and Mexico as ancestral areas and a trans-Beringian movement between Asian and American

Crataegus (Phipps 1983), but these hypotheses have not been critically evaluated with other data. In the light of its modern distribution, I question if there is molecular evidence for not only the trans-Beringian but also the trans-Atlantic and other migration/vicariance events of ancient

Crataegus. Hence, chapter 3 of the thesis aims to infer the historical biogeography of Crataegus in a phylogenetic framework and to compare the resulting cladistic groupings with the existing morphological classification. Moreover, morphological characters are evaluated to understand their patterns of evolution as well as to identify those that are of key taxonomic significance (Fig.

1.1).

1.5 Characterizing Crataegus polyploids in western North America.

About 8 of the 12 Crataegus species from western North America are ascribed to section

Douglasianae Loud., which is divided into two series, Douglasianae and Cerrones (Brunsfeld &

Johnson 1990; Dickinson et al. 1996; Phipps et al. 2003). The former includes C. douglasii and

C. suksdorfii as the two main species and C. castlegarensis, C. okennonii, and C. shuswapensis as segregates. The latter includes C. saligna, C. rivularis and its segregate C. erythropoda; they show distinct leaf morphologies and a narrower range of distribution compared to that of the

Douglasianae.

Species of sect. Douglasianae are known to contain diploid and polyploid individuals

(Dickinson et al. 1996; Talent & Dickinson 2005). Ploidy level has been shown to correlate with 8 distributional features such as elevation and latitude, as well as other edaphic characters in herbaceous plants of the Ranunculaceae, Brassicaceae, Rubiaceae, and Asteraceae (Husband &

Schemske 1998; Bacck 2004; Jakob et al. 2004; Verduijn et al. 2004). However, only a few woody plants have been investigated sufficiently with respect to the relationship between cytotype distribution and these abiotic factors (McKenzie et al. 2003). In the Douglasianae, it is still unclear (1) how diploid and polyploid individuals are distributed on a regional scale, and (2) whether there is a correlation between climate variables and cytotype distribution. Apart from distribution, other features such as reproductive system, origins(s), and population structure of cytotypes have never been investigated at a broad geographical scale for any Crataegus species.

Although sexual reproduction and gametophytic apomixis are known to occur in Crataegus

(Talent & Dickinson 2007), it is unclear whether seeds are produced heterogeneously within individuals of different ploidy levels in the Douglasianae. Moreover, the formation and subsequent establishment of these natural polyploids are poorly understood. Hence, chapter 4 of the thesis aims to investigate the reproductive biology and abiotic factors associated to distribution of cytotypes (Fig. 1.1). Also, I seek to determine (1) whether there are autopolyploids or allopolyploids present in the species complexes, (2) whether polyploids are derived independently in different localities, and (3) what are the putative parental lineages involved in polyploid formation, through modelling reticulation in the Douglasianae complexes as presented in chapter 5 of the thesis.

The existing knowledge of polyploid evolution is mainly based on studies of herbaceous groups, and documentation of evolutionary mechanisms for woody polyploids is very limited.

The evolutionary potential of a population or a species largely relies on the level and pattern of genetic variation. Such variation is governed by biotic factors such as dispersal ability, reproductive system, and selection regimes that might vary between species (Levin 1981;

Hamrick et al. 1992; Soltis & Soltis 1993; Hamrick & Godt 1996; Vavrek 1998; Gornall 1999; 9 Ouborg et al. 1999; Mallet 2005). Reproductive system alone has been shown to affect genetic variability of local populations (Schoen & Brown 1991; Hamrick et al. 1992; Linhart & Grant

1996; Holsinger 2000; Bengtsson 2003; Houliston & Chapman 2004). For instance, asexual reproduction is generally expected to produce more homogeneous populations than does sexual reproduction. However, the combined effect of polyploidy and agamospermy on genetic variation of natural populations is not well known. Polyploidy is suggested as one of the mechanisms that counterbalances the loss of variation through asexual reproduction within populations, e.g., in tetraploid Amelanchier and Aronia species of the Rosaceae (Campbell et al.

1999; Persson-Hovmalm et al. 2004). At a broader scale, polyploid apomicts of the Asteraceae and Poaceae (see review in Whitton et al. 2008), as well as the Rosaceae being studied

(Oddou-Muratorio et al. 2001; Robertson et al. 2004) have demonstrated effective gene flow via seed dispersal among populations. This is claimed as an alternative mechanism enhancing genetic diversity in asexual plants. More data are required to understand the evolution of polyploid agamic complexes not only in herbaceous but also in woody angiosperms that exhibit different life histories. In North American Crataegus, the dynamics as well as the impact of reproductive system and polyploidy on population genetic structure have never been studied.

Crataegus suksdorfii and C. douglasii (including its segregates), as well as C. punctata and C. crus-galli (section Crus-galli) offer a unique opportunity to shed light on potential mechanisms that shape the genetic structure of diploid and polyploid species. Hence, in chapter 6 and 7 of the thesis (Fig. 1.1), I seek to disentangle the genetic architecture of these diploid and polyploid complexes at both population and individual levels by asking (1) Are diploid and polyploid populations genetically differentiated across their natural distribution range? (2) Is geographical distance a factor that limits gene flow? (3) Is there a reproductive barrier to gene flow for individuals of different ploidy levels? (4) Are there differences in the level of genetic variation between diploid sexual and tetraploid apomictic populations/taxa? The overall goal of the thesis 10 is to resolve problems in the Crataegus complexes from global to fine scale investigations, so as to supplement the existing knowledge of polyploid evolution.

1.6 References

Aas G, Maier J 1994. Morphology, isozyme variation, cytology and reproduction of hybrids

between Sorbus aria (L.) Crantz and S. torminalis (L.) Crantz. Bot. Helv. 104: 195-214.

Allem AC 2003. Optimization theory in plant evolution: an overview of long-term evolutionary

prospects in the angiosperms. The Bot. Rev. 69: 225-251.

Asker SE, Jerling L 1992. Apomixis in plants. CRC Press. Boca Raton, FL.

Baack EJ 2004. Cytotype segregation on regional and microgeographic scales in snow

buttercups (Ranunculus adoneus: Ranunculaceae). Am. J. Bot. 91: 1783-1788.

Bayer RJ 1990. Patterns of clonal diversity in the Antennaria rosea (Asteraceae) polyploid

agamic complex. Am. J. Bot. 77: 1313-1319.

Becak ML, Becak W 1998. Evolution by polyploidy in Amphibia: new insights. Cytogenet. Cell

Genet. 80:28– 33.

Bengtsson BO 2003. Genetic variation in organisms with sexual and asexual reproduction. J.

Evol. Biol. 16: 189-199.

Brunsfeld SJ, Johnson FD 1990. Cytological, morphological, ecological and phenological

support for specific status of Crataegus suksdorfii (Sarg.) Kruschke. Madroño 37: 274-282.

Byatt JI, Ferguson IK, Murray BG 1977. Intergeneric hybrids between Crataegus L. and

Mespilus L.: a fresh look at an old problem. Bot. J. Linn. Soc. 74: 329-343.

Campbell CS, Donoghue MJ, Baldwin BG, Wojciechowski MF 1995. Phylogenetic relationships

in Maloideae (Rosaceae): Evidence from sequences of the internal transcribed spacers of

nuclear ribosomal DNA and its congruence with morphology. Am. J. Bot. 82: 903-918. 11 Campbell CS, Alice LA, Wright WA 1999. Comparisons of within-population genetic variation

in sexual and agamospermous Amelanchier (Rosaceae) using RAPD markers. Pl. Syst. Evol.

215: 157-167.

Campbell CS, Evans RC, Morgan DR, Dickinson TA, Arsenault MP 2007. Phylogeny of

subtribe Pyrinae (formerly the Maloideae, Rosaceae): Limited resolution of a complex

evolutionary history. Pl. Syst. Evol. 266: 119 – 145.

Carino DA, Daehler CC 1999. Genetic variation in an apomictic grass, Heteropogon contortus,

in the Hawaiian Islands. Mol. Ecol. 8: 2127-2132.

Carman JG 1997. Asynchronous expression of duplicated genes in angiosperms may cause

apomixis, bispory, tetraspory, and polyembryony. Biol. J. Linn. Soc. 61: 51-94.

Christensen KI 1992. Revision of Crataegus section Crataegus and Nothosect. Crataeguineae

(Rosaceae-Maloideae) in the Old World. Syst. Bot. Mono. 35: 1-199.

Clausen J 1954. Partial apomixis as an equilibrium system in evolution. Carylogia 6: 469–479.

Comber SCL, Smith C 2004. Polyploidy in fishes: patterns and processes. Biol. J. Linn. Soc.

82:431-442.

Courtney SP, Manzur MI 1985. Fruiting and fitness in Crataegus monogyna: the effect of

frugivores and seed predators. Oikos 44: 398-406.

Czapik R 1996. Problems of apomictic reproduction in the families Compositae and Rosaceae.

Folia Geobotanica 31: 381-387.

D’Souza TG, Storhas M, Schulenburg H, Beukeboom LW, Michiels NK 2004. Occasional sex

in an 'asexual' polyploid hermaphrodite. Proc. Royal Soc. B (Biol. Sci.) 271: 1001-1007.

Darlington CD 1958. Evolution of genetic systems. Edinburgh: Oliver and Boyd. deWet, J. M. J. 1980. Origins of polyploids. In Polyploidy: Biological relevance. Ed. by Lewis,

W. H. Plemum press, New York and London. Pp.3-17.

DeVore ML, Pigg KB 2007. A brief review of the fossil history of the family Rosaceae with a 12 focus on the Eocene Okanogan Highlands of eastern Washington State, USA, and British

Columbia, . Pl. Syst. Evol. 266: 45-57.

Dickinson TA 1986. Topodeme differentiation in Ontario taxa of Crataegus (Rosaceae:

Maloideae): Leaf morphometric evidence. Can. J. Bot. 64: 2738-2747.

Dickinson TA 1998. Taxonomy of agamic complexes in plants: a role for metapopulation

thinking. Folia Geobotanica 33: 327-332.

Dickinson TA 1999. Species concepts in agamic complexes. In Evolution in man-made habitats.

L.W.D.v. Raamsdonk, and J.C.M.d. Nijs (eds). Institute for Systematics & Ecology,

Amsterdam. pp. 319-339.

Dickinson TA, Phipps JB 1985. Degree and pattern of variation in Crataegus section Crus-galli

in Ontario. Syst. Bot. 10: 322-337.

Dickinson TA, Phipps JB 1986. Studies in Crataegus (Rosaceae: Maloideae) XIV. The breeding

system of Crataegus crus galli sensu lato in Ontario (Canada). Am. J. Bot. 73: 116-130.

Dickinson TA, Campbell CS 1991. Population structure and reproductive ecology in the

Maloideae (Rosaceae). Syst. Bot. 16: 350-362.

Dickinson TA, Belaoussoff S, Love R 1996. North American black-fruited hawthorns: I.

Variation in floral construction, breeding system correlates, and their possible evolutionary

significance in Crataegus sect. Douglasii Loudon. Folia Geobotanica 31: 355-371.

Dickinson TA, Lo E, Talent N 2007. Polyploidy, reproductive biology, and Rosaceae:

understanding evolution and making classification. Pl. Syst .Evol. 266: 59-78.

Durand J, Garnier L, Dajoz I, Mousset S, Veuille M 2000. Gene flow in a facultative apomictic

Poaceae, the savanna grass Hyparrhenia diplandra. Genetics 156: 823-831.

Ellstrand NC, Roose ML 1987. Patterns of genotypic diversity in clonal plant species. Am. J.

Bot. 74: 123-131.

Esselman EJ, Jianqiang L, Crawford DJ, Windus JL, Wolfe AD 1999. Clonal diversity in the 13 rare Calamagrostis porteri ssp. insperata (Poaceae): comparative results for allozymes and

random amplified polymorphic DNA (RAPD) and intersimple sequence repeat (ISSR)

markers. Mol. Ecol. 8: 443-451.

Evans RC, Alice LA, Campbell CS, Kellogg EA, Dickinson TA 2000. The Granule-Bound

Starch Synthase (GBSSI) gene in the Rosaceae: Multiple loci and phylogentic utility. Mol.

Phylogenet. Evol. 17: 388-400.

Evans RC, Campbell CS 2002. The origin of the apple subfamily (Maloideae; Rosaceae) is

clarified by DNA sequence data from duplicated GBSSI genes. Am. J. Bot. 89:1478-1484.

Gabrielsen TM, Brochmann C 1998. Sex after all: high levels of diversity detected in the artic

clonal plant Saxifraga cernua using RAPD markers. Mol. Ecol. 7: 1701-1708.

Gallardo MH, Kausel G, Jiménez A, Bacquet C, González C, Figueroa J, Köhler N, Ojeda R

2004. Whole-genome duplications in South American desert rodents (Octodontidae). Biol. J.

Linn. Soc. 82:443--451

Garnier LKM, Durand J, Dajoz I 2002. Limited seed dispersal and microspatial population

structure of an agamospermous grass of West African savannahs, Hyparrhenia diplandra

(Poaceae). Am. J. Bot. 89: 1785-1791.

Gornall RJ 1999. Population genetic structure in agamospermous plants. In: Molecular

Systematics and Plant Evolution. Eds. Hollingsworth P. M., Bateman R. M., and Gornall R.

J.). Taylor and Francis, London. Pp. 118-138.

Grant V 1981. Plant speciation. New York, USA: Columbia University Press.

Gu CZ, Spongberg, SA 2003. Crataegus L.. Flora of China 9: 111-117.

Guitián P (1998) Latitudinal variation in the fruiting phenology of a bird-dispersed plant

(Crataegus monogyna) in Western Europe. Pl. Ecol. 137: 139-142.

Hamrick JL, Godt MJ, Sherman-Broyles SL 1992. Factors influencing levels of genetic diversity

in woody plant species. New Forest 6: 95-124. 14 Hamrick JL, Godt MJ 1996. Effects of life history traits on genetic diversity in plant species.

Philos. Trans. R. S. London (Biol) 351: 1291-1298.

Holsinger KE 2000. Reproductive systems and evolution in vascular plants. Proc. Natl. Acad.

Sci. USA 97: 7037-7042.

Hooker JD 1835. Curtis’s Botanical Magazine, 62: t3442.

Hörandl E 2006. The complex causality of geographical parthenogenesis. New Phytol. 171:

525-538.

Hörandl E, Jakubowsky G, Dobeš C 2001. Isozyme and morphological diversity within

apomictic and sexual taxa of the Ranunculus auricomus complex. Pl. Syst. Evol. 226:

165-185.

Houliston GJ, Chapman HM 2004. Reproductive strategy and population variability in the

facultative apomict Hieracium pilosella (Asteraceae). Am. J. Bot. 91: 37-44.

Husband BC, Schemske DW 1998. Cytotype distribution at a diploid–tetraploid contact zone in

Chamerion (Epilobium) angustifolium (Onagraceae). Am. J. Bot. 85: 1688-1694.

Jakob SS, Meister A, Blattner FR 2004. The considerable genome size variation of Hordeum

species (Poaceae) is linked to phylogeny, life form, ecology, and speciation rates. Mol. Biol.

Evol. 21: 860-869.

Judson OP, Normark BB 1996. Ancient asexual scandals. Trends Ecol. Evol. 11: A41-46.

Kjolner S, Sastad SM, Taberlet P, Brochmann C 2004. Amplified fragment length polymorphism

versus random amplified polymorphic DNA markers: clonal diversity in Saxifraga cernua.

Mol. Ecol. 13: 81-86.

Kollmann J, Steinger T, Roy B 2000. Evidence of sexuality in European Rubus (Rosaceae)

species based on AFLP and allozyme analysis. Am. J. Bot. 87: 1592-1598.

Kondrashov AS 1993. Classification of hypotheses on the advantage of amphimixis. J. Hered.

84: 372-387. 15 Kraft T, Nybom H, Werlemark G 1996. DNA fingerprint variation in some blackberry species

(Rubus subg.Rubus, Rosaceae). Pl. Syst. Evol. 199: 93-108.

Lamotte, RS 1952. Catalogue of the Cenozoic plants of North America through 1950. Geol. Soc.

Am. Mem. 51.

Leitch IJ, Bennet MD 1997. Polyploidy in angiosperms. Trends Pl. Sci. 2: 470-476.

Levin DA 1981. Gene flow in plants revisited. Ann.Missouri Bot. Gard. 68: 233-253.

Levin DA 2002. The role of chromosomal change in plant evolution. Oxford University Press,

New York.

Lewis WH 1980. Polyploidy in species populations. In Polyploidy: Biological revelance. Ed. by

Lewis, W. H. Plemum press, New York and London. Pp.103-145.

Linhart YB, Grant MC 1996. Evolutionary significance of local genetic differentiation in plants.

Ann. Rev. Ecol. Syst. 27: 237-277.

Longley AE 1924. Cytological studies in the genus Crataegus. Am. J. Bot. 11: 295-317.

Lyman JC, Ellstrand NC 1998. Relative contribution of breeding system and endemism to

genotypic diversity: the outcrossing endemic Taraxacum californicum vs. the widespread

apomict T. officinale (sensu lato). Madroño 45: 283-289.

Lynch M, Gabriel W 1983. Phenotypic evolution and parthenogenesis. Am. Nat. 122: 745-764.

Mabberley DJ 1997. The plant book: a portable dictionary of the vascular plants. 2nd ed.

Cambridge, UK: Cambridge University Press

MacGinitie HD 1934. Contributions to paleobotany. Ⅱ. The Trout Creek flora of southeastern

Oregon. Carnegie Inst.

Mallet J 2005. Hybridization as an invasion of the genome. Trends Ecol. Evol. 20: 229-237.

McKenzie, D, Peterson DW, Peterson DL, Thornton PE. 2003. Climatic and biophysical

controls on conifer species distributions in mountain forests of Washington State, USA.

Journal of Biogeography 30: 1093-1108. 16 Medikus FC 1793. Geschichte der Botanik unserer Zeiten. Mannheim, Schwan und Gotz.

McAllister HA 1986. The Rowan and its relatives (Sorbus spp.). Ness Series I Liverpool, Ness

Gardens (University of Liverpool Botanic Gardens).

McLellan AJ, Prati D, Kaltz O, Schmid B 1997. Structure and analysis of phenotypic and

genetic variation in clonal plants. In H. de Kroon and J. M. Van Groenendael [eds.], The

ecology and evolution of clonal plants. Backhuys, Leiden, Netherlands. Pp. 185–210.

Menken SBJ, Smit E, Hans Den Nijs JCM 1995. Genetical population structure in plants: gene

flow between diploid sexual and triploid asexual dandelions (Taraxacum section Ruderalia).

Evolution 49: 1108-1118.

Meyers LA, Levin DA 2006. On the abundance of polyploids in flowering plants. Evolution 60:

1198-1206.

Meyer A, Schartl M 1999. Gene and genome duplications in vertebrates: the one-to-four

(-to-eight in fish) rule and the evolution of novel gene functions. Curr. Opin. Cell Biol. 11:

699-704.

Nelson-Jones EB, Briggs D, Smith AG 2002. The origin of intermediate species of the genus

Sorbus. Theor. Appl. Genet. 105: 953-963.

Nogler GA 1984. Gametophytic apomixis. Embryology of angiosperms. J. B. M. Berlin,

Springer-Verlag. Pp. 475-518.

Noyes RD, Soltis DE 1996. Genotypic variation in agamospermous compositus

(Asteraceae). Am. J.Bot. 83: 1292-1303.

Nybom H 1998. Biometry and DNA fingerprinting detected limited genetic differentiation

among populations of the blackberry Rubus nesssensis (Rosaceae). Nord. J. Bot. 18: 323-333.

Nybom H, Schaal BA 1990. DNA "fingerprints" reveal genotypic distributions in natural

populations of blackberries and raspberries (Rubus, Rosaceae). Am. J. Bot. 77: 883-888.

Ouborg NJ, Piquot Y, van Groenendael JM 1999. Population genetics, molecular markers and 17 the study of dispersal in plants. J. Ecol. 87: 551-568.

Oddou-Muratorio S, Petit RJ, Le Guerroue B, Guesnet D, Demesure B (2001) Pollen- versus

seed-mediated gene flow in a scattered forest tree species, Sorbus torminalis L. (Crantz).

Evolution 55: 1123-1135.

Palmer EJ 1932. The Crataegus problem. J Arnold Arboretum 13: 342-362.

Paun O, Greilhuber J, Temsch EM, Hörandl E 2006. Patterns, sources and ecological

implications of clonal diversity in apomictic Ranunculus carpaticola (Ranunculus auricomus

complex, Ranunculaceae). Mol. Ecol. 15: 897-910.

Persson-Hovmalm HA, Gustavsson BA 2001. The extent of clonality and genetic diversity in

lingonberry (Vaccinium vitis-idaea L.) revealed by RAPDs and leaf-shape analysis. Mol.

Ecol. 10: 1385-1397.

Persson-Hovmalm HA, Jeppsson N, BartishIV, Nybom H 2004. RAPD analysis of diploid and

tetrapoid populations of Aronia points to different reproductive strategies within the genus.

Hereditas 141: 301-312.

Phipps JB 1983. Biogeographic, taxonomic, and cladistic relationships between East Asiatic and

North American Crataegus. Ann. Missouri Bot. Gard. 70: 667-700.

Phipps JB 1988. Crataegus (Maloideae, Rosaceae) of the southeastern United States, I.

Introduction and series Aestivales. J. Arnold Arboretum 69: 401-431.

Phipps JB, Muniyamma M 1980. A taxonomic revision of Crataegus (Rosaceae) in Ontario.

Can. J. Bot. 58: 1621-1699.

Phipps JB, Robertson KR, Smith PG, Rohrer JR 1990. A checklist of the subfamily Maloideae

(Rosaceae). Can. J. Bot. 68: 2209-2269.

Phipps JB, Weeden NF, Dickson EE 1991. Isozyme evidence for the naturalness of Mespilus L.

(Rosaceae, subfam. Maloideae). Syst. Bot. 16: 546-552.

Phipps JB, O'Kennon RJ 1998. Three new species of Crataegus (Rosaceae) from western North 18 America: C. okennonii, C. okanagenensis, and C. phippsii. Sida 18: 169-191.

Phipps JB, O' Kennon RJ, Lance RW 2003. Hawthorns and medlars. Timber Press, Portland OR.

Potter D, Eriksson T, Evans RC, Oh SH, Smedmark JEE, Morgan DR, Kerr M, Robertson KR,

Arsenault MP, Dickinson TA, Campbell CS 2007. Phylogeny and classification of Rosaceae.

Pl. Syst. Evol. 266: 5–43.

Ptacek MB, Gerhardt HC, Sage RD 1994. Speciation by polyploidy in treefrogs: multiple

origins of the tetraploid, Hyla versicolor. Evolution 48: 898-908.

Oliver E 1936. Contributions to paleontology: A Miocene flora from the Blue Mountains,

Oregon. Carnegie Inst.

Otto SP, Whitton J 2000. Polyploid incidence and evolution. Ann. Rev. Genet. 34: 401-437.

Ramsey J, Schemske DW 1998. Pathways, mechanisms, and rates of polyploid formation in

flowering plants. Ann. Rev. Ecol. Syst. 29: 467– 501.

Richards AJ 1996. Genetic variability in obligate apomicts pf the genus Taraxacum. Folia

Geobotanica 31: 405-414.

Robertson A, Newton AC, Ennos RA 2004. Multiple hybrid origins, genetic diversity and

population genetic structure of two endemic Sorbus taxa on the Isle of Arran, Scotland. Mol.

Ecol. 13: 123-134.

Robertson KR, Phipps JB, Rohrer JR, Smith PG 1991. A synopsis of genera of the Maloideae

(Rosaceae). Syst. Bot. 16: 376-394.

Sax K 1931. The origin and relationships of the Pomoideae. J. Arn. Arb. 12: 3-22.

Schoen D, Brown A 1991. Intraspecific variation in population gene diversity and effective

population size correlates with the mating system in plants. Proc. Nat. Acad. Sci. 88:

4494-4497.

Shi Y, Gornall RJ, Draper J, Stace CA 1996. Intraspecific molecular variation in Hieracium sect.

Alpina (Asteraceae), an apomictic group. Folia Geobotanica 31: 305-413. 19 Soltis DE, Rieseberg LH 1986. Autopolyploidy in Tolmiea menziesii (Saxifragaceae): evidence

from enzyme electrophoresis. Am. J. Bot. 73: 310– 318.

Soltis PS, Soltis DE 1993. Molecular data and the dynamic nature of polyploidy. Crit. Rev. Pl.

Sci. 12: 243-273.

Soltis DE, Soltis PS 1999. Polyploidy: recurrent formation and genome evolution. Trends Ecol.

Evol. 14: 348-352.

Starfinger U, Stocklin J 1996. Seed, pollen, and clonal dispersal and their role in structuring

plant populations. Prog. Bot. 57: 337-355.

Stebbins GL 1950. Variation and evolution in plants. Columbia University Press, New York.

Štorchová H, Chrtek Jr J, Bartish IV, Tetera M, Kirschner J, Štepánek J 2002. Genetic variation

in agamospermous taxa of Hieracium sect. Alpina (Compositae) in the Tatry Mts. (Slovakia).

Pl. Syst. Evol. 235: 1-17.

Talent N, Dickinson TA 2005. Polyploidy in Crataegus and Mespilus (Rosaceae, Maloideae):

evolutionary inferences from flow cytometry of nuclear DNA amounts. Can. J. Bot. 83:

1268-1304.

Talent N, Dickinson TA 2007 Ploidy level increase and decrease in seeds from crosses between

sexual diploids and asexual triploids and tetraploids in Crataegus L. (Rosaceae,

Spiraeoideae, Pyreae). Can. J. Bot. 85: 570-584.

Taylor JS, Van de Peer Y, Meyer A 2001. Genome duplication, divergent resolution and

speciation. Trends Genet. 17: 299-301.

Van der Hulst RGM, Mes THM, Den Nijs JCM, Bachmann K 2000. Amplified fragment length

polymorphism (AFLP) markers reveal that population structure of triploid dandelions

(Taraxacum officinale) exhibits both clonality and recombination. Mol. Ecol. 9: 1-8.

Van Dijk PJ 2003. Ecological and Evolutionary Opportunities of Apomixis: Insights from

Taraxacum and Chondrilla. Philos. Trans. R. Soc. London [Biol] 358: 1113-1121. 20 Van Dijk PJ, Vijverberg K. 2005. The significance of apomixis in the evolution of the

angiosperms: a reappraisal. In Plant species-level systematics: New perspectives on pattern

and process. Pp.101-117.

Verduijn MH, Van dijk J, Jos MM, Van D 2004. Distribution, phenology and demography of

sympatric sexual and asexual dandelions (Taraxacum officinale s.l.): geographic

parthenogenesis on a small scale. Biol. J. Linn. Soc. 82: 205-218.

Vavrek M 1998. Within-population genetic diversity of Taraxacum officinale (Asteraceae):

differential genotype response and effect on interspecific competition. Am. J. Bot. 85:

947-954.

Watkinson AR, Powell JC 1993. Seedling recruitment and the maintenance of clonal diversity in

plant populations--a computer simulation of Ranunculus repens. J. Ecol. 81: 707-717.

Whitton J, Sears CJ, Bacck EJ, Otto SP 2008. The dynamic nature of apomixis in the

angiosperms. Int. J. Plant Sci. 169: 169-182.

Widen B, Cronberg N, Widen M 1994. Genotypic diversity, molecular markers, and spatial

distribution of genets in clonal plants, a literature survey. Folia Geobotanica 29: 245-263.

Wolfe JA, Wehr W 1988. Rosaceous Chamaebatiaria-like foliage from the Paleogene of western

North America. Aliso 12: 177-200

21 Figure 1.1 Phylogenetic tree showing the evolution of research project and outline of the present thesis. Each terminal box contains an abbreviated title of each thesis chapter, which represents an independent research paper. Background of polyploidy and Rosaceae was used as outgroup that provides the literature review and main questions that addressed in each of the chapters. Genetic variation of seed 7 Ch families C

h Population structure of T

6 a

C. douglasii and x C o h C. suksdorfii n 5 o m

C i h Origins and reticulation c

4 l e

of series Douglasianae v e

Ch l 3 a Cytotype distribution n C d h and reproductive g 2 e

biology of Douglasianae o g

C r h a 1 Phylogeography of Old p h i and New World Crataegus c a l s c

Generic limits of a l Crataegus and Mespilus e

Polyploidy and Rosaceae 2 2 23 Chapter 2

Molecular reappraisal of relationships between Crataegus and Mespilus

(Rosaceae, Pyreae) – Two genera or one?

Abstract. Mespilus and Crataegus are sister genera in Rosaceae tribe Pyreae. Mespilus has been seen to comprise not only the medlar, Mespilus germanica, native to western Eurasia but also the Arkansas, U.S.A. endemic, Mespilus canescens. Crataegus, on the other hand, consists of

140-200 species found throughout the northern hemisphere. Diagnoses of these two genera rely on morphological features of leaves, flowers and fruits. However, character states supposed to be diagnostic of Mespilus occur in species of Crataegus. We used two nuclear (ribosomal ITS and LEAFY intron2) and four intergenic chloroplast DNA regions (trnS-trnG, psbA-trnH, trnH- rpl2, and rpl20-rps12) to estimate the phylogeny of Mespilus and Crataegus. Maximum parsimony, maximum likelihood, and Bayesian analyses all corroborate the sister group relationship between Crataegus and Mespilus, and is sister to the rest of that genus. However, incongruence between chloroplast and nuclear data supports the hypothesis of a hybrid origin for Mespilus canescens, with Crataegus brachyacantha or its ancestor as the maternal parent. Accordingly, we (1) restrict Crataegus section Brevispinae to

Crataegus brachyacantha; (2) distinguish the Arkansas endemic as a nothospecies; (3) describe a new section and a new nothosection within Crataegus to contain the former species of

Mespilus and Crataemespilus; and (4) merge Crataegus and Mespilus through making two new combinations under Crataegus.

Keywords: Crataegus, generic delimitation, hybrid origin, Mespilus, molecular data, phylogeny.

23 24 2.1 Introduction

Crataegus and Mespilus have a complicated taxonomic history. In brief, the modern concepts of

Crataegus and Mespilus originated with Medikus (1793), and are based on the way in which the pyrenes are covered by epidermis in Mespilus but exposed in the fruits of Crataegus. According to Medikus, Mespilus comprised a single species, the medlar M. germanica L. (Medikus 1793).

Crataegus, on the other hand, consisted of 12 hawthorn species and one species of what is now recognized as the genus Pyracantha M.Roem. Lindley (1822) maintained Medikus' concept of the two genera but reversed his distinction between them by suggesting that "in Mespilus the top of the cells is absolutely naked; and this is one of the distinctions between it and Crataegus," perhaps confusing the openness of the free portion of the hypanthium in the medlar fruit with the lack of any tissue covering the pyrenes. Despite alternative interpretations of these genera by others (see Table 1 in Robertson et al. 1991) this concept of Crataegus and a monotypic

Mespilus espoused by Medikus and Lindley was maintained by Candolle (1825), Decaisne

(1874), and Koehne (1890) and is the concept that has been in use throughout the twentieth century.

More recently the similarities and differences between Crataegus and Mespilus have been explored in the context, on the one hand, of expanding Mespilus to include a North American entity endemic to Arkansas, M. canescens J.B.Phipps, and on the other of renewed interest as a result of data from molecular systematic studies in generic limits within Rosaceae tribe Pyrinae

Baill. (formerly treated as subfamily Maloideae). Several molecular phylogenies have demonstrated a sister-group relationship between Crataegus and Mespilus (Campbell et al. 1995;

Evans et al. 2000; Evans & Campbell 2002; Campbell et al. 2007). In many of these analyses

Amelanchier Medik. and its related genera Peraphyllum Nutt. ex Torr. and Gray and

Malacomeles (Decne.) Engl. have been shown to be sister to the Crataegus-Mespilus clade.

There are more morphological differences, however, between the Amelanchier group and the 25 Crataegus-Mespilus clade than there are between Mespilus and Crataegus. On the one hand, vegetative growth on fertile short shoots is sylleptic in Amelanchier, distinguishing it from most of the other genera in the Pyreae that have proleptic sympodial development of lateral short shoots. On the other hand Crataegus and Mespilus are distinguished from the Amelanchier group and most other Pyreae by (1) lateral short shoots modified as thorns; (2) collateral ovules that become superposed by the time of anthesis so that typically only the lower one is fertilized,

(3) abundant endosperm in the mature seed (Aldasoro et al. 2005), and (4) a polypyrenous drupe

(rather than a berry or "pome") that develops from the hypanthial ovary. Variation within

Crataegus in leaf margination and venation, number of flowers per inflorescence, and in stamen number per flower, encompasses most or all of the states exhibited in Mespilus. While M. germanica has been shown to be a diploid, like many species of Crataegus, M. canescens is triploid and largely sterile (Talent & Dickinson 2005; Dickinson unpubl. data). Phipps et al.

(1991) argued that their phenetic analyses of isozyme data collected from both species of

Mespilus, several species of Crataegus, and a number of outgroup Pyreae genera supported the naturalness of Mespilus as a genus. Because of concern about the failure of the Konecny Grove population to recruit new individuals, McCue et al. (2001) used four RAPD primers to identify unique genotypes for ex situ conservation (seed set by M. canescens grown at the Dale Bumpers

Small Farms Research Center, Booneville, Arkansas, from Konecny Grove seedlings is extremely poor: two seeds, in 61 pyrenes from 13 fruits). Using 10 consistently amplified bands, comparison of RAPD phenotypes in the 25 M. canescens individuals in Konecny Grove with the phenotype of a single individual of M. germanica (McCue et al. 2001) demonstrated, upon reanalysis of these data (not shown; these results differ slightly from those reported by McCue et al. 2001), the presence of 11 unique RAPD phenotypes in the M. canescens individuals (one to eight individuals per phenotype), none of which had any of the 10 bands in common with the

M. germanica individual. More recently, Verbilaitė et al. (2006) demonstrated the similarity of 26 DNA sequences from M. germanica and M. canescens to those of some Crataegus species for the trnL-trnF region of the chloroplast genome. These are the only molecular data that have been adduced to date. The present study seeks to resolve the relationship between these two genera using DNA sequence data from both the chloroplast and nuclear genomes.

This paper is part of a larger project on Crataegus systematics and evolution that has the following objectives: (1) to evaluate the support for Mespilus and Crataegus as distinct genera;

(2) to unravel the origin and relationships of M. canescens with other Mespilus and Crataegus taxa; (3) to discover the intrageneric taxonomic structure within Crataegus and find out to what extent the existing subgeneric classification represents distinct clades; (4) to infer the phylogenetic and biogeographic relationships between diploid and polyploid Crataegus entities and; (5) to establish what species concept best reflects the biology and evolutionary history of the North American black-fruited hawthorns (sections Brevispinae Beadle ex C.K.Schneid. and

Douglasianae Loud.).

This paper focuses on the first two of these objectives. We use a combination of nuclear and chloroplast sequences to infer the phylogeny of mainly diploid Mespilus and Crataegus species (Appendix 1). The commonly used nuclear ribosomal internal transcribed spacers (ITS) and the second intron of the floral homeotic gene, LEAFY were selected to represent the nuclear genome. LEAFY appears to be single copy in angiosperms (Frohlich & Meyerowitz 1997), but two orthologues have been reported in Malus species (Wada et al. 2002). The second intron has been shown to be informative in some previous phylogenetic studies (Archambault & Bruneau

2001; Grob et al. 2004) and it provided twice as many informative characters as the ITS and 10 times more than the cpDNA data among genera of the Rosaceae (Oh & Potter 2003, 2005). In addition to the nuclear sequences, four non-coding chloroplast regions trnS-trnG, psbA-trnH, trnH-rpl2, and rpl20-rps12, adjacent to the junction of the large single copy (LSC) and inverted repeat (IR) were used. These regions have been demonstrated to be informative for inferring 27 phylogenies at both inter- and intraspecific levels (Goulding et al. 1996; Xu et al. 2000;

Vaillancourt & Jackson 2000). Together, they provide an independent plastid phylogeny that can be compared with the nuclear trees.

2.2. Materials and Methods

2.2.1. Taxon sampling

Plant material was either collected in the field or from botanical gardens (Appendix 1). Voucher specimens are deposited in the Green Plant Herbarium of the Royal Ontario Museum (TRT) unless noted otherwise in Appendix 1. Although every effort was made to include only diploid taxa of the genus Crataegus, two factors led to the inclusion of some polyploids in the samples studied here. First, we sought to represent as many sections of the genus as possible. In some cases where a section is monotypic (Parvifolieae Loudon, Cordatae Beadle ex C.K.Schneid.) it was necessary to use a polyploid entity (Appendix 1; Talent & Dickinson 2005). Second, sampling for this project took place before or concurrently with sampling for a parallel study of variation in nuclear DNA content (Talent & Dickinson 2005) so that in some instances we discovered that species we sampled vary in ploidy level (e.g. C. laevigata Poir., C. monogyna

Jacq.; Appendix 1). Other species, such as C. crus-galli L. and C. suksdorfii (Sarg.) Kruschke

(Appendix 1), we knew varied in ploidy level but we were interested to include them in our study. A total of 31 Crataegus and two Mespilus species were included, with in most cases a minimum of two individuals representing each species. In three cases only a single individual was available to represent a section or series (sections Mexicanae Loud. and Lacrimatae

(J.B.Phipps) J.B.Phipps, and series Triflorae (Beadle) Rehder in section Coccineae Loud.;

Appendix 1). In some other cases where more than one species was available to represent a section or series, some species were represented by a single individual (Appendix 1). One individual was included in the sample on the supposition that it represented C. cuneata Siebold. 28 and Zucc. (section Cuneatae Rehder ex Schneider), but comparison with the image of the type specimen of this species demonstrated that this is not the case and this accession is listed under incertae sedis (Appendix 1). Species of Amelanchier, Malus and Aronia were used as outgroups because they have been shown to be divergent to varying degrees from Crataegus and Mespilus

(Campbell et al., 2007).

2.2.2. Morphological data

Data on vegetative and reproductive morphology (Appendix 2) are based on field observations and herbarium specimens, and on data in Robertson et al. (1992), and Phipps et al. (2003).

Secondary venation of short shoot leaves was visualized on x-ray negatives prepared using a

Hewlett-Packard Faxitron x-ray system and Kodak Industrex film.

2.2.3. DNA Extraction, PCR, and sequencing

Total genomic DNA was extracted from leaves that were either frozen on dry ice and stored at –

80。C or dried on silica gel and stored at room temperature. Frozen samples were extracted using the modified CTAB procedure of Doyle and Doyle (1987), while dried leaves were extracted using the method of Tsumura et al. (1995) modified to a small scale. The nuclear ribosomal region encompassing ITS-1, 5.8S rRNA and ITS2 spacer was amplified using primers ITS4 and

ITS5 (White et al. 1990). The second intron of LEAFY was amplified using primers LFY1 and

LFY2 designed on the 2’ and 3’ exon (Oh & Potter 2003). Four chloroplast intergenic spacer regions psbA-trnH (Sang et al. 1997), rpl20-rps12 and trnG-trnS (Hamilton 1999), and trnH- rpl2 (Vaillancourt & Jackson 2000) were amplified using the published primers.

Each 25 μl PCR reaction contained 5 pmol each of 5’ and 3’ primer, 0.2 mM dNTP, 1 unit of Taq DNA polymerase (Fermentas), 2.5 mM MgCl2, and 2.5 μl 10×PCR buffer. DMSO was 29 added to a final 10% in both ITS and LEAFY amplifications to increase the specificity of the

PCR fragments and the intensity of the sequence peak profiles. All amplifications were carried out using a T1 Thermocycler (Whatman Biometra). PCR cycles involved an initial denaturing step at 94°C for 3 min, then 35 cycles of 94°C for 1 min, 50-56°C for 50 s, and 72°C for2 min.

An additional extension was performed at 72°C for 5 min, then cooled to 4°C. PCR products were checked on 1% agarose gels. All chloroplast amplicons were sequenced directly after purification with MinElute purification columns (QIAGEN Inc.). Purified PCR products of ITS and LEAFY were cloned following the protocol of Qiagen’s pDrive Vector System and 3-5 clones per sample were sequenced using Perkin-Elmer BigDye terminator kits on ABI Model

3100 automated sequencer (PE Applied Biosystems, Inc.).

2.2.4. Sequence editing, alignment, and phylogenetic analyses

Multiple alignments of sequences were first obtained using the ClustalX program (Thompson et al. 1994) and then manually edited in Sequence Alignment Editor (Rambaut 2002). Gaps within the sequence data were treated as missing. However, the parsimony informative gaps, i.e., gaps shared by at least two ingroup species as determined by visual inspection of the alignment, were coded as either binary (presence or absence of indels) or multistate characters (depending on the length of indels) using the Simmons & Ochoterena (2000) method as implemented in SeqState version 1.32 (Müller 2005), and appended to the sequence matrixes for phylogenetic analyses

(Guillon 2004). Representative sequences for each region for each species were deposited in

GenBank (Appendix 3; accessions EF127007-127228).

Phylogenetic analyses were conducted using PAUP*4.0b (Swofford 2002) for maximum parsimony (MP) and maximum likelihood (ML), and MrBayes version 3.0b4 (Huelsenbeck &

Ronquist 2001) for Bayesian inference (BI). Nuclear and chloroplast data were analysed both 30 separately and jointly with the three methods. In order to obtain phylogenies based on a complete dataset, taxa in conflicting positions in the nuclear and chloroplast trees were removed in the combined analysis. Heuristic parsimony searches were performed using equally weighted characters, tree-bisection-reconnection (TBR) branch swapping, random addition of sequence

(1000 replicates), and with no limit to the number of trees saved. Character changes were interpreted with the ACCTRAN optimization. Branch support was assessed by bootstrap (BS) analyses (Felsenstein 1985) with full heuristic searches, 500 replicates using simple taxon addition and TBR swapping, MULTtrees option, and all trees saved.

In order to reduce computational time, one individual per species was included in the ML analyses of nuclear, chloroplast, and combined data. The substitution models for ML and

Bayesian analyses were obtained using Modeltest (version 3.06, Posada & Crandall 1998) with both Hierarchical Likelihood Ratio Tests (hLRTs) and Akaike Information Criterion (AIC) methods. Maximum likelihood analysis of the combined nuclear data was conducted with

Transitional (TIM) model (parameters: base frequencies A = 0.19, C = 0.34, G = 0.30, T = 0.17, proportion of invariable sites (I) 0.5183, gamma 1.1819,Ti/Tv 1.463, 6 rate parameters and molecular clock not enforced). Analysis of the chloroplast data was conducted with the General

Time Reversible (GTR) model (parameters: base frequencies A = 0.3538, C = 0.1332, G =

0.1456, T = 0.6536, proportion of invariable sites (I) 0.6536, gamma 0.4233, Ti/Tv 0.622, 6 rate parameters and molecular clock not enforced). The smaller gamma value obtained in the chloroplast dataset compared with that in the nuclear data indicated a more substantial heterogeneity of rate substitution across the chloroplast nucleotides. Analysis of the combined nuclear and chloroplast data was conducted with the Transversional (TVM) model (parameters: base frequencies A = 0.3095, C = 0.1852, G = 0.1862, T = 0.3191, proportion of invariable sites

(I) 0.5093, gamma 0.567, Ti/Tv 1.6414, 6 rate parameters and molecular clock not enforced). 31 Bayesian inference was initiated from a random starting tree and the program was set to run four Markov chain Monte Carlo (MCMC) iterations for 1,000,000 generations with trees sampling every 100th generation. The likelihood scores, trees, and other sample points generated prior to 136,100 and 55,700 generations, respectively, for nuclear and chloroplast data were discarded because they do not provide accurate parameter estimates. The remaining trees were saved and imported into PAUP* for constructing the majority rule consensus trees. Posterior probability for each clade was obtained to evaluate branch support in the resulting trees.

2.2.5. Alternative topologies

We used the Shimodaira–Hasegawa (SH) test (Shimodaira & Hasegawa 1999) as implemented in PAUP* (Swofford 2002) to compare the best ML trees recovered from the nuclear and chloroplast data, respectively, with the constraint trees constructed in MacClade (Maddison &

Maddison 1992). To test the two genera hypothesis, Crataegus and Mespilus taxa were constrained into two monophyletic groups and the trees were loaded as backbone into PAUP*.

Heuristic searches were conducted using the same ML parameters outlined above to find the shortest trees compatible with the constraint. The likelihood score of the constrained tree was then compared with the score of the best ML tree using the one tailed non-parametric SH tests.

2.3. Results

2.3.1. Sequences

Our data conform to the generally lower GC content in the chloroplast sequences than in the nuclear sequences (Table 2.1). For the nuclear sequences, intraspecific polymorphism was no more than 0.01% among clones of our examined taxa including the triploid C. uniflora and M. canescens, and tetraploid C. phaenopyrum. Size variation was observed in both nuclear regions

(Table 2.1). In LEAFY, divergence between ingroup and outgroup taxa (Malus and Aronia) was 32 as much as 35%, which was about three-fold higher than in the ITS region (Table 2.1). Because of the alignment difficulties with the divergent sequences, Malus and Aronia were removed in the phylogenetic analyses. The spacer regions between the chloroplast genes, like those in the

ITS and LEAFY intron, showed noticeable length variation across sequences of our studied taxa, and gave a total of 19 parsimony informative indels in the combined data matrix (Table 2.1).

Most of the indels were conserved in sequences and can be easily aligned except an AT-rich indel which was 245bp long in the trnH-rpl2 region. Taxa showed remarkable variation in the length of this indel caused by irregular AT insertion; therefore, this region was excluded in phylogenetic analyses.

2.3.2. Nuclear phylogeny

In all analyses, Amelanchier was shown to be less divergent from the ingroup taxa than were

Malus and Aronia. Heuristic parsimony searches of the ITS data alone yielded 2761 equally parsimonious trees. Within the ingroup, Mespilus taxa were monophyletic, but this was not the case for the Crataegus taxa because of Mespilus (Fig. 2.1a). Crataegus brachyacantha (section

Brevispinae) was associated with the two Mespilus species and was distinct from the rest of the genus (clade A). This relationship was supported additionally by two indels detected in the alignment. The remaining Crataegus taxa are divided into four clades labeled as B, C, D, and E with moderate bootstrap or Bayesian support (Fig. 2.1a). Clade B contains members of the

Eurasian sections Crataegus and Hupehensis. Clade C is a small group of three North American taxa: C. marshallii (sect. Crataegus), C. phaenopyrum (sect. Cordatae), and C. spathulata (sect.

Microcarpae). Clade D contains members of section Coccineae, Crus-galli, Virides, Mexicanae, and Aestivales exclusively from eastern North America, and this whole group was sister to clade

E which contains members of sections Sanguineae and Douglasianae, and C. saligna (sect. 33 Brevispinae). Over all of the ingroup branches the unrooted tree of the ITS data showed a maximum of 27 changes compared with 40 changes on the branch leading to the outgroup taxa.

In contrast, with the LEAFY data, about 205 changes were accumulated along the branch leading from Malus and Aronia to Amelanchier and the ingroup (branch lengths in this area of the tree < 15 changes. Thus, over 27,000 parsimony trees were produced when Malus and

Aronia were included as outgroup. We conclude that the extremely long branch of Malus and

Aronia in the LEAFY data could have distorted the topology of clades with relatively short branches, and resulted in an inaccurate phylogeny. In order to alleviate this rooting problem, further analyses of the LEAFY data used Amelanchier as the only outgroup.

Without Malus and Aronia, the LEAFY data yielded a total of 5053 parsimony trees. The strict consensus tree (Fig. 2.1b) divided the ingroup taxa into three main clades: {A, B}, {C, D}, and E. As in the ITS data (Fig. 2.1a), C. brachyacantha was allied with the Mespilus species

(clade A), but with poor support (<50% BS). This clade was strongly associated with the

Eurasian taxa of Crataegus (clade B; 87%BS, 98%BI). The three monotypic groups (sections

Cordatae and Microcarpae, and series Apiifoliae, in section Crataegus) that constituted clade C in the ITS data (Fig. 2.1a) were unresolved in LEAFY and were found in a polytomy together with the other eastern North American taxa (clade D; Fig. 2.1b). A similar pattern was also found in the ML tree (data not shown), as well as in the Bayesian results where the eastern

North American taxa were resolved as a polytomy (data not shown).

Because there was no strongly supported conflict between the topologies inferred from the

ITS (Fig. 2.1a) and LEAFY (Fig. 2.1b) data, the two datasets were combined to increase robustness and phylogenetic resolution. Analysis of the combined nuclear data resulted in 970 equally parsimonious trees. The strict consensus tree (Fig. 2.2) demonstrated the monophyly of

C. brachyacantha and the Mespilus species (clade A; Fig. 2.1a, b), and this clade was found to be closely related to the Eurasian species (clade B), as shown in the LEAFY data (Fig. 2.1b). 34 However, this association was weakly supported in the bootstrap analysis (BS < 50%). Clades D and E were well supported as sister groups as shown in the ITS data (Fig. 2.1a), and clade C was shown adjacent to clades {D, E}.

2.3.3. Chloroplast phylogeny

Maximum parsimony analyses of individual chloroplast regions each recovered over 30,000 equally parsimonious trees with only a few resolved clades nested in widely unresolved topologies. Because the entire chloroplast genome is considered as one linkage group, individual regions are expected to exhibit the same phylogenetic pattern (Doyle 1992). We combined all sequences to obtain greater phylogenetic resolution.

Heuristic parsimony analyses of combined chloroplast data produced 18,432 trees. Clades

A, B, D, and E, as found in nuclear data (Fig. 2.2), were recovered in the chloroplast data (Fig.

2.3). However, apparent conflicts were detected in the relationships between C. brachyacantha and the Mespilus taxa within clade A, and in the position of C. marshallii, C. phaenopyrum, and

C. spathulata of clade C in the chloroplast and nuclear trees. Such incongruences were also supported by the ML (Fig. 2.4) and Bayesian results (data not shown). None of the analyses of the chloroplast data recovered M. canescens and M. germanica as a monophyletic group (Fig.

2.3). Interestingly, Mespilus was recovered as paraphyletic, with M. canescens more closely related to C. brachyacantha than to M. germanica (BS≥81 and BI≥97). This association coincided with 16 site changes and 5 diagnostic indels shared between the two taxa as detected in the alignment. Another major conflict was found in clade C in which C. phaenopyrum, C. marshallii, and C. spathulata were dispersed within the eastern and western North American taxa (clades D and E) instead of showing as a monophyletic group.

2.3.4. Maximum likelihood analyses and tests of alternative phylogenetic hypotheses 35

For the combined nuclear data, ML analysis using TIM+G+I model (rAC = 1.00, rAG = 1.77, rAT

= 0.64, rCG = 0.64, rCT = 2.48, α = 1.18, pinv = 0.52) recovered a single tree (Fig. 2.4a) with –lnL

= 5,894.64. The topology found was similar to the MP (Fig. 2.2) and Bayesian (data not shown) results. Results of the Shimodaira–Hasegawa test based on nuclear data failed to reject the hypothesis that Crataegus and Mespilus are two separate monophyletic groups (P = 0.145). The difference in likelihood scores between the best ML tree and an ML tree constrained to fit the hypothesis was -5,894.64 - (-5,922.35) = 27.71. Hence, the result of the SH test on the nuclear sequence data is consistent with the traditional treatment of Crataegus and Mespilus as two distinct genera.

Maximum likelihood analysis of chloroplast data using the GTR+G+I model (rAC = 0.99, rAG = 1.14, rAT = 2.15, rCG = 0.89, rCT = 1.578, α = 0.42, pinv = 0.65) recovered a single tree with –lnL = 4,228.18 (Fig. 2.4b). This tree supported the topology observed in the parsimony

(Fig. 2.3) and Bayesian (data not shown) analyses except that the Eurasian taxa, M. germanica, and M. canescens-C. brachyacantha (i.e. clade A1, A2, and B in Fig. 2.4b) were resolved in a polytomy. Results of the SH test based on chloroplast data led us to reject the hypothesis that

Crataegus and Mespilus are two separate monophyletic groups (P < 0.05). The difference in likelihood scores between the best ML tree and an ML tree constrained to fit the hypothesis was

-4,228.18 - (-4,559.20) = 331.02. The SH test rejected the inclusion of C. brachyacantha within the Crataegus clade.

2.3.5. Combined nuclear and chloroplast phylogeny

In order to test the two-genera hypothesis of Crataegus and Mespilus more thoroughly, we analyzed the combined nuclear and chloroplast data after removing taxa responsible for conflicting topologies (M. canescens, C. marshallii, C. phaenopyrum, and C. spathulata).

Parsimony analyses generated 29113 trees and four major clades (A, B, D, and E) obtained with 36 these taxa in the earlier analyses (Fig. 2.1-2.5) were recovered in the strict consensus tree (Fig.

2.5a). Bootstrap and posterior probability values were generally high among most clades (BS >

80% and BI > 97%) in this analysis. Only the association between C. brachyacantha and M. germanica was not strongly supported (BS = 57% and BI = 77%). The difference in likelihood scores between the strict consensus MP tree and constrained MP tree was -11,001.39 - (-

10,968.21) = 33.19. The Shimodaira–Hasegawa test based on the combined data failed to reject the hypothesis that Crataegus and Mespilus are two separate monophyletic groups only when M. canescens was removed (P = 0.096).

Maximum likelihood analysis of the combined data using the TVM+G+I model (rAC = 1.06, rAG = 2.15, rAT = 2.89, rCG = 1.14, rCT = 2.15, α = 0.57, pinv = 0.51) recovered a single tree with – lnL = 11,001.39 (Fig. 2.5b). In the ML tree, the association of C. brachyacantha and M. germanica (clade A) collapsed and C. brachyacantha was clearly sister to all Crataegus species.

2.4. Discussion

2.4.1. Intergeneric divergence of LEAFY sequences

There are not many nuclear genes that have been used in the phylogeny of genera in the Pyreae due to the concerns about concerted evolution and paralogy especially in hybrid and polyploid taxa (Bailey et al. 2003). Genes such as waxy and s6pdh have been recently shown by Southern hybridization to contain more than one copy in Rosaceae (Evans et al. 2000; Bortiri et al. 2002).

LEAFY, a floral homeotic gene of the MADS box gene family that controls meristem development in Arabidopsis (Blazquez et al. 1997), is suggested to be single copy through the loss of its paralogous copy during angiosperm evolution (Frohlich & Meyerowitz 1997).

Although more than one ortholog is present in species of Malus (Wada et al. 2002), our PCR products appear to be single bands and introns have been shown to be phylogenetically informative not only here but in other Rosaceae genera, e.g. Neillia and Stephanandra (Oh & 37 Potter 2003, 2005). Variability of the LEAFY sequences in Crataegus and Mespilus was comparable to that in the ITS region (Table 2.1), but substantial divergence (three times more than in ITS) was found between the ingroup and Malus and Aronia. Such divergence, on the one hand, demonstrates the potential utility of using LEAFY elsewhere in the Pyreae. However, as shown in this study, when the sequences being analyzed are too divergent, or when rates of evolution show considerable variation among sequences, a spurious phylogeny could be produced due to long branch attraction (Felsenstein 1978). One approach to minimizing this effect is to include sequences with more changes along short internal branches in order to reduce the differences in branch length. An alternative is to include more samples to break down long branches of the diverged outgroup taxa. In our case, we took the former approach and combined the ITS and LEAFY data to obtain a more accurate phylogeny.

2.4.2. Phylogenetic utility of chloroplast regions in Pyreae

Regions such as rbcL, matK, and trnL/F have been used in earlier studies of Rosaceae phylogeny (Chase et al. 1993; Morgan et al. 1994; Potter et al. 2002). Some of these were shown to be informative at the generic level within subfamilies such as Spiraeoideae (Lee &

Wen 2001; Bortiri et al. 2002) and Rosoideae (Eriksson et al. 2003). Other regions such as rpl16, rps16, trnL, ndhF, and rbcL-atpB have been used singly or together to infer intergeneric relationships within Pyreae (Campbell et al., 2007), but the resolution was not as high as a single nuclear ITS region. In this regard, the attempt to reconstruct a maternal phylogeny of genera in the Pyreae, especially at lower taxonomic levels, was considered challenging.

Nevertheless, chloroplast regions vary broadly in their evolutionary rates that give different amounts of phylogenetic signal at any given taxonomic level (Zurawski & Clegg 1987;

Golenberg et al. 1993; Olmstead & Palmer 1994). In this study, we resolved a species-level phylogeny of Crataegus and Mespilus using four intergenic regions of the chloroplast genome 38 that have never been used in the Rosaceae, trnG-trnS, psbA-trnH, trnH-rpl2, and rpl20-rps12.

The combined data yielded as much as 5.84% polymorphism (Table 2.1) among all taxa examined and gave a topology compatible with the nuclear results.

2.4.3. Implications of nuclear and chloroplast data incongruence

The combined nuclear trees (Fig. 2.2 and 2.5a) were considered to be good hypotheses for evolutionary relationships in Crataegus and Mespilus because of the high degree of congruence between parsimony, maximum likelihood, and Bayesian analyses, in addition to the higher resolution, bootstrap, and posterior probability values obtained from the combined datasets.

However, comparison with the chloroplast results (Fig. 2.3 and 2.5b) demonstrates conflicts such as the placement of three eastern North American taxa C. marshallii, C. phaenopyrum, and

C. spathulata. These conflicts are important and will be discussed in more detail elsewhere, as part of our overall appraisal of relationships within Crataegus.

Of greater concern here are the placements of the Mespilus species and the non-monophyly of section Brevispinae, which until now has included two species, C. brachyacantha and C. saligna. Crataegus brachyacantha occurs naturally in Louisiana, eastern Texas, and adjacent portions of Arkansas and Oklahoma; there is also a single record for southwestern Georgia

(Phipps 1998). It is noteworthy for its petals that may turn orange upon drying, and for its dark purple to black fruit, covered with a waxy bloom. Its relatively isolated position within the genus is best accommodated by transferring C. saligna out of section Brevispinae and into section Douglasianae. This transfer is the best solution for C. saligna at this time, pending more comprehensive analyses of sections Douglasianae and Sanguineae (Fig. 2.1-2.5).

One cause of the incongruence between the nuclear and chloroplast trees could be the recent occurrence of hybridization between early-diverged taxa. Mespilus canescens and M. germanica share a common ancestor, as shown by the nuclear data (Fig. 2.2 and 2.5a), but M. 39 canescens is shown to be more closely related to C. brachyacantha than it is to M. germanica based on the chloroplast data (Fig. 2.3 and 2.5b). Conflicting topologies like these suggest a hybrid origin of M. canescens, with C. brachyacantha as the probable maternal parent with over

99% identity in the chloroplast sequences.

Hybridization between C. brachyacantha and M. germanica could have occurred if the latter was cultivated within the range of C. brachyacantha sometime in the past 150-200 years.

In fact, Baird and Thieret (1989) refer to an 1893 report of cultivation of M. germanica at an agricultural station in Louisiana, suggesting that there is no reason to exclude this possibility.

Hybridization among Crataegus species is well-documented (e.g. Christensen 1992; Phipps

2005), although its frequency and significance is debated. The factors likely most relevant to whether hybridization between M. germanica and C. brachyacantha could have occurred include proximity and phenology (Campbell et al. 1991). Hawthorns and medlars have relatively unspecialized entomophilous flowers with abundant pollen and are apparently pollinated primarily by bees (Dickinson 1985; Dickinson et al. 1996). Although the number of flowers per inflorescence varies (Table 2.2), even in many-flowered inflorescences anthesis is usually completed within a week or less, at a time that appears to be highly species-specific and controlled by vernal accumulated heat (Dickinson & Phipps 1986; Smith & Phipps 1988).

Nothing, however, is known about the relative timing of anthesis in M. germanica and C. brachyacantha.

A single population of red-fruited M. canescens was discovered in 1970 in Konecny Grove, a small nature reserve in Arkansas, and this site remains the only one at which this species is known to occur naturally (Phipps 1990). Trees of M. canescens are triploid, whereas individuals of M. germanica that have been studied are exclusively diploid (Appendix 1; Talent &

Dickinson 2005). A possible origin of M. canescens from a cross between two Pyreae species was considered by Phipps but dismissed “due to the lack of at least two suitable candidates” 40 (Phipps 1990). Nevertheless, petals of M. canescens resemble those of C. brachyacantha in turning a faint orange color upon drying and, in the analysis of 44 isozyme phenotypes for eight enzyme systems (Phipps et al. 1991), the two Mespilus species for the most part exhibited a subset of the 35 phenotypes found in 21 Crataegus species. Two phenotypes were unique to M. canescens, and two more were also found in M. germanica but not in any of the Crataegus species. Mespilus canescens shared two phenotypes with C. brachyacantha, and three with C. chlorosarca. Both sexual and graft hybrids between M. germanica and Crataegus species are known, and have been described as the nothogenera ×Crataemespilus E.G.Camus and

+Crataegomespilus Simon-Louis ex Bellair, respectively (Byatt et al. 1977; Baird & Thieret

1989). The Crataegus parents of the sexual hybrids, Crataegus ×grandiflora (Smith)

E.G.Camus and C. ×gillotii Beck, are inferred to be, respectively, C. laevigata and C. monogyna.

In the diploid ×C. grandiflora, pollen meiosis is disturbed, and pollen viability is around 5%, in contrast with viability in excess of 95% for all three parental diploids (Byatt et al. 1977). These results are more extreme than those from studies of hybridization between the introduced C. monogyna and North American diploid Crataegus species (Love & Feigen 1978; Wells &

Phipps 1989) in which the pollen stainability of putative hybrids was typically greater than 40%

(pollen stainability of the parental species was 80-95%).

A scenario that would account for the known facts can be outlined as follows. At some time, probably in the nineteenth century, pollen from cultivated M. germanica was transferred to stigmas of C. brachyacantha, resulting in hybrid seed formation. Hybrid individuals grew to maturity but were infertile, due to irregular meiosis as in C. ×grandiflora. Under these circumstances only occasional seeds were set, and these resulted from the fertilization of unreduced female gametes of the primary hybrid by reduced male gametes from the pollen of either M. germanica or a native, diploid (and probably red-fruited) Crataegus species. Such a scenario is at least as plausible as an origin for M. canescens as an autotriploid from a now 41 extinct species of Mespilus that persisted in North America since the divergence of Mespilus and

Crataegus. Recognition of what has up to now been known as M. canescens as a nothospecies of ×Crataemespilus thus would seem warranted on the basis of the molecular results obtained here if it were not for the question, discussed below, of whether to maintain Mespilus as a genus distinct from Crataegus.

2.4.4. Re-evaluation of generic limits

After removing conflicts due to hybridization or other factors, the analyses of the combined nuclear and chloroplast sequence data (Fig. 2.5) suggest that C. brachyacantha is sister to the remaining Crataegus species rather than to M. germanica. This raises the question of whether M. germanica should be included within Crataegus, since there appear to be fewer differences between these two genera than between them and their sister genus, Amelanchier (Table 2.2).

For the characters that were suggested earlier as distinguishing Mespilus (hence M. canescens) from Crataegus (Table 2.2, Appendix 2), a closer examination suggests that these two genera are more similar than has been acknowledged previously (Table 2.2).

Differences between the Mespilus-Crataegus clade and Amelanchier include the timing of replacement growth on fertile short shoots, disposition of the ovules within the locule, and composition of the mature fruit and seeds (Aldasoro et al. 2005; Table 2.2, Appendix 2). Some of the characters that might provide synapomorphies for Crataegus (relative to Mespilus; Phipps

1990) in fact vary within Crataegus, such as short shoot leaf margination, shape, and venation pattern, the numbers of flowers per inflorescence, and number of stamens per flower (Table 2.2).

Only whether the petals are notched apically (emarginate) and the way in which the apices of the pyrenes are, or are not, exposed in the fruit really distinguish the two species currently ascribed to Mespilus from Crataegus. 42 Although only about a quarter or fewer of species in Crataegus are included in our sample, out of the 15 sections in the genus we have covered all but section Cuneatae Rehder ex

Schneider (eastern Asia), and have at least two individuals from different localities for most species (Appendix 1). The SH test of the combined nuclear and chloroplast data did not reject the hypothesis of Crataegus and Mespilus being two distinct lineages, but only when M. canescens was removed from the dataset. Since a hybrid origin of M. canescens is plausible and justifies its removal from the phylogenetic analyses, Crataegus and Mespilus can still be treated as two distinct genera. However, because the number of morphological differences supporting the branch between the Mespilus-Crataegus clade and Amelanchier is considerably greater than those distinguishing Mespilus and Crataegus from each other, it seems more reasonable to sink the smaller genus in the larger one and create a new section to accommodate it. Accordingly, we make the following new combinations, and a new nothosection to accommodate one of them.

Crataegus Linnaeus sect. Mespilus T.A.Dickinson & E.Y.Y.Lo stat. nov., comb. nov. (Mespilus

Linnaeus in Sp. Pl. 1: 478, 1753; Gen. Pl., ed. 5: 549, 1754)

Ab omnibus sectionibus alteris Crataegi differt fructibus apicibus pyrenarum omnino tectis, necnon coniunctione foliorum venatione semicraspedodroma, non lobatorum, inflorescentiarum

1(-2)-florarum, atque staminum 30(-40) in quoque flore.

Deciduous trees or shrubs to 10 m, deciduous. Bark gray-brown on young branches, becoming gray with age. Shoots dimorphic, lateral short shoots sympodial, sometimes developing as aphyllous thorns especially in wild genotypes; borne on twigs 2 years or more old, each bearing

5—7 or more preformed leaves, and often inflorescences; long shoots with both preformed and neoformed leaves. Leaves alternate, spirally arranged, simple, 5—10 (—15) cm long, 3—5 cm wide; stipules deciduous, distinct; petioles present. Leaf blade pinnately veined, secondary venation semi-craspedodromous. Leaf blades elliptic, apex pointed. Inflorescences terminal on 43 short shoots, comprising 1 (—2) flowers. Flowers 3—5 cm across when open, bisexual, pentamerous, epigynous; sepals 5; petals 5, sometimes emarginate apically; stamens 30 (—40), pistil 1, ovary inferior, (4—) 5-locular; placentation axile; ovules 2 per locule, anatropous, apitropic, initiated collaterally at base of locule but becoming superposed, with a single funicular obturator adjacent the micropyle of the lower ovule; styles (4—) 5, stigmas wet- papillate. Fruits polypyrenous drupes (“pomes”), brown at maturity, 1.5—3 cm in diameter (—7 cm in cultivated genotypes), completely enclosing 5 1-seeded pyrenes, the free portion of the hypanthium forming a low wall around the disk almost as wide as the fruit itself, the calyx lobes typically erect; seed coat membranous; endosperm present, thin at maturity; embryo straight, as long as seed; cotyledons flat. One species, C. germanica (L.) K.Koch.

Crataegus nothosection Phippsara T.A.Dickinson & E.Y.Y.Lo, nothosect. nov. (Crataegus sect.

Mespilus × Crataegus sect. Brevispinae Beadle ex C.K.Schneid. × unknown section)

Deciduous trees or shrubs to 7 m. Bark gray-brown on young branches, flaking with age on the trunk. Shoots dimorphic, lateral short shoots sympodial, occasionally developing as aphyllous thorns; borne on twigs 2 years or more old, each bearing 5—7 or more preformed leaves, and often inflorescences; long shoots with both preformed and neoformed leaves. Leaves alternate, spirally arranged, simple, 3—5 cm long, (1—) 1.5—2 cm wide, canescent; stipules deciduous, distinct; petioles present. Leaf blade pinnately veined, secondary venation semi- craspedodromous. Leaf blades elliptic, apex pointed. Inflorescences flat-topped panicles usually terminating short shoots, pubescent, and comprising (2—) 5—10 flowers. Flowers 1.5—2 cm across when open, bisexual, pentamerous, epigynous; sepals 5; petals 5, emarginate apically, acquiring an orange tinge upon drying; stamens 20, pistil 1, ovary inferior, (4—) 5-locular; placentation axile; ovules 2 per locule, superposed; styles (4—) 5. Fruits polypyrenous drupes

(“pomes”), red at maturity, 1—1.5 cm in diameter, completely enclosing 5 1-seeded pyrenes. 44 One species, C. ×canescens (J.B.Phipps) T.A.Dickinson & E.Y.Y.Lo. Etymology: James B.

Phipps is the preeminent North American student of hawthorns and medlars. His energetic fieldwork and detailed revisionary studies have provided a wealth of new information about these plants as they occur across the continent.

Crataegus ×canescens (J.B.Phipps) T.A.Dickinson & E.Y.Y.Lo comb. nov. – Mespilus

canescens J.B.Phipps Syst. Bot. 15: 26-32. Lawrence, Kansas 1990.

Crataegus nothosection Crataemespilus (E.G.Camus) T.A.Dickinson & E.Y.Y.Lo stat. nov.,

comb. nov. – ×Crataemespilus E.G.Camus Journal de Botanique 13: 326, Paris 1899.

Crataegus ×gillotii (Beck) T.A.Dickinson & E.Y.Y.Lo comb. nov. – ×Crataemespilus gillotii

Beck Icones Florae Germanicae et Helveticae 25: 30, t. 107, Leipzig, Gera 1914.

The combination Crataegus ×grandiflora K.Koch has already been made and, under

Article 4 of the International Code of Nomenclature for Cultivated Plants (Brickell et al. 2004), scientific names for graft-chimaeras below the rank of genus are unnecessary (they may be named as cultivars). Because the transfer of Crataegus species to Mespilus has already been made (Scopoli 1772), we have elsewhere (Talent et al. in press) proposed conservation of

Crataegus over Mespilus in the interest of nomenclatural stability. There are potentially hundreds of new combinations that would be required if the phylogenetic results obtained here are to inform taxonomy and Crataegus is not conserved over Mespilus.

To conclude, molecular and morphological data indicate no clear genetic distinction between Crataegus and Mespilus. Although there is a certain arbitrariness in the assignment of taxonomic rank, we believe that the taxonomic solution that best reflects both the molecular phylogeny and the morphological data, as well as causing minimum disruption of existing 45 nomenclature, is to sink the genus Mespilus in Crataegus as a new, monotypic section. Mespilus canescens is readily accommodated as an intersectional hybrid named as a nothospecies in a new nothosection ×Phippsara. Together with a monotypic section Brevispinae, these realignments combine a phylogenetic basis for the classification of hawthorns and medlars with the greatest nomenclatural stability.

2.5. Acknowledgements

The authors thank Nadia Talent and Cheryl Smith for flow cytometry advice; Simona

Margaritescu and Jean-Marc Moncalvo for assistance with sequencing; Fannie Gervais for DNA extractions; Rhoda Love, Steve Brunsfeld, Christopher S. Reid, and Peter Zika for plant collection and identification; Nadia Talent, Sophie Nguyen, Melissa Purich, Fannie Gervais,

Theo Witsell, Demetra Kandalepas, Jörg Trau, Adam Dickinson, and John Dickinson for help with field work; and Tray Lewis (Louisiana), the Arkansas Natural Heritage Commission, The

Nature Conservancy, the Toronto and Region Conservation Authority, the Arnold Arboretum of

Harvard University, Jardin Botanique de Montréal, the Morton Arboretum, the University of

California Botanical Garden, the North Carolina Arboretum, the North Carolina Botanical

Garden, and the Royal Botanical Gardens (Burlington, Ontario) for access to trees. Jenny Bull was invaluable in helping to organize the vouchers for this study. Cary Gilmour demonstrated to

TAD the efficacy of x-ray imaging for documenting leaf venation pattern, and Patricia Ross prepared most of the leaf x-rays studied here. We are also indebted to Christopher Campbell for samples of Amelanchier bartramiana genomic DNA. Kayri Havens kindly made the unpublished data of McCue et al. available to us. Rodger Evans and Christopher Campbell discussed our results with us and generously commented on drafts of the manuscript, as did

Nadia Talent to whom we are indebted for advice on the nomenclatural issues arising from our results. Comments from two anonymous reviewers, as well as those of the editors, further 46 improved the presentation of our results. Financial support from the Natural Sciences and

Engineering Research Council of Canada (grant A3430 to TAD), the Botany Department of the

University of Toronto, and the Royal Ontario Museum is gratefully acknowledged, as is an award from the Royal Ontario Museum Foundation to Mark Engstrom and TAD for the purchase of the thermocycler used in this work.

2.6. References

Aldasoro JJ, Aedo C, Navarro C 2005. Phylogenetic and phytogeographical relationships in Maloideae (Rosaceae) based on morphological and anatomical characters. Blumea 50: 3-32. Archambault A, Bruneau A 2001. How useful is the LEAFY gene in the phylogenetic reconstruction in the Caesalpinioideae. Am. J. Bot. 88: 97-108. Bailey CD, Carr TG, Harris SA, Hughes CE 2003. Characterization of angiosperm nrDNA polymorphism, paralogy, and pseudogenes. Mol. Phylogenet. Evol. 29: 435-455. Baird JR, Thieret JW 1989. The Medlar (Mespilus germanica, Rosaceae) from antiquity to obscurity. Econ. Bot. 43: 328-372. Blazquez MA, Soowal LN, Lee I, Weigel D 1997. LEAFY expression and flower initiation in Arabidopsis. Development 124: 3835-3844. Bortiri E, Oh SH, Gao FA, Potter D 2002. The phylogenetic utility of nucleotide sequences of sorbitol 6-phosphate dehydrogenase in Prunus (Rosaceae). Am. J. Bot. 89: 1697-1708. Brickell CD, Baum BR, Hetterscheid WLA, Leslie AC, McNeill J, Trehane P, Vrugtman F, Wiersema JH 2004. International Code of Nomenclature for Cultivated Plants. Acta Horticulturae: 647.

Byatt JI, Ferguson IK, Murray BG 1977. Intergeneric hybrids between Crataegus L. and Mespilus L.: a fresh look at an old problem. Bot. J. Linn. Soc. 74: 329-343. Campbell CS, Baldwin BG, Donoghue MJ, Wojciechowski MF 1995. Phylogenetic relationships in Maloideae (Rosaceae): Evidence from sequences of the internal transcribed spacers of nuclear ribosomal DNA and its congruence with morphology. Am. J. Bot. 82: 903-918. 47 ____, Evans RC, Morgan DR, Dickinson TA, Arsenault MP 2007. Phylogeny of subtribe Pyrinae (formerly the Maloideae, Rosaceae): Limited resolution of a complex evolutionary history. Pl. Syst. Evol.: 266: 119 – 145. ____, Greene CW, Bergquist SE 1987. Apomixis and sexuality in three species of Amelanchier, Shadbush (Rosaceae, Maloideae). Am. J. Bot. 74: 321-328. ____, Greene CW, Dickinson TA 1991. Reproductive biology in subfamily Maloideae (Rosaceae). Syst. Bot. 16: 333-349.

Candolle AP 1825. Rosaceae trib. VIII. Pomaceae Juss. Pp. 626-639 in Prodromus systematis naturalis regni vegetabilis, vol. 2. Paris. Chase MW, Soltis DE, Olmstead RG, Morgan D, Les DH, Mishler BD, Duvall MR, Price RA, Hills HG, Qiu YL, Kron KA, Rettig JH, Conti E, Palmer JD, Manhart JR, Sytsma KJ, Michaels HJ, Kress WJ, Karol KG, Clark WD, Hedren M, Gaut BS, Jansen RK, Kim KJ, Wimpee CF, Smith JF, Furnier GR, Strauss SH, Xiang QY, Plunkett GM, Soltis PS, Swensen SM, Williams SE, Gadek PA, Quinn CJ, Eguiarte LE, Golenberg E, Learn GH, Graham SW, Barrett SCH, Dayanandan S, Albert VA 1993. Phylogenetics of seed plants: An analysis of nucleotide sequences from the plastid gene rbcL. Ann. Missouri Bot. Gard. 80: 528-580. Christensen KI 1992. Revision of Crataegus section Crataegus and Nothosect. Crataeguineae (Rosaceae-Maloideae) in the Old World. Syst. Bot. Mono. 35: 1-199. Decaisne MJ 1874. Mémoire sur la famille des Pomacées. Nouv. Arch. Mus. Hist. Nat. Paris 10: 113-192, plates I-XV. Dickinson TA 1985. The biology of Canadian weeds, 68. Crataegus crus-galli L. sensu lato. Can. J. Pl. Sci. 65: 641-654. _____, Phipps JB 1986. Studies in Crataegus (Rosaceae, Maloideae), XIV. The breeding system of Crataegus crus-galli sensu lato in Ontario. Am. J. Bot. 73: 116-130. _____, Belaousoff S, Love RM, Muniyamma M 1996. North American black-fruited hawthorns I. Variation in floral construction, breeding system correlates, and their possible evolutionary significance in Crataegus sect. Douglasii Loudon. Folia Geobotanica 31: 355- 371. Doyle JJ 1992. Gene trees and species trees: Molecular systematics as one-character taxonomy. Syst. Bot. 17: 144–163. 48 _____, Doyle JL 1987. A rapid DNA isolation procedure for small quantities of fresh leaf material. Phytochemistry Bulletin 19: 11-15. Eriksson T, Hibbs MS, Yoder AD, Delwiche CH, Donoghue MJ 2003. The phylogeny of Rosoideae (Rosaceae) based on sequences of the internal transcribed spacers (ITS) of nuclear ribosomal DNA and the trnL/F region of chloroplast DNA. Int. J. Pl. Sci. 164: 197- 211. Evans RC, Campbell CS 2002. The origin of the apple subfamily (Maloideae; Rosaceae) is clarified by DNA sequence data from duplicated GBSSI genes. Am. J. Bot. 89: 1478-1484.

____ , Dickinson TA 2005. Floral ontogeny and Morphology in Gillenia ("Spiraeoideae") and Subfamily Maloideae C. Weber (Rosaceae). Int. J. Pl. Sci. 166: 427-447. ____, Alice LA, Campbell CS, Kellogg EA, Dickinson TA 2000. The Granule-Bound Starch Synthase (GBSSI) gene in the Rosaceae: Multiple loci and phylogentic utility. Mol. Phylogenet. Evol. 17: 388-400. Felsenstein J 1978. Cases in which parsimony or compatibility methods will be positively misleading. Syst. Zool. 27: 401-410. ____ 1985. Confidence limits on phylogenies: an approach using the bootstrap. Evolution 39: 783-791. Frohlich MW, Meyerowitz EM 1997. The search for flower homeotic gene homologs in basal Angiosperms and Gnetales: a potential new source of data on the evolutionary origin of flowers. Int. J. Pl. Sci. 158: S131–S142. Golenberg EM, Clegg MT, Durbin ML, Doebley J, Ma DP 1993. Evolution of a noncoding region of the chloroplast genome. Mol. Phylogenet. Evol. 2: 52-64. Goulding SE, Olmstead RG, Morden CW, Wolfe KH 1996. Ebb and flow of the chloroplast inverted repeat. Mol. Gen. Genet. 252: 195-206. Grob GBJ, Gravendeel B, Eurlings MCM 2004. Potential phylogenetic utility of the nuclear FLORICAULA/LEAFY second intron: comparison with three chloroplast regions in Amorphophallus (Araceae). Mol. Phylogenet. Evol. 30: 13-23. Guillon JM 2004. Phylogeny of Horsetails (Equisetum) based on the chloroplast rps4 gene and adjacent noncoding sequences. Syst. Bot. 29: 251-259. Hamilton MB 1999. Four primer pairs for the amplification of chloroplast intergenic regions with intraspecific variation. Mol. Ecol. 8: 513-525. 49 Huelsenbeck JP, Ronquist F 2001. Mr. Bayes: a program for the Bayesian inference of phylogeny. Bioinformatics 17: 754-755.

Koehne E 1890. Die Gattungen der Pomaceen. Wissentschaftliche Beilage zum Programm des Falk-Realgymnasiums. Ostern, 1890. Programm Nr. 95. Berlin, R. Gaertner.

Leaf Architecture Working Group 1999. Manual of Leaf Architecture - morphological description and categorization of dicotyledonous and net-veined monocotyledonous angiosperms. Smithsonian Institution, Washington DC. Lee S, Wen J 2001. A phylogenetic analysis of Prunus and the Amygdaloideae (Rosaceae) using ITS sequences of nuclear ribosomal DNA. Am. J. Bot. 88: 150-160. Lindley J 1822. Observations on the natural group of plants called Pomaceae. Philos. Trans. R. Soc. London [Biol] 13: 88-106. Love R, Feigen M 1978. Interspecific hybridization between native and naturalized Crataegus (Rosaceae) in western Oregon. Madroño 25: 211-217. Maddison WP, Maddison DR 1992. MacClade, version 3.01. Sunderland: Sinauer Associates. Medicus FC 1793. Geschichte der Botanik unserer Zeiten. Mannheim, Schwan und Gotz. Morgan DR, Soltis DE, Robertson KR 1994. Systematic and evolutionary implications of rbcL sequence variation in Rosaceae. Am. J. Bot. 81: 890-903. Müller K 2005. Incorporating information from length-mutational events into phylogenetic analysis. Mol. Phylogenet. Evol. 38: 667-676. Oh SH, Potter D 2003. Phylogenetic utility of the second intron of LEAFY in Neillia and Stephanandra (Rosaceae) and implication for the origin of Stephanandra. Mol. Phylogenet. Evol. 29: 203-215. ____ 2005. Molecular phylogenetic systematics and biogeography of tribe Neillieae (Rosaceae) using DNA sequences of cpDNA, rDNA, and LEAFY. Am. J. Bot. 92: 179-192. Olmstead RG, Palmer JD 1994. Chloroplast DNA systematics: A review of methods and data analysis. Am. J. Bot. 81: 1203-1224. Palmer EJ 1925. Synopsis of North American Crataegi. J. Arnold Arboretum 6: 5-128. Phipps JB 1990. Mespilus canescens, a new Rosaceous endemic from Arkansas. Syst. Bot. 15: 26-32. ____ 1998. Synopsis of Crataegus series Apiifoliae, Cordatae, Microcarpae, and Brevispinae (Rosaceae subfam. Maloideae). Ann. Missouri Bot. Gard. 85: 475-491. 50 ____ 2005. A review of hybridization in North American hawthorns – another look at “The Crataegus problem.” Ann. Missouri Bot. Gard 92: 113-126. ____, Robertson KR 1990. A checklist of the subfamily Maloideae (Rosaceae). Can. J. Bot. 68: 2209-2269.

____, O' Kennon RJ, Lance RW 2003. Hawthorns and medlars. Timber Press, Portland OR. ____, Weeden NF, Dickson EE 1991. Isozyme evidence for the naturalness of Mespilus L. (Rosaceae, subfam. Maloideae). Syst. Bot. 16: 546-552. Potter D, Gao F, Bortiri PE, Oh SH, Baggett S 2002. Phylogenetic relationships in Rosaceae inferred from chloroplast matK and trnL-trnF nucleotide sequence data. Pl. Syst. Evol. 231: 77-89. Posada D, Crandall KA 1998. Modeltest: testing the model of DNA substitution. Bioinfomatics 14: 817-818. Rambaut A 2002. Se-Al Sequence Alignment Editor v2.0a11. Oxford: University of Oxford. Robertson KR, Phipps JB, Rohrer JR 1992. Summary of leaves in the genera of Maloideae (Rosaceae). Ann. Missouri Bot. Gard. 79: 81-94.

Rohrer JR, Robertson KR, Phipps JB 1991. Variation in structure among fruits of Maloideae (Rosaceae). Am. J. Bot. 78: 1617-1635.

____ 1994. Floral morphology of Maloideae (Rosaceae) and its systematic relevance. Am. J. Bot. 81: 574-581. Sang T, Crawford DJ, Stuessy TF 1997. Chloroplast DNA phylogeny, reticulate evolution, and biogeography of Paeonia (Paeoniaceae). Am. J. Bot. 84: 1120-1136. Sargent CS 1905. Manual of the Trees of North America. Boston, Houghton Mifflin. Schneider CK 1906. Illustrierte Handbuch der Laubholzkunde. Jena, Gustav Fischer.

Scopoli JA 1772. Flora Carniolica. J. P. Krauss, Vienna. Shimodaira H, Hasegawa M 1999. Multiple comparisons of log-likelihoods with applications to phylogenetic inference. Mol. Biol. Evol. 16: 1114–1116. Simmons MP, Ochoterena H 2000. Gaps as characters in sequence-based phylogenetic analyses. Syst. Biol. 49: 369-381.

Smith JE 1800. Flora Britannica, London. ____ 1819. Mespilus in: The Cyclopaedia; or, Universal Dictionary of Arts, Sciences and Literature. Rees, A., London. 51 Smith PG, Phipps JB 1988. Studies in Crataegus (Rosaceae, Maloideae), XIX. Breeding behavior in Ontario Crataegus series Rotundifolieae. Can. J. Bot. 66: 1914-1923. Swofford DL 2002. PAUP*: Phylogenetic analysis using parsimony (*and other methods), version 40b10. Sunderland: Sinauer Associates. Talent N, Dickinson TA 2005. Polyploidy in Crataegus and Mespilus (Rosaceae, Maloideae): evolutionary inferences from flow cytometry of nuclear DNA amounts. Can. J. Bot. 83: 1268-1304. ____, Eckenwalder JE, Lo E, Christensen KI, Dickinson TA. In press. Proposal to conserve the name Crataegus L. against Mespilus L. (Rosaceae). Taxon. Thompson JD, Higgins DG, Gibson TJ 1994. ClustalW: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position specific gap penalties, and weight matrix choice. Nucleic Acids Res. 22: 4673-4680. Tsumura Y, Yoshimura K, Tomaru N, Ohba K 1995. Molecular phylogeny of conifers using RFLP analysis of PCR-amplified specific chloroplast genes. Theoret. Appl. Genet. 91: 1222-1236. Vaillancourt RE, Jackson HD 2000. A chloroplast DNA hypervariable region in eucalypts. Theoret. Appl. Genet. 101: 473-477. Wada M, Cao QF, Kotoda N, Soegima JI, Masuda T 2002. Apple has two orthologues of FLORICAULA/LEAFY involved in flowering. Pl. Mol. Biol. 49: 567-577. Wells TC, Phipps JB 1989. Studies in Crataegus (Rosaceae: Maloideae). XX. Interserial hybridization between Crataegus monogyna (series Oxyacanthae) and Crataegus punctata (series Punctatae) in southern Ontario. Can. J. Bot. 67: 2465-2472. White TJ, Bruns T, Lee S, Taylor J 1990. Amplification and direct sequencing of fungal ribosomal genes for phylogenies. In PCR protocols: A guide to methods and applications, eds. M. Innis, D. Gelfand, J. Sninsky, and T. White. San Diego: Academic Press. Pp.315- 322 Xu DH, Abe J, Sakai M, Kanazawa A, Shimamoto Y 2000. Sequence variation of non-coding regions of chloroplast DNA of soybean and related wild species and its implications for the evolution of different chloroplast haplotypes. Theoret. Appl. Genet. 101: 724-732. Zurawski G, Clegg MT 1987. Evolution of higher-plant chloroplast DNA-encoded genes: Implications for structure-function and phylogenetic studies. Ann. Rev. Pl. Physiol. 38: 391-418.

52

Table 2.1 Comparison of sequence variation in Crataegus, Mespilus, and outgroups for the two nuclear and four chloroplast regions. PI = parsimony informative; MPT = most parsimonious tree; C.I. = consistency index; R.I. = retention index. Nuclear (NR) sequences Plastid (CP) sequences NR + CP combined trnH- rpl20- combined ITS LEAFY nr trnG-trnS psbA-trnH rpl2 rps12 cp Number of sequences 156 156 156 82 82 82 82 82 77 Number of characters 671 696 1367 719 344 287 736 2085 3452 GC content (%) 66.3 43.25 64.42 30 27 31 39 30.2 37 Number of variable characters 203 282 539 70 41 37 54 211 726 Number of PI characters with outgroup 168 205 436 54 21 18 34 125 552 Number of PI characters without outgroup 151 93 243 41 18 13 33 105 272 Number of observed PI indels 13 8 21 8 7 0 4 19 40 Divergence range within ingroup (%) 0.31-8.11 0.23-8.56 0.34-6.67 0-2.71 0-3.01 0-2.79 0-1.94 0-2.56 0-3.45 Divergence range between ingroup 3.86- 1.73- 2.76- and outgroup (%) 4.91-9.64 1.13-34.29 21.91 1.99-4.33 1.53-3.95 4.86 1.12-2.63 2.21-5.85 11.54 Divergence within genus Crataegus (%) 0.3-9.2 0.6-9.1 - - - - - 0.44-3.32 - Divergence within genus Mespilus (%) 6.25 1.7-2.2 - - - - - 1.45 - Number of MPTs 2,761 27,684 970 1,366 31,500 30,300 30,627 18,432 29,113 Tree length 509 411 917 98 89 322 85 335 1141 C.I. 0.73 0.82 0.74 0.93 0.64 0.6 0.84 0.75 0.81 R.I. 0.91 0.94 0.89 0.97 0.84 0.78 0.92 0.87 0.89 52 53

Table 2.2 Morphological variation, ploidy level, and geographic distribution (Characters 1-11, Appendix 2) as they are expressed in

Amelanchier, Mespilus, and exemplar species of Crataegus (Appendix 1 and Fig. 2.1-2.4). In bold, character states that apply to the genus as a whole. NA, character not applicable; ND, no data. Data from field observations and herbarium specimens, Robertson et al. (1992),

Phipps et al. (2003).

1. The secondary venation of this taxon is inadequately described by the term camptodromous since secondary veins frequently lead to the margin, b nodes just below the sinuse ut form s between the marginal crenations.

Characters 1 2 3 4 5 6 7 8 9 10 11 OUTGROUPS Amelanchier A. arborea 1 0 0 NA 0 1 1 0(1/2) 4 0 1 A. bartramiana 1 0 0 NA 0 1 1 0 4 0 1 Aronia A. arbutifolia 0 0 0 NA 0 2 1 0/1/2/3 1 0/2 1 Malus M. angustifolia 0 0 0 NA 0 0 1 0 2 ND 1

INGROUPS Mespilus M. germanica 0 1 1 0 1 1/2 0 0 0 0 3 M. canescens 0 1 1 0 0 0/1/2 1 0 1 1 1 Crataegus C. brachyacantha 0 1 1 1 0 21 1 0 4 0 0 53 53 54

CLADE B C. monogyna 0 1 1 1 0 0 1 4 1 0(1) 3 C. pentagyna 0 1 1 1 0 0 1 0 4 0 3 C. pinnatifida 0 1 1 1 0 1 1 0(1/2) 1 0(1/2/3) 2 CLADE D C. calpodendron 0 1 1 1 0 0 1 1/2 1 0(2/3) 1 C. crus-galli 0 1 1 1 0 0/1 2(1) (0/1)/2/3/4 1 (0/1)/2 1 C. opaca 0 1 1 1 0 0 1 0/1/2 1 0 1 C. mexicana 0 1 1 1 0 0 1 0 2 0 1 C. phaenopyrum 0 1 1 1 0 0/1 1 0/1/2 1 1/2 1 C. triflora 0 1 1 1 0 0 0 0/1/2 1 0/1/2 1 C. uniflora 0 1 1 1 1 0/1 1 0/1/2 1(2) 1 1 C. viridis 0 1 1 1 0 0/1 1 0 1 0/1 1 CLADE E C. chlorosarca 0 1 1 1 0 0 1 0 4 0 2 C. nigra 0 1 1 1 0 0 1 0/1 4 0 3 C. saligna 0 1 1 1 0 1 1 0 4 0 0 C. suksdorfii sensu lato 0 1 1 1 0 0 1 0(1/2) 4 0/1/2 0 C. sanguineae 0 1 1 1 0 0 1 (0/1)2 1 0(1/2) 2 C. wilsonii 0 1 1 1 0 0 1 2/3 1 0 2

54 55 Figure 2.1 Strict consensus trees, from maximum parsimony (MP) analyses of (a) ITS1-5.8S-

ITS2 (2761 trees) and (b) LEAFY second intron sequence data (27684 trees). Nodes with bootstrap (BS; above branches) and Bayesian posterior probability (BI; below branches) values

>50% are indicated. In (a) Amelanchier, Malus, and Aronia are used as outgroups, while in (b)

Amelanchier is the only outgroup because of the extreme divergence in Malus and Aronia

(details in text). Each branch represents a sequence obtained from at least three clones of an individual. Sectional affiliations of the Crataegus taxa (Phipps et al. 1990) are indicated by thick lines on the right, while species of Mespilus and three monotypic sections of Crataegus are indicated by thin lines. Major clades are labeled as A (C. brachyacantha and Mespilus species;

B (taxa of sections Crataegus and Hupehensis); C (C. marshallii, C. phaenopyrum, and C. spathulata); D (taxa of eastern North American sections, and E (C. saligna and taxa of sections

Douglasianeae and Sanguineae).

( ( b

a 88 ) )

L

Sect. Sanguineae I Sect. Sanguineae T E

S 62 A 53 E Sect. Douglasianae (1) -

F 83 M E (1)

Y C. saligna 70 66 (1) Sect. Douglasianae P

- (.99)

M 80 Sect. Sanguineae 100 (.96) (1) C. saligna 100 P Sect. Crus-galli Sect. Crus-galli Sect. Coccineae 100 86 95 Sect. Aestivales Sect. Coccineae (1) 100 D D 73 72 Sect. Coccineae (1) 64 (.60) Sect. Parvifoliae 100 Sect. Lacrimatae (.99) Sect. Lacrimatae 62 100 Sect. Aestivales (1) 58 Sect. Virides 72 Sect. Mexicanae Sect. Mexicanae 100 (1) Sect. Parvifoliae

Sect. Virides C 100

C C. spathulata C. spathulata 100 62 C. phaenopyrum C. phaenopyrum (.76) 100 C. marshallii C. marshallii 72 (1) 85 (.52) 83 Sect. Crataegus 81 Sect. Crataegus B 100 B 73 73 76 C. sp. (.72) 100 C. sp. 87 Sect. Hupehensis 100 (.98) (.91) Sect. Hupehensis 80 Sect. Crataegus 100 C. brachyacantha A A (1) C. brachyacantha 100 (.72) M. canescens M. canescens

(.97) 100 M. germanica M. germanica

Amelanchier 100 Amelanchier 81 Malus

Aronia 5 6 57 Figure 2.2 Strict consensus of 790 maximum parsimony (MP) trees from the combined analysis of ITS and LEAFY second intron data. Nodes with bootstrap (BS; above branch) and posterior probability (BI; below branch) values >50% are indicated. Species, sections, and genera (Phipps and Robertson 1990) are listed on the right. Labels of clades A-E as in Fig. 1.

58

88 C. nigra NUC-MP C. wilsonii 57 C. chlorosarca 68 (.98) C. dahurica Sanguineae C. sanguinea 84 64 (1) C. kansuensis E C. maximowiczii 84 90 (.80) (1) C. suksdorfii Douglasianae

C. saligna Brevispinae

C. crus-galli 79 Crus-galli (.55) C. punctata C. calpodendron C. triflora Coccineae C. mollis 94 D C. viridis Virides (.87) 89 C. mexicana Mexicanae (.52) C. uniflora Parvifoliae C. opaca 64 Aestivales C. aestivalis C. lassa Lacrimatae C. spathulata Microcarpae 86 C C. phaenopyrum Cordatae (1) 99 (1) C. marshallii Crataegus 71 (.86) C. monogyna 59 C. laevigata 95 Crataegus (1) C. songarica B C. heldreichii 63 (.96) C. sp. incertae sedis 100 100 (1) C. hupehensis Hupehensis (1) (1) C. pentagyna Crataegus 100 A (1) C. brachyacantha Brevispinae 100 (.77) M. canescens (1) 57 (.93) 100 M. germanica (1) 100 Amelanchier (1) Malus Aronia 59 Figure 2.3 Strict consensus of 18,432 equally parsimonious trees from the maximum parsimony

(MP) analysis of the combined trnG-trnS, psbA-trnH, trnH-rpl2, and rps20-rpl12 data. Nodes with bootstrap (BS; above branch) and posterior probability (BI; below branch) values >50% are indicated. Species, sections, and genera (Phipps and Robertson 1990) are listed on the right.

Labels of clade A-E can be referred to Figure 2.1.

60

C. lassa Lacrimatae 58 CP-MP C. mexicana Mexicanae Crataegus 71 C1 C. marshallii C. mollis Coccineae 100 C. opaca Aestivales

100 C. calpodendron Coccineae 54 96 (.73) C. punctata Crus-galli D 98 C. crus-galli 97 C. triflora Coccineae (.90) 100 C. viridis Virides 100 79 C. aestivalis Aestivales (.87) 100 C. uniflora Parvifoliae 100 C2 C. phaenopyrum Cordatae C. spathulata Microcarpae 77 C3

100 C. suksdorfii Douglasianae 53 (.68) (.96) 100 C. wilsonii 64 97 C. kansuensis Sanguineae 99 C. nigra 100 E C. saligna Brevispinae 100 100 C. dahurica 100 89 C. maximowizii Sanguineae 92 96 C. sanguinea 97 (.96) 85 C. chlorosarca 100 C. heldreichii Crataegus

100 Hupehensis 82 C. hupehensis (.99) C. sp. incertae sedis (.53) B 71 C. monogyna 55 95 98 C. laevigata Crataegus (.99) 100 C. songarica 100 100 C. pentagyna

98 M. germanica A (.99) 79 M. canescens 81 (.97) 100 C. brachyacantha Brevispinae Amelanchier Malus Aronia 61 Figure 2.4 The maximum likelihood (ML) trees of the combined nuclear (a) and chloroplast (b) data, generated by PAUP* using the TIM and GTR models, respectively. For the nuclear data, lnL: -5894.64, I=0.52 and G=1.18; while for the chloroplast data, lnL: -4228.18, I=0.65 and

G=0.42. Nodes with bootstrap (BS) values >50% are indicated above branch.

62

63 C. nigra (C1) C. spathulata C. wilsonii C. suksdorfii 1 71 C. chlorosarca C. suksdorfii 2 61 C. dahurica C. saligna C. sanguinea C. wilsonii E 75 C. kansuensis C. dahurica E C. maximowiczii C. sanguinea 77 C. suksdorfii 1 C. maximowiczii C. suksdorfii 2 C. chlorosarca C. saligna C. kansuensis 100 C. crus-galli C. nigra C. mexicana C. mollis 56 C. calpodendron (C2) C. marshallii 1 C. triflora C. marshallii 2 C. uniflora C. calpodendron C. lassa C. opaca C. mexicana C. punctata 66 C. opaca C. uniflora D C. mollis 55 C. aestivalis C. punctata C. triflora C. viridis (C3)C. phaenopyrum D C. viridis C. spathulata C 81 C. crus-galli 92 C. phaenopyrum 95 C. aestivalis 73 C. marshallii C. heldreichii C. songarica 50 C. monogyna 58 C. monogyna 69 88 C. songarica C. laevigata C. laevigata B C. heldreichii C. hupehensis B100 69 87 C. sp. C. sp. 100 C. pentagyna 52 (A1) 78 C. hupehensis M. germanica C. pentagyna M. canescens A (A2) 54 C. brachyacantha C. brachyacantha 64 M. canescens M. germanica OUTGROUPS

OUTGROUPS 0.005 substitutions/site 0.005 substitutions/site (a) NUC-ML (b) CP-ML 63

Figure 2.5 Trees based on combined nuclear and chloroplast data generated by (a) maximum parsimony (MP) and (b) maximum likelihood (ML), using the TVM model with lnL: -11,001.39,

I=0.51 and G=0.57. In (a), bootstrap (BS; above branch) and posterior probability (BI; below branch) values >50% are indicated; the dotted line represents the branch that collapses in the maximum likelihood analysis. In (b), nodes with bootstrap (BS) values >50% are indicated above branch. Crataegus marshallii, C. phaenopyrum, C. spathulata, and Mespilus canescens were omitted from the analyses because of their conflicting positions in the nuclear (Fig. 2.4a) and chloroplast trees (Fig. 2.4b).

(

( 51 C. lassa Lacrimatae a b C. lassa 94 C. uniflora Parvifoliae

) ) N C. triflora 100 C. mexicana Mexicanae N

C. uniflora (.51) 100 C. viridis Virides U

U

D 75

M C. calpodendron C. calpodendron Coccineae M C

C + C. mollis 58 84 C. mollis (.99) C. crusgalli + 57 C. crus-galli Crus-galli P L

C. aestivalis C C C. triflora Coccineae D C. viridis 100 53 C. opaca P Aestivales P 74 C. mexicana 78 C. aestivalis 59 C. opaca 90 C. punctata Crus-galli (.99) C. punctata 73 100 (1) 70 C. suksdorfii Douglasianae 97 C. suksdorfii 1 C. suksdorfii 2 63 100 C. saligna Brevispinae C. wilsonii (1) 76 E 53 C. nigra C. dahurica 83 C. wilsonii (1) C. sanguinea C. sanguinea E 77 C. dahurica Sanguineae C. kansuensis (.98) 75 100 93 C. kansuensis C. maximowiczii (1) C. maximowiczii C. chlorosarca C. chlorosarca 62 67 C. nigra C. songarica

C. saligna 82 C. laevigata C. songarica (1)

B C. monogyna 98 Crataegus 84 C. laevigata (.99) 100 C. heldreichii (1) 73 C. monogyna 100 67 (1) C. pentagyna C. heldreichii

B (.70) 100 C. hupehensis Hupehensis 100 C. pentagyna (1) C. sp. incertae sedis 100 100 C. hupehensis 100 (1) 100

A M. germanica C. sp. (1) 57 C. brachyacantha (.77) 100 C. brachyacantha Brevispinae M. germanica (1) 100 Amelanchier (1) Amelanchier Malus Aronia 0.005 substitutions/site 64

65 Chapter 3 Evidences for genetic association between East Asian and Western North American Crataegus L. (Rosaceae) and rapid divergence of Eastern North American lineages based on multiple DNA sequences.

Abstract. Biogeographic relationships were constructed for the Old and New World Crataegus species using a combination of chloroplast (trnG-trnS, psbA-trnH, trnH-rpl2, rpl20-rps12 spacers) and nuclear regions (ITS, LEAFY introns, 3’-PISTILLATA, and 5’-PEPC). Maximum parsimony, maximum likelihood, and Bayesian results yield congruent relationships among major lineages. The close association between the East Asian (section Sanguineae) and western North American (series Cerrones and Douglasianae) species points towards an ancient trans-Pacific movement. The relationships among eastern North American species are poorly resolved and no clear groups are identified to be consistent with the existing classification. Scarce variation and short internal branches among these species suggest genetic bottlenecks, hybridization, and/or rapid divergence potentially associated with polyploidy, which could lead to extensive homoplasy in morphological characters.

Incongruence between the chloroplast and nuclear data as well as morphologies suggest hybrid origins of C. marshallii, C. phaenopyrum, and C. spathulata, which were potentially derived from European and North American ancestors, with the latter ones serving as maternal parents.

Europe and eastern North America are suggested as the most recent common areas of modern

Crataegus and at least four dispersal events are inferred to explain its present distribution.

Keywords: Bering land bridge; Crataegus; DIVA; EA-WNA disjunction; hybrid origins; rapid

divergence; topological incongruence.

65 66 3.1. Introduction

Hawthorns (Crataegus L.) are woody members of the Rosaceae with their earliest fossils dating from the mid-Tertiary (see review in DeVore & Pigg 2007). About 140-200 species have been described in the genus and are distributed widely in the north temperate regions of both hemispheres (Phipps et al. 1990). Phipps (1983) suggested that South China and Mexico may contain “the most primitive stock of Crataegus” based on cladistic analyses of morphological data. He hypothesized that the modern distribution of species could have resulted from exclusively trans-Beringian migrations. However, an alternative hypothesis, based on molecular and non-molecular characters (Evans & Campbell 2002), suggests that the subtribe Pyrinae (formerly named as Maloideae; Campbell et al. 2007), in which Crataegus belongs, originated in North America. This notion disagrees with the south China origin of

Crataegus. In Crataegus, hypotheses of origin as well as proposed biogeographic migrations have never been evaluated with other types of data. In other plant groups, however, biogeographic disjunctions such as that between eastern Asia and eastern North America have been well documented by several biogeographic, ecological, paleobotanical, and phylogenetic studies in the last few decades (Graham 1972; Boufford & Spongberg 1983; Taylor 1990;

Iwatsuki & Ohba 1994; Xiang et al. 1998; Guo 1999; Wen 1999; Donoghue et al. 2001). Two migratory pathways via land bridges, North Atlantic (NALB) and Beringia (BLB), have been postulated as routes of floristic interchange between Eurasia and North America in the Tertiary, and contribute to the modern global floral and faunal disjunct distribution (Hopkins 1967;

Tiffney 1985a & b; Tiffney & Manchester 2001). Crataegus is suitable for inferring historical biogeography of the Northern Hemisphere in parallel with other such studies because of its wide distribution and rich species diversity in the Northern Hemisphere.

The existing classification of Crataegus divides species into 15 sections and 35 series based on geographical localities and morphologies (Phipps et al. 1990). More than 100 species 67 representing 11 sections are found in the New World, whereas 60 or more species representing four sections are known from the Old World (Phipps & Muniyamma 1980; Phipps et al. 1990;

Christensen 1992; Gu & Spongberg 2003). Taxonomy of this genus is particularly complicated for many North American species characterized by polyploidy and gametophytic apomixis

(Muniyamma & Phipps 1979, 1984; Talent & Dickinson 2005, 2007). For instance, only two out of the ten species of series Cerrones and Douglasianae in western North America (WNA) are diploid (Talent & Dickinson 2005). The others have been shown to include tetraploids and reproduce by apomixis (Dickinson et al. 1996). In eastern North America (ENA), nearly two-third of the species are triploids or tetraploids (Talent & Dickinson 2005). Together with the diploids, these species are currently divided into eight sections based on a combination of vegetative and reproductive characters (Phipps & Muniyamma 1980; Phipps et al. 1990).

However, this classification has not been tested with molecular data and the cladistic relationships between diploid and polyploid species are unclear. In comparison with the New

World species, Crataegus of the Old World seems not to be as complicated taxonomically.

Over three quarters of some 60 species in Europe (EUR), northern Africa, and East Asia (EA) are diploids (Talent & Dickinson 2005) and they are morphologically well defined (Phipps et al. 1990; Christensen 1992; Gu & Spongberg 2003). Tremendous effort has been made by earlier workers in clarifying the taxonomy of Crataegus (e.g., Phipps & Muniyamma 1980;

Dickinson & Phipps 1985; Phipps et al. 1990; Christensen 1992; Dickinson et al. 1996;

Dickinson 1998), but its biogeography in the context of phylogeny remains poorly understood.

Hence, in the present study we aim to (1) test biogeographic hypotheses (Phipps 1983) concerning the Old and New World (EA, ENA, WNA, and EUR) Crataegus; (2) identify the origin(s) of three ENA species that are morphologically related to European species (Phipps

1998); (3) resolve relationships among ENA species by adding three more nuclear gene regions; (4) compare relationships based on phylogenetic inference with the sectional limits of 68 the existing morphological classification; and (5) infer changes of ploidy level and morphological characters between and within cladistic groups. Our ultimate goals are to shed light on the historical biogeography and evolution of Crataegus.

3.2. Materials and Methods

3.2.1. Taxon sampling and DNA regions used

Present sampling aims to maximize the taxonomic and geographical coverage of Crataegus. A total of 72 species representing the 15 sections from both the Old and New World are included regardless of ploidy level (Table 3.1). One to six individuals per species and at least five species per section are examined, except for the monotypic sections. Species of Amelanchier,

Malus and Aronia are used as outgroups. Samples were either field collected or obtained from botanical gardens. Herbarium vouchers were deposited in the Green Plant Herbarium of the

Royal Ontario Museum (TRT) and information is provided in Appendix 4. Total DNA was extracted from either silica gel dried or frozen tissues using a small-scale modified method of

Tsumura et al. (1995). Four intergenic regions of the chloroplast genomes and two nuclear regions including ITS1-5.8S-ITS2 and LEAFY second intron, which were used in chapter 2 for less than half of the taxa studied here, are employed to obtain the fundamental phylogenetic framework of the genus.

For further resolution and statistical support among species of East Asia and North

America, sequences of three additional nuclear regions including the LEAFY first intron,

3’-portion of the PISTILLATA gene, and partial PEPC gene were obtained from 53 species representing ten sections (as indicated with an asterisk in Table 3.1), together with Malus angustifolia as outgroup. The first two nuclear genes are members of the MADS-box gene family that are involved in floral and vegetative development of flowering plants. The second intron of the LEAFY gene has been used for phylogenetic constructions in genera of Rosaceae 69 (Oh & Potter 2003, 2005; Chapter 2), whereas the longer first intron has not been investigated.

PISTILLATA is suggested to be present as a single or low copy gene (Goto & Meyerowitz

1994). Its longest first intron has been shown to be phylogenetically more informative than the

ITS and trnL regions in Brassicaceae (Bailey & Doyle 1999), but its four downstream introns are expected to be even more variable than the first intron because of the absence of transcription factor binding sites (Sieburth & Meyerowitz 1997). Lastly, the PEPC gene has been reported to have low copy number and contains 9 introns and 10 exons in most plants

(Matsuoka & Minami 1989; Lepiniec et al. 1994). Some of these introns have been shown to be phylogenetically informative in different flowering plant families (Gaskin & Schaal 2002;

Olson 2002; Malcomber 2002; Helfgott & Mason-Gamer 2004; Lohmann 2006).

3.2.2. PCR amplification and sequencing

Primer sequences and PCR conditions for the chloroplast regions, nuclear ribosomal ITS, and

LEAFY intron 2 were described in Chapter 2. For the other three nuclear regions, primers and

PCR conditions are presented in Table 3.2. These primers were designed based on the conserved mRNA sequences of other Rosaceae species obtained from the NCBI database as potentially applicable to related genera. Chloroplast sequences were obtained by direct sequencing, whereas PCR products of most nuclear sequences were cloned using pDrive vector (QIAGEN), and at least three clones per individual were sequenced. Because triploid/tetraploid individuals may contain multiple allelic sequences within a locus when heterozygous, up to eight clones were sequenced to test for intraspecific polymorphisms.

Plasmids were sequenced in both forward and reverse directions on either an ABI 377 or an

ABI 3100 (Applied Biosystems) automated DNA sequencer with the DyeNamic or BigDye dye terminator cycle sequencing kits.

70 3.2.3. Computational analyses

3.2.3.1. Sequence alignment and variation comparisons

Sequences of the nine examined regions were aligned both jointly and separately with

ClustalX (Thompson et al. 1997) and manually adjusted with the Sequence Alignment Editor version 1.d1 (SE-AL; Rambaut 2002). Gaps that are parsimony informative were coded into multistate characters with SeqState version 1.32 (Müller 2005) and appended to the sequence matrices. Pairwise divergences among taxa for chloroplast and nuclear regions were estimated using DNADIST program of PHYLIP version 3.66c (Felsenstein 2006). The HKY85

(Hasegawa et al. 1985) model which allows unequal base frequencies and transition and transversion rate was used for divergence comparisons.

3.2.3.2. Tree reconstructions

Phylogenetic trees were constructed for two major purposes with different datasets. The first was to infer relationships among the species from the four main biogeographic areas (EA,

WNA, ENA, EUR). This was achieved by separate and combined analyses of the four cpDNA regions, ITS, and LEAFY intron 2. The second was to further resolve relationships among the

East Asian and North American species. This was achieved by the addition of LEAFY intron 1,

3’ PISTILLATA gene and partial PEPC gene sequences. All datasets were analyzed using the maximum parsimony (MP) with equally weighted characters and maximum likelihood (ML) approaches in PAUP* 4.0b (Swofford 2002), as well as Bayesian inference (BI) in Mr. Bayes

3.0b4 (Huelsenbeck & Ronquist 2001).

For parsimony analyses, heuristic searches with 1,000 random additions, tree bisection-recombination (TBR) branch swapping, ACCTRAN optimization, MULTREES off, and with no more than 10 trees saved per replicate. The tree output was then used as the starting point for a second round of searches with the same settings except with MULTREES 71 on. To assess clade support, bootstrap analyses (BS; Felsenstein 1985) were conducted with

500 replicates, 10 random additions per replicate, TBR branch swapping, and MULTREES off options. A reduced dataset with one individual per species was included in the ML analyses to reduce the computational burden. The nucleotide substitution model was determined by

Hierarchical Likelihood Ratio Tests (hLRTs) and the Akaike Information Criterion (AIC) method using Modeltest (version 3.06, Posada & Crandall 1998). The best-fitting model and related parameters of datasets were used in both ML and Bayesian analyses. All ML searches were heuristic, with MULPARS and STEEPEST DESCENT options in effect, and TBR swapping. Bayesian analyses were performed with four Markov chains each initiated with a random tree and run for 5,000,000 generations, sampling every 100th generation. Likelihood values were monitored for stationarity and to determine the burn-in cut-off. Trees and other sampling points prior to the burn-in cut-off were discarded and the remaining trees were imported into PAUP* to generate a majority-rule consensus. Posterior probability values (PP;

Ronquist & Huelsenbeck 2003) were used to evaluate support of all nodes in the Bayesian trees.

3.2.3.3. Test for topological incongruence

Compatibility of tree topologies and bootstrap values were used for initial assessment of congruence between datasets. All chloroplast sequences were combined in phylogenetic analyses because these regions are linked as a single unit and no well-supported conflict was detected among individual trees. However, in order to in order to test for the significance of congruence between nuclear datasets, we conducted Incongruence Length Differences tests

(ILD; Farris et al., 1994), as implemented with the ‘partition homogeneity test’ option of

PAUP* (Swofford, 2002), with 1,000 homogeneity replicates and each with 10 random sequence addition under the same settings of MP heuristic searches as outlined above, except 72 that MULTREES option was off. Besides, the significance of incongruence between chloroplast and nuclear trees was assessed by two non-parametric tests in PAUP* (Swofford

2002). The Templeton test (Wilcoxon signed-rank; Templeton 1983) was performed under parsimony criteria. The one-tailed Shimodaira-Hasegawa test (SH; Shimodaira & Hasegawa

1999) was performed with selected substitution models (TVM+I+G for chloroplast data and

HKG+I+G for nuclear data), the same ML parameters as outlined above, and FULL optimization with 1,000 bootstrap replicates.

3.2.3.4. Ancestral areas inference

Dispersal-Vicariance Analysis v1.1 (DIVA; Ronquist 1997) was used to search for the most probable ancestral geographical areas and relationships among regions where species are distributed. Ronquist’s (1994) reversible parsimony assumes that speciation is caused by geographical vicariance. The most probable ancestral area usually has the fewest dispersal and extinction events. The MP tree generated from the total evidence approach (i.e., combined cpDNA and nDNA data, with conflicting taxa removed), together with the taxa distribution matrix were used as the primary input. The distribution of Crataegus was defined in five geographical areas: eastern North America (ENA), western North America (WNA), East Asia

(EA), Europe (EUR), and Central America (CAM). Each area was coded for either presence or absence of each terminal taxon. The root of each node defined as the most recent common area (MRCA) was obtained by optimal reconstruction with default settings. Maximum areas were constrained from 5 (i.e., the total number of our defined areas) to 2 (the minimal number of areas allowed by the program) in each of the iterations in order to determine the most reliable MRCA and number of dispersal events. In parallel with DIVA, the ancestral state(s) of geographical area was also inferred with the maximum likelihood method using the program

Mesquite version 2.01 (Maddison & Maddison 2007). The Markov k-state 1 parameter model 73 (Mk1), which assumes a single rate change in characters and does not allow a bias in gains versus loses, was used. The proportional likelihood values of all estimated areas are obtained for each node of the molecular tree.

3.2.3.5. Divergence time estimation

Divergence times among the major lineages of Crataegus were estimated by the penalized likelihood (PL) and non-parametric rate smoothing (NPRS) methods implemented in the program r8s version 1.71 (Sanderson 2006). The former method is a semiparametric rate-smoothing approach that allows heterogeneous evolutionary rates among branches when estimating node ages in the phylogenetic trees (Sanderson 2002), whereas the latter uses a least square smoothing approach that compares sum of square differences between branches

(Sanderson 1997). The ML treeblock with branch lengths and the r8s command block with the following settings: 3477 sites, PL/NPRS method, and truncated-Newton (TN)/POWELL optimization for the PL and NPRS methods (see Sanderson 2006), respectively, were included in the input file. The optimal smoothing parameter is first obtained by cross validation analysis of the data and then implemented in the successive run where the ages of the nodes were calculated. Calibration was constrained with a minimum age of 40 million years (the Late

Eocene; DeVore and Pigg, 2007) for the root of Maloideae genera (i.e., Aronia, Malus,

Amelanchier, Crataegus, and Mespilus), and with a minimum age of 25 million years

(Oligocene; MacGinitie, 1934; Oliver, 1936; Lamotte, 1952; Hickey, 1984; Wolfe and Wehr,

1988; DeVore and Pigg, 2007) for the stem group of Crataegus and Mespilus. Confidence intervals of divergence time were further estimated by the non-parametric bootstrap procedure

(Baldwin and Sanderson, 1998; Sanderson and Doyle, 2001). One hundred bootstrap sequence matrices were generated from SeqBoot in PHYLIP 3.66c (Felsenstein, 2006). For each matrix, tree of same topology but different branch lengths were generated from the ML heuristic 74 searches and were used for age estimation with the same parameters as outlined above in r8s.

The central 95% of the age distribution provides the confidence interval.

3.2.3.6. Mapping of morphological characters and ploidy level

Seven morphological characters which include qualitative vegetative and reproductive characters modified from Phipps et al. (2003), as well as ploidy level (Talent & Dickinson

2005) were mapped on the combined cpDNA and nDNA tree using MacClade 4.0 (Maddison

& Maddison 1992). Most morphological characters were scored from herbarium specimen and were treated as either binary or multistate (Appendix 5). To examine the level of homoplasy for individual characters, the number of changes, consistency index (CI), retention index (RI), and rescaled consistency index (RC) were estimated under the ACCTRAN optimization

(Farris 1989).

3.3. Results

3.3.1 Sequence divergence

DNA divergence of chloroplast and nuclear regions were compared among taxa with respect to different geographical groups (Table 3.3). The variability found in individual chloroplast regions has been reported in Chapter 2. Their combined sequences in the present comparisons were shown to be less variable than the other nuclear regions (0-2.67%; Table 3.3). All nuclear sequences revealed less than 1% intraspecific divergence among clones except PISTILLATA, which showed several substitutions and indels in the alignments in two out of the 8-10 clones in five of the individuals. These sequences are believed to be paralogous copies of the gene and were excluded from the analyses in order to recover an accurate species phylogeny.

Briefly, the average DNA divergence from the total sequences of the EUR species is the highest among all geographical groups and is about two-fold higher than that of the ENA 75 species (Table 3.3).

3.3.2 Species relationships

Based on the strict consensus parsimony tree the chloroplast data resolved three major clades within Crataegus labeled as A-C in Figure 3.1. Clade A contains the EA species of Crataegus sect. Sanguineae, WNA species of Crataegus series Cerrones (100%BS; 100%PP) and

Douglasianae (78%BS; 99%PP), as well as C. spathulata of sect. Microcarpae. Clade B is the sister group to clade A and contains ENA species from eight different sections. Clade C, positioned as sister to clade A and B together, contains the EUR species of Crataegus sect.

Crataegus and C. hupehensis (sect. Hupehenses) from EA (<50%BS; 84%PP). Although bootstrap support for all of these clades is relatively weak, similar topologies (i.e., the three major clades) are found also under the ML and Bayesian criteria (see Suppl. Figs. 3A-3C).

Nuclear data provide stronger bootstrap support and better resolution of species relationships than the chloroplast data. The ITS and LFY2 sequences were combined for tree reconstructions because no topological conflict was detected in the separate analyses of the two nuclear regions and ILD test indicated no significant difference between the two datasets

(P = 0.067). The strict consensus MPT (Fig. 3.2) supports the same three clades (A-C) found on the cpDNA tree (Fig. 3.1). This provides additional support for the association of EA with

WNA species in clade A (72%BS; 99%PP), the sister group relationship of species in clade A and B (90%BS; 99%PP), as well as the monophyly of species in clade C (58%BS; 85%PP).

Also, better resolution is provided within these clades. For instance, species of sect.

Sanguineae are unresolved in clade A of the cpDNA tree (Fig. 3.1). However, they are united by nuclear data as a monophyletic group (71%BS; 90%PP), adjacent to Douglasianae

(78%BS; 96%PP) and Cerrones (79%BS; 99%PP), respectively (Fig. 3.2). Also, compared to the cpDNA tree (Fig. 3.1) that shows almost complete polytomy among the ENA species in 76 clade B, the nDNA tree (Fig. 3.2) provides some, albeit limited, resolution. For example, C. aestivalis, C. opaca, and C. rufula of sect. Aestivales are associated as a monophyletic group

(76%BS; 96%PP). Finally, no clear relationships are detected among the species in clade C from the chloroplast data, while nuclear data indicate that the two EA species C. pinnatifida and C. hupehensis are closely related (97%BS; 100%PP) and distinct from EUR species

(83%BS; 99%PP).

3.3.3 Incongruent topologies

Noticeable topological differences between the cpDNA and nDNA trees were detected for the position of three ENA species: C. marshalli, C. phaenopyrum, and C. spathulata. According to the cpDNA trees (Fig. 3.1), these species are nested in different positions within clade A and B with the WNA and ENA species, suggesting separate origins. However, they are found in the nDNA tree (Fig. 3.2) to be monophyletic and constitute a clade labeled as D (86%BS; 99%PP).

Clade D shows a sister group relationship to species of clade A and B (84%BS; 99%PP). The

Templeton tests on the cpDNA data with the nDNA-based topology, and vice versa, indicated significant differences in tree length (52 and 114 extra steps, respectively; P < 0.001 for both).

Similarly, the SH tests rejected the monophyly of the three species (i.e., nDNA topology) with the cpDNA data (-lnL = 4483.55 and 4421.30; P = 0.005) as well as the polyphyletic origins of these species (i.e., cpDNA topology) with the nDNA data (-lnL = 6334.96 and 6234.17; P =

0.004). Based on these tests, we conclude with confidence that these species are different in phylogentic positions between the cpDNA and nDNA trees (Fig. 3.1 and 3.2).

3.3.4 Total sequence analyses

After exclusion of taxa that showed significant topological incongruence, chloroplast and nuclear regions were combined in a total-evidence phylogeny (3477 bp; 69 taxa). The 77 resulting maximum likelihood tree (Fig. 3.3) also reveals the same biogeographic and species relationships as found in the separate analyses (Fig. 3.1 and 3.2), except that these relationships are shown with additional resolution and stronger bootstrap support. They include the backbone topology indicating a sister group relationship of clade C to clade A and

B (99%BS; 100%PP), the close association of EA and WNA species in clade A (97%BS;

100%PP), and the monophyly of all ENA species in clade B regardless of sectional limits

(80%BS; 95%PP). Within clade A and C, a sister group relationship between species of series

Douglasianae and Cerrones (76%BS; 95%PP), as well as one between C. hupehensis (section

Hupehensis) and C. pinnatifida (section Crataegus) (83%BS; 95%PP) are recovered, respectively.

In contrast to clade A and C, the relationships among the ENA species in clade B remain largely unresolved even after the chloroplast and nuclear data were combined. In an attempt to further resolve these relationships, sequences of LFY1, PISTILLATA, and PEPC genes were added to the combined data, for a total of 7012 bp, given that no well supported topological conflict was detected among separate trees and no significant differences among datasets (P >

0.05; ILD tests). The strict consensus MPT (Fig. 3.4a) offers additional support for the alliance of WNA and EA species (100%BS). The three taxonomic groups, sect. Sanguineae (97%BS), ser. Douglasianae (96%BS), and ser. Cerrones (100%BS), are shown respectively to be monophyletic. However, the resolution among the ENA species and their bootstrap values still remain low. In clade B, only Crataegus section Aestivales appears as a monophyletic group, although with weak bootstrap support (<50%BS). Among species of sect. Coccineae, two pairs of sister species C. calpodendron and C. macracantha (76% BS), as well as C. mollis and C. submollis (97%BS) are identified. Other species such as C. uniflora (sect. Parvifoliae) and C. viridis (sect. Virides) are mixed with members of Coccineae and Crus-galli without clear distinction according to the existing taxonomic treatments (Fig. 3.4a). As evidenced from 78 the phylogram (Fig. 3.4b), the internal branches in clade B are comparatively shorter than those in clade A (EA and WNA species).

3.3.5 DIVA and r8s analyses

The most recent common areas (MRCA) of Crataegus inferred by DIVA under different constraints are indicated in Table 3.4. Briefly, a single solution was obtained for the root of

Crataegus [ENA (eastern North America), WNA (western North America), EA (East Asia),

EUR (Europe)] when maximum areas were constrained to four or five, indicating that these four areas are equally probable to be the MRCA. When maximum areas were constrained to three, four alternative solutions were provided. When maximum areas were constrained to two, only one solution was obtained (Table 3.4). Two biogeographic areas, eastern North America

(ENA) and Europe (EUR), are consistently shown to be MRCA of Crataegus in almost all solutions under different constraints. These two areas show the proportional likelihoods of

0.23 (ENA) and 0.74 (EUR) at the root of Crataegus (Fig. 3.5). At least four dispersal events are inferred to explain the present distribution of Crataegus (Table 3.4; Fig. 3.5). Ancestors of

C. hupehensis, C. songarica, and C. pinnatifida are inferred to have dispersed from Europe into Asia. The eastern North American ancestors appear to have diversified into other areas such as Central America and western North America. Ancestors of western North America may have immigrated into East Asia or vice versa (Fig. 3.5).

Using the smooth value of one obtained from the cross validation procedure in r8s, the

PL method estimated the initial split of C. brachyacantha from the rest of the Crataegus species to be at 16.5±3.7 mya (Table 3.5). Divergence time between the EUR (clade C) and

{EA-NA} lineages (clade A and B) was estimated to be at 14.3±3 mya (approximately in the late Miocene), followed by the split between the ENA and {WNA-EA} lineages at about

9.9±1.7 mya (in the early Pliocene). The WNA taxa appear to have diverged from the EA taxa 79 at 4.6±0.9 mya, which was about the same time when the ENA taxa diversified i.e., at about

5.2±1.1 mya. All the estimated ages based on the PL method were shown to be 2-5 mya older than those based on the NPRS method (Table 3.5).

3.3.6 Character changes and homoplasy

Most of the morphological characters as well as ploidy level are shown to be variable among species within clade A, B, and C (Table 3.6; Fig. 3.6). Among all characters, leaf lobing shows the least number of changes and highest CI, RI, and RC values (Table 3.6). This character is shown to be differ between the Eurasian (EA and EUR) and American (ENA and WNA) species, of which the former have leaves that are deeply lobed with veins going to sinuses

(LLB; Fig. 3.6). Grooved nutlet surface appears to be synapomorphic for the EA and WNA species (clade A), but this characteristic is also observed in a few ENA and EUR species (NUS;

Fig. 3.6). Other characters such as ploidy level, leaf and branch vestiture, as well as stamen number are inferred to have evolved multiple times within the genus (Fig. 3.6) and therefore are highly homoplastic (CI < 0.2; Table 3.6).

3.4. Discussion

3.4.1 Ancestral areas of Crataegus

Crataegus has a wide distribution in the Northern Hemisphere. Based on cladistic analyses of morphological characters Phipps (1983) suggested that the most basal section of the genus is sect. Mexicanae that comprises the East Asian C. scabrifolia (Franchet) Regder and Mexican

C. pubescens (synonym of C. mexicana). Although the samples here include only the latter, our molecular data place it in a derived position within clade B (Figs. 3.1-3.3) and disagree with this notion. Both DIVA and Mesquite analyses indicate instead that Eastern North

America and Europe are probably the most recent common areas for all species (Table 3.4; Fig. 80 3.5). These areas are likely to contain the most primitive stock of Crataegus, consistent in part with the hypothesis of a North American origin for all of the supertribe Pyrodae (= Maloideae plus Gillenia Moench) on the basis of molecular phylogeny, floral morphology, and some fossil evidence (Evans & Campbell 2002). Moreover, most species of the sister genus of

Crataegus, Amelanchier (and its segregated genera Malacomeles and Peraphyllum), as well as the other basal genera of the Pyrodae (e.g., Kageneckia, Lindleya, Vauquelinia, and Gillenia) are found today only in the New World (Campbell et al. 2007). An extensive rosaceous fossil record in North America and Europe from the Eocene on (DeVore & Pigg 2007), in addition to the earliest evidences of Crataegus fossil from the Okanogan Highlands in North America in at least the late Oligocene (MacGinitie 1934; Oliver 1936; Lamotte 1952; Hickey 1984; Wolfe

& Wehr 1988; DeVore & Pigg 2007) support the DIVA and Mesquite results in this study.

In our phylognetic tree (Fig. 3.3), Crataegus germanica (formerly treated as genus

Mespilus; Chapter 2) and C. brachyacantha are the two most basal species in the genus.

Crataegus germanica and species of sect. Crataegus are endemic to Central, Eastern, and

Southeastern Europe, Western Asia, and most of the Mediterranean basin (Appendix 4;

Christensen 1992). These areas are viewed as glacial refugia for Tertiary biota and serve to preserve diversity of many extant species (Bennett et al. 1991; Hewitt 1996, 2000; Taberlet et al. 1998), as demonstrated by the highest sequence divergence detected in our European species (Table 3.3). We predict that ancestors of C. germanica and sect. Crataegus may represent some early lineages that survived through the Quaternary extinction in the Northern

Hemisphere. Crataegus brachyacantha occurs naturally throughout Louisiana and surrounding areas at 30°-34°N (Phipps 1998) beyond the previous belt of the glacial ages (Campbell 1982; Berggren & Prothero 1992; Hewitt 1996). Although no accurate fossil is known for this taxon, it is conceivable that ancestors of C. brachyacantha could be one of the few lineages that survived in the southern arid areas of North America during glaciation. 81

3.4.2 Intercontinental migrations and hybridization hypotheses

Few migratory pathways are suggested for the present distribution of Crataegus by the molecular data (Fig. 3.5). In the Old World, the European species are likely to have immigrated into central China, as evidenced by the phylogenetic association of C. hupehensis,

C. songarica, C. pinnatifida, with species of sect. Crataegus, as well as C. nigra with species of Sanguineae (Fig. 3.3). The sister group relationships of sect. Douglasianae and sect.

Sanguineae (Fig. 3.3 and 3.4), in conjunction with the synapomorphies in morphological features such as nutlet surface (Fig. 3.6) suggest that Crataegus could have interdispersed between East Asia and western North America via the Bering Land Bridge probably untill its closure in the late Pliocene (Hopkins 1967; Tiffney 1985a; Donoghue et al. 2001). Such an

EA-WNA disjunction has also been reported in other temperate angiosperms (e.g., Liston et al.

1992; Xiang et al. 1998; Chen & Li 2004; Li & Xiang 2005; Oh & Potter 2005; Nie et al.

2005). The estimated divergence ages between the EA and WNA taxa obtained in some of these studies, e.g., in Datisca (in the late Miocene; Liston et al. 1992) and Kelloggia (5.4±2.3 mya; Nie et al. 2005), are similar to what we found in Crataegus (4.6±0.9 mya; Table 3.5).

According to the r8s analyses, the separation between the ENA and WNA Crataegus appears to occur earlier than that between the EA and WNA taxa (Table 3.5). This chronological order, which has also been documented in other plants (e.g., Aesculus; Xiang et al. 1998) and invertebrates (e.g., lizards and frogs; Macey et al. 2006), supports the notion that the cooling and aridification of the North American inland preceded the closure of the BLD (Gladenkov et al. 2002; Milne & Abbott 2002; Milne 2006).

Although there is no direct evidence for genetic interchange between the European and

North American Crataegus, the conflicts detected between our chloroplast and nuclear data

(Fig. 3.1 and 3.2) as well as morphologies (Phipps 1998) suggest that three ENA species C. 82 marshalli, C. spathulata and C. phaenopyrum are potentially hybrids derived from ancestors of some European and American species. Conflicts between chloroplast and nuclear phylogenies could be attributed to several factors. One is by the stochastic outcome of the lineage sorting process that happens when variation of the ancestral lineages is not fully represented in the chloroplast and/or nuclear DNA of a taxon (Pamilo & Nei 1988; Avise

2004). However, separate analyses of the two independent nuclear loci ITS and LEAFY1 with multiple clones from multiple individuals reveal similar relationships (data not shown) to those in the combined analyses (Fig. 3.2), which eliminate the possibility of contamination and suggest that extant Crataegus lineages no longer show incomplete lineage sorting. Another concern is duplication of nuclear genes that could lead to conflicting gene trees when paralogs of some taxa are not detected. In our samples, there is no indication of substantial divergence within taxa in ITS and LEAFY1 that appears to be paralogous. Therefore, lineage sorting, contamination, and paralogy as source of conflict in our cpDNA and nDNA trees can be excluded with confidence, leaving hybridization as the most likely phenomenon to explain the observed conflict.

Morphologies of C. marshalli, C. spathulata and C. phaenopyrum have shown some support to our hybridization hypothesis. Although these three species are found in the southeastern United States, their individuals have deeply to moderately lobed leaves with veins going to major sinuses, which is a characteristic of the European species (Phipps 1998;

Fig. 3.6). These vegetative features, at one end, point towards the affinity of C. marshalli, C. spathulata and C. phaenopyrum with European Crataegus. However, at the other end, reproductive features such as small flowers and small orange-red fruits with 3-5 pyrenes, as well as distribution, indicate similarities of these three species with the native North American

Crataegus (Phipps 1998). The combinations of both European and North American morphological features in C. marshalli, C. spathulata and C. phaenopyrum provide additional 83 confidence to the hybridization explanation for these species based on molecular data.

According to the chloroplast tree (Fig. 3.1), species of North America are likely to be the maternal parents, but no one species can be specified because of the limited variation. Paternal parents are not clearly indicated in the nuclear tree (Fig. 3.2), however, morphologies suggest that these could be the European species. We predict that their paternal lineage(s) could either be extinct or not included in our samples. It is noteworthy that such hybridization has to be an ancient event because disjunct taxa can only co-occur through NALB migration in the early

Tertiary (Tiffney 1985b). This ancient paternal lineage could be lost in the glacial periods, leaving the three putative hybrid taxa as a monophyletic group showing similar genetic distances to North American and European groups in the nuclear data (Fig. 3.2).

Among the three hybrid species, C. marshalli and C. spathulata are known to be diploid, whereas C. phaenopyrum contains both triploid and tetraploid individuals (Talent & Dickinson

2005; Appendix 4). In North America, apart from some well-known polyploid hybrids (Phipps

& Muniyamma 1980; Phipps 1988), a considerable number of diploid hybrids are also found to occur naturally (Phipps 1984). Many of these hybrids were derived from crossing between native and non-native Crataegus species, e.g., the North American C. suksdorfii and C. punctata with the native European C. monogyna, forming diploid sexual individuals (Purich et al. 2005). Diploid hybrids have also been reported in many other plant groups (e.g. Spooner et al. 1991; Wolfe et al. 1998; Brochmann et al. 2001; Wang et al. 2001; Welch & Rieseberg

2002; Borgen et al. 2003), in which progenies are shown to persist and maintain through sexual reproduction, as well as spatial and ecological separation from their parental species

(Rieseberg 2000; Wang et al. 2001; Rosenthal et al. 2002). Provided that individuals of C. marshalli and C. spathulata are fertile and well established in the southeastern U.S. (Phipps

1998), it is not unreasonable to infer diploid hybrid speciation in Crataegus.

84 3.4.3 Evolution of eastern North American taxa and character homoplasy

The existing classification divides the eastern North American Crataegus into eight different sections and large sections such as Crus-galli and Coccineae are further divided into several series based on a combination of morphological characters (Phipps et al. 1990). Molecular data show strong support for sections such as Sanguineae (EA) and Crataegus (EUR), as well as series such as Cerrones and Douglasianae (WNA) (Fig. 3.3 and 3.4a). However, with few exceptions (e.g. sect. Aestivales; Fig. 3.4a) the ENA taxa remain poorly resolved and the reason is unclear in view of the higher resolution found in the other groups.

Internal branches of the ENA species are relatively short when compared with those of the WNA and EA groups (Fig. 3.4b), even when polyploid taxa are excluded (data not shown).

One explanation for such poor resolution and short branches is the genetic bottleneck effect.

Recurrent expansion and contraction of ice sheets from northeastern North America during the glacial periods may cause massive extinction of Crataegus lineages/diversity (Carrara et al.

1996; Marshall et al. 2002). However, theories suggest that most informative sites of molecular regions evolve approximately neutral and molecular divergence is less likely to be affected by changes in population size. Another explanation is rapid divergence driven by polyploidy and reticulation as documented in many North American hawthorns (Phipps &

Muniyamma 1980; Dickinson & Phipps 1985, 1986). Over 60% of the ENA taxa are reported to have polyploids (Talent & Dickinson 2005) and ploidy level is inferred to have changed at least 13 times independently within the genus (Table 3.4). Polyploidization, either via fertilization of unreduced gametes or hybridization, may lead to genome duplications and/or morphological changes, which enhance speciation (Levin 2002; Meyer & Levin 2006). Our analyses have shown that morphological characters have evolved multiple times and overlap between different ENA sections (Table 3.4; Fig. 3.6). For instance, C. calpodendron and C. macracantha have sulcate nutlets that are similar to taxa of sect. Sanguineae and 85 Douglasianae, but different from taxa of the same section Coccineae (NUS; Fig. 3.6). This character has evolved at least five times and could be derived independently in these two ENA species (Table 3.6). Other characters such as leaf shape and venation pattern appear to be more uniform within sections such as Crataegus (EUR) in which species have small, broad and deeply lobed leaves with veins extending to the sinuses, as well as species of Sanguineae (EA), which have moderately lobed leaves with veins extending only to the basalmost sinuses (Fig.

3.6; Christensen, 1992). However, these features are shown to be more variable among the

North American taxa, which vary from narrow elliptic to moderately lobed, and veins may or may not be extending to the margins (LLB, LVM; Fig. 3.6). More importantly, such variation is present not only between but also within sections of those ENA taxa, indicative of the extensive homoplasy of morphological characters that accompanied the rapid divergence of the species.

3.4.4 Conclusions

The present study indicates ancient trans-Pacific movements between East Asian and western

North American Crataegus based on their genetic associations. Europe and eastern North

America are suggested as the most recent common areas of modern Crataegus and at least four dispersal events are inferred to explain the present distribution of Crataegus.

Incongruence between the chloroplast and nuclear data as well as morphologies suggests hybrid origins of C. marshallii, C. phaenopyrum, and C. spathulata, which were potentially derived from hybridizations between the North American species serving as the maternal parents and European lineages that might be extinct or not included in our samples. The eastern North American species remain poorly resolved and no clear cladistic groups are identified to support the existing classification. Compared with other groups, poor resolution and short internal branches observed in the ENA species suggest genetic bottlenecks and/or 86 rapid divergence potentially driven by polyploidy and reticulation. These processes provide an explanation for multiple changes and homoplasy in morphological characters.

3.5. Acknowledgements

The authors thank Simona Margaritescu and Maria Kuzmina for assistance with sequencing;

Fannie Gervais for DNA extractions; Rhoda Love, Steve Brunsfeld, Christopher Campbell,

Christopher S. Reid, Nadia Talent, and Peter Zika for plant collection and identification; Jenny

Bull and Annabel Por for organizing the vouchers; and Tray Lewis (Louisiana), the Arkansas

Natural Heritage Commission, The Nature Conservancy, the Toronto and Region Conservation

Authority, the Arnold Arboretum of Harvard University, Jardin Botanique de Montréal, the

Morton Arboretum, the University of California Botanical Garden, the North Carolina

Arboretum, the North Carolina Botanical Garden, and the Royal Botanical Gardens (Burlington,

Ontario) for access to trees. Financial support from the Natural Sciences and Engineering

Research Council of Canada (grant A3430 to TAD, 326439-06 to SS), the Carlsberg Foundation

(grant 2005-1-462 to KIC), the Botany Department of the University of Toronto, and the Royal

Ontario Museum is gratefully acknowledged.

3.6. References

Andreo CS, Gonzalez DH, Iglesias AA 1987. Higher plant phosphoenolpyruvate carboxylase:

Structure and Regulation. FEBS Let. 213: 1-8.

Avise JC 2004. Molecular markers, natural history, and evolution, 2nd edition. Sinauer

Associates, Sunderland, Massachusetts. 87 Bailey CD, Doyle JJ 1999. Potential phylogenetic utility of the low-copy nuclear gene

PISTILLATA in Dicotyledonous Plants: Comparison to nrDNA ITS and trnL intron in

Sphaerocardamum and other Brassicaceae. Mol. Phylogenet. Evol. 13: 20-30.

Bennett KD, Tzedakis PC, Willis KJ 1991. Quaternary refugia of North European trees. J.

Biogeog. 18: 103-115.

Berrggren WA, Prothero DR 1992. Eocene and Oligocene climatic and biotic evolution: an

overview. In Eocene and Oligocene climatic and biotic evolution. Princeton University

Press, Princeton. Pp. 568.

Borgen L, Leitch I, Santos-Guerra A 2003. Genome organization in diploid hybrid species of

Argyranthemum (Asteraceae) in the Canary Islands. Bot. J. Linn. Soc. 141: 491-501.

Boufford DE, Spongberg SA 1983. Eastern Asia-Eastern North American phytogeographical

relationships-a history from the time of Linnaeus to the twentieth century. Ann. Missouri

Bot. Gard. 70: 423-439.

Brochmann C, Borgen L, Stabbetorp OE 2000. Multiple diploid hybrid speciation of the

Canary Island endemic Argyranthemum sundingii (Asteraceae). Pl. Syst. Evol. 220: 77-92.

Campbell JJN 1982. Pears and persimmons: A comparison of temperate forests in Europe and

eastern North America. Vegetatio. 49: 85-101.

Campbell CS, Evans RC, Morgan DR, Dickinson TA, Arsenault MP 2007. Phylogeny of

subtribe Pyrinae (formerly the Maloideae, Rosaceae): Limited resolution of a complex

evolutionary history. Pl. Syst. Evol. 266: 119 – 145.

Carrara PE, Kiver EP, Stradling DF 1996. The Southern Limit of Cordilleran Ice in the

Colville and Pend Oreille Valleys of Northeastern Washington during the Late Wisconsin

Glaciation. Can. J. Earth Sci. 33: 769-778.

Chen ZD, Li JH 2004. Phylogeny and biogeography of Alnus (Betulaceae) inferred from

sequences of nuclear ribosomal DNA ITS region. Int. J. Plant Sci. 165, 325-335. 88 Christensen KI 1992. Revision of Crataegus section Crataegus and Nothosect. Crataeguineae

(Rosaceae-Maloideae) in the Old World. Syst. Bot. Mono. 35: 1-199.

Cushman JC, Bohnert HJ 1989. Nucleotide sequence of the Ppc2 gene encoding a

housekeeping isoform of PEPC from Mesembryanthemum crystallinum. Nucl. Acids Res.

17: 6743-6744.

DeVore ML, Pigg KB 2007. A brief review of the fossil history of the family Rosaceae with a

focus on the Eocene Okanogan Highlands of eastern Washington State, USA, and British

Columbia, Canada. Pl. Syst. Evol. 266: 45-57.

Dickinson TA 1998. Taxonomy of agamic complexes in plants: a role for metapopulation

thinking. Fol. Geobotanica 33: 327-332.

_____, Phipps JB 1985. Degree and pattern of variation in Crataegus section Crus-galli in

Ontario. Syst. Bot. 10: 322-337.

_____, _____ 1986. The breeding system of Crataegus crus galli sensu lato in Ontario

(Canada). Am. J. Bot. 73: 116-130.

_____, Belaoussoff S, Love RM, Muniyamma M 1996. North American black-fruited

hawthorns: I. Variation in floral construction, breeding system correlates, and their possible

evolutionary significance in Crataegus sect. Douglasii Loudon. Fol. Geobotanica 31:

355-371.

Donoghue M, Bell CD, Li JH 2001. Phylogenetic patterns in northern hemisphere plant

geography. Int. J. Plant Sci. 162: S41-52.

_____, Smith S.A 2004. Patterns in the assembly of the temperate forests around the Northern

Hemisphere. Phil. Trans. Roy. Soc. London B 359: 1633-1644.

Evans RC, Campbell CS 2002. The origin of the apple subfamily (Maloideae; Rosaceae) is

clarified by DNA sequence data from duplicated GBSSI genes. Am. J. Bot. 89: 1478-1484.

Farris JS 1989. The Retention Index and Homoplasy Excess. Syst. Zool. 38: 406-407. 89

Felsenstein J 1985. Confidence limits on phylogenies: an approach using the bootstrap.

Evolution 39:, 783-791.

_____ 2006. PHYLIP (phylogeny inference package), version 3.66c.

http://evolution.genetics.washington.edu/phylip.html

Gaskin JF, Schaal B 2002. Hybrid Tamarix widespread in U.S. invasion and undetected in

native Asian range. Proc. Nat. Acad. Sci., USA 99: 11256-11259.

Gehrig H, Heute V, Kluge M 1998. Towards a better understanding of the molecular evolution

of phosphoenolpyruvate carboxylase by comparison of partial cDNA sequences. J. Mol.

Evol. 46: 107-114.

_____, _____, _____ 2001. New partial sequences of Phosphoenolpyruvate Carboxylase as

molecular phylogenetic markers. Mol. Phylogenet. Evol. 20: 262-274.

Goto K, Meyerowitz E 1994. Function and regulation of the Arabidopsis floral homeotic gene

PISTILLATA. Gen. Devel. 8: 1548-1560.

Graham A 1972. Outline of the origin and historical recognition of floristic affinities between

Asia and eastern North America. In Graham, A. ed. Floristics and paleofloristics of Asia

and eastern North America. Elsevier, Amsteram. Pp. 1-18.

Gu CZ, Spongberg SA 2003. Crataegus L.. Flora of China 9: 111-117.

Guo Q 1999. Ecological comparisons between Eastern Asia and North America: historical and

geographical perspectives. J. Biogeog. 26: 199-206.

Hasegawa M, Kishino H, Yano T 1985. Dating of the human-ape splitting by a molecular

clock of mitochondria DNA. J. Mol. Evol. 22: 160-174.

Helfgott DM, Mason-Gamer RJ 2004. The evolution of North American Elymus (Triticeae,

Poaceae) allotetraploids: Evidence from Phosphoenolpyruvate Carboxylase gene

sequences. Syst. Bot. 29: 850-861.

Hewitt G.M 1996. Some genetic consequences of ice ages, and their role in divergence and 90 speciation. Biol. J. Linn. Soc. 58: 247-276.

_____ 2000. The genetic legacy of the Quaternary ice ages. Nature 405: 907-913.

Hickey LJ 1984. Changes in the angiosperm flora across the Cretaceous-Tertiary boundary. In:

Berggren, W.A., Van Couvering, J.A. eds. Catastrophes and Earth History. Princeton

University Press, Princeton, N. J. Pp. 279-313.

Hopkins DM 1967. The Cenozoic history of Beringia-a synthesis. In Hopkins, D.M. ed., The

Bering land Bridge. Stanford University Press, Standford. Pp. 451-484.

Huelsenbeck JP, Ronquist F 2001. Mr. Bayes: a program for the Bayesian inference of

phylogeny. Bioinformatics 17: 754-755.

Iwatsuki K, Ohba H 1994. The floristic relationship between East Asia and eastern North

America. In Vegetation in Eastern North America, ed. Miyawaki, A., Iwatsuki, K.,

Grandtner, M.M. Tokyo University Press, Tokyo. Pp. 61-74.

Lamotte RS 1952. Catalogue of the Cenozoic plants of North America through 1950. Geol.

Soc. Am. Mem. 51.

Lepiniec L, Vidal J, Chollet R, Gadal P, Cretin C 1994. Phosphoenolpyruvate carboxylase:

Structure, regulation, and evolution. Plant Sci. 99: 111-124.

Levin DA 2002. The role of chromosomal change in plant evolution. Oxford University Press,

New York.

Li JH, Xiang QP 2005. Phylogeny and biogeography of Thuja L. (Cupressaceae), an Eastern

Asian and North American disjunct genus. J. Integr. Plant Biol. 47, 651-659.

Liston A, Rieseberg LH, Hansen MA 1992. Geographic partitioning of chloroplast DNA

variation in the genus Datisca (Datiscaceae). Plant Syst. Evol. 181, 121-132.

Lohmann LG 2006. Untangling the phylogeny of neotropical lianas (Bignonieae, Bignoniaceae).

Am. J. Bot. 93: 304-318.

Maddison W.P., Maddison DR 1992. MacClade, version 3.01. Sunderland: Sinauer Associates. 91 _____, _____ 2007. Mesquite: A modular system for evolutionary analysis. Version 2.01.

MacGinitie HD 1934. Contributions to paleobotany. Ⅱ. The Trout Creek flora of southeastern

Oregon. Carnegie Inst.

Malcomber ST 2002. Phylogeny of Gaertnera Lam. (Rubiaceae) based on multiple DNA

markers: evidence of a rapid radiation in a widespread, morphologically diverse genus.

Evolution 56: 42-57.

Marshall SJ, James TS, Clarke GKC 2002. North American ice sheet reconstructions at the last

glacial maximum. Quat. Sci. Rev. 21: 175-192.

Matsuoka M, Minami EI, 1989. Complete structure of the gene for phosphoenolpyruvate

carboxylase from maize. Europ. J. Biochem. 181: 593-598.

Meyers LA, Levin DA 2006. On the abundance of polyploids in flowering plants. Evolution 60:

1198-1206.

Müller K 2005. Incorporating information from length-mutational events into phylogenetic

analysis. Mol. Phylogenet. Evol. 38: 667-676.

Muniyamma M, Phipps JB 1979. Cytological proof of apomixis in Crataegus (Rosaceae). Am.

J. Bot. 66: 149-155.

____, ____ 1984. Further cytological evidence for the occurrence of apomixis in North

American hawthorns. Can. J. Bot. 62: 2316-2324.

Nie ZL, Wen J, Sun H, Bartholomew B 2005. Monophyly of Kelloggia Torrey ex Benth.

(Rubiaceae) and evolution of its intercontinental disjunction between western North

America and eastern Asia. Am. J. Bot. 92: 642-652.

Oh SH, Potter D 2003. Phylogenetic utility of the second intron of LEAFY in Neillia and

Stephanandra (Rosaceae) and implication for the origin of Stephanandra. Mol. Phylogenet.

Evol. 29: 203-215.

____, ____ 2005. Molecular phylogenetic systematics and biogeography of tribe Neillieae 92 (Rosaceae) using DNA sequences of cpDNA, rDNA, and LEAFY. Am. J. Bot. 92: 179-192.

Oliver E 1936. Contributions to paleontology: A Miocene flora from the Blue Mountains,

Oregon. Carnegie Inst.

Olson ME 2002. Combining data from DNA sequences and morphology for a phylogeny of

Moringaceae (Brassicales). Syst. Bot. 27: 55-73.

Pamilo P, Nei M 1988. Relationships between gene trees and species trees. Mol. Biol. Evol. 5:

568-583.

Phipps JB 1983. Biogeographic, taxonomic, and cladistic relationships between East Asiatic

and North American Crataegus. Ann. Missouri Bot. Gard. 70: 667-700.

_____ 1984. Problems of hybridity in the cladistics of Crataegus (Rosaceae). Plant

Biosystematics. W. F. Grant. Toronto, Academic Press Canada.

____ 1988. Crataegus (Maloideae, Rosaceae) of the southeastern USA: Introduction and

series Aestivales. J. Arnold Arbor. 69: 401-432.

____ 1998. Synopsis of Crataegus series Apiifoliae, Cordatae, Microcarpae, and Brevispinae

(Rosaceae subfam. Maloideae). Ann. Missouri Bot. Gard. 85: 475-491.

_____, Muniyamma M 1980. A taxonomic revision of Crataegus (Rosaceae) in Ontario. Can.

J. Bot. 58: 1621-1699.

____, Robertson KR. Rohrer JR 1990. A checklist of the subfamily Maloideae (Rosaceae).

Can. J. Bot. 68: 2209-2269.

____, O’Kennon RJ, Lance RW 2003. Hawthorns and medlars. Timber Press, Portland OR.

Posada D, Crandall KA 1998. Modeltest: testing the model of DNA substitution.

Bioinfomatics 14: 817-818.

Purich MA, Lo EYY, Dickinson TA 2005. Characterizing hybridization between native and

non-native Crataegus species. The 90th Annual Meeting, Ecological Society of America,

Montreal, Canada. 93

Rambaut A 2002. Se-Al Sequence Alignment Editor v2.0a11. Oxford: University of Oxford.

Rieseberg LH 2000. Crossing relationships among ancient and experimental sunflower hybrid

lineages. Evolution 54: 859-865.

Ronquist F 1994. Ancestral areas and parsimony. Syst. Biol. 43: 267-274.

_____, Huelsenbeck JP 2003. MrBayes 3: Bayesian phylogenetic inference under mixed models.

Bioinformatics 19: 1572-1574.

____ 1997. Dispersal-vicariance analysis: a new approach to the quantification of historical

biogeography. Syst. Biol.46: 195-203.

Rosenthal DM, Schwarzbach AE, Donovan LA, Raymond O, Rieseberg LH 2002. Phenotypic

differentiation between three ancient hybrid taxa and their parental species. Int. J. Pl. Sci.

163: 387-398.

Sanmartin I, Enghoff H, Ronquist F 2001. Patterns of animal dispersal, vicariance and

diversification in the Holarctic. Biol. J. Linn. Soc. 73: 345-390.

Sieburth LE, Meyerowitz EM 1997. Molecular dissection of the AGAMOUS control region

shows that cis elements for spatial regulation are located intragenically. Plant Cell 9:

355-365.

Shimodaira H, Hasegawa M 1999. Multiple comparisons of log-likelihoods with applications to

phylogenetic inference. Mol. Biol. Evol. 16: 1114–1116.

Spooner DM, Sytsma KJ, Smith JF 1991. A molecular reexamination of diploid hybrid

speciation of Solanum rephanifolium. Evolution 45: 757-764.

Swofford DL 2002. PAUP*: Phylogenetic analysis using parsimony (and other methods),

version 40b10. Sunderland: Sinauer Associates.

Taberlet P, Fumagalli L, Wust-Saucy AG, Cosson JF 1998. Comparative phylogeography and

postglacial colonization routes in Europe. Mol. Ecol. 7: 453-464. 94 Talent N, Dickinson TA 2005. Polyploidy in Crataegus and Mespilus (Rosaceae, Maloideae):

evolutionary inferences from flow cytometry of nuclear DNA amounts. Can. J. Bot. 83:

1268-1304.

____, ____ 2007. Ploidy level increase and decrease in seeds from crosses between sexual

diploids and asexual triploids and tetraploids in Crataegus L. (Rosaceae, Spiraeoideae,

Pyreae). Can. J. Bot. 85: 570-584.

Taylor, D.W., 1990. Paleobiogeographic relationships of angiosperms form the Cretaceous and

early Tertiary of the North American area. Bot. Rev. 56, 279-417.

Templeton AR 1983. Phylogenetic inference from restriction endoclease cleavage site maps

with particular reference to the evolution of human and the apes. Evolution 37: 221-244.

Tiffney BH 1985a. Perspectives on the origin of the floristic similarity between eastern Asia

and eastern North America. J. Arnold Arbor. 66: 73-94.

_____ 1985b. The Eocene North Atlantic land bridge: its importance in Tertiary and modern

phylogeography of the Northern Hemisphere. J. Arnold Arbor. 66: 243-273.

_____, Manchester SR 2001. The use of geological and paleontological evidence in evaluating

plant phylogeographic hypotheses in the northern hemisphere tertiary. Int. J.Plant Sci. 162:

S3-S17.

Thompson JD, Gibson TJ, Plewniak F, Jeanmougin F, Higgins DG 1997. The ClustalX windows

interface: flexible strategies for multiple sequence alignment aided by quality analysis tools.

Nucleic Acids Res. 24: 4876-4882.

Tsumura Y, Yoshimura K, Tomaru N, Ohba K 1995. Molecular phylogeny of conifers using

RFLP analysis of PCR-amplified specific chloroplast genes. Theoret. Appl. Genet. 91:

1222-1236.

Wang XR, Szmidt AE, Savolainen O 2001. Genetic composition and diploid hybrid speciation

of a high mountain pine, Pinus densata, native to the Tibetan Plateau. Genetics 159: 95 337-346.

Welch ME, Rieseberg LH 2002. Patterns of genetic variation suggest a single, ancient origin for

the diploid hybrid species Helianthus paradoxus. Evolution 56: 2126-2137.

Wen J 1999. Evolution of eastern Asian and eastern North American disjunct distributions in

flowering plants. Annu. Rev. Ecol. Syst. 30: 421-455.

Wolfe AD, Xiang QY, Kephart SR 1998. Diploid hybrid speciation in Penstemon

(Scrophulariaceae). Proc. Nat. Acad. Sci. 95: 5112-5115.

Wolfe JA, Wehr W 1988. Rosaceous Chamaebatiaria-like foliage from the paleogene of

western North America. Aliso 12: 177-200

Xiang QY, Soltis DE, Soltis PS 1998. The eastern Asian and eastern and western North

American floristic disjunction: Congruent phylogenetic patterns in seven diverse genera. Mol.

Phylogenet. Evol. 10: 178-190.

96

Table 3.1 Summary of Crataegus samples included in this study. Number of species that were reported as diploid (2x), triploid (3x), and tetraploid (4x) according to Talent and Dickinson (2005) are indicated for each taxonomic section. Detailed information on species can be found in Appendix 4. Asterisks indicate sections in which species were sequenced with the three additional nuclear regions LEAFY intron 1, partial PEPC and PISTILLATA genes for further phylogenetic analyses. Numbers in bold denote species that contain both diploid and polyploid individuals, as described in greater detail in Appendix 4. Taxonomic section No. of species No. of individuals Biogeographic regions Ploidy level composition Mexicanae* 1 1 Central America 2x Crataegus 13 37 Central Europe, Eurasia 2x (7), 4x (6) Sanguineae* 11 17 East Asia 2x (5), 3x (1), 4x (2) Hupehenses 1 2 East Asia 2x Douglasianae* 8 25 Western North America 2x (2), 4x (6+1) Parvifoliae* 1 2 Eastern North America 3x Cordatae 1 2 Eastern North America 3-4x Virides* 1 2 Eastern North America 2x Microcarpae 1 2 Eastern North America 2x Lacrimatae* 4 5 Eastern North America 3x (1), 4x (3) Aestivales* 3 7 Eastern North America 2x (2), 3x (1) Crus-galli* 5 10 Eastern North America 2x (2), 3x (2), 4x (1) Coccineae* 15 30 Eastern North America 2x (3), 3x (2+1), 4x (9) Brachyacanthae* 1 4 Eastern North America 2x 96 97

Table 3.2 Information of chloroplast and nuclear primers used in the present study. Primers for LEAFY intron 1, PEPC, and PISTILLATA genes were designed based on conserved regions of the mRNA sequences of Maloideae taxa obtained from the NCBI database and their accession numbers are provided. The two forward primers for PISTILLATA (F1 and F2) are located respectively on the 3’ MADS-domain and 3’ K-domain, and the reverse primer is on the 5’ C-terminus of the gene. Only PIST-F2 and PIST-R are used in this study because the amplified downstream introns are suggested to be more variable and of readable length for sequencing without internal primers. These nuclear primers are shown to be applicable to Crataegus, Amelanchier, Malus, and potentially other Pyrinae species. Amplified Primer sequence (5' - 3') Tm (∘C) References/mRNA sequences used length (bp) Chloroplast regions trnG-trnS F: GAACGAATCACACTTTTACCAC 58 750 Hamilton (1999) R: GCCGCTTTAGTCCACTCAGC psbA-trnH F: GTTATGCATGAACGTAATGCTC 55 500 Sang et al. (1997) R: CGCGCATGGTGGATTCACAAATC trnH-rpl2 F: CGGATGTAGCCAAGTGGATC 55 500 Vaillancourt and Jackson (2000) R: GATAATTTGATTCTTCGTCGCC rpl20-rps12 F: TTTGTTCTACGTCTCCGAGC 58 1000 Hamilton (1999) R: GTCGAGGAACATGTACTAGG Nuclear regions ITS1-5.8S-ITS2 F: TCCTCCGCTTATTGATATGC 55 700 White et al. (1990) R: GGAAGGAGAAGTCGTAACAAGG LEAFY intron 1 F: GGATCCRGATGCCTTCTCTGCGAACTTGTTCAAGTGG Malus domestica (AB162034), Eriobotrya japonica (AB162039), 62 1000 R: GTTCTTTTTGCCACGCGCCACCTCCCCCGG-3’ Pseudocydonia sinensis (AB162038), and Cydonia oblonga (AB162037) LEAFY intron 2 F: CACCCACGACCITTYATIGTIACIGARCCIGGIGA Oh and Potter (2003) 60 550 R: CCTGCCIACRTARTGICKCATYTTIGGYTT 97 98

PEPC F: CCGKCTTGCWACACCWGAGCTGGAG Eriobotrya japonica (EF523436) 58 700 and Prunus persica (AJ243415) R: CCRGGWGCRTACTCGC PISTILLATA F1: CARAGAAAATTGGGTAGGGGAAAGGTCGAGAT Malus domestica (AJ291490) F2: CYCAGTACTACCARCAAGAAGC 62 1200 and Prunus persica (AY773012) R: GTACTGATGATTGGGTTGTAAYGCRTTCACTTG

98 99

Table 3.3 Estimates of sequence divergence among Crataegus species with respect to the four major geographical areas (ENA, WNA, EA, and EUR) for the chloroplast and nuclear regions. Asterisks indicate parsimony results estimated from sequences without LEAFY intron 1,

PISTILLATA, and PEPC gene regions. PI = parsimony informative characters; MPTs = most parsimonious trees; CI = consistency index; RI

= retention index; RC = rescaled retention index.

Combined cpDNA ITS LEAFY2 LEAFY1 PISTILLATA PEPC Total Aligned length (bp) 2487 675 697 1232 1243 683 3859*/ 7012 No. of taxa 79 79 79 52 52 52 79* / 52 No. of sequences 194 234 204 149 118 171 190* / 107 Divergence among taxa of EA 0-0.85% 0-5.96% 0.23-3.79% 0.78-1.65% 0.44-2.39% 0.63-1.73% 0.33-3.30%* Divergence among taxa of ENA 0-2.3% 0.16-3.10% 0-1.74% 0.57-1.92% 0-2.04% 0.78-3.50% 0.58-2.20%* Divergence among taxa of WNA 0.41-1.72% 0.48-4.25% 0.24-5.97% 0.29-1.97% 0.44-1.32% 0.63-2.86% 0.87-3.13%* Divergence among taxa of EUR/EA 0.05-0.53% 0-6.86% 0.16-2.16% - - - 0.57-4.28%* Divergence among all Crataegus taxa 0-2.67% 0-7.79% 0-4.92% 0.23-4.20% 0.18-11.34% 0.78-4.65% 0.05-5.49%* No. of PI 142 227 321 223 336 149 602* / 864 No. of MPTs >50,000 >50,000 >50,000 >50,000 >50,000 >50,000 >50,000* />50,000 Tree length 476 836 589 521 699 476 1282* / 2545 CI 0.721 0.702 0.857 0.716 0.764 0.502 0.738* / 0.715 RI 0.849 0.887 0.92 0.899 0.975 0.841 0.894* / 0.881 RC 0.612 0.623 0.789 0.644 0.745 0.422 0.571* / 0.630

99

100

Table 3.4 DIVA results indicating the most recent common areas (MRCA) obtained for the root of Crataegus and potential number of dispersal events computed by optimal reconstruction under default settings and maximum areas constrained from 5 to 2 in each of the iterations. The five biogeographic areas represented in our samples are as stated below. Only one solution is provided under all constraint tests except when the maximum areas are constrained to three and four alternative solutions are provided. Maximum areas MRCA for the root of Crataegus No. of events 5 ENA,WNA,EA,EUR 4 4 ENA,WNA,EA,EUR 4 3 ENA,EUR ENA,WNA,EUR ENA,EA,EUR WNA,EA,EUR 5 2 ENA,EUR 5

Notes: Eastern North America (ENA) Western North America (WNA) East Asia (EA) Europe (EUR) Central America (CAM)

100

101

Table 3.5. Age estimates and their confidence intervals for relevant nodes in the ML phylogeny of Crataegus and outgroup genera based on the penalized likelihood (PL) and non-parametric rate smoothing (NPRS) methods implemented in the program r8s. Numbers in brackets are confidence intervals estimated from 100 bootstrapping. EA1 and EA2 represent East Asian species of {C. hupehensis-C. pinnatifida} and section Sanguineae, respectively. Estimated age of the most recent common ancestor of (mya) C. brachyacantha EUR and ENA and (ENA) and the rest of EUR and EA1 EA2 and WNA ENA species {ENA{EA-WNA}} {EA-WNA} Crataegus PL 16.5 ± 5.1 14.3 ± 3.9 6.5 ± 2.2 9.9 ± 3.6 4.6 ± 1.9 5.2 ± 2.4 NPRS 18.8 ± 3.1 17.5 ± 3.8 10.9 ± 4.3 13.6 ± 4.7 7.9 ± 5.1 10.3 ± 4.6 101 102102 Table 3.6 Estimate of number of changes and homoplasy of morphological characters on molecular tree with 52 taxa included. Character label Morphological characters No. of changes CI RI RC LLB Leaf lobing 4 0.50 0.92 0.46 LVM Leaf veins-to-margin 6 0.17 0.64 0.11 LLV Lower leaf surface vestiture 10 0.10 0.61 0.06 LUV Upper leaf surface vestiture 10 0.10 0.55 0.06 BRV Branch vestiture 8 0.13 0.65 0.08 NUS Nutlet surface 5 0.20 0.83 0.17 STN Stamen number 10 0.20 0.27 0.05 PLL Ploidy level 13 0.15 0.52 0.08

98103

Figure 3.1 Strict consensus tree from maximum parsimony (MP) analyses of the combined trnG-trnS, psbA-trnH, trnH-rpl2, and rps20-rpl12 chloroplast data. Species of

Amelanchier, Malus, and Aronia were used as outgroups. Bars indicate the biogeographic distribution of the examined taxa. EA: East Asia; WNA: western North America; ENA: eastern North America; CAM: Central America; and EUR: Europe. Sections, series, species, and accession number of examined individuals are shown in separate columns and details can be found in Appendix 4. The three clades are labeled as A (taxa of series

Cerrones and Douglasianeae from WNA, and section Sanguineae from EA); B (taxa of

ENA sections); and C (taxa of sections Crataegus and Hupehensis from EUR and EA).

Nodes with bootstrap values (BS; above branches) and posterior probabilities (PP; below branches) >50% are indicated. Asterisked branches indicate the three ENA taxa showing conflicting positions in the nuclear trees (Fig. 3.2).

104

Region Series Section Species Accessions of individuals SANG SANG almaatensis 1196-65 SANG SANG russanovii KIC287 SANG SANG dahurica 71-73A,250-2000 SANG SANG wattiana 1401-52 SANG SANG sanguinea 1232-49 EA SANG NIGR maximowicizii 309-97,310-4 SANG NIGR chlorosarca 281-71A 60 EUR SANG NIGR nigra KIC294 SANG SANG wilsonii 271-84A,749-74A SANG SANG altaica 1280-50 A SANG NIGR kansuensis EN101,12-95 SANG NIGR maximowcizii 310-1,310-5 DOUG DOUG douglasii NT189,18453,2003-23 DOUG DOUG enderbyensis 18445,18454 78 DOUG DOUG suksdorfii 18477A,18485 58 DOUG DOUG castlegarensis 18390,18488 63 99 WNA DOUG DOUG okennonii 5A,B DOUG DOUG suksdorfii 2001-27,2003-11,D1619 MICR MICR spathulata 2003-6,2003-34 DOUG CERR erythropoda NT327,NT327 100 91 DOUG CERR saligna 2001-4a,2001-7A,99FW1-1 100 100 DOUG CERR rivularis NCA31,2001-10,2001-42 LACR LACR lassa 2003-18 COCC PULC sp. 2002-5 LACR LACR agrestina 2003-20 AEST AEST rufula 1992-425,RON CRUS CRUS engelmannii 312-87 COCC INTR sargentii NT288 COCC SUBO compacta D654,D659 95 CAM MEXI MEXI pubescens 76-2049 89 CRAT APII marshalli 2000-1,2003-5,2003-30 AEST AEST opaca 2001-1,2003-33 LACR LACR sp. 2003-46,2003-50 LACR LACR munda 2313 AEST AEST rufula 1992-425,RON AEST AEST aestivalis 1992-250,RON CRUS PUNC punctata BB4,2000-2b CRUS CRUS crus-galli NT213A,NT286,2000-15 CRUS CRUS tenax D661,D662 CRUS PUNC collina NT300,NT305 ENA COCC ROTU chryoscarpa 2001-23A,2001-24,749-52 COCC BRAC harbisonii lance4,2307,1998-74A B 83 COCC MACR calpodendron 2000-28,2003-56,NT172 COCC MACR macracantha 2001-25A 85 93 COCC INTR flavida 966-90H,2003-61,2003-65 COCC ROTU irrasa NT193,NT307 98 COCC MOLL mollis D1655,NT208 COCC MOLL submollis 2000-97 95 COCC COCC sp. 2000-50 COCC COCC sp. F6-172A COCC ROTU dodgei EL2 COCC TENU flabellata NT308 COCC TRIF triflora 2002-8,NT290 COCC ROTU chryoscarpa 5-33,5-34 COCC ROTU sp. 2001-29 COCC BRAC ashei lance2309,2003-25 VIRI VIRI viridis 2003-44,2003-45 PARV PARV uniflora 2003-26,2003-52 CORD CORD phaenopyrum 99ME1,195-52B EA HUPE HUPE hupehensis 356-81B,C CRAT CRAT monogyna 2003-25,99FW7-11 EA CRAT CRAT songarica 113-96,2003-57 CRAT CRAT laevigata 18472,18473 CRAT CRAT pentagyna 312,312-1,312-2,2000-18 100 EA CRAT PINN pinnatifida 1691-49,#3,#6 C CRAT CRAT monogyna 8-1,34-1 97 CRAT AZAR heldrechii 238-71A,B05,C02 CRAT CRAT rhipidophylla 18-1,1970-64 EUR CRAT AZAR pycnoloba A03,A04 CRAT PENT orientalis C01,168-1 CRAT CRAT nevadensis 271-2,1991-61 77 CRAT CRAT monogyna B02 82 84 CRAT CRAT laevigata 26-5,27-1 CRAT CRAT meyeri 1998-8010 100 CRAT CRAT pseudoheterophylla 313-1,314-1 100 CRAT CRAT songarica 304-1,1954-0509 ENA BRAC BRAC brachyacantha 1999,2000-11,2003-32 EUR MESP MESP germanica UCBG,727-89,AAD11457 Amelanchier bartramiana B5,B19 Amelanchier arborea 2003-1 Malus angustifolia 2003-2,2003-10 Aronia spp. 2003-3 105

Figure 3.2 Strict consensus maximum parsimony trees from the combined analysis of ITS and

LEAFY second intron data. Biogeographic regions, sections, series, species, and accessions for all examined individuals are listed on the right. Taxa in clades A-C are the same as in Fig. 3.1 except the three taxa C. marshalli, C. spathulata, and C. phaenopyrum that show different positions in the chloroplast tree (Fig. 3.1), form an additional well-supported clade labeled as D.

Nodes with bootstrap values (BS; above branches) and posterior probabilities (PP; below branches) >50% are indicated. The unrooted phylogram on the top left corner shows the branch length of the four clades found in the cladogram.

106

A B

Region Series Section Species Accessions of individuals D SANG NIGR maximowcizii 309-97,310-1,310-4,310-5 EA SANG NIGR kansuensis 1999,EN101,12-95 SANG NIGR chlorosarca 2003-60,281-71A EUR SANG NIGR nigra KIC294 71 60 SANG SANG wilsonii 271-84A,749-74A 90 SANG SANG russanovii KIC287 EA SANG SANG sanguinea 1232-49A SANG SANG dahurica 2003-62,250-2000 A SANG SANG almaatensis 1196-65 72 DOUG PURP enderbyensis 18445,18454 C 99 DOUG DOUG douglasii NT189,18453,2003-23 78 DOUG DOUG okennonii 5A,B 96 DOUG DOUG castlegarensis 18448,18390

5 changes 100 WNA DOUG DOUG suksdorfii 18477,18485 100 DOUG DOUG suksdorfii 2001-27,2003-11,D1619,etc. DOUG CERR erythropoda NT327,NT333 79 DOUG CERR rivularis NCA31,2001-10,2001-42 99 DOUG CERR saligna 2001-4A,2001-7A,99FW1-7 LACR LACR lassa 2003-18 64 LACR LACR munda 2313 90 LACR LACR agrestina 2003-20 99 CAM LACR LACR sp. 2003-46,2003-50 COCC MACR calpodendron NT172,2000-28,2003-56 COCC MACR macracantha 2001-25A COCC TENU flabellata NT308 COCC ROTU chryoscarpa 749-52,2001-23A,5-33,5-34 COCC MOLL mollis NT208,D1665 COCC MOLL submollis 2000-97 COCC ROTU sp. 2001-29A COCC ROTU dodgei EL2 COCC INTR flavida 966-90H,2003-61,2003-65 B COCC SUBO compacta D654,D659 56 COCC INTR sargentii NT288 COCC ROTU irrasa NT193,NT307 72 COCC BRAC harbisonii 1998-74A,lance4,2307 COCC PULC sp. 2002-5 COCC BRAC ashei lance2309,2003-25 84 COCC TRIF triflora 2314,2002-8 99 ENA MEXI MEXI pubescens 76-2049 AEST AEST rufula 1992-425,RON 76 AEST AEST opaca 2001-1,2003-33,387-96A 96 AEST AEST aestivalis 1992-250,RON CRUS CRUS crus-galli NT213A,NT286,2000-15 CRUS CRUS tenax D661,D662 CRUS CRUS engelmanii 312-87 CRUS PUNC punctata BB4,2000-26 CRUS PUNC collina NT300,NT305 72 VIRI VIRI viridis 2003-45,2003-63 95 PARV PARV uniflora 2003-26,2003-52 61 MICR MICR spathulata 2003-6,2003-34 D 86 CORD CORD phaenopyrum 99ME1,195-52B 99 CRAT APII marshalli 2000-1,2003-30 97 EA HUPE HUPE hupehensis 356-81B,C 100 CRAT PINN pinnatifida 1691-49,#3,#6 CRAT CRAT rhipidophylla 18-1 CRAT CRAT nevadensis 1991-61 78 CRAT CRAT laevigata 27-1,27-3 99 C CRAT CRAT meyeri 1998-8010 100 EUR CRAT CRAT nevadensis 271-2 100 CRAT CRAT laevigata 26-1,18472,18473 74 CRAT CRAT monogyna 8-1,34-1 85 CRAT AZAR heldreichii 238-71A,C02 EA CRAT CRAT songarica 304-1,1954-0509,113-96,etc. CRAT CRAT pseudoheterophylla 313-1,314-1 83 CRAT AZAR pycnoloba A04,A05 99 CRAT AZAR orientalis C11,168-1 86 CRAT PENT pentagyna 94-85B,KIC312,312-2 100 ENA BRAC BRAC brachyacantha 1999,2000-11,2003-32 100 EUR MESP MESP germanica AAD11457,78-184,727-89B 100 68 Amelanchier bartramiana B5,B19 Amelanchier arborea 2003-1 Malus spp. 2003-3,2003-10 Aronia spp. 2003-1 107 Figure 3.3 Maximum likelihood (ML) tree based on combined nuclear and chloroplast data using the TVM model with lnL: -6867.76, base frequencies A = 0.3019, C = 0.1798, T = 0.3212, and G = 0.1970, proportion of invariable sites = 0.5747, and gamma shape = 0.6453. Crataegus marshallii, C. phaenopyrum, C. spathulata were omitted from the analyses because of their conflicting positions in the chloroplast (Fig. 3.1) and nuclear trees (Fig. 3.2). Biogeographic regions represented by taxa in each clade are indicated on the left and major section/series are indicated on the right. Ploidy level is labeled for each taxon with as diploid (2x), as triploid (3x), and as tetraploid (4x) according to Talent and Dickinson (2005). Bootstrap (BS; above branch) and posterior probability (PP; below branch) values >50% are indicated.

108

C almaatensis 1196-65A C nigra KIC294 C wilsonii 749-74A 74 C dahurica 2003-62 East Asia (A)1 90 C maximowcizii 310-4 Sanguineae C russanovii KIC287 C sanguinea 1232-49 C chlorosarca 2003-60 A C kansuensis 1999 97 C enderbyensis 18454 C douglasii 18453 100 C okennonii B Douglasianae 70 C castlegarensis 18484 84 C suksdorfii 18485 Western North 76 C suksdorfii 99fw8-9 95 C saligna 2001-7A America (A)2 98 C rivularis 2001-42 Cerrones 100 C erythropoda NT327 C lassa 2003-18 C munda lance2313 C agrestina 2003-20 C Pulcherrimae 2002-05 C compacta D659 C Lacrimatae 2003-46 C sargentii NT288 C rufula 1992-425 C opaca AA387-96A C engelmanii AA312-87 90 C Coccineae 2000-50 97 C calpodendron 2000-28 C macracantha 2001-25A C triflora NT290 C Rotundifoliae 2001-29A C uniflora 2003-26 B C ashei lance2309 80 C flavida AA966-90H Eastern North C harbisonii lance4 America 95 C chryoscarpa AA749-52 C collina NT300 C irrasa NT193 C flabellata NT308 C mollis D665 99 C submollis 2000-97 100 C crusgalli NT213 C aestivalis RON C pubescens EH76 C dodgei EL2 C punctata 2000-2 C tenax D662 C viridis 2003-63 C heldreichii C02 C nevadensis 271-2 C orientalis 168-1 100 Europe C rhipidophylla 18-1 100 C monogyna 34-1 C pycnoloba A05 C laevigata 27-1 Crataegus C meyeri 1998-8010 C 89 C pseudoheterophylla 313-1 95 98 C songarica 1954-0509 100 C pentagyna 312 83 C pinnatifida 1691-49 East Asia 95 C hupehensis 356-81B Hupehenses C brachyacantha 1999 Brachyacanthae C germanica F6-171 Mespilus

10 changes 2x 3x 4x 109 Figure 3.4 (A) Strict consensus of 895 equally parsimonious trees from the maximum parsimony

(MP) analysis of the eastern North American (ENA), western North American (WNA), and East

Asian (EA) taxa using the combination of four chloroplast (trnG-trnS, psbA-trnH, trnH-rpl2, and rps20-rpl12) and five nuclear (ITS1-5.8S-ITS2, LEAFY intron 1 and 2, partial PEPC and

PISTILLATA) regions. Malus augustifolia was used for rooting. Ploidy level, sections, series, species, and accessions of the examined individuals are indicated on the right. Branches in gray denote taxa of section Aestivales that are united as a monophyletic group, although with <50%

BS values. (B) Phylogram generated by maximum likelihood analysis showing shorter internal branches within the ENA than WNA-EA clades. Nodes with bootstrap (BS) values >50% are indicated.

(A) Section Series Species Accessions of individuals (B) EA (A1) SANG SANG almaatensis 1196-65 54 SANG SANG wilsonii 749-74A 97 SANG NIGR kansuensis 12-95 SANG SANG dahurica 71-73A A SANG NIGR chlorosarca 281-71A EA (A1) 100 SANG SANG sanguinea 1232-49 93 DOUG CERR rivularis 2001-42 A 100 DOUG CERR saligna 2001-7A 100 DOUG CERR erythropoda NT327 WNA (A ) DOUG DOUG douglasii NT189 2 99 88 DOUG DOUG suksdorfii 99fw8-9 WNA (A2) 85 DOUG DOUG suksdorfii 18485 DOUG DOUG okennonii B DOUG DOUG castlegarensis 18484 LACR LACR munda lance2313 LACR LACR agrestina 2003-20 99 LACR LACR sp. 2003-46 LACR LACR lassa 2003-18 COCC PULC sp. 2002-05 COCC SUBO compacta D654 COCC INTR sargentii NT288 COCC BRAC harbisonii lance4 76 COCC MACR calpodendron 2000-28 COCC MACR macracantha 2001-25A 55 COCC TRIF triflora NT290 COCC ROTU sp. 2001-29 B B COCC ROTU chryoscarpa AA749-52 COCC BRAC ashei lance2309 ENA 100 COCC INTR flavida AA966-90H VIRI VIRI viridis 2003-63 ENA 97 COCC MOLL mollis D665 COCC MOLL submollis 2000-97 COCC ROTU irrasa NT193 COCC TENU flabellata NT308 CRUS CRUS engelmanii AA312-87 CRUS PUNC collina NT300 CRUS PUNC punctata 2000-2 100 CRUS CRUS tenax D661 CRUS CRUS crus-galli NT213 AEST AEST rufula 1992-425 AEST AEST opaca AA387-96A AEST AEST aestivalis RON C MEXI MEXI pubescens 76-2049 b PARV PARV uniflora 2003-26 C BRAC BRAC brachyacantha 2003-32 b Malus angustifolia 2003-3 0.005 substitutions/site

2x 3x 4x 1 1 0 111 Figure 3.5 Biogeographic model for Crataegus based on molecular phylogenies and

Dispersal-Vicariance (DIVA) results. DIVA analyses were performed based on the combined chloroplast and nuclear tree (Fig. 3.3) and species were defined as five biogeographic areas as indicated below graph. Potential dispersal and vicariance events are indicated as bullets. The most recent common areas (MRCA) for each node as well as the root of Crataegus are indicated at the internodes. The proportional likelihood values of ancestral areas inferred by maximum likelihood method implemented in Mesquite 2.01 are also indicated as pie charts at each node of the tree (right). Species of clade A, B, C, and D can be found in Fig. 3.2. Dashed line denotes the potential parental lineage of hybrid taxa in clade D which may be extinct. Dotted lines denote potential hybridization occurred between the North American species (A2 and B) and ancestors of the currently extinct European lineage. Putative hybrid species are labeled with stars. Cb and Cg denote C. brachyacantha and C. germanica, respectively.

112 , a d i f i A t E a

n o t n

i s p i

. g s n C C

e

, f h b a o e c

C i s p l r u a a R h s

g r . n e U C o p s s d

E A i . n , E D C a A N C E

e = c

R

n s U a E u i , r D g A a e E c i a t v

a

c

D

n i

r

f t o

i o

n r C

s

s o

a t

i f

e l

d

r n t e

e o

g r

r g

e A

o o t

f r r

n

t p

h o

i

l t n

e

i a

o

r g

/ n

r

B a r

o n e

e

t

o n a

i e

i

l N

p

t

t

e h a

c

b

n t z

i i n o

t

t r d x o

i i t E r o

M c

b s f

n A y n A i e t C H A c N x s e E e / C l b A a u R s N p r

E . e

A M p o C A N t

s

f i i N f E i r o , D

o E s

o , l A a l a A N s g n E r u o W e i o t p a d a s

) c i i . A i t ) A r D C n A N e A e N

2 N

r m W N e W f A W E A

f , ( ) R i ( h a A t d a U

M r

t E ic f E o ic s

A o e er

o

N er l

t n

A a n W m (C a a m i s - i E a r t r A h

g A s t g e )

) ic i i n a p th i 1 A n th R s E w r

er i . n e or U or A o m D C (E i B E t a - N N A a s ( si r e rn n al rn g a i A te r tr op te t T m es en ur as as E C E W E 113 Figure 3.6 Mapping of morphological characters and ploidy level using MacClade 4.0

(Maddison and Maddison, 1992) based on the combined chloroplast and nuclear tree (Fig. 3.3).

Variation of leaf lobing among species is indicated by branches on the left and the other characters are indicated by cells on the right. Morphological characters were scored from herbarium specimen and treated as either binary or multistate, as indicated in Appendix 5.

Almost all of these characters are found to be variable among the Eastern North American taxa with respect to different taxonomic sections in clade B and are highly homoplastic (Table 3.4).

The consistency index (CI), retention index (RI), and rescaled consistency index (RC) for each character can be found in Table 3.4.

Morphological characters

M V V V S N L Region Section Species V L U R U T L L L L B N S P EA SANG almaatensis EUR SANG nigra EA SANG wilsonii EA SANG dahurica EA SANG maximowcizii EA SANG russanovii EA SANG sanguinea A EA SANG chlorosarca EA SANG kansuensis WNA DOUG enderbyensis WNA DOUG okennonii WNA DOUG douglasii WNA DOUG castlegarensis WNA DOUG suksdorfii WNA BREV saligna WNA DOUG rivularis WNA DOUG erythropoda ENA LACR lassa ENA COCC compacta ENA COCC sargentii ENA LACR sp. ENA LACR agrestina CAM MEXI pubescens ENA COCC dodgei ENA COCC flabellata ENA LACR munda ENA COCC calpodendron B ENA COCC macracantha ENA COCC triflora ENA COCC chyroscarpa ENA COCC flavida ENA CRUS tenax ENA COCC irrasa ENA COCC harbisonii ENA COCC ashei ENA COCC mollis ENA COCC submollis ENA AEST aestivalis ENA AEST opaca ENA CRUS crus-galli ENA CRUS punctata ENA CRUS collina ENA CRUS engelmanii ENA VIRI viridis ENA PARV uniflora EA CRAT pinnatifida EA HUPE hupehensis Lobing of short shoot leaves (LLB) EUR CRAT meyeri unordered EUR CRAT nevadensis C EUR CRAT pseudoheterophylla Unlobed EUR CRAT rhipidophylla EUR CRAT monogyna Slightly lobed EA CRAT songarica EUR CRAT orientalis Deeply lobed with veins to sinuses EUR CRAT pycnoloba EUR CRAT laevigata polymorphic EUR CRAT heldreichii EUR CRAT pentagyna ENA BREV brachyacantha EUR MESP germanica 1 1 4 115 Supplementary figure 3A. Bayesian tree of the chloroplast data using the GTR+I+G model with base frequencies A = 0.3484, C = 0.1196, T = 0.3999, and G = 0.1321, proportion of invariable sites = 0.6437, and gamma shape = 0.4185. Crataegus marshallii, C. phaenopyrum, C. spathulata are labeled with stars. Biogeographic areas are indicated by arrows. The three clades

(A-C) identified in parsimony analysis but with weak bootstrap support (Fig. 3.1) are found.

Posterior probability (PP) values >50% are indicated above branches.

116

C nigra KIC294 C sanguinea 1232 49 CP-Bayesian C chlorosarca 281 71 C almaatensis 1196 65B C sanguinea 1232 49 C russanovii KIC287 C wattiana 1401 52 C almaatensis 1196 65 C chlorosarca 2003 60 C kansuensis AA12 95 C kansuensis AA EN101 C kansuensis 1999 EA C altaica 1280 50 C wilsonii 749 74 C wilsonii 271 84 C maximowcizii 310-4 C maximowcizii 310-5 C maximowcizii 310-1 C dahurica 2003 62 C dahurica 250 2000 C saligna 2001 7 C saligna 99FW1 7 C saligna 2001 4 1 C rivularis 2001 42 A 1 C rivularis 2001 10 C rivularis NCA31 C erythropoda NT327 C erythropoda NT333 C suksdorfii D1619 C suksdorfii 2001 27A C suksdorfii FW8 12 C douglasii 18453 C enderbyensis 18454 C douglasii NT189 C castlegarensis 18484 WNA C castlegarensis 2003 23 C okennonii 5A C okennonii B 0.99 C enderbyensis 18445 C suksdorfii 18477 C suksdorfii 18485 C suksdorfii 99FW8 9 C spathulata 2003 34 C spathulata 2003 6 C agrestina 2003-20 C Pulcherrimae 2002-5 C rufula 1992-425 C rufula RON C Lacrimatae 2003-50 C lassa 2003-18 C compacta D654 C compacta D659 C sargentii NT288 C engelmannii AA312-87 C munda 2313 C uniflora 2003-26 C uniflora 2003-52 C crus-galli NT213 C flavida 1999 C flavida 2003-65 C tenax D662 C tenax D661 C irrasa NT193 C punctata 2000 2 C punctata BB4 C viridis 2003 45 C viridis 2003 63 C flavida AA966 90 C harbisonii lance4 C harbisonii 1998 74A C harbisonii lance2307 C collina NT300 C collina NT305 C flava 2003 46 C pubescens EH76 0.89 C pubescens UCBG C marshalli 2000 1 C marshalli 2003 30 C opaca AA387 96 C chryoscarpa AA749 52 C chryoscarpa 2001 23 C aestivalis 1992 250 C aestivalis RON C opaca 2003 33 C opaca 2001 1 C calpodendron NT172 0.85 C macracantha 2001 25 C calpodendron 2000 28 C calpodendron 2003 56 C crusgalli NT286 C crusgalli 2000 15 C chryoscarpa 5 34 C chryoscarpa 5 33 C Rotundifoliae 2001 29A C dodgei EL2 C flabellata NT308 B C triflora 2002 08 C ashei lance2309 C triflora lance2314 C ashei lance2310 C mollis 1999 ENA C submollis 2000 97 C mollis D1655 0.95 C irrasa NT307 C Coccineae 2000 50 C Coccineae F6 172A C phaenopyrum 99ME1 C phaenopyrum 195 52B C media 29 1 C media 25 1 C nevadensis 271 2 C nevadensis 1991 0061 C orientalis C01 C orientalis 168 1 C monogyna B 02 C songarica 304 1 C songarica 1954 0509 C monogyna 28 1 C albanica C2BIS C albanica B14 C rhipidophylla 18 1 C monogyna 8 1 C monogyna 34 1 C rhipidophylla 1970 64 C killinica 166 1 C killinica 166 2 C heldrechii B05 C heldrechii C02 C media 23 1 C peloponnesiaca A18 C pseudoheterophylla 313 C pinnatifida 1691 49 C peloponnesiaca A25 C laevigata 26 5 C laevigata 27 1 C media 32 8 C media DY08 C meyeri 1998 8010 C pycnoloba A03 C pycnoloba A04 C pentagyna 4 44 C pentagyna 4 45 C pseudoheterophylla 314 C songarica AA113 96 C songarica 2003 57 0.84 C laevigata 18472 C C monogyna 99FW7 11 C laevigata 18473 C monogyna 2003 25 0.82 C pentagyna 312 C pentagyna 2000 18 C hupehensis AA356 81 C hupehensis AA356 81B EUR/EA C hupehensis 2003 54 0.97 C heldreichii 2003 61 C heldreichii AA238 71 C brachyacantha 1999 1 0.84 C brachyacantha 2000 11 C brachyacantha 2003 32 C germanica AAD11457 0.88 C germanica 727 89B C germanica 78 0184 Amelanchier B5 Amelanchier B19 Amelanchier 2003 1 Malus 2003 3 Malus 2003 10 0.1 Aronia 2003 2 117 Supplementary figure 3B. Maximum likelihood tree of the chloroplast data using the GTR+I+G model with lnL: -5437.68, base frequencies A = 0.3484, C = 0.1196, T = 0.3999, and G = 0.1321, proportion of invariable sites = 0.6437, and gamma shape = 0.4185. Crataegus marshallii, C. phaenopyrum, C. spathulata are labeled with stars. Biogeographic areas are indicated by arrows.

The three clades (A-C) identified in parsimony analysis but with weak bootstrap support (Fig.

3.1) are found.

118

C erythropoda NT327 C saligna 2001-4 CP-ML C rivularis 2001-10 C nigra KIC294 C almaatensis 1196-65 C wilsonii 749-74 C altaica 1280-50 C kansuensis 1999 C maximowcizii 310-4 C enderbyensis 18454 C douglasii NT189 C castlegarensis 18484 A C okennonii 5A C suksdorfii 18477 C suksdorfii 99FW8-9 C spathulata 2003-34 C spathulata 2003-6 C russanovii KIC287 C dahurica 2003-62 C wattiana 1401-52 C sanguinea 1232-49 C maximowicizii AA309-97 C chlorosarca 281-71A C lassa 2003-18 C agrestina 2003-20 C Pulchmerriae 2002-05 C rufula 1992-425 C compacta D654 C sargentii NT288 C pubescens 76-2049 C marshalli 2000-1 C marshalli 2003-30 C opaca AA387-96A C dodgei EL2 C flabellata NT308 C Lacrimatae 2003-46 C munda 2313 C Coccineae F6-172A C irrasa NT193 C Rotundifoliae 2001-29A C uniflora 2003-26 C punctata 2000-2 C collina NT300 B C calpodendron 2000-28 C macracantha 2001-25 C triflora lance2314 C ashei lance2309 C mollis 1999 C submollis 2000-97 C phaenopyrum 99ME3 C phaenopyrum AA185-52 C viridis 2003-45 C chryoscarpa AA749-52 C aestivalis 1992-250 C flavida 2003-65 C tenax D662 C harbisonii lance4 C crusgalli NT286 C engelmannii AA312-87 C heldreichii 2003-61 C hupehensis AA356-81 C pentagyna 312 C media 23-1 C C rhipidophylla 1970-64 C monogyna 34-1 C media 29-1 C nevadensis 271-2 C orientalis C01 C albanica C2BIS C laevigata 26-5 C media DY08 C meyeri 1998-8010 C songarica 304-1 C pinnatifida 1691-49 C peloponnesiaca A18 C pseudoheterophylla 313-1 C killinica 166-1 C heldrechii B05 C pycnoloba A03 C brachyacantha 1999 C germanica AAD11457 Aronia 2003-2 Malus 2003-3 5 changes 119 Supplementary figure 3C. Maximum parsimony phylogram showing branch length of diploid taxa only based on combined chloroplast and nuclear regions (3859 bp).

120

C almaatensis 1196 65 Combined CPNR-Diploids only C nigra KIC294 C wilsonii 749 74 C wilsonii 271 84 C dahurica 2003 62 C dahurica 250 2000 C maximowcizii 309 97 C chloroscara 2003 60 C chloroscara 281 71 C kansuensis 1999 C kansuensis AA EN101 C kansuensis AA12 95 C suksdorfii 18485 C suksdorfii 18477 A C suksdorfii 99fw8 12 C suksdorfii 99fw8 9 C suksdorfii 2001 27A C saligna 2001 7 C saligna 99fw1 7 C saligna 2001 4 C pubescens EH76 C opaca AA387 96 C opaca 2001 1 C opaca 2003 33 C aestivalis RON C aestivalis 1992 250 C mollis D665 C mollis 1999 C crusgalli NT286 C crusgalli NT213 C punctata 2000 2 B C punctata BB4 C viridis 2003 63 C viridis 2003 45 C calpodendron 2000 28 C calpodendron NT172 C calpodendron 2003 56 C triflora lance2314 C triflora NT290 C songarica 2003 57 C songarica AA113 96 C laevigata 18473 C laevigata 18472 C laevigata 26 5 C laevigata 27 1 C nevadensis 271 1 C nevadensis 1991 0061 C monogyna 8 1 C monogyna 34 1 C pycnoloba A05 C pycnoloba A04 C heldreichii AA238 71 C heldreichii 2003 61 C C heldreichii C02 C heldreichii B05 C monogyna 2003 25 C monogyna 99FW7 11 C pinnatifida 1691 49 C pinnatifida #3 C pentagyna KIC312 C pentagyna 2000 18 C pentagyna 94-85 C pentagyna 312-2 C hupehensis 356 81 C hupehensis 2003 54 C brachyacantha 1999 C brachyacantha 2000 11 C brachyacantha 2003 32 C germanica F6 171 C germainca 2 9 C germanica 2 18 Amelanchier B5 Amelanchier B19 Amelanchier 2003 1 Malus 2003 3 Aronia 2003 2 10 changes Chapter 4 Cytotype diversity, heterogeneity in reproductive system, and climatic correlates of distribution of the Pacific Northwest hawthorns (Crataegus section Douglasianae; Rosaceae).

Abstract. In woody perennials, reproductive systems and abiotic factors relating to cytotype distribution are poorly understood. The present study encompasses a wide geographical sampling of the western North American black-fruited Crataegus with the major goals to identify ploidy levels and reproductive systems, as well as to elucidate climatic factors that relate to cytotype distribution. Leaf and seed samples of about 300 individuals were included from 47 localities for which climatic data were available. Flow cytometry was used to determine nuclear DNA content of leaf, embryo, and endosperm tissues. Our findings indicate that plants from the Colorado Plateau and adjacent regions are either diploids or tetraploids, while plants along the Pacific coast and across the Rocky Mountains vary from diploids, through triploids, to tetraploids. All diploid plants display the usual ratio of endosperm to embryo DNA content

(ratio of 1.5) indicating sexual reproduction, while triploid and tetraploid seeds reveal considerably higher endosperm DNA content (ratio of 2.5-3), indicative of unreduced megagametophytes and agamospermy. Significant differences were found in elevation, temperature, rainfall, and relative humidity between diploid and polyploid sites. We infer that heterogeneity in ploidy level and reproductive strategy may contribute to the expansion of climatic tolerance of Crataegus in the Pacific Northwest.

Keywords: Climatic factors; cytotypes; elevation; flow cytometry; gametophytic apomixis; geographical segregation; nuclear DNA content; Pacific Northwest; Rosaceae.

121 122 4.1. Introduction

Polyploidization is one of the processes leading to adaptation and speciation (Levin 1975).

About 70% of flowering plant species are believed to be polyploids (Soltis & Soltis 1999; Otto

& Whitton 2000) and the chromosome multiplying events are often associated with changes in ecological habitat, geographical distribution, and reproductive mode (Allem 2003; Jakob et al.

2004; Bacck 2004; Van Dijk 2003; Van Dijk & Vijverberg 2005; Hörandl 2006). Apomixis, the asexual formation of seeds, has been known to occur in many plant families (Grant 1981;

Nogler 1984; Aster & Jerling 1992). In Rosaceae, gametophytic apomixis is well documented in the subtribe Pyrinae of the subfamily Spiraeoideae, and apomictic individuals are usually polyploids in genera such as Amelanchier, Crataegus, Malus, and Sorbus (Muniyamma &

Phipps 1979a; Dickinson et al. 2007).

Crataegus is one of the largest genera in the subtribe Pyrinae (formerly the subfamily

Maloideae; Campbell et al. 2007). It includes deciduous shrubs or small trees of a temperate distribution in the Northern Hemisphere and there are around 100 species described for North

America. The base chromosome number in the subtribe is n = 17 (Muniyamma & Phipps 1979a).

A flow cytometry survey (Talent & Dickinson 2005) estimated that over 60% of the North

American species include polyploid individuals, corroborating earlier chromosome counts

(Longley 1924; Gladkova 1968; Muniyamma & Phipps 1979a; Dickinson & Phipps 1986;

Dickinson et al. 1996). The black-fruited species of section Douglasianae have a wide western

North American distribution and were previously reported to contain diploid and polyploid individuals (Brunsfeld & Johnson 1990; Dickinson et al. 1997; Talent & Dickinson 2005).

Diploids, in general, show a fairly limited distribution range relative to polyploids. However, the reproductive system and factors relating to distribution pattern between cytotypes are unclear.

Early embryological studies provided some evidence for the occurrence of apomixis in

Crataegus, but were based on very limited sampling (Muniyamma & Phipps 1979b, 1984). Also, 123 previous surveys of ploidy levels in Crataegus (Longley 1924; Muniyamma & Phipps 1979a, b;

Talent & Dickinson 2005) were not at the intraspecific level and fine geographical scales.

Abiotic factors that are potentially associated with polyploidy and reproductive mode changes in natural populations are not well understood. Studies in herbaceous species of the

Ranunculaceae (Bacck 2004), Brassicaceae (Husband & Schemske 1998), Poaceae (Jakob et al.

2004), and Asteraceae (Verduijn et al. 2004) have shown correlations between ploidy level and elevation, latitude, as well as edaphic characters. However, very few woody plants have been investigated in-depth regarding the relationships among distribution pattern of cytotypes, reproductive systems, as well as the underlying climatic attributes. In the western U. S., the

Cascade-range uplift coupled with the atmospheric circulation patterns over the northern Pacific

Ocean have played an important role in shaping the climate, with a gradual progression from wet to dry conditions west to east. Such climate differences in turn drive the diverse ecosystems and have significant ecological effects on species distribution, habitat availability, and disturbance regimes between the west and east of the Cascades (Walther et al. 2002; McKenzie et al. 2003; Parmesan & Yohe 2003; Parmesan 2006; Brunsfeld et al. 2007). The present study aims to evaluate differences in ploidy level and reproductive mode within the Douglasianae complexes and further investigate potential associations with regional climatic variation.

Measuring nuclear DNA content by flow cytometry has been employed in many different plant groups to deduce their ploidy levels (Bennett & Leitch 1995; Bennett et al. 2000; Lafuma et al. 2003; Dart et al. 2004; Jakob et al. 2004; Talent & Dickinson 2005; Mandakova et al.

2006). Moreover, the comparative DNA content in the embryo and endosperm tissues of seeds can be used with flow cytometric analyses to interpret reproductive pathways (Matzk et al.

2000), and this has been shown to work for Crataegus (Talent & Dickinson 2007). Sexual reproduction and gametophytic apomixis are the two major reproductive pathways in Crataegus

(Talent & Dickinson 2007). Sexual reproduction involves meiotically reduced micro- and 124 megagametophytes and double fertilization (Maheshwari 1950) while gametophytic apomixis involves unreduced megagametophytes and a single fertilization of the binucleate central cell of the megagametophyte (Czapik 1996; Nogler 1984). In unreduced megagametophytes the egg cell develops parthenogenetically into the embryo, whereas in the central cells sperm (from pollen) is apparently always required in fertilization with the two polar egg nuclei to develop endosperm (Talent & Dickinson 2007). Depending on whether the megagametophytes are reduced or unreduced and whether the polar cells are fertilized or not, different DNA content

(ploidy levels) usually results in the endosperms of the seeds, even if the embryos of all seeds have the same ploidy levels (see Fig. 4.1). Sexual plants are expected to show a 3:2 DNA content ratio in endosperm and embryo resulting from fusion of reduced gametes, whereas apomictic plants show at least 2.5 to 3 fold (or higher, if extra central-cell nuclei exist) ratio in the endosperms because of the fusion of unreduced female gametes and either one or two sperm

(Fig. 4.1; Talent & Dickinson 2007). These differences in DNA contents between embryo and endosperm nuclei are used here to distinguish the reproductive pathways by which individual

Crataegus seeds were formed.

As part of a broader investigation into polyploid evolution in the black-fruited hawthorns, we focused here on the three following questions: (1) How are diploid and polyploid trees distributed on a regional scale? (2) How are seeds produced within individuals of different ploidy level? (3) Is there a correlation between climate variables and cytotype distribution?

4.2. Materials and methods

4.2.1. Taxon identification

Our studied taxa represent two main series, Douglasianae and Cerrones, of section

Douglasianae, found in the Pacific Northwest. Trees of these taxa occur from the Rocky

Mountains west to the Pacific, from New Mexico north to and the 125 panhandle (Fig. 2 in Dickinson et al. in press). Only C. douglasii can be found further east, in the Cypress Hills of southern Alberta and Saskatchewan, and in the upper Great Lakes basin

(Brunsfeld & Johnson 1990; Dickinson et al. 1996; Phipps 1999). Taxa of series Douglasianae are diagnosed by autumn black fruits. is unambiguously distinguished from

C. suksdorfii by stamen number (10- versus 20-stamens) and has a far wider range of distribution than C. suksdorfii. Crataegus castlegarensis, C. okennonii, C. phippsii, and C. shuswapensis are morphologically similar to C. douglasii and are considered as C. douglasii segregates in the present study. Taxa of series Cerrones are morphologically distinct from those of series Douglasianae in characters such as narrower and glossier leaves, straighter thorns, and coppery bark on branches (Phipps 1999). Within series Cerrones, C. saligna is distinguished from C. rivularis by stamen number (20 versus 10), smaller leaf, flower, and fruit size. The former species has a fairly limited distribution in northeastern Utah and western Colorado, while the latter has a much wider distribution. resembles C. rivularis but without such a wide range and with more lobed leaves and reddish fruits (Phipps 1998).

4.2.2. Sampling sites and plant materials

A total of 316 individuals of C. suksdorfii, C. saligna, C. rivularis, and C. douglasii including its segregates were collected from 47 localities mainly in western North America with three additional sites for C. douglasii in Ontario, Canada (Fig. 4.2). Mature leaves and fruits were collected in August and September 2004-2006 and number of individuals collected depended on the population size per site (Table 4.1). Studied trees are tagged and vouchers are deposited in the Green Plant Herbarium of the Royal Ontario Museum (TRT). Because of the failure in fruit production and/or delayed fruiting time in some populations, fruits were collected in only 27 of our 47 studied sites (Table 4.1). With 3-4 seeds per tree and 2-12 trees (depended on the population size) in each locality, a total of 283 and 186 seeds were investigated respectively for 126 C. douglasii (including its segregates) and C. suksdorfii. However, due to limited sampling, only

5 and 16 seeds were examined for C. saligna and C. rivularis, respectively.

4.2.3. Ploidy level estimation and data analyse

Leaves and fruits were kept at 4°C after collection. Leaves were processed within a week and fruits can last for 2-3 months without deterioration in seed quality. For most samples, seeds dissected from the fruits are analysed as a whole (i.e., embryo and endosperm together). The preparation of nuclear suspensions from leaf and seed materials, and the determination of DNA content followed the protocol by Talent and Dickinson (2005, 2007), using a FACSCalibur flow cytometer (Becton-Dickinson) equipped with an argon laser and detector (585nm wavelength) for fluorescence of propidium iodide-stained samples. Nuclear DNA content was estimated as the ratio between the fluorescence of the Crataegus samples and the standard Pisum sativum with a 2C value of 9.56 pg DNA per Pisum nucleus (Johnston et al. 1999). Because the measurement of some Crataegus seed endosperm can overlap with that of the Pisum when both are mixed, Pisum was used as an external standard. Histograms showing fluorescence height of the selected particles were used for statistical analyses.

Apart from the mean 2C values (μ), the adjusted variance (σ2) in each measurement was calculated following the methods of Dickson et al. (1992). Measurements with a ratio of variance to mean (σ2/μ) greater than 15% were discarded because they might represent inaccurate estimation. Ploidy levels of leaf, embryo, and endosperm tissues were readily distinguished using the Pisum C-value range published in Talent and Dickinson (2005, 2007).

The nuclear DNA content ratio of endosperm and embryo tissues (End:Emb) was used to determine the type of reproductive system that occurs in an individual, as illustrated in Fig. 4.1.

4.2.4. Statistical tests of distribution and climate data 127 Because C. douglasii (and its segregates) and C. suksdorfii were more densely sampled than other taxa in the present study, these individuals were used to test whether there are significant relationships between ploidy level and longitude, latitude, elevation as well as climate variables.

In addition, C. douglasii documented from two other Washington sites (Whitman Co.: 46.67∘N,

117.01∘W and Okanogan Co.: 48.27∘N, 119.73∘W) in Talent and Dickinson (2005) were included in the statistical analyses. Data of seven climate variables (monthly values) including precipitation, temperature, relative humidity, sunshine hours, wind run, days-with-frost, and daily temperature range (DTR) were obtained using the coordinates of the collection sites with the CAWQuer climate database (http://www.iwmi.cgiar.org/WAtlas/AtlasQuery.htm). Linear regression analyses and ANOVA (one way) were conducted with XLSTAT version 2007.4 to determine the correlations and level of significance between the leaf DNA content of individuals

(as dependent variable) and elevation as well as climate data (as independent variable) respectively. Where significant F-ratios were calculated by ANOVA, Tukey's Honesty

Significant Difference (HSD) test was applied to identify which data sets were different.

Principle component analysis (PCA) was also performed with monthly values of all climate factors (total 84 variables) that were commensurate with the maximum and minimum values by the formula (X1-X1min)/(X1max-X1min) where X1 denotes any one value of variable 1 and Xmax as well as Xmin denote the maximum and minimum values of variable 1. By doing so, the total climate variation among sites will not be biased by variable(s) e.g., precipitation (0.11-147.36 mm) that show a larger range of differences than the others e.g., wind speed (2.6-5 km/hr). The resulted plots were used to describe the total climate variation among cytotypes of our studied sites.

4.3. Results

4.3.1. Leaf ploidy level variation and distribution 128 The leaf nuclear DNA content estimates of the Douglasianae species with respect to geographical localities are shown in Fig. 4.3. The DNA measurements corresponded well to the ploidy level categories derived by Talent and Dickinson (2005): diploid (1.37-1.67pg), triploid

(2.05-2.50pg), and tetraploid (2.74-3.34pg). A summary of estimates for each topodeme is shown in Table 4.2. Of the 47 sites in the east and west sides of the Cascades, 14 sites showed only diploids, 2 sites showed only triploids, and 28 sites showed only tetraploids in our samples.

Three sites in Washington and Idaho (Fig. 4.2) showed a mixture of both triploid and tetraploid plants, which were identified respectively as C. suksdorfii and C. douglasii. The two cytotypes intersperse evenly within these sites without local segregation. No mixture of diploids and triploids, as documented in Brunsfeld and Johnson (1990), was observed in our studied sites.

In C. douglasii and its segregates, consistently high leaf DNA content (average 2C value =

3.01±0.12 pg, i.e., tetraploid) was detected among sites of California, Washington, Idaho,

Montana, and Ontario with the larger proportion found in the western U.S. (Table 4.2; Fig. 4.3).

By contrast, DNA contents of C. suksdorfii were shown to be remarkably different among sites

(Table 4.2; Fig 4.3). Of the 112 C. suksdorfii individuals from the 13 sites, 39.3% were diploids

(average 2C value = 1.54±0.14 pg) that are largely restricted to the western coastal range from

British Columbia, through Washington south to western and inner Oregon and northern

California (Fig. 4.2). Approximately 40% were triploids (average 2C value = 2.23±0.12 pg), which can be found disjunctively in alpine meadows of interior Oregon, Washington, and west of the Continental Divide in Idaho. Nineteen percent were tetraploids (average 2C value = 2.95±

0.06 pg), which occur in the eastern mountainous range of the Cascade in Montana.

Crataegus saligna and C. rivularis have similar habitat and distribution. They are more or less localized in the southeast of the Rocky Mountains at elevations around 2,000m (Table 4.1).

Although our samples covered only a portion of their complete distribution range as described in

Phipps (1999), the two species were distinctively recognized as diploid (1.67±0.16 pg/2C) and 129 tetraploid (3.08±0.31 pg/2C). No mixture of the two cytotypes was detected at our studied sites

(Fig. 4.3), but herbarium specimens at COLO (Boulder, Colorado) suggest that the two species may co-occur (N. Talent, pers. obs.).

4.3.2. Seed ploidy level among cytotypes

The DNA content of both embryo and endosperm nuclei differed in the diploid, triploid, and tetraploid individuals of our studied species (Fig. 4.4), thus giving different endosperm to embryo ratios (end:emb) among cytotypes (Fig. 4.5). Of the total 437 examined seeds, the diploid C. saligna and C. suksdorfii (N = 95) contained the lowest average DNA content in both embryo (1.48±0.24 pg/2C and 1.52±0.06 pg/2C) and endosperm (2.55±0.007 pg/2C and

2.35±0.15 pg/2C) nuclei, indicative of 2x embryo and 3x endosperm throughout the diploid samples. The end:emb was calculated to 1.5-1.7 (Fig. 4.5). This ratio, as illustrated in Fig. 4.1, suggested that diploid seeds are sexually produced by double fertilization with reduced male and female gametes.

Seeds obtained from the 46 triploid seeds of C. suksdorfii showed a higher DNA content in embryos and endosperms than the diploid individuals (Table 4.2; Fig. 4.4). Six seeds that appeared to show autonomous (i.e., 6x) endosperms were believed to be an outcome of the G2 phase embryo, and thus were excluded from the analyses. The average DNA content of embryos was 2.20±0.23 pg/2C and was readily classified into the 3x category, but the endosperms were less uniform and ranged from 5.71±0.27 pg/2C to 7.51±0.55pg/2C that could represent 8x, 9x, and 10x endosperms (Fig. 4.4). No discrete ploidy level was assigned because the estimated ranges of 2C-values overlapped for 8x and higher ploidy levels. The end:emb was calculated at around 3 which was twice that of the diploid seed end:emb values (Fig. 4.5). This discrepancy suggests that megagametophytes were unreduced in both polar and central nuclei and that fertilization with meiotically unreduced sperm occurred only in the central cell. 130 Seeds from the tetraploid in C. douglasii and segregates, C. rivularis and from 4x C. suksdorfii revealed the highest DNA content in both embryos and endosperms (Table 4.2; Fig.

4.4). Embryos of the three species (2.93±0.14pg/2C, 3.03±0.19pg/2C, and 2.89±0.19pg/2C) were clearly distinguished as 4x, whereas endosperms ranged from 10x-12x according to their

2C values (8.07±0.82pg/2C, 8.32±0.66pg/2C, and 7.91±0.74pg/2C) and no one ploidy level could be confidently specified because of the overlapping range. The end:emb was calculated to around 2.6 (Fig. 4.5), which indicated the presence of unreduced megagametophytes. Hence, these seeds are inferred to be produced apomictically and no sexual seeds were detected.

4.3.3. Comparisons of elevations and climatic variables among cytotypes

A summary of ANOVA comparisons of topographic and climatic variables among the 2x, 3x, and 4x cytotypes is shown in Table 4.3. Triploid and tetraploid sites were shown to be significantly different in elevation from the diploids (Table 4.3). A positive correlation was detected between leaf DNA content and elevation in the regression results, in which plants of lower 2C-values were found in lower areas (data not shown).

Regression analyses and ANOVA indicated significant correlation of four out of the seven tested climate variables with cytotype distribution (Table 4.3). They included precipitation (F2, 30

= 6.05, p = 0.006), temperature (F2, 30 = 5.11, p = 0.012), humidity (F2, 30 = 8.16, p = 0.001), and days-with-frost (F2, 30 = 9.6, p = 0.001) that all showed negative correlations with the leaf DNA content except the days-with-frost variable that showed a positive correlation (Fig. 4.6a-d). No significant differences were found in the remaining variables between diploid and polyploid sites (Table 4.3).

The largest amount of variation (67.28%) was obtained in the first two components of the

PCA plot based on the seven climate variables. The first component (43.22%) was identified with most of the variables, whereas the second component (24.05%) was identified mainly with 131 wind (Fig. 4.7a). Polyploid sites in general were shown to be more scattered than diploid sites and suggested wider climatic amplitudes among polyploids (Fig. 4.7b). Most of the 2x sites from Oregon and Washington were separated from the 3x and 4x sites of California, Washington,

Idaho, and Montana on the first component. In C. douglasii, sites of western Ontario appeared to have different climatic regimes from those of Washington, Idaho, and Montana.

4.4. Discussion

Geographical segregation between sexual diploids and apomictic polyploids have been reported in herbaceous perennials such as Crepis (Babcock & Stebbins 1938), Townsendia (Thompson &

Whitton 2006), Taraxacum (Verduijn et al. 2004), and Hieracium (Houliston et al. 2004), and

Ranunculus (Paun et al. 2006; Hörandl 2006), but rarely in woody angiosperms. Moreover, environmental attributes of such distribution pattern are not sufficiently investigated. This study documents climatic correlates of cytotype distribution in woody plants, such as Crataegus, that can reproduce both sexually and asexually.

4.4.1. Cytotype segregation and climatic correlates of distribution

Clear differences were detected in leaf nuclear DNA content of C. suksdorfii which contains diploid, triploid, and tetraploid individuals with respect to different geographical localities (Fig.

4.3). Plants with lower 2C values (i.e. 2x) occur further west along the coastal range in

California, Oregon, Washington, and British Columbia at lower elevation (<100 m) (Table 4.2).

In this study, it is apparent that diploid C. suksdorfii are spatially segregated from its tetraploids which occur in the mountainous range of Montana, the eastern edge of C. suksdorfii distribution range (Fig. 4.1), whereas triploids found in the Cascade Mountains of Washington, Oregon, and

Idaho are located between the diploid and tetraploid sites. For instance, an entirely triploid population of C. suksdorfii in Oregon is less than 200 km away from the nearest diploid 132 relatives. Triploids in Idaho are about 50 km away from the nearest tetraploid relatives, but no triploid individuals are detected within the diploid and tetraploid sites of C. suksdorfii. The only exception to this is the case of a diploid plant found in one of the triploid sites in Idaho as documented by Brunsfeld and Johnson (1990). However, no voucher was available to verify that plant and our results did not provide evidence for this case.

Tetraploid C. douglasii and its segregates span from the west to further east across the

Great Plain in western Ontario and display a much wider geographical range than C. suksdorfii.

Only 3x and 4x C. suksdorfii are shown to coexist with C. douglasii (Fig. 4.1 and Table 4.1) where their trees grow next to each other without niche differentiation, but no 2x C. suksdorfii and 4x C. douglasii are found together. Because not every tree within the same site was examined, the possibility of under-sampling a different cytotype cannot be ruled out. Theories predict that competition between coexisting cytotypes within local populations often results in the exclusion of the minority or less successful cytotype (Levin 1975; Husband 1998). Absence of sympatric diploids and polyploids in our studied sites perhaps could be explained by this minority exclusion hypothesis.

Polyploids in general showed a wider climatic amplitude than diploids in our samples (Fig.

4.7b). Diploid sexual plants appear to favour those areas with higher precipitation, warmer temperature, and higher relative humidity along the Pacific coast and in the west of the Cascade

Range (Table 4.3; Fig. 4.6). However, along the west-east transect (125°-100°W), when elevation increases to an average of 1,000 m in the Cascade mountains of Washington State and the Rockies at the Idaho/Montana border, and when annual and daily temperatures are much more variable as well as precipitation and humidity are much lower (Table 4.3; Fig. 4.6), triploids and tetraploids prevail. These individuals reproduce apomictically as evidenced from our seed data (Fig. 4.5). It is conceivable that apomixis serves as a more effective way to assure reproductive success in areas where lower temperature and precipitation result in shorter 133 vegetative and flowering period as well as lower pollinator availability and activity.

4.4.2. Gametophytic apomixis in triploid and tetraploid plants

The association of polyploidy with apomixis has been documented in many angiosperm species

(Asker & Jerling 1992; Nogler 1984; Carmen 1997) and various developmental pathways of agamospermous seed production have been investigated molecularly in the last decade (Noyes

& Rieseberg 2000; Savidan 2000; Quarin et al. 2001; Koltunow & Grossniklaus 2003).

Although the underlying genetic mechanism is still not fully understood, this association holds true in Douglasianae species in which diploid seeds are sexually produced whereas triploid and tetraploid seeds are produced via gametophytic apomixis.

Diploid embryos and triploid endosperms, as typically found in diploid plants, were detected in all examined seeds of 2x C. suksdorfii and C. saligna (Fig. 4.4 and Table 4.2).

Female gametophytes are expected to be meiotically reduced in both central-cell and egg nuclei.

Fertilizations of the binucleate central-cell and the egg by two reduced sperm from a pollen microgametophyte thus lead to a 1.5 endosperm to embryo DNA content ratio (Fig. 4.5).

However, such a ratio was not detected in any of the seeds from triploid and tetraploid mother plants.

All seeds from triploid plants in our samples contained 3x embryos and 8-10x endosperms.

The 3x embryos are expected to be developed from unreduced egg cells (♀3x) by parthenogenesis because of the challenge in meiotic division and recombination of uneven chromosomes. Absence of 4-6x embryos (♀of 3x + ♂of 1x or 2x or 3x) indicated that no fertilization occurs in the embryo. The DNA content of endosperms ranged from 5.46±0.27 to

7.51±0.55 pg/2C (Fig. 4.4). Data points fall in the overlapping range of the 8x and 9x as well as

9x and 10x categories. Therefore, we cannot confidently distinguish 9x endosperms, but rather specified as either 8-9x or 9-10x. There are three possible scenarios of which 8x, 9x, and 10x 134 endosperms can be formed. First, when egg nuclei remained unreduced, 8x endosperms could be products of fertilization between two unreduced polar nuclei (♀3x +♀3x) and one reduced pollen nucleus from a tetraploid (♂2x). Second, if triploid pollens are viable and unreduced

(♂3x), fertilization with polar nuclei will produce 9x endosperms. Third, fusion of either two tetraploid reduced pollen nuclei (♂2x+♂2x), or one unreduced pollen nucleus from a tetraploid

(♂4x) with polar nuclei can give rise to 10x endosperms. Among all scenarios, it is more likely that tetraploid pollens, regardless of whether they are reduced or unreduced, are involved in endosperm formation. It has been shown that pollen stainability of C. douglasii is significantly higher than C. suksdorfii in sympatric sites (Dickinson et al. 1996). Because the triploid seeds examined here were collected in sites where C. douglasii is found, it is not unreasonable to anticipate that the 4x C. douglasii is the pollen donor for successful pseudogamy. Furthermore, pollen in triploid Crataegus is usually completely or partially male-sterile (Muniyamma &

Phipps 1979a, b, 1984b; Dickinson & Phipps 1986; Smith & Phipps 1988; Ptak 1989), except in a few cases in which fertile triploid pollen was reported (Smith & Phipps 1988). Although we are uncertain of whether pollen of our 3x C. suksdorfii is viable and participated in pseudogamy, it is noteworthy that in sites with dominant triploid trees but without 4x C. douglasii, such as the

Patterson Mountain (OR6; Table 4.1), no successful seed set has been observed in the past few years (pers. obs.). This phenomenon favours the sterile 3x pollen explanation and suggests the necessity of C. douglasii pollen for successful pseudogamy, hence seed production in triploid C. suksdorfii.

Pollen in tetraploids has been shown to be highly fertile (Muniyamma & Phipps 1985;

Dickinson & Phipps 1986; Ptak 1986; Smith & Phipps 1988). Dickinson et al. (1996) estimated the pollen production per flower and the stainability in C. douglasii as 5,200-12,900 (mean =

8,679) and 81-96% (mean = 84.3), respectively. High ploidy-level (10-12x) endosperm found in almost all tetraploid plants of C. douglasii, C. suksdorfii, and C. rivularis clearly followed the 135 pathway according to which the two unreduced polar nuclei (♀4x+♀4x) fused with one (♂2x) or two reduced pollen nuclei (♂2x+♂2x), as illustrated in Fig. 4.1. Megagametophytes are likely to be unreduced and to develop parthenogenetically into 4x embryos. In our study, no sexual seeds were detected in triploid and tetraploid plants, although Crataegus species are suggested to be facultative apomicts (Muniyamma & Phipps 1979a, 1984; Smith & Phipps

1988). This result does not come as a surprise because Talent and Dickinson (2007) also demonstrated near-obligate apomixis in tetraploid C. crus-galli and C. macracantha from their hand-pollination experiments. The reasons for obligate apomicts are unclear, but based on similar seed sets observed between 4x C. douglasii and 2x C. suksdorfii, obligate apomicts probably acquire equivalent reproductive success as that of sexual plants.

4.4.3. Conclusions

Flow cytometry data identify ploidy level differences among individuals of section

Douglasianae in the Pacific Northwest. Diploids and polyploids of C. suksdorfii and C. douglasii are shown to be geographically segregated at the regional scale. Diploids are found mainly along the coastal mesic areas whereas triploids and tetraploids occur either separately or in sympatry in the Cascades and Rocky Mountains. This distribution pattern appears to be related to climate differences among sites. Moreover, diploid and polyploid individuals demonstrate heterogeneous reproductive systems. Apomictic plants are considered as effective colonizers in areas affected by human activities such as deforestation and agricultural practices

(Stebbins 1985; Mitchell 1992). Here, we infer that polyploidy in combination with apomixis in

C. suksdorfii and C. douglasii may contribute to adaptation in drier and cooler environments, thus a wider distribution range in the Pacific Northwest. Nevertheless, the natural history and population dynamics of these species, as well as other anthropogenic factors, cannot be ignored when explaining cytotype distribution. The present study sheds light on the reproductive 136 systems and abiotic factors relating to cytotype distribution, which are poorly documented in woody angiosperms. This information allows us to postulate hypotheses of polyploid evolution in the Douglasianae complex. More questions such as the origins of 3x and 4x C. suksdorfii, the extent of gene mixing between sympatric cytotypes, and genetic structure of sexual and apomictic populations remain to be solved in further genetic studies.

4.5. Acknowledgements

The authors thank Cheryl Smith for flow cytometry advice; Rhoda Love, and Peter Zika for plant collection and identification; Nan Lederer at the COLO herbarium for providing access to specimens; Jess Chung for helping with fruit dissection; Annabel Por, Jenny Bull, and Cheying

Ng for helping to organize the vouchers for this study. Financial support from the Natural

Sciences and Engineering Research Council of Canada (grant A3430 to TAD and 326439-06 to

SS), the Botany Department of the University of Toronto, and the Royal Ontario Museum is gratefully acknowledged.

4.6. References

Allem AC 2003. Optimization theory in plant evolution: an overview of long-term evolutionary

prospects in the angiosperms. Bot. Rev. 69: 225-251.

Asker SE, Jerling L 1992. Apomixis in plants. CRC Press. Boca Raton, FL.

Babcock EB, Stebbins GLJ 1938. The American species of Crepis: Their interrelationships and

distribution as affected by polyploidy and apomixis. Carnegie Institution, Washington,

District of Columbia, U.S.A.

Baack EJ 2004. Cytotype segregation on regional and microgeographic scales in snow

buttercups (Ranunculus adoneus: Ranunculaceae). Am. J. Bot. 91: 1783-1788. 137 Bennett MD, Leitch IJ 1995. Nuclear DNA amounts in angiosperms. Ann. Bot. 76:113-176.

_____, Bhandol P, Leitch IJ 2000. Nuclear DNA Amounts in Angiosperms and their Modern

Uses—807 New Estimates. Ann. Bot. 86: 859-909.

Brunsfeld SJ, Johnson FD 1990. Cytological, morphological, ecological and phenological

support for specific status of Crataegus suksdorfii (Sarg.) Kruschke. Madroño 37: 274-282.

_____, Miller TR, Carstens BC 2007. Insights into the biogeography of the Pacific Northwest of

North America: Evidence from the phylogeography of Salix melanopsis. Syst. Bot. 32:

129-139.

Campbell CS, Evans RC, Morgan DR, Dickinson TA, Arsenault MP 2007. Phylogeny of

subtribe Pyrinae (formerly the Maloideae, Rosaceae): Limited resolution of a complex

evolutionary history. Pl. Syst. Evol. 266: 119 – 145.

Carman JG 1997. Asynchronous expression of duplicate genes in angiosperms may cause

apomixis, bispory, tetraspory, and polyembryony. Biol. J. Linn. Soc. 61: 51-94.

Czapik R 1996. Problems of apomictic reproduction in the families Compositae and Rosaceae.

Folia Geobotanica 31: 381-387.

Dart S, Kron P, .Mable BK 2004. Characterizing polyploidy in Arabidopsis lyrata using

chromosome counts and flow cytometry. Can. J. Bot. 82: 185-197.

Dickinson TA, Phipps JB 1986. Studies in Crataegus (Rosaceae: Maloideae) XIV. The breeding

system of Crataegus crus-galli sensu lato in Ontario (Canada). Am. J. Bot. 73: 116-130.

_____, Belaoussoff S, Love RM, Muniyamma M 1996. North American black-fruited

hawthorns: I. Variation in floral construction, breeding system correlates, and their possible

evolutionary significance in Crataegus sect. Douglasii Loudon. Folia Geobotanica 31:

355-371.

_____, Love RM 1997. North American black-fruited hawthorns: III. What is Douglas hawthorn?

Conservation and Management of Oregon's Native Flora. T. Kaye. Corvallis, OR, Native 138 Plant Society of Oregon.

_____, Lo EYY, Talent N 2007. Polyploidy, reproductive biology, and Rosaceae: understanding

evolution and making classification. Pl. Syst. Evol. 266: 59-78.

Dickson EE, Arumuganathan K, Kresovic S, Doyle JJ 1992. Nuclear DNA content variation

within the Rosaceae. Am. J. Bot. 79: 1081-1086.

Fehrer J, Simek R, Krahulcova A, Krahulec F, Chrtek J, Brautigam E, Brautigam S 2005.

Evolution, hybridization, and clonal distribution of apo- and amphimictic species of

Hieracium subgen. Pilosella (Asteraceae, Lactuceae) in a Central European mountain range.

In Plant species-level systematics: New perspectives on pattern and process. Pp.175-203.

Gladkova VN 1968. Karyological studies of the genera Crataegus L. and Cotoneaster Medic.

(Maloideae) as related to their taxonomy. Botanicheskii Zhurnal 53: 1203-1273

Grant V 1981. Plant Speciation. Ed. 2. Columbia University Press, New York.

Husband BC 1998. Contraints on polyploid evolution: A test of the minority cytotype exclusion

principple. Proc. R. Soc. Biol. Sci. London 267: 217-223.

_____, Schemske DW 1998. Cytotype distribution at a diploid–tetraploid contact zone in

Chamerion (Epilobium) angustifolium (Onagraceae). Am. J. Bot. 85: 1688-1694.

Hörandl E 2006. The complex causality of geographical parthenogenesis. New Phytol. 171:

525-538.

Houliston GJ, Chapman HM 2004. Reproductive strategy and population variability in the

facultative apomict Hieracium pilosella (Asteraceae). Am. J. Bot. 91: 37-44.

Jakob SS, Meister A, Blattner FR 2004. The considerable genome size variation of Hordeum

species (Poaceae) is linked to phylogeny, life form, ecology, and speciation rates. Mol. Biol.

Evol. 21: 860-869.

Johnston JS, Bennett MD, Rayburn AL, Galbraith DW, Price HJ 1999. Reference standards for

determination of DNA content of plant nuclei. Am. J. Bot. 86: 609-613. 139 Koltunow AM, Grossniklaus U 2003. Apomixis: A developmental perspective. Ann. Rev. Pl.

Biol. 54: 547–574.

Lafuma L, Balkwill K, Imbert E, Verlaque R, Maurice S 2003. Ploidy level and origin of the

European invasive weed Senecio inaequidens (Asteraceae). Pl. Syst. Evol. 243: 59-72.

Levin DA 1975. Minority cytotype exclusion in local plant populations. Taxon 24: 35-43.

Longley AE 1924. Cytological studies in the genus Crataegus. Am. J. Bot. 11: 295-317.

Love R, Feigen R 1978. Interspecific hybridization between native and naturalized Crataegus

(Rosaceae) in western Oregon. Madroño 25: 211-217.

Maceira NO, Haan AAD, Lumaret R, Billon M, Delay J 1992. Production of 2n gametes in

diploid subspecies of Dactylis glomerata l. Occurrence and frequency of 2n pollen. Ann. Bot.

69: 335-343.

_____, Jacquard P, Lumaret R 1993. Competition between diploid and derivative autotetraploid

Dactylis glomerata L. from Galicia. Implications for the establishment of novel polyploid

populations. New Phytol. 124: 321-328.

McKenzie D, Peterson DW, Peterson DL, Thornton PE 2003. Climatic and biophysical controls

on conifer species distributions in mountain forests of Washington State, USA. J.

Biogeography 30: 1093-1108.

Maheshwari P 1950. An introduction to the embryology of angiosperms. London: McGraw-Hill.

Mandáková T, Münzbergová Z 2006. Distribution and ecology of cytotypes of the Aster amellus

aggregates in the Czech Republic. Ann. Bot. 98: 845-856.

Matzk F, Meister A, Schubert I 2000. An efficient screen for reproductive pathways using

mature seeds of monocots and dicots. Pl. J. 21: 97-108.

Mitchell WW 1992. Cytogeographic races of Arctagrostis latifolia. Can. J. Bot. 70: 80-83.

Muniyamma M, Phipps JB 1979a. Studies in Crataegus (Rosaceae: Maloideae). I. Cytological

proof of apomixis in Crataegus (Rosaceae). Am. J. Bot. 66: 149-155. 140 _____, _____ 1979b. Studies in Crataegus (Rosaceae: Maloideae). II. Meiosis and polyploidy in

Ontario species of Crataegus in relation to their systematics. Can. J. Genet. Cytol. 21:

231-241.

_____, _____ 1984. Studies in Crataegus. XI. Further cytological evidence for the occurrence

of apomixis in North American hawthorns. Can. J. Bot. 62: 2316-2324.

_____, _____ 1985. Studies in Crataegus. XII. Cytological evidence for sexuality in some

diploid and tetraploid species of North American hawthorns. Can. J. Bot. 63: 1319-1324.

Nogler GA 1984. Gametophytic apomixis. Embryology of angiosperms. J. B. M. Berlin,

Springer-Verlag. Pp. 475-518.

Noyes RD, Rieseberg LR 2000. Two independent loci control agamospermy (apomixis) in the

triploid flowering plant Erigeron annuus. Genetics 155: 379-390.

Parmesan C 2006. Ecological and evolutionary responses to recent climate change. Ann. Rev.

Ecol. Evol. Syst. 37: 637-690.

_____, Yohe G 2003. A globally coherent fingerprint of climate change impacts across natural

systems. Nature 421: 37-42.

Paun OJ, Greilhuber EM, Temsc H, Hörandl E 2006. Patterns, sources and ecological

implications of clonal diversity in apomictic Ranunculus carpaticola (Ranunculus

auricomus complex, Ranunculaceae). Mol. Ecol. 15: 897-910.

Otto SP, Whitton J 2000. Annual Review of Genetics: Polyploid incidence and evolution. Ann.

Rev. Genet 34: 401-437.

Phipps JB 1999. The relationships of the American black-fruited hawthorns Crataegus

erythropoda, C. rivularis, C. saligna, and C. brachyacantha to C. ser. Douglasianae

(Rosaceae). Sida 18: 647-660.

_____, Robertson KR, Smith PG, Rohrer JR 1990. A checklist of the subfamily Maloideae

(Rosaceae). Can. J. Bot. 68: 2209-2269. 141 _____, O' Kennon RJ, Lance RW 2003. Hawthorns and medlars. Timber Press, Portland OR.

Ptak K 1986. Cyto-embryological investigations on the Polish representatives of the genus

Crataegus L. I. Chromosome numbers; embryology of diploid and tetraploid species. Acta

Biologica Cracoviensia Series: Botanica 28: 107-122.

_____ 1989. Cyto-embryological investigations on the Polish representatives of the genus

Crataegus L. II. Embryology of triploid species. Acta Biologica Cracoviensia Series:

Botanica 31: 97-112, Pl. 5.

Quarin CL, Espinoza F, Martinez EJ, Pessino SC, Bovo OA 2001. A rise of ploidy level induces

the expression of apomixis in Paspalum notatum. Sex. Pl. Reprod. 13: 243-249.

Ramsey J 2007. Unreduced gametes and neopolyploids in natural populations of Achillea

borealis (Asteraceae). Heredity 98: 143-150.

Renner SS, Ricklefs RE 1995. Dioecy and its Correlates in the Flowering Plants. Am. J. Bot. 82:

596-606.

Robertson A, Newton AC, Ennos RA 2004. Breeding systems and continuing evolution in the

endemic Sorbus taxa on Arran. Heredity 93: 487-495.

Savidan Y 2000. Apomixis, genetics, and breeding. Pl. Breeding Rev. 18: 13-86.

Smith PG, Phipps JB 1988. Studies in Crataegus (Rosaceae, Maloideae), XIX. Breeding

behavior in Ontario Crataegus series Rotundifoliae. Can. J. Bot. 66: 1914-1923.

Soltis DE, Soltis PS 1999. Polyploidy: recurrent formation and genome evolution. Trends Ecol.

Evol. 14: 348-352.

Stebbins GL 1985. Polyploidy, hybridization, and the invasion of new habitats. Ann. Missouri

Bot. Gard. 72: 824-832.

Talent N, Dickinson TA 2005. Polyploidy in Crataegus and Mespilus (Rosaceae, Maloideae):

evolutionary inferences from flow cytometry of nuclear DNA amounts. Can. J. Bot. 83:

1268-1304. 142 _____, _____ 2007. Endosperm formation in aposporous in Crataegus L. (Rosaceae,

Spiraeoideae, Pyreae): Parallel to Ranunculaceae and Poaceae. New Phytol. 173: 231-249.

Thompson SL, Whitton J 2006. Patterns of recurrent evolution and geographic parthenogenesis

within apomictic polyploid Easter daises (Townsendia hookeri). Mol. Ecol. 15: 3389-3400.

Van Dijk PJ 2003. Ecological and Evolutionary Opportunities of Apomixis: Insights from

Taraxacum and Chondrilla. Philos. Trans. Biol. Sci. 358: 1113-1121.

_____, Tas ICQ, Falque M, Bakx-Schotman T 1999. Crosses between sexual and apomictic

dandelions (Taraxacum). II. The breakdown of apomixis. Heredity 83: 715-721.

_____, Vijverberg K 2005. The significance of apomixis in the evolution of the angiosperms: a

reappraisal. In Plant species-level systematics: New perspectives on pattern and process.

Pp.101-117.

Verduijn MH, Van dijk J, Jos MM, Van D 2004. Distribution, phenology and demography of

sympatric sexual and asexual dandelions (Taraxacum officinale s.l.): geographic

parthenogenesis on a small scale. Biol. J. Linn. Soc. 82: 205-218.

Walther GR, Post E, Convey P, Menzel A, Parmesan C, Beebee TJC, Fromentin JM,

Hoegh-Guldberg O, Bairlein F 2002. Ecological responses to recent climate change. Nature

416: 389-395. 143

Table 4.1 Geographical localities, coordinates, elevations of respective sites, and total number of leaf and seed samples of C. douglasii, C. suksdorfii, C. saligna, and C. rivularis in the present study. Sites that contain the C. douglasii segregates 1C. castlegarensis, 2C. okennonii,

3C. shuswapensis, 4C. phippsii, and C. rivularis segregate 5C. erythropoda are indicated. Number of individuals with ploidy level determinations but not vouchered are indicated in parentheses.

Species Site label No. of leaf Latitude Longitude Elevation (m) State/Province; County; Locality No. of seeds samples C. douglasii BC1 50.51 119.10 - BC; Spallumcheen 3 0 BC2 48.67 123.42 - BC; Saanich Peninsula 1 0 3BC3 50.55 119.13 360 BC; Enderby 1 6 CAR2 - - - CA; Shasta; Dana 1 5 CAR3 41.11 121.60 1,015 CA; Shasta; Dusty Campground 1 5 1CAR3 40.97 121.56 841 CA; Shasta; Hat Creek 2 0 1 ID02 46.77 116.45 811 ID; Latah; Little Boulder Creek 16 (8) 33 (19) ID03 47.18 116.49 786 ID; Benewah; St. Maries River, Santa Creek 4 (2) 9 (3) ID06 44.99 116.19 1,420 ID; Adams; Last Chance Campground, near Meadows 20 (3) 35 1ID15 44.97 113.94 1,292 ID; Lemhi; US 93 S of Gibbonville 6 (3) (5) 1 ID16 45.37 113.95 1,122 ID; Lemhi; US93 N of Salmon 5 (1) 8 +2 ID20 46.52 116.73 280 ID; Nez Perce; Little Potlatch Creek 9 (2) 14 (2) MT2 47.07 112.91 1,356 MT; Powell; Kleinschmidt Flat 13 (1) 30 ON18 48.45 89.19 200 ON; Thunder Bay 3 (2) (10) ON20 44.75 80.95 225 ON; Grey; Big Bay, Colpoy's range 21 (14) 17 (9) ON21 44.90 81.20 250 ON; Bruce; Barrow Bay 4 3 1 OR 44.58 119.64 655 OR; Grant; John Day River, South Fork 3 0 1 SK 49.62 109.63 - SK; Cypress Hills 2 0 WA1 47.58 120.66 100 WA; Chelan 2 0 WA20 47.24 121.04 747 WA; Kittitas; Cle Elum 5 (1) 24 (4) 143 144

1 WA21 46.84 122.98 64 WA; Thurston; Mound Prairie 22 (9) 3 +14 (6) 2WA22 46.85 117.34 666 WA; Whitman; South of Colfax1 5 (3) 6 (2) 4WA24 48.79 119.40 259 WA; Okanogan; Ellisforde 1 7 C. suksdorfii CAR5 41.40 122.84 871 CA; Siskiyou; Fay Lane 6 (1) 35 (4) ID02 46.77 116.45 811 ID; Latah; Little Boulder Creek 4 14 ID05 45.00 116.06 1,524 ID; Valley; North Beach, Payette Lake 4 (1) 12 (5) ID06 44.99 116.19 1,420 ID; Adams; Last Chance Campground, near Meadows 17 (5) 18 MT2 47.07 112.91 1,356 MT; Powell; Kleinschmidt Flat 17 (6) 40 (7) OR01 44.33 123.12 88 OR; Linn; Cogswell Foster Reserve 7 (3) (14) OR02 44.04 123.15 119 OR; Lane; Bertelson Rd. West Eugene 3 3 OR04 43.53 122.91 1,250 OR; Douglas; Elk Meadows RNA 5 0 OR06 43.77 122.62 1,295 OR; Lane; Patterson Mountain Prairie 16 0 OR07 45.56 122.87 55 OR; Washingto; Hillsboro. 1 0 OR11 45.73 122.77 10 OR; Columbia; Sauvie Island 14 (8) 35 (11) WA20 47.24 121.04 747 WA; Kittitas; Cle Elum 2 (1) 7 (3) WA7 45.83 122.76 15 WA; Clark 8 (7) 0 C. rivularis NTCO15 38.67 107.64 2,114 CO; Montrose; East of Cimarron 1 0 NTCO17 37.49 107.57 2,316 CO; Archuleta; Rte 160 near Yellow Jacket Pass 1 3 NTCO18 - - - CO; Boulder (cultivated) 1 0 ID13 42.34 111.21 1,951 ID; Bear Lake; Montpelier Canyon 17 (8) (3) NV2, NV3 41.01 115.26 1,794 NV; Elko; Lamoille Valley 3 0 NM2 36.96 106.82 2,369 NM; Rio Arriba; US84, Chama 2 0 UT2 40.40 109.92 1,701 UT; Uintah; Hwy 121 NE of Roosevelt 3 (1) 6 (2) WY1 42.84 105.30 1,506 WY; Converse; Glenrock, beside north Platte river 2 0 5CO 40.26 105.54 1,730 CO; Boulder 12 (2) 6 (3) 5NM2 36.96 106.82 2,369 New Mexico; US843 3 0 C. saligna 144 145

NTCO12 38.83 106.92 2,418 CO; Gunnison; Gunnison River 3 4 NTCO13 - - - CO; Gunnison; W of Gunnison 1 2 NTCO14 38.76 107.26 2,305 CO; Gunnison; Neversink Picnic ground, trail 1 0 CO1 40.03 107.86 1,926 CO; Rio Blanco; 8 east of Meeker 2 7 CO6 40.03 108.13 1,798 CO; Rio Blanco; hwy 64 west of Meeker 1 0 UT5 40.21 110.41 1,722 UT; Duchesne; Duchesne River valley 1 0

145 146

Table 4.2 Summary of flow cytometry results of leaf and seed (embryo and endosperm) samples of studied taxa as listed in Table 4.1. Mean nuclear 2C values and variance among measurements obtained from (N) individuals in respect of site are shown. For sites with only one individual, the variance of the DNA amount is shown instead of variance. Ploidy level was identified by comparing nuclear 2C values with the Pisum 2C-standard, following Talent and Dickinson (2005). Superscripts indicate sites that contain morphological segregates as described in Table 4.1.

Species Site label Leaf estimate Seed estimate N mean pg/2C Ploidy level N Embryo (pg/2C) Ploidy level Endosperm (pg/2C) Ploidy level C. douglasii and segregates BC1 3 3.01 ± 0.21 4x - - - - - BC2 1 3.28 ± 0.35 4x - - - - - 3 BC3 1 2.86 ± 0.08 4x 7 2.85 ± 0.28 4x 7.83 ± 1.34 10-12x CAR2 1 2.85 ± 0.23 4x 5 2.82 ± 0.05 4x 8.01 ± 1.13 10-12x 1 CAR3 3 3.02 ± 0.07 4x 5 2.83 ± 0.11 4x 7.91 ± 0.93 10-12x 1 ID2 16 2.93 ± 0.17 4x 37 2.90 ± 0.22 4x 7.99 ± 0.66 10-12x ID3 4 2.86 ± 0.14 4x 9 2.91 ± 0.16 4x 8.25 ± 0.86 10-12x ID6 20 2.85 ± 0.07 4x 35 2.82 ± 0.30 4x 8.02 ± 0.77 10-12x 1 ID15 6 2.94 ± 0.20 4x 5 2.93 ± 0.45 4x 7.96 ± 0.42 10-12x 1 ID16 5 2.91 ± 0.16 4x 9 3.07 ± 0.35 4x 7.99 ± 1.69 10-12x ID20 12 2.96 ± 0.12 4x 7 3.05 ± 0.23 4x 7.73 ± 1.26 10-12x ON18 3 2.98 ± 0.34 4x 10 2.93 ± 0.11 4x 7.92 ± 0.45 10-12x ON20 24 3.02 ± 0.11 4x 15 2.96 ± 0.15 4x 8.09 ± 0.80 10-12x ON21 2 2.99 ± 0.01 4x 5 2.99 ± 0.18 4x 8.03 ± 0.65 10-12x 1 OR 3 3.20 ± 0.21 4x - - - - - MT2 13 3.01 ± 0.11 4x 26 2.96 ± 0.17 4x 8.58 ± 0.92 10-12x 146 147

1 SK 2 3.29 ± 0.13 4x - - - - - WA1 2 3.23 ± 0.07 4x - - - - - WA20 5 2.98 ± 0.05 4x 24 2.93 ± 0.04 4x 8.17 ± 0.95 10-12x 1 WA21 22 3.01 ± 0.13 4x 17 2.93 ± 0.09 4x 8.55 ± 1.36 10-12x 2 WA22 5 2.96 ± 0.14 4x 6 2.93 ± 0.20 4x 7.74 ± 0.82 10-12x 4 WA24 1 3.32 ± 0.10 4x 6 2.86 ± 0.29 4x 8.29 ± 0.82 10-12x C. suksdorfii CA5 6 1.35 ± 0.14 2x 35 1.52 ± 0.04 2x 2.37 ± 0.11 3x ID2 4 2.93 ± 0.53 4x 14 3.05 ± 0.25 4x 7.94 ± 0.40 10x ID5 4 2.25 ± 0.15 3x 12 2.22 ± 0.24 3x 6.87 ± 1.04 8-10x ID6 19 2.17 ± 0.10 3x 21 2.16 ± 0.25 3x 6.86 ± 0.55 8-10x OR01 10 1.52 ± 0.17 2x 18 1.51 ± 0.06 2x 2.28 ± 0.09 3x OR02 1 1.88 ± 0.23 2x 3 1.51 ± 0.10 2x 2.33 ± 0.13 3x OR04 5 2.19 ± 0.16 3x - - - - - OR06 16 2.24 ± 0.06 3x - - - - - OR07 1 1.42 ± 0.13 2x - - - - - OR11 14 1.59 ± 0.07 2x 35 1.53 ± 0.12 2x 2.38 ± 0.20 3x MT2 19 2.95 ± 0.07 4x 40 2.88 ± 0.14 4x 7.80 ± 0.86 10-12x WA1 2 2.54 ± 0.01 3x 7 2.24 ± 0.09 3x 6.73 ± 0.43 8-10x WA7 8 1.43 ± 0.09 2x - - - - - C. rivularis and segregate NTCO15 12 2.98 ± 0.13 4x 7 2.94 ± 0.07 4x 8.44 ± 0.98 10-12x NTCO17 2 3.10 ± 0.04 4x 5 3.19 ± 0.11 4x 8.23 ± 0.56 10-12x NTCO18 1 3.01 ± 0.15 4x - - - - - ID13 19 3.05 ± 0.29 4x 6 3.12 ± 0.18 4x 8.25 ± 2.34 10-12x 4 NM2 5 3.19 ± 0.08 4x 3 3.10 ± 0.18 4x 8.85 ± 0.58 10-12x NV2, NV3 3 3.32 ± 0.14 4x - - - - - UT2 4 2.79 ± 0.26 4x 3 2.96 ± 0.16 4x 8.20 ± 0.89 10-12x 147 147 148

WY1 2 3.09 ± 0.35 4x - - - - - C. saligna NTCO12, 13 4 1.63 ± 0.32 2x 4 1.58 ± 0.17 2x 2.48 ± 0.03 3x NTCO14 1 1.83 ± 0.08 2x 2 1.49 ± 0.04 2x 2.43 ± 0.02 3x CO1 2 1.66 ± 0.01 2x 7 1.48 ± 0.24 2x 2.55 ± 0.05 3x CO6 1 1.67 ± 0.08 2x - - - - - UT5 1 1.67 ± 0.01 2x - - - - -

148 148 149

Table 4.3 One-way ANOVA comparisons and mean estimates of elevations and climate variables among cytotypes in C. suksdorfii and C.

douglasii including its segregates. Asterisks indicate significance level as listed below the table. “a” indicates test statistics that become

non-significant when site ID06 (Table 4.1) was included as a diploid site according to Brunsfeld and Johnson (1990).

Variables One-way ANOVA Mean values Turkey's HSD test, P-vlaue

F2, 30 P-value 2x 3x 4x 2x vs 3x 2x vs 4x 3x vs 4x Elevation (m) 11.26 0.00** 15-118 998-1472 280-1121 0.001** 0.007** 0.036* Precipitation (mm) 6.05 0.006** 55.88 ± 9.49 33.56 ± 48.79 34.24 ± 24.27 0.983 0.015* 0.057 Temperature (F) 5.11 0.012*, a 51.38 ± 2.23 39.86 ± 6.82 43.28 ± 7.8 0.022*, a 0.023*, a 0.660 Daily temperature range (DTR) 2.81 0.076 52.71 ± 3.51 57.34 ± 5.18 55.23 ± 4.29 0.082 0.153 0.614 Humidity (%) 8.16 0.001** 74.75 ± 3.16 59.67 ± 10.04 63 ± 4.16 0.025*, a 0.001** 0.970 Sunshine hours (% of max) 0.36 0.70 49.5 ± 4.41 58.83 ± 10.08 54.66 ± 12.25 0.692 0.798 0.914 Wind (km/hr) 1.04 0.37 3.52 ± 0.21 3.72 ± 0.83 3.65 ± 0.38 0.877 0.676 0.396 Days with frost 9.60 0.001** 7.01 ± 1.51 14.82 ± 3.64 14.19 ± 5.57 0.012* 0.000** 0.989 Notes: * Significant at p = 0.05 ** Significant at p = 0.01

149 149 150 Figure 4.1 Schematic diagrams showing (A) sexual and (B) gametophytic apomictic pathways in megagametophytes of tetraploid plants. Shaded area represents the egg cells where either fertilization or pathenogenesis occurs prior to embryo development. White area represents the central cell where pseuodogamy occurs for subsequent endosperm formation. The formulas below indicate the empirical estimation of the DNA content dosage in the resulting endosperm and embryos, which can be used to distinguish the two types of reproduction.

151 152

Figure 4.2 Distribution sites of C. douglasii, C. suksdorfii, C. saligna, and C. rivularis cytotypes included in the present study. 2x C. suksdorfii ( ); 3x C. suksdorfii ( ); 4x C. suksdorfii ( ); 4x

C. douglasii ( ) and segregates C. castlegarensis labeled as (c); C. okennonii (o); C. phippsii (p);

C. shuswapensis (s); 2x C. saligna ( ); 4x C. rivularis ( ) and segregate C. erythropoda (e).

Locality identities are presented in Table 4.1. Climate diagrams of monthly temperature (dotted line) and precipitation (solid line) of selected sites are shown.

125 W 105 W 50 100 110 W 50 100 WA20 120 W 115 W MT02 40 80 BC1 40 80 ) ) 5 ) 5 ) 7 7

C 50 N BC3(s) C P

( 50 N P (

30 60

30 60 e e m r m r u u m t m t ( a (

a r 20 40 l SK(c) r 20 40 l l e l e a p a

f SK p f n m n i m i e a

AB e t 10 20 BC a t r 10 20 BC2 r 0 0 WA24(p) 0 0

-10 WA1 -10 Y R L G V T N R C B Y N P R L G V T N R C B N P A A P C E E U U O A E A U A P C E E U U O E A J U J J F S J A M J O D A N J F M S A M O D A N WA22(o) ID3 MT2 M WA20 WA21(c) ID2 50 100 ID15 50 OR11 100 WA7 WA ID20 40 80 ) 5 40 80 ) 7 C ) OR11 P ( 5

) 30 60 7 MT e C m r P ( 45 N ID16

OR7 u 30 60 45 N m t e ( a m r ID6 r 20 40 l u l e m t a ( p a

f r 20 40 l ID15 n l m e i a

OR1 ID5 e p a f

t 10 20 r n m

i OR e a

t 10 20 r OR2 OR6 0 0 0 0 OR4 WY1 -10 -10 Y R L G V T N R C B N P A A P C E E U U O E A U J J J F S A M O D A N Y R M L G V T N R C B N P A A P C E E U U O E A U J J J F S A M O D A N M OR ID10 ID 50 100 CAR5 ID06 50 100 CAR2 WY OR06 40 80 ) 5 )

40 80 NV2 7 CO18 C ) P (

30 60 5

) UT2

CAR3 e 7 m r C P u (

m

30 60 CO6 t ( e a 40 N 40 N m r r 20 40 l l e u m t a p ( f a

r l 20 40 CO1 n m l UT5 i e e a a t

p 10 20 f r n m i e a

t 10 20 r CO15 0 0 0 0 CO14 CO12 -10 Y

-10 R L G V T N R C B N P A A P C E E U U O E A U J J J F S A M O D A N M Y R L G V T N R C B N P A A P C E E U U O E A U J J J F S A M O D A N M CO17 NM2(e) 50 100 UT CO12 50 100 NM2 CAR5 40 80 ) 5 ) 7

40 80 C P ( )

30 60 5 e ) m NV r 7 C u P ( m t

30 60 ( a

e 35 N NM r l

m 20 40 r 35 N l e u a m t p f ( a

n r m 20 40 l i

l 125 W CA 105 W e e a t a 10 20 p r f n

m 120 W

i 110 W e a

t 10 20 115 W r 0 0 0 0 50 100 -10 50 100 UT2

NV2 Y R L G V T N R C B N -10 P A A P C E E U U O E A U J J J F S A

40 80 M O D A N M ) Y

R 40 80 L G V T N R C B N P 5 A ) ) A P C E E U U O E A U 7 J 5 J J F ) S A C M O D A N M P 7 (

C 30 60 P ( e

30 60 m r e u m r m t ( u a m

t r

4x C. douglasii and segregates l

( 20 40 a l

e r 20 40 l a l p e f

a p n f m i e 2x C. suksdorfii n m a t i 10 20 r e a

t 10 20 r 3x C. suksdorfii 0 0 0 0

4x C. suksdorfii -10 -10 Y R L G V T N R C B N P

2x C. saligna 1 A Y R A P L C E E U G V U O T E N A R C B N P U J J A J F S A P A M C E E U O D U O A N E A M U 5 J J J F S A M O D A N M

4x C. rivularis and segregate 3 154 Figure 4.3 Histogram of leaf nuclei DNA content (mean and adjusted variance) from individuals of C. douglasii and its segregates (n = 150), C. suksdorfii (n = 112), C. saligna (n = 9), and C. rivularis (n = 45) representing a total of 47 localities. Species are indicated in the upper bar and locality labels as presented in Table 4.1 in the lower bar. Localities that contain morphological segregates of C. douglasii are indicated with superscripts and can be referred to Table 4.1.

Shaded areas show the range for each ploidy level derived from an average estimate of 0.76 pg/genome±10%, giving 1.37-1.67 pg for diploids, 2.05-2.51 pg for triploids, and 2.74-3.34 pg for tetraploids. te a g re g e s d is s n i s i fi ii n s a e n s a r s r ii e i n o la a n ii p r g d g o s a a i s g e n p w l l k u tl n p s u a u o s e i u v s s d a k h h i . . . c o p s r C . . . . . C C C C C C C

) P s l m 4x o a i r d g y o

l c e i v p e (

l t

e n s u

3x t o i m m a a

t e A N D

WA ID ON WA

UT CA OR WA ID MT SK OR MT BC CO CO BC CA NM UT ID NV WY Locality 1 5 5 156

Figure 4.4 Histogram of embryo (black circle) and endosperm (open circle) nuclear DNA content (mean and standard deviation) from seeds of C. douglasii including its segregates (n =

235), C. suksdorfii (n = 182), C. saligna (n = 7), and C. rivularis (n = 20) collected from 27 localities. Species with the observed leaf ploidy level are indicated in the upper bar and locality labels as presented in Table 4.1 in the lower bar. Localities that contain morphological segregates of C. douglasii are indicated with superscripts and can be referred to Table 4.1. The estimated ranges of 2x, 3x, and 4x are the same as those described in Fig. 4.3. Using the average estimate of 0.76 pg/genome±10%, we define values of 5.47-6.68 pg as 8x, 6.15-7.52 as 9x,

6.84-8.36 as 10x, 7.52-9.16 as 11x, and 8.20-10.03 as 12x.

te a g re g e s i i s d f i is n r ii s s a o s n n s a d a re i e i n s l a i ii p r g k g g n s a a i u u e o p w l l s tl n p s u a . o s n i u v s d a e h h i . C . c k p s r . o . . . C C . C 2x 3x 4x C C C C

) s m a P r l g o o i c d i y p

( l

e t v n e u l o

e s m t a

i m A a N t e D

WA ID MT ON WA UT CA CO OR WA ID MT BC ID CO CA UT

Locality 1 5 7 158

Figure 4.5 Boxplots indicating the estimated ratio of endosperm to embryo DNA content of seed samples across (A) sites of C. douglasii and its segregates, and (B) sites of C. suksdorfii.

According to Fig. 2, ratio of around 1.5 indicates the sexual mean of seed production, whereas

2.5 or above indicates occurrence of gametophytic apomixis. Explanations for the high end:emb ratio (2.5-3) are provided in the DISCUSSION section.

159

(A) 4.5

4.0

3.5 t

n 3.0 e t n 2.5 o c

A 2.0 N

D 1.5 o y r 1.0 b 3 3 2 3 6 5 6 0 2 8 0 1 0 1 2 4 C & 0 0 0 1 1 2 T 1 2 2 2 2 2 2 B 2 ID ID ID ID ID ID M N N N A A A A A O O O W W W W m C e Localities of C. douglasii and its segregates o t m

r (B) e 4.5 p s o

d 4.0 n e

f 3.5 o o i

t 3.0 a R 2.5

2.0

1.5

1.0 CA5 OR01 OR02 OR11 ID02 ID05 ID06 MT2 WA20

Localities of C. suksdorfii 160

Figure 4.6 Regression plots of leaf nuclear DNA content of C. douglasii and segregates and C. suksdorfii against selected climate variables including (A) temperature, (B) precipitation, (C) relative humidity, and (D) number of days with frost. Standard error of the regression and regression coefficients of each variable are indicated in the graphs. Solid lines represent the regression lines and dotted lines represent the 95% confidence and prediction interval, respectively. Symbols can be referred to Figure 4.1. These variables were shown to be significantly different in the one-way ANOVA analyses as presented in Table 4.3.

(A) (B) 4.5 4.5 S 0.579238 S 0.596449 R-Sq 18.8% R-Sq 13.9% R-Sq(adj) 16.0% 4.0 4.0 R-Sq(adj) 10.9%

3.5 3.5

3.0 3.0

2.5 2.5

) 2.0 g 2.0 p /

C 1.5

2 1.5 ( t

n 1.0

e 1.0 t n 0 10 20 30 40 50 60 70 80 o 35 40 45 50 55 c Precipitation (mm)

A (C) Mean annual temperature (F) (D) N S 0.546146 D 4.5 S 0.571341 4.5

f R-Sq 21.0% R-Sq 27.8% a R-Sq(adj) 18.2% R-Sq(adj) 25.3% e 4.0 4.0 L 3.5 3.5

3.0 3.0

2.5 2.5

2.0 2.0

1.5 1.5

1.0 1.0

60 65 70 75 80 5.0 7.5 10.0 12.5 15.0 17.5 20.0 Humidity (%) Number of days with frost 1 6 1 162 Figure 4.7 (A) Biplot showing the variation and mean vectors of 84 monthly values of the seven climate variables including temperature, precipitation, humidity, wind, days-with-frost, daily temperature range (DTR), and sunshine on the first and second components. (B) Scatter plot generated by principle component analysis (PCA) of diploid and polyploid sites of C. suksdorfii and C. douglasii (including its segregates) based on 84 monthly values of seven climate variables. A total variation of 67.28% is detected in the first two components (F1: 43.22% and

F2: 24.05%). Symbols of respective ploidy level are indicated below graph and locality identity is found in Table 4.1. 163

(A)

2 0.2 2 2 5 2 2 5 2 5 5 2 Temperature 5 3 2 3 5 3 44 2 5 5 3 1 1 3 Sunshine 3 1 2 5 3 0.1 2 6

4 1 5 2 5 2 4 1 3 3 3 t n Precip1 itation DTR e 6 6 Hum4 idity n 4 o 4 1 3 p 0.0 4 3 3 m 77 o 7 7

C 4 6

44 4 6 7 Wind 7

6 6 6 6 Days-with-frost 7 -0.1 6

6 6 77 7 1 7 7 6 1 1 1 -0.2

-0.15 -0.10 -0.05 0.00 0.05 0.10 0.15 Component 1 (B) 10

CAR3 CAR2

OR02 5 OR01 OR04 OR07 ID20 OR11 CAR5 2 OR07 t WA WA22 n WA23 e WA21 WA07 ID02 OR

n ID06 o 0 WA24

p WA24a ID03 ID05 m o ID15 C WA20 ID16 -5 MT02

ON20

-10 ON21 ON18

-10 -5 0 5 10

Component 1 4x C. douglasii and segregates 2x C. suksdorfii

3x C. suksdorfii 4x C. suksdorfii Chapter 5 Detecting origins and inferring reticulate history in diploid-polyploid complexes of Crataegus suksdorfii sensu lato (Rosaceae) using tree and network approaches.

Abstract. Polyploidy, the multiplication of entire genomes, plays a prominent role in plant evolution. In autopolyploids, genomes are expected to associate within a single lineage that could be unambiguously represented by phylogenetic trees, whereas in allopolyploids genomes are merged from different lineages subsequent to hybridization and such reticulation events are better represented in a haplotype network. The present study uses both maximum parsimony and statistical parsimony algorithms to construct trees and networks from two unlinked nuclear genes, PISTILLATA and PEPC, together with two intergenic chloroplast regions, psbA-trnH and trnH-rpl2, in order to unravel origins and infer reticulation history of the diploid-polyploid complexes in Crataegus series Douglasianae. Duplicated paralogues were detected in both nuclear genes and they reveal topologies that are congruent with taxon relationships. Autopolyploids and allopolyploids were found in the C. suksdorfii and C. douglasii complexes. Triploid individuals of C. suksdorfii from Oregon and Idaho were independently derived, in one case, homogeneously by inheriting almost identical nuclear and chloroplast genomes from a diploid progenitor, and in the other case, heterogeneously by merging genomes from diploid C. suksdorfii and tetraploid C. douglasii. Tetraploid individuals are formed via a triploid bridge i.e. by the backcross of allotriploid offspring with the diploid

C. suksdorfii parent, followed by gene introgression of the sympatric C. douglasii. Our study identifies different pathways of polyploid formation in Crataegus and provides new insights into the reticulation history of this taxonomically complicated group. We also highlight the usefulness of a network approach in resolving relationships beyond the species level,

164 165 especially in organisms where hybridization and polyploidization are key evolutionary processes.

Keywords: Autopolyploids; Allopolyploids; Crataegus; Haplotype network; Gene duplication;

Reticulate evolution. 166 5.1. Introduction

Polyploidy or genome duplication has played an important role in diversification of species lineages and has been found to be increasingly common in flowering plants (Grant 1981;

Bowers et al. 2003; Cronn & Wendel 2004; Soltis et al. 2004; Adams & Wendel 2005a; Meyers

& Levin 2006). Two fundamental types of polyploids, autopolyploids and allopolyploids, have been described. Autopolyploidy refers to the whole genome duplication within a species, involving fertilization of unreduced gametes in crosses between conspecific individuals.

Allopolyploidy refers to duplication of two or more divergent genomes within a hybrid and involves crosses between different species (Soltis & Rieseberg 1986; Ramsey & Schemske

1998). Compared to autopolyploidy, allopolyploidy has been suggested to be more common in natural plant lineages because of the weak reproductive barrier that often exists between closely related or even more distant species (Soltis et al. 2004; Meyers & Levin 2006). Vertebrates such as fishes (Meyer & Schartl 1999; Comber & Smith 2004), rodents (Gallardo et al. 2004), and amphibians (Ptacek et al. 1994; Becak & Becak 1998) also demonstrate significant impacts of allopolyploidy by producing novel genotypes in newly-derived species for adaptation and establishment in new environments.

Apart from genome duplication, phenomena such as gene silencing, neofunctionalization, and subfunctionalization of homoeologues are common consequence of polyploidy (Lynch &

Conery 2000; Adams & Wendel 2005b; Veitia 2005; Adams 2007). These events usually render greater variation at the gene level, particularly in higher polyploids (Wendel & Doyle 2005), and can be used as sources of phylogenetic signal for identifying origins and parental lineages in dipoid-polyploid complexes. Because an autopolyploid inherits genomes only from a single progenitor species, this can be viewed as a bifurcation of the same lineage in which newly derived polyploids could be directly detected as sister group of the progenitor of lower ploidy level in a phylogenetic tree. On the other hand, allopolyploidy often produces polyploid species 167 complexes entailing network-like histories that cannot be properly presented by bifurcating trees

(Linder & Rieseberg 2004; Vriesendorp & Bakker 2005; Huson & Bryant 2006). Forcing reticulation to be displayed in a branching topology might lead to the lack of support for the resolved clades and/or collapse of hierarchical structure (Cassens et al. 2005; Vriesendorp &

Bakker 2005; Huber & Moulton 2006). Although incongruence between nuclear and plastid trees has been conventionally used as a detector of allopolyploidy or reticulate evolution, phenomena such as lineage sorting, differential gene duplication/loss or introgression could also mislead the resulting topologies (Pamilo & Nei 1988; Lyons-Weiler & Milinkovitch 1997;

Wendel & Doyle 1998). Hence, to examine polyploid relationships, the network approach using single/low-copy nuclear genes has become more common (e.g. Oxelman & Bremer 2000; Popp et al. 2001, 2005; Smedmark et al. 2003; Linder & Rieseberg 2004; Huber et al. 2006; Joly et al.

2006; Brysting et al. 2007).

Crataegus L. (commonly known as hawthorns) is a genus of the subtribe Pyrinae

(Campbell et al. 2007). In North America, over 60% of the Crataegus species include polyploid individuals (Talent & Dickinson 2005). Such ploidy level variation is found not only among but also within species, e.g., in the black-fruited hawthorns recognized as series Douglasianae

(Loud.). Crataegus douglasii senus lato and C. suksdorfii sensu lato are members of the

Douglasianae that are commonly found in western North America (Brunsfeld & Johnson 1990;

Dickinson et al. 1996, 1997). Individuals of C. douglasii are characterized by 10-stamen flowers and are uniformly tetraploids. They are found not only across the Pacific Northwest but also in the Great Lakes Basin, showing a wider geographical range and broader ecological amplitude than C. suksdorfii (Dickinson et al. 1996; Chapter 4). On the other hand, individuals of C. suksdorfii are recognized by 20-stamen flowers and have been shown to include both diploids and polyploids (Talent & Dickinson 2005). Diploids occur more frequently in mesic lowland along the west coast while triploids and tetraploids are found in drier and cooler areas across the 168 Cascades and Rocky Mountains. These cytotypes differ not only in their distribution, but also in their reproductive system and morphological characters (Dickinson et al. in press; Chapter 4).

However, their origin(s) and relationships are yet unknown. Therefore, this study aims to reconstruct phylogenies and reticulation networks in the C. suksdorfii and C. douglasii complexes using nuclear and chloroplast genes in order to: (1) identify origins of triploid and tetraploid C. suksdorfii from different localities; (2) construct a reticulation model that shows the putative parental lineages involved in cytotype formation; and (3) examine the effect of nuclear gene duplication associated to polyploidy in Crataegus.

Regarding the first objective, two alternatives are evaluated with the tree and network approaches. If individuals are of autopolyploid (single) origin, the null expectation is that these individuals should contain only paralogous sequences closely related or sister to those of the progenitor taxon, and thus will resolve as a monophyletic clade in a tree or as the same haplotype group in a network. If individuals are of allopolyploid (multiple) origins, the alternative expectation is that these individuals should contain a mixture of homoeologous sequences inherited from different progenitor taxa. These sequences are expected to associate differently with progenitors that contribute diverse copies, thus will resolve as polyphyletic groups in a tree or multiple haplotype groups in a network. Caution, however, is needed because autopolyploids sometimes might appear as allopolyploids when there is introgression of foreign alleles. Conversely, allopolyploids might appear as autopolyploids when there is lineage sorting or polysomic segregation. Therefore, it is crucial to sample sufficient sequences to maximally represent allelic variation within a polyploid taxon.

5.2. Materials and methods

5.2.1. Plant materials and ploidy level determinations

Crataegus suksdorfii and C. douglasii were sampled at a total of 16 sites (Table 5.1 and 5.2) in 169 the (Oregon, Idaho, Montana, and Washington) and eastern Canada

(Ontario). Individuals used here (Appendix 6) were drawn from a larger sample of individuals on whose morphological and cytotypic variation, as well as breeding system were reported elsewhere (Dickinson et al. in press; Chapter 4). Vouchers were deposited in the Green Plant

Herbarium at the Royal Ontario Museum (TRT). For the 34 individuals included in nuclear gene amplifications (Table 5.1), ploidy level was determined from leaf tissues by flow cytometry, following the protocol of Talent and Dickinson (2005) and using a FACSCalibur flow cytometer

(Becton-Dickinson) equipped with an argon laser and a detector for fluorescence of propidium iodide-stained samples (585nm wavelength). Nuclear DNA content was estimated from the ratio between the fluorescence of the Crataegus samples and a Pisum sativum standard (2C value of

9.56 pg DNA per Pisum nucleus; Johnston et al. 1999). The standard deviation in each measurement was calculated following the methods of Dickson et al. (1992).

5.2.2. Gene markers and sequencing strategy

Different sampling strategies were applied to the nuclear and chloroplast gene markers because of the ploidy level differences in our taxa. Two unlinked nuclear genes, PISTILLATA and

Phosphoenolpyruvate Carboxylase (PEPC), which were suggested to be single or low copy in other plants (Matsuoka & Minami 1989; Goto & Meyerowitz 1994; Bailey & Doyle 1999), were used in this study. Our sampling strategy sought to fully capture the allelic variation of these nuclear gene loci within individuals, especially those of polyploids in our sampling sites. For this reason, 8-15 clones were sequenced, respectively for the two genes, from each individual but only 1-4 individuals per site were included (Table 5.1). On the other hand, because chloroplast markers are uniparentally inherited and each individual, regardless of its ploidy level, should only contain one copy, no cloning was required to sample variation within an individual.

However, we sought to maximally recover the variation among individuals within sites in order 170 to accurately identify all possible maternal donors. Therefore, a total of 132 individuals representing C. douglasii and C. suksdorfii from the 16 sites were sequenced for the psbA-trnH and trnH-rpl2 regions (Table 5.2).

Details of DNA extraction, primers information, PCR conditions, and cloning protocols were described in Chapter 3. Briefly, nuclear and chloroplast regions were PCR amplified.

Chloroplast gene amplicons were sequenced directly, whereas amplicons of PISTILLATA and

PEPC were cloned and sequenced. To reduce the possibility of recovering PCR artefact recombinants and/or Taq polymerase-induced mutations among clones of nuclear genes, each individual was amplified twice and both amplicons were combined for purification prior to cloning (Judo et al. 1998; Cronn et al. 2002; Joly et al. 2006). Sequencing reactions were preformed with BigDye terminator (version 3.1; Applied Biosystems) following the manufacturer’s protocols and were run on the 3100 automatic sequencer (Applied Biosystems).

5.2.3. Sequences analyses

Alignments were initially conducted with ClustalX (Thompson et al. 1997) followed by manual editing with Sequence Alignment Editor v1.d1 (SE-Al; Rambaut 2002). Indels were coded as multistate characters (Simmons & Ochoterena 2000) with SeqState version 1.32 (Müller 2005) and appended to the sequence matrix for phylogenetic analyses. Because the two chloroplast regions are linked, they were combined and treated as a single marker for analyses whereas the two nuclear datasets were treated separately.

Sequence recombination caused by PCR errors could be more common in polyploids when more than one copy is available (Bradley & Hills 1997; Judo et al. 1998; Cronn et al. 2002).

This occurs when the DNA polymerase stops functioning or detaches from the template before elongation is complete. The partial extended product may prime to another allele in a subsequent cycle and result in a recombinant sequence. To identify and exclude potential recombinant 171 sequences from the alignment matrices before phylogenetic analyses, the aligned sequences from all clones of each individual were compared using the pairwise scanning method implemented in the Recombination Detection Program (RDP) version 2b.08 (Martin & Rybicki

2000). The RDP examines every possible combination of three of the aligned sequences for evidence of recombination by first assigning relatedness within groups of three sequences based on variable sites and their relative positions in an estimated UPGMA tree. This is followed by calculations of average percentage identity for each of the three possible sequence pairs in the alignment. A region that shows a percentage identity of the less closely related pair higher than the more closely related pair is regarded as a recombination breakpoint. For further confirmation, alignments with and without the detected recombinant sequences are split into partitions and phylogenetic trees are generated for each partition to compare topologies. These were done with the program TOPALi v2 (Milne et al. 2004) that allows manual selection of sequences and regions, as well as a quick overview of trees for individual partitions. Sequences detected with recombinant points and/or showing incongruent positions among partitions were excluded from the analyses.

To compare the amount of chloroplast sequence variation among cytotypes, the haplotype diversity (Hd) and nucleotide diversity (pi) in C. suksdorfii and C. douglasii with respect to different sites as listed in Table 2 were estimated using DNASP 4.10.1 (Rozas et al. 2003) based on equations of Nei (1987).

5.2.4. Phylogenetic tree and network reconstructions

Phylogenetic trees were built using maximum parsimony in PAUP 4.0b* (Swofford 2002).

Heuristic searches for the most parsimonious trees with unweighted data were performed with

1000 random additions, tree bisection-recombination (TBR) branch swapping, MULTREES off, and with no more than 10 trees saved per replicate. The tree output was then used as a starting 172 point for a second round of searches with the same settings except with MULTREES on.

Evolutionary relationships were also examined by constructing a haplotype network based on the statistical parsimony method of Templeton et al. (1992) implemented in TCS v. 1.13

(Clement et al. 2000). This method emphasizes what is shared among haplotypes that differ minimally (Posada & Crandall 2001). This approach first estimates the uncorrected distance above which the parsimony principle is violated with more than 5% probability. Then all connections are iteratively joined among haplotypes only when the parsimony has a probability of at least 0.95 of being true as determined by coalescence theory, starting with the shortest distance until all haplotypes are joined or the distance exceeds the parsimony limit (Clement et al. 2000). Here, haplotypes connected by mutational steps were set with 95% confidence limits and the "gaps = missing" option. The same sequence alignment matrices for PAUP* analyses was used in the TCS analyses, except outgroup sequences were excluded.

5.3. Results

5.3.1. Flow cytometry

Individuals of C. douglasii representing different localities all showed high 2C values (2.77-3.25 pg) and were estimated to be tetraploids, whereas C. suksdorfii varied considerably from

1.24-2.98 pg and were estimated to be diploids, triploids, and tetraploids (Table 5.3). These results are consistent with the earlier ones obtained at the same and other sites (Chapter 4; Talent

& Dickinson 2005).

5.3.2. Chloroplast sequence polymorphism

The psbA-trnH and trnH-rpl2 regions gave a total aligned length of 652 bp in which 23 of the 63 variable positions and six indels were parsimony informative. Haplotype diversity (Hd) and nucleotide diversity (pi) of the overall data is 0.832 ± 0.032 and 6.19×10-3 ± 0.63×10-3. 173 Diversity measures in diploid C. suksdorfii are consistently high (e.g. in sites OR1, OR11, and

CA5) and these values are comparable with those of the tetraploid C. suksdorfii (MT2) and C. douglasii, with the exception of individuals from one Idaho site (ID6; Table 5.4). By contrast, triploid individuals revealed relatively low Hd and pi values (particularly those of site OR6), suggesting a homogeneous cytoplasmic gene pool or limited gene exchange among these individuals, as expected in triploid plants (Table 5.4).

To simplify the resulting network, five single sequence haplotypes with either one or two site changes from the other haplotypes were excluded. Such exclusion does not interfere with the relationships shown by the network. Nineteen haplotypes labelled as A-S were recovered

(Table 5.2) and they were distinguished from each other by 2-8 mutation steps in the phylogenetic network (Fig. 5.1). Sequences of 2x C. suksdorfii (sites OR1, OR11, and CA5;

Table 5.2) were found in seven haplotypes A-G that were more or less connected except F and G

(Table 5.2; Fig. 5.1). Three of these haplotypes (A-C) were shared with 3x C. suksdorfii from

Oregon (OR6). Although haplotype H appeared to be unique for the rest of the Oregon triploids, it was differentiated from haplotype B by only one mutation. Haplotype I containing 3x C. suksdorfii from Idaho (ID5, ID6) and 4x C. suksdorfii from Montana (MT2) was found neighbour to haplotype H. Sequences of C. douglasii were distributed among nine haplotypes (J,

L, M, N, O, P, Q, R, and S) in which P was the most common one containing sequences of 41 individuals including 4x C. douglasii (N = 27), as well as 3x (N = 11) and 4x C. suksdorfii (N =

3). Apart from P, haplotype J, L, and O were also shared with either 3x C. suksdorfii from the

Idaho (ID5, ID6) or 4x C. suksdorfii from Montana (MT2), or both (Table 5.2; Fig. 5.1).

Haplotype K appeared to be unique to 4x C. suksdorfii (MT2) and was related to haplotype Q containing sequences of the Ontario C. douglasii (ON20).

Maximum parsimony trees showed very little resolution of our taxa but neighbour-joining analyses (data not shown) generally support the relationships reflected in the network (Fig. 5.1) 174 including: (1) the association of 2x and 3x C. suksdorfii from Oregon; (2) polyphyletic origins for the 3x (ID5, ID6) and 4x (MT2) C. suksdorfii that associated with both 2x C. suksdorfii and

4x C. douglasii.

5.3.3. Topologies of PISTILLATA

Of the total 321 PISTILLATA sequences obtained from the 34 individuals, 13 sequences were detected with significant recombinant points and showed incongruent positions among trees generated from different partitions (Table 5.5), and were thus removed from the phylogenetic analyses. In the partial PISTILLATA gene of 1150 bp, about 140 bp in the first intron was deleted from all analyses because a hypervariable AT-rich region was detected that could not be unambiguously aligned. Two copies designated as S (short) and L (long) were clearly identified in the alignments based on both length (indels) and nucleotide differences. The PISTILLATA-L paralog appears to be more variable than the S paralog (Table 5.5) and no clear open reading frame was observed in its exons, thus suggesting a pseudogene copy. The strict consensus parsimony tree of PISTILLATA reveals two distinct and strongly supported clades, corresponding to the S and L paralogs (Fig. 5.2). Sequences of the S paralog are almost completely unresolved (not shown). However, phylogenetic resolution was observed with the L paralog where sequences are divided into clade A and B. Within clade A, the subclade A1 is found containing sequences of 4x C. douglasii (ID, MT, ON, and WA) and 4x C. suksdorfii (MT) only, without much resolution (Fig. 5.2). The remainder of clade A is a mixture of sequences from 2x (OR), 3x (ID), and 4x (MT) C. suksdorfii together with those from 4x C. douglasii, which are poorly resolved. Sister to clade A is clade B which shows monophyly of all 3x C. suksdorfii from Oregon (OR6; N = 18) together with 2x C. suksdorfii sequences from Oregon and California ((OR1, CA5; N = 5; Fig. 5.2), indicative of a homogenous origin of these 3x individuals. 175

5.3.4. Topologies of PEPC

Of the total 332 PEPC sequences obtained from the 34 individuals, 19 sequences were detected with significant recombinant points and showed incongruent positions among trees generated from different partitions (Table 5.5), and were thus removed from the phylogenetic analyses.

Amplicons of the partial PEPC genes are of about 740 bp in size. Two paralogs designated as S

(short) and L (long) were detected in the alignments based on both length (indels) and nucleotide differences. Open reading frames were identified in their exon sequences, suggesting functionality of both paralogs. The S and L paralogs of the PEPC gene are resolved as two distinct clades (Fig. 5.3). In the clade containing the L paralog, very little resolution is detected, except the clade containing sequences of 2x and 3x C. suksdorfii from Oregon. On the other hand, the S paralog provides better resolution to the taxa in which three subclades A1, A2, and A3 are observed in the large clade A (Fig. 5.3). All sequences of 3x C. suksdorfii from Oregon (OR6) are found to be monophyletic with 2x C. suksdorfii (OR1, OR11) in clade A2, which is similar to clade B of PISTILLATA tree (Fig. 5.2). However, other sequences of 3x (ID) and 4x (MT) C. suksdorfii are found to be polyphyletic and nested in clades such as A1 and A3 together with 2x

C. suksdorfii (OR) and 4x C. douglasii (ID, MT, WA). Outside clade A, i.e. the remainder of the

S-paralog clade, only sequences of 4x C. douglasii (CA, ID, MT, ON, and WA) and 4x C. suksdorfii (MT) are observed and poorly resolved (labelled as S2; Fig. 5.3). These S2 sequences are united as a monophyletic clade in a single most parsimonious tree (not shown), equivalent to clade A1 of PISTILLATA (Fig. 5.2).

To provide an additional view of relationships, a haplotype network was constructed with

70 PEPC-L sequences (Fig. 5.4). Seventeen haplotypes labelled as A-Q were identified.

Sequences of 2x C. suksdorfii are found in haplotype A, B, D, E, F, and G. Sequences of 3x C. suksdorfii from Oregon (OR6) are either found in or related to haplotype A of the diploids, 176 supporting their close association, as indicated in the PISTILLATA-L and PEPC-S trees (Fig. 5.2 and 5.3). On the other hand, sequences of the other 3x (ID5, ID6) and 4x (MT2) C. suksdorfii, either separately or in combination, appear in multiple haplotypes of 2x C. suksdorfii (F and G) and 4x C. douglasii (F, H, I, L, Q, and M; Fig. 5.4), suggesting heterogeneous origins of these individuals.

5.4. Discussion

In this paper, we examined diploid and polyploid samples at the interspecific and intraspecific interface with both the tree and network approaches to test the auto- and allopolyploid origin hypotheses in Crataegus. Crataegus suksdorfii and C. douglasii of series Douglasianae have previously been described as being distinct in morphology, chromosome number, and ecological habitats (Brunsfeld & Johnson 1990), but subsequent studies have demonstrated that the situation is more complex because of the variation in ploidy level and reproductive system

(Dickinson et al. 1996; Chapter 4). Moreover, preliminary morphological data clearly indicate that C. suksdorfii is a more variable taxon than has been documented previously (Dickinson et al., in press). The present study is the first report of the origins and reticulation history of polyploids in these species complexes inferred from nucleotide sequence markers.

5.4.1. Duplications of PISTILLATA and PEPC genes

PISTILLATA and PEPC genes were suggested to be of single or low-copy number in some previous studies (Matsuoka & Minami 1989; Goto & Meyerowitz 1994; Bailey & Doyle 1999).

Our results detected at least two paralogous copies in both genes and these copies were present in all of our samples, polyploids as well as diploids. Because sequences of one paralog obtained from all individuals were shown to be monophyletic and distinct from the other paralog of the same individuals (Fig. 5.2 and 5.3), gene duplication is inferred to have occurred in the 177 ancestors of C. suksdorfii and C. douglasii, prior to polyploidization (i.e. genome duplication).

Theory suggests three most probable outcomes in the evolution of duplicated genes: (1) one copy may become non-functional (pseudogene) by degenerative mutations; (2) one copy may acquire an alternative function and become preserved by natural selection; or (3) both copies may become partially compromised and involved in the same function, which could alter regulation of gene expression (Guo et al. 1996; Lynch & Conery 2000; Adams & Wendel 2005b;

Adams 2007). In PISTILLATA, the S-copy is believed to be the functional paralog because of the conserved reading frame and intron. In contrast, putative exons of the L-copy are riddled with indels and there is no clear open reading frame, consistent with pseudogene formation. In PEPC, long open reading frames were identified in both L- and S-copies. The L-copy harbours a conserved reading frame and intron, while the S-copy exons are characterized by several substitutions and indels resulting in a frame shift. Although the functionality of these paralogues is uncertain without expression data, mutations accumulated after gene duplication provide valuable source of information for species history.

In PEPC, there is some indication that a second gene duplication occurred subsequent to genome duplication (polyploidy). Our extensive sequencing of clones indicates a subset of

PEPC-S sequences (marked as S2; Fig. 5.3) present only in tetraploids but not diploids and triploids of our samples. These sequences were poorly resolved in the PEPC-S clade (Fig. 5.3) and might represent an additional paralog derived by genome duplication during tetraploid formation in C. douglasii and C. suksdorfii. Although no Southern hybridization data are available to confirm the number of PEPC gene copies, gene dosage has been shown to be positively correlated to ploidy level in plants and other organisms (Guo et al. 1996; Meyer &

Schartl 1999; Veitia 2005).

5.4.2. Autopolyploid and allopolyploid formation 178 Individuals in C. suksdorfii were found not only to be of auto- but also of allopolyploid origins.

A reticulation model was synthesized based on all findings including ploidy level and evolutionary relationships inferred from the chloroplast and nuclear genes to illustrate the following possible routes of polyploid formation in the Douglasianae (Fig. 5.5).

5.4.2.1. Route 1—Autotriploidy

Triploid individuals of C. suksdorfii from Oregon were shown to be monophyletic and closely associated with 2x C. suksdorfii only in the chloroplast and nuclear data (Fig. 5.1-5.4). These 3x individuals display considerably lower haplotype and nucleotide diversity than the others in the sampled chloroplast regions (Table 5.4), which is expected when considering lower fertility in triploid than diploid and tetraploid plants (Dickinson et al. 1996). Marhold and Lihová (2006) suggested that for polyploid taxa, a close resemblance to diploids and a lack of unique alleles are usually signatures of autopolyploid origin. Here, we infer that both the nuclear and organelle genomes were singly inherited from diploid to triploid C. suksdorfii in Oregon with no other taxa involved (Fig. 5.5). Reproduction via unreduced egg (2x♀) or pollen (2x♂) gametes in diploid trees is one possible mechanism for this autotriploid formation. However, such an occurrence would be rare given that the flow cytometry data available so far documented only diploid but no triploid embryos in diploid Crataegus plants (Talent & Dickinson 2007; Chapter

4). These interpretations call for further extensive surveys of seeds from several diploid and surrounding trees in the Oregon sites.

5.4.2.2. Route 2—Allotriploidy

Unlike the Oregon triploids, sequences of 3x C. suksdorfii from Idaho are consistently associated with both 2x C. suksdorfii and 4x C. douglasii in the nuclear and chloroplast data (Fig.

5.1-5.4) and reveal a greater amount of variation among individuals (Table 5.4). The 179 heterogeneity in nuclear sequences suggests polyphyletic origins of these 3x individuals with more than one parental lineage involved and supports the allopolyploid hypothesis. We infer that individuals of 2x C. suksdorfii and 4x C. douglasii hybridized and contributed their nuclear genomes to the allotriploid progeny. Also, both parental species could have served as the maternal donor based on the chloroplast haplotype heterogeneity, with clear preponderance of maternal influence from 4x C. douglasii (Fig. 5.1 and 5.5).

5.4.2.3. Route 3—Backcrossing to tetraploidy

While tetraploid C. suksdorfii (MT2) and allotriploids (ID) share similar chloroplast haplotypes

(e.g., I, P, O; Fig. 5.1), sequences of the former are found in polyphyletic lineages in the nuclear data (Fig. 5.3, 5.4). One likely pathway for the 4x C. suksdorfii formation is backcrossing between the allotriploids (3x♀) and their 2x progenitors (x♂), with allotriploids serving preponderantly as the maternal donor (Fig. 5.5). Dickinson et al. (1996) estimated that the pollen production per flower and stainability in these allotriploids are as high as those of the diploid plants. This suggests that allotriploids are equally fertile and capable of producing successful seed set. Here, we infer that through backcrossing of the allotriploids, genetic materials of not only 2x C. suksdorfii but also 4x C. douglasii are expected to be inherited and transmitted to the tetraploid progeny. An alternative pathway for the 4x C. suksdorfii formation is direct hybridization between 4x C. douglasii and 2x C. suksdorfii. In theory, the fusion of reduced gamete (2x♀ or 2x♂) from 4x C. douglasii with unreduced gametes (2x♀ or 2x♂) from

2x C. suksdorfii could give rise to allotetraploid offspring. However, this would require the bypassing of meiotic reduction in C. suksdorfii diploids, which is considered to be rare and less likely to occur (Chapter 4).

5.4.2.4. Route 4—Gene flow between sympatric tetraploids 180 Subsequent to tetraploid C. suksdorfii formation, recurrent gene flow is suggested between these

4x individuals of C. suksdorfii and 4x C. douglasii, as evidenced by their unique associations detected in the chloroplast and nuclear data (e.g., haplotype J, L in Fig. 5.1; clade A1 in Fig. 5.2;

S2 sequences in Fig. 5.3). Individuals of these two taxa occur in sympatry at our Montana site where no clear niche segregation is observed. Moreover, their pollens have been shown to be highly viable (Dickinson et al. 1996) that made crossing always possible. Further investigations of population genetic structure using hypervariable markers is needed to understand the evolutionary dynamics of these polyploid individuals.

5.4.3. Conclusions

Findings of the present study, based on two unlinked nuclear and two chloroplast regions and employing both tree and network analytical methods, point to the multiple origins of polyploids in the C. suksdorfii complex. The occurrence of nuclear gene duplication prior to polyploidization in this species complex allows us to reconstruct relationships with separate paralogs. Autopolyploid and allopolyploid lineages are identified in the Douglasianae and a reticulation model is proposed from synthesis of ploidy level and molecular data to infer the potential parents and related gene flow events. In short, triploid individuals were derived either homogeneously by receiving almost identical nuclear and chloroplast genomes from a diploid progenitor (as autotriploid), or heterogeneously by merging genomes from diploid C. suksdorfii and tetraploid C. douglasii (as allotriploid). Tetraploid individuals are likely formed via the triploid bridge coupled with subsequent gene flow with C. douglasii. Taxonomic implications of these results will be discussed elsewhere in conjunction with data on morphological variation within what now must be seen as C. suksdorfii sensu lato. Polyploids are found to be increasingly common not only in plants but also in some vertebrates and will continue to raise questions regarding their origins. With the use of molecular gene markers, our study identifies 181 different pathways of polyploid formation in Crataegus in western North America and sheds light on the reticulation history of this taxonomically complicated group. Our findings also highlight the usefulness of the network approach to resolving relationships beyond the species level, especially in organisms where hybridization and polyploidization are key evolutionary processes.

5.5. Acknowledgements

The authors thank Maria Kuzmina, Kristen Choffe, Matthew Hébert-Lee, and Jean-Marc

Moncalvo for assistance with sequencing; Rhoda Love and Peter Zika for assistance in making plant collections; Annabel Por, Cheying Ng, and Jenny Bull for organizing the vouchers.

Financial support from the Natural Sciences and Engineering Research Council of Canada

(grant A3430 to TAD, 326439-06 to SS), the Botany Department of the University of Toronto, and the Royal Ontario Museum, Department of Natural History is gratefully acknowledged.

5.6. References

Adams KL 2007. Evolution of duplicate gene expression in polyploid and hybrid plants. J. Her.

98: 136-141.

_____, Wendel JF 2005a. Polyploidy and genome evolution in plants. Curr. Opin. Plant Biol. 8:

135-141.

_____, _____ 2005b. Novel patterns of gene expression in polyploid plants. Trends Genet. 21:

539-543.

Bailey CD, Doyle JJ 1999. Potential phylogenetic utility of the low-copy nuclear gene

PISTILLATA in Dicotyledonous Plants: Comparison to nrDNA ITS and trnL intron in

Sphaerocardamum and other Brassicaceae. Mol. Phylogenet. Evol. 13: 20-30. 182 Becak ML, Becak W 1998. Evolution by polyploidy in Amphibia: new insights. Cytogenet. Cell

Genet. 80: 28– 33.

Bowers JE, Chapman BA, Rong J, Paterson AH 2003. Unravelling angiosperm genome

evolution by phylogenetic analysis of chromosomal duplication events. Nature 422: 433–

438.

Bradley R, Hillis D 1997. Recombinant DNA sequences generated by PCR amplification. Mol.

Biol. Evol. 14: 592-593.

Brunsfeld SJ, Johnson FD 1990. Cytological, morphological, ecological and phenological

support for specific status of Crataegus suksdorfii (Sarg.) Kruschke. Madroño 37: 274-282.

Brysting AK, Oxelman B, Huber KT, Moulton V, Brochmann C 2007. Untangling complex

histories of genome mergings in high polyploids. Syst. Biol. 56: 467-476.

Cassens I, Mardulyn P, Milinkovitch MC 2005. Evaluating intraspecific “network” construction

methods using simulated sequence data: Do existing algorithms outperform the global

maximum parsimony approach? Syst. Biol. 54: 363-372.

Clement M, Posada D, Crandall KA 2000. TCS: a computer program to estimate gene

genealogies. Mol. Ecol. 9: 1657-1659.

Comber SCL, Smith C 2004. Polyploidy in fishes: patterns and processes. Biol. J. Linn. Soc. 82:

431-442.

Cronn R, Cedroni M, Haselkorn T, Grover C, Wendel JF 2002. PCR-mediated recombination in

amplification products derived from polyploid cotton. Theor. Appl. Genet. 104: 482 – 489.

_____, Wendel JF 2004. Cryptic trysts, genomic mergers, and plant speciation. New Phytol. 161:

133-142.

Dickinson TA, Belaoussoff S, Love RM, Muniyamma M 1996. North American black-fruited

hawthorns: I. Variation in floral construction, breeding system correlates, and their possible

evolutionary significance in Crataegus sect. Douglasii Loudon. Fol. Geobotanica 31: 183 355-371.

_____, Love RM 1997. North American black-fruited hawthorns: III. What is Douglas hawthorn?

Conservation and Management of Oregon's Native Flora. T. Kaye. Corvallis, OR, Native

Plant Society of Oregon.

_____, Lo EYY, Talent N, Love R In press. Black-fruited Hawthorns of North America – an

Agamic Complex? Can. J. Bot.

Dickson EE, Arumuganathan K, Kresovich S, Doyle JJ. 1992. Nuclear DNA content variation

within the Rosaceae. Am. J. Bot. 79: 1081-1086.

Evans RC, Campbell CS 2002. The origin of the apple subfamily (Maloideae; Rosaceae) is

clarified by DNA sequence data from duplicated GBSSI genes. Am. J. Bot. 89: 1478-1484.

Gallardo MH, Kausel G, Jiménez A, Bacquet C, González C, Figueroa J, Köhler N, Ojeda R

2004. Whole-genome duplications in South American desert rodents (Octodontidae). Biol. J.

Linn. Soc. 82: 443--451

Goto K, Meyerowitz E 1994. Function and regulation of the Arabidopsis floral homeotic gene

PISTILLATA. Genes Dev. 8: 1548-1560.

Grant V 1981. Plant speciation. New York, USA: Columbia University Press.

Guo M, Davis D, Birchler JA 1996. Dosage Effects on Gene Expression in a Maize Ploidy

Series. Genetics 142: 1349-1355.

Husband BC 1998. Constraints on polyploid evolution: A test of the minority cytotype exclusion

principle. Proc. R. Soc. Lond. (Biol.) 267: 217-223.

Huber KT, Oxelman B, Lott M, Moulton V 2006. Reconstructing the evolutionary history of

polyploids from multilabeled Trees. Mol. Biol. Evol. 23: 1784-1791.

_____, Moulton V 2006. Phylogenetic networks from multi-labelled trees. J. Math. Biol. 52:

613-632.

Huson DH, Bryant D 2006. Application of phylogenetic networks in evolutionary studies. Mol. 184 Biol. Evol. 23: 254-267.

Joly S, Starr JR, Lewis WH, Bruneau A. 2006. Polyploid and hybrid evolution in roses east of

the Rocky Mountains. Am. J. Bot. 93: 412-425.

Judo M, Wedel A, Wilson C 1998. Stimulation and suppression of PCR-mediated recombination.

Nucleic Acids Res. 26: 1819-1825.

Levin DA 2002. The role of chromosomal change in plant evolution. Oxford University Press,

New York.

Linder CR, Rieseberg LH 2004. Reconstructing patterns of reticulate evolution in plants. Am. J.

Bot. 91: 1700-1708.

Lyons-Weiler J, Milinkovitch MC 1997. A phylogenetic approach to the problem of differential

lineage sorting. Mol. Biol. Evol. 14: 968-975.

Lynch M, Conery JS 2000. The evolutionary fate and consequences of duplicate genes. Science

290: 1151-1155.

Marhold K, Lihová J 2006. Polyploidy, hybridization and reticulate evolution: lessons from the

Brassicaceae. Pl. Syst. Evol. 259: 143-174.

Martin D, Rybicki E 2000. RDP: detection of recombination amongst aligned sequences.

Bioinformatics, 16: 562-563.

Matsuoka M, Minami EI 1989. Complete structure of the gene for phosphoenolpyruvate

carboxylase from maize. Eur. J. Biochem. 181: 593-598.

Meyer A, Schartl M 1999. Gene and genome duplications in vertebrates: the one-to-four

(-to-eight in fish) rule and the evolution of novel gene functions. Curr. Opin. Cell Biol. 11:

699-704.

Meyers LA, Levin DA 2006. On the abundance of polyploids in flowering plants. Evolution 60:

1198-1206.

Milne I, Wright F, Rowe G, Marshall DF, Husmeier D, McGuire G 2004. TOPALi: Software for 185 automatic identification of recombinant sequences within DNA seqeunce alignments.

Bioinformatics 20: 1806-1807.

Müller K 2005. Incorporating information from length-mutational events into phylogenetic

analysis. Mol. Phylogenet. Evol. 38: 667-676

Nei M 1987. Molecular Evolutionary Genetics. Columbia University Press, New York.

Oxelman B, Bremer B 2000. Discovery of paralogous nuclear gene sequences coding for the

second-largest subunit of RNA polymerase II (RPB2) and their phylogenetic utility in

Gentianales of the . Mol. Biol. Evol. 17: 1131-1145.

Pamilo P, Nei M 1988. Relationships between gene trees and species trees. Mol. Biol. Evol. 5:

568-583.

Phipps JB, Robertson KR, Smith PG, Rohrer JR 1990. A checklist of the subfamily Maloideae

(Rosaceae). Can. J. Bot. 68: 2209-2269.

_____, O' Kennon RJ, Lance RW 2003. Hawthorns and medlars. Timber Press, Portland OR.

Popp M, Oxelman B 2001. Inferring the history of the polyploid Silene aegaea

(Caryophyllaceae) using plastid and homoeologous nuclear DNA sequences. Mol.

Phylogenet. Evol. 20: 474-481.

_____, Erixon P, Eggens F, Oxelman B 2005. Origin and evolution of a circumpolar polyploid

species complex in Silene (Caryophyllaceae) inferred from low copy nuclear RNA

polymerase introns, rDNA, and chloroplast DNA. Syst. Bot. 30: 302-313.

Posada D, Crandall KA 2001. Evaluation of methods for detecting recombination from DNA

sequences: computer simulations. Proc. Natl. Acad. Sci. USA 98: 13757-13762.

Ptacek MB, Gerhardt HC, Sage RD 1994. Speciation by polyploidy in treefrogs: multiple

origins of the tetraploid, Hyla versicolor. Evolution 48: 898-908.

Rambaut A 2002. Se-Al Sequence Alignment Editor v2.0a11. Oxford: University of Oxford.

Ramsey J, Schemske DW 1998. Pathways, mechanisms, and rates of polyploid formation in 186 flowering plants. Ann. Rev. Ecol. Syst. 29: 467– 501.

Rozas J, Sánchez-Delbarrio JC, Messeguer X, Rozas R 2003. DnaSP, DNA polymorphism

analyses by the coalescent and other methods. Bioinformatics 19: 2496-2497.

Simmons MP, Ochoterena H 2000. Gaps as characters in sequence-based phylogenetic analyses.

Syst. Biol. 49: 369-381.

Smedmark JEE, Eriksson T, Rodger CE, Christopher CS 2003. Ancient allopolyploid speciation

in Geinae (Rosaceae): Evidence from nuclear Granule-Bound Starch Synthase (GBSSI) gene

sequences. Syst. Biol. 52: 374-385.

Soltis DE, Rieseberg LH 1986. Autopolyploidy in Tolmiea menziesii (Saxifragaceae): evidence

from enzyme electrophoresis. Am. J. Bot. 73: 310– 318.

_____, Soltis PS, Tate JA 2004. Advances in the study of polyploidy since plant speciation.

New Phytol. 161: 173-191.

Swofford DL 2002. PAUP*: phylogenetic analysis using parsimony (*and other methods),

version 4.0b10 Sinauer Sunderland, Massachusetts, USA.

Talent N, Dickinson TA 2005. Polyploidy in Crataegus and Mespilus (Rosaceae, Maloideae):

evolutionary inferences from flow cytometry of nuclear DNA amounts. Can. J. Bot. 83:

1268-1304.

_____, _____ 2007. Ploidy level increase and decrease in seeds from crosses between sexual

diploids and asexual triploids and tetraploids in Crataegus L. (Rosaceae, Spiraeoideae,

Pyreae). Can. J. Bot. 85: 570-584.

Templeton AR, Crandall KA, Sing CF 1992. A cladistic analysis of phenotypic associations

with haplotypes inferred from restriction endonuclease mapping and DNA sequence data. III.

Cladogram estimation. Genetics 132: 619-633.

Thompson JD, Gibson TJ, Plewniak F, Jeanmougin F, Higgins DG 1997. The ClustalX windows

interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. 187 Nucleic Acids Res. 24: 4876-4882.

Veitia RA 2005. Paralogs in polyploids: One for All and All for One? Plant Cell 17: 4-11.

Vriesendorp B, Bakker FT 2005. Reconstructing patterns of reticulate evolution in angiosperms:

what can we do? Taxon 54: 593-604.

Wendel JF, Doyle JJ 2005. Polyploidy and evolution in plants. In: Henry R. J. (ed.) Plant

diversity and evolution: Genotypic and phenotypic variation in higher plants. Oxfordshire

(UK): CABI. Pp.97-117. 188

Table 5.1 Summary of C. douglasii and C. suksdorfii individuals included in amplifications of the two nuclear gene regions PEPC and

PISTILLATA. Two duplicated paralogs, –S and –L respectively, were identified in each gene. The total number of clones sequenced from each individual and number of clones found for each paralog (in parentheses) are indicated. Superscripts 1 and 2 denote segregates of C. douglasii that are morphologically described as C. okennonii and C. castlegarensis according to Phipps and O’Kennnon (1998). Bolded letters indicate locality abbreviation used in the text.

Nuclear gene regions- PEPC and PISTILLATA Species Label State/Province; County; Locality # of PEPC clones (S-, L-copy) # of PISTILLATA clones (S-, L-copy) C. douglasii senus lato CA3L0613 California; Shasta; Hat Creek 10 (8, 2) 9 (5, 4) ON20L011 Ontario; Grey; Big Bay, Colpoy's range 9 (7, 2) 14 (7, 7) ON20L015 Ontario; Grey; Big Bay, Colpoy's range 8 (5, 3) 8 (3, 5) ID2AL138 Idaho; Latah; Little Boulder Creek 13 (8, 5) 10 (5, 5) ID6AL166 Idaho; Adams; Last Chance Campground, near Meadows 11 (8, 3) 11 (4, 7) ID6AL170 Idaho; Adams; Last Chance Campground, near Meadows 9 (6, 3) 9 (5, 4) ID15L197 Idaho; Lemhi; US 93 S of Gibbonville 11 (5, 6) 8 (5, 3) ID20L121 Idaho; Nez Perce; Little Potlatch Creek 10 (7, 3) 9 (4, 5) MT2AL032 Montana; Powell; Kleinschmidt Flat 11 (6 ,5) 11 (5, 6) MT2AL039 Montana; Powell; Kleinschmidt Flat 15 (12, 3) 9 (4, 5) MT2A0141 Montana; Powell; Kleinschmidt Flat 8 (6, 2) 8 (4, 4) WA5S0703 Washington; Chelan 11 (7, 4) 11 (3, 8) WA22L1551 Washington; Whitman; South of Colfax 11 (9, 2) 8 (3, 5) WA21L0892 Washington; Thurston; Mound Prairie. 12 (7, 5) 8 (3, 5) C. suksdorfii senus lato 188 189

CA5L0616 California; Siskiyou; Fay Lane 8 (5, 3) 8 (4, 4) CA5L0622 California; Siskiyou; Fay Lane 8 (6, 2) 8 (6, 2) OR1EL070 Oregon; Linn; Cogswell Foster Reserve 9 (6, 3) 9 (7, 2) OR1EL072 Oregon; Linn; Cogswell Foster Reserve 8 (7, 1) 8 (5, 3) OR1EL075 Oregon; Linn; Cogswell Foster Reserve 8 (4, 4) 7 (4, 3) OR6EL050 Oregon; Lane; Patterson Mountain Prairie 9 (5, 4) 9 (4, 5) OR6EL057 Oregon; Lane; Patterson Mountain Prairie 15 (9, 6) 8 (3, 5) OR6EL062 Oregon; Lane; Patterson Mountain Prairie 13 (10, 3) 8 (4, 4) OR6EL065 Oregon; Lane; Patterson Mountain Prairie 12 (8, 4) 13 (7, 6) OR11L104 Oregon; Columbia; Sauvie Island 9 (6, 3) 9 (5, 4) OR11L115 Oregon; Columbia; Sauvie Island 8 (3, 5) 7 (3, 4) ID6BL165 Idaho; Adams; Last Chance Campground, near Meadows 10 (6, 4) 12 (8, 4) ID6BL172 Idaho; Adams; Last Chance Campground, near Meadows 11 (8, 3) 9 (3, 6) ID6BL173 Idaho; Adams; Last Chance Campground, near Meadows 9 (6, 3) 8 (3, 5) ID5BL188 Idaho; Valley; North Beach, Payette Lake 13 (7, 6) 11 (8, 3) MT2BL026 Montana; Powell; Kleinschmidt Flat 14 (12, 2) 12 (6, 6) MT2BL030 Montana; Powell; Kleinschmidt Flat 13 (10, 3) 14 (7, 7) MT2BL036 Montana; Powell; Kleinschmidt Flat 12 (8, 4) 14 (6, 8) MT2BL045 Montana; Powell; Kleinschmidt Flat 13 (9, 4) 12 (7, 4) WA7Z18485 Washington; Clark - 7 (4, 3) 189 189 190

Table 5.2 Summary of C. douglasii and C. suksdorfii individuals included in amplifications of the two chloroplast intergenic regions psbA-trnH and trnH-rps2. Superscripts 1 and 2 denote segregates of C. douglasii that are morphologically described as C. okennonii and C. castlegarensis according to Phipps and O’Kennnon (1998). The number of individuals (N) included from each site are indicated. A total of

19 haplotypes (labelled as A-S) were identified from the 132 sequences and these haplotypes are indicated with respect to sites of the two taxa. Statistical parsimony network of these haplotypes is shown in Figure 1.

Chloroplast intergenic regions-psbA-trnH and trnH-rps2 Species Label State/Province; County; Locality N Haplotype groups C. douglasii senus lato ID2 Idaho; Latah; Little Boulder Creek 9 N, P ID3 Idaho; Benewah; St. Maries River, Santa Creek 2 N, P ID6 Idaho; Adams; Last Chance Campground, near Meadows 11 M, N, O, P ID15 Idaho; Lemhi; US93 N of Salmon 4 M, P ID16 Idaho; Lemhi; US 93 S of Gibbonville 3 M, P ON20 Ontario; Grey; Big Bay, Colpoy's range 9 P, Q MT2 Montana; Powell; Kleinschmidt Flat 12 J, L, P WA221 Washington; Whitman; South of Colfax 5 P, R WA212 Washington; Thurston; Mound Prairie. 6 R, S C. suksdorfii senus lato CA5 California; Siskiyou; Fay Lane 5 D, F OR1 Oregon; Linn; Cogswell Foster Reserve 9 A, B, C, E, G OR6 Oregon; Lane; Patterson Mountain Prairie 20 A, B, C, H OR11 Oregon; Columbia; Sauvie Island 9 A, B, E, G ID5 Idaho; Valley; North Beach, Payette Lake 2 O, P 190 190 191

ID6 Idaho; Adams; Last Chance Campground, near Meadows 16 I, O, P MT2 Montana; Powell; Kleinschmidt Flat 13 I, J, K, L, O, P

191 192 Table 5.3 Mean nuclear 2C values and standard deviation (SD) of C. douglasii and C. suksdorfii individuals resulted from flow cytometry are shown. Localities of these individuals are indicated in Table 1. SD denotes standard deviation of the mean 2C value in each measurement.

Taxa Individual mean 2C (pg) SD Ploidy level C. douglasii senus lato CA3L0613 2.86 0.38 4x ON21L011 2.92 0.15 4x ON20L015 3.00 0.24 4x ID20L121 2.96 0.18 4x ID6AL166 2.93 0.16 4x ID6AL170 2.90 0.15 4x ID2AL138 2.84 0.16 4x ID15L197 2.85 0.21 4x MT2AL032 2.92 0.14 4x MT2AL039 2.90 0.17 4x MT2A0141 3.15 0.31 4x WA5S0703 3.25 0.24 4x WA22L1551 2.77 0.34 4x WA21L0892 3.01 0.18 4x C. suksdorfii senus lato CA5L0616 1.59 0.09 2x CA5L0622 1.24 0.09 2x OR1EL070 1.49 0.08 2x OR1EL072 1.51 0.09 2x OR1EL075 1.52 0.11 2x OR11L104 1.45 0.11 2x OR11L115 1.67 0.09 2x OR6EL050 2.25 0.11 3x OR6EL057 2.25 0.14 3x OR6EL062 2.22 0.14 3x OR6EL065 2.22 0.14 3x ID6BL165 2.21 0.12 3x ID6BL172 2.03 0.11 3x ID6BL173 2.29 0.15 3x ID5BL188 2.26 0.11 3x MT2BL026 2.98 0.17 4x 193 MT2BL030 2.92 0.16 4x MT2BL036 2.91 0.19 4x MT2BL045 2.95 0.18 4x WA718485 1.42 0.13 2x 194

Table 5.4 Diversity measures from chloroplast sequences of C. douglasii and C. suksdorfii individuals from separate or combined localities estimated by DNASP. K: Average number of nucleotide differences; Hd: Haplotype diversity; Pi: Nucleotide diversity; sd: Standard deviation.

Species Cytotype Locality id # of individuals # of poly. sites K Hd ± sd Pi (10-3) ± sd (10-3) C. suksdorfii Diploid OR1 9 7 2.87 0.933 ± 0.122 5.31 ± 0.98 OR11 9 6 2.22 0.889 ± 0.091 4.28 ± 1.11 CA5 5 5 2.40 0.900 ± 0.161 3.95 ± 0.67 Triploid OR6 20 7 0.79 0.284 ± 0.128 1.41 ± 0.68 ID5, ID6 16 4 0.70 0.600 ± 0.127 1.24 ± 0.33 Tetraploid MT2 13 11 5.23 0.833 ± 0.081 9.29 ± 1.05 C. douglasii Tetraploid ON20 9 9 3.67 0.917 ± 0.073 6.50 ± 1.24 ID2, ID3 11 6 2.22 0.945 ± 0.054 3.95 ± 0.50 ID6 11 2 0.66 0.327 ± 0.153 1.16 ± 0.54 ID15, ID16 7 4 1.14 0.524 ± 0.209 2.03 ± 1.05 MT2 12 7 1.17 0.682 ± 0.148 2.07 ± 0.66 WA21, WA22 11 8 2.58 0.873 ± 0.071 4.91 ± 1.03 194 195

Table 5.5 Summary of results for PISTILLATA and PEPC paralogue sequences among C. douglasii and C. suksdorfii individuals. PI: parsimony informative sites; MPTs: equally most parsimonious trees; CI: consistency index; RI: retention index.

Gene paralog Size (bp) Total seq. Recombinant seq. Analysed seq. Variable sites PI sites # of MPTs Tree length CI RI PISTILLATA-L 1066 161 3 158 325 157 >50,000 637 0.73 0.90 PISTILLATA-S 1052 160 10 150 147 69 >50,000 322 0.84 0.83 PEPC-L 743 96 8 88 85 51 >50,000 88 0.82 0.91 PEPC-S 705 236 11 225 189 106 >50,000 385 0.68 0.94 195 195 196 Figure 5.1 Statistical network of chloroplast haplotypes A-S obtained from 132 sequences representing diploid and polyploid individuals of the Douglasianae from 16 localities for the psbA-trnH and trnH-rpl2 intergenic regions. Haplotypes observed in C. suksdorfii and C. douglasii with respect to sites are presented in Table 2. Ploidy level, locality identity, and number of individuals that share the same haplotype (in parentheses) are indicated in each haplotype. Sizes of haplotypes are proportional to the number of individuals. Dashes denote site changes taken from one haplotype to another under the parsimony criteria. Gray color denotes haplotypes that shared between 2x and 3x C. suksdorfii from Oregon. Black color denotes haplotypes that shared separately or in combination between 3x and 4x C. suksdorfii and C. douglasii.

I4 4x- 3x- M8 MTs IDs 4x-IDd

T A A G N6 T A G A 4x-IDd J6 A T S6 4x-MTs G A A A C G G 4x-WAd 3x- A T G 4 4x-MTd A A A T T G A ORs 12 TT H C G A G G A G R3 T A 2x-ORs 3x-ORs G T A 41 C 4x-WAd T OR6EL063s P T G T G C A C 3x-IDs C C T C A T T 4x-ONd G A 4x-MTd B7 C T 2 G T A T 4x-MTs T G T 4x- 3x- 2x- D 4x-IDd L3 A A C MTs T A C 2x-CAs T A G G ORs ORs T 4x-WAd T 4x-MTd G T C A G 4x- G T A 6 C3 2x- A T C MTsO T G ORs TT G 4x-IDd 3 3x-ORs F Q3 3x-IDs 2x-CAs AGA 4x-ONd T G G A C A C A T G Chloroplast network A C T A C A C A C T C T 2x-C. suksdorfii (s) T G E8 C A 3x-C. suksdorfii (s) 2x-ORs G3 T A 2x-ORs 4x-C. suksdorfii (s) 6 4x-C. douglasii (d) K 4x-MTs 1 9 7 198 Figure 5.2 Strict consensus parsimonious tree for the S and L paralogs of the PISTILLATA sequence data. Because the 150 sequences of the S-paralog obtained from all examined individuals as listed in Table 5.1 are unresolved, they are represented by a triangle in the cladogram. Each terminal is represented by a sequence obtained from clones of either a diploid, triploid, or tetraploid individual. Ploidy level and locality identity are indicated on the right.

Sequences of 3x and 4x C. suksdorfii are bolded. Two clades (A and B) were identified within the L-paralog. A subclade (A1) is detected in clade A that contains exclusively sequences of 4x

C. suksdorfii and 4x C. douglasii. Clade B contains only 2x and 3x C. suksdorfii from Oregon and California.

199

N = 150 S paralog

L CA3L0613 C douglasii 4x 5 L ON20L011 C douglasii 4x 18 L ON20L011 C douglasii 4x 3 L ON20L011 C douglasii 4x 6 L ID15L197 C douglasii 4x 9 L ID2AL138 C douglasii 4x 10 L ID6AL166 C douglasii 4x 15 L ID20L121 C douglasii 4x 5 L ID2AL138 C douglasii 4x 5 L ID2AL138 C douglasii 4x 4 L ID15L197 C douglasii 4x 15 L ID15L197 C douglasii 4x 14 L ID2AL138 C douglasii 4x 2 L ID6AL166 C douglasii 4x 18 L ID6AL166 C douglasii 4x 14 L MT2BL032 C douglasii 4x 11 L MT2BL032 C douglasii 4x 12 L MT2BL039 C douglasii 4x 11 L MT2BL039 C douglasii 4x 9 L MT2BL039 C douglasii 4x 3 L MT2BL039 C douglasii 4x 1 L WA5S0703 C douglasii 4x 4 L WA21L089 C douglasii 4x 15 L WA21L089 C douglasii 4x 17 L MT2BL030 C suksdorfii 4x 2 L MT2BL026 C suksdorfii 4x 12 L MT2BL036 C suksdorfii 4x 15 L MT2BL026 C suksdorfii 4x 16 L MT2BL026 C suksdorfii 4x 13 L MT2BL036 C suksdorfii 4x 6 L MT2BL026 C suksdorfii 4x 17 L MT2BL030 C suksdorfii 4x 12 L MT2BL036 C suksdorfii 4x 10 L MT2BL036 C suksdorfii 4x 16 L MT2BL045 C suksdorfii 4x 2 L MT2BL045 C suksdorfii 4x 1 L MT2BL045 C suksdorfii 4x 3 L MT2BL045 C suksdorfii 4x 4 A1 L MT2BL039 C douglasii 4x 6 L MT2BL032 C douglasii 4x 10 PISTILLATA gene L ON20L011 C douglasii 4x 1 L CA3L0613 C douglasii 4x 6 L ID2AL138 C douglasii 4x 3 duplication event L WA5S0703 C douglasii 4x 2 L WA22L155 C douglasii 4x 2 L WA21L089 C douglasii 4x 7 L WA5S0703 C douglasii 4x 3 L WA5S0703 C douglasii 4x 1 L WA5S0703 C douglasii 4x 5 L ID6AL166 C douglasii 4x 9 L ID6AL166 C douglasii 4x 10 L MT2BL030 C suksdorfii 4x 5 L MT2BL030 C suksdorfii 4x 9 L MT2BL026 C suksdorfii 4x 3 L MT2BL026 C suksdorfii 4x 7 L MT2BL030 C suksdorfii 4x 8 L MT2BL036 C suksdorfii 4x 4 L WA22L155 C douglasii 4x 13 L WA5S0703 C douglasii 4x 9 L WA5S0703 C douglasii 4x 7 L WA5S0703 C douglasii 4x 8 L ON20L011 C douglasii 4x 2 L CA3L0613 C douglasii 4x 4 A L ON20L011 C douglasii 4x 17 L ON20L011 C douglasii 4x 14 L CA3L0613 C douglasii 4x 12 L ID6AL170 C douglasii 4x 5 L ID6AL166 C douglasii 4x 16 L ID6AL166 C douglasii 4x 17 L ID6AL170 C douglasii 4x 6 L WA21L089 C douglasii 4x 14 L WA22L155 C douglasii 4x 6 L WA22L155 C douglasii 4x 1 L MT2BL032 C douglasii 4x 6 L MT2BL032 C douglasii 4x 4 L MT2BL032 C douglasii 4x 3 L MT2BL036 C suksdorfii 4x 18 L MT2BL036 C suksdorfii 4x 13 L ID6BL165 C suksdorfii 3x 6 L ID6BL172 C suksdorfii 3x 5 L ID6BL165 C suksdorfii 3x 13 L ID6BL165 C suksdorfii 3x 18 L ID6BL172 C suksdorfii 3x 3 L ID6BL188 C suksdorfii 3x 10 L ID6BL165 C suksdorfii 3x 14 L ID6BL188 C suksdorfii 3x 2 L ID6BL188 C suksdorfii 3x 6 L ID6BL172 C suksdorfii 3x 9 L ID6BL172 C suksdorfii 3x 8 L ID6BL172 C suksdorfii 3x 6 L ID6BL172 C suksdorfii 3x 1 L OR11L104 C suksdorfii 2x 3 L OR1EL075 C suksdorfii 2x 3 L OR1EL072 C suksdorfii 2x 1 L OR1EL070 C suksdorfii 2x 9 L OR1EL072 C suksdorfii 2x 6 L OR11L115 C suksdorfii 2x 5 L WA718485 C suksdorfii 2x 3 L WA718485 C suksdorfii 2x 8 L OR1EL072 C suksdorfii 2x 5 L OR1EL072 C suksdorfii 2x 3 L CA5L0616 C suksdorfii 2x 5 L CA5L0616 C suksdorfii 2x 2 L CA5L0616 C suksdorfii 2x 11 L CA5L0616 C suksdorfii 2x 12 B L OR6EL065 C suksdorfii 3x 14 L OR6EL065 C suksdorfii 3x 1 autotriploidy L OR6EL065 C suksdorfii 3x 17 L OR6EL065 C suksdorfii 3x 13 L OR6EL065 C suksdorfii 3x 6 L OR6EL065 C suksdorfii 3x 2 L OR6EL057 C suksdorfii 3x 3 L OR6EL062 C suksdorfii 3x 5 L OR6EL057 C suksdorfii 3x 5 L OR6EL057 C suksdorfii 3x 12 L OR6EL057 C suksdorfii 3x 1 L OR6EL057 C suksdorfii 3x 2 L OR6EL062 C suksdorfii 3x 4 L OR6EL062 C suksdorfii 3x 6 L OR6EL062 C suksdorfii 3x 1 200 Figure 5.3 Strict consensus parsimonious tree for the S and L paralogs of PEPC sequence data that resolved as two distinct clades. Each terminal was represented by a sequence from clones of either a diploid, triploid, or tetraploid individual. Ploidy level and locality identity are indicated on the right. Sequences of 3x and 4x C. suksdorfii are bolded. In the L-paralog clade, a subclade

(in gray) containing 2x and 3x C. suksdorfii sequences from Oregon is recovered. In the

S-paralog clade, three subclades (A1, A2, and A3) are detected in clade A. The remaining of the

S-paralog clade contains sequences of 4x C. suksdorfii and 4x C. douglasii as marked by S2, which are poorly resolved. These S2 sequences are united as a monophyletic clade in a single most parsimonious tree.

201

L MT2BL036 C suksdorfii 4x 2 L MT2BL036 C suksdorfii 4x 1 L ID6BL172 C suksdorfii 3x 2 L ID6BL172 C suksdorfii 3x 10 L ID6AL166 C douglasii 4x 8 L ID6AL170 C douglasii 4x 11 L MT2AL039 C douglasii 4x 13 L ID15L197 C douglasii 4x 11 L ID15L197 C douglasii 4x 1 L ID6BL165 C suksdorfii 3x 8 L MT2AL032 C douglasii 4x 3 L MT2AL032 C douglasii 4x 2 L MT2AL032 C douglasii 4x 1 L MT2AL032 C douglasii 4x 11 L ID15L197 C douglasii 4x 9 L ID15L197 C douglasii 4x 10 L ID2AL138 C douglasii 4x 10 L ID2AL138 C douglasii 4x 2 L ID6AL170 C douglasii 4x 10 L ID2AL138 C douglasii 4x 5 L CA3L0613 C douglasii 4x 4 L CA3L0613 C douglasii 4x 1 L WA21L089 C douglasii 4x 10 L WA21L089 C douglasii 4x 5 L WA21L089 C douglasii 4x 4 L WA21L089 C douglasii 4x 11 L WA21L089 C douglasii 4x 9 L paralog L WA5S0703 C douglasii 4x 3 L WA5S0703 C douglasii 4x 11 L ID6BL188 C suksdorfii 3x 1 L ID6BL188 C suksdorfii 3x 7 L OR6EL057 C suksdorfii 3x 13 L OR6EL057 C suksdorfii 3x 12 L OR6EL057 C suksdorfii 3x 2 L OR6EL062 C suksdorfii 3x 12 L OR6EL057 C suksdorfii 3x 4 autotriploidy L OR6EL065 C suksdorfii 3x 10 L OR6EL057 C suksdorfii 3x 6 L OR6EL065 C suksdorfii 3x 4 L OR6EL065 C suksdorfii 3x 2 L OR1EL075 C suksdorfii 2x 3 L OR1EL075 C suksdorfii 2x 4 L OR1EL075 C suksdorfii 2x 5 L OR1EL075 C suksdorfii 2x 2 L OR1EL070 C suksdorfii 2x 2 L OR1EL070 C suksdorfii 2x 8 L OR1EL070 C suksdorfii 2x 5 L OR11L104 C suksdorfii 2x 8 L OR11L115 C suksdorfii 2x 4 L OR11L115 C suksdorfii 2x 5 L OR11L115 C suksdorfii 2x 2 L OR11L115 C suksdorfii 2x 7 L OR11L115 C suksdorfii 2x 6 L CA3L0616 C suksdorfii 2x 4 S1 MT2BL026 C suksdorfii 4x 7 S1 MT2BL026 C suksdorfii 4x 1 S1 MT2BL026 C suksdorfii 4x 9 S1 MT2BL030 C suksdorfii 4x 5 S1 MT2BL045 C suksdorfii 4x 2 S1 MT2AL032 C douglasii 4x 10 S1 MT2AL039 C douglasii 4x 10 S1 ID2AL138 C douglasii 4x 9 S1 ID2AL138 C douglasii 4x 3 S1 ID2AL138 C douglasii 4x 11 S1 ID2AL138 C douglasii 4x 12 S1 ID6AL170 C douglasii 4x 2 S1 ID20L121 C douglasii 4x 11 S1 WA5S0703 C douglasii 4x 5 S1 WA5S0703 C douglasii 4x 2 S1 WA21L089 C douglasii 4x 6 S1 WA22L155 C douglasii 4x 11 S1 ID6BL165 C suksdorfii 3x 3 A1 S1 ID6BL172 C suksdorfii 3x 71 S1 ID6BL172 C suksdorfii 3x 1 S1 ID6BL188 C suksdorfii 3x 9 S1 OR1EL072 C suksdorfii 2x 4 S1 OR1EL072 C suksdorfii 2x 1 S1 OR1EL072 C suksdorfii 2x 2 S1 MT2BL026 C suksdorfii 4x 2 S1 MT2AL039 C douglasii 4x 7 S1 MT2AL039 C douglasii 4x 6 S1 MT2AL039 C douglasii 4x 2 PEPC gene S1 ON20L011 C douglasii 4x 3 S1 ID6AL166 C douglasii 4x 2 S1 ID6BL173 C suksdorfii 3x 3 S1 ID6BL173 C suksdorfii 3x 4 S1 MT2BL026 C suksdorfii 4x 8 duplication event S1 MT2BL026 C suksdorfii 4x 12 S1 MT2BL030 C suksdorfii 4x 7 S1 MT2BL036 C suksdorfii 4x 5 S1 MT2BL036 C suksdorfii 4x 4 S1 MT2BL030 C suksdorfii 4x 3 S1 OR6EL050 C suksdorfii 3x 1 S1 OR6EL050 C suksdorfii 3x 2 S1 OR6EL050 C suksdorfii 3x 3 S1 OR6EL057 C suksdorfii 3x 10 S1 OR6EL057 C suksdorfii 3x 7 S1 OR6EL057 C suksdorfii 3x 5 A2 S1 OR6EL057 C suksdorfii 3x 13 S1 OR6EL057 C suksdorfii 3x 1 S1 OR6EL057 C suksdorfii 3x 8 S1 OR6EL062 C suksdorfii 3x 4 S1 OR6EL062 C suksdorfii 3x 7 autotriploidy S1 OR6EL062 C suksdorfii 3x 3 S1 OR6EL062 C suksdorfii 3x 8 S1 OR6EL062 C suksdorfii 3x 5 S1 OR6EL062 C suksdorfii 3x 11 S1 OR6EL065 C suksdorfii 3x 5 A S1 OR6EL065 C suksdorfii 3x 7 S1 OR6EL065 C suksdorfii 3x 9 S1 OR6EL065 C suksdorfii 3x 12 S1 OR1EL072 C suksdorfii 2x 7 S1 OR1EL070 C suksdorfii 2x 6 S1 OR1EL070 C suksdorfii 2x 1 S1 OR11L104 C suksdorfii 2x 1 S1 OR11L104 C suksdorfii 2x 6 S1 OR11L104 C suksdorfii 2x 5 S1 OR11L104 C suksdorfii 2x 7 S1 MT2BL030 C suksdorfii 4x 1 S1 MT2BL030 C suksdorfii 4x 9 S1 MT2BL030 C suksdorfii 4x 10 S1 MT2BL030 C suksdorfii 4x 6 S1 MT2BL030 C suksdorfii 4x 2 S1 MT2BL030 C suksdorfii 4x 8 S1 MT2AL032 C douglasii 4x 9 S1 MT2AL032 C douglasii 4x 7 S1 ID2AL138 C douglasii 4x 1 S1 WA5S0703 C douglasii 4x 8 A3 S1 WA21L089 C douglasii 4x 3 S1 WA21L089 C douglasii 4x 2 S1 ID6BL165 C suksdorfii 3x 5 S1 ID6BL172 C suksdorfii 3x 3 S1 OR1EL070 C suksdorfii 2x 7 S1 ID6BL173 C suksdorfii 3x 6 S1 ID6BL173 C suksdorfii 3x 1 S1 ID6BL188 C suksdorfii 3x 8 S1 OR1EL070 C suksdorfii 2x 4 S1 OR1EL070 C suksdorfii 2x 3 S1 OR11L115 C suksdorfii 2x 1 S1 OR11L115 C suksdorfii 2x 3 S1 MT2AL039 C douglasii 4x 9 S1 WA21L089 C douglasii 4x 1 S1 ID6BL172 C suksdorfii 3x 4 S1 ID6BL165 C suksdorfii 3x 1 S1 ID6BL172 C suksdorfii 3x 5 S1 ID6BL188 C suksdorfii 3x 5 S1 ID6BL165 C suksdorfii 3x 9 S1 ID6BL165 C suksdorfii 3x 10 S1 ID6BL172 C suksdorfii 3x 2 S1 OR1EL075 C suksdorfii 2x 1 S2 MT2BL026 C suksdorfii 4x 10 S2 MT2BL026 C suksdorfii 4x 5 S2 MT2BL026 C suksdorfii 4x 11 S2 MT2BL045 C suksdorfii 4x 5 S2 MT2BL045 C suksdorfii 4x 1 S2 MT2BL026 C suksdorfii 4x 3 S paralog S2 MT2AL039 C douglasii 4x 1 S2 WA5S0703 C douglasii 4x 4 S2 MT2AL039 C douglasii 4x 5 S2 MT2AL039 C douglasii 4x 11 S2 MT2AL039 C douglasii 4x 3 S2 ID2AL138 C douglasii 4x 6 S2 ID2AL138 C douglasii 4x 4 S2 WA5S0703 C douglasii 4x 7 S2 WA5S0703 C douglasii 4x 3 S2 ID20L121 C douglasii 4x 3 S2 ID6AL166 C douglasii 4x 1 S2 MT2AL039 C douglasii 4x 8 S2 MT2AL039 C douglasii 4x 4 S2 ID6AL166 C douglasii 4x 3 S2 ID6AL170 C douglasii 4x 9 S2 MT2AL039 C douglasii 4x 12 S2 MT2BL045 C suksdorfii 4x 6 S2 MT2BL026 C suksdorfii 4x 4 S2 MT2BL036 C suksdorfii 4x 7 S2 MT2BL036 C suksdorfii 4x 6 S2 MT2A0141 C douglasii 4x 3 S2 ID6AL170 C douglasii 4x 3 S2 ID6AL170 C douglasii 4x 8 S2 ID6AL166 C douglasii 4x 7 S2 ID15L197 C douglasii 4x 8 S2 ID15L197 C douglasii 4x 3 S2 ON20L015 C douglasii 4x 5 S2 ON20L015 C douglasii 4x 1 S2 ON20L011 C douglasii 4x 1 S2 ON20L011 C douglasii 4x 6 S2 ON20L011 C douglasii 4x 7 S2 ID15L197 C douglasii 4x 2 S2 ID15L197 C douglasii 4x 4 S2 ID20L121 C douglasii 4x 6 S2 WA21L089 C douglasii 4x 8 S2 MT2AL032 C douglasii 4x 8 S2 WA22L155 C douglasii 4x 12 S2 WA22L155 C douglasii 4x 6 S2 WA22L155 C douglasii 4x 5 S2 WA22L155 C douglasii 4x 4 S2 WA22L155 C douglasii 4x 7 S2 WA22L155 C douglasii 4x 2 S2 WA22L155 C douglasii 4x 8 S2 WA22L155 C douglasii 4x 3 S2 ID2AL138 C douglasii 4x 8 S2 ID20L121 C douglasii 4x 5 S2 ID6AL166 C douglasii 4x 5 S2 CA3L0613 C douglasii 4x 2 S2 CA3L0613 C douglasii 4x 3 S2 CA3L0613 C douglasii 4x 4 S2 CA3L0613 C douglasii 4x 6 202 Figure 5.4 Statistical network of nuclear haplotypes A-Q obtained from 91 PEPC -L paralog sequences representing diploid and polyploid individuals of the Douglasianae as presented in

Table 5.1. Ploidy level, individual identity, and number of sequences (in parentheses) are indicated in each haplotype. Sizes of haplotypes are proportional to the number of sequences.

Dashes denote site changes taken from one haplotype to another under the parsimony criteria.

Gray color denotes haplotypes that were shared between 2x and 3x C. suksdorfii. Black color denotes haplotypes that share separately or in combination between 3x and 4x C. suksdorfii and

C. douglasii.

PEPC-L network 2x-C. suksdorfii (s) E4 M5 2x-ORs 3x-C. suksdorfii (s) 3x-IDs L2 TC 4x-C. suksdorfii (s) 4x- 4x- 4x-IDd A G 4x-C. douglasii (d) MTsMTd CT T A A T C T T 4x- A G C G G A A G A G I5 A A MTs A G G2 C AT A G 3x- 2x- K3 5 IDs CAs A G 3x-IDs T GGG 4x-MTd GAGGG D2 2x-ORs A 4x-IDd A C A 2x-ORs AGACA G T T G T 3x-ORs C A T A A TC G T CG 29 C T C F 2x-ORs C6 4x-CAd 3x-ORs TCACA 3x-IDs A T 3 B5 4x-MTd N 2x-ORs AGGTT 4x-WAd G C 4x-WAd 2x-CAs 4x-IDd 4x- MTs G G A C T A A G C G C G 3 AGCCGACA Q2 T T T C P 4x- 4x- A C MTs H4 T C G 3x-IDs GAG ACCAG MTd TA TT 4x-CAd 3x-IDs TA 2 CC 4x-IDd O G 4x-IDd 2 0 3 204 Figure 5.5 Reticulation model synthesized from ploidy level, as well as chloroplast and nuclear data indicating origins of triploids and tetraploids of C. suksdorfii from different sites and their putative parental lineages involved. Solid lines indicated topologies resulted from the PEPC sequence data and dotted lines indicated the PISTILLATA sequence data. Four hypothetic routes for C. suksdorfii polyploid formation are inferred subsequent to gene duplication of the two nuclear genes. These four routes are described in detail in the Discussion section. Branches in gray indicate hybridization between 2x C. suksdorfii and 4x C. douglasii. Branches in bolded indicate the backcrossing of the allotriploids with their diploid progenitors. Lines with arrows indicate recurrent gene flow between the sympatric 4x C. suksdorfii and 4x C. douglasii.

C. douglasii CA 4x C. douglasii ON 4x C. douglasii WA 4x

C. douglasii MT 4x 4a. Recurrent gene flow in contact zone P

I C. douglasii ID 4x S T - p or a

r 2. Hybridization to allotriploidy a l o

g or P C. suksdorfii ID 3x L E

P 1. Autotriploidy 3. Backcross to

C C. suksdorfii OR 3x tetraploidy - p

a C. suksdorfii CA 2x r

a C. suksdorfii MT 4x

l C. suksdorfii WA 2x o g

S C. suksdorfii OR 2x G e n

e C. suksdorfii MT 4x d u C. douglasii MT 4x p

l 4b. Synapomorphy of i c C. douglasii ID 4x PEPC-S via gene flow a 2 t i

o C. douglasii ON 4x n s C. douglasii WA 4x

PEPC-paralog L 2

PIST-paralog S 0 5 Chapter 6 Population genetic structure of diploid sexuals and polyploid apomicts of Crataegus suksdorfii and C. douglasii sensu lato (Rosaceae) in the Pacific Northwest.

Abstract. Polyploidy and apomixis are two important and associated processes in plants. Many hawthorn species are characterized by polyploidy and the ability to reproduce both sexually and apomictically. However, the population structure and genetic diversity of these species are poorly understood. In western North America, Crataegus douglasii and C. suksdorfii are known to include diploid sexuals and polyploid apomicts. This paper investigates population structure and compares genetic variability of these two related taxa with microsatellite (SSR) markers. A total of 251 alleles were detected in 13 SSR loci of 290 individuals sampled from

15 localities. Within-population multilocus genotypic variation is the greatest in diploid sexuals, and the least in triploid apomicts. In C. douglasii, distance appears to be a barrier to gene flow between individuals of the Ontario and those of the Pacific Northwest. However, frequent gene flow among individuals from Washington, Idaho, and Montana contributes to an appreciable level of genetic diversity within the tetraploid apomictic populations, and demonstrates effective colonization and successful establishment of C. douglasii in the west.

In contrast, the high differentiation and strong genetic structure in C. suksdorfii suggest that little gene flow is occurring between populations, probably because of ploidy level differences and distance. We predict that apomixis and pre- or post-zygotic reproductive barriers are factors that limit gene flow between cytotypes, which may ultimately lead to morphological diversification and allopatric speciation in C. suksdorfii. Our findings shed light on evolution of woody plants that show heterogeneous ploidy levels and reproductive systems.

206 207 Keywords: Gametophytic apomixis; gene flow; genotypic diversity; woody polyploids. 208 6.1. Introduction

Approximately 70% of the extant plant species are known to contain polyploids (Otto &

Whitton 2000; Meyer & Levin 2002), but the genetic structure of polyploids in natural populations, particularly those of woody species, is poorly understood. The evolutionary potential of a population or a species largely relies on the levels and pattern of genetic variation.

This variation is governed by biotic factors such as reproductive modes, dispersal ability, and selection regimes (Levin 1981; Hamrick et al. 1992; Soltis & Soltis 1993; Hamrick & Godt

1996; Vavrek 1998; Gornall 1999; Ouborg et al. 1999; Mallet 2005). Polyploidy has been shown to be associated with apomixis, i.e., the asexual formation of seeds, a process known to occur in over 400 angiosperm genera (Asker & Jerling 1992; Calzada et al. 1996; Carmen 1997; Whitton et al. 2008). In agamospermous species, populations are expected to be genetically uniform and likely to be differentiated only into patches of few different genotypes by stochastic colonization events (Hamrick et al. 1992; Starfinger & Stocklin 1996; McLellan et al. 1997; Nybom 1998;

Paun et al. 2006). However, this notion is challenged by a series of studies that indicated surprising levels of genetic variation among and within populations of agamospermous plants

(e.g., Ellstrand & Roose 1987; Bayer 1990; Watkinson & Powell 1993; Widen et al. 1994;

Noyes & Soltis 1996; Richards 1996; Durand et al. 2000; D’Souza et al. 2005). Because polyploidy and apomixis are two tightly-linked processes, knowledge about the genetic structure of polyploid populations is essential for understanding the scale of colonization as well as the factors that govern evolution of wild apomicts.

To date, molecular investigations of polyploid and apomictic species have been mainly concentrated on herbaceous plants of the Asteraceae, Poaceae, Ranunculaceae, and Rosaceae

(e.g., Watkinson & Powell 1993; Menken et al. 1995; Esselman et al. 1999; Van der Hulst et al.

2000; Garnier et al. 2002; Paun et al. 2006). Comparatively, very few attempts have been made to evaluate and compare population genetic structure between sexual and apomictic woody 209 plants, with different life form (e.g. growth habit, dispersal mechanism, and generation time), which may influence the pattern and level of genetic variation (Hamrick & Godt 1990). Woody genera such as Amelanchier, Crataegus, Malus, and Sorbus of the Rosaceae are known to contain species that are characterized by polyploidy and gametophytic apomixis (Dickinson et al.

2007). Here, we seek to investigate in depth the genetic architecture in one of these genera,

Crataegus, in North America. Over 60% of the approximately 100 North American species are documented to include polyploid individuals (Talent & Dickinson 2005) and many of these are taxonomically complicated as a result. In western North America, section Douglasianae, diagnosable for the most part by fruits that are black at maturity, comprises at least two such agamic complexes. Series Cerrones occupies the Colorado Plateau and adjacent regions, including eastern Nevada, southern Idaho, and Wyoming. Another complex, series

Douglasianae, which we examine here, occupies the Pacific Northwest with outliers as far east as the upper Great Lakes basin (Dickinson et al. 1996). In common with other such groups of the North American hawthorns, the two best known species in series Douglasianae differ in stamen number, C. douglasii sensu lato with about 10 stamens per flower and C. suksdorfii with about 20 stamens (Evans & Dickinson 1996). These two species were thought to differ in ploidy level, with polyploids restricted to the more widely distributed C. douglasii (Brunsfeld &

Johnson 1990). However, although C. suksdorfii has a narrower distribution than C. douglasii, it has been shown to comprise polyploids as well as diploids (Dickinson et al. 1996; Talent &

Dickinson 2005). While C. douglasii is uniformly tetraploid, there are certain individuals that appear to be slightly differentiated morphologically and are sometimes recognized as 10-stamen segregate species (e.g., C. castlegarensis and C. okennonii; Phipps & O’Kennon 1998).

In polyploid individuals of C. suksdorii and C. douglasii, gametophytic apomixis i.e. formation of unreduced megagametophytes to give rise to a parthenogenetic embryo (Nogler

1984), has been shown to occur (Dickinson et al. 1996; Chapter 4). Such a reproductive system 210 may influence genetic variation in a species, depending on the extent of gene flow between and within populations. Gene flow in hawthorns via pollen is dependent on floral visitors (e.g., bees, flies, and beetles) attracted to pollen and nectar found in the 10-20 unspecialized flowers of terminal inflorescences of sympodial lateral short shoots produced on extension growth of an earlier year. A fraction of these flowers mature into fleshy fruits in which one to five seeds are each enclosed by a woody endocarp (Dickinson 1985). Gene flow via seeds thus depends on birds and small mammals (Courtney & Manzur 1985; Guitian 1998). In Europe, diploid sexual species such as C. monogyna and C. laevigata have demonstrated frequent seed dispersal among populations based on chloroplast microsatellite markers (Fineschi et al. 2005). However, the population dynamics of North American species, particularly those of mixed reproductive system (Chapter 4), are poorly understood. Moreover, gene flow between ploidy levels has been shown to be common in other plants when cytotypes are found in sympatry (Hagen et al. 2002;

Meirmans et al. 2003; Daurelio et al. 2004; Robertson et al. 2004; Talent & Dickinson 2007), but both the degree to which this occurs between sympatric C. suksdorfii and C. douglasii cytotypes and its effects on local genetic diversity are unclear.

The overall goal of this paper is, thus, to examine the extent of gene flow and impact of reproductive system on population genetic diversity of C. douglasii and C. suksdorfii throughout their distribution ranges. Specifically, we ask the following questions: (1) Are populations of C. suksdorfii and C. douglasii genetically differentiated across our sampling sites? If so, is there a correlation between genetic and geographical distances? (2) Is there evidence that gene flow occurs between sympatric polyploid individuals of C. suksdorfii and C. douglasii? (3) Are there differences in the levels of genetic variation between diploid sexual and tetraploid apomictic populations?

6.2. Materials and methods 211 6.2.1. Plant materials

Our sampling localities (Fig. 6.1; Table 6.1) encompass almost the entire ranges of C. suksdorfii and C. douglasii distribution. In large populations, the trees studied were chosen randomly by having two collectors using an “ignorant person” strategy (Ward 1974) to avoid biased sampling.

The number of trees collected depended on the population size with a target of at least 15 individuals (Appendix 6). Unexpanded leaves were collected and stored in silica gel and used for DNA extraction. Fall collections included leaves and fruits were used for flow cytometric determinations. Ploidy level is indicated in Table 6.1. In total, 239 trees representing 15 localities in California, the Pacific Northwest, and Ontario were included, of which 125 tetraploid individuals are C. douglasii (and its segregates) and 114 individuals (52 diploid, 41 triploid, and 21 tetraploid) are identified as C. suksdorfii. Species identification was mainly based on leaf, thorn, and flower or fruit characters such as stamen number and fruit color.

Sample vouchers were deposited in the Green Plant Herbarium, Royal Ontario Museum (TRT).

6.2.2. DNA extraction and microsatellite markers

DNA was extracted from leaf tissues according to the modified protocol of Tsumura et al.

(1995). Twenty-three dinucleotide microsatellite loci located on the linkage groups 12 and 14 of

Malus domestica (Liebhard et al. 2002) were selected for Crataegus in the preliminary primer testing because these primers were shown to be transferable to Crataegus and other Maloideae genera. All primers were designed on conserved flanking regions of tandem repeats in M. domestica and the complete procedure of primer development is described in Gianfranceschi et al. (1998). Prior to fragment analyses, we sequenced the PCR products of these primers from

2-3 individuals to confirm and identify the types of tandem repeats. PCR amplifications were performed in a 15 μl volumes containing ~20 ng of genomic DNA, 1.5 μl 10 × PCR buffer

(Fermentas), 0.2 Mm of each dNTP, 1.5 mM MgCl2, 1 U of Taq polymerase (Fermentas), and 212 0.5 μM each of the forward and reverse primers. All forward primers were end-labelled with fluorescent dyes either FAM or HEX. Because the same annealing temperature was used for all

SSR primers, two primer pairs were combined in PCR amplifications (multiplexing). We followed the PCR conditions of Gianfranceschi et al. (1998). PCR products were analysed on an

ABI3700 automatic sequencer (Applied Biosystems) in the Centre for Applied Genomics, The

Hospital for Sick Children, Toronto, Canada. Peaks were scored using the program

GENEMAPPER version 3.5 (Applied Biosystems).

6.2.3. Data analyses

For each locus and population, the mean number of alleles, allelic frequency, expected heterozygosity or gene diversity (He) corrected with sample size (Nei 1987), fixation index FIS corrected for small samples size (Kirby 1975), and significance of deviation from

Hardy-Weinberg equilibrium (HWE; H0 = 1 in all populations) were calculated by SPAGEDI version 1.2 (Hardy & Vekemans 2002). This software allows the estimation of individual and population relatedness based on codominant SSR data at any ploidy level.

For each population, genotypic variation was assessed with GENODIVE v2.0b4

(Meirmans & Van Tienderen, 2004). We calculated genetic distances using Bruvo et al. (2004) method and the minimal distance class was set as threshold to identify the following: (1) the number of multilocus genotypes (G), (2) the proportion of distinguishable multilocus genotypes

(PD), (3) Simpson’s diversity index (D), also known as Nei’s (1987) genetic diversity corrected for sample size, which ranges from zero where two random individuals share a single genotype in a population to one where individuals are genetically different in a population; and (4)

Genotype evenness (E), which ranges from zero where one or a few genotypes dominate in a population, to one where all genotypes are of equal frequency in a population.

To estimate molecular diversity of the samples, we calculated ANOVA-based global and 213 pairwise F-statistics (based on allele identity under infinite allele model; IAM) and R-statistics

(based on allele size under stepwise mutation model; SMM) with significant P values of two-sided tests obtained after 1,000 random permutations of genes, individuals and populations with SPAGEDI. The standard genetic distances FST (Wright 1965) and DS (Nei 1978) based on

2 IAM, as well as RST (Slatkin 1995) and delta µ (Goldstein et al. 1995) based on SMM were

2 estimated. The concordance between FST and RST estimates, as well as DS and ∆ µ distance

values were tested by Mantel tests (H0 = matrices are correlated), respectively, to determine whether our data are influenced by different mutation models. Neighbor-joining (NJ) trees based

2 on DS and delta µ were constructed in PHYLIP version 3.66 (Felsenstein 2006) to infer population relationships. The resulting trees were visualized with the TREEVIEW software program (Page 1996).

To examine the distribution of genetic variation, Analysis of Molecular Variance (AMOVA)

(Excoffier et al. 1992) was performed by GENALEX version 6.0 software (Peakall & Smouse

2006). The codominant SSR data were first converted to a binary data matrix by treating absence as “0” and presence as “1” of a defined allele. It was followed by computing Jaccard coefficient (JC) based on the binary data, which does not consider the shared absence of a character as similarity (Legendre & Legendre 1983), to obtain an unbiased estimate of pairwise genetic distances between individuals. AMOVA analyses were then performed on the JC matrix with significant tests of 10,000 permutations to determine how genetic diversity is partitioned within and between populations, so as to infer the extent of gene flow. Apart from AMOVA, the

JC distance matrix was also used to construct NJ dendrogram to reflect relationships among C. douglasii and C. suksdorfii individuals.

In addition, a model-based Bayesian analysis was preformed to delineate clusters of individuals on the basis of their genotypes with the software STRUCUTRE (Pritchard et al.

2000). The model accounts for the presence of Hardy-Weinberg equilibrium and linkage 214 equilibrium by introducing population structure and attempts to identify K groups that are not in disequilibrium (Pritchard et al. 2000). The number of clusters K is determined by simulating a range of K values. The posterior probability of each value was then used to detect the modal value of ΔK, a quantity related to the second order rate of change with respect to K of the likelihood function (Evanno et al. 2005). STRUCTURE estimates posterior probabilities using a

Markov Chain Monte Carlo (MCMC) method and 1,000,000 iterations of each chain following the 100,000 iteration burn-in period were performed, as recommended by Pritchard et al. (2000).

Each MCMC chain for each value of K (ranging from 1-22) was run for 15 times with the admixture model that assumes mixed ancestry of individuals and independent allele frequency which assumes allelic frequency in different populations are reasonably different from each other. These parameters allow individuals with ancestors in more than one group to be assigned into one cluster. Individuals are partitioned into multiple groups according to the membership coefficient (Q) which ranges from 0 (lowest affinity to a group) to 1 (highest affinity to a group) across the K groups. Individual assignments can vary across runs if there is a weak genetic basis for affinity of one group. To address such variation, 100 separate MCMC chains were run for the optimal K (where ΔK was a maximum) to test for the consistency of membership coefficient.

The partitioning of clusters is visualized in the program DISTRUCT (Rosenberg 2004).

The relationships between genetic distances and geographic distance were analyzed using

Isolation By Distance (IBD) version 1.52 software (Bohonak 2002). The IBD software program assesses the significance and evaluates the strength of the relationships between genetic

2 distances (DS and delta µ ) and the Euclidian geographic distance (estimated by SPAGEDI based on spatial coordinates) by Mantel tests and reduced major axis (RMA) regression, respectively.

6.3. Results

6.3.1. Allelic and genotypic variation within populations 215 The 13 SSR loci yielded a total of 251 alleles (7-34 per locus) for our sample of 239 individuals

(Table 6.2). Up to three and four alleles were detected respectively in triploid and tetraploid individuals and two for diploid in each locus, consistent to their ploidy levels. Among all loci,

CH05G07 is shown to be the most polymorphic in C. douglasii and CH01F02, CH01F07, and

CH03C02 in C. suksdorfii with over 20 alleles (see supplementary figs.), whereas CH01D03 and

CH03D08 appear to be the least variable and are nearly monomorphic in C. douglasii. The allele number and gene diversity (He) across populations of tetraploid C. douglasii are generally high and comparable to those of diploid C. suksdorfii (Tables 6.2 & 6.3). In contrast, the triploid and tetraploid populations of C. suksdorfii reveal lower diversity than diploids at the gene level.

Significant negative FIS values were observed in polyploid but not in diploid populations (H0 =

1 in all populations; Table 6.3), which suggests the presence of excess heterozygotes, if there are no missing null alleles in the loci. As expected, multilocus genotypic variability (G, PG, D, and

E) is shown to be the highest in diploid but lowest in triploid populations (Table 6.3). Diversity indices (D) of some tetraploid populations (e.g., MT2, ID6, ID20, and WA21) are comparable to those of diploids.

6.3.2. Partitioning genetic variation among populations

Estimates of the R-statistics are generally higher than those of the F-statistics in all groups

(Table 6.4). Mantel tests indicated that FST and RST indices were significantly correlated in C. douglasii (r = 0.63, P = 0.001), but not in C. suksdorfii populations (r = 0.32, P = 0.111). The

FST and RST indices of C. suksdorfii are about four-fold higher than C. douglasii (Table 6.4), indicative of stronger genetic differentiation among populations in C. suksdorfii. According to the AMOVA (Table 6.5), up to 37% of the variation is found among populations of C. suksdorfii

(P = 0.001), which is higher than that of C. douglasii (22%; P = 0.001). When C. suksdorfii was partitioned with respect to ploidy level, up to 54% of the variation is detected between diploid 216 and polyploid populations (P = 0.001), while only 15% among diploid populations.

6.3.3. Genetic clustering of individuals

The neighbour-joining tree based on allelic similarity among C. douglasii and C. suksdorfii individuals (Fig. 6.2) indicates some differences between geographical areas but not between the two species. Individuals are divided into three major clusters (A, B, and C). Cluster A and B contain all 4x C. douglasii individuals. In cluster A, individuals from Ontario form a group neighbours to individuals from Montana, Washington, British Columbia, and Idaho. Triploid and tetraploid C. suksdorfii respectively from Idaho and Montana are nested in cluster A with C. douglasii. In cluster B, no clear distinction is observed among C. douglasii individuals from the

Pacific Northwest populations. Cluster C contains diploid individuals of C. suksdorfii from

Oregon, Washington, and California (Fig. 6.2), together with triploids from Patterson Mountain

(OR6). The individual-based NJ tree corroborates the interpopulation relationships shown in the

Δµ2-based tree (Fig. 6.3). Populations of the west coast (OR1, OR6, OR11, WA7, and CA5) appear to be distinct from those in the Cascades and Rocky Mountains (e.g., WA21, ID2, ID6,

ID15, and MT2) as well as the Great Plain (ON20), regardless of the ploidy level and/or taxa differences. A similar topology was also found in the DS -based dendrogram (tree not shown).

STRUCTURE analysis attempts to infer the number of clusters (K) that best explains the structuring of C. douglasii and C. suksdorfii individuals when analysed together. The ln(K) values kept increasing with higher K and a small peak at K = 9 was found when the Evanno et al.

(2005) posterior ΔK statistics were applied. Membership coefficients of these nine clusters are presented in Table 6. In C. douglasii, probability values in almost all clusters are less than 0.85, except cluster 8 (0.97) that contains individuals from Ontario. This agrees with results in separate analysis (Fig. 6.4A) which identified individuals from Ontario as one cluster while individuals from Washington, Idaho, and Montana populations formed as two other clusters 217 even though they exhibit an admixture of alleles. For C. suksdorfii, high probability values

(0.82-0.98) are found in five of the nine detected clusters in combined analysis (Table 6.6).

Separate analyses also recovered an optimal partitioning of K = 5 among C. suksdorfii individuals of which polyploids from Montana, Idaho, and Oregon are identified respectively as three distinct clusters (Table 6.6; Fig. 6.4B). Diploid individuals from Washington and northern

California constitute the two remaining clusters, respectively, while the Oregon diploids contain an admixture of alleles from the two. Similar results are obtained when diploid C. suksdorfii were analysed alone (data not shown).

In separate analyses of sites with sympatric cytotypes e.g. in Idaho (ID06) where 3x C. suksdorfii and 4x C. douglasii individuals co-occur, STRUCTURE identified two most probable clusters (data not shown) indicating that individuals of these two taxa reveal almost no admixture of alleles. This result is different from that of the Montana site (MT02) where 4x C. suksdorfii and 4x C. douglasii co-occur. Three clusters were identified in which one of the clusters is shared between the two taxa, suggesting that a subset of alleles were mixed by gene flow and shared among these individuals (Table 6.6).

6.3.4. Isolation by distance

Mantel tests did not reveal a significant relationship between geographical and genetic distances in C. douglasii for either Ds (r = 0.299, P = 0.117) or ∆µ2 (r = 0.139, P = 0.475) (Fig. 6.5A).

This agrees with the low FST values observed among populations (Table 6.4) and suggests the absence of isolation by distance. In contrast, there is evidence for isolation by distance in C. suksdorfii (Fig. 6.5B). A significant relationship was detected among C. suksdorfii populations for Ds (r = 0.663, P = 0.002) but not in ∆µ2 (r = 0.064, P = 0.778). Such incongruence could be due to the differences of the IAM and SMM models applied on the C. suksdorfii data. According to the IAM (Fig. 6.5B), distance barrier is suggested in C. suksdorfii and may account for the 218 high genetic differentiation as shown in the ANOVA-based statistics and AMOVA results (Table

6.4 & 6.5). Alternatively, SMM suggests no isolation by distance in C. suksdorfii and genetic differentiation among its populations could be attributed to other factors as discussed below.

6.4. Discussion

6.4.1. Dispersion of C. douglasii in the Pacific Northwest

Individuals of C. douglasii are found in the west and east of the Cascades, Northern Rocky

Mountains, and the upper Great Lakes basin where they are acclimated to a wider range of ecological conditions than C. suksdorfii (Brunsfeld & Johnson 1990; Dickinson et al. 1996; Lo

2008). Bayesian clustering (Fig. 6.4; Table 6.6) and AMOVA (Table 6.5) indicate little genetic structuring and differentiation among the C. douglasii samples. Individuals from Washington,

Idaho, and Montana are closely related and identified as one mixed allelic cluster (Fig. 6.2 &

6.4a). No significant correlation was found between genetic and geographical distances of these individuals (Fig. 6.5), which suggests that distance within the Pacific Northwest are not an effective barrier to gene flow. Flow cytometry has shown that seeds from our C. douglasii samples contain unreduced megagametophytes and are likely to be produced by apomixis (Lo

2008). In sexual flowering plants gene flow is mediated by both pollen and seed movements, while for apomictic plants the major means of gene flow between populations appear to be by seed, as shown in several other such species (e.g., Watkinson & Powell 1993; Overath &

Hamrick 1998; Vavrek 1998; Durand et al. 2000; Oddou-Muratorio et al. 2001; Persson &

Gustavaaon 2001; Rogstad et al. 2002; Pluess & Stocklin 2004). As birds and small mammals are common dispersal vectors of Crataegus seeds (Courtney & Manzur 1985; Guitian 1998), hence, we infer that genotypes of C. douglasii are commingled among sites probably by the effect of frequent seed dispersal through these agents. Such dispersal not only allows rapid colonization of suitable habitats in the Columbia Basin as well as the Rocky Mountains which 219 contributes to its present distribution, but also increases local genetic diversity of apomictic populations by recruiting new founders from other sites (Table 6.3).

On the other hand, dispersal is evidently interrupted between the western and eastern populations of C. douglasii. In North America, C. douglasii has a disjunct distribution. It is abundant in the Pacific Northwest and also in the upper Great Lakes, but absent in the Great

Plains other than in the Cypress Hills (Marquis & Voss 1981; Dickinson et al. 1996). Although only one eastern population is examined in this study, the genetic structuring between the

Ontario and western populations (Fig. 6.4A) suggests that gene flow between them has been limited, probably because of the distance involved and/or lack of suitable habitats in the Great

Plains (Dickinson et al. 1997). We infer that the present distribution of C. douglasii could be a consequence of postglacial migration 10,000 years ago before the re-establishment of the mid-continental grasslands (McAndrews et al. 1991) and/or subsequent aridification of the

Great Plains (Marquis & Voss 1981).

6.4.2. Source of genetic variation in apomictic populations

Reproductive regime is one of the factors that directly determine the amount of genetic variation of a population. Our diploid populations reveal the highest level of genetic diversity among all studied populations (PD = 1, D = 1, and E = 1; Table 6.3), which is expected when considering diploid individuals undergo sexual reproduction and are self-incompatible (Brunsfeld & Johnson

1990; Dickinson et al. 1996; Chapter 4). However, in two of these diploid C. suksdorfii sites

(OR1 and OR11), C. suksdorfii occurs with diploid C. monogyna and hybrids between the two species are observed. Therefore, we cannot rule out the possibility that introgressive hybridization may have contributed to the high genetic diversity in these diploid samples, even though samples included here were carefully diagnosed as C. suksdorfii by morphology.

Triploid and tetraploid individuals of C. suksdorfii and C. douglasii have been shown to 220 reproduce through gametophytic apomixis (Chapter 4). The lowest PG, D, and E values, observed in the two triploid populations (Table 6.3), indicated that individuals at each of these sites share only a small number of multilocus genotypes. This is to be expected, as a consequence of one or more of the following: establishment of the population by seed from a limited number of individuals (Dickinson et al. 1996), autopolyploid origin (Chapter 5), and a lack of recruitment of new genotypes due to apomixis or failure to set seeds for reasons that are unclear at present (Dickinson et al. 1996; Chapter 4). However, the diversity indices of some tetraploid populations (e.g., MT2, ID6, ID20, and WA21; Table 6.3) are only slightly lower than those of diploids. These estimates exceed the range documented in other asexual plants

(Ellstrand & Roose 1987) and are considered to be high for apomictic and/or self-fertile individuals. Similar levels of genetic variation have also been reported in polyploid apomicts of the Asteraceae (Menken et al. 1995; Chapman et al. 2000; Van Der Hulst et al. 2000), Poaceae

(Carino & Daehler 1999; Esselman et al. 1999), Ranuculaceae (Paun et al. 2006), and Rosaceae

(Richards 1996; Campbell et al. 1999), as well as other groups (see also Persson & Gustavsson

2001; Kjolner et al. 2004). Here, we suggest four explanations for the high genetic variability in our tetraploid apomictic plants.

First, tetraploid apomicts of C. suksdorfii and C. douglasii have been shown to produce highly stainable pollen (Dickinson et al. 1996). Apart from endosperm fertilization (i.e., pseudogamy) that is required for seed to be set, sperm nuclei can also fertilize an unreduced egg cell in the embryo sac given that endosperm balance requirements are often relaxed in apomictic plants (Talent & Dickinson 2007), and thus producing offspring of new genotypes. Such occasional sexual reproduction is not unusual to be found in asexual polyploids (Campbell et al.

1999; Van Der Hulst et al. 2000; D’Souza et al. 2004). Second, the formation of novel genotypes through allopolyploidy and introgression (e.g., in 3x C. suksdorfii individuals of site

ID06 and 4x C. suksdorfii individuals of site MT02; Chapter 5) could enrich the initial gene pool 221 (Leitch & Bennet 1997; Soltis et al. 2004; Comai 2005; Abbott et al. 2007). Third, mutations can accumulate and increase genetic variation in a population. Theoretical predictions suggest that for populations under stabilising selection, 0.1% of the genetic variance is attributed to mutation in each generation (Maynard Smith 1998). Such an effect cannot be ignored, particularly with microsatellites, because these regions often exhibit high mutation rates (Rienzo et al. 1994; Ellegren 2000; Schlotterer 2000; Estoup et al. 2002). In addition, tandem repeats on different chromosome linkage groups may be subjected to different selection regimes and thus may vary in mutation rate (Gustafson & Yano 2000; Liebhard et al. 2002). This may affect the estimation of multilocus genetic variation when using unlinked loci, as in the present study

(Table 6.3). Because not all loci of a single linkage group can be amplified in our samples and/or are sufficiently variable for the purposes of this study, we cannot avoid the impact of mutations on our data. Last, gene flow via seed dispersal as mentioned above in C. douglasii (Fig. 6.4 and

6.5) could mix gene pools between populations and increase local diversity. Hence, occasional sexual reproduction, allopolyploid origin, mutations, and dispersal are potential factors that increase genetic variation of our tetraploid apomictic individuals.

6.4.3. Gene flow limited by ploidy level differences

Crataegus suksdorfii has been shown to include 2x, 3x, and 4x individuals that are geographically segregated in the Pacific Northwest (Chapter 4). Diploids are found in more temperate, mesic areas at lower elevations near the Pacific coast, while polyploids occur in less temperate, more xeric settings at higher elevations and often in sympatry with C. douglasii, on the eastern slopes of the Cascades, on the Columbia Plateau, and in the Rocky Mountains

(Dickinson et al. 1996, 1997). According to our data, up to 54% of the genetic variation was detected between cytotypes (Table 6.5), and individuals of different ploidy levels as well as populations are recognized as distinct genetic clusters (Table 6.6; Fig. 6.4B). Such high 222 differentiation and strong genetic structuring across ploidy levels, as also shown between 3x C. suksdorfii and 4x C. douglasii within the sympatric site (ID06; Table 6.6), suggest that pre- or post-zygotic reproductive factors between cytotypes are effective barrier to gene flow in the standing populations of C. suksdorfii. The underlying reason for this is unclear. Although pollination between cytotypes can produce successful seed sets in some other Crataegus species

(Talent & Dickinson 2007), reproductive isolation has been suggested as one of the instantaneous consequences subsequent to genome multiplications, potentially related to fitness and/or other epigenetic impacts on the offspring (Haig & Westoby, 1991; Taylor et al. 2001;

Husband et al. 2002; D’ Souza et al. 2004; Scott & Spielman 2006; Kinoshita 2007). Moreover, breeding system differences between diploid and polyploid C. suksdorfii (Chapter 4) may reduce the extent of pollen-mediated gene flow between populations, even though individuals exhibit similar floral features and phenological regimes (Evans & Dickinson 1996).

Significant correlation was found between genetic and geographical distances under the

IAM criterion for SSR data (Fig. 6.5B), which suggests that gene flow could also be limited by distance throughout its distribution. However, this correlation should be interpreted with caution for the following reasons. First, populations of the greatest geographical distance are also different in ploidy levels in our samples (Fig. 6.5B). No diploid and tetraploid C. suksdorfii are found in proximity in the Pacific Northwest probably because of differential habitats and/or competition for local colonization (Dickinson et al. 1996; Chapter 4). With such spatial segregation between cytotypes, it could be difficult to distinguish whether gene flow is limited by ploidy level differences or physical distance of dispersal. Second, C. suksdorfii and C. douglasii are expected to share similar pollinating/dispersal agents (Evans & Dickinson 1996).

Given that gene flow is not limited by distance in C. douglasii across the Pacific Northwest (Fig.

6.5A), it is difficult to see how a distance constraint would operate in C. suksdorfii which has a narrower distribution range than that of C. douglasii (Fig. 6.1). Third, isolation by distance is 223 only indicated under IAM but not the SMM criteria. Differences between the two models on microsatellite data are still unclear. Some suggest this to be caused by differences in the calculation of expected heterozygosities at mutation equilibrium (Slatin 1995; Luikart &

Cornuet 1998), while others suggest potential errors during electrophoresis that might lead to size homoplasy i.e., alleles identical in size but different in state or vice versa (Estoup et al.

2002; Adams et al. 2004; Pompanon et al. 2005). Unfortunately, no existing program is available to test mutation models of microsatellite data particularly in polyploids. Therefore, with these notions in mind, we suggest that the correlation between genetic and geographical distances under IAM is less likely to be caused by physical barrier that limits dispersal, but rather by ploidy level differences that lead to genetic differentiation among distant populations.

The results obtained under SMM would simply support the prediction of gene flow limited by reproductive barrier between ploidy levels.

6.4.4. Conclusions

Polyploidy and apomixis are two tightly linked processes in Crataegus. Although the genetic basis of apomixis is unclear in this group, the high level of genetic variation detected in

C. douglasii populations provides evidence of the success and continued evolution of apomicts not only in herbaceous but also in woody plants. Crataegus douglasii and C. suksdorfii of western North America display different population genetic structures and suggest that different evolutionary dynamics are in operation. Our data suggest that, on one hand, frequent dispersal, occasional sexuality, and mutations increase genetic diversity within populations of C. douglasii in which individuals are tetraploid and reproduce predominantly by apomixis. These processes, which are also suggested in other agamospermous species (see review in Whitton et al. 2008), contribute to successful colonization and establishment in C. douglasii. On the other hand, the high differentiation and strong genetic structure in C. suksdorfii suggest that little gene flow is 224 occurring between populations, probably because of ploidy level differences. Such a reproductive barrier between ploidy levels is also shown in populations when cytotypes co-occur.

We predict that C. suksdorfii acquires the ability to reproduce apomictically through polyploidy, either in crosses between conspecific individuals or by hybridization (Lo et al. submitted), and disperses eastward to establish in environments that are less favourable for diploids (Dickinson et al. 1996). However, the preponderance of gametophytic apomixis among polyploid derivatives (Chapter 4), as well as pre- or post-zygotic reproductive barrier between cytotypes may lessen the frequency of gene mixing among populations. It is conceivable that cumulative differentiation among cytotypes may eventually lead to morphological diversification and allopatric speciation in C. suksdorfii. The outcome of this study sheds light on evolution of woody plants that show heterogeneous ploidy levels and reproductive systems.

6.5. Acknowledgements

The authors thank Tara Paton and Simone Russell for the ABI facilities; Rhoda Love, Steve

Brunsfeld, Peter Zika, Sophie Nguyen, and Nadia Talent for plant collection and identification;

Annabel Por and Jenny Bull for organizing the vouchers for this study. Financial support from the Natural Sciences and Engineering Research Council of Canada (grant A3430 to TAD and

326439-06 to SS), the Botany Department of the University of Toronto, and the Royal Ontario

Museum is gratefully acknowledged.

6.6. References

Abbott RJ, Ireland HE, Rogers HJ 2007. Population decline despite high genetic diversity in the

new allopolyploid species Senecio cambrensis (Asteraceae). Mol. Ecol. 16: 1023-1033.

Adams RI, Brown KM, Hamilton MB 2004. The impact of microsatellite electromorph size 225 homoplasy on multilocus population structure estimates in a tropical tree (Corythophora alta)

and an anadromous fish (Morone saxatilis). Mol. Ecol. 13: 2579-2588.

Asker S, Jerling L 1992. Apomixis in plants. CRC Press, Boca Raton.

Bayer RJ 1990. Patterns of clonal diversity in the Antennaria rosea (Asteraceae) polyploid

agamic complex. Am. J. Bot. 77: 1313-1319.

Bohonak AJ 2002. IBD (Isolation By Distance): A program for analyses of isolation by distance.

J.Heredity 93: 153-154.

Brunsfeld SJ, Johnson FD 1990. Cytological, morphological, ecological and phenological

support for specific status of Crataegus suksdorfii (Sarg.) Kruschke. Madroño 37: 274-282.

Calzada JPV, Crane CF, Stelly DM 1996. Apomixis: the asexual revolution. Science 274:

1322-1323.

Campbell CS, Alice LA, Wright WA 1999. Comparisons of within-population genetic variation

in sexual and agamospermous Amelanchier (Rosaceae) using RAPD markers. Pl. Syst. Evol.

215, 157-167.

Carino DA, Daehler CC 1999. Genetic variation in an apomictic grass, Heteropogon contortus,

in the Hawaiian Islands. Mol. Ecol. 8: 2127-2132.

Chapman HM, Parh D, Oraguzie N 2000. Genetic structure and colonizing success of a clonal,

weedy species, Pilosella officinarum (Asteraceae). Heredity 84: 401-409.

Comai L 2005. The advantages and disadvantages of being polyploid. Nat. Rev. Genet. 6:

836-846.

Courtney SP, Manzur MI 1985 Fruiting and fitness in Crataegus monogyna: the effect of

frugivores and seed predators. Oikos 44: 398-406.

D’Souza TG, Storhas M, Schulenburg H, Beukeboom LW, Michiels NK 2004. Occasional sex

in an 'asexual' polyploid hermaphrodite. Proc. R. Soc. Biol. Sci. 271: 1001-1007.

Daurelio LD, Espinoza F, Quarin CL, Pessino SC 2004. Genetic diversity in sexual diploid and 226 apomictic tetraploid populations of Paspalum notatum situated in sympatry or allopatry. Pl.

Syst. Evol. 244: 189-199.

Dickinson TA, Campbell CS 1991. Population structure and reproductive ecology in the

Maloideae (Rosaceae). Syst. Bot. 16: 350-362.

Dickinson TA, Belaoussoff S, Love R 1996. North American black-fruited hawthorns: I.

Variation in floral construction, breeding system correlates, and their possible evolutionary

significance in Crataegus sect. Douglasii Loudon. Folia Geobotanica 31: 355-371.

_____, Phipps JB 1986. Studies in Crataegus (Rosaceae: Maloideae) XIV. The breeding system

of Crataegus crus galli sensu lato in Ontario (Canada). Am. J. Bot. 73: 116-130.

_____, Lo EYY, Talent N 2007. Polyploidy, reproductive biology, and Rosaceae: understanding

evolution and making classification. Pl. Syst. Evol. 266: 59-78.

_____, _____, _____, Love R In press. Black-fruited Hawthorns of North America – an Agamic

Complex? Can. J. Bot.

Durand J, Garnier L, Dajoz I, Mousset S, Veuille M 2000. Gene flow in a facultative apomictic

Poacea, the savanna grass Hyparrhenia diplandra. Genetics 156: 823-831.

Ellegren H 2000. Microsatellite mutations in the germline. Trends in Genetics, 16, 551-558.

Ellstrand NC, Roose ML (1987) Patterns of genotypic diversity in clonal plant species. Am. J.

Bot. 74: 123-131.

Estoup A, Jarne P, Cornuet JM 2002. Homoplasy and mutation model at microsatellite loci and

their consequences for population genetics analysis. Mol. Ecol. 11: 1591-1604.

Esselman EJ, Jianqiang L, Crawford DJ, Windus JL, Wolfe AD 1999. Clonal diversity in the

rare Calamagrostis porteri ssp. insperata (Poaceae): comparative results for allozymes and

random amplified polymorphic DNA (RAPD) and intersimple sequence repeat (ISSR)

markers. Mol. Ecol. 8: 443-451.

Evans RC, Dickinson TA 1996. North American black-fruited hawthorns (Crataegus section 227 Douglasii Loud.): Floral development of 10- and 20-stamen morphotypes. Am. J. Bot. 83:

961-978.

Evanno G, Regnaut S, Goudet J 2005. Detecting the number of clusters of individuals using the

software STRUCTURE: a simulation study. Mol. Ecol. Notes 14: 2611-2620.

Excoffier L, Smouse PE, Quattro J 1992. Analysis of molecular variance inferred from metric

distances among DNA haplotypes: Application to human mitochondrial DNA restriction

data. Genetics 131: 479-491.

Felsenstein J (2006) PHYLIP (Phylogeny Inference Package), version 3.66c. Zoology

Department, University of Washington, Seattle.

Fineschi S, Salvini D, Turchini D, Pastorelli R, Vendramin GG (2005) Crataegus monogyna

Jacq. and C. laevigata (Poir.) DC. (Rosaceae, Maloideae) display low level of genetic

diversity assessed by chloroplast markers. Plant Systematics and Evolution, 250, 187-196.

Garnier LKM, Durand J, Dajoz I 2002. Limited seed dispersal and microspatial population

structure of an agamospermous grass of West African savannahs, Hyparrhenia diplandra

(Poaceae). Am. J. Bot. 89: 1785-1791.

Gianfranceschi L, Seglias N, Tarchini R, Komjanc M, Gessler C 1998. Simple sequence repeats

for the genetic analysis of apple. Theoret. Appl. Genet. 96: 1069-1076.

Goldstein DB, Linares AR, Cavalli-Sforza LL, Feldman MW 1995. An evaluation of genetic

distances for use with microsatellite loci. Genetics 139: 463-471.

Gornall RJ 1999. Population genetic structure in agamospermous plants. In: Molecular

Systematics and Plant Evolution. Eds. Hollingsworth P. M., Bateman R. M., and Gornall R.

J.). Taylor and Francis, London. Pp. 118-138.

Guitián P 1998. Latitudinal variation in the fruiting phenology of a bird-dispersed plant (shape

Crataegus monogyna) in Western Europe. Pl. Ecol. 137: 139-142.

Gustafson JP, Yano M (2000) Genetic mapping of hypervariable minisatellite sequences in rice 228 (Oryza sativa L.). Theoret. Appl. Genet. 100: 447-453.

Hagen AR, Sæther T, Borgen L, Elven R, Stabbetorp OE, Brochmann C 2002. The arctic-alpine

polyploids Cerastium alpinum and C. nigrescens (Caryophyllaceae) in a sympatric situation:

breakdown of species integrity? Pl. Syst. Evol. 230: 203-219.

Haig, D, Westoby M (1991) Genomic imprinting in endosperm: its effect on seed development

in crosses between species, and between different ploidies of the same species, and its

implications for the evolution of apomixis. Philos. Trans. R. Soc. London 333: 1–14.

Hamrick JL, Godt MJ 1996. Effects of life history traits on genetic diversity in plant species.

Philos. Trans. R. Soc. London 351: 1291-1298.

_____, _____, Sherman-Broyles SL 1992. Factors influencing levels of genetic diversity in

woody plant species. New Forest 6: 95-124.

Hardy O.J, Vekemans X 2002. SPAGEDI: a versatile computer program to analyse spatial

genetic structure at the individual or population levels. Mol. Ecol. Notes 2: 618-620.

Hokanson SC, Szewc-McFadden AK, Lamboy WF, McFerson JR 1998. Microsatellite (SSR)

markers reveal genetic identities, genetic diversity and relationships in a Malus×domestica

borkh. core subset collection. Theoret. Appl. Genet. 97: 671-683.

Husband B, Schemske DW, Burton TL, Goodwilliw C 2002. Pollen competition as a unilaternal

reproductive barrier between sympatric Chamerion angustifolium. Proc. R. Soc. London 269:

2565-2571.

Kinoshita T 2007. Reproductive barrier and genomic imprinting in the endosperm of flowering

plants. Genes Genet. Syst. 82: 177-186.

Kirby GC 1975. Heterozygote frequencies in small subpopulations. Theoret. Popul. Biol. 8:

31-48.

Kjolner S, Sastad SM, Taberlet P, Brochmann C 2004. Amplified fragment length

polymorphism versus random amplified polymorphic DNA markers: clonal diversity in 229 Saxifraga cernua. Mol.Ecol. 13: 81-86.

Koltunow AM, Grossniklaus U 2003. Apomixis: A developmental perspective. Ann. Rev. Pl.

Biol. 54: 547-574.

Legendre L, Legendre P 1983. Numerical Ecology. Developments in Environmental Modelling.

S. E. Jorgensen. Amsterdam, Elsevier Scientific Publishing Company. Pp.419.

Leitch IJ, Bennet MD 1997. Polyploidy in angiosperms. Trend Pl. Sci. 2: 470-476.

Levin DA 1981. Gene flow in plants revisited. Ann. Missouri Bot. Gard. 68: 233-253.

Liebhard R, Gianfranceschi L, Koller B, Ryder CD, Tarchini R, Van De Weg E, Gessler C 2002.

Development and characterisation of 140 new microsatellites in apple (Malus ×domestica

Borkh). Mol. Breeding 10: 217-241.

Longley AE 1924. Cytological studies in the genus Crataegus. Am. J. Bot. 11: 295-317.

Luikart G, Allendorf FW, Cornuet JM 1998. Distortion of allele frequency distributions provides

a test for recent population bottlenecks. J. Hered. 89: 238-247.

Mallet J 2005. Hybridization as an invasion of the genome. Trend Ecol. Evol. 20: 229-237.

Maynard Smith J 1998. Evolutionary genetics. Oxford University Press, Oxford.

McAndrews JH, Lin KB, Manville GC, Prest VK, Vincent JS 1991. Historical atlas of Canada.

In Harris RC, Matthews GJ.

McLellan AJ, Prati D, Kaltz O, Schmid B 1997. Structure and analysis of phenotypic and

genetic variation in clonal plants. In H. de Kroon and J. M. Van Groenendael [eds.], The

ecology and evolution of clonal plants. Backhuys, Leiden, Netherlands. Pp. 185–210.

Menken SBJ, Smit E, Hans Den Nijs JCM 1995. Genetical population structure in plants: gene

flow between diploid sexual and triploid asexual dandelions (Taraxacum section Ruderalia).

Evolution 49: 1108-1118.

Meirmans PG, Vlot EC, Den Nijs JCM, Menken SBJ 2003. Spatial ecological and genetic

structure of a mixed population of sexual diploid and apomictic triploid dandelions. J. Evol. 230 Biol. 16: 343-352.

_____, Van Tienderen PH 2004. GENOTYPE and GENODIVE: two programs for the analysis

of genetic diversity of asexual organisms. Mol. Ecol. Notes 4: 792-794.

Meyers, LA, Levin DA 2006. On the abundance of polyploids in flowering plants. Evolution 60,

1198-1206.

Mohanty A, Martin JP, Aguinagalde I 2002. Population genetic analysis of European Prunus

spinosa (Rosaceae) using chloroplast DNA markers. Am. J. Bot. 89: 1223-1228.

Muniyamma M, Phipps JB 1979. Studies in Crataegus (Rosaceae: Maloideae). I. Cytological

proof of apomixis in Crataegus (Rosaceae). Am. J. Bot. 66: 149-155.

_____, _____ 1984. Studies in Crataegus: XI. Further cytological evidence for the occurrence of

apomixis in North American hawthorns. Can. J. Bot. 62: 2316-2324.

Nei M 1978. Estimation of average heterozygosity and genetic distance from a small number of

individuals. Genetics 89: 583-590.

_____ 1987. Molecular Evolutionary Genetics. Columbia University Press, New York.

Nogler GA 1984. Gametophytic apomixis. Embryology of angiosperms. J. B. M. Berlin,

Springer-Verlag. Pp. 475-518.

Noyes RD, Soltis DE 1996. Genotypic variation in agamospermous

(Asteraceae). Am. J.Bot. 83: 1292-1303.

Ouborg NJ, Piquot Y, van Groenendael JM 1999. Population genetics, molecular markers and

the study of dispersal in plants. J. Ecol. 87: 551-568.

Oddou-Muratorio S, Petit RJ, Le Guerroue B, Guesnet D, Demesure B 2001. Pollen- versus

seed-mediated gene flow in a scattered forest tree species, Sorbus torminalis L. (Crantz).

Evolution 55: 1123-1135.

Overath RD, Hamrick JL 1998. Allozyme diversity in Amerlanchier arborea and A. laevis.

Rhodora 100: 272-292. 231 Page RDM 1996. TREEVIEW: An application to display phylogenetic trees on personal

computers. Comp. Appl. Biosci. 12: 357-358.

Paun O, Greilhuber J, Temsch EM, Hörandl E 2006. Patterns, sources and ecological

implications of clonal diversity in apomictic Ranunculus carpaticola (Ranunculus

auricomus complex, Ranunculaceae). Mol. Ecol. 15: 897-910.

Peakall R, Smouse PE 2006. GENALEX 6: genetic analysis in Excel. Population genetic

software for teaching and research. Mol. Ecol. Notes 6: 288-295.

Persson HA, Gustavsson BA 2001. The extent of clonality and genetic diversity in lingonberry

(Vaccinium vitis-idaea L.) revealed by RAPDs and leaf-shape analysis. Mol. Ecol. 10:

1385-1397.

Phipps JB 1983. Biogeographic, taxonomic, and cladistic relationships between East Asiatic and

North American Crataegus. Ann. Missouri Bot. Gard. 70: 667-700.

_____, Muniyamma M 1980. A taxonomic revision of Crataegus (Rosaceae) in Ontario. Can. J.

Bot. 58: 1621-1699.

_____, O'Kennon RJ 1998. Three new species of Crataegus (Rosaceae) from western North

America: C. okennonii, C. okanagenensis, and C. phippsii. Sida 18: 169-191.

Pluess AR, Stocklin J 2004. Population genetic diversity of the clonal plant Geum reptans

(Rosaceae) in the Swiss Alps. Am. J. Bot. 91: 2013-2021.

Pompanon F, Bonin A, Bellemain E, Taberlet P 2005. Genotyping errors: causes, consequences

and solutions. Nature Rev. Genet. 6: 847-846.

Pritchard JK, Stephens M, Donnelly P 2000. Inference of population structure using multilocus

genotype data. Genetics 155: 945-959.

Richards AJ 1996. Genetic variability in obligate apomicts pf the genus Taraxacum. Folia

Geobotanica 31: 405-414.

Rienzo AD, Peterson AC, Garza JC, Valdes AM, Slatkin M, Freimer NB 1994. Mutational 232 processes of simple-sequence repeat loci in human populations. Proc. Nat. Acad. Sci. USA

91: 3166-3170.

Robertson A, Newton AC, Ennos RA 2004. Multiple hybrid origins, genetic diversity and

population genetic structure of two endemic Sorbus taxa on the Isle of Arran, Scotland. Mol.

Ecol. 13: 123-134.

Rogstad SH, Keane B, Beresh J 2002. Genetic variation across VNTR loci in central North

American Taraxacum surveyed at different spatial scales. Pl. Ecol. 161: 111-121.

Rosenberg NA 2004. DISTRUCT: a program for the graphical display of population structure.

Mol. Ecol. Notes 4: 137-138.

Sargent, CS 1907. The black-fruited Crataegus of western North America. Botanical Gazette 44:

64-66.

Schlötterer C 2000. Evolutionary dynamics of microsatellite DNA. Chromosoma 109: 365-371.

Scott RJ, Spielman M 2006. Genomic imprinting in plants and mammals: how life history

constrains convergence. Cytogenet. Gen. Res. 113: 53–67.

Slatin M 1995. A measure of population subdivision based on microsatellite allele frequencies.

Genetics 139: 457-462.

Smith PG, Phipps JB 1988. Studies in Crataegus (Rosaceae, Maloideae), XIX. Breeding

behavior in Ontario Crataegus series Rotundifoliae. Can. J. Bot. 66: 1914-1923.

Soltis PS, Soltis DE 1993. Molecular data and the dynamic nature of polyploidy. Crit. Rev. Pl.

Sci. 12, 243-273.

Soltis DE, Soltis PS, Tate J 2004. Advances in the study of polyploidy since plant speciation.

New Phtyol. 161: 173-191.

Starfinger U, Stocklin J 1996. Seed, pollen, and clonal dispersal and their role in structuring

plant populations. Prog. Bot. 57: 337-355.

Talent N, Dickinson TA 2005. Polyploidy in Crataegus and Mespilus (Rosaceae, Maloideae): 233 evolutionary inferences from flow cytometry of nuclear DNA amounts. Can. J. Bot. 83:

1268-1304.

_____, _____ 2007. Ploidy level increase and decrease in seeds from crosses between sexual

diploids and asexual triploids and tetraploids in Crataegus L. (Rosaceae, Spiraeoideae,

Pyreae). Can. J. Bot. 85: 570-584.

Taylor JS, Van de Peer Y, Meyer A 2001. Genome duplication, divergent resolution and

speciation. Trend Genet. 17: 299-301.

Tsumura Y, Yoshimura K, Tomaru N, Ohba K 1995. Molecular phylogeny of conifers using

RFLP analysis of PCR-amplified specific chloroplast genes. Theoret. Appl. Genet. 91:

1222-1236.

Van der Hulst RGM, Mes THM, Den Nijs JCM, Bachmann K 2000. Amplified fragment length

polymorphism (AFLP) markers reveal that population structure of triploid dandelions

(Taraxacum officinale) exhibits both clonality and recombination. Mol. Ecol. 9: 1-8.

Vavrek M 1998. Within-population genetic diversity of Taraxacum officinale (Asteraceae):

differential genotype response and effect on interspecific competition. Am. J. Bot. 85:

947-954.

Ward DB 1974. The “Ignorant Man” technique of sampling plant populations. Taxon 23:

325-330.

Watkinson AR, Powell JC 1993. Seedling recruitment and the maintenance of clonal diversity in

plant populations--a computer simulation of Ranunculus repens. J. Ecol. 81: 707-717.

Widen B, Cronberg N, Widen M 1994. Genotypic diversity, molecular markers, and spatial

distribution of genets in clonal plants, a literature survey. Folia Geobotanica 29: 245-263.

Wright S 1965. The interpretation pf population structure F-statistics with special regard to

systems of mating. Evolution 19: 395-420.

Table 6.1 Locality, ploidy level, habitat, and population size (N) of C. douglasii sensu lato and C. suksdorfii included in the present study.

Number of unvouchered samples is indicated in parentheses. Reproductive system (RS) is based on flow cytometry determinations of seeds

as reported in Chapter 4. “A” denotes apomixis, “S” denotes sexual and “U” denotes not determined. Superscripts 1 and 2 denote sites that

contain morphological segregates of C. douglasii described as C. okennonii and C. castlegarensis according to Phipps and O’Kennon (1998).

Species Ploidy level N Label Longitude Latitude Elevation (m) States/Provinces; County; Locality Habitat RS C. douglasii sensu lato 4x 29 (14) ON20 44.75 80.95 225 Ontario; Grey; Big Bay Abandoned farmland; on the slope A 4x 11 (8) ID20 46.52 116.73 280 Idaho; Nez Perce; Little Potlatch Creek Roadside A 4x 12 (3) ID6 44.99 116.19 1420 Idaho; Adams; Last Chance Campground, Riverside and interior dry land A 4x 14 (2) ID2 46.77 116.45 811 Idaho; Latah; Little Boulder Creek Along creek A 4x 10 (4) ID15 44.97 113.94 1292 Idaho; Lemhi Roadside A 4x 15 (1) MT2 47.07 112.91 1356 Montana; Powell; Kleinschmidt Flat Along creek and on the slope A 4x 5 (3) WA22 46.85 117.34 666 Washington; Whitman; South of Colfax1 On the slope on the roadside A 4x 25 (9) WA21 46.84 122.98 64 Washington; Thurston; Mound Prairie2 Prairie A 4x 3 BC1 50.51 119.10 - British Columbia; Spallumcheen Mountain prairie U 4x 2 OR 44.58 119.64 655 Oregon; Wheeler; Fossil Roadside U C. suksdorfii 2x 8 (1) CA5 41.4 122.84 871 California; Siskiyou; Fay Lane Roadside S 2x 13 (3) OR1 44.33 123.12 88 Oregon; Linn; Cogswell Foster Reserve Grassland S 2x 19 (8) OR11 45.73 122.77 10 Oregon; Columbia; Sauvie Island Sandy beach along coast S 2x 12 (7) WA7 45.83 122.76 15 Washington; Clark. S 3x 20 OR6 43.77 122.62 1295 Oregon; Lane; Patterson Mountain Prairie Meadow surrounded by firs U 3x 21 (5) ID6 44.99 116.19 1420 Idaho; Adams; Last Chance Campground Along river and interior dry land A 234 234

3x 2 ID5 45.00 116.06 1524 Idaho; Valley; North Beach, Payette Lake Sandy area along beach A 4x 21 (6) MT2 47.07 112.91 1356 Montana; Powell; Kleinschmidt Flat Along creek A

235 235

Table 6.2 Nucleotide sequences of the selected microsatellite primers used in the present study. These primers are designed on the conserved

SSR-flanking regions of Malus ×domestica and are transferable to Crataegus and other Maloideae species (Liebhard et al. 2002).

Information of each locus is described. D: Number of alleles observed in C. douglasii; S: Number of alleles observed in C. suksdorfii.

Loci Forward primer sequence (5' - 3') Reverse primer sequence (5' - 3') Total alleles D S Size (bp) Maps on LG no. CH01D03 CCG CTT GGC AAT GAC TCC TC ACC CTG AAG CCA TGA GGG C 7 4 6 125-149 4 and 12 CH01F02 ACC ACA TTA GAG CAG TTG AGG CTG GTT TGT TTT CCT CCA GC 25 17 23 149-201 12 CH01F07 CCC TAC ACA GTT TCT CAA CCC CGT TTT TGG AGC GTA GGA AC 32 17 28 149-263 10 CH03A02 TTG TGG ACG TTC TGT GTT GG CAA GTT CAA CAG CTC AAG ATG A 17 15 14 122-162 14 CH03C02 TCA CTA TTT ACG GGA TCA AGC A GTG CAG AGT CTT TGA CAA GGC 23 17 20 98-148 12 CH03D08 CAT CAG TCT CTT GCA CTG GAA A TAG GGC TAG GGA GAG ATG ATG A 16 10 11 120-172 14 CH04F06 GGC TCA GAG TAC TTG CAG AGG ATC CTT AAG CGC TCT CCA CA 14 14 7 146-176 14 CH04G04 AGT GGA TGA TGA GGA TGA GG GCT AGT TGC ACC AAG TTC ACA 18 12 15 146-186 12 CH05D03 TAC CTG AAA GAG GAA GCC CT TCA TTC CTT CTC ACA TCC ACT 15 13 13 141-173 14 CH05D04 ACT TGT GAG CCG TGA GAG GT TCC GAA GGT ATG CTT CGA TT 16 10 12 138-174 12 CH05D11 CAC AAC CTG ATA TCC GGG AC GAG AAG GTC GTA CAT TCC TCA A 13 11 8 161-193 12 CH5G07 CCC AAG CAA TAT AGT GAA TCT CAA TTC ATC TCC TGC TGC AAA TAA C 21 20 16 142-192 12 and 14 CH05G11 GCA AAC CAA CCT CTG GTG AT AAA CTG TTC CAA CGA CGC TA 34 27 26 173-245 14

236 236

Table 6.3 Descriptive statistics of diploid, triploid, and tetraploid populations of C. suksdorfii (indicated with superscript “s”) and C.

douglasii (indicated with superscript “d”) based on the 13 microsatellite loci. Underlines indicate the overall estimates of respective ploidy

level. N: Samples size; He: Gene diversity corrected for sample size (Nei 1987); G: Number of detected multilocus genotypes; PG:

Proportion of distinguishable genotypes; D: Genotypic diversity also known as Simpson’s diversity index; E: Genotypic evenness.

Ploidy level Diploid Triploid Tetraploid Population CA5 s WA7 s OR11 s OR1 s Average OR6 s ID6 s Average MT2s ON20 d MT2 d ID15 d ID6 d ID20 d ID2 d WA22 d WA21 d Average N 8 12 19 13 52 20 21 41 21 24 21 10 12 11 14 5 25 143 Mean allele no. 4.33 3.92 9.08 8.15 6.37 3.46 3.69 3.58 5.31 4.85 10.92 5.54 6.69 8.38 6.08 4.31 8 6.68 Mean He 0.6 0.55 0.76 0.8 0.68 0.5 0.46 0.48 0.57 0.56 0.64 0.61 0.73 0.75 0.7 0.71 0.71 0.66

Multi-loci FIS 0.17 0.12 0.24 0.3 0.21 -0.46 -0.32 -0.39 -0.17 -0.15 -0.04 -0.2 -0.09 -0.02 -0.06 -0.13 -0.01 -0.10 P-value 0.008 0.018 0.001 0.001 0.01 0.001 0.001 0.001 0.001 0.001 0.074 0.001 0.001 0.204 0.01 0.001 0.238 0.004 G 8 12 19 13 13 7 11 9 16 10 11 5 11 9 10 3 13 10 PG 1 1 1 1 1 0.35 0.52 0.44 0.76 0.42 0.52 0.5 0.92 0.82 0.71 0.6 0.52 0.64 D 1 1 1 1 1 0.76 0.84 0.80 0.96 0.71 0.93 0.66 0.98 0.94 0.81 0.7 0.91 0.84 E 1 1 1 1 1 0.52 0.47 0.50 0.74 0.41 0.69 0.5 0.93 0.79 0.63 0.75 0.62 0.66

237 237

Table 6.4 ANOVA-based F- and R-statistics for SSR data calculated for all populations, separately for C. douglasii (tetraploid) and C. suksdorfii. Diploids, triploids, and tetraploids of C. suksdorfii were analyzed both in combination and separately. Because only one site is found to contain tetraploid C. suksdorfii, triploids and tetraploids were combined in the analyses. The two-sided P- values were all < 0.001 unless specified. “N” denotes samples size. Populations N F-statistics R-statistics

FIT FIS FST RIT RIS RST All C. douglasii and C. suksdorfii 241 0.145 -0.038 0.176 0.195 0.051 0.152 C. douglasii – tetraploids 127 0.054 -0.029 0.079 0.096 0.049* 0.051 C. suksdorfii – diploids, triploids, tetraploids 114 0.245 -0.049 0.280 0.326 0.077* 0.269 C. suksdorfii – diploids 52 0.294 0.189 0.129 0.526 0.375 0.242 C. suksdorfii – triploids 41 0.215 -0.339 0.414 0.194 -0.144 0.295 C. suksdorfii – triploids and tetraploids 62 0.187 -0.261 0.355 0.202 -0.128 0.293 *P = 0.02 *P = 0.049

238 238

Table 6.5 Analysis of Molecular Variance (AMOVA) showing the partitioning of genetic variation among and within populations of C. douglasii and C. suksdorfii, respectively. The analyses were performed on the Jaccard coefficient matrix of the binary SSR data (absence as

“0” and presence as “1” of a defined allele) with 10,000 permutations. Partitioning of variation in the two species d.f. Sum of squares Expected mean squares Variance components % of total variation Among C. douglasii populations 7 10.052 1.436 0.078 22% Within C. douglasii populations 115 31.527 0.274 0.274 78% Among C. suksdorfii populations 6 16.274 2.712 0.153 37% Within C. suksdorfii populations 107 27.867 0.260 0.260 63% Partitioning within C. suksdorfii with respect to ploidy level Among 2x and 3x 5 12.381 2.476 0.145 35% Within 2x and 3x 87 23.705 0.272 0.272 65% Among 2x and 4x 4 7.567 1.892 0.111 26% Within 2x and 4x 68 21.290 0.313 0.313 74% Among 3x and 4x 2 9.091 4.546 0.211 54% Within 3x and 4x 59 10.739 0.182 0.182 46% Among 2x 3 3.424 1.141 0.063 15% Within 2x 48 16.645 0.347 0.347 85% 239

Table 6.6 Membership coefficient at K = 9 inferred from STRUCTURE analyses when all C. douglasii and C. suksdorfii individuals are included. “N” denotes sample size. Asterisks denote sites that show probability higher than 0.80 in any one of the nine clusters. Species Locality ID Ploidy level N K cluster C. douglasii 1 2 3 4 5 6 7 8 9 ON20 4x 29 0.003 0.001 0.002 0.001 0.006 0.012 0.006 0.967* 0.001 MT2 4x 15 0.002 0.003 0.022 0.001 0.004 0.747 0.019 0.200 0.001 WA21 4x 25 0.004 0.002 0.380 0.004 0.011 0.022 0.567 0.009 0.003 WA22 4x 5 0.002 0.002 0.090 0.002 0.848* 0.005 0.018 0.024 0.011 ID20 4x 11 0.002 0.002 0.009 0.003 0.547 0.116 0.294 0.017 0.009 ID2 4x 14 0.003 0.001 0.005 0.002 0.429 0.513 0.005 0.039 0.002 ID6 4x 12 0.005 0.001 0.002 0.002 0.414 0.009 0.542 0.023 0.002 ID15 4x 10 0.003 0.001 0.196 0.001 0.742 0.042 0.007 0.006 0.001 C. suksdorfii MT2 4x 21 0.003 0.001 0.024 0.001 0.058 0.008 0.023 0.879* 0.002 ID6 3x 21 0.978* 0.002 0.001 0.002 0.002 0.002 0.007 0.002 0.004 OR6 3x 20 0.003 0.004 0.006 0.978* 0.001 0.002 0.001 0.001 0.004 OR1 2x 13 0.009 0.145 0.015 0.033 0.010 0.004 0.009 0.005 0.769 OR11 2x 19 0.042 0.062 0.008 0.020 0.005 0.006 0.007 0.009 0.841* WA7 2x 12 0.009 0.816* 0.004 0.002 0.001 0.004 0.003 0.001 0.161 CA5 2x 8 0.003 0.964* 0.009 0.006 0.003 0.002 0.002 0.003 0.009 240 241 Figure 6.1 Distribution of sampling sites of C. suksdorfii and C. douglasii sensu lato include in the present study. 2x C. suksdorfii ( ); 3x C. suksdorfii ( ); 4x C. suksdorfii ( ); and 4x C. douglasii ( ). Locality identities can be found in Table 6.1.

125 W 110 W 120 W 115 W BC1 50 N 50 N

AB BC MT WA1 4x C. douglasii sensu lato 2x C. suksdorfii WA22 ID3 MT2 WA20 WA21 ID2 3x C. suksdorfii 4x C. suksdorfii WA7 WA ID20 OR11 45 N OR7 ID16 45 N ID6 ID15 OR1 ID5 OR OR2 OR6 OR4

OR ID CAR5 CAR2

CAR3 40 N CA 40 N 125 W 110 W 120 W 115 W 2 4 2 243 Figure 6.2 Neighbor-joining tree based on Jaccard coefficients of alleles obtained from the 13 microsatellite loci showing the genetic relationships among samples of C. douglasii and C. suksdorfii. and C. saligna were used for rooting. Three clusters (A-C) are identified in which clusters A and B contain all individuals of 4x C. douglasii from the Pacific

Northwest and western Ontario, as well as 3x and 4x C. suksdorfii from Idaho and Montana, respectively. Cluster C contains individuals of 2x C. suksdorfii from Oregon, Washington, and

California together with 3x C. suksdorfii from Oregon. Localities (bars) and ploidy level are indicated on the right and can be referred to Fig. 6.1.

244

ON

MT

BC/WA

A ID

WA/ID

MT/ID

ID B

OR/WA/ MT/ID

WA/OR/CA

C OR

CA/OR

C. rivularis

C. saligna 4x C. douglasii sensu lato 5 changes 2x C. suksdorfii 3x C. suksdorfii 4x C. suksdorfii 245

Figure 6.3 Unrooted dendrogram based on DS distances (Nei 1978), estimates of F-statistics under IAM model, computed from SPAGEDI showing relatedness among sites of C. douglasii and C. suksdorfii. Sites of C. douglasii with less than five individuals (e.g., OR, BC1, and WA22) are not included in the analyses. Sizes of the symbol are proportional to the number of examined individuals from each site. Labels of ploidy level are indicated below graph and locality identities can be found in Table 6.1.

246

I D6 ID2/6/ MT2 15/20 MT2

WA21 ON20

OR1 OR11

WA7

CA5

OR6

4x C. douglasii sensu lato 2x C. suksdorfii 3x C. suksdorfii 4x C. suksdorfii 247 Figure 6.4 Bayesian inferences of clusters (K) estimated by STRUCTURE among individuals of

(A) C. douglasii and (B) C. suksdorfii. Plots of K values against ln Pr(X/K) andΔK calculated with the method of Evanno et al. (2005) are shown to identify the most probable K in the two taxa. Membership coefficients among individuals of respective population under the defined K, as indicated by different scale of gray, are graphically presented with DISTRUCT. In C. douglasii (A), three clusters are identified. Cluster 1 (light gray) contains mostly the Ontario individuals (ON20); cluster 2 (moderate gray) and 3 (dark gray) contain mixture of individuals from sites of the Pacific Northwest. In C. suksdorfii (B), five clusters are identified with respect to the five major sites (MT2, OR6, ID6, WA7, and CA5) as represented by different gray color.

Individuals from Oregon (OR1 and 11) contain an admixture of alleles from WA7 and CA5, as well as a minor proportion of the others.

0 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 -1000 -2000 -2000 ) -3000

K -4000 /

X -4000 (

r -6000

P -5000 n l -6000 -8000

-7000 -10000 -8000

-9000 -12000

400 25

350

20 300

250 K

15 a t 200 l e

d 10 150

100 5 50

0 0 2 3 4 5 6 7 8 9 10 11 12 13 14 15 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 K K 4x 2x 3x 4x

0 2 1 1 2 2 2 5 6 0 2 2 2 1 1 7 5 6 6 1 0 2 0 T N T A A R R A A R D D D D D O M I I I I W W O O W C O I M 2

(A) C. douglasii, K = 3 4

(B) C. suksdorfii, K = 5 8 249

Figure 6.5 Scatter plots of pairwise DS values (Nei, 1978; IAM-based model) among populations of C. douglasii (A) and C. suksdorfii (B) versus geographical distances estimated by SPAGEDI based on spatial coordinates. In the plot of C. douglasii (A), black dots denote comparisons of populations within Pacific Northwest or western Ontario, while gray dots denote populations between Pacific Northwest and western Ontario. In the plot of C. suksdorfii (B), comparisons of

2x/2x is indicated by black dots, 2x/3x by red, 2x/4x by purple, 3x/3x by blue, and 3x/4x by green.

The RMA regression line is indicated. Mantel tests did not indicate a significant relationship between geographical and genetic distances in C. douglasii (r = 0.299, P = 0.117), but did so in

C. suksdorfii (r = 0.663, P = 0.002) under IAM criterion.

(A) C. douglasii populations (B) C. suksdorfii populations

0.7 1.4

0.6 1.2 ) 7 8

9 0.5 1.0 1 ( i e N n o

d 0.4 0.8 e s a b e c n a t 0.3 s i 0.6 D c i t e n e

G 0.2 0.4

0.1 0.2

0 200 400 600 800 1000 1200 1400 0 200 400 600 800 1000 1200 1400

Geographic Distance (km) 2 5 0 251 Supplementary figure 6A. Frequency distribution of the observed alleles of some selected microsatellite loci with respect to (A) C. douglasii (gray) and C. suksdorfii (black). CH05G07 is shown to be the most polymorphic in C. douglasii and CH01F02, CH01F07, and CH03C02 in C. suksdorfii with over 20 alleles, whereas CH01D03 and CH03D08 appear to be the least variable and nearly monomorphic in C. douglasii.

(A)

0.5 CH01F02 0.5 1 CH01D03 CH01F07

0.4 0.4 0.8

0.3 0.6 0.3

0.2 0.4 0.2 y 0.1 0.1 c 0.2 n e u 0 0 0 q 149 151151533155 157151599161 163161655167 169 171711173 175171777179 181181833185 187181899191 193191999201 153 163 169 177 183 193 199 205 121211 121255 131333 131355 131377 141411 149149 9 3 1 5 9 5 9 3 1 5 9 3 7 1193 e 14 15 16 16 16 17 17 18 19 19 19 20 20 21 r f c i l e 0.5 CH05G07

l 0.5 CH05D03 0.5 l CH03C02 A 0.4 0.4 0.4

0.3 0.3 0.3

0.2 0.2 0.2

0.1 0.1 0.1

0 0 98 100101022104 106101088110 112111144 116 118121200122 124121266 128 130131322134 136131388148 0 145 141477 149 151511 153 151555 157 151599 161 161633 167 161699 171 171733 142 141466 148 151500 152 151544 156 151588 160 161622 164 161666 168 171700 172 171744 176

Allele size (bp)

C. douglasii C. suksdorfii 2 5 2 253 Supplementary figure 6B. Frequency distribution of the observed alleles of some selected microsatellite loci with respect to the three cytotypes of C. suksdorfii that include diploids (2x; black), triploids (3x; pattern), and tetraploids (4x; white).

(B)

1 CH01F07 0.5 1 CH01D03 CH01F02 0.8 0.8 0.4

0.6 0.6 0.3

0.4 0.4 0.2

0.2 0.2 y 0.1 c n e 0 0 0 u 121 125 133 135 137 139 121 125 133 135 137 139 149 151 115555 157 159 116161163 165 116767 169 171 117373 175 177117979 181 183 118585 187 189 119191193 199 149 151115353161 163 116565167 169 117171174 175 117777179 181 118383185 191 119393195 197 119999201 203220505207 209 221111 q e r f c i l 1 e CH03A02 l 0.5 1 CH05D03 l CH04G04

A 0.8 0.4 0.8

0.6 0.3 0.6

0.4 0.2 0.4

0.2 0.1 0.2

0 0 0 122 112626 128 113232 134 136136 138 114040 142 114646 148 115050 152 116354 150 160 164 168 172 178 182 146 150 156 160 162 164 166 168 170 172 174 178 180 182 186 141 114545 147 114949 151 115353 155 115757 159 116161 163 116767 169

Allele size (bp) 2x C. suksdorfii 3x C. suksdorfii 4x C. suksdorfii 2 5 4 Chapter 7 Limited pollen contribution in apomictic plants: Inference from the genetic variability in seed families of the Ontario Crataegus (hawthorns; Rosaceae) with microsatellite markers.

Abstract. Genetic variation in agamospermous plants is poorly understood for woody angiosperms. Very few attempts have been made to investigate such variation at a fine scale by using seed progenies and compare with that of sexuals. Many North American Crataegus are polyploid and can reproduce both sexually and apomictically, but the relationship between reproductive system and local genetic diversity has never been investigated directly in this genus. Here we examine the extent of pollen contribution by comparing genetic variation between and within seed families of two closely-related species in Ontario, C. crus-galli and C. punctata, which differ in ploidy level and breeding system. Embryo DNA extracted from seeds was used to determine microsatellite genotypes for five loci in a total of 83 and 118 seeds of 18 C. crus-galli and 28 C. punctata individuals. Our analyses indicate that allelic diversity and multilocus genotypic diversity in diploid C. punctata are higher than in tetraploid

C. crus-galli. Compared with C. punctata, C. crus-galli shows (1) a Neighbor-joining tree with lower resolution and shorter internal branches among seed samples, and (2) a higher level of inter-individual gene identity within and between seed families. Based on the high genotypic uniformity in tetraploid C. crus-galli, we infer that apomixis has led to a replication of maternal genotypes and limited pollen contribution to offspring genotypes. Such a phenomenon may enhance reproductive success locally for this taxon. Our study also highlights analytical approaches for examining genetic variation across ploidy levels that are useful with organisms characterized by polyploidy.

255 256

Keywords: Crataegus; apomixis; genotypic diversity; microsatellite; Rousset’s distance; seed families 257 7.1 Introduction

Early biologists predicted that founder effect and lack of segregation and recombination in asexual populations usually lead to deficiencies in genetic variation and gradually hamper their continual evolution (Darlington 1939; Stebbins 1950; Clausen 1954; Grant 1981; Lynch &

Gabriel 1983; Kondrashov 1993). This view has been challenged by molecular data accumulated in the last decade that clearly illustrate surprising levels of genetic diversity in widespread apomicts (e.g., Diggle et al. 1998; Carino & Daehler 1999; Esselman et al. 1999;

Van der Dulst et al. 2000; Hörandl et al. 2001; Paun et al. 2006). Occasional sexuality (i.e., meiosis and gamete fusion, either separately or together) is claimed to be one of the explanations for the high genetic variability observed in agamospermous plants (see review in

Whitton et al. 2008). Such events depend on frequent pollen flow not only for endosperm fertilization that is required in asexual seed formation in pseudogamous apomicts, but also for embryo fertilization when seeds are produced sexually. However, there is little empirical evidence regarding differences in the extent to which pollen contributes to genetic variation in sexual and apomictic populations. Furthermore, very few of the data on genetic variation in relation to breeding system come from the woody angiosperms. To remedy this lack, the present study aims to examine genetic variation in seed families of sexual and apomictic populations of hawthorns.

Hawthorns (Crataegus L.; Rosaceae subtribe Pyrinae) are ruderal shrubs and small trees with unspecialized insect-pollinated flowers and fleshy, polypyrenous drupes. In North America their systematics have been confounded by the large number of species that were described by taxonomists between 1890 and 1910, before there was any appreciation of the relationships possible between patterns of morphological variation and variation in breeding system

(Dickinson 1999). Subsequently many North American Crataegus species were shown to be at least partially male sterile (Standish 1916) and often triploid (Longley 1924). Documentation of 258 asexual reproduction in other Rosaceae groups (e.g., Alchemilla, Potentilla, Rubus) led subsequent workers to infer similar reproductive behavior in Crataegus as an explanation for the taxonomic complexity of the genus (Palmer 1932). Gametophytic apomixis, i.e., formation of unreduced megagametophytes or embryo sacs to give rise to a parthenogeneic embryo (Nogler

1984), has been shown common in many North American Crataegus particularly polyploids

(Longley, 1924; Muniyamma & Phipps 1979, 1984; Dickinson & Phipps 1986; Smith & Phipps

1988; Talent & Dickinson 2007). In Ontario, C. crus-galli L. is polyploid, self-fertile, and apomictic. Local populations (topodemes) of C. crus-galli were found to be differentiated among sites with respect to flowers, fruits and leaves (Dickinson & Phipps 1985; Dickinson

1986). Such morphological differentiation is considerably more pronounced compared to C. punctata Jacq. which is diploid, self-incompatible, and sexual (Dickinson & Phipps 1986). This contrast in morphological variability within the respective taxa suggests a narrative hypothesis of the origin of local populations from deposits of seeds from one or only a few parent individuals (Dickinson 1986, 1999; Dickinson & Campbell 1991). Within-topodeme diversity of genotypes would therefore be a function of whether the few parent individuals responsible reproduced sexually or not, and whether there is a difference in pollen contribution to genotypes of seed progeny.

Here, we use microsatellite loci as codominant genetic markers to compare seed families from individuals in two local populations (topodemes), one of C. crus-galli and one of C. punctata. Seeds of C. crus-galli and C. punctata are characterized by large embryos covered with a thin translucent layer of endosperm usually adhering to the brown seed coat. Seed DNA obtained from the same or closely-related families are expected to share similar maternal genotypes. With this in mind, we seek to examine genetic variation between and within seed families in order to elucidate paternal genotypes contribution, and thus the consequences of different reproductive strategies within a population. Our study provides a test of the narrative 259 hypothesis for the origin of local populations as mentioned above and reports a fine scale study of genetic variation in sexual and apomictic individuals of woody plants.

7.2. Materials and Methods

7.2.1. Plant materials

Seed families were collected in September 2005 from a random sample of hawthorns at each of two sites in southern Ontario, Canada (Table 7.1; Appendix 7). At one site all sampled individuals were tagged and identified unambiguously as C. crus-galli L. (glossy, unlobed leaves; fruits developed from 10-stamen flowers). At the other site, all sampled individuals were tagged and identified unambiguously as C. punctata Jacq. (pubescent, slightly lobed leaves with secondary venation markedly impressed on the adaxial surface; fruits developed from the 20- stamen flowers). From each fruit only a single seed was selected for DNA extraction. Two to seven seed tissues that showed promising yield of good quality DNA were used to represent each tree individual. A total of 83 seed samples representing 18 C. crus-galli individuals and

118 seed samples representing 28 C. punctata individuals were used in this study (Appendix 7).

Because the two species present at each site were limited in number and both could be identified unambiguously, voucher specimens were not collected together with the seed families. Site

ON04 (Table 7.1) is already well-documented by specimens deposited in the Herbarium of the

Biology Department of the University of Western Ontario (UWO) and in the Green Plant

Herbarium of the Royal Ontario Museum (TRT) (Dickinson & Phipps 1985), while C. punctata at site ON46 is documented by representative specimens at TRT.

7.2.2. DNA extraction and microsatellite markers

For each seed, the seed coat and the attached endosperm were carefully removed, while the remaining white embryo tissue of about 8-13 mg in fresh weight was used for DNA extraction. 260 The extraction protocol described in Chapter 2 was applied to seed tissues except all volumes were reduced by half and the final DNA pellet was eluted in 20μl water. Because of the limited

DNA quantity for each seed sample, only five dinucleotide microsatellite loci were used in this study and these loci are located on linkage groups 12 and 14 of Malus ×domestica (Liebhard et al. 2002; Table 7.2). We followed the PCR amplification protocols and conditions as described in Chapter 6. Amplification products were analysed on an ABI3700 automatic sequencer

(Applied Biosystems) in the Centre for Applied Genomics, The Hospital for Sick Children,

Toronto, Canada. Resulted peaks were scored using the program GENEMAPPER version 3.5

(Applied Biosystems).

7.2.3. Statistical analyses

To compare genetic variation among loci, the number of alleles, allelic frequency, mean allelic size, average expected heterozygosity or gene diversity (He), and the fixation coefficient FIS corrected for small samples size (Kirby 1975) for seed families of C. crus-galli and C. punctata were calculated with SPAGeDI version 1.2 developed by Hardy and Vekemans (2002).

Multilocus genotypes of seed individuals in C. crus-galli and C. punctata were estimated using the software GENODIVE v2.0b4 (Meirmans & Tienderen, 2004). Distances based on a two-phase mutation model for microsatellites and scaled by ploidy level (Bruvo et al., 2004) were calculated to determine the frequency distribution of distance values in the two taxa. This was followed by assigning individual genotypes into clonal lineages through selecting the minimal distance class as threshold distance so that any pair of individuals below threshold distance was assigned as clonemates. Once clones were assigned, the following three parameters were estimated. First, the total number of multilocus genotypes (G) was determined. Second, the proportion of distinguishable genotypes (PG) was calculated as the number of genotypes divided by sample size (Ellstrand and Roose, 1987). Third, Nei’s (1987) genetic diversity corrected for 261 sample size (also known as Simpson’s diversity index, D) was calculated. This index ranges from zero where two randomly picked individuals shared a single genotype to one where individuals were genetically different in a population. We also estimate genotypic evenness (E) for the two topodeme samples based on multilocus genotypes of one randomly selected seed from each family (i.e., representing an individual tree) to infer the distribution of genotypes within the populations.

Apart from diversity indices estimated for seed samples, the levels of genetic relatedness between and within seed families were also compared with respect to the two taxa. The Bruvo et al. (2004) based distance matrices were used to construct neighbor-joining trees in PHYLIP

3.66 (Felsentein 2006) and visualized with TREEVIEW (Page 1996) for branch length and topology comparisons. In addition, Rousset’s distances (â; Rousset 2000) were calculated based on allele sharing between all pairs of genetic individuals within population using SPAGeDI v1.2

(Vekemans & Hardy 2004). This algorithm estimates the probability of a gene being identical by decent between two individuals i and j with respect to the mean Qo over all pairs of sampled individuals defined as â ij = (Qo – Qij)/(1 – Qo), and is an analogue of kinship coefficient

(Rousset 2000; Vekemans & Hardy 2004). Although this distance is often used to infer inter- individual relatedness of continuous distribution, here we use to compare relatedness of individuals that are divided into three categories including the total seed samples (ALL), between seed families (BS), and within seed family (WS) with respect to the diploid and tetraploid taxa. One-way ANOVA (analyses of variance) with Tukey’s Honesty Significance test (HSD) at the 95% confidence limit was performed using XLSTAT version 2007.4 to test if there is significance difference in the pairwise Rousset’s distance between all categories.

Although distances are not completely independent and may not meet the assumptions of an

ANOVA, we are making comparisons between categories that are independent, so as to demonstrate the degree of differences as if each distance is an independent observation. 262

7.3. Results

7.3.1. Comparison of genetic variation

A total of 71 alleles were obtained from the five SSR loci of 201 seed samples and the number of alleles varied considerably, from 7-20 per locus (Table 7.2; Supple. fig.). A maximum of two and four alleles were detected respectively in C. punctata (2x) and C. crus-galli (4x) individuals in each locus, consistent with their ploidy levels. The allele number and gene diversity (He) in diploid C. punctata are generally higher than the tetraploid C. crus-galli (Table 7.1 and 7.2). A positive non-significant FIS estimation (P > 0.01) was obtained for the C. punctata population, whereas a significant negative estimation (P < 0.001) was obtained for the C. crus-galli population (Table 7.1).

Histograms of pairwise genetic distance using Bruvo et al. (2004) method (Fig. 7.1) indicate different frequency distributions in C. punctata and C. crus-galli. A right-skewed (or nearly bell-shaped) distribution is found in C. punctata with 48% of the distance values centered at the mean of 0.50 ± 0.08 (Fig. 7.1A), whereas a markedly left-skewed distribution is detected in C. crus-galli (Fig. 7.1B) with up to 66% of the values found in the first two smallest distance classes (0-0.15). Such differences are undoubtedly due to the preponderance of uniform genotypes among the C. crus-galli seed samples. Estimates of genotypic variation G, PG, and D are lower in C. crus-galli than C. punctata individuals, consistent with the observed allele number and gene diversity (Table 7.1). In C. punctata, In C. punctata, values of PG, D and E inferred at the tree individual level (0.93, 0.99, and 0.89; Table 7.1) are higher than those at the seed level. These values are all close to one, which are expected for sexually reproduced individuals. In C. crus-galli, values of PG, D and E among tree individuals (0.33, 0.49, and 0.31;

Table 7.1) are less than half of those in C. punctata.

263 7.3.2. Genetic relatedness among seed samples

Seed samples of C. crus-galli and C. punctata are distinguished as two separate clusters in the neighbor-joining dendrogram (Fig. 7.2) and reveal substantial differences in comparison to each other with respect to both topology and branch length. Samples of C. punctata are split into several small clusters that contain a mixture of seed families (i.e. originating from different mother trees) and terminal branches range from 0-0.27 (mean 0.06 ± 0.05) in length (Fig. 7.2A).

In contrast, relationships among seed samples of C. crus-galli are not clearly resolved largely due to the short branches in the resulting tree (Fig. 7.2B). Terminal branches range from 0-0.07

(mean 0.01 ± 0.04) and are considerably shorter that those of C. punctata. This suggests limited allelic differences among seed samples of C. crus-galli and is consistent with the differences found in the diversity indices between the two taxa.

ANOVA indicates that the pairwise Rousset’s distances (â) of all seed samples (ALL) of

C. punctata (mean 0.16 ± 0.008) are significantly higher than those of C. crus-galli (mean -0.03

± 0.02) (Table 7.3; Fig. 7.3). Negative â values observed in C. crus-galli indicate a higher inter- individual gene identity compared to the overall sampled individual gene identity (Rousset

2000). Further partition of seed samples with respect to families, i.e., within (WS) and between seed families (BS), indicates no significant difference among the pairwise â values of WS and

BS for C. crus-galli (P = 0.881; Table 7.3). However, both sets of values are significantly lower that those of C. punctata (Fig. 7.3). These mirror the differences in branch lengths between the two taxa in the NJ trees (Fig. 7.2). Results of all analyses evidently indicate scant genetic variation within and between seed families of tetraploid C. crus-galli when compared to C. punctata and imply a considerable impact of reproductive systems on genetic diversity.

7.4. Discussion 264 Our study represents the first report of a fine scale analysis of genetic variation among seed families in sexual and apomictic individuals of two Crataegus species that are common and locally abundant in Ontario, Canada. Analyses using both Bruvo et al. (2004) and Rousset (2000) distance approaches on our microsatellite data were shown to be useful in comparing genetic variation between diploid and tetraploid samples. They provided compelling evidences to the lower genetic variability in tetraploid C. crus-galli than the diploid C. punctata.

The first species in our comparison is C. punctata that has been documented to be a diploid in Ontario (Muniyamma & Phipps 1979; Dickinson & Phipps 1986; Talent & Dickinson 2005), and to reproduce sexually (Dickinson & Phipps 1986; Talent & Dickinson 2007). Crataegus punctata is morphologically distinctive and it was the dominant hawthorn species at the site where seed families were collected for this study (Table 7.1). The only other hawthorn at this site, C. calpodendron, flowers a minimum of one week after the sympatric C. punctata (Phipps

& Muniyamma 1980) thus the genetic diversity found in our C. punctata samples is unlikely to be influenced by introgression from C. calpodendron.

In this study, C. punctata serves as a benchmark on genetic variability within and between seed families attributable to recombination in sexually reproduced individuals. Diversity indices inferred at the individual level (i.e., between seed families) of C. punctata (PG, D, and E; Table

7.1) were comparable to another diploid species, C. suksdorfii, whose values are also close to one (Chapter 6), which is a signature of sexual species under normal recombination and outcrossing (Ellstrand and Roose, 1987). Moreover, the high within-family variation (P-WS; Fig.

7.3) implies a significant pollen contribution to maternal genotypes that give rise to heterogeneous offspring, confirming the lack of apomixis and selfing in this species (Dickinson

& Phipps 1986; Talent & Dickinson 2007). However, we are unable to accurately estimate the relative amount of outcrossing and inbreeding within families for this sexual taxon because of the limited number of progeny included. It is worth noting that multilocus diversity indices 265 estimated from all seed samples appear to be lower than those from a subsample that represents tree individuals (e.g., PG and D; Table 7.1). This clearly indicates the effect of family structure, i.e., sharing the same maternal parent, when examining variation even from sexually produced seeds.

The second species in our comparison, C. crus-galli, is known to be uniformly tetraploid in

Ontario (Muniyamma & Phipps 1979; Dickinson & Phipps 1986; Talent & Dickinson 2005) and reproduction appears to be predominantly apomictic (Dickinson & Phipps 1986; Talent &

Dickinson 2007). While the site at which seeds were collected (Table 7.1) has a more diverse hawthorn flora than the C. punctata site, these species are well known (Dickinson, pers, obs.).

The only species that might be confused with C. crus-galli in our collection site is referred to as

C. grandis Ashe, which could be a hybrid between C. crus-galli and C. punctata (Muniyamma

& Phipps 1980; Dickinson & Phipps 1985). However, C. grandis is male-sterile and frequently parthenocarpic (Dickinson & Phipps 1986), and hence the fruits of this taxon yield no seeds. We are confident that our seed families from this site are composed exclusively of C. crus-galli and, because of the self-compatibility and apomictic reproduction of this tetraploid taxon (Dickinson

& Phipps 1986; Talent & Dickinson 2007), it is unlikely that low level of genetic diversity found among these seed families has been influenced in any way by introgression from other

Crataegus species.

Multilocus diversity indices (G = 9, PD = 0.12, D = 0.28; Table 7.1) of C. crus-galli are considerably lower than those of the sexual C. punctata (G = 47, PD = 0.40 and D = 0.64; Table

7.1). The significant negative FIS estimation implies the presence of an excess of heterozygous genotypes potentially relates to an allopolyploidy of this tetraploid taxon. Such characteristic has also been reported in other polyploid species (Birky 1996; Paun et al. 2006). Low level of genetic variation observed within and between seed families of C. crus-galli (Fig. 7.2 and 7.3) points to the preponderance of apomictic reproduction and selfing, as suggested in previous 266 embryological studies (Dickinson & Phipps 1986; Talent & Dickinson 2007). Within seed families, we infer that the uniform but highly heterozygous multilocus genotypes detected among seed progeny may represent either relatively limited or homogeneous pollen (i.e., paternal genotypes) contribution to the offspring genotypes, when compared to the sexual C. punctata. The extent of pollen flow and its involvement in reproduction, other than in pseudogamy, could be constrained by pollinator activity in our studied population (Dickinson &

Phipps 1986) and by endosperm balance requirement (Whitton et al. 2008). However, the latter factor appears to be not important in asexual seed formation in Crataegus species (Talent &

Dickinson 2007). Other apomictic plants such as species of Rubus (Nybom & Schaal 1990;

Kraft et al. 1996; Nybom 1998), Hieracium (Shi et al. 1996; Štorchová et al. 2002), Ranunculus

(Hörandl et al. 2001; Paun et al. 2006), and Taraxacum (Lyman & Ellstrand 1998) also show scant genetic variability in natural populations, potentially because of the deficiency in pollen flow and/or polyploidy that lead to the evolution of apomixis.

Between seed families (i.e., among individual trees), the level of genetic variation observed in tetraploid C. crus-galli appears to be different from another tetraploid species C. douglasii

(section Douglasianae), also found in North America, and is predominantly apomictic

(Dickinson et al. 1996). Here, we detect six multilocus genotypes in 18 C. crus-galli tree individuals, and the genotypic diversity and evenness are 0.49 and 0.31, respectively (Table 7.1).

These estimates are lower than in C. douglasii with a maximum of 13 multilocus genotypes detected in 24 individuals and the overall diversity values range from 0.67-0.98 within populations (Chapter 6). The contrast in genetic variation may suggest differences in the relative role of outcrossing (i.e., pollen flow) and/or sexual recombination between these two apomictic species, as also documented in other herbaceous agamospermous species (see review in

Ellstrand & Roose 1987; Widen et al. 1994; Whitton et al. 2008). For example, occasional sexuality contributes to a high allozyme marker-based genotypic diversity in the triploid 267 apomicts of the European Taraxacum (0.71-0.89; Menken et al. 1995), which is strikingly high compared to the sexual relatives and other polyploid apomicts (Ellstrand & Roose 1987).

However, the North American Taraxacum apomicts only showed less than half of genotypic diversity in the sexual endemics, demonstrating the prevalence of clonality (Lyman & Ellstrand

1998). In Rosaceae, while some apomicts of Rubus were shown to contain limited genetic variability (Nybom & Schaal 1990; Kraft et al. 1996; Nybom 1998), other studies revealed substantial amounts of variation in apomicts of Amelanchier and Aronia within populations, potentially due to occasional outcrossing events (Campbell et al. 1999; Persson-Hovmalm et al.

2004). Therefore, it seems increasingly clear that agamospermous species can vary in their levels of genetic variation, due to different influences of outcrossing and/or sexuality. For C. crus-galli, the lack of outcrossing and preponderance of apomixis are possible explanations for the low genetic variability observed between seed families.

To conclude, the present study reports the first fine-scale analyses of genetic variation using seed families of woody sexual and asexual plants. Also, we demonstrate the use of analytical methods in comparing genetic diversity between diploid and tetraploid samples, which can be applied on codominant and potentially dominant data of various ploidy levels. In

C. crus-galli and C. punctata, contrasting levels of genetic variation were uncovered between and within seed families that are likely due to differences in reproductive systems. Uniparental inheritance related to gametophtyic apomixis seems to prevail in tetraploid C. crus-galli and results in a replication of maternal genotypes with little pollen contribution among seed progenies within a population, corroborating early inferences based on morphological and embryological data (Dickinson 1986; Dickinson & Phipps 1986). Such a pattern may reflect rapid local colonization in areas that are frequently disturbed either naturally or by human activities, particularly in eastern North America (Dickinson 1999). Future studies at a broader 268 geographical scale are required to examine the global genetic diversity and structure of C. crus- galli and C. punctata, so as to enrich our existing knowledge on evolution of these apomictic groups.

7.5. Acknowledgements

The authors thank Tara Paton and Simone Russell of Centre for Applied Genomics, Hospital for

Sick Children for the ABI facilities; Sophie Nguyen for plant collection; Jess Chung for extracting seeds and helping with DNA extraction for this study. Financial support from the

Natural Sciences and Engineering Research Council of Canada (grant A3430 to TAD and

326439-06 to SS), the Botany Department of the University of Toronto, and the Royal Ontario

Museum is gratefully acknowledged.

7.6. References

Birky CW 1996. Heterozygosity, heteromorphy, and phylogenetic trees in asexual eukaryotes.

Genetics 144: 427-437.

Campbell CS, Alice LA, Wright WA 1999. Comparisons of within-population genetic variation

in sexual and agamospermous Amelanchier (Rosaceae) using RAPD markers. Pl. Syst. Evol.

215: 157-167.

Carino DA, Daehler CC 1999. Genetic variation in an apomictic grass, Heteropogon contortus,

in the Hawaiian Islands. Mol. Ecol. 8: 2127-2132.

Clausen J 1954. Partial apomixis as an equilibrium system in evolution. Carylogia 6: 469–479.

Darlington CD 1958. Evolution of genetic systems. Edinburgh: Oliver and Boyd. 269 Dickinson TA 1986. Topodeme differentiation in Ontario taxa of Crataegus (Rosaceae:

Maloideae): Leaf morphometric evidence. Can. J. Bot. 64: 2738-2747.

_____ 1999. Species concepts in agamic complexes. In Evolution in man-made habitats.

L.W.D.v. Raamsdonk, and J.C.M.d. Nijs (eds). Institute for Systematics & Ecology,

Amsterdam. Pp. 319-339.

_____, Phipps JB 1985. Degree and pattern of variation in Crataegus section Crus-galli in

Ontario. Syst. Bot. 10: 322-337.

_____, _____ 1986. The breeding system of Crataegus crus galli sensu lato in Ontario (Canada).

Am. J. Bot. 73: 116-130.

Diggle PK, Lower S, Ranker TA 1998. Clonal diversity in alpine populations of Polygonum

viviparum (Polygonaceae). Int. J. Pl. Sci. 159: 606-615.

Ellegren H 2000. Microsatellite mutations in the germline. Trends Genet. 16: 551-558.

Ellstrand NC, Roose ML 1987. Patterns of genotypic diversity in clonal plant species. Am. J.

Bot. 74: 123-131.

Esselman EJ, Jian QL, Crawford DJ, Windus JL, Wolfe AD 1999. Clonal diversity in the rare

Calamagrostis porteri ssp. insperata (Poaceae): comparative results for allozymes and

random amplified polymorphic DNA (RAPD) and intersimple sequence repeat (ISSR)

markers. Mol. Ecol. 8: 443-451.

Felsenstein J 2006. PHYLIP (Phylogeny Inference Package), version 3.66c. Zoology

Department, University of Washington, Seattle.

Gabrielsen TM, Brochmann C 1998. Sex after all: high levels of diversity detected in the artic

clonal plant Saxifraga cernua using RAPD markers. Mol. Ecol. 7: 1701-1708.

Gianfranceschi L, Seglias N, Tarchini R, Komjanc M, Gessler C (1998). Simple sequence

repeats for the genetic analysis of apple. Theor. Appl. Genet. 96: 1069-1076.

Grant V 1981. Plant speciation. Columbia University Press, New York. 270 Hamrick JL, Godt MJ 1996. Effects of life history traits on genetic diversity in plant species.

Philos. Trans. R. S. London 351: 1291-1298.

_____, _____, Sherman-Broyles SL 1992. Factors influencing levels of genetic diversity in

woody plant species. New Forest 6: 95-124.

Hardy OJ, Vekemans X 2002. SPAGEDI: a versatile computer program to analyse spatial

genetic structure at the individual or population levels. Mol. Ecol. Notes 2: 618-620.

Hörandl E, Jakubowsky G, Dobeš C 2001. Isozyme and morphological diversity within

apomictic and sexual taxa of the Ranunculus auricomus complex. Pl. Syst. Evol. 226: 165-

185.

Kirby GC 1975. Heterozygote frequencies in small subpopulations. Theor. Popul. Biol. 8: 31-48.

Kjolner S, Sastad SM, Taberlet P, Brochmann C 2004. Amplified fragment length

polymorphism versus random amplified polymorphic DNA markers: clonal diversity in

Saxifraga cernua. Mol. Ecol. 13: 81-86.

Kollmann J, Steinger T, Roy B 2000. Evidence of sexuality in European Rubus (Rosaceae)

species based on AFLP and allozyme analysis. Am. J. Bot. 87: 1592-1598.

Kondrashov AS 1993. Classification of hypotheses on the advantage of amphimixis. J. Hered.

84: 372-387.

Kraft T, Nybom H, Werlemark G 1996. DNA fingerprint variation in some blackberry species

(Rubus subg.Rubus, Rosaceae). Pl. Syst. Evol. 199: 93-108.

Kudoh H, Shibaike H, Takasu H, Whigham DF, Kawano S 1999. Genetic structure and

determinants of clonal structure in a temperate deciduous woodland herb, Uvularia perfoliata.

J. Ecol. 87: 244-257

Liebhard R, Gianfranceschi L, Koller B, Ryder CD, Tarchini R, Van De Weg E, Gessler C 2002.

Development and characterisation of 140 new microsatellites in apple (Malus ×domestica

Borkh). Mol. Breeding 10: 217-241. 271 Longley AE 1924. Cytological studies in the genus Crataegus. Am. J. Bot. 11: 295-317.

Lyman JC, Ellstrand NC 1998. Relative contribution of breeding system and endemism to

genotypic diversity: the outcrossing endemic Taraxacum californicum vs. the widespread

apomict T. officinale (sensu lato). Madroño 45: 283-289.

Lynch M, Gabriel W 1983. Phenotypic evolution and parthenogenesis. Am. Nat. 122: 745-764.

Menken SBJ, Smit E, Hans Den Nijs JCM 1995. Genetical population structure in plants: gene

flow between diploid sexual and triploid asexual dandelions (Taraxacum section Ruderalia).

Evolution 49: 1108-1118.

Meirmans PG, Van Tienderen PH 2004. GENOTYPE and GENODIVE: two programs for the

analysis of genetic diversity of asexual organisms. Mol. Ecol. Notes 4: 792-794.

Muniyamma M, Phipps JB 1979. Cytological proof of apomixis in Crataegus (Rosaceae). Am.

J. Bot. 66: 149-155.

_____, _____ 1984. Further cytological evidence for the occurrence of apomixis in North

American hawthorns. Can. J. Bot. 62: 2316-2324.

Nei M. 1987. Molecular Evolutionary Genetics. Columbia University Press, New York.

Nogler GA 1984. Gametophytic apomixis. Embryology of angiosperms. J. B. M. Berlin,

Springer-Verlag. Pp. 475-518.

Nybom H 1998. Biometry and DNA fingerprinting detected limited genetic differentiation

among populations of the blackberry Rubus nesssensis (Rosaceae). Nord. J. Bot. 18: 323-333.

_____, Schaal BA 1990. DNA "fingerprints" reveal genotypic distributions in natural

populations of blackberries and raspberries (Rubus, Rosaceae). Am. J. Bot. 77: 883-888.

Page RDM 1996. TREEVIEW: An application to display phylogenetic trees on personal

computers. Comput. Appl. Biosci. 12: 357-358.

Palmer EJ 1932. The Crataegus problem. J. Arnold Arboretum 13: 342-362. 272 Parker KC, Hamrick JL, Parker AJ, Nason JD 2001. Fine-scale genetic structure in Pinus clausa

(Pinaceae) populations: effects of disturbance history. Heredity 87: 99-113.

Paun O, Greilhuber J, Temsch EM, Hörandl E 2006. Patterns, sources and ecological

implications of clonal diversity in apomictic Ranunculus carpaticola (Ranunculus auricomus

complex, Ranunculaceae). Mol. Ecol. 15: 897-910.

Persson-Hovmalm HA, Jeppsson N, Bartish I, Nybom H 2004. RAPD analysis of diploid and

tetraploid populations of Aronia points to different reproductive strategies within the genus.

Hereditas 141: 301-312.

Phipps JB, Muniyamma M 1980. A taxonomic revision of Crataegus (Rosaceae) in Ontario.

Can. J. Bot. 58: 1621-1699.

Ritland K 2000. Marker-inferred relatedness as a tool for detecting heritability in nature. Mol.

Ecol. 9: 1195-1204.

Rousset F 2000. Genetic differentiation between individuals. J. Evol. Biol. 13: 58-62.

Schlötterer C 2000. Evolutionary dynamics of microsatellite DNA. Chromosoma 109: 365-371.

Shi Y, Gornall RJ, Draper J, Stace CA 1996. Intraspecific molecular variation in Hieracium sect.

Alpina (Asteraceae), an apomictic group. Folia Geobotanica 31: 305-413.

Smith PG, Phipps JB 1988. Breeding behavior in Ontario Crataegus series Rotundifoliae. Can. J.

Bot. 66: 1914-1923

Smouse PE, Peakall R 1999. Spatial autocorrelation analysis of individual multiallele and

multilocus genetic structure. Heredity 82: 561-573.

Standish LM 1916. What is happening to the hawthorns? J. Hered. 7: 266-279.

Stebbins GL 1950. Variation and evolution in plants. Columbia University Press, New York.

Štorchová H, Chrtek Jr J, Bartish IV, Tetera M, Kirschner J, Štepánek J 2002. Genetic variation

in agamospermous taxa of Hieracium sect. Alpina (Compositae) in the Tatry Mts. (Slovakia).

Pl. Syst. Evol. 235: 1-17. 273 Talent N, Dickinson TA 2005. Polyploidy in Crataegus and Mespilus (Rosaceae, Maloideae):

evolutionary inferences from flow cytometry of nuclear DNA amounts. Can. J. Bot. 83:

1268-1304.

_____, _____ 2007. Endosperm formation in aposporous in Crataegus L. (Rosaceae,

Spiraeoideae, Pyreae): Parallel to Ranunculaceae and Poaceae. New Phytol. 173: 231-249.

Van Der Hulst RGM, Mes THM, Den Nijs JCM, Bachmann K 2000. Amplified fragment length

polymorphism (AFLP) markers reveal that population structure of triploid dandelions

(Taraxacum officinale) exhibits both clonality and recombination. Mol. Ecol. 9: 1-8.

Vavrek M 1998. Within-population genetic diversity of Taraxacum officinale (Asteraceae):

differential genotype response and effect on interspecific competition. Am. J. Bot. 85: 947-

954.

Vekemans X, Hardy OJ 2004. New insights from fine-scale spatial genetic structure analyses in

plant populations. Mol. Ecol. 13: 921-935.

Widen B, Cronberg N, Widen M 1994. Genotypic diversity, molecular markers, and spatial

distribution of genets in clonal plants, a literature survey. Folia Geobotanica 29: 245-263.

274

Table 7.1 Descriptive statistics of genetic variation at five microsatellite loci (Table 7.2) among seed families of C. crus-galli and C. punctata. Two-seven seed samples were used for each tree individual. Numbers in parentheses denote parameters estimated for individual trees based on multilocus genotype of one randomly selected seed from each family (i.e., each individual tree). Dashes denote parameters that are not applicable at the individual seed level. Asterisk indicates value that was shown to be significant at P < 0.001. Species C. crus-galli L. C. punctata Jacq. Ontario; Regional Municipality of Niagara; Ontario; Perth; Province; County; Coordinates 43° 14’ 40” N 79° 03’ 40” W; ON04 43° 18’ 54” N 81° 10’ 18” W; ON46 Ploidy level 4x 2x No. of seed samples (no. of individual trees) 83 (18) 118 (28) Total allele no. 37 (22) 62 (41) Mean allele no. per locus 7.4 (4.4) 12.4 (8.2) Mean He (gene diversity based on Nei 1987) 0.42 (0.56) 0.68 (0.70)

Mean FIS - (-0.29*) - (0.11) No. of multilocus genotypes (G) 9 (6) 47 (26) Proportion of distinguishable genotypes (PG) 0.12 (0.33) 0.40 (0.93) Genotypic or Simpson’s diversity index (D) 0.28 (0.49) 0.64 (0.99) 274 275

Table 7.2 Nucleotide sequences and information of microsatellite markers used in this study. Primers were developed based on Malus

×domestica and were shown applicable to Crataegus species (Liebhard et al. 2002). C: Total number of alleles detected in C. crus-galli; P:

Total number of alleles detected in C. punctata. Loci Forward primer sequence (5' - 3') Reverse primer sequence (5' - 3') Total alleles C P Size (bp) Maps on LG no. CH01F02 ACC ACA TTA GAG CAG TTG AGG CTG GTT TGT TTT CCT CCA GC 16 10 13 145-185 12 CH03A02 TTG TGG ACG TTC TGT GTT GG CAA GTT CAA CAG CTC AAG ATG A 16 9 13 124-168 14 CH04F06 GGC TCA GAG TAC TTG CAG AGG ATC CTT AAG CGC TCT CCA CA 20 10 18 128-190 14 CH04G04 AGT GGA TGA TGA GGA TGA GG GCT AGT TGC ACC AAG TTC ACA 7 3 7 145-159 12 CH05D04 ACT TGT GAG CCG TGA GAG GT TCC GAA GGT ATG CTT CGA TT 12 5 11 166-196 12

275

Table 7.3 Tukey (HSD) analysis of the differences in pairwise Rousset’s distances (2000) between three categories including total seed samples (ALL), between seed families (BS), and within seed families (WS) of C. crus-galli (c) and C. punctata (p) within local populations at a confidence interval of 95%. Groups Contrast F-ratio P-value Significant Between C. crus-galli and C. punctata

p-ALL vs c-ALL F1, 8268 = 8260.49 < 0.0001 Yes

p-BS vs c-WS F1, 5385 = 587.11 < 0.0001 Yes

p-BS vs c-BS F1, 7896 = 8194.35 < 0.0001 Yes

p-WS vs c-WS F1, 370 = 177.09 < 0.0001 Yes

p-WS vs c-BS F1, 2881 = 877.86 < 0.0001 Yes Within C. crus-galli

c-BS vs c-WS F1, 2835 = 4.24 0.881 No Within C. punctata

p-BS vs p-WS F1, 5431 = 123.94 < 0.0001 Yes 276 277 Figure 7.1 Frequency distribution of Bruvo et al. (2004) based genetic distance computed by

GENODIVE (Meirmans & Van Tienderen 2004).Y-axis represents the number of pairwise counts respectively in (A) C. punctata and (B) C. crus-galli. X-axis represents genetic distances that are divided into 13 classes as recognized under the calculation criteria. 278

(A) C. punctata, N = 118 (seeds of 2x sexuals)

1000

800

600

400 s t n u 200 o c e s i w r i 0 0.08 0.15 0.23 0.31 0.38 0.46 0.54 0.62 0.69 0.77 0.85 0.92 1 a p f o r e (B) C. crus-galli, N = 83 (seeds of 4x apomicts) b m u N

1000

800

600

400

200

0 0.08 0.15 0.23 0.31 0.38 0.46 0.54 0.62 0.69 0.77 0.85 0.92 1

Bruvo et al. (2004) based genetic distance 279 Figure 7.2 Neighbor-joining trees based on Bruvo et al. (2004) distances among seed samples of

C. punctata (A) and C. crus-galli (B). Terminals are represented by seed samples of either the same or different families. Branches are shown with the same scale for the two taxa. Terminal branches of C. crus-galli (mean 0.01 ± 0.04) are considerably shorter that those of C. punctata

(mean 0.06 ± 0.05). The nearly zero-branch length suggests limited allelic differences among seed samples of C. crus-galli. 280

(A) 118 seed samples representing 28 seed families of C. punctata (p) 7 6 1 4 3 1 1 1 2 3 1 2 5 0 1 1 1 8 7 5 4 2 1 1 6 5 1 0 3 1 2 4 5 2 1

6 c 2 2 5 9 5 5 8 1 3 1 5 7 2 3 8 9 8 2 4 0 8 0 2 2 3 8 2 1 2 2 4 1 8 9 1 7 1 2 3 3 1 2 2 2 2 7 2 1 7 3 6 2 1 2 7 2 4 2 0 6 5 1 2 1 2 2 3 1 2 6 2 1 4 6 8 5 1 1 1 8 1 2 0 1 2 4 6 1 5 2 1 2 1 1 9 1 2 2 2 2 0 8 8 2 3 2 2 0 3 1 8 4 0 4 2 9 5 2 1 1 1 5 1 1 9 2 6 9 2 9 1 4 3 6 7 1 7 2 1 6 1 1 3 5 3 5 6 4 2 6 2 2 1 2 2 9 8 6 1 1 2 7 2

0.1

(B) 83 seed samples representing 18 seed families of C. crus-galli (c)

2 p 1 1 6 9 1 5 7 6 2 8 8 2 2 3 1 6 6 4 4 6 5 5 0 0 0 1 1 6 1 1 5 1 1 1 1 1 7 7 7 7 8 4 4 4 7 7 6 5 6 6 8 6 1 1 1 2 5 5 5 3 1 3 5 6 1 1 1 1 1 3 4 4 4 4 4 3 3 1 1 1 1 1 1 8 1 1 7 7 1 1 1 9 9 1 0 1 7 8 8 1 1 1 1 1 1 1 1 1 2 3 1 1 1 9 2 2 2 2 3 3 3 2 281 Figure 7.3 Box plots of pairwise Rousset’s distance (Rousset 2000) calculated for the total seed samples (ALL), as well as within (WS) and between (BS) seed families of C. crus-galli (C) and

C. punctata (P). The overall â values of C. punctata (0.16 ± 0.008) are significantly higher than those of C. crus-galli (-0.03 ± 0.02) as indicated by the ANOVA comparisons (Table 3). No significant difference is found between the WS and BS â values of C. crus-galli (P = 0.881;

Table 7.3).

282

C-ALL

P-ALL s e i C-BS r o g e t a

C P-BS

C-WS

P-WS

-0.50 -0.25 0.00 0.25 0.50 0.75

Rousset's Distance

C: C. crus-galli P: C. punctata

283 Supplementary figure. Frequency distributions for detected alleles of five microsatellite loci in the two hawthorn species, C. crus-galli (gray) and C. punctata (black). Among all loci, CH1F02,

CH03A02, and CH04F06 are showed to be polymorphic with 10-18 alleles detected in C. crus-galli and C. punctata, whereas CH04G04 appears to be the least variable locus with only 3 and 7 alleles respectively in the two taxa.

CH01F02 CH03A02 CH04F06 0.5 0.5 0.5

0.4 0.4 0.4

0.3 0.3 0.3

0.2 0.2 0.2

y 0.1 0.1 0.1 c n e 0 0 u 0 126 156 164 q 147 155 161 167 179 138 150 130 136 144 150 156 178 190 e r f c i CH04G04 l CH05D04 e 1 l 1 l A 0.8 0.8

0.6 0.6

0.4 0.4

0.2 0.2

0 0 145 147 149 151 153 155 159 168 172 176 184 192 196

C. crus-galli C. punctata Allele size (bp) 2 8 4 Chapter 8 Summary

Genera of the Pyrinae are characterized by polyploidy, apomixis, and weak reproductive barriers such that species are interfertile. Classifications of the Pyrinae and of individual genera within it have been difficult in the sense that many morphological characters overlap and are homoplastic because of intergeneric as well as interspecific hybridization. Phylogenetic analyses using nuclear and chloroplast gene regions did not indicate clear genetic distinction between species of Crataegus and Mespilus. Mespilus canescens, a triploid plant that is currently restricted to a narrow eastern North American distribution, is best explained as a hybrid of C. brachyacantha and M. germanica based on overlapping genetic and morphological features. Although there is certain arbitrariness in the assignment of taxonomic rank, the taxonomic solution that best reflects both the molecular phylogeny and the morphological data, as well as causing minimum disruption of the existing nomenclature, is to sink the genus Mespilus in Crataegus as a new monotypic section. Mespilus canescens is readily accommodated as an intersectional hybrid named as a nothospecies in a new nothosection.

Within the large genus Crataegus, a strong genetic association was found between the East

Asian (sect. Sanguineae) and western North American species (sect. Douglasianae), suggesting an ancient trans-Beringian migration. Moreover, there is some evidence for trans-Atlantic connections as manifested by the hybrid origins of three extant eastern North American species

C. marshalli (series Apiifoliae), C. spathulata (sect. Microcarpae), and C. phaenopyrum (sect.

Cordatae), which were potentially derived from hybridization between North American species and an extinct lineage of European species. That Europe and eastern North America could be the most probable ancestral areas of Crataegus, disagrees with Phipps’ (1983) hypothesis of southwest China-Mexican origins but is consistent with the North American origins of the

285 286 Pyrinae (Evans & Campbell 2002) and the earliest fossil record of Crataegus (DeVore & Pigg

2007). Taxonomically, molecular data support the monophyly of sections such as sect.

Crataegus, Sanguineae, and Douglasianae, but fail to clearly resolve the eight sections described in eastern North America, despite the considerable amount of data generated for these species. Internal branches of the eastern North American species are relatively short when compared with those of the western North American and East Asian species. In addition, morphological characters are shown to be homoplastic and have undergone multiple independent changes. These lead to the conclusion that the evolutionary history of the eastern

North American species might involve ancient genetic bottlenecks, rapid divergence, and/or extensive hybridization.

In western North America, species of C. sect. Douglasianae are shown to vary in ploidy level and reproductive system. Diploids, triploids, and tetraploids are found in C. suksdorfii.

Polyploid individuals of C. suksdorfii show overlapping distribution with tetraploid C. douglasii in drier and cooler areas across the Cascades and Rocky Mountains. They are geographically segregated from diploid C. suksdorfii which inhabits more humid and warmer areas along the

Pacific coast. Seeds of diploids are sexually produced whereas seeds of triploids and tetraploids are predominantly produced through gametophytic apomixis. Heterogeneity in reproductive pathways coupled with differential climatic preferences may account for their present distribution pattern.

Gene duplication prior to polyploidization is shown to occur in the Douglasianae complexes. Duplicated paralogs were found to be useful for inferring reticulation history through phylogenetic tree and network reconstructions. Molecular data indicate that triploid individuals of C. suksdorfii were derived independently in the Pacific Northwest. They have originated either as autotriploids from diploid progenitors or as allotriploids from diploid C. suksdorfii and tetraploid C. douglasii. Tetraploid individuals appear to have formed via a triploid 287 bridge, with allotriploids backcrossing with diploid C. suksdorfii parents. Introgression of sympatric C. douglasii also appears to occur through recurrent gene flow.

The population dynamics of these naturally derived polyploids were further investigated for better understanding of their subsequent establishment. In the Pacific Northwest, frequent interpopulation gene flow potentially via seed dispersal is suggested as one of the processes that maintains genetic diversity in tetraploid apomictic C. douglasii. However, the genetic differentiation between C. douglasii of the Pacific Northwest and the upper Great Lakes Basin suggests that dispersal must have occurred following deglaciation and before re-establishment of the mid-continental grasslands. In contrast, gene flow in C. suksdorfii appears to be constrained by geographical distance as well as reproductive barrier imposed by ploidy level differences. Such limitation of interploidial gene flow is also indicated in areas where C. suksdorfii and C. douglasii co-occur. In C. suksdorfii and C. douglasii, diploid and tetraploid populations did not differ considerably in within-population variation. However, at the seed family level, the diploid C. punctata reveals remarkably higher genetic variation than the tetraploid C. crus-galli, suggesting differential reproductive systems and corroborating earlier work regarding morphological variability in these two species.

Findings of this thesis contribute to various aspects of polyploid evolution (Fig. 8.1). First, this thesis represents the first phylogenetic study of Crataegus and provides genetic evidences to previous morphological hypotheses in respect of biogeography and classification. Second, in-depth investigations of the Douglasianae complexes provide an explicit model to explain polyploid formation and identify some natural ecological as well as reproductive features related to polyploid distribution, which is less documented in woody plants. Third, inter- and intrapopulation studies of C. douglasii and C. suksdorfii, as well as C. crus-galli and C. punctata, shed light on the evolutionary dynamics of diploid sexuals and polyploid apomicts occur in natural populations. Overall, the present global and fine scale molecular studies of Crataegus 288 enrich our existing knowledge in two major areas, historical biogeography in Northern

Hemisphere and evolution of natural polyploids in woody angiosperms.

289 Figure 8.1 Contribution of the present thesis study based on the findings of each of the chapters as described in the modified phylogenetic tree (Fig. 1.1). Significance

7 Genetic variation of -Shed light on the h C seed families evolutionary

C dynamics of diploid h sexual and polyploid 6 Population structure apomictic plants of C. douglasii and C h C. suksdorfii 5 Enrich our existing knowledge in C Origins and h -Provide explicit model 4 reticulation of series 1. Northern Douglasianae of polyploid formation C Hemisphere h biogeography 3 Cytotype distribution -Identify natural C and reproductive features of polyploids h in woody perennials 2. Polyploid 2 biology of Douglasianae evolution in C woody plants h 1 Phylogeography of Old and New World - First phylogenetic Crataegus study that provides molecular evidences for morphological Generic limits of hypotheses Crataegus and Mespilus - Clarify taxonomic uncertainties 2 9 Polyploidy and Rosaceae 0

Appendix 1 (Chapter 2). Locality and vouchers data for outgroup, Mespilus, and Crataegus taxa used for molecular analyses. Nomenclature follows that used by Talent and Dickinson (2005). Except as noted, locality data are state (or Canadian provinces) and county (or parish) where leaf samples and herbarium vouchers were collected. All vouchers were deposited in the Green Plant Herbarium of the Royal Ontario Museum (TRT) unless noted otherwise; except as noted vouchers were collected by TAD. Ploidy data as reported in Talent and Dickinson (2005); in bold, ploidy determinations from that paper based on the accession studied here. 1Campbell et al. (1987), vouchers in MAINE; 2A. A. Dönmez coll. (vouchers in HUB); 3det. R. Lance; 4 N. Talent coll.; 5 D. Kandalepas coll. Ploidy level Genus, Section, and Series Species Voucher Source (x = 17) Amelanchier Medik. A. arborea (Michx. f.) Fernald 2003-1 Alabama, DeKalb 2x A.bartramiana (Tausch) Roemer B51 Maine, Somerset 2x B91 Aronia arbutifolia (L.) Ell. Aronia Mitch. 2003-2 Alabama, DeKalb ca. 4x Malus Miller M. angustifolia (Aiton) Michx. 2003-3 2x, 3x Mespilus L. M. canescens Phipps 2003-35-03 Arkansas, Prairie 3x 2003-36-11 3x 2003-37-13 3x 2003-38-17 3x 2003-39-18 3x 2003-40-19 3x 2003-41-20 3x 2003-42-22 3x 2003-43-24 3x M. germanica L. M645-80/32-28 (cult.) Morton Arboretum, Chicago, Illinois 2x M682-80/50-62 (cult.) 2x 2000-42 (cult.) Jardin Botanique de Montréal, Montréal, Québec 2-3x 291 291

UCBG78.0184 (cult.) University of California Botanical Garden, Berkeley, California AA727-89B Arnold Arboretum, Boston, Massachusetts AAD114572 Turkey, Artvin AAD116002 Turkey, Kirkklareli AAD116192 Turkey, Istanbul AAD116562 Turkey, Bursa AAD116602 Turkey, Bolu AAD116872 Crataegus L. Mexicanae Loud. Mexicanae (Loud.) Rehder C. mexicana UCBG76-2049 (cult.) University of California Botanical Garden, Berkeley, 2x California Parvifoliae Loud. Parvifoliae (Loud.) Rehder C. uniflora Münchh. 2003-26 Alabama, Autauga 3x 2003-52 Virginia, Franklin 3-4x Crataegus Crataegus C. laevigata Poir. Zika 18472 Washington, San Juan 2x C. laevigata Poir. Zika 18473 4x C. monogyna Jacq. Love C-2003-25 Oregon, Lane 3x C. monogyna Jacq. 99FW7-11 Oregon, Linn 2x C. songarica K. Koch AA198-65A (cult.) Arnold Arboretum, Boston, Massachusetts 4x C. songarica K. Koch AA113-96A (cult.) Apiifoliae (Loudon) Rehder C. marshallii Egglest. 2003-05 Alabama, Dekalb 2-3x C. marshallii Egglest. 2003-30 Mississippi, Scott 2-3x Orientales (C.K.Schneid.) Pojark. C. heldreichii Boiss. AA238-71A (cult.) Arnold Arboretum, Boston, Massachusetts 2x Pentagynae (C.K.Schneid.) Russanov C. pentagyna Waldst. & Kit. AA94-85B (cult.) Arnold Arboretum, Boston, Massachusetts 2x Christensen 312 (cult.) Denmark, Taastrup 292

Sanguineae Zabel ex Schneider Nigrae (Loudon) Russanov C. chloroscara Maxim. AA281-71A (cult.) Arnold Arboretum, Boston, Massachusetts 2x C. kansuensis Wilson AA-EN101 (cult.) Arnold Arboretum, Boston, Massachusetts 2x AA12-95 (cult.) C. maximowicizii Schneider AA309-97 (cult.) Arnold Arboretum, Boston, Massachusetts 2x, 3x, 4x C. nigra Waldst. and Kit. Christensen 294 (cult.) Denmark, Taastrup 2x Sanguineae (Zabel ex Schneider) C. dahurica Koehne ex Schneider AA71-73A (cult.) Arnold Arboretum, Boston, Massachusetts 2x Rehder AA-EN250-2000 (cult.) C. sanguinea Pall. ex Bieb. JBM1232-49 (cult.) Jardin Botanique de Montréal, Montréal, Québec 2x, 3x, 4x C. wilsonii Sarg. AA271-84A (cult.) Arnold Arboretum, Boston, Massachusetts 2x AA749-74A (cult.) 2x Hupehensis J.B.Phipps. Hupehenses J.B.Phipps. C. hupehensis Sarg. AA356-81B (cult.) Arnold Arboretum, Boston, Massachusetts 2-3x AA356-81C (cult.) Cordatae Beadle ex Egglest. Cordatae (Beadle ex Egglest.) Rehder C. phaenopyrum (L. f.) Medikus 99ME1 (cult.) Maine, Penobscot 3x, 4x AA195-52B (cult.) Arnold Arboretum, Boston, Massachusetts Virides (Beadle ex Sarg.) Schneider Virides (Beadle ex Sarg.) Rehder C. viridis L. 2003-44 Arkansas, Prairie 2x+ 2003-45 2-3x Microcarpae Loud. Microcarpae (Loud.) Rehder C. spathulata Michx. 2003-6 Georgia, Floyd 2-3x 2003-34 Louisiana, Boissier 2-3x Lacrimatae (J.B.Phipps) J.B.Phipps Lacrimatae J.B.Phipps C. lassa Beadle 2003-184 Alabama, Dallas high Aestivales (Sarg.) Schneider 293

Aestivales (Sarg.) Rehder C. aestivalis (Walt.) T. &G. NC1992-250 (cult.)5 North Carolina Arboretum, Asheville, North Carolina low Talent 3213 North Carolina, Buncombe low C. opaca Hook. & Arn. 2003-33 (cult.) Louisiana, Sabine 2x AA387-96A (cult.) Arnold Arboretum, Boston, Massachusetts 2001-15 Texas, Jasper Brevispinae Beadle ex Schneider Brevispinae (Beadle ex Schneider) C. brachyacantha Sarg. & Engelm. 2000-11 Texas, Jasper Rehder 2001-3A5 Texas, Jasper 2003-32 Louisiana, Sabine 2-3x Reid 5203 Louisiana, Morehouse 2-3x C. saligna Greene 99FW1/1 Colorado, Gunnison 2x, 2-3x 2001-4A Colorado, Rio Blanco 2001-7A Douglasianae Loud. Douglasianae (Loud.) Poletiko C. suksdorfii (Sarg.) Kruschke D1619A Montana, Powell 4x 2001-27A Montana, Lake . Love C-2003-11 Oregon, Lane 2-3x Zika 18477 Oregon, Columbia 2x Zika 18483 Oregon, Washington Zika 18485 Washington, Clark 99FW8/9 Washington, Klickitat 99FW8/12 Crus-galli Loud. Crus-galli (Loud.) Rehder C. crus-galli L. Talent 213A Alabama, Montgomery 2x Talent 286 Georgia, Houston low 2003-15 Alabama, Lowndes 4x 294

C. punctata Jacq. 2000-26 Ontario, Lambton BB4 Ontario, Bruce 2x Coccineae Loud. C. sp. 2003-4 Alabama, DeKalb Macracanthae (Loud.) Rehder C. calpodendron (Ehrh.) Medikus Talent 166 Ontario, Middlesex 2x Talent 172 Ontario, Niagara 2x 2000-28 Ontario, Middlesex AA277-68A (cult.) Arnold Arboretum, Boston, Massachusetts Molles (Beadle ex Schneider) Rehder C. mollis (T. and G.) Scheele D1655 Ontario, Middlesex 2-3x Talent 208 (cult.) Wisconsin, Madison 2x Triflorae (Beadle) Rehder C. triflora Chapm. Talent 290a3 Georgia, Floyd 2x incertae sedis C. sp. RBG 54705 Royal Botanic Garden, Hamilton 2x

295 296

Appendix 2 (Chapter 2). Morphological characters and their states, together with ploidy level and geographic distribution, as they are expressed in Amelanchier, Mespilus, and Crataegus species.

1. Sympodial replacement growth on reproductive short shoots is proleptic (0; during the following growing season) or sylleptic (1; during the same growing season as flowering), from an axillary bud below the terminal inflorescence. 2. Disposition of ovules within the locule at anthesis is typically collateral (0) or the ovules are superposed (1) (Decaisne 1874; Evans and Dickinson 2005). 3. Seeds are enclosed in a cartilaginous core (0) or within a woody endocarp, or pyrene (1) (Rohrer et al. 1991). 4. In the mature fruit the apices of the pyrenes are covered by epidermis (0) or are exposed (1) (Decaisne 1874; Koehne 1890). 5. Inflorescence typically multiflorous (0) or uniflorous (1) (Rohrer, Robertson and Phipps 1994). 6. Secondary venation of short shoot leaves typically craspedodromous (0) (Robertson et al. 1992, Fig. 1; Leaf Architecture Working Group 1999, Fig. 29.7), semi-craspedodromous (1) (Leaf Architecture Working Group 1999, Fig. 29.8), or (eu-) camptodromous (2) (Robertson et al. 1992, Fig. 2; Leaf Architecture Working Group 1999, Fig. 29.3). 7. Flowers, mean number of stamens > 25 (0), 15-25 (1), or < 15 (2). 8. Flowers, number of gynoecial units (locules, styles) ≥ 5 (0), 4 (1), 3 (2), 2 (3), or ≤ 1 (4). 9. Fruits brown (0), red (1), yellow (2), white (3), or blue, purple, or black (4). 10. Ploidy level 2n = 2x = 34 (0), 2n = 3x = 51 (1), 2n = 4x = 68 (2), or 2n ≥ 5x = 85 (3). 11. Geographic distribution western North America (0), eastern North America (1), eastern Eurasia (2), or western Eurasia (including northern Africa) (3).

Appendix 3 (Chapter 2). GenBank accession numbers of representative species used in the phylogenetic reconstruction here. eSpecies Voucher trnS-trnG psbA-trnH trnH-rpl2 rpl20-rps12 ITS LEAFY A. arborea (Michx. f.) Fernald 2003-1 EF127115 EF127152 EF127189 EF127226 EF127041 EF127078 Aronia arbutifolia (L.) Ell. 2003-2 EF127117 EF127154 EF127191 EF127228 EF127043 EF127080 M. angustifolia (Aiton) Michx. 2003-31 EF127116 EF127153 EF127190 EF127227 EF127042 EF127079 M. canescens Phipps 2003-37-13 EF127099 EF127136 EF127173 EF127210 EF127039 EF127076 M. germanica L. M645-80/32-28 (cult.) EF127098 EF127135 EF127172 EF127209 EF127040 EF127077 C. mexicana UCBG76-2049 (cult.) EF127082 EF127119 EF127156 EF127193 EF127021 EF127058 C. uniflora Münchh. 2003-26 EF127112 EF127149 EF127186 EF127223 EF127020 EF127057 C. laevigata Poir. Zika 18472 EF127093 EF127130 EF127167 EF127204 EF127015 EF127052 C. monogyna Jacq. 99FW7-11 EF127091 EF127128 EF127165 EF127202 EF127014 EF127051 C. songarica K. Koch AA198-65A (cult.) EF127092 EF127129 EF127166 EF127203 EF127036 EF127073 C. marshallii Egglest. 2003-05 EF127095 EF127132 EF127169 EF127206 EF127037 EF127074 C. heldreichii Boiss. AA238-71A (cult.) EF127090 EF127127 EF127164 EF127201 EF127016 EF127053 C. pentagyna Waldst. & Kit. AA94-85B (cult.) EF127094 EF127131 EF127168 EF127205 EF127035 EF127072 C. chloroscara Maxim. AA281-71A (cult.) EF127110 EF127147 EF127184 EF127221 EF127009 EF127046 C. kansuensis Wilson AA12-95 (cult.) EF127108 EF127145 EF127182 EF127219 EF127029 EF127066 C. maximowicizii Schneider AA309-97 (cult.) EF127109 EF127146 EF127183 EF127220 EF127030 EF127067 C. nigra Waldst. and Kit. Christensen 294 (cult.) EF127107 EF127144 EF127181 EF127218 EF127007 EF127044 C. dahurica Koehne ex Schneider AA-EN250-2000 (cult.) EF127105 EF127142 EF127179 EF127216 EF127028 EF127065 C. sanguinea Pall. ex Bieb. JBM1232-49 (cult.) EF127106 EF127143 EF127180 EF127217 EF127027 EF127064 C. wilsonii Sarg. AA749-74A (cult.) EF127104 EF127141 EF127178 EF127215 EF127008 EF127045 C. hupehensis Sarg. AA356-81B (cult.) EF127111 EF127148 EF127185 EF127222 EF127038 EF127075 C. phaenopyrum (L. f.) Medikus 99ME1 (cult.) EF127096 EF127133 EF127170 EF127207 EF127034 EF127071 297

C. viridis L. 2003-45 EF127113 EF127150 EF127187 EF127224 EF127013 EF127050 C. spathulata Michx. 2003-34 EF127097 EF127134 EF127171 EF127208 EF127033 EF127070 C. lassa Beadle 2003-185 EF127081 EF127118 EF127155 EF127192 EF127024 EF127061 C. aestivalis (Walt.) T. &G. Talent 3214 EF127089 EF127126 EF127163 EF127200 EF127023 EF127060 C. opaca Hook. & Arn. 2003-33 (cult.) EF127088 EF127125 EF127162 EF127199 EF127022 EF127059 C. brachyacantha Sarg. & Engelm. 2000-11 EF127100 EF127137 EF127174 EF127211 EF127032 EF127069 C. saligna Greene 99FW1/1 EF127101 EF127138 EF127175 EF127212 EF127031 EF127068 C. suksdorfii (Sarg.) Kruschke Love C-2003-11 EF127103 EF127140 EF127177 EF127214 EF127025 EF127062 Zika 18477 EF127102 EF127139 EF127176 EF127213 EF127026 EF127063 C. crus-galli L. Talent 213A EF127087 EF127124 EF127161 EF127198 EF127010 EF127047 C. punctata Jacq. BB4 EF127086 EF127123 EF127160 EF127197 EF127011 EF127048 C. calpodendron (Ehrh.) Medikus Talent 172 EF127083 EF127120 EF127157 EF127194 EF127018 EF127055 C. mollis (T. and G.) Scheele D1655 EF127085 EF127122 EF127159 EF127196 EF127012 EF127049 C. triflora Chapm. Talent 290a4 EF127084 EF127121 EF127158 EF127195 EF127019 EF127056 C. sp. RBG 54705 EF127114 EF127151 EF127188 EF127225 EF127017 EF127054 298

Appendix 4 (Chapter 3). Locality and vouchers data for outgroup and Crataegus taxa used for molecular analyses. Nomenclature follows that used by Phipps (1990) and Talent and Dickinson (2005). Except as noted, locality data are state (or Canadian provinces) and county (or parish) where leaf samples and herbarium vouchers were collected. All vouchers were deposited in the Green Plant Herbarium of the Royal

Ontario Museum (TRT) unless noted otherwise; except as noted vouchers were collected by TAD. Ploidy level data as reported in Talent and Dickinson (2005); in bold. iCampbell et al. (1987), vouchers in MAINE; iiDickinson (1995); iiiA. A. Dönmez coll., vouchers in HUB; ivdet. R. Lance; vK. I. Christensen.

Genus, Section, and Series Ploidy level Species Voucher Locality (x = 17) Amelanchier Medik. A. arborea (Michx. f.) Fernald 2003-1 Alabama, DeKalb 2x A.bartramiana (Tausch) Roemer B5 Maine, Somerset 2xvi B9 Aronia Mitch. Aronia sp. 2003-2 Alabama, DeKalb ca. 4x Malus Miller M. angustifolia (Aiton) Michx. 2003-3 2x, 3xvii 2003-10 Crataegus L. Mespilus L. C. germanica (L.) UCBG78.0184 (cult.) University of California Botanical Garden, 2x Dickinson & Lo. Berkeley, California AA727-89B Arnold Arboretum, Boston, Massachusetts 2x AAD11457 Turkey, Artvinviii 2x Mexicanae Loud. 299

Mexicanae (Loud.) Rehder C. pubescens Moc. & Sesse UCBG76-2049 (cult.) University of California Botanical Garden, 2x Berkeley, California Parvifoliae Loud. C. uniflora Münchh. 2003-26 Alabama, Autauga 3x 2003-52 Virginia, Franklin 3-4x Crataegus Crataegus C. laevigata Poir. Zika 18472 Washington, San Juan 2x Zika 18473 Washington, San Juan 4x 26-5v Denmark 2x 27-1v Denmark, Jaegersborg Deer Park 2x C. monogyna Jacq. Love C-2003-25 Oregon, Lane 3x 99FW7-11 Oregon, Linn 2x 34-1v Denmark, Jaegersborg Deer Park 2x 8-1v Greece, Mitsikeli, Elati 2x B02v Greece, Mt. Kourenta 2x C. songarica K. Koch AA198-65A (cult.) Arnold Arboretum, Boston, Massachusetts 4x AA113-96A (cult.) Arnold Arboretum, Boston, Massachusetts 304-1v Tian Shan 4x 1954-0509v Afghanistan, Paski Parun 4x C. rhipidophylla Gand. 18-1v Denmark, Sjaellands Odde 4x 1970-64v The Arboretum, Hoersholm 4x Morocco, Middle Atlas, Ban Iblance to Immanzer 2x C. nevadensis K.I.Chr. 271-2v des Marmanela 300

1991-61v Morocco, Atlas Mts., M.F. de Tirrhist 2x C. meyeri Pojark. 1998-8010v Denmark, Copenhagen, Botanical Garden 4x C. pseudoheterophylla Pojark. 313-1v Uzbekistan, Tan San Mts., River Pskum 4x 314-1v Turkmenistan, Kopet Dag, Tjuen 4x Apiifoliae (Loudon) Rehder C. marshallii Egglest. 2000-1 Mt. Alto Rd. Rome 2003-05 Alabama, Dekalb 2-3x 2003-30 Mississippi, Scott 2-3x Azaroli (Loud.) Rehder C. heldreichii Boiss. AA238-71A (cult.) Arnold Arboretum, Boston, Massachusetts 2x B05v Greece, Mt. Kourenta 2x C02v Greece, Mt. Ossa 2x C. pycnoloba Boiss. & Heldr. A03v Greece, Mt. Chelmos 2x A04v Greece, Mt. Chelmos 2x C. orientalis Pall. 168-1v Greece, Kalambaka-Chaliki, Kastanea 4x C01v Greece, Mt. Ossa 4x Pentagynae (C.K.Schneid.) Russanov C. pentagyna Waldst. & Kit. AA94-85B (cult.) Arnold Arboretum, Boston, Massachusetts 2x Christensen 312 (cult.) Turkey, Kastamonu, Yaglica 312-2v Turkey, Kastamonu, Yaglica 2x 312-3v Turkey, Kastamonu, Yaglica 2x Pinnatifidae (Zabel ex Schneider) Rehder C. pinnatifida Bunge 1691-49v Jardin Botanique de Montréal, Montréal, Québec 4x 15.Vl.2005-#3v Russia, Primorskiy Kray, Lazovskiy Zapovednik 15.Vl.2005-#6v Russia, Primorskiy Kray, Lazovskiy Zapovednik Sanguineae Zabel ex Schneider Nigrae (Loudon) Russanov C. chlorosarca Maxim. AA281-71A (cult.) Arnold Arboretum, Boston, Massachusetts 2x C. kansuensis Wilson AA-EN101 (cult.) Arnold Arboretum, Boston, Massachusetts 2x 301 301

AA12-95 (cult.) Arnold Arboretum, Boston, Massachusetts C. maximowicizii Schneider AA309-97 (cult.) Arnold Arboretum, Boston, Massachusetts 2x, 3x, 4x 310-5v (cult.) Denmark, Taastrup 4x 310-1v (cult.) Denmark, Taastrup 4x 310-4v (cult.) Denmark, Taastrup 4x C. nigra Waldst. and Kit. Christensen 294 (cult.) Denmark, Taastrup 2x C. dahurica Koehne ex Sanguineae (Zabel ex Schneider) Rehder Schneider AA71-73A (cult.) Arnold Arboretum, Boston, Massachusetts 2x AA-EN250-2000 (cult.) Arnold Arboretum, Boston, Massachusetts C. sanguinea Pall. ex Bieb. JBM1232-49 (cult.) Jardin Botanique de Montréal, Montréal, Québec 2x, 3x, 4x C. wilsonii Sarg. AA271-84A (cult.) Arnold Arboretum, Boston, Massachusetts 2x AA749-74A (cult.) Arnold Arboretum, Boston, Massachusetts 2x C. almaatensis Pojark. 1196-65A (cult.) Arnold Arboretum, Boston, Massachusetts 4x C. russanovii Cin. KIC287 (cult.) Denmark, Taastrup C. wattiana Hensl. Ex Lace 1401-52 Jardin Botanique de Montréal, Montréal, Québec 3x C. altaica Lange 1280-50 Jardin Botanique de Montréal, Montréal, Québec 4x Hupehenses J. B. Phipps. C. hupehensis Sarg. AA356-81B (cult.) Arnold Arboretum, Boston, Massachusetts 2-3x AA356-81C (cult.) Arnold Arboretum, Boston, Massachusetts Cordatae Beadle ex Egglest. C. phaenopyrum (L. f.) Medikus 99ME1 (cult.) Maine, Penobscot 3x, 4x C. sp. AA195-52B (cult.) Arnold Arboretum, Boston, Massachusetts Virides (Beadle ex Sarg.) Schneider C. viridis L. 2003-44 Arkansas, Prairie 2x+ 302

2003-45 Arkansas, Konecny Grove 2-3x Microcarpae Loud. C. spathulata Michx. 2003-6 Georgia, Floyd 2-3x 2003-34 Louisiana, Boissier 2-3x Lacrimatae (J. B. Phipps) J. B. Phipps C. lassa Beadle 2003-18ix Alabama, Dallas high C. agrestina 2003-20 Lowndes Co. , near Cypress Creek, AL 4x C. munda Beadle Lance2313 Georgia, Rabun 3x C. sp. 2003-46 Grassy Hill, VA 4x C. sp. 2003-50 Grassy Hill, VA 3x Aestivales (Sarg.) Schneider C. aestivalis (Walt.) T. &G. NC1992-250 (cult.)x North Carolina, NC Arboretum, Asheville low Talent 3214 North Carolina, Buncombe low C. opaca Hook. & Arn. 2003-33 (cult.) Louisiana, Sabine 2x AA387-96A (cult.) Arnold Arboretum, Boston, Massachusetts low 2001-1xi Texas, Jasper C. rufula Sarg. RON North Carolina, Old Fanning Fields Road 3x- 1992-425 North Carolina, NC arboretum 2-3x Brevispinae Beadle ex Schneider C. brachyacantha Sarg. & Engelm. 2000-11 Texas, Jasper 2001-3A6 Texas, Jasper 2003-32 Louisiana, Sabine 2-3x Reid 5203 Louisiana, Morehouse 2-3x 303 303

C. saligna Greene 2001-4A Colorado, Rio Blanco 2x 99FW1/1 Colorado, Gunnison River 2x, 2-3x 2001-7A Colorado, Gunnison River 2x Douglasianae Loud. C. suksdorfii (Sarg.) Kruschke D1619A Montana, Powell 4x 2001-27A Montana, Lake 4x . Love C-2003-11 Oregon, Lane 2-3x Zika 18477 Oregon, Columbia 2x Zika 18483 Oregon, Washington 2x Zika 18485 Washington, Clark 2x 99FW8/9 Washington, Klickitat 99FW8/12 Washington, Klickitat C. enderbyensis Phipps & O’Kennon PZ18445 British Columbia, Enderby 4x PZ18454 British Columbia, Enderby 4x C. douglasii Lindley PZ18453 British Columbia, Spallumcheen 4x NT189 Ontario, Big Bay, Colpoy’s range 4x 2003-23 Oregon, Goose Rock Bridge, John Day River 4x C. castlegarensis Phipps & O’Kennon PZ18488A Washington, Thurston Co. 4x PZ18390 Washington, Whitman Co. 4x C. okenonnii Phipps B (PZ19239) Washington, Okanagan 4x 5A (PZ19235) Washington, Okanagan 4x C. rivularis Nuttall 2001-10A Idaho, Montpelier Canyon 4x 304

2001-42 Wyoming, Glenrock, beside north Platte river 4x NCA31 North Carolina, NC Arboretum, Asheville 4x- C. erythropoda Ashe NT327 Colorado, Chautauqua Park 4x NT333 Colorado, Chautauqua Park 4x Crus-galli Loud. Crus-galli (Loud.) Rehder C. crus-galli L. Talent 213A Alabama, Montgomery 2x Talent 286 Georgia, Houston low 2000-15 Alabama, Lowndes 4x C. tenax Sarg. D661 Ontario, Fansher 3-4x D662 Ontario, Fansher 3-4x C. engelmannii Sarg. AA312-87 Arnold Arboretum, Boston, Massachusetts Punctatae (Lound.) Rehder C. punctata Jacq. 2000-26 Ontario, Lambton BB4 Ontario, Bruce 2x C. collina Chapm. NT300 UNC Botanical garden 3x NT305 Monte Jones Road, Asheville 3x+ Coccineae Loud. C. calpodendron (Ehrh.) Macracanthae (Loud.) Rehder Medikus Talent 166 Ontario, Middlesex 2x Talent 172 Ontario, Niagara 2x 2000-28 Ontario, Middlesex AA277-68A (cult.) Arnold Arboretum, Boston, Massachusetts C. macracantha Lodd. Ex Loud. 2001-25 Little Bitterroot 4x Molles (Beadle ex Schneider) Rehder C. mollis (T. and G.) Scheele D1655 Ontario, Middlesex 2-3x Talent 208 (cult.) Wisconsin, Madison 2x 305 305

C. submollis Sarg. 2000-97 Ontario, Ashbridge's Bay 4x Triflorae (Beadle) Rehder C. triflora Chapm. Talent 290a4 Georgia, Floyd 2x 2002-8 Bienville NF, MS 3x Lance2314 Newton Co. 3x Pulcherrimae (Beadle ex Palmer) Robertson C. sp. 2002-5 Bienville NF, MS 3x Suborbiculatae (Kruschke) Phipps C. compacta Sarg. D654 Ontario, Fansher 3x D659 Ontario, Fansher 3x Intricatae (Sarg.) Rehder C. sargentii Sarg. NT288 Georgia, McGee Bend Road 2-3x C. flavida Sarg. AA966-90H Arnold Arboretum, Boston, Massachusetts 2003-61 Arnold Arboretum, Boston, Massachusetts 4x 2003-65 Arnold Arboretum, Boston, Massachusetts 4x Rotundifoliae (Egglest. ex Egglest.) Rehder C. chryoscarpa Phipps. AA749-52 Arnold Arboretum, Boston, Massachusetts 2001-23A Little Bitterroot 4x 2001-24 Little Bitterroot 4x C. dodgei Ashe EL2 Ontario, Grey, Colpoy range 4x C. irrasa Sarg. NT193 Ontario, Grey, Colpoy range 4x NT307 Ontario, Grey, Colpoy range 4x C. sp. 2001-29 Bracteatae (Palmer) Rehder C. harbisonii Beadle Lance4 North Carolina, NC Arboretum, Asheville 4x Lance2307 North Carolina, NC Arboretum, Asheville 4x 1998-74A North Carolina, NC Arboretum, Asheville 4x Seedling of the original plant in AL, Lowndes 4x C. ashei Beadle Lance2309 Co., N end of Cravey Hill, Holy Ground 306 306

Battlefield Park 2003-25 Jones Bluff, AL 4x Tenuifoliae (Beadle ex Sarg.) Rehder C. flabellata (Bosc.) K. Koch NT308 Ontario, Colpoy range 4x 307 307 308

Appendix 5 (Chapter 3). Keys indicating ploidy level and the states of the eight morphological characters mapped on the molecular tree in Fig. 3.6.

Morphological characters Keys (white) no; (gray) slightly lobed; (black) deeply lobed with veins Lobing of short shoot leaves (LLB) extended to sinuses Leaf veins-to-margin (LVM) (white) nil; (black) yes Lower leaf surface vestiture (LLV) (white) along veins; (black) present throughout surface Upper leaf surface vestiture (LUV) (white) sparse; (black) present throughout surface Branch vestiture (BRV) (white) sparse; (black) present (+/- dense) Nutlet surface (NUS) (white) smooth; (black) sulcate Stamen number (STN) (white) 10 stamens; (gray) 20 stamens; (black) 30 stamens Ploidy level (PLL) (white) 2x; (gray) 3x; (black) 4x

Appendix 6. (Chapter 4, 5, and 6). Locality and vouchers data for Crataegus section Douglasianae accessions used for flow cytometry and molecular analyses. Nomenclature follows that used by Phipps (2003). Except as noted, locality data are state (or Canadian provinces) and county (or parish) where leaf and fruit samples as well as herbarium vouchers were collected. All vouchers were deposited in the Green Plant

Herbarium of the Royal Ontario Museum (TRT). Asterisks denote voucher not made. Collectors: EL, Eugenia Lo and co-workers; NT, Nadia

Talent; PZ, Peter Zika; RE, Rodger Evans; RL, Rhode Love; SS, Saša Stefanović; TD, Timothy Dickinson and co-workers.

Section, Series, Species Collection Locality Ch 4- Ch 4- Ch 5- Ch 6- State/Province Country Locality No. ID Leaf FC Seed FC (#) NR, CP SSR Douglasianae Douglasiaanae C. castlegarensis TD2007-06 California Shasta Co. CAR3 Hat Creek Park FC EL141 Idaho Latah Co. ID02 Little Boulder Creek Campground. FC SSR EL192 Idaho Lemhi Co. ID16 US 93 FC 2 CP SSR TD2007-15 Saskatchewan SK Cypress Hills FC TD2007-16 Saskatchewan SK Cypress Hills FC PZ18488 Washington Thurston Co. WA21 Mound Prairie EL086 Washington Thurston Co. WA21 Scatter Creek Rest Area Mound Prairie FC 2 CP SSR EL086A Washington Thurston Co. WA21 Scatter Creek Rest Area Mound Prairie FC SSR EL087 Washington Thurston Co. WA21 Scatter Creek Rest Area Mound Prairie FC 4 SSR EL087A Washington Thurston Co. WA21 Scatter Creek Rest Area Mound Prairie FC SSR EL088 Washington Thurston Co. WA21 Scatter Creek Rest Area Mound Prairie FC SSR EL089 Washington Thurston Co. WA21 Scatter Creek Rest Area Mound Prairie FC NR, CP SSR EL090 Washington Thurston Co. WA21 Scatter Creek Rest Area Mound Prairie FC 2 SSR EL091 Washington Thurston Co. WA21 Scatter Creek Rest Area Mound Prairie FC CP SSR 309

EL092 Washington Thurston Co. WA21 Scatter Creek Rest Area Mound Prairie FC CP SSR EL093* Washington Thurston Co. WA21 Scatter Creek Rest Area Mound Prairie FC SSR EL094* Washington Thurston Co. WA21 Scatter Creek Rest Area Mound Prairie FC SSR EL095* Washington Thurston Co. WA21 Scatter Creek Rest Area Mound Prairie FC SSR EL096* Washington Thurston Co. WA21 Scatter Creek Rest Area Mound Prairie FC SSR EL097* Washington Thurston Co. WA21 Scatter Creek Rest Area Mound Prairie FC SSR EL098* Washington Thurston Co. WA21 Scatter Creek Rest Area Mound Prairie FC SSR EL099* Washington Thurston Co. WA21 Scatter Creek Rest Area Mound Prairie FC 6 SSR EL100* Washington Thurston Co. WA21 Scatter Creek Rest Area Mound Prairie FC 3 CP SSR EL101* Washington Thurston Co. WA21 Scatter Creek Rest Area Mound Prairie FC CP SSR EL102* Washington Thurston Co. WA21 Scatter Creek Rest Area Mound Prairie FC CP SSR PZ18488A Washington Thurston Co. WA21 Scatter Creek Rest Area Mound Prairie FC SSR PZ18488B Washington Thurston Co. WA21 Scatter Creek Rest Area Mound Prairie FC SSR PZ18488C Washington Thurston Co. WA21 Scatter Creek Rest Area Mound Prairie FC SSR PZ18488D Washington Thurston Co. WA21 Scatter Creek Rest Area Mound Prairie FC SSR EL2006-13 California Shasta Co. CAR3 Hat Creek Park FC NR, CP EL140 Idaho Latah Co. ID02 Little Boulder Creek Campground FC SSR

C. douglasii PZ18450 British Columbia Spallumcheen Municipality FC PZ18452 British Columbia Spallumcheen Municipality FC PZ18453 British Columbia Spallumcheen Municipality FC PZ18456 British Columbia North Saanich Municipality FC EL2006-14 California Shasta Co. CA03 Dusty Campground FC 5 EL2006-12 California Shasta Co. CAR2 Dana FC 5 TD2007-05 California Shasta Co. CAR3 Hat Creek Park FC 310

EL2005-15 Idaho Latah Co. ID02 Little Boulder Creek Campground FC 6 TD1572 Idaho Latah Co. ID02 Little Boulder Creek Campground FC 6 EL129 Idaho Nez Perce Co. ID02 Little Boulder Creek Campground FC CP SSR EL131 Idaho Latah Co. ID02 Little Boulder Creek Campground FC 2 CP SSR EL132* Idaho Latah Co. ID02 Little Boulder Creek Campground FC 3 SSR EL133* Idaho Latah Co. ID02 Little Boulder Creek Campground FC CP SSR EL134* Idaho Latah Co. ID02 Little Boulder Creek Campground FC SSR EL135* Idaho Latah Co. ID02 Little Boulder Creek Campground FC SSR EL136* Idaho Latah Co. ID02 Little Boulder Creek Campground FC SSR EL137* Idaho Latah Co. ID02 Little Boulder Creek Campground FC 4 SSR EL138* Idaho Latah Co. ID02 Little Boulder Creek Campground FC 5 NR, CP SSR EL139* Idaho Latah Co. ID02 Little Boulder Creek Campground FC 7 CP SSR EL143 Idaho Latah Co. ID02 Little Boulder Creek Campground FC SSR EL147 Idaho Benewah Co. ID03a St. Maries R. at Santa Creek FC 3 CP SSR EL148* Idaho Benewah Co. ID03a St. Maries R. at Santa Creek FC SSR EL149* Idaho Benewah Co. ID03a St. Maries R. at Santa Creek FC 3 SSR EL150 Idaho Benewah Co. ID03b St. Maries R. at Santa Creek FC 3 CP SSR EL185 Idaho Adams Co. ID06 Goose Creek FC 6 CP SSR EL2005-05 Idaho Adams Co. ID06a Goose Creek FC 4 EL2005-06 Idaho Adams Co. ID06a Goose Creek FC 3 EL2005-07 Idaho Adams Co. ID06a Goose Creek FC 3 EL2005-08 Idaho Adams Co. ID06a Goose Creek FC 4 EL2005-10 Idaho Adams Co. ID06a Goose Creek FC EL2005-11 Idaho Adams Co. ID06a Goose Creek FC 2 EL158* Idaho Adams Co. ID06a Goose Creek FC CP SSR 311

EL160* Idaho Adams Co. ID06a Goose Creek FC CP SSR EL166 Idaho Adams Co. ID06a Goose Creek FC 3 NR, CP SSR EL169 Idaho Adams Co. ID06a Goose Creek FC CP SSR EL170 Idaho Adams Co. ID06a Goose Creek FC NR, CP SSR EL171 Idaho Adams Co. ID06a Goose Creek FC CP SSR EL174 Idaho Adams Co. ID06a Goose Creek FC CP SSR EL176* Idaho Adams Co. ID06a Goose Creek FC CP SSR EL2005-12 Idaho Adams Co. ID06b Goose Creek FC 3 EL2005-13 Idaho Adams Co. ID06b Goose Creek FC 4 EL180 Idaho Adams Co. ID06b Goose Creek FC 3 CP SSR EL181 Idaho Adams Co. ID06b Goose Creek FC SSR EL183 Idaho Adams Co. ID06b Goose Creek FC CP SSR EL194 Idaho Lemhi Co. ID15 US 93, milepost 336 FC CP SSR EL195* Idaho Lemhi Co. ID15 US 93, milepost 336 FC CP SSR EL196* Idaho Lemhi Co. ID15 US 93, milepost 336 FC CP SSR EL197* Idaho Lemhi Co. ID15 US 93, milepost 336 FC 5 NR, CP SSR EL198 Idaho Lemhi Co. ID15 US 93, milepost 336 FC SSR EL190 Idaho Lemhi Co. ID16 US 93 FC 5 CP SSR EL191 Idaho Lemhi Co. ID16 US 93 FC 3 CP SSR EL193* Idaho Lemhi Co. ID16 US 93 FC SSR EL189 Idaho Lemhi Co. ID17 US 93 FC CP SSR EL121 Idaho Nez Perce Co. ID20 Hwy 3 FC 3 NR, CP SSR EL123 Idaho Nez Perce Co. ID20 Hwy 3 FC 3 CP SSR EL124 Idaho Nez Perce Co. ID20 Hwy 3 FC SSR EL125 Idaho Nez Perce Co. ID20 Hwy 3 FC SSR 312

EL126 Idaho Nez Perce Co. ID20 Hwy 3 FC SSR EL127 Idaho Nez Perce Co. ID20 Hwy 3 FC 3 SSR EL128* Idaho Nez Perce Co. ID20 Hwy 3 FC 3 CP SSR EL129A Idaho Nez Perce Co. ID20 Hwy 3 FC SSR EL130* Idaho Nez Perce Co. ID20 Hwy 3 FC 2 CP SSR EL130A* Idaho Nez Perce Co. ID20 Hwy 3 FC SSR EL024 Montana Powell Co. MT1a Big Nelson Campground FC CP SSR TD2001-40 Montana Lake Co. MT3 Tower Road 2 SSR TD2001-41 Montana Lake Co. MT3 Tower Road 4 NR, CP SSR TD1618 Montana Powell Co. MT2 Kleinschmidt Flat FC SSR EL031 Montana Powell Co. MT2 Kleinschmidt Flat FC CP SSR EL032 Montana Powell Co. MT2a Kleinschmidt Flat FC 6 NR, CP SSR EL034 Montana Powell Co. MT2a Kleinschmidt Flat FC 6 SSR EL035 Montana Powell Co. MT2a Kleinschmidt Flat FC CP SSR TD2001-36 Montana Powell Co. MT2b Kleinschmidt Flat FC 2 CP SSR TD2001-37 Montana Powell Co. MT2b Kleinschmidt Flat FC CP SSR EL025* Montana Powell Co. MT2b Kleinschmidt Flat FC CP SSR EL038 Montana Powell Co. MT2b Kleinschmidt Flat FC CP SSR EL039 Montana Powell Co. MT2b Kleinschmidt Flat FC 6 NR, CP SSR EL040 Montana Powell Co. MT2b Kleinschmidt Flat FC CP SSR EL046 Montana Powell Co. MT2b Kleinschmidt Flat FC CP SSR TD2001-21 Montana Lake Co. MT3 Tower Road SSR TD2007-22 Ontario Thunder Bay ON18 City of Thunder Bay FC SSR NT325 Ontario Thunder Bay. ON18 City of Thunder Bay FC 4 NT326 Ontario Thunder Bay ON18 City of Thunder Bay FC 6 313 313

TD1416 Ontario Grey Co. ON20 Colpoy's Range, FC 3 SSR EL010 Ontario Bruce ON20 Colpoy's Range FC SSR EL011 Ontario Bruce Co. ON20 Colpoy's Range FC NR, CP SSR EL012* Ontario Bruce Co. ON20 Colpoy's Range FC 4 SSR EL013* Ontario Bruce Co. ON20 Colpoy's Range FC CP SSR EL014* Ontario Bruce Co. ON20 Colpoy's Range FC SSR EL015* Ontario Bruce Co. ON20 Colpoy's Range FC NR, CP SSR EL016* Ontario Bruce Co. ON20 Colpoy's Range FC 3 CP SSR EL017* Ontario Bruce Co. ON20 Colpoy's Range FC SSR EL018* Ontario Bruce Co. ON20 Colpoy's Range FC SSR EL019* Ontario Bruce Co. ON20 Colpoy's Range FC SSR EL020* Ontario Bruce Co. ON20 Colpoy's Range FC SSR EL021* Ontario Bruce Co. ON20 Colpoy's Range FC SSR EL023* Ontario Bruce Co. ON20 Colpoy's Range FC SSR EL007 Ontario Bruce Co. ON20 Colpoy's Range FC 2 CP SSR EL008 Ontario Bruce Co. ON20 Colpoy's Range FC CP SSR EL009 Ontario Bruce Co. ON20 Colpoy's Range FC 3 CP SSR NT189* Ontario Bruce Co. ON20 Colpoy's Range FC SSR NT190* Ontario Bruce Co. ON20 Colpoy's Range FC SSR NT191* Ontario Bruce Co. ON20 Colpoy's Range FC SSR NT192* Ontario Bruce Co. ON20 Colpoy's Range FC 2 CP SSR NT194* Ontario Bruce Co. ON20 Colpoy's Range FC SSR RE009 Ontario Bruce Co. ON20 Colpoy's Range FC SSR TD2003-76 Ontario Bruce Co. ON21 Barrow Bay FC 3 SSR TD2003-78 Ontario Bruce Co. ON21 Barrow Bay FC SSR 314

EL004 Ontario Bruce Co. ON21 Barrow Bay FC SSR EL005 Ontario Bruce Co. ON21 Barrow Bay FC CP SSR RL2003-16 Oregon Wheeler Co. Fossil FC RL2003-17 Oregon Wheeler Co. Hwy 19 FC RL2003-20 Oregon Grant Co. John Day River, South Fork FC EL2005-18 Washington Kittitas Co. WA20a Cle Elum FC 4 EL2005-19 Washington Kittitas Co. WA20a Cle Elum FC 5 EL2005-20 Washington Kittitas Co. WA20a Cle Elum FC 5 EL2005-21 Washington Kittitas Co. WA20a Cle Elum FC 6 EL2005-22* Washington Kittitas Co. WA20a Cle Elum FC 4 SS07-03 Washington Chelan Co. WA-SS1 FC NR SS07-04 Washington Chelan Co. WA-SS2 FC SS07-05 Washington Chelan Co. WA-SS3 FC PZ18390 Washington Whitman Co. Staley Rd. FC EL2005-26* Washington Okanagan Co. WA24 Ellisford FC EL2005-27* Washington Okanagan Co. WA24 Ellisford FC TD2001-14 Idaho Lemhi Co. ID15 US 93, milepost 336 FC CP SSR C. okennonii EL151 Washington Whitman Co. WA22 US 195 FC 2 CP SSR EL152 Washington Whitman Co. WA22 US 195 FC CP SSR EL153 Washington Whitman Co. WA22 US 195 FC 2 CP SSR EL154* Washington Whitman Co. WA22 US 195 FC CP SSR EL155* Washington Whitman Co. WA22 US 195 FC 2 NR, CP SSR PZ18389 Washington Whitman Co. Pullman FC SSR C. shuswapensis EL2005-28 British Columbia BC3 Enderby Indian Reserve #2 FC 6 C. suksdorfii EL2006-15 California Siskiyou Co. CAR5 Scott Valley FC 6 CP SSR 315

EL2006-16 California Siskiyou Co. CAR5 Scott Valley FC 5 NR, CP SSR EL2006-17 California Siskiyou Co. CAR5 Scott Valley FC 4 CP SSR EL2006-18 California Siskiyou Co. CAR5 Scott Valley FC 5 SSR EL2006-19 California Siskiyou Co. CAR5 Scott Valley FC 4 SSR EL2006-20 California Siskiyou Co. CAR5 Scott Valley FC 4 CP SSR EL2006-21 California Siskiyou Co. CAR5 Scott Valley FC 4 SSR EL2006-22 California Siskiyou Co. CAR5 Scott Valley FC 3 NR, CP SSR EL142 Idaho Latah Co. ID02 Little Boulder Creek Campground FC 4 SSR EL144 Idaho Latah Co. ID02 Little Boulder Creek Campground FC 5 SSR EL145 Idaho Latah Co. ID02 Little Boulder Creek Campground FC 5 SSR EL146 Idaho Latah Co. ID02 Little Boulder Creek Campground FC SSR EL2005-02 Idaho Valley Co. ID05 Payette Lake FC 3 EL2005-03 Idaho Valley Co. ID05 Payette Lake FC 5 EL186 Idaho Valley Co. ID05 Payette Lake FC CP SSR EL187 Idaho Valley Co. ID05 Payette Lake FC CP SSR EL188 Idaho Valley Co. ID05 Payette Lake FC 4 NR, CP SSR EL184 Idaho Adams Co. ID06 Goose Creek FC CP SSR EL2005-4 Idaho Adams Co. ID06a Goose Creek FC 4 EL2005-9 Idaho Adams Co. ID06a Goose Creek FC 5 EL162 Idaho Adams Co. ID06a Goose Creek FC CP SSR EL163 Idaho Adams Co. ID06a Goose Creek FC CP SSR EL164 Idaho Adams Co. ID06a Goose Creek FC 5 SSR EL165 Idaho Adams Co. ID06a Goose Creek FC NR, CP SSR EL167 Idaho Adams Co. ID06a Goose Creek FC SSR EL168 Idaho Adams Co. ID06a Goose Creek FC SSR 316 316

EL172 Idaho Adams Co. ID06a Goose Creek FC NR, CP SSR EL173 Idaho Adams Co. ID06a Goose Creek FC NR, CP SSR EL175 Idaho Adams Co. ID06a Goose Creek FC CP SSR EL178 Idaho Adams Co. ID06a Goose Creek FC CP SSR EL179 Idaho Adams Co. ID06a Goose Creek FC CP SSR EL156* Idaho Adams Co. ID06b Goose Creek FC SSR EL157* Idaho Adams Co. ID06b Goose Creek FC CP SSR EL159* Idaho Adams Co. ID06b Goose Creek FC CP SSR EL161* Idaho Adams Co. ID06b Goose Creek FC CP SSR EL177* Idaho Adams Co. ID06b Goose Creek FC CP SSR EL182 Idaho Adams Co. ID06b Goose Creek FC CP SSR TD1611 Montana Powell Co. MT1.1 Dry Creek SSR EL2005-1 Montana Powell Co. MT2 Kleinschmidt Flat FC 4 CP TD1619 Montana Powell Co. MT2 Kleinschmidt Flat FC 4 CP SSR EL026* Montana Powell Co. MT2 Kleinschmidt Flat FC NR, CP SSR EL027* Montana Powell Co. MT2 Kleinschmidt Flat FC CP SSR EL028* Montana Powell Co. MT2 Kleinschmidt Flat FC SSR EL029 Montana Powell Co. MT2 Kleinschmidt Flat FC SSR EL030 Montana Powell Co. MT2 Kleinschmidt Flat FC NR, CP SSR EL033 Montana Powell Co. MT2a Kleinschmidt Flat FC 6 SSR EL224* Montana Powell Co. MT2a Kleinschmidt Flat FC 3 SSR EL225 Montana Powell Co. MT2a Kleinschmidt Flat FC 2 SSR EL226 Montana Powell Co. MT2a Kleinschmidt Flat FC 2 SSR EL228* Montana Powell Co. MT2a Kleinschmidt Flat FC 4 SSR TD2001-38 Montana Powell Co. MT2b Kleinschmidt Flat FC SSR 317 317

TD2001-39 Montana Powell Co. MT2b Kleinschmidt Flat FC CP SSR EL036 Montana Powell Co. MT2b Kleinschmidt Flat FC NR,CP SSR EL037 Montana Powell Co. MT2b Kleinschmidt Flat FC CP SSR EL041 Montana Powell Co. MT2b Kleinschmidt Flat FC CP SSR EL043 Montana Powell Co. MT2b Kleinschmidt Flat FC 5 CP SSR EL044 Montana Powell Co. MT2b Kleinschmidt Flat FC 5 CP SSR EL045 Montana Powell Co. MT2b Kleinschmidt Flat FC 9 NR, CP SSR EL047* Montana Powell Co. MT2b Kleinschmidt Flat FC SSR TD2001-27 Montana Lake Co. MT5 US 93 CP SSR EL068* Oregon Linn Co. OR01 Cogswell-Foster Reserve FC 4 CP SSR EL068A* Oregon Linn Co. OR01 Cogswell-Foster Reserve FC CP SSR EL068B* Oregon Linn Co. OR01 Cogswell-Foster Reserve FC CP SSR EL068C* Oregon Linn Co. OR01 Cogswell-Foster Reserve FC CP SSR EL068D* Oregon Linn Co. OR01 Cogswell-Foster Reserve FC SSR EL069* Oregon Linn Co. OR01 Cogswell-Foster Reserve FC CP SSR EL070* Oregon Linn Co. OR01 Cogswell-Foster Reserve FC 3 NR, CP SSR EL071* Oregon Linn Co. OR01 Cogswell-Foster Reserve FC CP SSR EL072* Oregon Linn Co. OR01 Cogswell-Foster Reserve FC 2 NR, CP SSR EL075* Oregon Linn Co. OR01 Cogswell-Foster Reserve FC 5 NR, CP SSR RL2003-35 Oregon Douglas Co. OR04 Elk Meadows Research Natural Area FC 4 RL2003-36 Oregon Douglas Co. OR04 Elk Meadows Research Natural Area FC RL2003-37 Oregon Douglas Co. OR04 Elk Meadows Research Natural Area FC RL2003-38 Oregon Douglas Co. OR04 Elk Meadows Research Natural Area FC RL2003-39 Oregon Douglas Co. OR04 Elk Meadows Research Natural Area FC EL048 Oregon Lane Co. OR06 Patterson Mt. Prairie FC CP SSR 318 318

EL049 Oregon Lane Co. OR06 Patterson Mt. Prairie FC CP SSR EL050 Oregon Lane Co. OR06 Patterson Mt. Prairie FC NR, CP SSR EL051 Oregon Lane Co. OR06 Patterson Mt. Prairie FC CP SSR EL052 Oregon Lane Co. OR06 Patterson Mt. Prairie FC CP SSR EL053 Oregon Lane Co. OR06 Patterson Mt. Prairie FC CP SSR EL054 Oregon Lane Co. OR06 Patterson Mt. Prairie FC CP SSR EL055 Oregon Lane Co. OR06 Patterson Mt. Prairie FC CP SSR EL056 Oregon Lane Co. OR06 Patterson Mt. Prairie FC CP SSR EL057 Oregon Lane Co. OR06 Patterson Mt. Prairie FC NR, CP SSR EL058 Oregon Lane Co. OR06 Patterson Mt. Prairie FC CP SSR EL059 Oregon Lane Co. OR06 Patterson Mt. Prairie FC CP SSR EL060 Oregon Lane Co. OR06 Patterson Mt. Prairie FC CP SSR EL061 Oregon Lane Co. OR06 Patterson Mt. Prairie FC CP SSR EL062 Oregon Lane Co. OR06 Patterson Mt. Prairie FC NR, CP SSR EL063 Oregon Lane Co. OR06 Patterson Mt. Prairie FC CP SSR EL064 Oregon Lane Co. OR06 Patterson Mt. Prairie FC CP SSR EL065 Oregon Lane Co. OR06 Patterson Mt. Prairie FC NR, CP SSR EL066 Oregon Lane Co. OR06 Patterson Mt. Prairie FC CP SSR EL067 Oregon Lane Co. OR06 Patterson Mt. Prairie FC CP SSR PZ18477 Oregon Columbia Co. OR11 Sauvie Island SSR EL103 Oregon Columbia Co. OR11 Sauvie Island FC 8 CP SSR EL104 Oregon Columbia Co. OR11 Sauvie Island FC 5 NR, CP SSR EL105 Oregon Columbia Co. OR11 Sauvie Island FC CP SSR EL106 Oregon Columbia Co. OR11 Sauvie Island FC 6 SSR EL107 Oregon Columbia Co. OR11 Sauvie Island FC CP SSR 319 319

EL109 Oregon Columbia Co. OR11 Sauvie Island FC CP SSR EL110* Oregon Columbia Co. OR11 Sauvie Island FC 3 CP SSR EL111* Oregon Columbia Co. OR11 Sauvie Island FC SSR EL112* Oregon Columbia Co. OR11 Sauvie Island FC SSR EL113* Oregon Columbia Co. OR11 Sauvie Island FC CP SSR EL114* Oregon Columbia Co. OR11 Sauvie Island FC CP SSR EL115* Oregon Columbia Co. OR11 Sauvie Island FC NR, CP SSR PZ18477A Oregon Columbia Co. OR11 FC 8 SSR PZ18477B Oregon Columbia Co. OR11 FC SSR EL2005-16 Oregon Columbia Co. OR11a Sauvie Island FC SSR EL2005-17 Oregon Columbia Co. OR11a Sauvie Island FC 5 SSR PZ18483 Oregon Washington Co. Hillsboro, FC SSR PZ2003-10 Oregon Lane Co. West Eugene, Bertelson Rd. FC 3 SSR PZ2003-11 Oregon Lane Co. West Eugene, Bertelson Rd. FC SSR PZ2003-14 Oregon Lane Co. East Eugene, Eldon Shafer Dr. FC SSR PZ2003-15 Oregon Lane Co. East Eugene, Eldon Shafer Dr. FC SSR PZ2003-24 Oregon Lane Co. West Eugene, Bertelson Rd. FC SSR PZ18995 Oregon Lewis Co. FC SSR PZ19024 Oregon Cowlitz Co. FC SSR PZ18481 Oregon Columbia Co. FC SSR EL2005-23 Washington Kittitas Co. WA20a Cle Elum FC 3 EL2005-24 Washington Kittitas Co. WA20a Cle Elum FC 4 PZ18485 Washington Clark Co. WA7 FC SSR PZ18486 Washington Clark Co. WA7 FC SSR PZ18485A Washington Clark Co. WA7 FC SSR 320

PZ18485B Washington Clark Co. WA7 FC SSR PZ18485C Washington Clark Co. WA7 FC NR SSR PZ18485D Washington Clark Co. WA7 FC SSR PZ18485E Washington Clark Co. WA7 FC SSR PZ18485F Washington Clark Co. WA7 FC SSR PZ18485G Washington Clark Co. WA7 FC SSR PZ18485H Washington Clark Co. WA7 FC SSR Cerrones C. erythropoda NT327 Colorado Boulder NTCO01 Chautauqua Park FC 3 SSR NT336 Colorado Boulder NTCO01 Chautauqua Park FC SSR NT341 Colorado Boulder NTCO01 Chautauqua Park FC SSR NT334 Colorado Boulder NTCO02 Beach Park FC SSR NT333 Colorado Boulder NTCO03 CU campus FC SSR NT349 Colorado Boulder NTCO05 Mouth of Gregory Canyon FC SSR NT351 Colorado Boulder NTCO07 Sawhill ponds FC SSR NT352 Colorado Boulder NTCO07 Sawhill ponds FC SSR NT353 Colorado Boulder NTCO08 Bobolink trail FC SSR NT366 Colorado Boulder NTCO09 Bear Creek Park FC SSR NT355 New Mexico Rio Arriba NTNM01 US84 FC SSR NT356 New Mexico Rio Arriba NTNM01 US84 FC SSR NT358 New Mexico Rio Arriba NTNM01 US84 FC SSR NT379* NTCO01 FC 3 SSR NT378* NTCO05 FC SSR C. rivularis NT373 Colorado Montrose NTCO15 East of Cimarron FC SSR NT376 Colorado Archuleta NTCO17Rte 160 FC 3 SSR 321

NT377 Colorado Boulder NTCO18 Cultivated on 8th Avenue, Boulder FC SSR TD2001-10 Idaho Bear Lake Co. ID13 W of Whitman Hollow, Geneva FC SSR TD2001-11 Idaho Bear Lake Co. ID13 W of Whitman Hollow, Geneva FC SSR EL199 Idaho Bear Lake Co. ID13 W of Whitman Hollow, Geneva FC 3 SSR EL200* Idaho Bear Lake Co. ID13a Montpelier Canyon, FC SSR EL201* Idaho Bear Lake Co. ID13a Montpelier Canyon, FC SSR EL202* Idaho Bear Lake Co. ID13a Montpelier Canyon, FC 3 SSR EL203 Idaho Bear Lake Co. ID13a Montpelier Canyon, FC SSR EL204 Idaho Bear Lake Co. ID13a Montpelier Canyon, FC SSR EL205 Idaho Bear Lake Co. ID13a Montpelier Canyon, FC SSR EL206 Idaho Bear Lake Co. ID13a Montpelier Canyon, FC SSR EL207 Idaho Bear Lake Co. ID13a Montpelier Canyon, FC SSR EL208* Idaho Bear Lake Co. ID13a Montpelier Canyon, FC SSR EL209* Idaho Bear Lake Co. ID13a Montpelier Canyon, FC SSR EL210 Idaho Bear Lake Co. ID13a Montpelier Canyon, FC SSR EL211* Idaho Bear Lake Co. ID13a Montpelier Canyon, FC SSR EL212* Idaho Bear Lake Co. ID13a Montpelier Canyon, FC SSR EL213* Idaho Bear Lake Co. ID13a Montpelier Canyon, FC SSR TD2007-03 Nevada Elko Co. NV2 Thorpe Creek, Lamoille Valley, FC SSR TD2007-01 Nevada Elko Co. NV3 Starr Valley FC SSR TD2007-02 Nevada Elko Co. NV3 Starr Valley FC SSR NT354 New Mexico Rio Arriba NTNM01 US84 FC SSR NT357 New Mexico Rio Arriba NTNM01 US84 FC SSR TD2001-06 Utah Uintah Co. UT6a UT 121 FC 2 SSR TD2001-05 Utah Uintah Co. UT6b UT 121 FC 2 SSR 322

TD2001-42 Wyoming Converse Co. WY1 North Platte River FC SSR TD2001-43 Wyoming Converse Co. WY1 North Platte River FC SSR C. saligna TD2001-07 Colorado Rio Blanco Co. CO1 Rio Blanco Rd 8, N bank of White R. FC SSR TD2004-08 Colorado Rio Blanco Co. CO1 Rio Blanco Rd 8, N bank of White R. FC 2 SSR TD2004-06 Colorado Rio Blanco Co. CO6 CO64 FC SSR NT368 Colorado Gunnison NTCO12Gunnison River FC SSR NT369 Colorado Gunnison NTCO12Gunnison River FC SSR NT370 Colorado Gunnison NTCO12Gunnison River FC 4 SSR NT371 Colorado Gunnison NTCO13 West of Gunnison FC 2 SSR NT372 Colorado Gunnison NTCO14Neversink Picnic ground FC SSR TD2004-05 Utah Duchesne Co. UT5 River Road FC SSR

323 324 Appendix 7 (Chapter 7). Locality and source of C. crus-galli and C. punctata seed samples examined in this study. All seed samples were collected in the fall of 2005. Because of the unambiguous identification of the two taxa at our study sites (and the fact that our collections were made when leaves and fruits were abscising), voucher specimens were collected for only a few representative individuals, and these are deposited in the Green Plant Herbarium (TRT) at the Royal Ontario Museum. Additional vouchers of C. crus-galli of ON04 can be found in the herbarium at The University of Western Ontario (UWO). Species Province; County; Locality; ID Tree accessions No. of seeds C. crus-galli Ontario; Niagara; Fort George National SN004 3 Historic Park; ON04 SN005 7 SN006 5 SN007 5 SN009 5 SN010 5 SN013 5 SN014 5 SN015 4 SN017 4 SN019 5 SN020 4 SN021 5 SN022 5 SN023 4 SN024 5 SN025 5 SN026 2 Total: 18 trees 83 seeds C. punctata Ontario; Perth; East side of SN027 3 Motherwell ; ON46 SN029 3 SN030 5 SN031 2 SN032 6 SN033 5 SN035 3 325 SN036 4 SN037 4 SN038 5 SN039 2 SN040 5 SN041 5 SN042 5 SN044 5 SN045 4 SN046 3 SN047 4 SN048 5 SN049 5 SN050 2 SN051 5 SN052 4 SN053 5 SN054 5 SN055 4 SN056 5 SN057 5 Total: 28 trees 118 seeds