Genetic Diversity and Phylogeographic Structure of the Parasitic (): Implications for Systematics and Post-glacial Colonization of North America

by

Anuar Gregory Rodrigues

A thesis submitted in conformity with the requirements for the degree of Doctor of Philosophy Ecology and Evolutionary Biology University of Toronto

© Copyright by Anuar Gregory Rodrigues 2013

Genetic Diversity and Phylogeographic Structure of the Parasitic Plant Genus Conopholis (Orobanchaceae): Implications for Systematics and Post-glacial Colonization of North America

Anuar Gregory Rodrigues

Doctor of Philosophy

Ecology and Evolutionary Biology University of Toronto

2013 Abstract

Parasitism in is often accompanied by a suite of morphological and physiological changes resulting in a condition known as the ‘parasitic reduction syndrome’. With changes including extreme vegetative reduction, frequently beyond any resemblance to its photosynthetic relatives, accompanied by significant losses of genes linked to photosynthesis, the study of parasitic plants can be challenging. Conopholis (Orobanchaceae) is a small holoparasitic genus distributed across eastern and southwestern North America and Central America. This genus has never been the subject of a molecular phylogenetic or morphometric analyses. In addition, very little is known of the relationships among populations and of their post-glacial history.

To investigate the species limits and phylogenetic relationships in Conopholis, we conducted a comprehensive molecular phylogenetic study of the genus as well as a fine-scale morphometric study. Based on plastid and nuclear sequences, Conopholis was found to contain three distinct and well-supported lineages which have varying degrees of overlap with previously proposed taxa. The clustering and ordination analyses of the morphometric study corroborated the molecular data, demonstrating the morphological differentiation between the three lineages

ii detected within Conopholis. A taxonomic re-alignment is proposed for the genus that recognizes three species, C. americana, C. panamensis, and C. alpina.

To address genetic diversity and phylogeographic structure of C. americana in eastern North

America, microsatellite markers were developed and characterized for the first time in this species. Using these newly generated markers along with sequences from the plastid genome, the persistence of a minimum of two glacial refugia at the last glacial maximum were inferred, one in Florida and southern Alabama and another in the Appalachian Mountains near the southern tip of Blue Ridge Mountains. The diversity seen across the southern Appalachian

Mountains supports the hypothesis that populations derived from the southern and northern refugia come together in this area.

iii

Acknowledgments

I would like to express my deepest appreciation to my supervisor, Saša Stefanović. Thank you for all the guidance and support you have given me over these past years and for always challenging me every step of the way. Thank you for everything Saša. I also wish to acknowledge my committee members Dr. John Stinchcombe and Dr. Tim Dickinson for all their support, guidance, and insight. Tim, thank you very much for all the assistance and encouragement during the time spent in your lab obtaining data for morphometric analyses. The support received from my supervisor and committee members was invaluable and greatly appreciated. In addition, I would also like to thank the extended members of my examination committee, Dr. James Eckenwalder, Dr. Peter Kotanen, and external examiner Dr. Daniel

Nickrent.

I would like to thank all of my colleagues in the Stefanović lab for their support over the years and for making the lab environment an excellent one. Thank you Thomas Braukmann, Dr.

Eugenio Lo, Dr. Maria Kuzmina, Michael Wright, Shana Shaya, and Lily Xiao. Thomas, we began our degree together as strangers, but I know we will be friends for life. To all my friends at the UTM campus, both students and staff, thank you for all the coffees, laughs, and great memories over the years.

To my mother who has always been my biggest fan, thank you for always believing in me, pushing me to dream big and supporting me in all my decisions. You have given me so much for which I am eternally grateful and I look forward to always making you proud! Finally, to

Adam who has been my rock and constant source of support, love, and patience. Thank you for always understanding and for supporting me through the many highs and lows while I was

iv completing my degree. I have always told you, words will never be able to describe how thankful and appreciative I am for absolutely everything you have given me and done for us.

v

Table of Contents

Acknowledgments...... iv

Table of Contents...... vi

List of Tables ...... ix

List of Figures...... x

List of Appendices ...... xi

1 Overview...... 1 1.1 Parasitism in plants...... 1 1.2 Consequences of parasitism in plants ...... 2 1.3 The parasitic plant family Orobanchaceae and its holoparasitic genus Conopholis ...... 4 1.4 Project description ...... 6

2 Molecular systematics of the parasitic genus Conopholis (Orobanchaceae) inferred from plastid and nuclear sequences ...... 8 2.1 Abstract ...... 9 2.2 Introduction...... 10 2.3 Materials and Methods...... 13 2.3.1 Taxon sampling...... 13 2.3.2 DNA extraction, amplification, and sequencing...... 14 2.3.3 Phylogenetic analyses...... 15 2.3.4 Network analyses ...... 16 2.3.5 Parsimony analyses...... 16 2.3.6 Bayesian analyses...... 17 2.3.7 Evaluation of the rooting ...... 18 2.3.8 Testing of alternative topologies...... 19 2.4 Results...... 19 2.4.1 DNA regions and alignments ...... 19 2.4.2 Individual data set analyses ...... 20 2.4.3 Phylogenetic analyses of combined data...... 21 2.4.4 Molecular clock and placement of the root ...... 22 vi

2.4.5 Tests of alternate topologies...... 24 2.5 Discussion ...... 24 2.5.1 Phylogenetic and taxonomic implications ...... 25 2.5.2 Historical biogeography...... 28 2.6 Conclusions ...... 32 2.7 Acknowledgements...... 32

3 Morphometric analyses and taxonomic revision of the North American holoparasitic genus Conopholis (Orobanchaceae)...... 49 3.1 Abstract ...... 50 3.2 Introduction...... 51 3.3 Materials and Methods...... 54 3.3.1 Taxon Sampling...... 54 3.3.2 Morphology and Morphometric Analysis...... 55 3.4 Results...... 56 3.5 Discussion ...... 58 3.6 Taxonomic Treatment ...... 62 3.7 Key to species of Conopholis ...... 63 3.8 Acknowledgements...... 66

4 Development and characterization of polymorphic microsatellite markers for (Orobanchaceae) ...... 84 4.1 Abstract ...... 85 4.2 Introduction...... 86 4.3 Methods and Results...... 87 4.4 Conclusion...... 89 4.5 Acknowledgements...... 90

5 Present-day genetic structure of the holoparasite Conopholis americana (Orobanchaceae) in eastern North America and location of its refugia during the last glacial cycle ...... 94 5.1 Abstract ...... 95 5.2 Introduction...... 97 5.3 Materials and Methods...... 99 5.3.1 Taxon Sampling and DNA Extraction ...... 99 5.3.2 Plastid clpP Sequencing...... 100

vii

5.3.3 Microsatellite Genotyping...... 100 5.3.4 Analyses of Population Diversity and Structure...... 101 5.3.5 Distribution Modelling...... 102 5.4 Results...... 103 5.4.1 Plastid sequencing and analysis...... 103 5.4.2 Microsatellite genotyping and analysis...... 104 5.4.3 Distribution modelling...... 106 5.5 Discussion ...... 106 5.6 Conclusion...... 112 5.7 Acknowledgements...... 113

References ...... 125

Appendix I...... 141

Appendix II...... 147

Publications...... 152

viii

List of Tables

Table 2-1...... 34

Table 2-2-Appendix 1...... 35

Table 3-1...... 67

Table 3-2 Supplemental Appendix 1 ...... 68

Table 4-1...... 91

Table 4-2...... 92

Table 4-3 Appendix 1 ...... 93

Table 5-1...... 114

ix

List of Figures

Figure 2-1 ...... 39

Figure 2-2 ...... 40

Figure 2-3 ...... 42

Figure 2-4 ...... 44

Figure 2-5-Supplemental Figure S1 ...... 45

Figure 2-6-Supplemental Figure S2 ...... 47

Figure 3-1 ...... 77

Figure 3-2 ...... 78

Figure 3-3 ...... 79

Figure 3-4 ...... 81

Figure 3-5 ...... 82

Figure 5-1 ...... 119

Figure 5-2 ...... 121

Figure 5-3 ...... 123

x

List of Appendices

Appendix I - AFLP ...... 141

Appendix II - ITS ...... 147

xi

1 Overview 1.1 Parasitism in plants

Photosynthesis is a set of physiochemical processes used to acquire energy by fixing atmospheric carbon into organic compounds using the energy from sunlight. Photosynthesis occurs in terrestrial plants, algae, and cyanobacteria, which combined are responsible for the conversion of 100-115 teragrams of carbon dioxide into biomass per year (Field et al., 1998).

The majority of land plants encountered are autotrophic, that is, they are able to fix their own carbon via photosynthesis. In contrast, heterotrophic plants have abandoned photosynthesis and are partially able or completely unable to produce their own food. As a result, such plants obtain some or all of their water and carbohydrates from other organisms.

Heterotrophic plants show a wide range of evolutionary degradation of photosynthetic capability and are usually divided into two major categories, mycoheterotrophic and haustorial parasitic plants. Mycoheterotrophs utilize mycorrhizal fungi and their symbiotic network to connect to their host in order to acquire nutrients and water (Leake, 1994). The mycorrhizal fungus acts as a bridge between the green plant and the mycoheterotroph, where nutrients (carbon, nitrogen, etc.) flow from the host plant root, to the mycorrhizal fungus, and ultimately to the

14 mycoheterotroph. This trophic relationship has been demonstrated using CO2 tracers

(McKendrick et al., 2000).

In contrast to mycoheterotrophs, haustorial parasites form a direct physical connection to their hosts via the haustorium, a specialized organ that allows the parasite access to the host vascular system from which they acquire water, carbon, and other nutrients.

Approximately 4500 species (~1%) of flowering plants are parasitic (Nickrent, 2013), that is, they derive some or all of their nutrients and water from other plants (hosts). Among

1 angiosperms, haustorial parasitism is thought to have evolved independently a minimum of 12 times (Nickrent, 2002; Barkman et al. 2007; Davis et al., 2007) across the angiosperm phylogeny (Boranginaceae, Cucurbitales, Ericales, Krameriaceae, , Laurales,

Malpighiales, Malvales, Piperales, Santalales, Saxifragales, and Solanales).

The degree to which a parasite depends on its host varies. Facultative parasites can live and reproduce in the absence of a host; however, they are able to parasitize neighboring plants should a host be present. On the other hand, obligate parasites must attach to a host to complete their life cycle. Parasitic plants can also be categorized based on their photosynthetic ability.

Hemiparasites retain the ability to photosynthesize but can still obtain nutrients from their host plant. Holoparasites lack photosynthetic ability and are thus obligately dependent on their host for all nutritional needs (Nickrent, 2002).

1.2 Consequences of parasitism in plants

There are two major hypotheses that exist as to how plants switched from an autotrophic mode of life to a heterotrophic one. The evolutionary transition series supports a gradual change from autotrophs to holoparasites via facultative and obligate hemiparasites. The expectation is that the changes observed leading to the evolution of holoparasitism is phylogenetically progressive

(Boeshore, 1920; Young et al, 1999). Alternatively, the punctuated equilibrium hypothesis suggests that there are sudden evolutionary modifications that occur following long periods of a steady state during which no, or relatively few events occur (Young et al., 1999; McNeal et al.,

2007).

In addition to the presence of a haustorium, a common feature that is observed with the evolution of holoparasitism is that it is frequently accompanied by a suite of morphological and physiological changes, including the loss or near loss of chlorophyll and vegetative structures

2 such as stems, leaves, and roots, resulting in a condition known as the ‘parasitic reduction syndrome’ (Colwell, 1994). Many root hemiparasites retain expanded leaves with green pigmentation, thus making it difficult to recognize them as parasites. However, as the degree of parasitism increases and moves towards holoparasitism, the profound changes seen in morphology become more apparent. Leaves become reduced to scales and the loss of green pigmentation is evident throughout the plant body. Due to their complete reliance of their host for nutrition, the vast majority of holoparasites only need to emerge from the soil to reproduce

(exception being Hydnora triceps, flowers are subterranean; Nickrent et al., 2002). As a result, they can spend several years underground before floral stalks emerge above ground. For the most part, the reproductive structures of these parasites generally remain little affected, and provide a broad idea about phylogenetic relationships with their respective green relatives.

Parasitic plants provide remarkable examples of convergent evolution. This process is defined as when species or genera that have no close phylogenetic relationship develop similar characters through the action of similar selective forces (Heide-Jorgensen, 2008, p. 398). The extreme reduction in morphology of these plants plays a significant role in achieving the similarities observed in the structure and function between unrelated species. A classical example of convergent evolution in vegetative structures of parasitic plants is that between

Cuscuta (Convolvulaceae) and Cassytha (Lauraceae) (Nickrent et al., 1998). The resemblance between the two is extraordinary. They are both vines that wind around stems, possess greatly reduced leaves, and produce haustoria that emerge laterally from their stems. Their floral morphology and stem circumnutation, however, can be used to differentiate the two for indeed they belong to two distantly related plant families. Another example of convergence among these plants is the reduction of the embryo, endosperm, and seed size that is seen in many holoparasites.

3

Besides the changes in morphology described above, the plastid genomes of parasites are also significantly reduced in size. Accompanying the transition to holoparasitism is the relaxation of evolutionary pressure and functional constraints that are associated with the vital function of photosynthesis in plants. As a result, the plastid genome of holoparasite species exhibit a wide range of molecular modifications and gene losses. The losses of photosynthetic genes (such as rbcL, photosystem I, and photosystem II; Delavault et al., 1996) and genes involved in the chlororespiratory pathways (ndh genes; dePamphillis and Palmer, 1990), are presumably a consequence of relaxed or absent selection for maintenance of these genes associated with light harvesting in these non-photosynthetic plants. The similarity that exists between distantly related nonphotosynthetic organisms in their plastid gene content (protein encoding and tRNA genes) is suggestive of convergence toward a distinct shared gene set (Delannoy et al., 2011).

1.3 The parasitic plant family Orobanchaceae and its holoparasitic genus Conopholis

Orobanchaceae (as redefined by Young et al., 1999; Olmstead et al., 2001; APG III, 2009) represents one of the largest and most prominent groups of parasitic plants and contains approximately one-half of all known parasitic angiosperms, circumscribed in some 90 genera,

(Nickrent, 2013). The Orobanchaceae is near-cosmopolitan in distribution, but is most diversified in temperate regions. The family includes the entire range of trophic abilities, where aside from members of a few small non-parasitic genera such as Lindenbergia, Rehmannia, and

Triaenophora, (Jensen et al., 2008; Albach et al., 2009; Xia et al., 2009) all other species in the

Orobanchaceae are hemi- and holoparasites.

The most comprehensive phylogenetic analysis to date of Orobanchaceae was published by

McNeal et al., (2013) using a multilocus data set (nuclear ITS, PHYA, PHYB, plastid matK and rps2). Their study obtained good taxon sampling (54 of the 90 ingroup genera in the family),

4 covering various taxonomic ranks and geographic ranges and provided the highest levels of support for naming subfamilial clades within the family as compared to past studies (Young et al., 1999; Olmstead et al., 2001; Wolfe et al., 2005; Bennett and Mathews, 2006). Previously, transitions from hemiparasitism to holoparasitism within this family were hypothesized to have evolved independently at least five times (Young and dePamphilis, 2005; Wolfe et al., 2005;

Bennett and Mathews, 2006) for the approximately 20 holoparasitic genera in the family.

However, the most recent study showed strong support for three independent evolutionary origins of holoparasitism from hemiparasitism.

In order to address topics involving the systematics, speciation, biogeography, and molecular evolution in these plants, I focused on the holoparasite genus Conopholis Wallr. Members of the genus are perennial, achlorophyllous, obligate root parasites (Kuijt, 1969; Haynes, 1971).

The genus is distributed throughout eastern and southwestern North America and Central

America. Very little is known of the relationships among species and populations as well as their post-glacial history.

In 1971, Robert Haynes proposed a taxonomic classification of the genus based on geographic distribution, morphology, reproductive isolation, and host specificity. He considered Conopholis as being composed of two species, C. americana and C. alpina, with the latter being divided into two varieties, C. alpina var. alpina and var. mexicana. His classification was based on a combination of discrete (presence/absence) characters such as veination of scales along with a number of quantitative characters such as the size and relative proportions of bracts and the shape of the calyx. However, each of the traits Haynes (1971) used showed a range of variation that overlapped between the species. To him, the two species were indeed morphologically distinct, yet “No single character can be relied upon to determine all specimens encountered…”

(p. 252). To get around this problem, he suggested that one needed to consider several 5 characters in combination to correctly identify a particular specimen to species. Nevertheless, the two species were separated because of their partial morphological differences, and perhaps more importantly, their geographic isolation and apparent host specificity (Haynes, 1971). Prior to the publication from this thesis, there was no molecular phylogeny for the genus utilizing populations spanning the entire geographic distribution and morphological ranges of the species.

1.4 Project description

The topics discussed in this thesis are broad yet integrated, with a primary goal aimed at expanding our current understanding of the relationships between species and populations within a holoparasitic genus. It describes the various approaches used and results obtained that address topics involving phylogenetic relationships, morphological distinction, genetic diversity, speciation, and biogeography in Conopholis. Chapter Two is aimed at developing a well- supported molecular phylogeny for Conopholis so as to assess the species limits proposed by

Haynes (1971). This is the first molecular phylogenetic study of the genus and it includes samples accounting for the entire morphological variation and geographic range of the genus in

North America. In Chapter Two, the recovery of three well supported lineages within

Conopholis, none of which entirely correspond to the species proposed by Haynes (1971), as well as the problematic manner of how morphological characters are used to assign specimens of Conopholis to a particular species, prompted the morphometric study described in Chapter

Three. In this third chapter, I present a morphometric study of the genus that emphasized calyx and bract morphology. Given that the results of the morphmetric analyses agreed with the results of Chapter Two, the third chapter also provides a taxonomic revision for Conopholis.

Chapters Four and Five are concerned with the genetic diversity and phylogeographic structure of C. americana populations in eastern North America. In Chapter Four I present the method used to isolate and characterize the first published codominant microsatellite DNA markers for

6 the genus Conopholis. Chapter Five describes the utilization of these fast-evolving and independent markers as well as plastid data to investigate the present-day genetic structure of C. americana in eastern North America. The pattern of genetic diversity was used in conjunction with results from paleodistribution modelling to identify geographic regions where populations may have persisted through the Last Glacial Maximum.

7

2 Molecular systematics of the parasitic genus Conopholis (Orobanchaceae) inferred from plastid and nuclear sequences

Other than thesis specific changes for formatting, this chapter was previously published as:

Rodrigues, A. G., A. E. L. Colwell, and S. Stefanović. 2011. Molecular systematics of the parasitic genus Conopholis (Orobanchaceae) inferred from plastid and nuclear sequences. American Journal of Botany 98: 896-908.

8

2.1 Abstract

Premise of the study: Little is known of the evolutionary relationships within Conopholis, a small holoparasitic genus belonging to the broomrape family. Presently, Conopholis is described as having two species, C. americana and C. alpina. This classification is based on a combination of presence/absence of morphological characters along with a number of quantitative traits. We assessed the relationships among populations and species of this genus to determine whether the present taxonomic hypothesis is reflected in molecular phylogenies.

• Methods: We conducted the first phylogenetic study of Conopholis using plastid (trnfM-E intergenic spacer and clpP gene/introns) and nuclear (PHYA intron 1) sequences from a wide taxonomic sampling covering its entire geographical range in North America. Analyses were carried out using a variety of phylogenetic inference approaches.

• Key results: Reciprocal monophyly between the two traditionally accepted species has not yet been achieved. Instead, three distinct genetic clusters were recovered. is clearly paraphyletic and shows evidence of belonging to at least two distinct lineages.

Specimens found in Costa Rica and Panama form a distinct group from those located in northern

Mexico and the southwestern United States. The monophyly of C. americana was also not recovered; however, the possibility of it being monophyletic could not be rejected with confidence.

• Conclusions: These analyses recovered three distinct lineages indicating that there could be a minimum of three species within the genus. A re-evaluation of morphological features within

Conopholis may reveal shared features that could further corroborate our molecular findings.

9

2.2 Introduction

Orobanchaceae, as redefined by Young et al., (1999), Olmstead et al., (2001), and the

Angiosperm Phylogeny Group III (APG III, 2009), is a morphologically diverse family comprised of herbaceous parasitic plants containing approximately one-half of all known parasitic angiosperms (ca. 1800 species), circumscribed in some 90 genera (Nickrent, 2010).

With the notable exception of species belonging to several small genera (e.g., Lindenbergia,

Rehmannia, and Triaenophora; Jenson et al., 2008; Albach et al., 2009; Xia et al., 2009), all other members of this family are facultative or obligate root parasites. They may either retain the various degrees of capability to photosynthesize (hemiparasitic species) or be completely dependent on their host for nutrients and water (holoparasitic species). The evolution of advanced holoparasitism is accompanied generally by an extreme reduction or modification, entailing both physiological and morphological changes. Those include the loss or reduction of chlorophyll production, photosynthesis, and vegetative structures such as leaves, roots, and branches along with the gain of haustoria, the organs that enable these plants to connect to their hosts vascular systems. As a result of this overall reduction in morphological features, also known as the “parasitic reduction syndrome” (Colwell, 1994), holoparasites remain difficult to study from a taxonomic and systematic point of view.

Conopholis is one such morphologically distinct group, one of ca. 20 holoparasitic genera in

Orobanchaceae. Members of this genus are obligate perennial achlorophyllous parasites (Kuijt,

1969; Haynes, 1971). The mature plant body consists of several erect flowering stalks arising from a swollen subterranean haustorium (consisting of both parasite and host tissue) that connects the plant to the vascular system of its host (oaks; Baird and Riopel, 1986a). Leaves are reduced to scales and roots are absent. Following 3–4 yr of subterranean tubercle growth, these

10 plants reach reproductive maturity and floral meristems may erupt above ground and produce inflorescences. The mature plant will continue to grow for ca. 10 yr, after which it dies, presumably due to a disruption of the haustorial connection (Baird and Riopel, 1986b).

Populations of Conopholis are best described as locally abundant but rare and isolated, at times separated by many kilometers of forest or other habitat in which no individuals occur. On occasion, bumblebees have been seen visiting populations of Conopholis (Haynes, 1971;

Gomez, 1980; Baird and Riopel, 1986b) but the actual pollination by bees has not been confirmed. These plants do not have floral nectaries and are not known to produce scents that may attract bees. Bagging experiments performed to investigate the roles of wind or insects in

Conopholis pollination (Baird and Riopel, 1986b) showed only a slight reduction in viable seed set (85%) compared to unbagged controls (87%). In addition, studies of flowers postanthesis have found that the anthers are in physical contact with the stigma. In aggregate, these observations suggest a predominant selfing mode of pollination for Conopholis. Dispersal of seeds occurs either from drying and decomposition of the capsule and the washing away of seeds following periods of rain or following the consumption of the inflorescence by mammals, especially deer (Baird and Riopel, 1986b). The role of ants has not been investigated in detail, but anecdotal observation suggests that their role in both pollination and dispersal of Conopholis seems minimal.

Little is known of the evolutionary relationships among populations and species of Conopholis or of their postglacial history. In the most recent taxonomic classification of this genus, taking into account geographic distribution, morphology, reproductive isolation, and host specificity,

Conopholis is described as having two species: C. americana and C. alpina (Haynes, 1971).

Conopholis americana parasitizes red oaks (Quercus section Lobatae; Manos et al., 2001) in moist, deciduous or mixed forests and is found across eastern North America, from Florida 11 north to Nova Scotia west to Wisconsin and south to Alabama (Fig. 1). Conopholis alpina parasitizes various oak species (predominantly white oak; Quercus section Quercus) in oak woodlands and mixed montane forests found in southwestern North America. This species is divided into two varieties. Conopholis alpina var. mexicana is found from the Trans-Pecos area in through northern New Mexico and central south down to the Trans-Mexican volcanic belt (TMVB), a large mountain range running from east to west in the central portion of Mexico, located approximately along the 19°N parallel and well-known as a gene-flow barrier for other plant species (Haynes, 1971; Nixon, 1993). The type variety (C. alpina var. alpina) is distributed from this same central area of Mexico south to Costa Rica and Panama

(Haynes, 1971; Fig. 1).

Morphologically, the classification proposed by Haynes (1971) is based on a combination of presence/absence of characters along with a number of quantitative traits such as the size and relative proportions of bracts as well as the shape of the calyx. However, Haynes (1971) states that the calyx is the most variable part of the plant, and as a result, differences pertaining to it cannot alone be used as criteria for taxonomic placement of specimens. In his view, the two species of Conopholis are morphologically distinct, yet “No single character can be relied upon to determine all specimens encountered…” (p. 252). Haynes (1971) acknowledged this to be a conundrum, suggesting further that one needs to consider several characters at the same time to identify a specimen to the correct taxon. Nevertheless, he recognized them as two separate species because of their clear present day geographic isolation (>1400 km; Fig. 1) and apparent differential host specificity. This bicentric geographic pattern of Conopholis in North America with an east–west disjunction is also found in other plant groups (e.g., Chamaecyparis;

Mylecraine et al., 2004; and Platanus; Feng, et al., 2005).

12

Our overall research on Conopholis was undertaken with several major goals in mind: (1) to test the current taxonomic hypothesis by Haynes and assess the status of the proposed species; (2) to investigate the relationships between species and populations of this genus; (3) to conduct phylogeographic and population level analyses that may shed light on the postglaciation migration pattern(s) of the eastern North American species; (4) to investigate morphological character evolution within the genus and conduct morphometric analyses; and (5) to develop, in conjunction with the re-evaluation of the taxonomic characters used, a comprehensive phylogeny-based classification.

This present study is concerned with the first two of the aforementioned goals, i.e., develop a well-supported molecular phylogenetic hypothesis for Conopholis and assess the monophyly of the proposed species. To address these aims, we developed a multilocus molecular data set consisting of both plastid and nuclear DNA sequences. This is the first comprehensive molecular phylogenetic study of Conopholis, accounting for the entire geographic and morphological range of the species in North America.

2.3 Materials and Methods

2.3.1 Taxon sampling

A total of 42 specimens representing the two presently recognized species of Conopholis were sampled in this study. A complete list of species, voucher information, DNA extraction numbers, and approximate locality of sampled populations is provided in Appendix 1. These accessions represent individuals spanning the entire geographic range of the genus (Fig. 1) and of all three traditionally described morphological types (C. americana, C. alpina var. alpina, and C. alpina var. mexicana). The names applied to these accessions follow the species

13 delimitations by Haynes (1971), which emphasizes geographical distinctions between the species. However, given the morphological variation shown by Conopholis, we also use the phylogenetic species concept (PSC) approach. Unlike morphological or various mechanistic species concepts, the PSC is historically based (Baum and Donoghue, 1995) and uses the criteria of monophyly and exclusivity to define species (de Queiroz and Donoghue, 1990; Baum, 1992;

Baum and Shaw, 1995).

In addition, Epifagus virginiana, represented here by seven accessions (Appendix 1), was chosen as the outgroup, based on the well-supported sister-group relationship of Conopholis and

Epifagus resulting from previous broad molecular studies of Orobanchaceae (Nelson et al.,

1999; Wolfe et al., 2005; Bennett and Mathews, 2006).

2.3.2 DNA extraction, amplification, and sequencing

Total genomic DNA was extracted from fresh, silica dried, or herbarium material using a modified hexadecyltrimethylammonium bromide (CTAB) technique from Doyle and Doyle

(1987) and purified using Wizard minicolumns (Promega, Madison, Wisconsin, USA). The polymerase chain reaction (PCR) was used to obtain the double-stranded DNA fragments of interest. The plastid genome region containing the spacer between the trnfM (CAU) and trnE

(UUC) exons (hereafter called trnfM-E) was amplified using the trnfM-r and trnE primers described by Doyle et al. (1992). These two genes are known to be more than 5 kb apart in

Nicotiana (Wakasugi et al., 1998). However, in Epifagus, they are closer to each other due to significant intervening deletions in the plastid genome (dePamphilis and Palmer, 1990; Wolfe et al., 1992). Amplicons of the plastid clpP gene and its introns were generated via PCR following

Stefanović et al. (2004). Primers used to amplify nuclear encoded phytochrome A (PHYA) sequences in Orobanchaceae (Bennett and Mathews, 2006) were used to generate the first round

14 of Conopholis PHYA sequences. Nested within these sequences, we subsequently designed a new set of primers, specific to Conopholis, targeting the PHYA intron 1 (PHYA F-a678f—

GAGATGGTCCGTTTGATTGAG and PHYA R-a787- CGATGAAACATACTCCCACC). For all three of the regions, PCR reactions were carried out in 50-µL volumes with annealing temperature ranging between 50 and 55°C. Amplified products were cleaned by polyethylene glycol/NaCl precipitations or by Wizard minicolumns (Promega, Madison, Wisconsin, USA).

To ensure accuracy, we sequenced both stands of cleaned PCR products (for trnfM-E and

PHYA, using external primers only; for clpP, two internal primers were used in addition).

Cleaned fragments were sequenced using the DYEnamic ET dye terminator sequencing kit (GE

Healthcare, Baie-d’Urfe, Quebec, Canada) on an Applied Biosystems model 377 automated

DNA sequencer (PE Biosystems, Foster City, California, USA). To screen nuclear sequences for polymorphisms, we cloned the cleaned PHYA PCR products into the pSTBlue-1 Acceptor vector (EMD Biosciences, San Diego, California, USA) and sequenced multiple clones.

Sequence chromatograms were proofed, edited, and contigs assembled using the program

Sequencher version 4.8. (Gene Codes Corp., Ann Arbor, Michigan, USA). All sequences generated in this study are deposited in GenBank (accessions HQ895589–HQ895712; Appendix

1).

2.3.3 Phylogenetic analyses

Sequences were aligned manually with the program Se-Al version 2.0a11 (Rambaut, 2002). The sequences were readily alignable among all ingroup accessions in both the plastid and nuclear matrices. Gaps in the alignments were treated as missing data. Indels were coded with the program Seqstate version 1.4.2 (Muller, 2005) using the procedure of Simmons and Ochoterna

15

(2000) and appended to the respective sequence matrices. Phylogenetic analyses were conducted under a variety of distance- and character-based methods.

2.3.4 Network analyses

To investigate relationships among and within the species and populations of Conopholis, we initially constructed phylogenetic networks for each individual data set. The networks were constructed using a neighbor-net (NN) algorithm (Bryant and Moulton, 2004), as implemented in the program SplitsTree version 4.11.3 (Huson and Bryant, 2006). Prior to network analyses, sequences were corrected by imposing corresponding models of DNA evolution. The program

ModelTest version 3.7 (Posada and Crandall, 1998) was used to determine the model of sequence evolution that fits best for each of the three data sets. The Akaike information criterion

(AIC) method selected the F81 + I, HKY85 + I, and F81 models of DNA substitution for trnfM-

E, clpP, and PHYA Conopholis matrices, respectively. The TVM + G model was chosen for the clpP matrix containing both ingroup and outgroup taxa (network not shown).

2.3.5 Parsimony analyses

Each data matrix was analyzed separately as well as in a single combined matrix using the program PAUP* version 4.0b10 (Swofford, 2002). In all of those analyses, heuristic searches for the most parsimonious (MP) trees were conducted using 1000 replicates with stepwise random sequence addition and tree-bisection-reconnection (TBR) branch swapping. All trees were saved during the search (MULTREES on). Support for relationships was inferred from nonparametric bootstrapping (Felsenstein, 1985) implemented in PAUP* by using 500 pseudoreplicates, each with 20 random sequence addition cycles, TBR branch swapping, and

MULTREES option off (DeBry and Olmstead, 2000). Conflict between data sets was evaluated

16 by visual inspection, looking for strongly supported yet conflicting tree topologies resulting from individual data matrices.

2.3.6 Bayesian analyses

Searches under the Bayesian criterion were done using the program MrBayes version 3.1.2

(Ronquist and Huelsenbeck, 2003) on the combined data set only (both with and without outgroup taxa). This combined plastid and nuclear data set was split into four partitions, three containing the trnfM-trnE, clpP, and PHYA sequences, respectively, and fourth with the combined indel characters. The models of sequence evolution as determined before were imposed for each sequence partition. The coded gaps (for trnfM-trnE, clpP, and PHYA) were included and analyzed separately from the sequence data. These characters were set to follow the Mk model (Lewis, 2001) with the possibility that some indels may be changing at different rates (Mk + G). Two runs starting from random trees were carried out. The Metropolis-coupled

Markov chain Monte Carlo algorithm was used with four simultaneous chains set initially to one million generations and sampled every 100 generations. The likelihoods of the independent runs were considered indistinguishable when the average standard deviation of split frequencies was

<0.01%, as suggested by Ronquist and Huelsenbeck (2003). To determine the burn-in cut-off point, we plotted the −ln likelihood values against generation time and discarded preasymptotic samples. The remaining data were analyzed in PAUP* where the 50% majority-rule consensus tree was constructed. With no significant difference between the two runs observed, we only report topologies and posterior probabilities based on pooled trees from the independent

Bayesian analyses.

17

2.3.7 Evaluation of the rooting

In most phylogenies, the root node of a tree is usually determined extrinsically, by imposing outgroup(s). Alternatively, the enforcement of the molecular clock will result in ultrametric trees, rooted intrinsically by the tree-building algorithm itself (Felsenstein, 2004). It is generally assumed that among closely related species, which tend to have similar metabolic rates, life histories, and generation times, the rates of evolution for a particular gene are likely to be comparable, resulting in a “local” molecular clock (Li, 1993; Sanderson, 2002). To identify the position of the root within Conopholis but without the use of outgroups and to compare the divergence times (rates) of Conopholis with those of Epifagus, we evaluated the molecular clock hypothesis using the likelihood ratio tests (LRT; Felsenstein, 1981; Goldman 1993). These tests were conducted on two data sets: the Conopholis-only matrix, including all available sequence data but without outgroups, as well as the clpP-only matrix, with Epifagus included. To assess whether the molecular clock could be applied to these data, we conducted maximum likelihood searches utilizing the models of DNA sequence evolution and parameter estimates as identified by Modeltest (Posada and Crandall, 1998; see above). For each data set, likelihood searches, with and without the molecular clock enforced, were performed using a two-stage strategy with

PAUP*. First, the analyses involved 20 replicates with stepwise random taxon addition, TBR branch swapping saving no more than 10 trees per replicate, and MULTREES option off. The second round of analyses was performed on all trees in memory with the same settings except with the MULTREES option on. Both stages were conducted to completion or until 100c000 trees were found. The resulting likelihood estimates, with the clock imposed (Hnull) and no enforcement of clock (Halt), were then compared using the LRT with N − 2 degrees of freedom

(where N is the number of operational units).

18

2.3.8 Testing of alternative topologies

Two alternative topologies, designed to investigate the monophyly of species as circumscribed traditionally, were constructed and their cost in parsimony assessed using PAUP* (Swofford,

2002). Constraining the monophyly of C. americana and that of C. alpina was done by using the combined data set (comprises all three sequence matrices) and including both ingroup and outgroup taxa (Epifagus). To statistically test and compare these alternatively enforced phylogenetic hypotheses with the optimal trees, we conducted two statistical tests. First, we conducted one-tailed Shimodaira–Hasegawa (SH) tests (Shimodaira and Hasegawa, 1999;

Goldman et al., 2000) in PAUP* using 1000 replicates and full parameter optimization of the model. Second, we also carried out the less conservative, approximately unbiased tests (AU tests; Shimodaira, 2002). The P-values for the AU test were calculated in the program CONSEL version 0.1j (Shimodaira and Hasegawa, 2001), using 10 repetitions of multiscale bootstrapping, each consisting of 10 sets with 10c000 bootstrap replicates.

2.4 Results

2.4.1 DNA regions and alignments

The characteristics of the three sequenced regions as well as statistics of MP trees derived from separate and combined analyses are described in Table 1.

Sequences for the trnfM-E region were obtained from all accessions of Conopholis and Epifagus used in this study. These sequences were relatively easy to align within Conopholis as well as within Epifagus. However, we could not achieve unambiguous alignment between the ingroup and outgroup taxa. When indels were coded for the Conopholis trnfM-E aligned data matrix, 22 additional binary characters were obtained. However, the indel characters arising from complex 19 gaps in the alignment produced from single base repeats (>8) were excluded from the analyses, leaving 14 coded gaps used in subsequent analyses.

Most specimens of Conopholis (40 of 42) were readily amplifiable for clpP, as were all seven individuals of E. virginiana. Attempts to amplify the remaining two accessions of C. alpina, including partial fragment amplification with internal primers, were unsuccessful. This was probably due to the poor quality of DNA extracted from herbarium material. Unlike trnfM-E, clpP gene/introns sequences were easily aligned across both ingroup and outgroup samples. Gap characters were scored using the modified complex indel-coding method (Simmons and

Ochoterna, 2000; Müller 2006) and resulted in 17 additional characters (coded for Conopholis only).

Intron 1 sequences of the PHYA gene were readily obtained for all 35 individuals of Conopholis acquired from fresh or silica dried material. A direct sequencing approach yielded results without polymorphisms being observed in sequence trace chromatograms. A number of amplicons were cloned and up to 10 clones sequenced from each. In all cases, clones from the same individual produced identical sequences. Several attempts to amplify this single-copy nuclear region from the remaining seven Conopholis accessions, all of which were from herbarium sources, met with failure. In addition, amplification of outgroup samples for this nuclear region proved to be difficult, and the few PHYA sequences for Epifagus that did amplify could not be aligned with those of Conopholis. Indel coding for available sequences provided three additional binary characters.

2.4.2 Individual data set analyses

To explore the data, a number of distinct phylogenetic analyses were initially conducted on individual matrices using distance, parsimony, and Bayesian approaches. 20

The phylogenetic networks of individual data sets each revealed three distinct clusters, labeled informally A–C (Fig. 2). Group A consists of all accessions of C. alpina var. mexicana from

Arizona, New Mexico, Texas, and north of the Trans-Mexican volcanic belt (TMVB). Group B contains samples of C. alpina var. alpina exclusively from Costa Rica and Panama. Group C includes populations from southern Mexico (C. alpina var. alpina from the states of Chiapas,

Oaxaca, and Puebla), found interspersed within this group that otherwise contains all samples of

C. americana from eastern North America.

Tree characteristics for MP searches are shown in Table 1. Topological agreement was found among the three separate analyses (trees not shown). Parsimony analyses of the individual data sets produced clades identical to their respective phylogenetic network already described (Fig.

2). Taking the results from all three separate analyses into account, both under network and tree approaches, we deemed these three matrices to show no significant topological incongruence and thus combined them into one data set.

2.4.3 Phylogenetic analyses of combined data

Trees produced from the combined analyses had better resolution and overall support compared to those produced from individual analyses. The Bayesian analyses from each of the two runs starting from a random tree reached an asymptotic plateau no later than 150000 generations, and all trees obtained prior to the plateau were excluded from the assemblage of a consensus tree.

Figure 3 shows the majority-rule consensus tree resulting from the Bayesian analysis of all available data, combined plastid and nuclear sequences as well as coded gaps, obtained from

Conopholis and Epifagus accessions. The topology is consistent with the results from the separate data set analyses using distance (Fig. 2) and parsimony (trees not shown). Identical backbone clades (outgroup plus three ingroup clusters labeled A–C) were recovered from the

21 combined analyses, all with ≥90% bootstrap support (BS) in parsimony and with 1.0 posterior probability (PP) in Bayesian analysis (Fig. 3). The majority of the southwestern North American specimens included in this study (C. alpina var. alpina and C. alpina var. mexicana) are found in two distinct clades (A and B). Clade A contains all individuals from northern Mexico and the southwestern portion of the USA. Clade B comprises the lineage found in Costa Rica and

Panama. The eastern North American individuals (C. americana) are all found within the third well-supported group, clade C. However, as one of the most surprising results of this study, six accessions of C. alpina var. alpina sampled from the southern Mexican states of Chiapas,

Oaxaca, and Puebla are also found nested within clade C, more closely related to C. americana than the other specimens of C. alpina. The seven accessions of Epifagus cluster with each other to form a separate outgroup clade, sister to Conopholis.

Parsimony analyses of the combined plastid and nuclear data resulted in essentially the same relationships between populations and species as described above. The inset in Fig. 3 shows a phylogram of one of the MP trees resulting from the analysis of the combined data matrix. This phylogram also depicts three substantial branches subtending the three major clades A–C, while individuals within each of these major clades appear more homogeneous with relatively shorter branch lengths.

2.4.4 Molecular clock and placement of the root

All phylogenetic analyses recovered the same major ingroup lineages within Conopholis (clades

A–C; Fig. 3). In addition, the combined data analyses, which included outgroups, indicated that the first split occurred between clade A on one side and clades B plus C on the other. All of these backbone relationships received strong internal support. However, two problems are raised by the analysis. First, because Conopholis includes only three major clades and a root (Fig. 3),

22 even a simple topological distortion, such as nearest-neighbor interchange (NNI), would result in trees with different placements of the root (e.g., clades A + B sister to clade C or clades A +

C sister to clade B). Such competing rooting solutions could be caused by the artifact of attraction involving long outgroup and ingroup branches (Felsenstein, 1978). Second, we were able to achieve unambiguous alignment between ingroup and outgroup accessions only for the clpP region; yet, a significant proportion of missing data could lead to inaccurate phylogenetic reconstruction (Scotland et al., 2003).

To explore the influence of these potentially adverse conditions on our results, we reanalyzed the same combined data set but with the exclusion of Epifagus sequences, thereby eliminating the possibly misleading long outgroup branch as well as a relatively large number of missing data cells. Instead, we used the maximum likelihood to produce a rooted phylogeny. Results of the likelihood ratio test revealed that the null hypothesis (molecular clock enforced) could not be rejected (df = 40, χ2obs = 12.0824, p = 0.0001). By enforcing the molecular clock, the ML search results in an ultrametric tree intrinsically rooted by all members of C. alpina var. mexicana (Appendix S1; see Supplementary Figure S1). These ingroup-only analyses not only recovered the same major clades, with the same composition, but also resulted in identical inference of the root node within Conopholis, the same as when the outgroup Epifagus was included (compare Fig. 3 to online Appendix S1; see Supplementary Figure S1).

To test whether the molecular clock is in effect more broadly, between Conopholis and its sister

Epifagus, the clpP data matrix (the most complete data set where the ingroup taxa were alignable with the outgroup) was analyzed using ML approach as well. Results of the likelihood ratio test revealed that the null hypothesis (molecular clock enforced) could not be rejected (df =

45, χ2obs = 22.9202, p = 0.0026). The strict consensus tree resulting from ML analysis with the molecular clock enforced is shown in the online Appendix S2 (see Supplementary Figure S2). 23

2.4.5 Tests of alternate topologies

Given that neither species of Conopholis was found to be monophyletic on the optimal

(unconstrained) trees, we wanted to determine the cost in parsimony and its significance when enforcing monophyly of species as circumscribed by Haynes (1971). These tests were done using the combined data matrix, containing both Conopholis and Epifagus sequences. When the topologies were constrained so that all eastern North American individuals were monophyletic

(i.e., the monophyly of C. americana s.s.), this resulted in trees of 327 steps, only one step longer than the optimal tree (Fig. 3). Not surprisingly, this result proved not to be significantly different from the optimal tree (SH test P = 0.782; AU test P = 0.353). However, when the topology was constrained so that C. alpina was monophyletic, this produced trees of 342 steps,

16 steps longer than the MP tree. This result provides strong evidence in support of the optimal topologies as it rejects the monophyly of C. alpina as a significantly worse solution (SH test P =

0.004; AU test P = 2 × 10−4).

2.5 Discussion

This work represents the only fine-scale molecular phylogenetic study for Conopholis. It is based on a combination of plastid and nuclear DNA sequences obtained from individuals sampled across the entire taxonomic and geographic range of the two presently recognized species. The resulting phylogenetic inferences are robust and show significant support for the composition and relationships between major clades in the tree. Figure 4 summarizes our current understanding of phylogenetic relationships among populations of Conopholis, the relationship between the traditional and putative phylogenetic classification suggested here, and a biogeographic scenario proposed to explain present-day distribution of the genus.

24

2.5.1 Phylogenetic and taxonomic implications

Prior to the publication of the monograph by Robert R. Haynes (1971), upward of five species of Conopholis had been described: C. americana, C. alpina, C. mexicana, C. panamensis, and

C. sylvatica. However, due to the severe reduction in morphology of parasitic plants, there is a limited number of characters that can be potentially relied upon to differentiate between these species, creating uncertainty as to the number of species in the genus in early floristic treatments, ranging from one to four (e.g., Beck-Mannagetta, 1930; Small, 1933; Fernald, 1950,

Gleason, 1952). Following his seminal work, Haynes (1971) concluded that only two of these taxa warranted recognition at the species level, C. americana and C. alpina, with the latter being further subdivided into two varieties (var. alpina and var. mexicana). Specifically, after studying the relevant type specimens, he concluded that the individuals assigned to C. alpina, C. sylvatica, and C. panamensis represented only intraspecific variability and did not warrant separation in three different species. Therefore, these taxa were reduced to a single species, to which the specific epithet of C. alpina was assigned based on priority.

Regardless of the data set used or phylogenetic methodology employed, none of our analyses lend support for the strict subdivision of the genus into the two presently recognized species.

Instead, molecular data presented in the present study provide strong evidence for three distinct lineages within Conopholis (Figs. 2 and 3) having various degrees of overlap with previously proposed taxa (Fig. 4). In addition, these analyses also reveal substantial branches subtending each of three clades (see insets in Fig. 3 and online Appendices S1 and S2). These three branches are comparable in length to each other, and in all three cases they are substantially longer than the branch lengths observed within the major clades. Assuming that these relative lengths are indicative of the overall amount of genetic diversification, this suggests that the three

25 lineages have been reproductively isolated from each other for a period of time long enough to allow them to accumulate greater genetic differences among lineages compared to within. In aggregate, the composition of the clades as well as the branches subtending those clades lends support for the recognition of three distinct lineages within Conopholis, possibly at the species level (Fig. 4).

Clade A corresponds entirely to C. alpina var. mexicana, containing all sampled individuals found in the southwestern portion of the USA and north of the TMVB. Therefore, taxonomically this clade corresponds to C. mexicana, a taxon originally recognized as a separate species by

Watson (1883). He deemed it to be distinguished from C. americana by its longer and more rigid lanceolate acuminate scales as well as having a larger corolla and a less deeply toothed calyx. Aside from this clade, the remaining individuals traditionally assigned to C. alpina are also found in clades B and C. Clade B is composed of specimens occurring solely in Costa Rica and Panama, although according to Haynes’ (1971) scheme, individuals obtained from this geographic region should be more closely related to those found in clade A. Instead, clade B may correspond to another previously described species, C. panamensis (Woodson and Seibert,

1938, morphologically distinct from both C. mexicana and C. americana. According to its original description, C. panamensis can be distinguished from C. mexicana, a taxon that is geographically in a relatively close proximity to the Central American populations, based on its shallow and broadly obtuse calyx. In addition, this putative species can be distinguished from both C. mexicana and C. americana by its seeds that are about half the size of those in the other two taxa. However, it shares with C. americana broad bracts concealing its calyx, while the loss of style in fruit resembles that of C. mexicana. The morphological distinction between the

Central American populations and those found in northern Mexico and the south western USA is consistent with the two genetically distinct lineages of C. alpina (clades A and B; Figs. 2 and

26

3) recovered by phylogenetic networks and tree approaches. Also, for clade B to correspond to the description of C. alpina var. alpina, it would have to contain individuals sampled from the southern Mexican states. However, the six accessions of C. alpina sampled from Chiapas,

Oaxaca, and Puebla are found interspersed within clade C. Excluding these six accessions from southern Mexico, clade C otherwise contains all sampled individuals of C. americana from the eastern North America. While the monophyly of C. americana is not strictly recovered according to the optimal trees, the possibility of its monophyly added only one step to the MP trees and could not be rejected with confidence by the SH and AU tests.

Given the composition of clade C, we hypothesize that a comprehensive re-evaluation of morphological characters within this genus will likely reveal morphological features shared between the members of C. americana and those of C. alpina distributed in the southern

Mexican states. The observation that no single character can be relied upon to distinguish the eastern from western species as noted by Haynes (1971; p. 252) may thus be explained by the fact that some Conopholis populations found in Mexico are actually disjunct members of C. americana rather than of C. alpina, as expected by their distribution. A similar case of intraspecific disjunction is indeed observed in Epifagus virginiana. This species, sister to

Conopholis, also occurs predominantly throughout eastern North America, but it has disjunct populations located in known relict temperate forests in Mexico such as Rancho del Cielo

(Thieret, 1969), mirroring closely the distribution of individuals found in clade C (Fig. 4). Other

Mexican disjunct lineages that are considered conspecifics with their eastern United States counterparts include Nyssa sylvatica (Miranda and Sharp, 1950) and Fagus grandifolia (Morris et al., 2010). Also, two members of the Corallorhiza striata species complex (C. bentleyi and C. striata var. invulata) show a similar pattern and are presumed relicts left from a once broader distribution of the ancestor of that clade (Barrett and Freudenstein, 2009). However, contrary to

27 these other cases, which showed no or little genetic variation in the Mexican accessions studied

(e.g., Fagus grandifolia; Morris et al., 2010), we observe substantial variation among the disjunct Mexican samples of C. alpina found in clade C even though our sampling of those populations is limited (Fig. 3).

The incomplete lineage sorting of ancestral polymorphisms (i.e., deep coalescence) could be seen as an alternative explanation for the phylogenetic patterns observed in our study, in particular regarding the nonmonophyly of C. alpina. However, several lines of evidence suggest this as an unlikely scenario. First, two independent sources of data were used, plastid and nuclear DNA sequences, resulting in congruent gene trees, both producing three separate and well-supported groups (Figs. 2 and 3). Second, because the plastid genome has a significantly smaller effective population size compared to nuclear loci (Moore, 1995), the phylogenetic relationships resulting from the use of plastid regions have a higher probability of a faster coalescence times, leading to more rapid eliminations of any polymorphisms. Third, in the case of C. alpina whose members are found in all three of these well-differentiated clades (as evidenced by the branch lengths subtending those clades; see inset, Fig. 3) the polymorphisms would have had to persist through multiple cladogenesis events. Phylogenetic analyses of additional, independently inherited nuclear sequence data, as well as consideration of faster evolving markers, such as microsatellite loci, will help us to further test our current taxonomic and phylogeographic hypotheses (Hare, 2001).

2.5.2 Historical biogeography

Taxa such as Conopholis that exhibit an east–west geographic disjunction in North America are considered to be either: (1) tertiary relicts of the mixed broadleaf (mesophytic) forest, or (2) disruptions in the continuous ranges caused by Pleistocene glaciations, or (3) the result of more

28 recent long-distance dispersal events (Graham, 1964; Wood, 1972, Graham, 1993; Soltis et al.,

2006). Oaks, the hosts of Conopholis, are documented to have been part of the broadleaf forest that was found across North America from the late Eocene through the Miocene (40–5 Ma;

Braun, 1947; Axelrod, 1983). The range of Conopholis is thought to have been continuous during the latter portion of this time period with that of oaks. However, with the appearance of widespread prairie vegetation and the aridification of the midcontinental North America during the Pliocene (5–2 Ma), a major east–west disruption of the broadleaved deciduous forest was created (Graham, 1993).

Given the composition of and rooted relationships among three separate lineages inferred within

Conopholis (Fig. 3; online Appendices S1 and S2), we propose a four-step biogeographic scenario to explain the current distribution of the genus and place it in a historical context (Fig.

4). The first, and therefore the oldest, split seems to have occurred between clade A and the rest of the genus, with the Trans-Mexican volcanic belt functioning as a barrier. The formation of this belt began in the late Miocene (Ferrari et al., 1999) and continues today, forming the tallest mountain range in Mexico that runs from east to west in the central region of the country

(Rzedowski, 1978). This mountain range is a recognized center for biodiversity (Quercus;

Nixon, 1993; Pinus; Styles, 1993) and is established as a known vicariant barrier for a variety of organisms (insects, Halffter, 1964, 1976; mosses, Delgadillo, 1987). For Conopholis, this volcanic belt can be seen as an effective barrier to migration and gene flow, creating a north- south divide in Mexico and effectively separating populations to the north of the belt from the rest of Conopholis (clade A; Figs. 2, Fig. 3).

The second step is the separation of high-mountain populations from Costa Rica and Panama from those that lay further north and east, around the Gulf of Mexico. Members of clade B are geographically isolated from the nearest population of Conopholis reported to be in Guatemala 29

(Haynes, 1971). There are no known populations occurring in Honduras, El Salvador, or

Nicaragua. Hence, these populations have presumably existed in isolation from other members of the genus for an extended period of time, resulting in accumulation of genetic differences between them and their nearest relatives along the Gulf Coast.

The third split in Conopholis can be explained by the east-west North American disjunction. In addition to the aridification of midcentral North America during the Pliocene described above, there is also the possibility that the range of Quercus and Conopholis remained continuous along the Gulf Coast during the last glacial period occurring in the Pleistocene, 110c000-10c000 ybp

(Jackson et al., 2000). However, the harsh climate during the Pleistocene glaciations can be assumed to have eliminated the north-central portion of the range (Wood, 1972), and in so doing, created an east-west divide along the Gulf Coast. This geographic divide of more than

1400 km in the case of Conopholis resulted in the genetic differentiation between populations found today in north-eastern North America from those in the southern Mexico (Figs. 1 and 4).

Finally, the fourth step would involve repeated range expansion and contraction of eastern North

American Conopholis populations following the glaciation minima and maxima. In eastern

North America during the peak Pleistocene glaciation, plants and animals presumably survived primarily in several glacial refugia located in the southern portions of the United States, along the Gulf coast (Pielou, 1991). Fossil data suggests that pockets of hardwood forests existed in the Lower Mississippi Valley during the last glacial maximum forming a glacial refugium in the southern USA for temperate taxa (Delcourt and Delcourt, 1984). As the ice retreated, populations migrated northwards to their ranges present day. The “Southern Refugia hypothesis” postulates higher diversity in the southern nonglaciated regions and loss of this diversity by populations moving northwards (Hewitt, 1996, 2000; Petit et al., 2002; McLachlan et al., 2005). The results of our present analyses offer an initial support for this hypothesis in 30

Conopholis. Specifically, within clade C, we observe greater diversity in populations collected from the southern Mexican and USA states (e.g., Chiapas, Oaxaca, Puebla, Alabama, and

Kentucky) compared to the central and northern parts of its range (Fig. 3 and online Appendices

S1 and S2). To gain a better understanding of relationships within clade C as well as the amount and geographic distribution of genetic diversity within this lineage, a much more dense sampling strategy is required, in combination with faster-evolving markers.

Given the present distribution of Epifagus, the last two steps (steps three and four) appear to apply equally well on populations of this genus, sister to Conopholis. Although not as dramatic as in Conopholis, an east–west North American geographic disjunction clearly exists in

Epifagus (Thieret, 1969; Fig. 1), and this divide is also recovered through molecular phylogenetic analyses (Fig. 3 and online Appendix S2). The representative of the Epifagus population found in Mexico is genetically distinct from those located in northeastern North

America. In addition, from the molecular clock trees (inset, online Appendix S2), we can deduce that populations of Epifagus have very similar diversification times (rates) compared to those of

C. americana in eastern North America. The branches subtending populations within these clades are very short and indicative of low levels of sequence divergence within these two clades, particularly in the northern range, compatible with more comprehensive results from plastid and microsatellite data in Epifagus (Tsai and Manos, 2010).

A more integrative approach to historical phylogeny-based biogeography has been strongly advocated (e.g., Donoghue and Moore, 2003 and references therein), in particular regarding the need for an explicit incorporation of temporal information in such studies (Ree et al., 2005).

However, primarily because Orobanchaceae as a family has no fossils that can be used to set a reference date (Cronquist, 1988), we feel efforts to estimate the absolute timing of the diversification of lineages within Conopholis remain premature. Nevertheless, an initial attempt 31 was made by Wolfe et al. (2005) to put a timeframe on the origin of Orobanchaceae within a broader phylogenetic context, by taking estimates for the Lamiales crown-clade age based on sequence data (71–74 Ma; Wikström et al., 2001) and fossil data (37 Ma for Oleacaceae;

Magallón et al., 1999), and averaging these estimates to create a reference node time of 55.5

Ma. Wolfe et al. (2005, p. 125) acknowledged the limitations of their approach, stating that “not having a fossil for Orobanchaceae, and calculating divergence times based on an average estimated age of Lamiales […] means that our inference of divergence times should be considered as a baseline for future studies.” Aside from these calibration issues, their divergence estimates within Orobanchaceae were deduced from a phylogenetic tree with relatively poor support for many of the backbone relationships in this family and confidence intervals were not provided for any of the calculated point estimates, making it altogether difficult to further critically evaluate those results or use them here.

2.6 Conclusions

Altogether, these analyses reveal three distinct lineages lending support to the possibility of there being three species within the genus. A fine-scale morphometric analysis is needed to determine if there are morphological features that could further corroborate our molecular results. In addition, research with multiple individuals per population should be conducted to provide more accurate estimates of population-level genetic diversity within C. americana s.l. and to draw conclusions about its postglacial migration.

2.7 Acknowledgements

The authors warmly thank C. Campbell, W. Flynn, L. Goertzen, S. Hill, M. Kuzmina, J. Meyers,

S. O’Kane, G. Porta, N. Prentiss, J. Robertson, A. Salina Tovar, G. Schatz, K. Shaw, E. Tsai,

32 and the curators/directors of ARIZ, INBIO, MO, NY, and US for supplying plant material and

G. Yatskievych and T. Collins and two anonymous reviewers for critical comments on the manuscript. Financial support from the Natural Sciences and Engineering Research Council of

Canada Discovery grant (326439-06) and University of Toronto Connaught New Staff Matching grant to S.S. are gratefully acknowledged.

33

Table 2-1

Table 2-1: Summary descriptions for sequences included in, and maximum parsimony trees derived from, individual and combined datasets of Conopholis and its close outgroup Epifagus.

Description Plastid clpPa Plastid Nuclear All trnfM-trnEb phyAb combineda

Number of OTUs included 47 42 35 49

Sequence characteristics

Analyzed length c 1667 475 479 2621

Number of coded gaps 24 14 3 41

Variable sitesd 232 24 8 264

Parsimony informative sitesd 209 16 5 230

Mean AT content 0.70 0.73 0.66 0.70

Tree characteristics

Number of trees 711 120 2 930

Length 287 28 9 326

CI/RI 0.930/0.986 0.857/0.953 1/1 0.920/0.983 Note: CI, consistency index; RI, retention index; OTU, operational taxonomic unit a Including outgroup taxa (Epifagus) b Excluding outgroup taxa that could not be aligned with the ingroup accessions c Excluding portions of the alignment spanning primer regions and ambiguously aligned regions dIncluding coded gaps

34

Table 2-2-Appendix 1

Appendix 1-Taxa, DNA accession numbers, voucher information, locality from where specimen were collected, geographic coordinates, labels for names used in text, and GenBank accession numbers for sequences used in this study.

Species DNA Localityc Geographic Labele GenBank Accessions Accessiona/Voucherb Coordinatesd

clpP trnfM-E phyA

Conopholis alpina AC.MX.11M.1; MO Coxcatlan, Puebla, Mexico 18°22’N 97°00’W Z HQ895610 HQ895687 HQ895645 Liebm.

AC.MX.13M.2.2; MO La Carbonera, Oaxaca, Mexico 17°35’N 97°00’W X HQ895605 HQ895682 HQ895644

AC.05.Panama; n/a Chiriqui, Panama 08°33’N 82°24’W AH HQ895602 HQ895677 HQ895642

INBIO-8382; INBIO Limon, Talamnca, Costa Rica 09°06’N 82°58’W AF N/A HQ895678 N/A

INBIO-8386a; INBIO Puntarenas, Coto Brus, Costa Rica 08°57’N 82°49’W AG N/A HQ895679 N/A

MO.04805064; MO Huajuapan, Oaxaca, Mexico 17°48’N 97°46’W Y HQ895608 HQ895685 N/A

MO.04823722; MO Reserva de la Biosfera, Chiapas, 16°46’N 93°06’W AC HQ895606 HQ895683 N/A Mexico

MO.04875049; MO Cerro Quetzal, Chiapas, Mexico 16°47’N 93°04’W AD HQ895607 HQ895684 N/A

MO.04853204; MO Reserva de la Biosfera, Chiapas, 16°45’N 93°09’W AE HQ895609 HQ895686 N/A Mexico

NYBG-857; NY San Jose, Costa Rica 09°56’N 84°03’W AJ HQ895604 HQ895681 N/A

SI-0023A; US Chiriqui, Panama 08°36’N 82°22’W AI HQ895603 HQ895680 HQ895643

Conopholis alpina AC.AZ-FHR.4; MO Gila Co., Arizona, USA 34°22’N 111°6’W AO HQ895600 HQ895675 HQ895640 Liebm. var. mexicana (A.Gray, ex S. Watson) R.R. Haynes

35

AC.NM-WWC.5; MO Otero Co., New Mexico, USA 32°53’N 105°57’W AM HQ895598 HQ895673 HQ895638

AC.NM-BC.1.1; WTU Colfax Co., New Mexico, USA 36°33’N 105°03’W AK HQ895596 HQ895671 HQ895636

AC.NM.DSNA.1; MO Town of Las Cruces, New Mexico, 32°18’N 106°46’W AN HQ895599 HQ895674 HQ895639 USA

ARIZ-6333b; ARIZ Van Horn Rural, Texas, USA 31°54’N 104°50’W AL HQ895597 HQ895672 HQ895637

AC-MX-24Fb.3; MO Aculco, Districto Federal, Mexico 19°20N 99°30’W AP HQ895601 HQ895676 HQ895641

Conopholis AC.IL-KSP.1; MO Vermillion Co., Illinois, USA 40°07’N 87°44’W A HQ895621 HQ895698 HQ895656 americana (L.) Wallr.

SS0331; TRTE Hickory Ridge Lookout, Monroe Co., 39°02’N 86°19’W B HQ895622 HQ895699 HQ895657 Indiana, USA

SS0680; TRTE Cloudland Canyon, Dade Co., 34°50’N 85°27’W C HQ895623 HQ895700 HQ895658 Georgia, USA

SS0471; TRTE Kanawha Co., West Virginia, USA 39°11N 81°27’W D HQ895624 HQ895701 HQ895659

ET.CT.11.31; N/A Hubbard Park, New Haven Co., 41°33’N 72°50’W E HQ895625 HQ895702 HQ895660 Connecticut, USA

ET.NC.21.21; N/A Hot Springs, Madison Co., North 35°53’N 82°49’W F HQ895626 HQ895703 HQ895661 Carolina, USA

SS0582; TRTE Holland, Ottawa Co., Michigan, USA 42°47’N 80°06’W G HQ895627 HQ895604 HQ895662

SS0594; TRTE Halton Co., Ontario, Canada 43°25’N 79°52’W H HQ895628 HQ895605 HQ895663

SS05194; TRTE Township of Archipelago, Parry 45°20’N 80°02’W I HQ895629 HQ895606 HQ895664 Sound, Ontario, Canada

SS06127; TRTE Gatlinburg, Blount Co., Tennessee, 35°42’N 83°30’W J HQ895630 HQ895707 HQ895665 USA

SS0758; TRTE Shenandoah, Madison Co., Virginia, 38°41’N 78°19’W K HQ895631 HQ895708 HQ895666 USA

36

SS0937; TRTE Cheboygan Co., Michigan, USA 45°33’N 84°40’W L HQ895633 HQ895710 HQ895668

SS0941; TRTE Port Severn, Ontario, Canada 44°55’N 79°44’W M HQ895635 HQ895712 HQ895670

SS0653; TRTE Lake Waren, Hampton Co., South 32°49’N 81°10’W N HQ895614 HQ895691 HQ895649 Carolina, USA

SS0648; TRTE Wakulla Spring, Wakulla Co., 30°07’N 84°21’W O HQ895615 HQ895692 HQ895650 Florida, USA

SS0581; TRTE Huntland, Franklin Co., Tennessee, 35°03’N 86°16’W P HQ895613 HQ895690 HQ895648 USA

AC.FL-SF.4.2; MO Alachua Co., Florida, USA 29°44’N 82°26’W Q HQ895616 HQ895693 HQ895651

AC.PA-MSF.1; MO Franklin Co., Pennsylvania, USA 39°55’N 77°26’W R HQ895617 HQ895694 HQ895652

SS06146; TRTE Swain Co., North Carolina, USA 35°25’N 83°27’W S HQ895618 HQ895695 HQ895653

SS06174; TRTE Granville, Licking Co., Ohio, USA 40°04’N 82°31’W T HQ895619 HQ895696 HQ895654

SS0780; TRTE Montarville Quebec, Canada 45°32’N 73°21’W U HQ895620 HQ895697 HQ895655

AC.WI-DL.1; N/A Devil’s Lake State Park, Sauk Co., 43°24’N 89°42’W V HQ895632 HQ895709 HQ895667 Wisconsin, USA

SS0938B; TRTE Industry Town, Franklin Co., Maine, 44°45’N 70°04’W W HQ895634 HQ895711 HQ895669 USA

SS0579A; TRTE Huntsville, Madison Co., Alabama, 34°43’N 86°35’W AA HQ895611 HQ896788 HQ895646 USA

SS0311; TRTE Gulf Bottom Trail, McCreary County, 36°41’N 84°28’W AB HQ895612 HQ895689 HQ895647 Kentucky, USA

Epifagus virginiana MO.03399414; MO Sierra de Guatemala, Tamaulipas, 23°5’N 99°15’W HQ895590 N/A N/A (L.) W.P.C. Barton Mexico

SS05200; TRTE Mississauga, Peel, Ontario, Canada 43°32’N 79°39’W HQ895591 N/A N/A

SS05202; TRTE Moon River, Ontario, Canada 45°05’N 79°56’W HQ895593 N/A N/A 37

SS04159; TRTE Martin Co, Indiana, USA 38°42’N 86°44’W HQ895589 N/A N/A

SS04145; TRTE Bloomington, Indiana, USA 39°12’N 86°30’W HQ895595 N/A N/A

SS05201; TRTE Herkimer Co. New Yourk, USA 43°31’N 74°47’W HQ895592 N/A N/A

SS04169; TRTE Huntington Co, Indiana, USA 40°50’N 85°26’W HQ895594 N/A N/A Note: in column DNA accession/voucher: SS, Sasa Stefanović; AC, Alison Colwell; ET, Erica Tsai. In column GenBank Accessions: N/A, sequences not available and not used in analyses. aExtraction labels for the specimen indicated on the phylogenetic trees (Fig. 3 and Suppl. Figs. S1 and S2). bAbbreviations of herbaria where vouchers are deposited follow Index Herbariorum. cGeographic areas where the specimen were collected. When known, the lower administrative units within a country are listed (e.g., states or provinces, counties or townships). dApproximate geographic coordinates for the localities from which the specimens were obtained. eLetter(s) corresponding to node labels on the networks (Fig. 2)

38

Figure 2-1

Figure 2-1 Distribution of Conopholis (shaded) and Epifagus (dashed outlines) across their geographic ranges in eastern and western North America [modified from Haynes (1971) and Thieret (1969), respectively]. The approximate position of sampling sites used in this study are indicated (for details, see Appendix 1). Circles represent sampling sites for populations of C. americana, diamonds represent those of C. alpina, while X symbols stand for E. virginiana. TMVB = Trans-Mexican volcanic belt.

39

Figure 2-2

40

Figure 2-2 Phylogenetic networks (neighbor-net split graphs) obtained from plastid (trnfM-E and clpP) and nuclear (phyA) sequences of Conopholis. Major groups recovered and discussed in this study are highlighted and labeled A-C. Numbers represent bootstrap values ≥50% (1000 replicates). Closed circles represent individuals traditionally identified as C. americana; diamonds represent those of C. alpina. Taxon labels are indicated in Appendix 1. Note: plastid networks are at the same scale.

41

Figure 2-3

42

Figure 2-3. Majority rule consensus tree resulting from the partitioned Bayesian analysis of the combined plastid (clpP, trnfM-E) and nuclear (phyA) sequence data plus coded gaps showing phylogenetic relationships among and between populations of Conopholis. The tree is rooted using individuals from the sister genus Epifagus as outgroups. The MP search resulted in a strict consensus tree with almost identical topology (326 steps in length). Bayesian posterior probabilities are indicated above branches while parsimony bootstrap values (≥ 50%) are indicated below branches. Major clades recovered from analyses and discussed in this study are labeled A-C. Species names are followed by abbreviations of states/provinces in which they were collected and their respective DNA accession numbers (Appendix 1). Inset shows one of the equally parsimonious trees chosen to illustrate branch lengths.

43

Figure 2-4

Figure 2-4. Schematic overview of the evolutionary hypothesis for Conopholis derived from plastid and nuclear sequence data. Correspondence between the major clades (A-C) recovered in this study, the traditional classification, new classification (in bold) presented here, and current geographic distribution is indicated. Shaded clade (dotted lines) represents the sister genus Epifagus and its current distribution. See Discussion for full description of proposed four-step biogeographic scenario (1-4).

44

Figure 2-5-Supplemental Figure S1

45

Supplementary Figure S1. Strict consensus of maximum likelihood trees with molecular clock imposed of the combined plastid (clpP, trnfM-E) and nuclear (phyA) sequence data matrix showing phylogenetic relationships within Conopholis. Inset shows one of the ML trees chosen to illustrate branch lengths. The MP search resulted in a strict consensus tree with almost identical topology (146 steps in length). Bayesian posterior probabilities are indicated above branches while parsimony bootstrap values (≥ 50%) are indicated below branches. Major clades recovered from analyses and discussed in this study are labeled A-C. Species names are followed by abbreviations of states/provinces in which they were collected and their respective DNA accession numbers (Appendix 1).

46

Figure 2-6-Supplemental Figure S2

47

Supplementary Figure S2. Strict consensus of maximum likelihood trees with molecular clock imposed on the clpP data matrix showing phylogenetic relationships between Conopholis and Epifagus. Inset shows one of the ML trees chosen to illustrate branch lengths. The MP search resulted in a strict consensus tree with identical topology (287 steps in length). Bayesian posterior probabilities are indicated above branches while parsimony bootstrap values (≥ 50%) are indicated below branches. Major clades recovered from analyses and discussed in this study are labeled A-C. Species names are followed by abbreviations of states/provinces in which they were collected and their respective DNA accession numbers (Appendix 1).

48

3 Morphometric analyses and taxonomic revision of the North American holoparasitic genus Conopholis (Orobanchaceae)

Other than thesis specific changes for formatting, this chapter was previously published as:

Rodrigues, A. G., S. Shaya, T. A. Dickinson, and S. Stefanović. 2013. Morphometric analyses and taxonomic revision of the North American holoparasitic genus Conopholis (Orobanchaceae). Systematic Botany 38(3): in press.

49

3.1 Abstract

Members of the small genus Conopholis are perennial holoparasites. They are found growing in eastern and southwestern North America, and in Central America, where they attach to the roots of their oak hosts. Two species were recognized in the last taxonomic revision of the group based on geographic range and differences in floral, capsule, and bract morphology. Due to the overlapping nature of the characters used to distinguish between taxa, no single morphological feature can be relied on to differentiate between the species. A recent molecular phylogenetic study of the genus recovered three well-supported lineages, none of which corresponds entirely to the current subdivision of the genus into two species. We undertook a fine-scale morphometric study of the genus, emphasizing calyx and bract morphology. Unweighted pair- group method using arithmetic averages and principal coordinate analyses corroborate molecular data and strongly support the distinction of three separate lineages within Conopholis.

A taxonomic re-alignment is proposed for the genus including three species, C. americana, C. panamensis, and C. alpina, each with various degrees of overlap with previously described taxa.

50

3.2 Introduction

Conopholis Wallr. is a small holoparasitic genus distributed throughout eastern and southwestern North America and Central America. The genus was established by Wallroth

(1825) based on a specimen from South Carolina (eastern U. S. A.) described originally by

Linnaeus in 1767 as americana. Since then, four other species have been described:

C. alpina Liebm., C. sylvatica Liebm., C. mexicana Gray ex Watson, and C. panamensis

Woodson. Conopholis belongs to Orobanchaceae (Young et al., 1999; Olmstead et al., 2001;

APG, 2009), one of the largest and most prominent families of parasitic plants, containing approximately 1800 species, one-half of all known parasitic angiosperms (Nickrent, 2012).

Within Orobanchaceae, Conopholis is closely related to other North American parasites in the holoparasitic clade III (as defined by Bennett and Mathews, 2006), specifically Epifagus Nutt.,

Boschniakia C. A. Mey ex Bong, and Orobanche L. It can be distinguished from Epifagus by its chasmogamous flowers and from Orobanche by its exserted stamens. Following the descriptive terminology as applied traditionally to this genus (see Woodson and Seibert, 1938; their Fig. 2), Conopholis possesses calyces that are lobed (i.e., with rounded margins) or toothed

(i.e., with pointed margins), with tubes split deeply longitudinally along the anterior side, while those in Boschniakia are often zygomorphic but not split longitudinally.

In plants, parasitism is defined by the presence of haustoria. These are the organs that connect the parasite to the vascular system of its host. With the evolution of advanced parasitism, many holoparasitic species exhibit what is known as the “parasitic reduction syndrome” (Colwell,

1994), a suite of correlated morphological and physiological changes, including the loss/reduction of chlorophyll production, photosynthesis, and vegetative structures, along with the complete reliance on their haustorial connection to hosts, from which they acquire carbon,

51 water, and nutrients. Due to this syndrome, there are a limited number of morphological characters that can be relied upon to potentially differentiate between species of Conopholis.

This has led to disagreement regarding the number of species in the genus among early floristic treatments, and the genus has been variously treated as having one to four species. For example,

Beck (1930) accepted two species; Small (1933) assigned three species to this genus; Fernald

(1950) reduced it to only one; and Gleason (1952) accepted four species.

In 1971, Haynes determined that the genus was in need of a revision given the taxonomic uncertainty and the high degree of similarity among taxa. After studying the relevant type specimens, he concluded that the individuals assigned to C. alpina, C. sylvatica, and C. panamensis represented only intraspecific variability and did not warrant separation into three different species. Therefore, these three entities were combined under C. alpina. His classification is based on a combination of presence/absence of characters along with a number of quantitative traits such as the relative proportion of bracts and scales as well as the shape of the calyx (Haynes, 1971). Ultimately, Haynes (1971) recognized only two species, C. americana and C. alpina, with the latter being divided into two varieties, C. alpina var. alpina and var. mexicana (Gray ex Watson) R. R. Haynes. The two species were separated because of their partial morphological distinctiveness, and perhaps most importantly, because of their geographic isolation and apparent host specificity (Haynes, 1971). Figure 1 summarizes the relationship between the five species of Conopholis that were described before Haynes’ work in

1971 and the two species proposed by Haynes following his taxonomic treatment.

Conopholis americana parasitizes red oaks (Quercus section Lobatae Loudon; Manos et al.,

2001) in moist deciduous or mixed forests and is found today across eastern North America, from Nova Scotia to Wisconsin in the north and from Florida to Alabama in the south.

Compared to C. alpina, C. americana has a looser inflorescence, broader bracts nearly or wholly

52 concealing the calyx, and styles mostly persistent in fruit. Conopholis alpina parasitizes various oak species, but predominantly those of white oaks (Quercus section Quercus) in oak woodlands and mixed montane forests found in southwestern North America. Conopholis alpina var. alpina occurs in the central area of Mexico across the Trans-Mexican volcanic belt

(TMVB) south to Costa Rica and Panama. Conopholis alpina var. mexicana is distributed from the Trans-Pecos area of Texas through northern New Mexico and central Arizona south to central Oaxaca, including the same central area of Mexico as C. alpina var. alpina. The distribution of both varieties thus overlaps in the central region of Mexico along the TMVB where identifying a specimen to a particular variety is especially challenging. The features that distinguish the two varieties, apart from their geographic range, are the texture and venation of scales, and whether or not the bract conceals the calyx.

In a recent molecular phylogenetic study (Rodrigues et al., 2011), Conopholis was found to contain three major lineages. Regardless of the source of data (plastid or nuclear sequences) and phylogenetic method utilized (distance or character-based methods), none of the analyses resulted in the strict subdivision of the genus into the two currently recognized species. Each of the three distinct and well-supported clades recovered had varying degrees of overlap with previously proposed taxa. In addition, the three clades showed much greater genetic differentiation among them than among individuals within each of those clades. Altogether, taking into account the composition of these clades and the branch lengths subtending them, the molecular results were interpreted as lending support to three distinct lineages within

Conopholis, potentially at the species level (Rodrigues et al., 2011).

Given the overlapping distribution of variation in morphological traits used to assign individuals of Conopholis to their respective species, combined with the recent molecular findings suggesting three distinct lineages, a morphometric study of this genus is necessary. The specific

53 objectives are to (1) examine the patterns of morphological variation among Conopholis taxa,

(2) conduct morphometric analyses, and (3) provide taxonomic realignment for the genus. We present new morphological evidence to expand upon the previous molecular phylogenetic study, and based on these combined lines of evidence we provide a comprehensive systematic treatment for Conopholis.

3.3 Materials and Methods

3.3.1 Taxon Sampling

Approximately 600 Conopholis herbarium specimens from ARIZ, ASU, AUA, F, IEB, INBIO,

MEXU, NMC, NY, RSA, TEX/LL, TRTE, UNM, US, and XAL were examined. Many of these specimens could not be included in this morphometric study owing primarily to their inappropriate ontogenetic stage (e.g., young emerging inflorescence that had not yet expanded, flowers in buds, late fruiting specimens) or their poor condition. In total, 105 individuals sampled from across the geographic range of the two currently recognized species of

Conopholis (including the two varieties of C. alpina), were used in this study (Appendix 1).

This sampling includes 27 individuals of C. americana, 40 individuals of C. alpina var. alpina, and 38 individuals of C. alpina var. mexicana. The initial names applied to these accessions follow the species delimitations by Haynes, (1971), which emphasizes the geographical distinctions between the species. Given the difficulty of distinguishing the two varieties of C. alpina at their parapatric boundaries along the TMVB, it was important to investigate multiple individuals from a single herbarium sheet, when available. Of all examined sheets attributed to

C. alpina var. alpina, seven collections included two plants on a single sheet. This allowed for an attempt to assess variation within populations, assuming that the collected specimens are representatives of different individuals and not coming from the same tubercle.

54

3.3.2 Morphology and Morphometric Analysis

States for seven characters derived from bract and calyx morphology were recorded, five qualitative and two quantitative (Table 1; Supplemental Appendix 1). These characters were chosen based on (1) primary differences noted in previous species descriptions (Beck, 1930;

Small, 1933; Woodson and Seibert, 1938; Fernald, 1950; Gleason, 1952), (2) the fact that they were deemed most taxonomically useful in the last comprehensive monograph of the genus

(Haynes, 1971), and (3) personal observations made during a pilot study. Descriptions and measurements are based on rehydrated herbarium material. Material was rehydrated, fixed in

FAA, and then stored in 70% ethanol. The character states were recorded at two positions along the inflorescence of the specimens: the observations and measurements made at the ‘top’ were always located four to six bract positions below the youngest bract subtending a flower while those made at the ‘base’ were always from the first bract that subtends a flower found just above basal stem scales that do not surround a flower. These landmark locations, depicted in Fig. 2, were established to ensure that observations and measurements would be made at the same relative position across all specimens, regardless of their exact ontogenetic stage or environmental conditions. Quantitative characters (bract width and length) were measured from digitally acquired bract outlines and computer-based measurements using MorphoSys

(Meacham and Duncan, 1991) and an image capture system based on the PCvisionplus framegrabber from Imaging Technology Inc., Woburn, Mass., U. S. A. Length was measured from the base of the bract to its apex, and width was measured at the widest point of the bract

(always at the base of the bract; Fig. 2).

To assess overall morphological variation, the data were visualized with clustering and ordination methods implemented in R (R Core Team, 2012). We first calculated the pairwise dissimilarities (distance) between observations in the data using Gower’s coefficient (Gower, 55

1971; function daisy in the R package cluster). For binary characters, 0/0 matches were treated as negative matches. Gower’s coefficient was used because it allows for the combination of qualitative and quantitative data. Phenograms were then constructed using the unweighted pair- group method using arithmetic averages (UPGMA; Sneath and Sokal, 1973; function hclust in the R package vegan) on the Gower’s coefficient matrix. The cophenetic correlation coefficient was calculated to determine how well the hierarchical structure of the dendrogram represents the actual distances. Finally, we applied principal coordinate analysis (PCoA) to the distance matrix (function pcoa in the R package ape). This form of analysis is more appropriate than principal component analyses when there are missing values in the data matrix (Rohlf, 1972).

This allowed us to include calyx morphology in the analysis despite its absence by the time an individual bears mature fruit. In this study, the distances amongst specimens are illustrated by plotting the first two principal coordinates. Both UPGMA and PCoA analyses were performed on observations and measurements made from the ‘top’ and ‘base’ along the inflorescence of the specimens, as indicated in Figure 2. Two sets of measurements were made and analyzed because (1) for a number of specimens we could not establish what the ‘base’ on the inflorescence was, (2) the ‘top’ data set had more observations than the ‘base’, and (3) we wanted to determine if the individuals would cluster in the same manner, regardless of where observations were made.

3.4 Results

The UPGMA cluster analysis using the Gower’s coefficient matrix produced from measurements obtained from the ‘top’ of the inflorescence shows a clear separation of three 56 backbone clusters (A–C; Fig. 3). The majority of specimens from southwestern North America used in this study (C. alpina var. alpina and C. alpina var. mexicana) are found in two separate clusters (A and B). Cluster A contains individuals from the southwestern portion of the U. S. A. and throughout Mexico (C. alpina var. mexicana and C. alpina var. alpina). Cluster B comprises the lineage found in Costa Rica and Panama (C. alpina var. alpina). Cluster C, sister to B, contains all samples from eastern North America (C. americana) along with one individual identified a priori as C. alpina var. mexicana from Texas and nine accessions named a priori as

C. alpina var. alpina from the southern Mexican states of Vera Cruz, Puebla, Distrito Federal,

Michoacán, and Hidalgo. These ten samples of C. alpina are positioned within this predominantly eastern North American cluster C, instead of being more closely related to other specimens of C. alpina (clusters A and B), as would be expected based on traditional classification. The cophenetic correlation coefficient of the analysis was 0.92. The UPGMA cluster analysis performed on the ‘base’ observations and measurements produced a topology consistent with that from the ‘top’ UPGMA analysis, recovering identical backbone clusters

(dendrogram not show; cophenetic correlation coefficient of the analysis was 0.93).

Both the ‘top’ and ‘base’ ordination analyses (PCoA) revealed three clearly separated clusters

(A–C; Fig. 4). The compositions of species and populations within each group were identical as the three clusters obtained by UPGMA analysis described above. The first coordinate axis for each analysis separates clusters B and C from cluster A while the second axis separates B from

C. In both plots of Figure 4, individuals in cluster C marked by an arrowhead are found to be outliers within this group. These individuals are from sympatric populations in southern Mexico where the two varieties of C. alpina overlap in distribution. Character states and measurements were recorded for two individuals per herbarium sheet. For these three particular populations,

57 one individual from each herbarium sheet was found to group with cluster C while the other was found in cluster A.

3.5 Discussion

This work represents the first fine-scale morphometric study of Conopholis. The clustering and ordination analyses performed in this study failed to reveal groupings corresponding to the subdivision of the genus into the two species recognized by Haynes (1971), C. americana and

C. alpina. Instead, our results demonstrate the morphological differentiation that has occurred between the three lineages detected in our molecular study (Rodrigues et al., 2011). The clear morphological separation among the three clusters recovered here stands in contrast with the traditional classification (synopsis provided in Fig. 1). These new morphological results complement our molecular findings (Rodrigues et al., 2011) and reinforce the distinction of three species within Conopholis. Figure 5 summarizes our understanding of the circumscription of species and their relationships based on all available morphological and molecular data. This best estimate of phylogeneny is also used to map morphological synapomorphies and autapomorphies as well as to illustrate the relationship between the current classification of the genus (Haynes, 1971) and the revised classification being proposed here (Fig. 5).

Multivariate analyses of morphological data delineated three separate and distinct clusters.

Conopholis alpina, as defined traditionally, is shown to be polyphyletic as was the case with molecular data (Rodrigues et al., 2011). Its representatives belong to all three clusters (A, B, and C; Figs. 3, 4). Cluster A consists of all members of C. alpina var. mexicana found in the southwestern portion of the U. S. A. and north of the TMVB as well as several individuals from southern Mexico (C. alpina var. alpina in part). Members of group A can be identified by their acute bract that does not conceal the calyx, pubescence along the margin of bracts, and obtusely

58 toothed calyx. This species definition encompasses descriptions previously put forth for C. alpina (Liebmann, 1847), C. sylvatica (Liebmann, 1847), and C. mexicana (Gray ex Watson

1883). Described from Puebla, Mexico, C. alpina was deemed to be different from C. americana by its unibracteolate calyx, corolla that is twice as long as the calyx, lobes of its lower lip that are short with much exserted stamens, and styles hardly longer than the stamens.

Conopholis sylvatica from Veracruz, Mexico, was described at the same time as C. alpina by

Liebmann in 1847 and was defined as having a slender stem, small calyx, slender corolla that was twice as long as the calyx, and short, more obtuse lower lip. Conopholis mexicana was described from Coahuila, Mexico, and was said to differ from C. americana by its less deeply toothed calyx, larger corolla, and longer and more rigid, lanceolate and acuminate scales. No defining characters were indicated to distinguish it from either C. alpina or C. sylvatica. For all three of these species, the differences were only noted relative to C. americana and not to each other. Based on name precedence, the specific epithet to be applied to this lineage corresponds to C. alpina (Liebmann, 1847).

Cluster B consists of all individuals sampled from Costa Rica and Panama. Members of this group can be identified by their acute bract that conceals the calyx, lack of pubescence along the margin of bracts, and lobed calyx. This lineage corresponds to the previously described species,

C. panamensis (Woodson, 1938) from Chiriqui, Panama. In its original description, C. panamensis was said to differ from both C. americana and C. mexicana by its shallow, broadly obtuse lobed calyx. The broad bracts of C. panamensis were similar to those of C. americana, while its loss of style in fruit resembled that of C. mexicana. For this lineage to correspond with the description of C. alpina var. alpina, it would have to contain not only individuals from Costa

Rica and Panama, but also all individuals occurring in southern Mexico. However, nine individuals from southern Mexico are confined to cluster C, therefore rendering C. alpina var.

59 alpina polyphyletic. Along with these disjunct individuals from southern Mexico, cluster C contains all accessions of C. americana from eastern North America (and one member of C. alpina var. mexicana from Texas; see below). Members of cluster C can be distinguished by their obtuse bract that conceals the calyx, lack of pubescence along the margin of bracts, and acutely toothed calyx.

In his description, Haynes (1971) views eastern and western species as morphologically distinct, yet states that “No single character can be relied upon to determine all specimens encountered…” (p. 252). Haynes (1971) saw this as challenging, but implied that any given specimen can be placed to the correct taxon when several morphological features are considered in combination with geographic distribution, an extrinsic character. In light of this current morphometric study and previous molecular work, this problem he encountered can be explained by the fact that C. alpina, as he defined the species, is polyphyletic. Some

Conopholis populations found in Mexico are actually disjunct members of C. americana, and hence do not belong to C. alpina, as solely expected by their geographic distribution. The persistence of both continuous and disjunct species distributions between Mexico and eastern

North America are not uncommon. Epifagus, the monotypic sister genus to Conopholis, also exhibits this east-west disjunction. Epifagus virginiana (L.) W. P. C. Barton is predominantly found across eastern North America, but it does have small disjunct populations found in

Mexico (Thieret, 1969; Tsai and Manos, 2010). Other examples of Mexican disjunct lineages include Liquidambar styraciflua L. (Graham, 1973; Morris et al., 2008), Nyssa sylvatica

Marshall (Miranda and Sharp, 1950), Fagus grandifolia Ehrh. (Morris et al., 2010), and two members of the Corallorhiza striata species complex (C. bentleyi and C. striata var involuta;

Barrett and Freudenstein, 2009).

60

In addition to the delimitations of populations and lineages described above, there were three unsuspected features discovered as a result of this study. The first is the anomalous presence of a single individual from Jeff Davis Co., Texas (C. alpina 1848190; Fig. 3), found to group with cluster C instead of with cluster A. This can be explained in one of two ways. There is the possibility this unusual result stems from a herbarium sheet that was mislabeled for the sampling locality. However, a more likely alternative is that this individual comes from an as yet undocumented, disjunct population of C. americana in Texas. Namely, we observed another specimen (C. alpina 1679772) from the same herbarium (US), collected by the same individual

(Sperry) who collected C. alpina 1848190, and at the same locality in Texas (Jeff Davis

County), but three years earlier (1936). This specimen could not be included in the morphometric analyses due to its deterioration and inability to sample at landmark locations

(‘top/base’) along the inflorescence but it also appears to share general morphological features with C. americana. Taken together, these findings suggest the presence of another disjunct population of C. americana in southwestern Texas, analogous to those discovered in southern

Mexico. All other samples in this study from Texas (and the rest of the southwestern U.S.A.) are found in cluster A.

Second, sampling of localities where the two varieties of C. alpina occur in sympatry (at and just south of the TMVB) indicates the presence of mixed populations, containing individuals from both clusters A and C (e.g., accessions C. alpina 1004045A and B, 559A and B, 4237A and B, and 34174B and C; Fig. 3). Referring in particular to these areas of overlap, Haynes

(1971) stated that “some specimens cannot confidently be placed into either taxon and for this reason I consider these two taxa to be varieties of one species.” (p. 255). Members of populations that occur in southern Mexico, including the TMVB, are shown here to belong to two separate species, C. alpina and disjunct members of C. americana. In view of the fact that

61

Haynes (1971) considered geographic distribution to be of overwhelming importance, he did not consider C. americana as a possibility. Instead he considered individuals from this region to be placed into one of the two varieties of C. alpina (var. alpina or var. mexicana).

Finally, two of these four mixed populations (i.e., C. alpina 559 and 4237) as well as one additional disjunct C. americana population/individual (specimen labelled as C. alpina 2321) exhibit mixed morphological characters, as evidenced by their outlier position within cluster C in the PCoA (Fig. 4; highlighted with arrowheads). These three samples have narrow bracts that do not entirely conceal their calyces, normally a diagnostic trait for C. alpina. Other than this feature, their remaining character states are all shared with C. americana. The discovery of populations from southern Mexico that have two species and show some individuals with intermediate morphology suggest the existence of hybrid swarms in zones of overlap.

Individuals that possess this intermediate morphology may also have been another reason why

Haynes (1971) was not able to confidently assign them to a particular taxon. To confirm whether hybridization is occuring between C. alpina and C. americana, further investigations involving multiple single or low copy nuclear genes, are required.

3.6 Taxonomic Treatment

CONOPHOLIS Wallroth, Carl Friedrich Wilheml. Orobanches Gen. Diask. 78.1825.—TYPE:

Conopholis americana (L.) Wallr.

Low, glabrous, yellow, cream, yellow-brown, or brown simple herbs, fleshy at first but becoming brittle, flowering stems arising from a brown to black subterranean tubercle. Leaves scale-like, of 2 types, the lower very tightly imbricate and wide at base; the upper alternate, glandular pubescent or not along the margins, ovate to ovate-oblong or lanceolate to narolowly elongate triangular, widest at base, apex acute or obtuse. Inflorescence a compact raceme, each 62 flower subtended by a bract, bract longer than the calyx and may or may not entirely conceal the calyx. Calyx irregular, tube cylindrical, 4- to 5-toothed or 2-lobed, teeth acute or obtuse to apiculate, lobes obtuse. Corolla cream colored, tubular, 2-lipped. Stamens 4, inserted above the ovary, exserted. Style apically reflexed, persistent with stigma on or deciduous from fruit. Fruit

2- halved, non-fleshy, brown to black capsule, ovoid, dehiscing regularly or irregularly. Seeds oval, triangular, rhomboidal, and quadrangular, brown to dark brown.

3.7 Key to species of Conopholis

1. Bracts narrow, not concealing the calyx; bract margin glandular pubescent; calyx toothed and

teeth obtuse; plants of southwestern U. S. A. and Mexico.....….. C. alpina

1. Bracts wide, concealing the calyx; bract margin glabrous; calyx either lobed or toothed;

plants of eastern North America, southern Mexico, Costa Rica and

Panama………………………………………………………………………………………2

2. Bract tips acute; calyx lobed (not toothed and lobes rounded); plants of Costa Rica and

Panama………..……………………………..……….……………… C. panamensis

2. Bract tips obtuse; calyx toothed and teeth acute; plants of eastern North America and

southern Mexico………...…………………………………………………..... C. americana

CONOPHOLIS AMERICANA (L.) Wallr., Orob. Gen. Diask. 78. 1825. Orobanche americana L.

Mant. Pl. 88. 1767.—TYPE: U. S. A., Carolina. No date recorded. Anon., s. n. (lectotype:

LINN scanned image!, designated by Haynes)

Conopholis alpina Liebm. var. alpina sensu R. R. Haynes pro parte (excluding type).

63

Stem erect, simple, glabrous, 6–20 cm tall; bracts glabrous along the margins, ovate to ovate- oblong, widest at the base, conceal calyx, 10.5–20 mm long, (2) 4–8 mm wide, apex obtuse; calyx irregular 4- to 5-toothed, tube cylindrical, teeth acute; corolla 8–14 mm long; filaments 6–

10.5 mm long; anthers glabrous; style 5–13 mm long; capsule ovoid, 5–13 x 5.5–11 mm, style and stigma persistent; seeds irregularly oval, triangular, and quadrangular, 0.5–1.5 mm long.

Distribution and Ecology—Found parasitizing oaks (Quercus section Lobatae) in moist, deciduous, or mixed forests from central Florida west to Alabama, north to Wisconsin, west to

Nova Scotia, central and southern Mexican states. In the eastern U. S. A., flowering mid-

February in the south to mid-June in the north. Flowering in central and southern Mexico April to late July.

CONOPHOLIS PANAMENSIS Woodson, Ann. Missouri Bot. Gard. 25: 835–836, Fig. 2. 1935.

Conopholis alpina var. alpina R. R. Haynes, SIDA 4 246–264. 1971. pro parte—TYPE:

Panama, Chiriqui, Trail from Bambito to Cerro Punta, April 1937, P. H. Allen 305.

(holotype: MO scanned image!; isotypes: F, MICH, MO, NY, US scanned images!).

Stem erect, simple, glabrous, 5–20 cm tall; bracts glabrous along the margins, ovate to ovate- oblong, widest at the base, concealing calyx, 11–22 mm long, 3.5–8.5 mm wide, apex obtuse; calyx irregular 2-lobed, tube cylindrical, lobes obtuse; corolla 12–16 mm long, filaments 10–15 mm long, anthers glabrous; style 10–14 mm long; capsule ovoid, 7–16 x 5.5–12 mm, style and stigma deciduous; seeds irregularly oval, triangular, and quadrangular, 0.3–1.5 mm long.

Distribution and Ecology—Found parasitizing oaks (Quercus spp.) in high elevation forests in

Costa Rica and Panama. Flowering mid-December to May.

64

CONOPHOLIS ALPINA Liebm., Fohr, Skand. Naturf. Mode 4: 184. 1847.—TYPE: Mexico, Puebla,

March 1841, F. M. Liebmann 3719 (lectotype: C; isolectotype: F scanned image!, designated

by R. R. Haynes).

Conopholis sylvatica Liebm., Fohr, Skand. Naturf. Mode 4: 185. 1847.—TYPE: Mexico, Vera

Cruz, Liebmann s.n. (holotype: illustration s. n. no date, Mexico (C))

Conopholis alpina Liebm. var. mexicana (A. Gray ex S. Watson) R. R. Haynes Sida 3(5) 347

1969. Conopholis mexicana A. Gray ex S. Watson, Proc. Amer. Acad. Arts 18: 131, 1883.—

TYPE: Mexico, Coahuila, In the Sierra Madre, south of Saltillo, 1880, Palmer 996. (holotype:

GH photographed image!, isotypes: F, NY, PH, US, VT, K scanned image!).

Conopholis alpina var. alpina sensu R. R. Haynes pro parte (excluding type)

Stem erect, simple, glabrous, 11–33 cm tall, bracts pubescent along margin, lanceolate or narrowly elongate triangular, widest at the base, not entirely concealing the calyx, 9–21 mm long, 2–5.5 mm wide, apex acute; calyx irregular 4- to 5-toothed, tube cylindrical, teeth less deeply toothed and obtuse to apiculate; corolla 14–20 mm long, filaments 7–12 mm long, anthers sparingly pilose; style 5–12 mm long; capsule ovoid, 8–15 x 6–12 mm, style and stigma deciduous; seeds irregularly oval, triangular, and quadrangular, 0.5–1.3 mm long.

Distribution and Ecology—Found parasitizing oaks (Quercus spp.) in oak woodlands and mixed montane forests in the Trans-Pecos area of Texas, through northern and central Arizona, and south to Oaxaca, Mexico. Flowering from mid-February to late-July.

65

3.8 Acknowledgements

The authors warmly thank the curators/directors of ARIZ, ASU, AUA, F, IEB, INBIO, MEXU,

NMC, NY, RSA, TEX/LL, TRTE, UNM, US, and XAL for supplying plant material. We thank

Alison Colwell for helpful discussions and 2 anonymous reviewers for providing comments that improved a previous version of the manuscript. Financial support from the Natural Sciences and Engineering Research Council of Canada Discovery grant (326439) and University of

Toronto Connaught New Staff Matching grant to S. Stefanović are gratefully acknowledged.

We gratefully acknowledge a grant from the Royal Ontario Museum to Timothy A. Dickinson that made possible the purchase of the image analysis equipment used in data capture.

66

Table 3-1

TABLE 3-1. Characters and character states used to make observations/measurements at the ‘top’ and ‘base’ of the plant (see Fig. 1) and used in morphometric analyses.

1. Bract shape (0 = does not conceal calyx; 1 = conceals calyx), 2. Calyx shape (0 = lobed; 1 = toothed), 3. Bract tip (0 = acute; 1 = obtuse), 4. Calyx tooth shape (0 = acute; 1 = obtuse), 5. Bract margin (0 = glabrous; 1 = with hair), 6. Bract width (cm), 7. Bract length (cm)

67

Table 3-1 Supplemental Appendix 1

SUPPLEMENTAL APPENDIX 1. List of herbarium specimens examined for morphometric analyses of the genus Conopholis. Character states gathered both at the ‘top’ and/or ‘base’ for each individual (see Fig. 2 for location and Table 1 for the names of characters 1–7 and character states) and the new determination following this study are provided. The accession label is the alphanumeric code used to designate the specimen in our study (e.g., in Fig. 3). Abbreviations of herbaria follow Index Hebariorum. These are the two data sets used in UPGMA and PCoA. On the left is the new classification of individuals using the Taxonomic Key proposed in this study.

Characters taken from the ‘top’ Characters taken from the ‘base’

Species Accession Label 1 2 3 4 5 6 7 1 2 3 4 5 6 7 Species

Conopholis alpina 1820563;US 1 0 0 NA 0 0.653 2.172 1 0 0 NA 0 0.459 1.387 C. panamensis

Liebm. 1010416;US 1 0 0 NA 0 0.403 1.401 C. panamensis

2490023;US 1 0 0 NA 0 0.364 1.702 1 0 0 NA 0 0.35 1.479 C. panamensis

577561;US 1 NA 0 NA 0 0.572 1.692 1 NA 0 NA 0 0.735 1.715 C. panamensis

857;NY 1 NA 0 NA 0 0.585 1.769 C. panamensis

7506;NY 1 0 0 NA 0 0.861 1.133 1 0 0 NA 0 0.933 1.396 C. panamensis

1252575;US 1 0 0 NA 0 0.616 1.667 1 0 0 NA 0 0.514 1.82 C. panamensis

677512;US 1 0 0 NA 0 0.627 1.343 C. panamensis

1808147;US 1 0 0 NA 0 0.566 1.345 C. panamensis

1820828;US 1 0 0 NA 0 0.355 1.507 C. panamensis

3422;NY 1 NA 0 NA 0 0.641 1.583 1 NA 0 NA 0 0.377 1.622 C. panamensis

3604;NY 1 NA 0 NA 0 0.565 1.188 C. panamensis

3121;NY 1 NA 0 NA 0 0.387 1.094 1 NA 0 NA 0 0.37 1.346 C. panamensis

880;NY 1 NA 0 NA 0 0.678 1.594 1 NA 0 NA 0 0.665 2.175 C. panamensis

68

305;NY 1 NA 0 NA 0 0.574 1.258 1 NA 0 NA 0 0.47 1.751 C. panamensis

215767A;US NA 1 0 1 1 0.271 1.455 C. alpina

215767B;US 0 1 0 1 1 0.267 1.749 0 1 0 1 1 0.339 2.079 C. alpina

4676A;NY 0 NA 0 NA 0 0.46 1.577 0 NA 0 NA 0 0.405 1.572 C. alpina

2613C;NY 1 1 0 1 1 0.484 2.101 C. alpina

1424;NY NA 1 0 1 1 0.352 1.638 NA 1 0 1 1 0.44 1.318 C. alpina

28456B;NY 0 1 0 1 0 0.329 1.022 0 1 0 1 0 0.402 1.556 C. alpina

2613B;NY 0 1 0 NA 1 0.261 1.137 0 1 0 NA 1 0.217 1.297 C. alpina

2321A;NY 0 1 1 0 0 0.285 1.51 0 1 1 0 0 0.273 1.633 C. americana

2321B;NY 1 1 1 0 0 0.431 1.894 C. americana

34174B;NY 1 1 1 0 0 0.427 1.419 C. americana

34174C;NY 1 1 0 1 1 0.296 1.496 C. alpina

461750;US 0 1 0 1 1 0.412 1.737 0 1 0 1 1 0.493 1.835 C. alpina

1358;NY NA 1 0 0 1 0.237 1.402 C. alpina

840015;US 0 1 0 1 1 0.253 1.87 0 1 0 1 1 0.345 1.894 C. alpina

464217A;US 1 1 1 0 0 0.459 1.53 1 1 1 0 0 0.447 1.777 C. americana

2923296;US 1 1 1 0 0 0.517 1.411 1 1 1 0 0 0.542 1.675 C. americana

1003322A;US 1 1 1 0 0 0.544 1.73 1 1 1 0 0 0.482 1.945 C. americana

1004044B;US NA 1 0 1 1 0.371 1.432 C. alpina

1003324;US 0 1 0 NA 1 0.467 1.763 0 1 0 NA 1 0.394 1.725 C. alpina

1004045A;US 0 1 0 1 1 0.229 1.577 0 1 0 1 1 0.529 1.857 C. alpina

1004045B;US 1 1 1 0 0 0.402 1.466 1 1 1 0 0 0.371 1.728 C. americana

69

4237A;NY 0 1 1 0 0 0.199 1.552 C. americana

4237B;NY 0 1 0 1 1 0.249 1.295 0 1 0 1 1 0.526 1.675 C. alpina

559A;NY NA 1 1 0 0 0.331 1.509 C. americana

559B;NY 0 1 0 1 1 0.318 1.341 C. alpina

Conopholis alpina 337652;ARIZ 1 1 0 1 1 0.478 1.653 C. alpina

Liebm. var. 21769;NY 0 1 0 1 1 0.357 0.938 0 1 0 1 1 0.438 1.111 C. alpina mexicana (A. Gray, 8028;NY 0 1 0 0 1 0.427 1.416 0 1 0 0 1 0.381 1.618 C. alpina

Ex. S. Watson) R. R. 313673;ARIZ 0 1 0 1 1 0.292 1.575 0 1 0 1 1 0.273 1.544 C. alpina

Haynes 3194;NY 0 NA 0 NA 1 0.327 1.888 C. alpina

203;NY 1 1 0 1 1 0.335 1.394 1 1 0 1 1 0.555 1.702 C. alpina

98-628;NY 0 1 0 1 1 0.475 1.375 0 1 0 1 1 0.57 1.581 C. alpina

497910;US 0 1 0 1 1 0.427 1.473 C. alpina

495338;US 0 1 0 NA 1 0.208 1.176 C. alpina

34379;NY 0 NA 0 NA 1 0.289 1.404 C. alpina

693;NY 0 1 0 1 1 0.314 1.779 C. alpina

589;NY 0 1 0 1 0 0.417 1.183 0 1 0 1 0 0.543 1.404 C. alpina

244914;ARIZ NA 1 0 1 1 0.267 1.292 C. alpina

737255;US 0 1 0 1 1 0.501 1.293 0 1 0 1 1 0.331 1.678 C. alpina

662474;US 0 1 0 1 1 0.279 1.102 0 1 0 1 1 0.375 1.676 C. alpina

1735136;US 0 1 0 1 1 0.234 1.235 0 1 0 1 1 0.437 1.409 C. alpina

1221674;US 0 NA 0 NA 1 0.215 1.117 C. alpina

737065;US 0 1 0 0 1 0.373 1.293 0 1 0 0 1 0.484 1.55 C. alpina

70

1739221;US 0 1 0 1 1 0.249 0.996 C. alpina

1439044;US 0 1 0 0 1 0.243 0.913 0 1 0 0 1 0.357 1.047 C. alpina

1435056;US 0 1 0 NA 1 0.431 1.506 C. alpina

332;NY 0 1 0 1 1 0.195 1.325 C. alpina

1367618;US 0 NA 0 NA 1 0.325 1.379 C. alpina

1679772;US 0 1 0 1 1 0.564 1.466 0 1 0 1 1 0.713 1.707 C. alpina

1848190;US 1 1 1 0 0 0.303 1.441 1 1 1 0 0 0.505 1.38 C. americana

661869;US 0 1 0 1 1 0.347 1.26 0 1 0 1 1 0.266 1.346 C. alpina

1286291;US 0 1 0 1 1 0.291 1.227 0 1 0 1 1 0.441 1.61 C. alpina

10;NY 0 1 0 1 1 0.241 1.408 0 1 0 1 1 0.343 1.625 C. alpina

22105;NY 0 1 0 1 1 0.232 1.781 0 1 0 1 1 0.343 1.8 C. alpina

147282;ARIZ 0 1 0 1 1 0.395 1.691 0 1 0 1 1 0.454 1.922 C. alpina

007126;NMC 1 1 0 NA 1 0.55 1.652 C. alpina

190111;ARIZ 0 1 0 1 0 0.293 1.793 0 1 0 1 0 0.39 1.826 C. alpina

P6111;NY 1 1 0 1 1 0.408 1.416 1 1 0 1 1 0.469 1.95 C. alpina

16001;NY NA 1 0 1 1 0.28 1.388 NA 1 0 1 1 0.589 1.638 C. alpina

00105228A;TEX/LL 0 1 0 NA 1 0.285 1.248 0 1 0 NA 1 0.303 1.486 C. alpina

00105228B; TEX/LL 0 1 0 1 1 0.262 1.051 0 1 0 1 1 0.327 1.571 C. alpina

00105228C; TEX/LL 0 1 0 1 1 0.267 1.395 0 1 0 1 1 0.37 1.729 C. alpina

85;NY 1 1 0 1 1 0.231 1.626 C. alpina

Conopholis SS0330;TRTE 1 1 1 0 0 0.523 1.053 C. americana americana (L.) Wallr. SS04102;TRTE 1 NA 1 NA 0 0.822 1.606 C. americana

71

SS0489;TRTE 1 1 1 0 0 0.639 1.143 C. americana

SS0480;TRTE 1 1 1 0 0 0.551 1.081 1 1 1 0 0 0.661 1.395 C. americana

SS0494;TRTE 1 1 1 0 0 0.668 1.538 1 1 1 0 0 0.681 1.763 C. americana

SS0331;TRTE 1 1 1 0 0 0.548 1.14 1 1 1 0 0 0.742 1.602 C. americana

SS0493;TRTE 1 1 1 0 0 0.716 1.717 1 1 1 0 0 0.663 1.814 C. americana

SS0311;TRTE 1 1 1 0 0 0.681 1.275 1 1 1 0 0 0.606 1.373 C. americana

SS0329;TRTE 1 1 1 0 0 0.619 1.182 1 1 1 0 0 0.581 1.265 C. americana

SS0472;TRTE 1 1 1 0 0 0.544 1.462 1 1 1 0 0 0.628 1.737 C. americana

SS0483;TRTE 1 1 1 0 0 0.719 1.395 1 1 1 0 0 0.679 1.738 C. americana

SS04109;TRTE 1 1 1 0 0 0.576 1.477 C. americana

SS06127;TRTE 1 1 1 0 0 0.748 1.578 C. americana

SS0932;TRTE 1 1 1 0 0 0.607 1.208 C. americana

SS06146A;TRTE 1 1 1 0 0 0.503 1.972 C. americana

SS06146B;TRTE 1 1 1 0 0 0.742 1.608 C. americana

SS1005;TRTE 1 1 1 1 0 0.585 1.303 C. americana

SS06170;TRTE 1 1 1 0 0 0.63 1.43 C. americana

SS0925;TRTE 1 NA 1 NA 0 0.53 1.37 C. americana

SS0471;TRTE 1 1 1 0 0 0.483 1.398 1 1 1 0 0 0.594 1.612 C. americana

SS0908;TRTE 1 1 1 0 0 1.017 1.904 C. americana

SS0931;TRTE 1 1 1 0 0 0.415 1.099 C. americana

SS05001B;TRTE 1 1 1 0 0 0.674 1.488 C. americana

SS05001E;TRTE 1 1 1 0 0 0.801 1.683 C. americana

72

SS06160A;TRTE 1 1 1 1 0 0.636 1.326 C. americana

SS06160B;TRTE 1 1 1 0 0 0.592 1.553 C. americana

SS06133A;TRTE 1 1 1 0 0 0.71 1.986 C. americana

73

APPENDIX 1. List of herbarium specimens examined for morphometric analyses of the genus Conopholis. Country, locality, collectors, and herbaria in which the specimens are deposited are provided for each individual. Entries follow the following format: Species name Authority: accession label, voucher information (Herbarium acronym), locality. Accession labels are the unique alphanumeric code applied to the specimen indicated on dendrogram (see Fig. 3). Abbreviations of herbaria follow Index Herbariorum (Thiers 2012).

Conopholis alpina Liebm. var alpina sensu R. R. Haynes: 1820563, Davidson 399 (US), Boquete, Chiriqui, Panama; 1010416, Killip 3605 (US), Potrero, Chiriqui, Panama; 2490023, Stern 2033 (US), Boquete, Chiriqui, Panama; 577561, Pittier 12212 (US), Sanata Rosa, Costa Rica; 857, Utley 857 (NY), San Jose, Costa Rica; 7506, Burger 7506 (NY), Canaan, Costa Rica; 1252575, Standley 42022 (US), San Jose, Costa Rica; 677512, Pittier 3122 (US), Boquete, Chiriqui, Panama; 1808147, White 66 (US), Chiriqui, Panama; 1820828, Davidson 956 (US), Chiriqui, Panama; 3422, Aranda 3422 (NY), Jurutungo, Chiriqui, Panama; 3604, Killip 3604 (NY), Chiriqui, Panama; 3121, Pittier 3121 (NY), Boquete, Chiriqui, Panama; 880, Woodson 880 (NY), Casita Alta, Chiriqui, Panama; 305, Allen 305 (NY), Cerro Punta, Chiriqui, Panama; 215767A, Pringle 4676 (US), Sierra De San Felipe, Oaxaca, Mexico; 215767B, Pringle 4676 (US), Sierra De San Felipe, Oaxaca, Mexico; 4676A, Pringle 4677 (US), Sierra De San Felipe, Oaxaca, Mexico; 2613C, Camp 2613 (NY), Cerro de San Felipe, Oaxaca, Mexico; 1424, Mickel 1424 (NY), Ixtlan, Oaxaca, Mexico; 28456B, Matuda 28456 (NY), Oaxaca, Mexico; 2613B, Camp 2613 (NY), Cerro de San Felipe, Oaxaca, Mexico; 2321A, Fryxell 2321 (NY), Distrito Federal, Mexico; 2321B, Fryxell 2321 (NY), Distrito Federal, Mexico; 34174B, Davidse 34174 (NY), Distrito Federal, Mexico; 34174C, Davidse 34174 (NY), Distrito Federal, Mexico; 461750, Pringle 13153 (US), Distrito Federal, Mexico; 1358, Ventura 1358 (NY), Distrito Federal, Mexico; 840015, Brandegee 1851 (US), Santiago Tuxtla, Vera Cruz, Mexico; 464217A, Arsine 1062 (US), Puebla, Mexico; 2923296, Ventura 4913 (US), Vera Cruz, Mexico; 1003322A, Arsene 1062 (US), Puebla, Mexico; 1004044B, Arsene 1004044 (US), Puebla, Mexico; 1003324, Arsene 5229 (US), Morelia, Michoacan, Mexico; 1004045A, Nicolas 109 (US), Manzanilla, Puebla, Mexico; 1004045B, Nicolas 109 (US), Manzanilla, Puebla, Mexico; 4237A, Steinmann 4237 (NY), Tingambato, Michoacan, Mexico; 4237B, Steinmann 4237 (NY), Tingambato, Michoacan, Mexico; 559A, Galvan 559 (NY), Hidalgo, Mexico; 559B, Galvan 559 (NY), Hidalgo, Mexico.

74

Conopholis alpina Liebm. var. mexicana (A. Gray ex S. Watson) R. R. Haynes: 337652, Turner 97-90 (ARIZ), Hidalgo Co., New Mexico, USA; 21769, Correll 21769 (NY), Majalca, Chihuahua, Mexico; 8028, Spellenberg 8028 (NY), Ocampo, Chihuahua, Mexico; 313673, Laferriere 355 (ARIZ), Temosachi, Chihuahua, Mexico; 3194, Moore 3194 (NY), Brewster Co., Texas, USA; 203, Palmer 203 (NY), San Ramon, Durango, Mexico; 98-628, Van Devender 98- 628 (NY), Yecora, Sonora, Mexico; 497910, Metcalfe 1022 (US), New Mexico, USA; 495338, Metcalfe 241 (US), Socorro County, New Mexico, USA; 34379, Palmer 34379 (NY), Jeff Davis Co., Texas, USA; 693, Parry 693 (NY), San Luis Potosi, Mexico; 589, Palmer 589 (NY), Sierra De Alvarez, San Luis Potosi, Mexico; 244914, Yatskievych 83-81 (ARIZ), Cola De Caballo, Nuevo Leon, Mexico; 737255, Herrick 262 (US), Albuquerque, New Mexico, USA; 662474, Ellis 48 (US), Bernalillo Co., New Mexico, USA; 1735136, Studhalter S3000 (US), Grant Co., New Mexico, USA; 1221674, Lee 161 (US), Hidalgo Co., New Mexico, USA; 737065, Wooton 737065 (US), Dona Ana County, New Mexico, USA; 1739221, Peebles 13272 (US), Gila County, Arizona, USA; 1439044; Peebles 5862 (US), Cochise Co., Arizona, USA; 1435056, Peebles 5387 (US), Cochise Co., Arizona, USA; 332, Rusby 332 (NY), Greenlee Co., Arizona, USA; 1367618, Peebles 4404 (US), Graham Co., Arizona, USA; 1679772, Warnock T97 (US), Jeff Davis Co., Texas, USA; 1848190, Sperry T744 (US), Jeff Davis Co., Texas, USA; 661869, Gooding 1048 (US), Cochise Co., Arizona, USA; 1286291; Orcutt 1085 (US), Jeff Davis Co., Texas, USA; 10, Fryxell 10 (NY), San Isidro, Nuevo Leon, Mexico; 22105, Henrickson 22105 (NY), Galeana, Nuevo Leon, Mexico; 147282, Pringle 13746 (ARIZ), Nuevo Leon, Mexico; 007126, Henrickson 16000 (NMC), Cuatro Cienegas, Coahuila, Mexico; 190111, Pinkava 10472 (ARIZ), Cuatro Cienegas, Coahuila, Mexico; P6111, Pinkava P-6111 (NY), Sierra De San Marcos, Coahuila, Mexico; 16001, Henrickson 16001 (NY), Canon Desiderio, Coahuila, Mexico; 00105228A, Johnston 10824 (TEX/LL), Coahuila, Mexico; 00105228B, Johnston 10824 (TEX/LL), Coahuila, Mexico; 00105228C, Johnston 10824 (TEX/LL), Coahuila, Mexico; 85, Palmer 85 (NY), Santiago Papasquiaro, Durango, Mexico.

Conopholis americana (L.) Wallr.: SS0330, Stefanović SS.03.30 (TRTE), Monroe Co., Indiana, USA; SS04102, Stefanović SS.04.102 (TRTE), Bloomington, Monroe Co., Indiana, USA; SS0489, Stefanović SS.04.89 (TRTE), Martin Co., Indiana, USA; SS0480, Stefanović SS.04.80

75

(TRTE), Lawrence Co., Indiana, USA; SS0494, Stefanović SS.04.94 (TRTE), Crawford Co., Indiana, USA; SS0331, Stefanović SS.03.31 (TRTE), Hickory Ridge Lookout, Monroe Co., Indiana, USA; SS0493, Stefanović SS.04.93 (TRTE), Crawford Co., Indiana, USA; SS0311, Stefanović SS.03.11 (TRTE), Gulf Bottom Trail, McCreary Co., Kentucky, USA; SS0329, Stefanović SS.03.29 (TRTE), Monroe Co., Indiana, USA; SS0472, Stefanović SS.04.72 (TRTE), Kanawha Co., West Virginia, USA; SS0483, Stefanović SS.04.83 (TRTE), German Ridge, Perry Co., Indiana, USA; SS04109, Stefanović SS.04.109 (TRTE), Bloomington, Monroe Co., Indiana, USA; SS06127, Stefanović SS.06.127 (TRTE), Sugarlands Valley, Blount Co., Tennessee, USA; SS0932, Stefanović SS.09.32 (TRTE), Muskegan Co., Michigan, USA; SS06146A, Stefanović SS.06.146A (TRTE), Swain Co., North Carolina, USA; SS06146B, Stefanović SS.06.146B (TRTE), Swain Co., North Carolina, USA; SS1005, Stefanović SS.10.05 (TRTE), Gatineau Park, Quebec, Canada; SS06170, Stefanović SS.06.170 (TRTE), Halton Co., Ontario, Canada; SS0925, Stefanović SS.09.25 (TRTE), Summit Co., Ontario, Canada; SS0471, Stefanović SS.04.71 (TRTE), Kanawha Co., West Virginia, USA; SS0908, Stefanović SS.09.08 (TRTE), Niagara Co., Ontario, Canada; SS0931, Stefanović SS.09.31 (TRTE), Allegan Co., Michigan, USA; SS05001B, Stefanović SS.05.001B (TRTE), Lee Co., Alabama, USA; SS05001E, Stefanović SS.05.001E (TRTE), Lee Co., Alabama, USA; SS06160A, Stefanović SS.06.160A (TRTE), Jackson Co., North Carolina, USA; SS06160B, Stefanović SS.06.160B (TRTE), Jackson Co., North Carolina, USA; SS06133A, Stefanović SS.06.133A (TRTE), Blount Co., Tennessee, USA.

76

Figure 3-1

FIGURE 3-1. A summary of the relationships between the various names applied to taxa in the genus Conopholis according to the various authors before 1971, by Haynes in his monograph in 1971, and by our revised classification following this morphometric study.

77

Figure 3-2

FIGURE 3-2. A composite sketch of a stylized Conopholis specimen (adapted from Haynes 1971). The two positions along the inflorescence from where the morphometric observations and measurements were taken are labeled ‘top’ and ‘base.’ Scale bars equal 1 cm.

78

Figure 3-3

79

FIGURE 3-3. Illustration of the congruence between morphological and molecular data. On the left is the phenogram resulting from the UPGMA analysis performed in this study using Gower’s coefficient matrix on the observations and measurements made from the top of the inflorescence on 103 specimens. Major clusters recovered and discussed in this study are labeled A–C. Species names are followed by their respective accession label/collector numbers and abbreviations of states/provinces in which they were collected (Appendix 1). Underlined is an individual with an anomalous position, see text for discussion. On the right is a summary phylogenetic tree showing the relationships among the three major lineages within Conopholis inferred from a combined analysis of plastid (trnfM-E and clpP) and nuclear (PHYA) sequences (adopted from Rodrigues et al. 2011, which used Epifagus as the outgroup). Bayesian posterior probabilities and parsimony bootstrap support values are indicated above and below the branches respectively.

80

Figure 3-4

FIGURE 3-4. Principal coordinates analysis (PCoA) for the specimens of the genus Conopholis. A. Plot of the first two axes following analyses utilizing observations and measurements made from the ‘top’ of the inflorescence on 103 specimens. PCoA axes 1 and 2 explain 14.33% and 4.21% of the variation, respectively. B. Plot of the first two axes following analyses utilizing observations and measurements made from the ‘base’ of the inflorescence on 59 specimens. PCoA axes 1 and 2 explain 25.47% and 7.52% of the variation, respectively. Major clusters recovered and discussed in this study are encased by convex hulls and labeled A–C. Triangles represent individuals traditionally identified as C. americana, open circles represent those of C. alpina var. alpina, while closed circles are those of C. alpina var. mexicana. Arrowheads highlight outliers in cluster C (see text for discussion).

81

Figure 3-5

82

FIGURE 3-5. Stylized phylogenetic tree, based on morphological and molecular data, showing the relationships between the three proposed species of Conopholis. Character state transformations for the morphological characters examined in this study are indicated above branches. Characters and character states are listed in Table 1.

83

4 Development and characterization of polymorphic microsatellite markers for Conopholis americana (Orobanchaceae)

Other than thesis specific changes for formatting, this chapter was previously published as:

Rodrigues, A. G., A. E. L. Colwell, and S. Stefanović. 2013. Development and characterization of polymorphic microsatellite markers for Conopholis americana (Orobanchaceae). American Journal of Botany 99: e4-e6.

84

4.1 Abstract

Premise of Study: Conopholis americana (L.) Wallr. is an obligate root parasite with highly reduced morphology. To investigate population structure, genetic diversity, and mating system of this predominantly eastern North American species, we developed polymorphic microsatellite markers for C. americana.

Methods and Results: Using an enrichment cloning protocol, we report the isolation and characterization of 11 microsatellites. Product size varied from 198 – 370 bp. These loci show moderate levels of allelic variation (averaging 4.182 alleles per locus) and very low levels of heterozygosity (average observed heterozygosity = 0.054).

Conclusions: These microsatellite markers could be used to obtain estimates of population-level genetic diversity and in phylogeographic studies of C. americana.

85

4.2 Introduction

Members of the genus Conopholis are perennial achlorophyllous obligate root parasites that form haustorial connections to the vascular system of oaks (Kuijt, 1969). Conopholis americana (L.) Wallr. specifically parasitizes red oaks (Quercus section Lobatae) in moist, deciduous or mixed forests and is found across eastern North America, from Florida north to

Nova Scotia west to Wisconsin and south to Alabama (Haynes, 1971). To date, very little is known of the relationships among populations and of the species post-glacial history. The populations are best described as locally abundant but rare and isolated, often separated by kilometers of forest. These plants do not possess floral nectaries and are not known to produce a fragrance that would attract insect pollinators. Bagging experiments to specifically investigate pollination by wind or insects suggest a predominant selfing mode of pollination (Baird and

Riopel, 1986b). In addition, studies of flowers postanthesis have found the anthers to be in physical contact with the stigma. Dispersal occurs either by the washing away of seeds following periods of rain or through the consumption of the inflorescence by mammals such as deer (Baird and Riopel, 1986b).

At the height of the last glacial maximum, ranges of many species in eastern North America, particularly those that currently occupy temperate habitats, were restricted south of the

Laurentian ice sheets that dominated the northern part of the continent (see McLachlan et al.,

2005 and references therein). With the retreat of the ice, populations of C. americana are likely to have migrated northward, together with it’s host, to their ranges present day. The “Southern

Refugia hypothesis” postulates higher diversity in the southern nonglaciated regions and loss of this diversity by populations moving northwards. As part of our broader research on systematics and evolutionary history of Conopholis (Rodrigues et al., 2011), in this study we

86 isolate and characterize fast evolving co-dominant microsatellite markers to gain a better understanding of the mating system, relationships among populations, and the amount and geographic distribution of genetic diversity for C. americana.

4.3 Methods and Results

Total genomic DNA was extracted from fresh or silica dried material using a modified hexadecyltrimethylammonium bromide (CTAB) technique from Doyle and Doyle (1987) containing polyvinylpyrrolidone (PVP) in concentrations of 0.3 – 4%, used to bind and remove tannins and other secondary plant compounds (Palmer, 1986). All DNA extracted from multiple individuals per population was purified using Wizard minicolumns (Promega, Madison,

Wisconsin, USA). A microsatellite enriched C. americana genomic library was constructed according to the fast isolation by AFLP of sequences containing repeats (FIASCO) protocol by

Zane et al. (2002). Briefly, DNA of C. americana were simultaneously digested with MseI and ligated to an MseI AFLP adaptor (5’-GACGATGAGTCCTGAG-3’). The digestion-ligation mixture was then diluted (1:10) and directly amplified with AFLP adaptor-specific primers. The products were denatured, hybridized to a biotinylated probe, (AC)17, and fragments containing microsatellite sequences were captured by streptavidin-coated magnetic beads (Promega,

Madison, Wisconsin, USA). Nonspecific DNA was removed by three nonstringency washes and three stringency washes. DNA fragments were separated from the bead-probe complex by two denaturation steps (Elution 1 and Elution 2). The last nonstringency wash, the last stringency wash, and the two elution steps should harbour increasing proportions of repeat enriched DNA fragments carrying the MseI-N primer target site at each end. Each of the four recovered fractions was amplified by PCR using the MseI-N primer. PCR products from

Elution 2 were cloned using the TOPO-TA cloning kit (Invitrogen, Carlsbad, California, USA).

87

A total of 395 colonies were screened using the universal SP6/T7 primer combination. Colonies producing insert sizes larger than 300 bp were sequenced using the DYEnamic ET dye terminator sequencing kit (GE Healthcare, Baie-d’Urfé, Québec, Canada) on an Applied

Biosystems model 377 automated DNA sequencer (PE Biosystems, Foster City, California,

USA).

Sixty-two sequences contained equal to or greater than six dinucleotide repeats with sufficient flanking regions within which primers could be designed. Each candidate microsatellite locus was tested in five individuals from five different populations for amplification and polymorphisms. Amplifications were performed in 25 µL reactions containing 1X PCR buffer,

2.5 mM MgCl2, 0.2 mM dNTPs (Invitrogen, Burlington Ontario, Canada), 0.1 µL forward and reverse primers, 0.5 µL DMSO, and 0.1 U JumpStart TaqDNA polymerase (Sigma-Aldrich,

Oakville, Ontario, Canada). PCR conditions were 94ºC for 4 min followed by 36 cycles of 20 s at 93ºC, 50 s at specific annealing temperature, 50 s extension at 70ºC, with a final extension step of 30 min at 72ºC. Amplicons of expected length were purified and sequenced (same as above). Labeled primers (forward or reverse, labeled HEX or FAM; see Table 1) were ordered for 11 loci with variable repeat numbers.

Theses 11 primer pairs revealing a polymorphism in two or more individuals were chosen for a larger screening of 72 individuals from 11 populations of C. americana (Table 1 and Appendix

1) and amplified separately under the same optimal conditions described previously. Fragments were also genotyped separately (by locus) on an Applied Biosystems 3730xl DNA Analyzer

(Applied Biosystems, Foster City, California, USA) at The Centre for Applied Genomics at The

Hospital for Sick Children (Toronto, Canada). Allele sizes were initially estimated using

GENEMAPPER (Applied Biosystems), but all electropherograms were examined manually before assigning final genotypes.

88

Characteristics of these 11 polymorphic loci are summarized in Tables 1 and 2. Exact tests for

Hardy-Weinberg equilibrium were performed in GENEPOP 4.0.10 (Raymond and Rousset,

1995; Rousset, 2008). The computer programs ARLEQUIN (Excoffier et al., 2005) and FSTAT version 2.9.3.2 (Goudet, 1995) were used to calculate linkage equilibrium among loci within C. americana. The number of alleles as well as observed and expected heterozygosity were estimated using the software GDA (Lewis and Zaykin, 2001). At the population level, the mean number of alleles per locus was 1.339. An average of 4.182 alleles per locus was observed. A significant deviation from linkage equilibrium was observed for the pairwise locus combination of SSR9 – SSR 10. The microsatellite loci SSR6 and SSR33 were the most polymorphic, with 7 and 8 alleles respectively. Expected heterozygosity ranged from 0.158 – 0.778 while observed heterozygosity ranged from 0 – 0.286. These low levels of observed heterozygosity in C. americana are expected given the life history of the genus and would suggest a selfing mode of pollination as proposed by Baird and Riopel, 1986b.

4.4 Conclusion

We developed 11 microsatellite loci that show variability at the population level in Conopholis americana. These are the first microsatellite DNA markers for the genus, and will function as prime tools to quantify levels of genetic variation and patterns of population structure in this holoparasitic species. Although yet to be tested, the microsatellite markers described here are also likely to be extendable to the other two species within this genus, making them useful not only to draw conclusions about mating systems and postglacial migration of C. americana in eastern North America, but also for population, conservation, systematic, and phylogeographic studies of other Conopholis species.

89

4.5 Acknowledgements

The authors thank two anonymous reviewers for critical comments on the manuscript. Financial support from the Natural Sciences and Engineering Research Council of Canada Discovery grant (326439) and University of Toronto Connaught New Staff Matching grant to S. Stefanović are gratefully acknowledged.

90

Table 4-1

Table 4-1. Characteristics of 11 microsatellite loci developed for Conopholis americana (n = 72 individuals) including forward (F) and reverse ® primer sequences, repeat type, allele size

ranges, annealing temperatures (Ta), and GenBank accession numbers.

Locus Primer Sequences (5-3) Repeat Motif Expected Ta (ºC) GenBank Accession Size (bp) No.

SSR6 F: TGAAACATCGAACATGTGTGT (GT)13(GA)20 245 59 JN050982

**R: CCTCAAGCGACACATAGAGC

SSR10 *F: TTCGCACCATAGATCTTGACC (GA)10C(AG)15 257 59 JN050983

R: TCCCCTTATGATTTAGATTGAATTG

SSR27 F: CCAAATTCGACAATCTAAAACA (CA)13 249 61 JN050984

*R: AGCCTCATTTCAGCCCTTAC

SSR33 **F: ATTCTGAGTCCGTACAATCCTC (GA)18 362 57 JN050985

R: GCTAAAATTTCTCTCTCGTCTTG

SSR9 **F: GAACTCCCCTTATGATTTAGATTGA (CT)14G(TC)10 227 59 JN050986

R: ATAAGACCTTGAGGCTGCTG

SSR22 *F: GAAGAGAGGGTGCGAAGAA (AG)10 244 56 JN050987

R: AACTTCTTTCTTTCTCTTGATTCC

SSR49 F: TGGATGTTGAGTTATCTGTTCA (GT)20...(AG)18 199 55.4 JN050988

*R: CCACCAAGCACTTTTTATCA

SSR42 **F: GCGCGCTTTTTAGAACACT (GA)10…(GA)12 265 55.4 JN050989

R: AAGACAAGCCCTAGAATGGA

SSR43 **F: GGAGATCTATAACGGGGTTG (TC)13 339 55.4 JN050990

R: GCCGATAACCAGACCATTAG

SSR56 *F: TGAGTCGAGTCGATTTACCA (GT)7 198 63 JN050991

R: GACGGTGGCTCTGTAACTCT

SSR51 *F: CATACCCAAAAACCCTTTCA (GT)10 242 61 JN050992

R: ACCCTCACAAACCGACACAT

Note: in column Primer Sequences (5-3), * denotes FAM labeled primer, ** denotes HEX labeled primer. 91

Table 4-2

Table 4-2. Results of the initial primer screening in population of Conopholis americana. The number of alleles (A) and the mean value of observed heterozygosity are shown per locus.

Locus No. of alleles (A) Observed Expected Heterozygosity (Ho) Heterozygosity (He)

SSR6 7 0.167 0.680

SSR10 4 0.000 0.356

SSR27 5 0.000 0.667

SSR33 8 0.113 0.778

SSR9 4 0.028 0.378

SSR22 3 0.000 0.287

SSR49 6 0.286 0.426

SSR42 3 0.000 0.213

SSR43 2 0.000 0.293

SSR56 2 0.000 0.179

SSR51 2 0.000 0.158

92

Table 4-3 Appendix 1

Appendix 1. DNA accession numbers, locality from where specimens were collected and geographic coordinates for sequences of Conopholis americana used in this study.

DNA No. of Geographic Localitya accession Individuals Coordinatesb

AC.FL.JS 2 Marion Co. Florida, USA 29°10′ N 81°42′ W

AC.NC.L; 10 Madison Co. North Carolina, USA 35°44′ N 82°51′ W

AC.IN.CCF 7 Clarke Co. Indiana, USA 38°32′ N 85°49′ W

AC.IL.KSP; 9 Vermillion Co., Illinois, USA 40°07′ N 87°44′W

AC.PA.MSF; 9 Franklin Co., Pennsylvania, USA 39° 55′ N 77° 26′ W

AC.MA.MN; 14 Hampshire Co., Massachusetts, USA 42°18′ N 72°30′ W

SS.05.79; 3 Huntsville, Madison Co., Alabama, USA 34° 43′ N 86° 35′ W

SS.09.37 2 Cheboygan Co., Michigan, USA 45° 33′ N 84° 40′ W

AC.FL.SF; 5 Alachua Co., Florida, USA 29° 44′ N 82° 26′ W

Devil ’ s Lake State Park, Sauk Co., Wisconsin, AC.WI.DL; 6 43° 24′ N 89° 42′ W USA

AC.VA.TRP; 5 Fairfax Co., Virginia, USA 38° 57′ N 77°09′ W Note: in column DNA accession: SS, Saša Stefanović ; AC, Alison Colwell. aGeographic areas where the specimen were collected. bApproximate geographic coordinates for the localities from which the specimens were obtained.

93

5 Present-day genetic structure of the holoparasite Conopholis americana (Orobanchaceae) in eastern North America and location of its refugia during the last glacial cycle

94

5.1 Abstract

Aim Understanding how various organisms responded to past climate changes could provide insight into how they may respond to current or future changes. Conopholis americana has a broad distribution across eastern North America, covering both previously glaciated and unglaciated regions. In this study, we investigated the post-glacial history and phylogeographic structure of this parasitic plant species to characterize its genetic variation and structure and to identify the number and locations of past refugial areas.

Location Eastern North America

Methods Molecular data from 10 microsatellite markers and DNA sequences from the plastid gene/intron (clpP) were collected for 281 individuals sampled from 75 populations spanning the current range of the species and analyzed using a variety of phylogeographic methods.

Distribution modelling was carried out to determine regions with relatively suitable climatic niches for populations existing at the Last Glacial Maximum (LGM) and current populations.

Results We inferred the persistence of a minimum of two glacial refugia for C. americana at the LGM, one in Florida and southern Alabama and another in the Appalachian Mountains near the southern tip of Blue Ridge Mountains. High levels of genetic diversity were observed across the southern Appalachian Mountains, the region where populations from two refugia come together following re-colonization northward.

Main Conclusions The genetic and geographic patterns revealed by our results provide further evidence of the dynamic nature and phylogeographical history of eastern North American taxa.

The recovery of a distinct southern lineage is in agreement with the location of a previously proposed southern glacial refugium spanning across Florida, southern Georgia and Alabama,

95 and the Lower Mississippi Valley. The second lineage is dominant across the present northern range of the species and is hypothesized to have been located in the southern extent of the Blue

Ridge Mountain Range of the Appalachian Mountains at the LGM.

96

5.2 Introduction

The Pleistocene epoch (approximately 2.6 – 0.01 MYA) was a time of great climate change that consisted of long glacial periods separated by shorter warm interglacial periods. Climate conditions during this time were highly variable, with more than 20 glacial cycles recorded, resulting in major alteration to the landscape (Pielou, 1991; Williams et al., 1998; Hewitt, 2000).

During the last glacial maximum (LGM; approximately 20-18 kya) at the end of the Wisconsin glaciation, the Laurentide ice sheet covered most of North America (Canada and the northern

USA). In eastern North America, the ice margin extended south to an area comprising today the states of New York, Pennsylvania, and Ohio, and covered all of the Great Lakes. Permafrost and tundra continued even further south beyond the leading edge of the ice sheet. Due to the extreme climate during this time, the distribution of plant species occurring south of the ice sheet would have been greatly different from what we see today. As a result of the great expanse of the Laurentide ice sheet at the LGM, most species experienced a reduction or fragmentation in their habitat and population size, and would have been confined to southern ice-free refugia, thus escaping the harsh environment that made much of North America uninhabitable (Hewitt, 1996).

For plants, the fossil record and pollen data suggest that as the ice sheet advanced, the deciduous forest in this region retreated south of 33º N, spanning across the region that today comprises the states of Florida and Georgia as well as the Lower Mississippi Valley (Davis, 1981; Delcourt and Delcourt, 1993; Jackson et al., 2000; Jackson and Overpeck, 2000; Soltis et al., 2006). As temperatures rose and the glaciers receded at the end of the LGM, populations were able to expand their geographic distributions northward and re-colonize new areas that became suitable

(Pielou, 1991). With the migration of populations northward from southern refugia, genetic

97 patterns become evident, following a “southern richness to northern purity” scenario (Hewitt,

2000). This hypothesis posits that higher genetic diversity should be found in populations that now occupy southern, previously non-glaciated regions and predicts the loss/reduction in diversity by those populations moving northwards along the axes of recolonization. In North

America, this is supported by phylogeographic studies that have revealed a lower genetic diversity in northern populations compared to those from the south (Hewitt, 1996; McLachlan et al., 2005; Hewitt, 2004).

Recently, phylogeographic studies have also shown that more northern, smaller, cryptic refugia may have existed (Stewart and Lister, 2001; Jaramillo-Correa et al., 2004; Godbout et al., 2005;

McLachlan et al., 2005). These regions would have been located in ice-free areas that persisted near the ice margin as well as on the peaks of mountains protruding through the ice sheet

(nunataks). These findings add to the complex history of postglacial colonization following the

LGM. They affect our interpretation of how plants respond to changes in climate and whether distributional ranges can be better explained by range expansions from more northern-located refugia or by long distance dispersal events from the southern refugia. To discern among these hypotheses and to study patterns and tempo of post-glacial history in eastern North America, species with broad distributions are needed, spanning both previously glaciated and unglaciated regions.

In a recent molecular phylogenetic study of the North American holoparasitic genus Conopholis

(Orobanchaceae; Rodrigues et al., 2011), C. americana (L.) Warll. was identified as one of its three major lineages and subsequently was confirmed as a distinct species by comprehensive morphometric analyses (Rodrigues et al., 2013). Its present day populations span the locations of traditional as well as potential cryptic glacial refugia along with several known barriers to plant movement on the continent (Soltis et al., 2006). Conopholis americana is primarily

98 distributed throughout the eastern USA and adjacent Canada, from Nova Scotia to Wisconsin in the north and from Florida to Alabama in the south, with some of its populations found in southern Mexico, as disjunct members of this species (Rodrigues et al., 2011, 2013). In eastern

North America, these plants are found in moist, deciduous, or mixed forests attached to the roots of red oaks (Quercus section Lobatae) via haustoria. Also, plants such as Conopholis that are self fertilizing (Baird and Riopel, 1986b) and have relatively limited dispersal ability, especially given the reliance on their host for survival, represent excellent model systems that can be used to identify the locations of northern refugia. Namely, such species cannot rely primarily on long distance dispersal to explain how they expanded to occupy their current geographic range, but have instead likely existed in small populations closer to the ice margin from which they could expand their range following the retreat of the glaciers.

The overarching goal of this study was to investigate the glacial history of Conopholis americana in eastern North America. Our specific objectives were to employ phylogeographic analyses using plastid and nuclear markers along with species distributional modelling to (1) determine the genetic variation across the range of C. americana in eastern north America, (2) quantify the phylogeographic structure, (3) identify refugial locations and recolonization history,

(4) attempt to shed further light on their breeding system, and (5) use the results as a proxy for the host range expansion (red oaks).

5.3 Materials and Methods

5.3.1 Taxon Sampling and DNA Extraction

A total of 281 individuals from 75 populations were used in this study, covering essentially the entire range of Conopholis americana in eastern North America (e.g., Fig. 1b). A complete list

99 of collecting locations and sample sizes is provided in Table 1. Total genomic DNA was extracted from fresh or silica-dried material and purified as described in Rodrigues et al.,

(2011).

5.3.2 Plastid clpP Sequencing

Because the plastid genome is non-recombinant and usually only maternally inherited (Reboud and Zeyl, 1994) it can be used to identify the genetic signature of maternal lineages. Also, owing to relatively low mutation rates observed in these genomes (Wolfe et al., 1987), the majority of alleles (haplotypes) recovered are the genealogical derivatives of distinct lineages that predate postglacial colonization (McLachlan, 2005). Therefore, the modern geographic distribution of plastome haplotypes is expected to correspond largely to the migration routes of expanding populations from glacial refugia and a direct mutational relationship among the haplotypes can be detected and traced. To assess haplotype diversity of C. americana, we targeted plastid clpP gene and its introns. PCR reactions, amplicon purification and sequencing for all 281 sampled individuals were carried out as described in our phylogenetic study

(Rodrigues et al., 2011). Newly generated sequences were deposited in Genbank under accession numbers X - Y.

5.3.3 Microsatellite Genotyping

It has become increasingly clear that conclusions drawn from results based solely on a single non-recombining region/gene of the plastid genome can potentially be misleading (Schaal et al.,

1998; Brito and Edwards, 2009). To make more reliable inferences of both past population history and present population structure, phylogeographic studies are moving towards using markers from the organellar genomes along with multiple unlinked nuclear markers in combination with species distribution modelling. The same 281 individuals (Table 1) were 100 genotyped for ten unlinked microsatellites loci developed and characterized for C. americana

(SSR6, SSR9, SSR22, SSR27, SSR33, SSR42, SSR43, SSR49, SSR51, SSR56), following the methods for amplification and scoring detailed in Rodrigues et al., (2012).

5.3.4 Analyses of Population Diversity and Structure

Plastid sequences were aligned manually in Se-Al version 2.0a11 (Rambaut, 2002). Gaps in the alignment were treated as missing data. However, indels were coded and binary codes appended to the nucleotide sequences. A statistical parsimony haplotype network was constructed using TCS version 1.21 (Clement et al., 2000). Support for relationships among major lineages was inferred from nonparametric bootstrapping (Felsenstein, 1985) implemented in PAUP* version 4.0b10 (Swafford, 2002) using 500 pseudoreplicates each with 20 random sequence addition cycles, TBR branch swapping, and MULTREES option off (DeBry and

Olmstead, 2000).

The structure of the nuclear microsatellite data was explored using the Bayesian approach implemented in BAPS version 5.3 (Bayesian Analysis of Population Structure; Corander et al.,

2003). BAPS identifies clusters of genetically similar populations that have restricted gene flow between them. In preliminary analyses, ten replicates were run for all possible number of clusters (K) up to a maximum of 75, the number of populations sampled in our study. We found that 23-25 clusters were supported by the data, with 25 being the optimal number of clusters.

Therefore, the final runs were performed with the maximum possible number of clusters set to

35. To infer the relationships between the optimal number of clusters recovered, the distances between clusters obtained in BAPS were used to produce a Neighbour-Joining (NJ) tree in

PAUP* version 4.0b10 (Swofford, 2002).

To test the significance of both the clusters recovered using TCS for the plastid sequenced data

101 and BAPS for the microsatellite genotyped data, we performed analysis of molecular variance

(AMOVA) implemented in Arlequin version 3.5.1.2 (Excoffier and Lischer, 2010). Each data set was partitioned into the groups recovered by each analysis (a two-by-two factorial design), resulting in a total of four analyses as follows: (1) plastid sequenced data partitioned by TCS major groups; (2) plastid sequenced data partitioned according to major BAPS lineages; (3) microsatellite genotyped data grouped by TCS major groups; and (4) microsatellite genotyped data grouped according to major BAPS lineages. In addition to partitioning the data in this manner, we also divided both data sets according to geographic barriers: (1) populations currently found north versus south of where the ice margin was located at the LGM; and (2) populations found north, on, and south of the Appalachian mountain range.

5.3.5 Distribution Modelling

Ecological Niche Modelling provides us with the tools necessary to determine the location(s) of suitable habitat for species and to define potential distributional ranges at the LGM. In combination with traditional molecular phylogeographic data, it can be used to offer an independent, more objective, and more spatially defined hypothesis for the geographic distributions and patterns of species in the past (Waltari et al., 2007). To determine regions with relatively suitable climate niches for lineages/populations of C. americana in eastern North

America at present day conditions and at the LGM (ca. 21 kybp), we took advantage of

WorldClim climate data (Hijmans et al., 2005) available for 19 bioclimatic factors. These environmental conditions summarize aspects of climate that may be particularly relevant in determining species distribution and their limits. Employing the maximum entropy approach implemented in Maxent version 3.3.3 (Phillips et al., 2006; Phillips and Dudik, 2008), we used the data to predict where individuals of this species are most likely to occur. Maxent generates ecological niche models utilizing presence-only species records and contrasts them with pseudo- 102 absence data sampled from the remainder of the study area. Layers were trimmed to the area surrounding North America and projected across the same dimensions after modelling. Present day species occurrence data for C. americana in eastern North America (640 entries) were downloaded from the Global Biodiversity Information Facility (GBIF) data portal

(http://www.gbif.org, September 24, 2012). Prior to analyses, the data were mapped and any points that were not in the geographic range of C. americana specific to eastern North America were removed from the data set. In addition, duplicate sample records were removed to avoid the effects of spatial autocorrelation. For the present niche model predictions, we used the 19 bioclimatic variables from the WorldClim data set with a 2.5 min spatial resolution (Hijmans et al., 2005). For the LGM climate, data layers representative of that time were derived from the

Community Climate System Model (CCSM) at the same resolution (2.5 min). In Maxent, the models were run using the default convergence setting (10-5) with 1000 iterations, using 25% of the localities for model training. Maxent outputs a continuous surface value ranging from 0 – 1, indicating regions of potentially suitable climate niches where individuals/populations of the species could be found. When projected onto the reconstructed LGM data, it can be used to identify potential refugial locations.

5.4 Results

5.4.1 Plastid sequencing and analysis

We amplified and sequenced the plastid clpP gene and its introns for all 281 individuals of

Conopholis from 75 populations. Sequences were readily alignable and resulted in an overall alignment length of 1553 bp. Scoring indels resulted in an additional 13 characters that were appended to the nucleotide matrix. The network constructed from the combined data

(nucleotides plus indels) revealed a total of 23 distinct haplotypes. Based on a combination of

103 bootstrap support, presence of unambiguous characters, and substantial branch length subtending them, we divided these haplotypes into three major groups (plastid groups P1, P2, and P3; Table 1) and shaded them black, dark gray, and light gray, respectively in Fig. 1. There were clear geographical differences in haplotype frequencies of the three major groups. Plastid group P3 is centered to the north while group P2 is centered to the south. These two haplotype groups each have a broad distribution, with a large area of overlap in their range (Fig. 1b). The third group, P1, is comprised of four very distinct and unique haplotypes found in the central overlapping region, specifically limited in distribution to the southern tip of the Appalachian

Plateau (northeastern Alabama and southeastern Kentucky). Populations sampled from across the Appalachians are the most diverse, with representatives of all three haplotype groups present. The states of Florida and South Carolina, along with the southern ranges of Alabama and Georgia (all of which are found south of the Appalachians), are found to have populations of C. americana belonging only to haplotype group P2. With respect to the LGM boundary, both haplotype groups P3 and P2 are found north of the ice margin, but those belonging to P3 dominate the northern range. None of the haplotypes belonging to group P1 are found north of the LGM line.

5.4.2 Microsatellite genotyping and analysis

Of the 10 loci targeted in this study, there was only one locus with greater than 5% missing data

(SSR42). The number of alleles per locus ranged from 2-11, with a mean value of 6.5. No significant linkage disequilibrium was detected between any of 45 pairs of microsatellite loci tested, in accordance with our previous results (Rodrigues et al., 2012) based on a smaller sampling. Levels of observed and expected heterozygosity ranged from 0-0.375 and 0-0.4, respectively. Bayesian analysis of population structure resulted in 25 genetically distinct clusters. The unrooted NJ dendrogram (Fig. 2a) produced from the distance matrix obtained in 104

BAPS revealed the relationship between these genotype clusters. Based on a combination of branch lengths and geographical distribution, we divided these haplotypes into three major groups (microsatellite groups M1, M2, and M3) and shaded them black, dark gray, and light gray, respectively, in Fig. 2. Unlike the plastid case, microsatellite data show only a narrow zone of overlap between these three groups and each group has a relatively broad distribution.

Namely, genotype group M1 dominates the very southern range of Conopholis americana in eastern North America, group M2 has a distribution that is more concentrated in the central region, while in the north, populations from group M3 are the predominant genotype. None of these three groups is unique and distinct to a very specific and localized region. On the other hand, similar to what has been seen with the plastid data, populations from the Blue Ridge

Mountains and the Appalachians in general are the most diverse, with all three groups represented in that region. Of the 28 populations north of the LGM line, 24 belong to group M3

(shaded light gray) and only four belong to the genotypes of group M2 (shaded dark gray).

None of the populations belonging to cluster M1 (shaded black) are found north of the ice margin; this group does not extend beyond southern Tennessee and southwestern North

Carolina. When the NJ dendrogram is rooted by the mid-point method, the distribution of these three groups largely follows a latitudinal subdivision, where southern genotypes from group M1 are sister to central and northern genotype group M2 and M3 respectively.

The AMOVA confirmed that the best regional differentiation is based on plastid haplotype groupings. Comparison of clusters recovered using both TCS and BAPS revealed that a significant proportion of the genetic variance was explained by differences among groups when the plastid data was partitioned according to the three major plastid groups recovered following

TCS analyses (FCT = 0.89). The next best partition was the microsatellite data sorted by BAPS clusters recovered (FCT = 0.19). The other two analyses in the two-by-two factorial design were

105 also statistically significant. However, the percentage of variation explained between groups was minimal and therefore was likely not biologically significant. When geographic boundaries were compared, we observed high and significant structure in the plastid data (LGM, FCT =

0.34; Appalachians, FCT = 0.38) as compared to the microsatellite data (LGM, FCT = 0.07;

Appalachians, FCT = 0.15). In cases where the boundaries are considered, most of the variation was partitioned among populations within regions as opposed to between regions.

5.4.3 Distribution modelling

The geographic distribution of C. americana in eastern North America based on current climate data was well modeled by the ecological niche models (Fig. 3a), as evidenced by the good match between the predicted and observed current distribution. The modeled distribution accurately shows highly probable areas of habitat extending from as far south as central Florida, north to Nova Scotia, west to Wisconsin and south to Alabama. When the models were projected onto past reconstructed climate layers at the LGM, suitable regions for the persistence of populations were identified as highly probable in various areas of the south, in Florida, coastal Louisiana, and the south-eastern border of Texas and the Mexican state of Tamaulipas.

In addition, a separate and more northern location with high suitability scores was the tri-state area in the southern reach of the Blue Ridge Mountains where Georgia, South Carolina, and

North Carolina boarder. Relatively habitable locations (with suitability scores between 0.25-

0.50) extend further north to straddle the LGM line in southern Indiana and Ohio as well as northern Virginia, West Virginia and Maryland.

5.5 Discussion

Populations of Conopholis can be described as rare and isolated, at times being separated by

106 kilometers, but usually locally abundant where present. These plants do not possess floral nectaries nor are they known to attract insect pollinators by producing a fragrance. Studies of flowers post-anthesis have found the anthers to be in physical contact with the stigma (Baird and

Riopel, 1986b). This, combined with bagging experiments aimed at exploring the role of wind and insects in pollination, suggest selfing as a mode of pollination for these plants. The results of our current study shed light on the phylogeographic structure of C. americana in eastern

North America and suggest that these populations are indeed self-fertlizing. Most of the populations are fixed for a particular plastid haplotype or microsatellite genotype (Table 1). Of the 75 populations surveyed, only seven populations have individuals belonging to more than one haplotype. According to the microsatellite data, the maximum value for observed heterozygosity was 0.375, with an average value of 0.042 across all populations. Of the 75 populations sampled, 67 are fixed for a particular genotype cluster. In addition, the inbreeding coefficient (FIS) was 0.88. This is a measure of the extent of genetic inbreeding within subpopulations, and such a high value is in agreement with the life history and previous bagging experiments showing that members of this species are highly self-fertilizing.

If populations of C. americana existed in separate refugia during the LGM, each harboring separate haplotypes/genotypes, we would expect to recover genetic differentiation between regions when populations are clustered according to the major haplotype/genotype lineages.

Our analyses show that populations of C. americana are indeed geographically structured in eastern North America. Based on the combination of the number of plastid haplotype and microsatellite genotype groups and their locations, we infer the persistence of a minimum of two glacial refugia at the LGM from which the populations we see today have likely originated. We identified a southern refugium located in Florida and southern Alabama. The multilocus microsatellite pattern observed is concordant with the plastid model. Plastid haplotypes within

107 group P2 (haplotypes 5-13) are primarily found distributed to the south, though a few are found in more northern populations located in a previously glaciated region. Microsatellite genotypes belonging to group M1 also dominate the southern landscape (genotypes labeled black; Fig. 2) in the same locations as haplotypes from group P2 (labeled dark gray; Fig. 1). In addition, these genotypes harbor high genetic diversity as evident by the long branches within microsatellite group M1 (Fig. 2a). Given that populations from Florida, South Carolina, and the southern ranges of Alabama and Georgia (Fig 1) cluster together in both data sets, this suggests the possibility of an out-of-Florida migration route. The location of such a southern refugium is posited for several other species (Acer rubrum, McLachlan et al., 2005; Liriodendron tulipifera;

Sewell et al., 1996; Sagittaria latifolia, Mylecraine et al., 2004; Trillium cuneatum, Gonzales et al., 2008) and is in agreement with where the hosts of C. americana, the red oaks, are presumed to have survived during the LGM along with other temperate hardwood taxa (Delcourt and

Delcourt, 1993; Jackson and Overpeck, 2000).

In addition, we identified a region in the southern Appalachian Mountains, near the southern tip of the Blue Ridge Mountains, as the location of a second, more northern refugium for C. americana during the LGM. At that time, the southern Appalachians are believed to have been dominated by boreal forest. Today, neither populations belonging to plastid haplotype group P3 nor the microsatellite groups M2 and M3 are found south of the southern Appalachians, but instead are the most common haplotypes/genotypes found in the central and northern range of this species (Fig. 1 & 2). Such a pattern in geographic distribution for these haplotypes and genotypes suggests their persistence in a more northern refugium during the LGM. Their occurrence in the Appalachians today is not reflective of the species most recent northward migration from the southern Florida refugium following the retreat of the ice margin. Instead, a few relictual populations could have survived in the southern Appalachians during the LGM at

108 which point they became geographically and genetically isolated from the southern refugium, resulting in their unique genetic signal that is found across the northern range of the species today. In the plastid haplotype network (Fig. 1a) the character that differentiates groups P3 from P2 (shaded light and dark gray, respectively) is an eight-nucleotide deletion. These eight nucleotides are present in the haplotypes of group P2 as well as P1 (shaded black; see Fig. 1), in addition to the other two species of Conopholis and the sister genus Epifagus (data not shown).

The deletion of this non-repetitive sequence of nucleotides in these haplotypes represents a strong and unique character that is unlikely to be homoplastic (Kelchner and Wendel, 1996;

Graham et al., 2000). It is the defining character that supports the separation of populations/individuals belonging to haplotype group P3 (labeled in light gray; Fig. 1) from those originating from the southern refugium. This finding of a more northern refugium located in the southern Appalachians is consistent with that suggested for other plant and animal taxa

(e.g., McLachlan et al., 2005; Jackson and Austin, 2010; Walker et al., 2009; Church et al.,

2003).

Further evidence for the existence of two separate refugia is the starburst pattern observed from the plastid haplotype network. Such a pattern is an expected signature of a species that has recently expanded from a single geographic source (Avise, 2009). In this case we observe two startburst patterns, one is for plastid group P2 labeled in dark gray and another for plastid group

P3, shaded light gray (Fig. 1a). This indicates a recent expansion from two separate geographic sources, where the common and widespread haplotypes 8 (dark gray) and 14 (light gray) are the ancestral conditions from which the other haplotypes were more recently derived and are still rare. The relatively high diversity found across the Appalachians likely represents a secondary contact zone between populations following re-colonization between the two previously separated refugia. As populations from the southern refugium migrated northwards at the end of

109 the LGM, they would have encroached on the geographic range harboring populations from the northern refugium. In this central region of eastern North America along the Appalachians is where we see a mixture of the different haplotype/genotype groups (Fig. 1b & 2b). However, with the retreat of the glaciers, range expansions and re-colonization northward would primarily involve populations at the leading edge in this region. Populations from the established northern refugium (light gray groups) are likely to block the range expansion of the related southern refugium populations (“leading edge hypothesis”; Hewitt, 1996; Swenson and Howard, 2005).

As a result, we expect a poleward decrease in genetic diversity within and among populations

(Hewitt, 2000; Hampe and Petit, 2005). Our study supports such a hypothesis, whereby populations derived from the southern and northern refugia are both found along the

Appalachian Mountains while in the north, populations are primarily derived from the northern refugium (i.e., the lead edge).

When comparing the past and present ecological niche distribution models for Conopholis americana (Fig. 3), we notice a substantial reduction in availability of suitable habitat for populations at the LGM. Figure 3b shows that at the LGM, highly suitable habitats for C. americana east of the Mississippi River were located in only two regions. The first is the central region of Florida and the southern portion of Mississippi and Louisiana along the Gulf

Coast while a second location with high probability scores is in the Blue Ridge Mountains of the southern Appalachians. This finding is consistent with the presumed location of where the hosts of Conopholis and other temperate deciduous hardwood species (Populus, Quercus, Alnus,

Betula) are believed to have survived during the most recent glacial cycle (Delcourt and

Delcourt, 1993; Jackson et al., 2000; Soltis et al., 2006). The molecular data also supports this

(as discussed above), resulting in an overall agreement between where the genetic diversity is observed, the genetic signatures of glacial refugia, and where the most likely suitable habitats

110 for populations are located following our LGM distribution modelling.

The detection and location of a unique and genetically distinct plastid haplotype group (P1, black haplotypes; Table 1, Fig. 1) provides some clues to the relationships between populations of C. americana in eastern North America and the disjunct members in southern Mexico. This infrequent haplotype group (present in only 5/75 populations and 8/281 individuals) likely represents relictual retention of the ancestral haplotype. A molecular phylogenetic study of

Conopholis (Rodrigues et al., 2011) that used two of these five sampled populations (SS.05.79 and SS.03.11 only), found that these particular populations are more closely related to disjunct members in southern Mexico than to populations found in eastern North America. Such an east- west split is presumed to have occurred during the late Miocene to mid-Pleistocene (Graham,

1999; Wood, 1972). These particular five populations today still retain the distinctive and ancestral genetic signature of south Mexican populations (see number of steps separating haplotypes labeled in black from those shaded in dark gray; Fig. 1a).

Finally, it should be pointed out that a previous study using the holoparasite Epifagus virginiana, the monotypic sister genus to Conopholis which also exhibits similar intraspecific east-west disjunction in North America, found that the southern and mid-western regions contained higher allelic richness compared to the north and that population differentiation was the greatest in the south (Tsai and Manos, 2010). The results of our study are similar to that recovered in Epifagus. However, unlike their case where the definition of regions was driven in part by the knowledge of the single host and its location as the ice margin retreated following the LGM, the particular species of red oaks that are the hosts to C. americana in eastern North

America are unknown. Even though the distribution of oaks during the LGM at approximately

20 kya is well established based on isopollen maps (Jackson et al., 2000), it is difficult to distinguish between the different species of oak based on pollen grains (Bennett, 1983). As a

111 result, the records of fossil oak pollen deposits only provide an indirect proxy for the presence of oak communities, and not to the particular species of oaks present at any given time. The results of our study on the locations of glacial refugia and the genetic diversity of C. americana can be used as a proxy for the location of glacial refugia and the range expansion of those species of red oaks that are the hosts for Conopholis in this region. A study focusing on the chloroplast DNA variation in one species of red oak (Q. rubra) in North America found weak phylogeographic structure and no spatial structure of genetic diversity (Magni et al., 2005). One haplotype was present in 75% of the sampled trees and was the most dominant haplotype north of where the ice margin was located at the LGM. The phylogenetic relationship between haplotypes also exhibited a starburst like pattern with populations in the southern Appalachians showing more diversity and harboring some rare haplotypes, suggestive of a glacial refugium being located in that region much the same as for C. americana. The range of C. americana in eastern North America however extends further south beyond the distribution of this particular species of red oak (compare Figs. 1 in Magni et al., 2005 and Rodrigues et al., 2011). This further supports the notion that C. americana parasitizes more than one species of red oaks.

Other related species of red oaks whose ranges overlap with that of C. americana and go beyond the distribution of Q. rubra are Q. coccineae, Q. falcata, Q. ilicifolia, Q. imbricaria, Q. marilandica, Q. pagoda, Q. palustris, Q., phellos, and Q. velutina (Aldrich et al., 2003;

Samuelson and Hogan, 2003).

5.6 Conclusion

In summary, this study utilized both plastid and nuclear data in addition to paleodistribution modeling to identify two geographic regions where populations of C. americana in eastern

North America persisted through the LGM. It provids support for a scenario where populations

112 have existed in two separate and isolated refugia from which they expanded their ranges following the retreat of the ice. The recovery of a distinct southern lineage is in agreement with the location of a previously proposed southern glacial refugium spanning across Florida, southern Georgia and Alabama, and the Lower Mississippi Valley. The second lineage is dominant across the present northern range and is hypothesized to have been located in the southern extent of the Blue Mountain Range of the Appalachian Mountains at the LGM.

Following the retreat of the glaciers, populations from the more northern refugium were the primary players at the leading edge of the northward migration. As a result, their haplotypes/genotypes are the most prevalent in the north, especially in the previously glaciated regions. In addition, the diversity seen across the southern Appalachian Mountains is congruent with the hypothesis that this is the area where populations derived from the southern and northern refugia come together. Future work in this group should focus on identifying the particular species of red oaks that are the hosts for C. americana. If a specific species (or a limited set of species) can be ascertained, a similar study can be conducted to determine if this host(s) also exhibit a comparable (1) LGM history and (2) present-day geographic structure.

5.7 Acknowledgements

We thank Robert Brown and Tanya Kenesky (University of Toronto Mississauga Library) for their help using ArcMap to create, view, edit, and analyze geospatial data as well as to Alison

Colwell (Yosemite National Park) for plant collections and helpful discussions. Financial support from the Natural Sciences and Engineering Research Council of Canada Discovery grant (326439) and University of Toronto Connaught New Staff Matching grant to S. Stefanović are gratefully acknowledged.

113

Table 5-1

Table 5-1 Collection and label information for Conopholis americana populations used in this study. For each population, plastid haplotypes are labeled (1-23) and the major group (P1, P2, and P3) to which they belong is indicated. Likewise, microsatellite genotypes are labeled (1-25) and the major genotype cluster (M1, M2, and M3) in which they were found is indicated.

State/Province Location (county) Accession Sample Geographic Coordinates (DD) Plastid haplotype Microsatellite label Size genotype Latitude Longitude

Alabama Lee Co. SS.05.01 5 32.5228 85.4969 P2; 8 M1; 21

Madison Co. SS.05.79 3 34.7293 86.5505 P1; 1, 4 M2 & M3; 5,11

Madison Co. SS.05.80 1 34.7209 86.5320 P1; 4 M3; 11

Marshal Co. SS.06.98 2 34.5577 86.2089 P2; 11 M1; 15

Lauderdale Co. SS.06.103 1 34.8113 87.3574 P3; 14 M2; 25

Jackson Co. SS.06.161 1 34.7090 86.0074 P1; 3 M2; 25

Jackson Co. SS.06.162 2 34.6971 86.0247 P1; 3 M1 & M2; 15, 6

Florida Marion Co. AC.FL.JS 2 29.1778 81.7164 P2; 6 M1; 12

Alachua Co. AC.FL.SF 5 29.7299 82.4348 P2; 9, 10 M1; 3, 12

Wakulla Co, SS.06.48 3 30.1325 84.3570 P2; 5 M1; 20

Georgia Early Co. SS.06.24 3 31.4639 84.9229 P2; 8 M1; 13

Dade Co. SS.06.80 2 34.8407 85.4829 P3; 23 M2; 25

Illinois Vermillion Co. AC.IL.KSP 9 40.1246 87.7352 P3; 14 M2; 24

Indiana Monroe Co. SS.03.29 1 39.0202 86.3713 P3; 14 M3; 2

114

Monroe Co. SS.03.30 1 39.0246 86.3607 P3; 14 M3; 7

Monroe Co. SS.03.31 1 39.0348 86.3213 P3, 19 M2; 25

Lawrence Co. SS.04.80 1 38.7319 86.4175 P3; 14 M2; 25

Perry Co. SS.04.83 1 37.9925 86.5938 P3; 22 M2; 25

Martin Co. SS.04.89 1 38.6676 86.7159 P3, 14 M3; 19

Crawford Co. SS.04.93 1 38.3689 86.6414 P2; 12 M2; 25

Crawford Co. SS.04.94 1 38.3706 86.6469 P2; 12 M3; 19

Parke Co. SS.04.96 1 39.8916 87.2032 P3; 14 M3; 19

Monroe Co. SS.04.102 1 39.2039 86.5298 P3, 14 M3; 7

Monroe Co. SS.04.109 1 39.1955 86.5201 P3; 14 M3; 7

Steuben Co. SS.04.170 1 41.6850 85.0074 P3; 14 M3; 19

Steuben Co. SS.09.28 9 41.7123 85.0258 P3; 14 M3; 16, 19

Clarke Co. AC.IN.CCF 10 38.4856 85.8326 P3; 22 M2; 17

Kentucky McCreary Co. SS.03.11 1 36.6912 84.4697 P1; 2 M2; 25

Montgomery Co. AC.KY.MC 2 38.0716 83.9347 P3; 14 M2; 25

Maine Franklin Co. SS.09.38 6 44.7554 70.0740 P2; 8 M3; 8

Maryland Montgomery Co. AC.MD.MT 20 39.0006 77.2101 P2 & P3, 8, 13, M2 & M3; 9, 11 14

Massachusetts Hampshire Co. AC.MA.MN 14 42.3051 72.5127 P3; 14 M2; 23

Hampshire Co. AC.MA.RK 1 42.3061 72.4965 P3; 16 M3; 8

Michigan Ottawa Co. SS.05.82 1 42.7958 86.1008 P3; 14 M3; 19

115

Van Buren Co. SS.09.30 8 42.3331 86.2992 P3; 20, 21, 22 M3; 19

Allegan Co. SS.09.31 6 42.6989 86.1954 P3; 14 M3; 19

Muskegon Co. SS.09.32 8 43.4105 86.3287 P3; 14 M3; 19

Cheboygan Co. SS.09.37 11 45.5505 84.6674 P2 & P3; 8, 15 M2; 22

North Macon Co. SS.06.64 3 35.2064 83.4206 P3; 14 M3; 2 Carolina

Swain Co. SS.06.146 2 35.5053 83.6761 P2; 8 M1; 14

Jackson Co. SS.06.160 2 35.4239 83.0848 P3; 14 M2; 25

Madison Co. AC.NC.L 10 35.7333 82.8697 P2; 8 M2; 25

Ohio Licking Co. SS.06.173 1 40.0736 82.5193 P2, 8 M3; 8

Granville Co. SS.06.174 1 40.0705 82.5330 P2; 8 M3; 8

Summit Co. SS.09.25 11 41.2609 81.5693 P3; 14 M3; 1, 8

Ontario Simcoe Co. SS.05.02 1 44.3991 79.8561 P3; 14 M3, 8

Halton Co. SS.05.94 2 43.4257 79.8817 P3; 14 M3; 8

Township of SS.05.194 2 45.3401 80.0457 P3; 14 M3; 8 Archipelago

Halton Co. SS.06.170 2 43.5066 79.9594 P3, 14 M3; 8

Peel Co. SS.08.03 6 43.5524 79.6636 P3; 14 M3; 10

Bruce Co. SS.08.04 6 45.2311 81.5983 P3; 14 M3; 19

Bruce Co. SS.08.05 6 45.2005 81.5331 P3; 14 M3, 19

Lincoln Co. SS.09.05 2 43.1348 79.1575 P3; 14 M3; 8

Lincoln Co. SS.09.08 6 42.9095 79.2748 P2; 8 M3; 8 116

Simcoe Co. SS.09.39 4 44.8497 79.9978 P3; 14 M3; 8

Simcoe Co. SS.09.40 3 44.7583 79.8281 P3; 14 M3; 8

Pennsylvania Butler Co. SS.07.42 1 40.9434 80.0875 P3, 14 M3; 8

Franklin Co. AC.PA.MSF 9 39.8246 77.5160 P2; 8 M3; 11

Quebec Vallée-du-Richelieu Co. SS.07.80 1 45.5491 73.3569 P2; 8 M3; 8

South Hampton Co. SS.06.53 4 32.8321 81.1755 P2; 7 M1; 12 Carolina

Banberg Co. SS.06.54 5 33.0480 81.0970 P2; 9 M2; 25

Dorchester Co. SS.06.63 4 33.0636 80.6167 P2; 9 M1; 4

Tennessee Franklin Co. SS.05.81 1 35.0533 86.2732 P2; 8 M1; 15

Blount Co. SS.06.127 2 35.7256 83.5083 P3; 14 M2; 25

Blount Co. SS.06.133 2 35.6317 83.9435 P3; 14, 17 M1 & M2; 14; 25

Virginia Rappahannock Co. SS.07.57 1 38.8194 78.1801 P3; 18 M2; 25

Shenandoah Co. SS.07.58 1 38.6904 78.3323 P3; 14 M2; 25

Rockbridge Co. AC.VA.NB 7 37.6776 79.5073 P3; 14, 22 M3, 11, 18, 19

Fairfax Co. AC.VA.TRP 5 38.9579 77.1627 P2; 8 M3; 11

Fairfax Co. AC.VA.TRP.2 9 38.9474 77.2692 P2; 8 M3; 11

West Virginia Kanawha Co. SS.04.71 1 38.3172 81.6680 P3; 14 M2; 25

Kanawha Co. SS.04.72 1 38.2529 81.6568 P3; 14 M2; 25

Summers Co. SS.04.75 1 37.5421 80.9599 P3; 14 M2; 25

Cabell Co. AC.WV.BFL 2 38.3230 82.3785 P3; 14 M2; 25

117

Wisconsin Sauk Co. AC.WI.DL 6 43.4283 89.7274 P2; 8 M3; 19

118

Figure 5-1

119

Figure 5-1 Plastid haplotype reconstruction and its distribution for Conopholis americana in eastern North America. (a) Statistical parsimony network shows the relationships between the 23 recovered haplotypes; the size of each circle is proportional to the number of individuals found to have that particular haplotype. Haplotypes are numbered and their provenance (state or province) is indicated. See Table 1 for further details on haplotype identity and major groups to which they belong. Lines connecting haplotypes represent single mutational step, while small open filled circles represent unsampled/extinct haplotypes. Shades of haplotypes correspond to the three major groups recovered; bootstrap values supporting their separation are shown (≥60%). (b) Map showing the distribution of the plastid haplotypes. Dashed line shows the approximate location of the ice margin at the last glacial maximum (LGM).

120

Figure 5-2

121

Figure 5-2 Microsatellite dendrogram and its distribution for Conopholis americana in eastern North America. (a) Unrooted NJ dendrogram produced from the distance matrix obtained in BAPS showing the relationship between the genotypes . Shades correspond to the three major cluster groups recovered based on a combination of branch lengths and geographic distribution. The arrow represents the placement of the root according to the mid-point rooting. (b) Range map showing the distribution of the microsatellite clusters. Dashed line shows the approximate location of the ice margin at the last glacial maximum (LGM).

122

Figure 5-3

123

Figure 5-3 Ecological niche models showing regions with suitable climate envelopes for Conopholis americana in eastern North America based on scenarios for (a) current and (b) past climate. Past climate (ca. 21 kya) is reconstructed based on the community climate systems model (CCSM). Regions of suitability range from 0 (white) to 1 (black). Dashed line shows the approximate location of the ice margin at the last glacial maximum.

124 References

ALBACH, D. C., K. YAN, S. R. JENSEN, AND H.-Q. LI. 2009. Phylogenetic placement of Triaenophora (formerly Scrophulariaceae) with some implications for the phylogeny of Lamiales. Taxon 58: 749–756.

ALDRICH, P.R., G. R. PARKER, C. H. MICHLER, AND J. ROMEO-SEVERSON. 2003. Whole-tree silvic identifications and microsatellite genetic structure of a red oak species complex in an Indiana old-growth forest. Canadian Journal of Forest Research 33: 2228-2237.

ALVAREZ, I. AND J.F. WENDEL. 2003. Ribosomal ITS sequences and plant phylogenetic inference. Molecular Phylogenetics and Evolution 29: 417-434.

APG III [ANGIOSPERM PYLOGENY GROUP III]. 2009. An update of the Angiosperm Phylogeny Group classification for the orders and families of flowering plants: APG III. Botanical Journal of the Linnean Society 161: 105–121.

AVISE, J.C. 2009. Phylogeography: Retrospect and prospect. Journal of Biogeography 36: 3-15.

AXELROD, D. I. 1983. Biogeography of oaks in the Arcto-Tertiary province. Annals of the Missouri Botanical Garden 70: 629–657.

BAILEY, C.D., T. CAR, S. HARRIS, AND C. HUGHES. 2003. Characterization of angiosperm nrDNA polymorphism, paralogy, and pseudogenes. Molecular Phylogenetics and Evolution 29: 435-455.

BAIRD, V. W., AND J. L. RIOPEL. 1986a. The developmental anatomy of Conopholis americana (Orobancaceae) seedlings and tubercles. Canadian Journal of Botany 64: 710–717.

BAIRD, V. W., AND J. L. RIOPEL. 1986b. Life history studies of Conopholis americana (Orobanchaceae). American Midland Naturalist 116: 140–151.

BALDWIN, B.G., M.J. SANDERSON, J.M. PORTER, M.F. WOJCIECHOWSKI, C.S. CAMPBELL, AND

M.J. DONOGHUE. 1995. The ITS region of nuclear ribosomal DNA: A valuable source of evidence on angiosperm phylogeny. Annals of the Missouri Botanical Gardens 82: 247-277.

BARKMAN, T.J., J. R. MCNEAL, S. H. LIM, G. COAT, H. B. CROOM, N. D. YOUNG, AND C. W.

i

DEPAMPHILIS. 2007. Mitochondrial DNA suggests at least 11 origins of parasitism in angiosperms and reveals genomic chimerism in parasitic plants. BMC Evolutionary Biology 7: 248-263.

BARRETT, C. F., AND J. V. FREUDENSTEIN. 2009. Patterns of morphological and plastid DNA variation in the Corallorhiza striata species complex (Orchidaceae). Systematic Botany 34: 496– 504.

BAUM, D. A. 1992. Phylogenetic species concepts. Trends in Ecology & Evolution 7: 1–2.

BAUM, D. A., AND M. J. DONOGHUE. 1995. Choosing among alternative “phylogenetic” species concepts. Systematic Botany 20: 560–573.

BAUM, D. A., AND K. L. SHAW. 1995. Genealogical perspectives on the species problem. In P. C. Hoch and A. G. Stephenson [eds.], Experimental and molecular approaches to plant biosystematics, 289–303. Missouri Botanical Garden Press, St. Louis, Missouri, USA.

BECK-MANNAGETTA, G. 1930. Orobanchaceae. In A. Engler [ed.], D. Pflanzenrich IV. 261, Heft 96, 1–348.

BENNETT, K.D. 1983. Postglacial population expansion of forest trees in Norfolk, UK. Nature 303: 164-167.

BENNETT, J. R., AND S. MATHEWS. 2006. Phylogeny of the parasitic plant family Orobanchaceae inferred from phytochrome A. American Journal of Botany 93: 1039–1051.

BOESHORE, I. 1920. The morphological continuity of Scrophulariaceae and Orobanchaceae. Contributions from the Botanical Laboratory of the University of Pennsylvania 5: 139-177.

BRAUN, E. L. 1947. Development of the deciduous forests of eastern North America. Ecological Monographs 17: 211–219.

BRITO, P.H. AND EDWARDS, S.V. 2009. Multilocus phylogeography and phylogenetics using sequence-based markers. Genetica 135: 439-455.

BRYANT, D., AND V. MOULTON. 2004. Neighbor-net: An agglomerative method for the construction of phylogenetic networks. Molecular Biology and Evolution 21: 255–265.

126

BUCKLER, E.S., A. IPPOLITO, AND T.P. HOLTSFORD. 1997. The evolution of ribosomal DNA: divergent paralogues and phylogenetic implications. Genetics 145: 821-832.

CHURCH, S.A., A. M. KRAUS, J. C. MITCHELL, D. R. CHURCH, AND D. R. TAYLOR. 2003. Evidence of multiple pleistocene refugia in the post glacial expansion of the eastern tiger salamander, Ambystoma tigrinum tigrinum. Evolution 57: 372-383.

CLEMENT, M., D. POSADA, AND K. A. CRANDALL. 2000. TCS: a computer program to estimate genetic genealogies. Molecular Ecology 9: 1657-1659.

COLWELL, A. E. 1994. Genome evolution in a non-photosynthetic plant, Conopholis americana. Ph.D. dissertation, Washington University, St. Louis Missouri, USA.

CORANDER, J. P. WALDMANN, AND M. J. SILLANPÄÄ. 2003. Bayesian analysis of genetic differentiation between populations. Genetics 163: 367-374.

CRONQUIST, A. 1988. The evolution and classification of flowering plants, 433. New York Botanical Garden, Bronx, New York, USA.

DAVIS, M.B. 1981. Quaternary history and the stability of forest communities. In Forest Succession: Concepts and Application. pp 132-153. Springer Verlag, New York, USA.

DAVIS, C.C., M. LATVIS, D. L. NICKRENT, K. J. WURDACK, AND D. A. BAUM. 2007. Floral gigantism in Rafflesiaceae. Science 315: 1812.

DEBRY, R. W., AND R. G. OLMSTEAD. 2000. A simulation study of reduced tree-search effort in bootstrap resampling analysis. Systematic Biology 49: 171–179.

DELANNOY, E., S. FUJII, C. C. DES FRANC-SMALL, M. BRUNDRETT, AND I. SMALL. 2011. Rampant gene loss in the underground orchid Rhizanthella garneri highlights evolutionary constraints on plastid genomes. Molecular Biology and Evolution 28: 2077-2086.

DELCOURT, P. A., AND H. R. DELCOURT. 1984. Ice age heaven for harwoods. Natural History 93: 22–25.

DELCOURT, H.R. AND P. A. DELCOURT. 1993. Paleoclimates, paleovegetation, and paleofloras during the Late Quaternary. In Flora of North America. pp 71-94. Oxford University Press, New

127

York.

DELGADILLO, M. C. 1987. Moss distribution and the phytogeographical significance of the Neovolcanic Belt in Mexico. Journal of Biogeography 14: 69–78.

DELVAULT, P.M., N. M. RUSSO, N. A. LUSSON, AND P. A. THALOUARN. 1996. Organization of the reduced plastid genome of Lathrea clandestina, an achlorophyllous parastic plant. Physiologia Plantarum 96: 674-682.

DEPAMPHILIS, C. W., AND J. D. PALMER. 1990. Loss of photosynthetic and chlororespiratory genes from the plastid genome of a parasitic . Nature 348: 337–339.

DE QUEIROZ, K., AND M. J. DONOGHUE. 1990. Phylogenetic systematics or Nelson’s version of cladistics? Cladistics 6: 61–75.

DONOGHUE, M. J., AND B. R. MOORE. 2003. Toward an integrative historical biogeography. Integrative and Comparative Biology 43: 261–270.

DOYLE, J. J., AND L. DOYLE. 1987. A rapid DNA isolation procedure for small quantities of fresh leaf tissue. Phytochemical Bulletin 19: 11–15.

DOYLE, J. J., J. I. DAVIS, R. J. SORENG, D. GARVIN, AND M. J. ANDERSON. 1992. Chloroplast DNA inversions and origin of the grass family (Poaceae). Proceedings of the National Academy of Sciences, USA 89: 7722–7726.

EXCOFFIER, L. AND H. E. L. LISCHER. 2010. Arlequin suite ver 3.5: A new series of programs to perform population genetics analyses under Linux and Windown. Molecular Ecology Resources 10: 564-567.

FELSENSTEIN, J. 1978. Cases in which parsimony or compatibility methods will be positively misleading. Systematic Zoology 27: 401-410.

FELSENSTEIN, J. 1981. Evolutionary trees from DNA sequences: A maximum likelihood approach. Journal of Molecular Evolution 17: 368–376.

FELSENSTEIN, J. 1985. Confidence-Limits on Phylogenies—an approach using the bootstrap. Evolution; International Journal of Organic Evolution 39: 783–791.

128

FELSENSTEIN, J. 2004. Inferring Phylogenies. Sinauer, Sunderland, Massachusetts.

FENG, Y., S. H. OH, AND P. S. MANOS. 2005. Phylogeny and historical biogeography of the genus Platanus as inferred from nuclear and chloroplast DNA. Systematic Botany 30: 786–799.

FERRARI, L., M. LOPEZ-MARTINEZ, G. AGUIRRE-DIAZ, AND G. CARRASCO-NUNEZ. 1999. Space- time patterns of Cenozoic arc volcanism in central Mexico: From the Sierra Madre Occidental to the Mexican Volcanic Belt. Geology 27: 303–306.

FERNALD, M. I. 1950. Grey’s Manual of Botany, [ed. 8]. American Book Company, New York.

FIELD. C.B., M. J. BEHRENFELD, J. T. RANDERSON, AND P. FALKOWSKI. 1998. Primary production in the biosphere: Intergrating terrestrial and oceanic components. Science 281: 237- 240.

GLEASON, H. A. 1952. The new Britton and Brown illustrated flora of the northeastern United States and adjacent Canada. Volume 3. Lancaster Press Incorporated, Lancaster.

GODBOUT, J., J. P. JARAMILLO-CORREA, J. BEAULIEU, AND J. BOSQUET. 2005. A mitochondrial DNA minisatellite reveals the postglacial history of jack pine (Pinus banksiana), a broad-range North American conifer. Molecular Ecology 14: 3497-3512.

GOLDMAN, N. 1993. Statistical tests of models of DNA substitution. Journal of Molecular Evolution 36: 182–198.

GOLDMAN, N., J. P. ANDERSON, AND A. G. RODRIGO. 2000. Likelihood-based tests of topologies in phylogenetics. Systematic Biology 49: 652–670.

GOMEZ, L. D. 1980. Notes on the biology of Central American Orobanchaceae. Besnia 17: 389– 396.

GONZALES, E., J. L. HAMRICK, AND S-M. CHANG. 2008. Identification of glacial refugia in south- eastern North America by phylogeographic analyses of a forest understorey plant, Trillium cuneatum. Journal of Biogeography 35: 844-852.

GOUDET, J. 1995. FSTAT (version 1.2): A computer program to calculate F-statistics. Journal of Heredity 86: 485-486.

129

GOWER, J. C. 1971. A general coefficient of similarity and some of its properties. Biometrics 27: 857–871.

GRAHAM, A. 1964. Origin and evolution of the biota of Southeastern North America: evidence from the fossil plant record. Evolution; International Journal of Organic Evolution 18: 571– 585.

GRAHAM, A. 1973. History of the arborescent temperate element in the northern Latin American Biota. In Vegetation and Vegetational History of Northern Latin America, A. Graham [ed.]. pp. 301–314. New York: Elsevier Scientific Publishing Company.

GRAHAM, A. 1993 History of the vegetation: Cretaceous (Maastrichtian) – Tertiary Flora of North America., 57-70. Oxford University Press, New York.

GRAHAM, A. 1999. The Tertiary history of the northern temperate element in the northern Latin America biota. American Journal of Botany 86: 32-38.

GRAHAM, S.W., P. A. REEVES, A. C. E. BURNS, AND R. G. OLMSTEAD. 2000. Microstructural changes in noncoding chrolorplast DNA: Interpretation, evolution, and utility of indels and inversions in basal angiosperm phylogenetic inference. International Journal of Plant Sciences 161: S83-S96.

HALFFTER, G. 1964. La entomofauna americana, ideas acerca de su orígen y distribución. Folia Entomologica Mexicana 6: 1–108.

HALFFTER, G. 1976. Distribución de los insectos en la zona de transición mexicana. Relaciones con la entomofauna de Norteamérica. Folia Entomologica Mexicana 35: 1–64.

HAMPE, A. AND R. J. PETIT. 2005. Conserving biodiversity under climate change: the rear edge matters. Ecology Letters 8: 461-467.

HARE, M. P. 2001. Prospects for nuclear phylogeography. Trends in Ecology & Evolution 16: 700–706.

HAYNES, R. R. 1971. A monograph of the genus Conopholis (Orobanchaceae). SIDA 4: 246– 264.

130

HEIDE-JORGENSEN, H. S. 2008. Parasitic Flowering Plants. Brill NV, Leiden, The Netherlands.

HEWITT, G. M. 1996. Some genetic consequences of ice ages and their role in divergence and speciation. Biological Journal of the Linnaean Society 58: 247–276.

HEWITT, G. M. 2000. The genetic legacy of Quaternary ice ages. Nature 405: 907–913.

HEWITT, G.M. 2004. Genetic consequences of climatic oscillations in the Quaternary. Philosophical Transactions of the Royal Society London Biological Sciences 359: 183-195.

HIJMANS, R.J., S. E. CAMERON, J. L. PARRA, P. G. JONES, AND A. JARVIS. 2005. A very high resolution interpolated climate surfaces for global land areas. International Journal of Climatology 25: 1965-1978.

HUSON, D. H., AND D. BRYANT. 2006. Application of phylogenetic networks in evolutionary studies. Molecular Biology and Evolution 23: 254–267.

JACKSON, S.T. AND J. T. OVERPECK. 2000 Responses of plant populations and communities to environmental changes of the late Quaternary. Paleobiology 26: 194-220.

JACKSON, S. T., R. S. WEBB, K. H. ANDERSON, J. T. OVERPECK, T. WEBB III, J. W. WILLIAMS,

AND B. C. S. HANSEN. 2000. Vegetation and environment in eastern North America during the last glacial maximum. Quaternary Science Reviews 19: 489–508.

JACKSON, N.D. AND C. C. Austin. 2010. The combined effects of rivers and refugia generate extreme cryptic fragmentation within the common ground skink (Scincella lateralis). Evolution 64: 409-428.

JARAMILLO-CORREA, J.P., J. BEAULIEU, AND J. BOSQUET. 2004. Variation in mitochondrial DNA reveals multiple distant glacial refugia in black spruce (Piceae mariana), a transcontinental North American conifer. Molecular Ecology 13: 2735-2747.

JENSEN, S. R., H.-Q. LI, D. C. ALBACH, AND C. H. GOTFREDSEN. 2008. Phytochemistry and molecular systematics of Triaenophora rupestris and Oreosolen watii (Scrophulariaceae). Phytochemisty 69: 2162–2166.

KELCHNER, S.A. AND J. F. WENDEL. 1996. Haipins create minute inversions in non-coding

131 regions of chloroplast DNA. Current Genetics 30: 259-262.

KUIJT, J. 1969. The biology of parasitic flowering plants. University of California Press, Berkeley, California, USA.

LEAKE, J. R. 1994. The biology of mycoheterotrophic (‘saprophytic’) plants. New Phytologist 127: 171-216.

LEWIS, P. O. AND D. ZAYKIN. 2001. Genetic data analysis: Computer program for the analysis of allelic data. Free program distributed by the authors, http://hydrodictyon.eeb.uconn.edu/people/plewis/software.php

LI, W. H. 1993. So what about the molecular clock hypothesis? Current Opinion in Genetics & Development 3: 896–901.

LIEBMANN, F. M. 1847. To nye arter af slaegten Conopholis Wallr. In Förhandlingar; Skandinaviske Naturforskeres Möte 4, pp. 184–186. Stockholm: P. A. Norstedt and Soner, Kongl Boktryckare.

LINNAEUS, C. 1767. In part 1 of Mantissa Plantarum, pp. 88–89. Stockholm: Impensis direct. Laurentii Salvii.

LEWIS, P. O. 2001. A likelihood approach to estimating phylogeny from discrete morphological character data. Systematic Biology 50: 913–925.

MAGALLÓN, S., P. R. CRANE, AND P. S. HERENDEEN. 1999. Phylogenetic pattern, diversity, and diversification of . Annals of the Missouri Botanical Garden 86: 297–372.

MAGNI C.R., A. DUCOUSSO, H. CARON, R. J. PETIT, AND A. KREMER. 2005. Chloroplast DNA variation of Quercus rubra L. in North America and comparison with other Fagaceae. Molecular Ecology 14: 513-524.

MANOS, P. S., Z.-K. ZHOU, AND C. H. CANNON. 2001. Systematics of Fagaceae: Phylogenetic tests of reproductive trait evolution. International Journal of Plant Sciences 162: 1361–1379.

MCKENDRICK, S. L., J. R. LEAKE, AND D. J. READ. 2000. Symbiotic germination and development of myco-heterotrophic plants in nature: Transfer of carbon from ectomycorrhizal

132

Salix repens and Betual pendula to the orchid Corallorhiza trifida through shared hyphal connections. New Phytologist 145: 539-548.

MCLACHLAN, J. S., J. S. CLARK, AND P. S. MANOS. 2005. Molecular indicators of tree migration capacity under rapid climate change. Ecology 86: 2088–2098.

MCNEAL, J.R., K. ARUMUGUNATHAN, J. V. KUEHL, AND J. L. BOORE. 2007. Systematics and plastid genome evolution of the cryptically photosynthetic parasitic plant Cuscuta (Convolvulaceae). BMC Biology 5: 55.

MCNEAL, J.R., J. R. BENNETT, A. D. WOLFE, AND S. MATHEWS. 2013. Phylogeny and origins of holoparasitism in Orobanchaceae. American Journal of Botany 100: 971-983.

MEACHAM, C. A. AND T. DUNCAN. 1991. MorphoSys. Berkeley, California: Regents of the University of California.

MIRANDA, F., AND A. J. SHARP. 1950. Characteristics of the vegetation in certain temperate regions of eastern Mexico. Ecology 31: 313–333.

MOORE, W. S. 1995. Inferring phylogenies from mtDNA variation: Mitochondrial-gene trees versus nuclear-gene trees. Evolution 49: 718–726.

MORRIS, A. B., S. M. ICKERT-BOND, D. B. BRUNSON, D. E. SOLTIS, AND P. S. SOLTIS. 2008. Phylogeographical structure and temporal complexity in the American sweetgum (Liquidambar stryaciflua; Altingiaceae). Molecular Ecology 17: 3889–3900.

MORRIS, A. S., C. H. GRAHAM, D. E. SOLTIS, AND P. S. SOLTIS. 2010. Reassessment of phylogeographical structure in eastern North American trees using Monmonier’s algorithm and ecological niche modelling. Journal of Biogeography 37: 1657–1667.

MÜLLER, K. 2005. SeqState: Primer design and sequence statistics for phylogenetic DNA datasets. Applied Bioinformatics 4: 65–69.

MÜLLER, K. 2006. Incorporating information from length-mutational events into phylogentic analysis. Molecular Phylogenetics and Evolution 38: 667–676.

MUELLER, U.G. AND L. L. WOLENBARGER. 1999. AFLP genotyping and fingerprinting. Trends

133 in Ecology and Evolution 14: 389-394.

MYLECRAINE, K. A., J. E. KUSER, P. E. SMOUSE, AND G. L. ZIMMERMANN. 2004. Geographic allozyme variation in Atlantic white-cedar, Chamaecyparis thyoides (Cupressaceae). Canadian Journal of Forest Research 34: 2443–2454.

NICKRENT, D. L., R. J. DUFF, A. E. COLWELL, A. D. WOLFE, N. D. YOUNG, K. E. STEINER, AND C.

W. DEPAMPHILIS. 1998. Molecular Phylogenetic and Evolutionary Studies of Parasitic Plants. In Molecular Systematics of Plants II. DNA Sequencing. D. Soltis, P. Soltis, J. Doyle (eds.), pp. 211-241 (Chapter 8). Kluwer Academic Publishers, Boston, MA.

NICKRENT, D. L. 2002. Plantas parásitas en el mundo. In J. A. López-Sáez, P. Catalán and L. Sáez [eds.], Plantas Parásitas de la Península Ibérica e Islas Baleares, Capitulo 2, pp. 7-56. Mundi-Prensa Libros, S. A., Madrid.

NICKRENT, D. L., A. BLARER, Y-L. QUI, D. E. SOLTIS, P. S. SOLTIS, AND M. ZANIS. 2002. Molecular data place Hydnoraceae with Aristolochiaceae. American Journal of Botany 89: 1809-1817.

NICKRENT D.L., A. BLARER, Y-L. QIU, R. VIDAL-RUSSEL, F. E. ANDERSON. 2004. Phylogenetic inference in Rafflesiales: the influence of rate heterogeneity and horizontal gene transfer. BMC Evolutionary Biology, 4.

NICKRENT, D. L. 2010. The parasitic plant connection. Website http://www.parasiticplants.siu.edu/ [accessed 1 June 2010].

NICKRENT, D. L. 2012. The parasitic plant connection. Website http://www.parasiticplants.siu.edu/ [accessed 3 June 2012].

NICKRENT, D.L. 2013. The parasitic plant connection. Website http://www.parasiticplants.siu.edu/ [accessed 19 April 2013].

NIXON, K. C. 1993. El género Quercus en México. In T. P. Ramamoorthy, R. Bye, A. Lot, and J. Fa [eds.], Diversidad biológica de México: Orígenes y distribución, 435–448. Instituto de Biología, Universidad Nacional Autónoma de México, México, D. F., México.

134

OLMSTEAD, R. G., C. W. DEPAMPHILLIS, A. D. WOLFE, N. D. YOUNG, W. J. ELISONS, AND P. A.

REEVES. 2001. Disintegration of the Scrophulariaceae. American Journal of Botany 88: 348– 361.

PALMER, J. D. 1986. Isolation and structural analysis of chloroplast DNA. Methods in Enzymology 118: 167-186.

PETIT, R. J., S. BREWER, S. BORDACS, K. BURG, R. CHEDDADE, E. COART, J. COTTRELL, ET AL. 2002. Identification of refugia and postglacial colonization routes of European white oaks based on chloroplast DNA and fossil pollen evidence. Forest Ecology and Management 156: 49–74.

PHILLIPS, S.J., R. P. ANDERSON, AND R.E. SCHAPIRE. 2006. Maximum entropy of spatial geographic distributions. Ecological Modelling 190: 231-259.

PHILLIPS, S.J. AND M. DUDIK. 2008. Modeling of species distributions with Maxent: new extensions and a comprehensive evaluation. Ecography 31: 161-175.

PIELOU, E. C. 1991. After the ice age: The return of life to glaciated North America. University of Chicago Press, Chicago, Illinois, USA.

POSADA, D., AND K. A. CRANDALL. 1998. ModelTest: Testing the model of DNA substitution. Bioinformatics 14: 817–818.

POULIN, R. AND S. MORAND. 2000. The diversity of parasites. The Quarterly Review of Biology 75: 277-293.

R CORE TEAM. 2012. R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing.

RAMBAUT, A. 2002. Se-Al sequence alignment editor, verison 2.0a11. University of Oxford, Oxford, UK.

RAYMOND, M., AND F. ROUSSET. 1995. GENEPOP (Version 1.2): Population genetics software for exat tests and ecumenicism. Journal of Heridity 86: 248-249.

REBOUD, X, AND C. ZEYL. 1994. Organelle inheritance in plants. Heredity 72: 132-140.

135

REE, R. H., B. R. MOORE, C. O. WEBB, AND M. J. DONOGHUE. 2005. A likelihood framework for inferring the evolution of geographic range on phylogenetic trees. Evolution 59: 2299–2311.

RIESEBERG, L.H. AND D.E. SOLTIS. 1991. Phylogenetic consequences of cytoplasmic gene flow in plants. Evolutionary Trends in Plants 5: 65–84.

RIESEBERG, L.H. AND J.F. WENDEL. (1993) Introgression and its consequences in plants. In Harrison, R. [ed], Hybrid Zones and the Evolutionary Process. Oxford University Press, Oxford, 70–109.

RODRIGUES, A. G.; A. E. L. COLWELL; AND S. STEFANOVIĆ. 2011. Molecular systematics of the parasitic genus Conopholis (Orobanchaceae) inferred from plastid and nuclear sequences. American Journal of Botany 98: 896-908.

RODRIGUES, A.G., A. E. L. COLWELL, AND S. STEFANOVIĆ. 2012. Development and characterization of polymorphic microsatellite markers for Conopholis americana (Orobanchaceae). American Journal of Botany 99: e4-e6.

RODRIGUES, A., S. SHAYA, T. A. DICKINSON, AND S. STEFANOVIĆ. 2013. Morphometric analyses and taxonomic revision of the North American holoparasitic genus Conopholis (Orobanchaceae). Systematic Botany 38(3): in press.

ROHLF, F. J. 1972. An empirical comparison of three ordination techniques in numerical taxonomy. Systematic Zoology 21: 271–280.

RONQUIST, F., AND J. P. HUELSENBECK. 2003. MrBayes 3: Bayesian phylogenetic inference under mixed models. Bioinformatics 19: 1572–1574.

ROUSSET, F. 2008. GENEPOP 007’: A complete re-implementation of the GENEPOP software for Windows and Linux. Molecular Ecology Resources 8: 103-106.

RZEDOWSKI, J. 1978. Vegetación de México. Editorial Limusa, México, D.F., México.

SAMUELSON, L.J. AND M. E. HOGAN. 2003. Forest trees: A guide to the southeastern and mid- Atlantic regions of the United States. Van Hoffmann Press, Upper Saddle River, New Jersey.

136

SANDERSON, M. J. 2002. Estimating absolute rates of molecular evolution and divergence times: A penalized likelihood approach. Molecular Biology and Evolution 19: 101–109.

SCHAAL, B.A., D. A HAYWORTH, K. M. OLSEN, J. T. RAUSCHER, AND W. A. SMITH. 1998. Phylogeographic studies in plants: problems and prospects. Molecular Ecology, 7: 465-474.

SCOTLAND, R. W., R. G. OLMSTEAD, AND J. R. BENNETT. 2003. Phylogeny reconstructions: The role of morphology. Systematic Biology 52: 539–548.

SEWELL, M.M., R. P. CLIFFORD, AND M. W. CHASE. 1996. Intraspecific chloroplast DNA variation and biogeography of North American Liriodendron L. (Magnoliaceae). Evolution, 50: 1147-1154.

SHIMODAIRA, H. 2002. An approximately unbiased test of phylogenetic tree selection. Systematic Biology 51: 492–508.

SHIMODAIRA, H., AND M. HASEGAWA. 1999. Multiple comparisons of log-likelihoods with applications to phylogenetic inference. Molecular Biology and Evolution 16: 1114–1116.

SHIMODAIRA, H., AND M. HASEGAWA. 2001. CONSEL: For assessing the confidence of phylogenetic tree selection. Bioinformatics 17: 1246–1247.

SIMMONS, M. P., AND H. OCHOTERNA. 2000. Gaps as characters in sequence-based phylogenetic analyses. Systematic Biology 49: 369–381.

SMALL, J. K. 1933. Manual of the southeastern flora. University of North Carolina Press, Chapel Hill, North Carolina, USA.

SNEATH, P. H. A. AND R. R. SOKAL. 1973. Numerical taxonomy: The principles and practice of numerical classification. San Francisco: W. H. Freeman and Company.

SOLTIS, D. E., A. B. MORRIS, J. S. MCLACHLAN, P. S. MANOS, AND P. S. SOLTIS. 2006. Comparative phylogeography of unglaciated eastern North America. Molecular Ecology 15: 4261–4293.

STEFANOVIĆ, S., D. W. RICE, AND J. D. PALMER. 2004. Long branch attraction, taxon sampling, and the earliest angiosperms: Amborella or monocots? BMC Evolutionary Biology 4: 35.

137

STEWART, J.R. AND A. M. LISTER. 2001. Cryptic northern refugia and the origins of northern biota. Trends in Ecology and Evolution 16: 608-613.

STYLES, B. T. 1993. Genus Pinus: A Mexican purview. In T. P. Ramamoorthy, R. Bye, A. Lot, and J., Fa [eds.], Biological diversity of Mexico: Origins and distribution, pp 397-420. Oxford University Press, New York, New York, USA.

SWENSON, N.G. AND D. J. HOWARD. 2005. Clustering of contact zones, hybrid zones, and phylogeographic breaks in North America. The American Naturalist 166, 581-591.

SWOFFORD, D. L. 2002. Phylogenetic analysis using parsimony (*and other methods), version 4.0b10. Sinauer, Sunderland, Massachusetts, USA.

THIERET, J. W. 1969. Notes on Epifagus. Castanea 34: 397–402.

THIERS, B. 2012. Index Herbariorum: A global directory of public herbaria and associated staff. New York Botanical Garden's Virtual Herbarium. Website http://sweetgum.nybg.org/ih/ [accessed 2 November 2012].

TSAI, Y.-H. E., AND P. S. MANOS. 2010. Host density drives the postglacial migration of the tree parasite, Epifagus virginiana. Proceedings of the National Academy of Sciences, USA 107: 17035–17040.

VOS, P., R. HOGERS, M. BLEEKER, T. VAN DE LEE, M. HORNES, A. FRIJTERS, J. POT, AND M.

KUIPER. 1995. AFLP: a new technique for DNA fingerprinting. Nucleic Acids Research 23: 4407–4414.

WALKER, M.J., A. K. STOCKMAN, P. L. MAREK, AND J. E. BOND. 2009. Pleistocene glacial refugia across the Appalachian Mountains and coastal plain in the millipede genus Narceus: Evidence from population genetics, phylogeographic, and paleoclimatic data. BMC Evolutionary Biology 9.

WALLROTH, K. F. W. 1825. In Orobanches Generis Diaskene, pp. 80. Frankfurt: Francofurti ad Moenum.

WALTARI, E., R. J. HIJMANS, A. T. PETERSON, A. S. NYARI, S. L. PERKINS, AND R. P. GURALNICK. 2007. Locating Pleistocene refugia: Comparing phylogeographic and Ecological Niche Model

138

Predictions. PLoS One 7: e563-e573.

WATSON, S. 1883. Contributions to American botany. I. List of plants from southwestern Texas and northern Mexico. II. Gamopetalae to Acotyledons. Proceedings of the American Academy of Arts and Sciences 18: 96–191.

WAKASUGI, T., M. SUGITA, T. TSUDZUKI, AND M. SUGIURA. 1998. Updated gene map of tobacco chloroplast DNA. Plant Molecular Biology Reporter 16: 231–241.

WIKSTRÖM, N., V. SAVOLAINEN, AND M. W. CHASE. 2001. Evolution of the angiosperms: Calibrating the family tree. Proceedings. Biological Sciences 268: 2211–2220.

WILLIAMS, M., D. DUNKERLEY, P. DE DECKER, P. KERSHAW, AND J. CHAPPEL. 1998. Quaternary environments. Oxford University Press, New York.

WOLFE, K.H., W. H. LI, AND P. M. SHARP. 1987. Rates of nucleotide substitution vary greatly among plant mitochondrial, chloroplast, and nuclear DNAs. Proceedings of the National Academy of Sciences of the United States of America 84: 9054-9058.

WOLFE, K. H., C. W. MORDEN, AND J. D. PALMER. 1992. Function and evolution of a minimal plastid genome from a nonphotosynthetic parasitic plant. Proceedings of the National Academy of Sciences, USA 89: 10648–10652.

WOLFE, A. D., C. P. RANDLE, L. LIU, AND K. E. STEINER. 2005. Phylogeny and biogeography of Orobanchaceae. Folia Geobotanica 40: 115–134.

WOOD, C. E. J. 1972. Morphology and phytogeography: The classical approach to the study of disjunctions. Annals of the Missouri Botanical Garden 59: 107–124.

WOODSON, R. E. JR., AND R. J. SEIBERT. 1938. Contributions towards a flora of Panama II. Miscellaneous collections during 1936–1938. Annals of the Missouri Botanical Garden 25: 823–840.

XIA, Z., Y.-Z. WANG, AND J. F. SMITH. 2009. Familial placement and relations of Rehmannia and Triaenophora (Scrophulariaceae s.l.) inferred from five gene regions. American Journal of Botany 96: 519–530.

139

YOUNG, N. D., K. E. STEINER, AND C. W. DEPAMPHILIS. 1999. The evolution of parasitism in Scrophulariaceae/Orobanchaceae plastid gene sequences refute an evolutionary transition series. Annals of the Missouri Botanical Garden 86: 876–893.

YOUNG, N.D. AND C. W. DEPAMPHILIS. 2005. Rate variation in parasitic plants: correlated and uncorrelated patterns among plastid genes of different function. BMC Evolutionary Biology 5.

ZANE, L.; L. BARGELLONI, AND T. PATARNELLO. 2002. Strateges for microsatellite isolation: A review. Molecular Ecology 11: 1-16.

140

Appendix I Appendix I - AFLP

The first pilot study in this thesis was aimed at utilizing phylogenetic analyses in order to develop an accurate reconstruction of the relationships within the genus Conopholis, as well the biogeographic history of the species found in eastern North America. For these purposes, we utilized sequences from two plastid regions (clpP and trnfM) for a subset of populations spanning the geographic range of the two species described by Haynes (1971). At those very early stages of the study, the phylogeny obtained revealed that reciprocal monophyly between the two accepted species had not been achieved and that there could potentially be one, two, or three species within the genus. The results however, did not resolve the relationships between individual specimens of C. americana. The complete results of the molecular phylogenetic study can now be found in Chapter Two.

In order to better resolve the hypothesis for Conopholis, and in particular for the large section in

C. americana that remained unresolved, Amplified Fragment Length Polymorphism (AFLP) approach was considered. This method was known to be relatively easy, fast, and reliable at generating hundreds of informative characters for use in phylogenetic studies (Vos et al., 1995;

Mueller and Wolfenbarger, 1999). In addition, this method can simultaneously screen different

DNA regions distributed throughout the various plant genomes and does not require prior sequence information or probe generation (unlike microsatellite markers that require primers to be first developed). Such an approach would allow for the detection of polymorphisms of restriction fragments by PCR amplifications which could then be used to asses differences between individuals, populations, and species. AFLP markers were also found to be applicable in analyses focused at uncovering the genetic variation below the species level (e.g.

141 investigating population structure and differentiation). One known disadvantage of this approach is that it amplifies dominant markers, causing difficulty in the determination of whether a sample is homozygous or heterozygous for any given fragment size (Mueller and

Wolfenbarger, 1999). Nevertheless, we anticipated that the data gathered using this method would specifically help resolve the relationships between the species of Conopholis and at the same time, provide resolution for the relationships between populations sampled in eastern

North America.

The AFLP protocol was carried out as described in Vos et al. (1995) following modifications made in collaboration with Eugenia Lo (a previous PhD. student in our lab). In summary, the

DNA was digested using two restriction enzymes, EcoRI and MseI. An adaptor was subsequently ligated on to these restricted products which were then pre-amplified and lastly selectively amplified using labelled primers.

At first, four EcoRI primers and eight MseI primers were utilized resulting in a total of 32 possible primer combinations for the selective amplification step. All 32 primer combinations were screened using a subset of Conopholis specimens. The amount of raw data generated was large. The data matrix from each primer combination was analyzed for the total number of bands/characters and for the number of variable sites it provided. The number of primer combinations to be used was reduced to four (i.e. those that produced the highest proportion of variable sites) and was expanded across additional samples, bringing the total number of individuals with AFLP data to 36 (12 from round one and 24 from round two). On average, each primer combination provided approximately 250 characters of which 68% were informative (Table 1). The most parsimonious tree from the combined data set of the four primers is shown in Figure 1, along with bootstrap support. While there is support for some of the relationships, many of them are not supported, and in particular those along the backbone of

142 the tree.

In addition to finding that relationships were not supported, a technical problem arose when analyzing the data matrix that was compiled following sequential addition of samples. When new data were added to an existing matrix, individuals would cluster with those that were generated at the same time, clearly resulting in spurious relationships (Fig.1). After further troubleshooting, we discovered that this was one of the disadvantages when working with AFLP data. In addition, assessing the homology of fragment sizes is problematic (Mueller and

Wolfenbarger, 1999). When combined, these two drawbacks can cause the method to be unreliable for studies that require allelic states (heterozygosity), making it not particularly useful for population level studies. At that point, it was decided that if we were to continue using

AFLP as a tool for the molecular phylogenetics of the genus and for biogeographic analyses in eastern North American individuals, the full analysis would be done once collections were complete and the entire set could be run at the same time with the same reagents under the same conditions, so as to eliminate external factors as much as possible. The logistic disadvantage of such a plan meant that should we acquire new collection of Conopholis after we performed an analysis, we could no longer include them in any study using AFLP data due to the technical problem described above. As a result, we opted to pursue other avenues for acquiring the data necessary for molecular phylogenetics and biogeography.

With regard to the molecular systematics, we concluded that nuclear markers needed to be added to the study, in addition to the two plastid regions already used. Such nuclear sequences could then be used to develop separate phylogenetic hypotheses whose topologies can be compared and contrasted to that obtained from plastid sequences. To this end, two nuclear regions were explored, those being ITS and PHYA intron 1. The results of the pilot study using

ITS in Conopholis is discussed in Appendix 2. Given that nuclear-encoded PHYA sequences

143 were used to provide the most comprehensive phylogeny of the Orobanchaceae to date, we designed a new set of primers specific to Conopholis, targeting the PHYA intron 1. The sequences generated from this nuclear region proved to be useful, and it was used in Chapter

Two.

In order to investigate the relationships between populations, genetic diversity, and population structure of populations in eastern North America, we opted to develop microsatellite markers for C. americana. The development and characterization of these markers can be found in

Chapter Four while their use in phylogeographic analyses can be found in Chapter Five.

144

Table 1: List of the four primer combinations used in the expanded study. In the names of the primers, E represents the EcoR1 while M represents the Mse1 primers with adaptor sequences.

Primer combination Number of Number of variable Percentage of characters provided sites variable characters (%)

E-AGC + M-CAC 101 68 67

E-ACT + M-CGA 410 279 68

E-ACT + M-CGC 392 274 70

E-ACT + M-CGG 368 235 64

Total 1271 856 67

145

Figure 1: Strict consensus of two equally most parsimonious trees resulting from analyses of AFLP binary data showing the relationships between 35 populations of Conopholis americana and one of C. panamensis. Bootstrap values (>50%) are indicated above branches. Species names are followed by the abbreviations of states/provinces in which they were collected along with their respective DNA accession numbers. The two major groups recovered by the analyses are labeled by gray (identifying samples used in round one of data acquisition) and white (specimen used in the second round of data acquisition) bars. The species names applied to the specimen follow the systematic treatment for Conopholis put forth by Rodrigues et al., 2013.

146

Appendix II Appendix II - ITS

In order to develop a strong phylogenetic hypothesis at the species and population level, it is necessary to utilize multiple independent phylogenetic markers that provide sufficient sequence divergence. Early on in our studies focusing on Conopholis, we established that the two plastid markers used in the pilot study (clpP and trnfM) provided good support for the distinction of the lineages within the genus. However, one cannot exclusively rely on linked and uniparentally inherited plastid markers for phylogenetic inference at the low phylogenetic level (i.e., species level) due to a variety of organismal phenomena that can cause discrepancy between organismal trees and gene trees (Rieseberg and Soltis, 1991; Rieseberg and Wendel, 1993). The use of nuclear markers is essential in order to provide an independent estimate of species relationships.

The corroboration of phylogenetic hypotheses from the two different genomes (plastid vs. nuclear) provides further support and confidence in a given phylogenetic tree.

To this end, we first explored the utility of the internal transcribed spacer (ITS) region from the nuclear ribosomal DNA. Nuclear ribosomal genes exist in tandem repeats of individual genes

(18S-5.8S-26S repeats) with hundreds to thousands of copies per array (Buckler et al., 1997;

Bailey et al., 2003). In angiosperms, ITS sequences vary in length between 500 and 750 bp

(Baldwin et al., 1995). Having such a short sequence length, coupled with their high copy number makes their amplification by PCR relatively straightforward, even for the relatively low-quality DNA extraction from herbarium specimens. ITS sequence based phylogenetic analyses have been extensively employed for molecular systematics purposes making this region one of the most widely used molecular markers in plant systematics (Alvarez and

Wendel, 2003).

147

Upwards of five months was spent amplifying and sequencing Conopholis ITS sequences. The initial PCR amplification using universal ITS primers on a subset of samples all resulted in multiple banding patterns or a smear of PCR products. These samples were then purified and cloned. Approximately 15 cloned fragments of various insert lengths were sequenced. On average, 14 of 15 sequenced cloned fragments were of fungal origin. Plant specific ITS primers were subsequently obtained (Daniel L. Nickrent, personal communication) and ordered. While multiple bands were still obtained per sample in almost all cases, these plant specific primers were better at selectively amplifying Conopholis sequences. Accessions for which we was able to clone and sequence a Conopholis ITS fragment, showed that individuals within this genus possessed multiple copies of ITS sequences, having both a long (690 – 720 bp) and short (500 –

515 bp) copy. A few of the short copies could be eliminated as non-functional due to the absence of the 5.8S coding region. When more than one long copy was obtained, they often were different at the sequence level. Nonetheless, a limit of screening no more than 25 long fragment clones per sample was set. By doing so, we were still only able to obtain a long ITS sequence for 20 of the 42 accessions (<50%) used in the pilot study.

Parsimony analyses of the Conopholis ITS data set including all long copies of the sequence did not help with the resolution of relationships within the genus (Figure 1). There is a large unresolved polytomy for the relationships of individuals of C. americana, C. alpina, and C. panamensis. No definitive conclusion could be drawn from the results of the analyses regarding the number of lineages within the genus nor about the relationships between specimen from the same species. When more than one copy of ITS sequences were obtained for a particular individual, the sequences from the same individual did not cluster with each other. This was seen across all three species (C. americana accessions AC.MA.MSF.1, AC.SF.4.2, SS06127,

148

SS06174, SS07.80; C. alpina accessions AC.NM.WWC.5, AC.NM.BC.1.1, AC.AZ.FHR.4; C. panamensis accession AC.PAN.05).

Given that ITS was found not to be useful in addressing our main questions, neither molecular phylogenetic study nor population level genetic diversity, we decided to focus our attention on the use of nuclear low-copy PHYA sequences and on developing microsatellite markers specifically for Conopholis. Sequences generated from intron 1 of the PHYA gene were found to be useful at segregating three lineages in the Conopholis complimenting the results from plastid sequences. The use of PHYA sequences are reported in Chapter Two. We were successful at developing microsatellite markers for C. americana (Chapter Four) that were subsequently used in phylogeographic analyses to characterize the genetic variation and structure across its distribution and to determine whether populations survived the Last Glacial

Maximum in more than one refugium, as reported in Chapter Five.

149

150

Figure 1: Strict consensus of eight most parsimonious trees derived from analyses of nuclear ITS sequence data showing the relationships between clones and populations of Conopholis and Epifagus. Bootstrap values (>50%) are indicated above branches. Species names are followed by the abbreviations of states/provinces in which they were collected along with their respective DNA accession numbers. Specimen of C. americana are labeled by a white bar, C. panamensis by a gray bar, and those of C. alpina by a black bar. The species names applied to the specimen follow the systematic treatment for Conopholis conducted by Rodrigues et al., 2013.

151

Publications

Chapter 2 was published in American Journal of Botany. Rodrigues, A. G., A. E. L. Colwell, and S. Stefanović. (2011). American Journal of Botany 98: 896-908.

Chapter 3 was published in Systematic Botany. Rodrigues, A. G., S. Shaya, T. A. Dickinson, and S. Stefanović (2013). Systematic Botany 38(3): in press.

Chapter 4 was published in American Journal of Botany. Rodrigues, A. G., A. E. L. Colwell, and S. Stefanović. (2012). American Journal of Botany 99: e4-e6.

Chapter 5 was submitted to Journal of Biogeography. Rodrigues, A. G. and S. Stefanović. (2013).

Copyright permission, if required, was granted from each of the publishers to re-print material.

152