HYBRIDIZATION, GENOME DUPLICATION, AND CHEMICAL DIVERSIFICATION IN

THE EVOLUTION OF L. (COMPOSITAE)

A Dissertation

Presented to the Faculty of the Graduate School

of Cornell University

in Partial Fulfillment of the Requirements for the Degree of

Doctor of Philosophy

by

Olofron Plume

January 2015

© 2015 Olofron Plume

HYBRIDIZATION, GENOME DUPLICATION, AND CHEMICAL DIVERSIFICATION IN

THE EVOLUTION OF CALENDULA L. (COMPOSITAE)

Olofron Plume, Ph. D.

Cornell University 2015

Hybridization and polyploidy are common in . Both processes can have extensive

genomic consequences, and resulting morphological, biochemical, and reproductive changes

may drive speciation. The effects of hybridization and polyploidy on speciation and biochemical

diversity were explored in Calendula, a small, circum-Mediterranean genus in Compositae.

Calendula officinalis (pot marigold), the best known species, has been cultivated for centuries

for ornamental and medicinal use. Calendula is remarkable for the wide range of chromosome

numbers (2n=14, 18, 30, 32, 44, and ~85), likely resulting from hybridization between species

with different chromosome numbers followed by genome duplication. Hypotheses of species

origins were tested using evidence from three non-coding chloroplast regions (atpIatpH,

petLpsbE, and ndhFrpl32), nuclear ribosomal DNA (ITS), and two putatively low-copy nuclear

markers (Chs and A39). Analyses of these markers provided support for a division of the genus

into annual and perennial polyploid complexes, multiple origins of most polyploid taxa, and a

single origin of C. officinalis. A39 was duplicated once in Calendula. Nine or more duplications

of Chs were inferred from analyses of Calendula sequences with others from Compositae.

Variation of four classes of phenolic (caffeic acid derivatives and flavonoid glycosides) and

isoprenoid (monoterpenes and sesquiterpenes) compounds was investigated within the context of

the annual and perennial polyploid complexes. Arising from different biosynthetic pathways, and

highly diverse in plants, phenolics and isoprenoids offered different perspectives on the effects of

speciation, hybridization, and polyploidy on chemical diversification. All four classes varied

quantitatively and/or qualitatively across Calendula species. For three of four compound classes,

proportions of compounds within each class were relatively steady within taxa or within tissue

types but varied across taxa or tissue types, suggesting that particular blends of compounds may

have evolutionary and ecological significance in Calendula. Neither the number of compounds

detected nor bulk concentration of any compound class increased consistently with ploidy, but

both were higher in floral tissue than in leaf tissue. Phenolic profiles were more consistent with

hypotheses of evolutionary relationships than were isoprenoid profiles. Compound profiles in the

annual polyploids showed additive compound complements or intermediate proportions of

dominant compounds relative to putative progenitors.

BIOGRAPHICAL SKETCH

Olofron Plume received a B.A. in English Literature from the University of Colorado in

1997, an M. A. in Foreign Language Education from Soka University of America in 2002, and a second B. A. in Ecology and Evolutionary Biology from the University of Colorado in 2006.

iii

To Cindy and Cleo

iv

ACKNOWLEDGEMENTS

Firstly, I want to thank my major advisor, Dr. Jeff Doyle, for his steadfast encouragement,

attention to detail, constructive criticism, availability, and flexibility (e.g., four years of weekly

meetings over Skype!) no matter what I tried to take on or where I was in the world, and to my

entire committee, including Dr. Andre Kessler and Dr. Melissa Luckow, whose commitment to

me and to my research was (and is) unflagging. I would like to thank to all of the faculty, staff,

and students of the Department of Biology and the L.H. Bailey Hortorium, and particularly

Dr. Jerold Davis, Dr. Alejandra Gandolfo, Dr. Kevin Nixon, Dr. Jim Reveal, Dr. Bill Crepet, Sue

Sherman-Broyles, Jane Doyle, Anna Stalter, and Peter Fraissinet, each of whom always made time for a conversation, a pep talk, or to share expertise and offer guidance; Peter Fraissinet for his translation of German keys to Calendula taxa; Bob Dirig who gave the best and most inspired tour of an herbarium ever when I was choosing a graduate program; Karin Jantz for keeping me on track toward graduation; Tara Nihil for managing all of my grants; and my fellow graduate students in the L.H. Bailey Hortorium who never ceased to inspire and motivate me, particularly

Cynthia Skema, Shannon Straub, Mariana Yazbek, Janelle Burke, Jim Cohen, Jeremy Coate,

Caroline Kellogg, Dan Ilut, Tee Havananda, Mary Futey, Simon Gunner, Gwynne Lim, and

Adrian Powell. I would also like to thank all the members of the Kessler Lab in the Department of Ecology and Evolutionary Biology, and particularly Rayko Halitschke for his assistance with

HPLC and GC/MS analyses. Many thanks also to the staff in the Plant Science and Gutterman greenhouses for caring for my living collection of Calendula plants, and thanks to members of

the Doyle Lab and Nancy Roberts for helping me collect nearly 300 samples on ice, over the

course of two mornings, for use in chemical analyses. As I prepare to leave Cornell, I fear I may

never again find such a vibrant and inspiring community of scientists and scholars, nor ever

v

again have access to such a depth and breadth of botanical and ecological knowledge and

resources all in one place.

Many people and institutions outside of Cornell contributed samples, offered field and

herbarium assistance, or otherwise provided space and/or resources for the completion of this

work. I am grateful for donations of germplasm and tissue for my research from the North

Central Regional Plant Introduction Station of the U. S. National Plant Germplasm System,

which has an extensive collection of Calendula germplasm, from the Israel Plant Gene Bank, and

from colleagues Dr. Paulo Silveira (Universidade de Aveiro, Portugal), Dr. Angelo Troia

(Universitá degli Studi di Palermo, Italy), and Dr. Alan Wood (Agricultural Research Council,

South Africa). I offer extra thanks to Dr. Paulo Silveira for bringing me on a collecting

expedition in southern Spain and for providing taxonomic advice and cytological observations

over the course of my studies; to Dr. Francesco Raimondo and Dr. Gianniantonio Domina for

providing lodging and facilitating my herbarium work at PAL; to Dr. Francesco Raimondo, Dr.

Gianniantonio Domina, and Dr. Angelo Troia, Dr. Cynthia Skema, and several other researchers

at the Orto Botanico di Palermo for accompanying me on collecting expeditions in Sicily, to Dr.

Angelo Troia for involving me in his studies on hybridization between species of Calendula in

Sicily and in his efforts to save the threatened sea marigold (Calendula maritima), to Dr.

Mohamed Fennane and Dr. Jalal El Oualidi (Institut Scientifique de Rabat, Morocco) for

facilitating my field and herbarium work at RAB and for allowing destructive sampling from

some specimens, to Hamid Khamr (Institut Scientifique de Rabat) for accompanying me in the

field in the vicinity of Rabat, and finally to Mariana Yazbek for taking me on collecting

expeditions throughout Lebanon. I offer a very big thank you to Massey University (Palmerston

vi

North, New Zealand) and the lab of Dr. Jennifer Tate and Dr. Vaughan Symmonds for hosting me as a visiting student during a portion of this work.

I owe an immense debt of gratitude to my family. Thank you to my parents, my sister, and my grandmother for instilling in me a fierce love of exploring and learning, whether it be within the walls of academia or elsewhere. Thank you to my in-laws who supported me and cheered me on to the finish. Thank you to my wife who has taught me much about science and even more about life, who inspired me, encouraged me, and endured my process every step of the way, who believed in me even when I did not, and who never let me give up. Finally, thank you to our daughter, who made all of this worthwhile and who continues to help me keep things in perspective.

I gratefully acknowledge funding for this work from a National Science Foundation Doctoral

Dissertation Improvement Grant (DEB-0909832), from the Botanical Society of America and the

American Society of Plant Taxonomists Graduate Research and Travel Awards, and from the

Harold E. Moore Fund.

vii

TABLE OF CONTENTS

Biographical Sketch ...... iii Dedication ...... iv Acknowledgments...... v Table of Contents ...... viii List of Figures ...... xi List of Tables ...... xiii

Chapter 1 – Hybridization and polyploidy in the evolution of Calendula L. (Compositae): evidence from the chloroplast and ITS ...... 1 1. Introduction ...... 1 2. Materials and Methods ...... 7 2.1. Taxon sampling ...... 7 2.2. DNA extraction, PCR amplification, and sequencing ...... 8 2.3. Phylogenetic and network analyses ...... 15 3. Results ...... 19 3.1. Phylogenetic analyses of the combined chloroplast matrix ...... 19 3.2. Phylogenetic analyses of ITS ...... 25 3.3. Network analysis of ITS ...... 29 4. Discussion ...... 33 4.1. Relationships of outgroups sampled from Calenduleae and the monophyly of Calendula ...... 33 4.2. Divergence of Moroccan endemic species from the rest of the genus ...... 35 4.3. Chloroplast polymorphism in C. stellata and C. tripterocarpa ...... 37 4.4. Origins of the annual polyploids ...... 40 4.5. Origins of the perennial polyploids...... 42 4.6. Origin of C. officinalis ...... 45

viii

References ...... 48

Chapter 2 – Utility of low-copy nuclear markers for use in phylogeny reconstruction in Calendula L. (Compositae) ...... 55 1. Introduction ...... 55 2. Materials and Methods ...... 59 2.1. Taxon sampling and marker screening and selection ...... 59 2.2. DNA extraction, PCR, cloning, and sequencing ...... 67 2.3. Phylogenetic analyses ...... 70 3. Results ...... 72 3.1. Phylogenetic analyses of A39 ...... 72 3.2. Phylogenetic analyses of Chs...... 77 4. Discussion ...... 84 4.1. A39 ...... 84 4.2. Chs ...... 87 4.3. Implications for relationships in Calendula ...... 90 References ...... 92

Chapter 3 – Diversity and evolution of secondary chemistry in Calendula L. (Compositae) ...... 98 1. Introduction ...... 98 2. Materials and Methods ...... 102 2.1. Plant growth and sampling...... 102 2.2. Sampling of phenolics and small terpenes ...... 113 2.3. Data analyses ...... 115 3. Results ...... 131 3.1. Monoterpenes ...... 131 3.2. Sesquiterpenes...... 138

ix

3.3. Caffeic acid derivatives ...... 143 3.4. Flavonoid glycosides ...... 148 4. Discussion ...... 155 4.1. Summary of variation by chemical class ...... 155 4.2. Some taxon-specific patters of variation and implications ...... 159 4.3. Variation in allopolyploids relative to progenitors ...... 161 4.4. Tissue-specific variation ...... 166 4.5. Problems and future directions ...... 168 References ...... 170

x

LIST OF FIGURES

Figure 1.1. Visual summary of major hypotheses of species origins in Calendula ...... 3

Figure 1.2. Combined chloroplast cladogram showing identical parsimony and Bayesian topologies ...... 21

Figure 1.3. Strict consensus parsimony topologies of individual cp regions ...... 23

Figure 1.4. ITS cladogram showing the Bayesian topology in black and differences in the parsimony topology in gray ...... 26

Figure 1.5. Major Calendula network from analysis in TCS of recduced ITS dataset ...... 30

Figure 1.6. Simplified Bayesian trees showing branch lengths ...... 35

Figure 1.7. Modified hypotheses of species origins in Calendula based on 3CP and ITS analyses ...... 40

Figure 2.1. Pictorial summary of major hypotheses of species origins in Calendula from previously published work and from analyses of chloroplast and ITS markers ...... 58

Figure 2.2. Location of A39 and Chs primers relative to Arabidopis gene models ...... 66

Figure 2.3. A39 cladogram showing the Bayesian topology in black and differences in the parsimony topology in gray ...... 73

Figure 2.4. Strict consensus tree resulting from parsimony analysis of the Chs matrix ...... 78

Figure 2.5. 50% majority rule tree resulting from Bayesian analysis of the Chs matrix ...... 80

Figure 2.6. Expected phylogeny for Chs sequences ...... 82

Figure 2.7. Pictorial summary of major hypotheses of species origins in Calendula based on evidence from the chloroplast, ITS, A39 and Chs ...... 91

Figure 3.1. Mean MT number and bulk abundance by chromosome number and tissue in the annual and perennial polyploid complexes ...... 132

Figure 3.2. 100% stacked column chart showing proportion of each MT peak in each sample ...... 134

Figure 3.3. NMDS plots of Bray-Curtis dissimilarities between the MT profiles of each sample in Calendula ...... 136

xi

Figure 3.4. Mean ST number and bulk abundance by chromosome number and tissue in the annual and perennial polyploid complexes ...... 139

Figure 3.5. 100% stacked column chart showing proportion of each ST peak in each sample ...... 141

Figure 3.6. NMDS plots of Bray-Curtis dissimilarities between the ST profiles of each sample in Calendula ...... 142

Figure 3.7. Mean CAD number and bulk abundance by chromosome number and tissue in the annual and perennial polyploid complexes ...... 144

Figure 3.8. 100% stacked column chart showing proportion of each CAD peak in each sample ...... 146

Figure 3.9. NMDS plots of Bray-Curtis dissimilarities between the CAD T profiles of each sample in Calendula ...... 147

Figure 3.10. Mean FG number and bulk abundance by chromosome number and tissue in the annual and perennial polyploid complexes ...... 150

Figure 3.11. 100% stacked column chart showing proportion of each FG peak in each sample ...... 151

Figure 3.12. NMDS plots of Bray-Curtis dissimilarities between the FG profiles of each sample in Calendula ...... 153

xii

LIST OF TABLES

Table 1.1. Taxon sampling for 3CP and ITS analyses ...... 9

Table 1.2. Chloroplast primers generated for this study ...... 14

Table 1.3. 3CP and ITS partitions and models used for Bayesian analyses ...... 18

Table 1.4. Matrix statistics by region ...... 20

Table 2.1. Taxa used for screening of LCN primers and for generation of A39 and Chs sequences for this study ...... 60

Table 2.2. Loci, primer information, PCR programs, and amplification and sequencing outcomes for 17 low-copy nuclear markers tested for their utility in this study ...... 63

Table 2.3. Partitions, models (for Bayesian analyses), and matrix statistics for A39 and Chs matrices ...... 71

Table 2.4. Clade membership of A39 copies by species and individual ...... 76

Table 3.1. Summary of taxa sampled for this study with information on chromosome number, lifespan, breeding system, and taxon distribution ...... 104

Table 3.2. Taxon sampling for HPLC and GC/MS analyses ...... 105

Table 3.3. Mean ratio of amount of each peak to total amount of its type, bulk abundance of each type, and number of peaks for each compound class by chromosome number group and tissue type ...... 116

Table 3.4. Mean ratio of amount of each peak to total amount of its type, bulk abundance of each type, and number of peaks for each compound class by taxon and tissue type ...... 122

Table 3.5. Presence and mean proportion of several compounds in leaf samples of two putative progenitor species (C. stellata and C. tripterocarpa) and in the putative allopolyploid of these species (C. arvensis) ...... 164

xiii

CHAPTER 1

HYBRIDIZATION AND POLYPLOIDY IN THE EVOLUTION OF CALENDULA L.

(COMPOSITAE): EVIDENCE FROM THE CHLOROPLAST AND ITS

1. Introduction

Hybridization and polyploidy (whole genome duplication) have both played major roles in

the radiation and diversification of flowering plants (Arnold 2006; Mallet 2007), and it is now understood that all angiosperms have undergone at least one polyploidization event (Jiao et al.

2011). Polyploidy often accompanies hybridization (producing allopolyploids), especially in cases where hybrid fertility is reduced or absent (Chapman and Burke 2007). In addition, autopolyploidy, which results from the duplication of very similar genomes, may be much more prevalent than is generally recognized (Soltis et al. 2007). Allopolyploid evolution is complex at every level of analysis, including both pattern and process. At the level of phenotypic evolution, it has long been known that hybridization and genome doubling generate novelty at the morphological, biochemical, and physiological levels (e.g., Levin 1983; Mears 1980; Warner and

Edwards 1993; Orians 2000) and that these changes may allow polyploids to occupy new ecological niches (e.g., Otto and Whitton 2000; Madlung 2013). Wood et al. (2009) estimated that 15% of speciation events in angiosperms involved polyploidy. Despite progress toward understanding polyploid origins and effects of polyploidy on divergence and distribution in many plant groups (e.g., Glycine: Doyle et al. 2004a, 2004b; Gossypium: Wendel and Cronn 2003;

Tragopogon: Lim et al. 2008; Achillea: Ramsey 2011; Hedera: Green et al. 2013; Cardamine:

1

Mandáková et al. 2013; Silene: Popp and Oxelman 2007), there still remains much work to be

done, especially given the prevalence of polyploidy in plants. In this study, allopolyploid

evolution is explored in Calendula (Compositae: Calenduleae).

Calendula is a genus of twelve species (following Ohle 1974, 1975a, 1975b and Heyn et al.

1974) of annual and perennial herbs and subshrubs native to Macaronesia and the Mediterranean

basin. The most familiar of these species is C. officinalis (pot marigold), cultivated for centuries

for ornamental and medicinal use. The geographical distribution of Calendula is interesting

given that the remaining eight genera in tribe Calenduleae are South African, with only a few

members of two to three genera extending up the eastern coast of Africa, and only one other

species, Osteospermum vaillantii, occurring in the southeastern-most part of the range of

Calendula (Norlindh 1946; Greuter 2006+). The genus is well supported by morphological and

molecular data as monophyletic, and the tribe Calenduleae is well supported as monophyletic

within subfamily Asteroideae (Panero and Funk 2008; Funk 2009). Lanza’s (1919) Monografia

del Genere Calendula L. is the only monograph of the entire genus ever produced. Several

decades after the publication of Lanza’s monograph, revisions of the perennial (Ohle 1974,

1975a, 1975b) and annual (Heyn et al. 1974) taxa were produced.

Calendula is notable for its wide range of chromosome numbers across species, with counts

of 2n = 14, 18, 30, 32, 44, and ~85-88 consistently reported, each characteristic of a species or

group of species. These numbers are likely the result of multiple rounds of hybridization and

genome duplication during species divergence (Heyn et al. 1974). Morphological and karyological investigations, coupled with the additivity of observed chromosome numbers, led authors to suggest several hypotheses of species origins in the genus (summarized visually in

Fig. 1.1). Ohle (1975b) speculated that the diploid annual C. stellata (2n = 14) might represent

2

Figure 1.1. Visual summary of major hypotheses of species origins in Calendula (Heyn et. al. 1974; Heyn and Joel 1983; Ohle 1974, 1975a, 1975b), including a simplified overview of capitula size and color. Capitula are concolorous (color of ray florets equals that of disc florets; represented by a solid circle) or bicolorous (color of ray florets differs from disc florets; represented by two concentric circles, the inner one darker than the outer one). Some species or groups of species have only one type of capitulum (a single circle with our without a smaller circle in its center), while others have color polymorphism (two or more partially overlapping circles with or without smaller circles in their centers). Sizes shown are not to scale, but are relative to one another (e.g., capitula in C. stellata are generally larger than those in C. arvensis which are in turn generally larger than those in C. tripterocarpa, but some capitula in C. arvensis are about as small as those in C. tripterocarpa). Also there is more variation in size than what is shown here, e.g., some of C. officinalis can have capitula much larger than those of any other species. A proliferous capitula is also shown for C. officinalis (a larger concentric circle with smaller concentric circles radiating from the darker inner circle). Proliferation (or growth from the ray florets of the primary capitulum of new floral shoots terminating in smaller capitula) is common in “doubled” (having multiple series of ray florets) cultivars of C. officinalis. The small, white circle with “16” inside of it represents a hypothetical, dysploid intermediate (with 2n = 16) between C. lanzae and C. stellata. Each single circle or overlapping group of circles represents a single hypothesized lineage. The dotted box indicates a single lineage with further subdivision. X = hybridization;  = derivation from hybridization event or ancestral taxon; * = polyploidization.

3

4

the end point of a progression that began with a diploid perennial species with 2n = 18

chromosomes (such as C. maroccana) and continued through subspecies with decreasing

secondary growth, yielding an annual species with 2n = 18 chromosomes (such as C. lanzae).

From there, aneuploidy or dysploidy (resulting in a reduction from 2n = 18 to 14 chromosomes)

and continued divergence may have produced C. stellata. He found further support for the

possibility of a close relationship of C. stellata to C. lanzae in the colors of their capitula.

Calendula lanzae differs from the other species with 2n = 18 chromosomes in that it sometimes

has bicolorous capitula (with dark-brown to reddish disc florets and yellow to yellow-orange ray

florets) while the other species have concolorous capitula (discs and rays both yellow).

Calendula stellata also has bicolorous capitula (with red-purple discs and orange rays). Ohle

(1974, 1975a) had already hypothesized that hybridization and polyploidization events between

C. stellata, with 2n = 14 chromosomes, and any of the four species endemic to Morocco (the

perennial species C. eckerleinii, C. maroccana, C. meuselii and the annual species C. lanzae),

each with 2n = 18 chromosomes, may have given rise to the perennial polyploid species C.

incana and C. suffruticosa, each with 2n =32 chromosomes. Specifically, Ohle considered C.

incana to be the product of a cross between C. stellata and C. meuselli, and C. suffruticosa the

product of C. stellata and C. eckerleinii. Implicit in these hypotheses is an additional hypothesis

(Ohle 1974) that C. incana and C. suffruticosa are, in fact two distinct lineages. However, these

two species are often treated as a single, highly polymorphic species (C. suffruticosa) based on

morphological and karyological gradation between them (e.g., Meikle 1976; Nora et al. 2013;

Silveira et al. 2013; Greuter 2006+). Though the origin of C. officinalis is unknown, with 2n = 32

chromosomes and morphological affinities to both C. incana and C. suffruticosa, it may be the product of cultivation from one of these species, from crosses between these species, or from

5 another cross between C. stellata and a member of the Moroccan endemic group (Ohle 1974).

Various authors hypothesized that hybridization between C. stellata and the annual C. tripterocarpa, with 2n = 30 chromosomes, followed by genome duplication, may have produced the widespread and highly polymorphic annual, C. arvensis, with 2n = 44 chromosomes (Heyn et al. 1974; Heyn and Joel 1983). Calendula tripterocarpa, though often considered a diploid in the literature, is also apparently a polyploid based on chromosome number (2n = 30). Heyn and

Joel (1983) suggested that it could be derived from a cross between a hypothetical aneuploid or dysploid ancestor of C. stellata (with 2n = 16 chromosomes) and C. stellata itself. Heyn et al.

(1974) rejected the hypothesis that the two, annual, high-polyploid species, C. pachysperma and

C. palaestina, were autopolyploids of C. arvensis despite the fact that the chromosome number they counted for both species (2n = ~85; but see 2n counts of 88 for each of the two species in

Pazy 2000) was nearly twice that of C. arvensis. They based their doubt on a strongly asymmetrical karyotype and regular (i.e., with only bivalents) meiosis in both species, and hypothesized that an additional species may have been involved in the origin of the high- polyploid species. Heyn and Joel (1983) implied that this additional species could have been C. tripterocarpa.

If these hypotheses are correct, then all taxa with 2n = 30, 32, 44, and ~85-88 chromosomes would be allopolyploid in origin, and C. stellata would have contributed its genome to all of them. It would also mean that Calendula could be divided into two polyploid complexes, a

(mostly) perennial polyploid complex of species with 2n = 14, 18, and 32 chromosomes and an annual complex of species with 2n = 14, 30, 44, and 85-88 chromosomes, with species having 2n

= 14 and 30 chromosomes being progenitors of those with 44 and 85-88 chromosomes.

Sequences from three intergenic chloroplast regions and from the internal transcribed spacers 1

6

and 2 and the 5.8S gene of the nuclear ribosomal DNA cistron (henceforth ITS) are here used to test these hypotheses, as well as to develop a hypothesis of relationships among all taxa.

2. Materials and Methods

2.1. Taxon sampling

Sampling in Calendula was guided by the taxonomic treatments of Lanza (1919), Norlindh

(1962), Heyn et al. (1974), Ohle (1974, 1975a, 1975b), and Greuter (2006+), by observation of wild and cultivated material, and specimens at the herbaria of the Bailey Hortorium (BH),

Harvard (HUH), the University of Palermo (PAL), the American University in Beirut (BEI), and the Institut Scientifique in Rabat (RAB), and by discussions with colleagues working on

treatments of Calendula or on the Mediterranean flora. Sampling of outgroup taxa from

Calenduleae followed Norlindh (1977), Nordenstam (2007), Nordenstam and Källersjö (2009), and Barker et al. (2009). Some individuals included in these analyses were of wild origin, collected in the field either by the author or colleagues, or sampled from herbarium specimens.

Other individuals included were grown from seed in greenhouses at Cornell University (Ithaca,

NY, USA). Seed originated from nurseries, the USDA North Central or Washington Regional

Plant Introduction Stations (NCRPIS or WRPIS), or field collections. One individual

(Osteospermum barberiae) was purchased from a local nursery. DNA was extracted from leaf

tissue (fresh, silica-dried, or from herbarium sheets) from a total of 108 individuals. This

included seven individuals belonging to four outgroup genera (Garuleum, Dimorphotheca,

Osteospermum, and Tripteris; all in Calenduleae) and 100 individuals belonging to Calendula

7 representing all twelve currently accepted species of the genus, all five accepted subspecies of C. incana, six of nine subspecies of C. suffruticosa, a range of form variants of the highly polymorphic C. arvensis and six different cultivars of C. officinalis. New sequence data were generated for this study from the individuals described. See Table 1 for more detailed sampling information including voucher information, collection locality, contributing herbarium, and seed source (as applicable).

An additional thirteen ITS sequences were downloaded from GenBank

(http://www.ncbi.nlm.hih.gov/genbank/), with one (O. fruticosum AF422131) derived from

Wagstaff and Breitwieser (2002) and the remaining twelve derived from Barker et al. (2009).

These data added 10 new outgroup species, including three individuals from a fifth genus in

Calenduleae (Chrysanthemoides), to the ITS dataset. In the ITS tree, these sequences are preceded with an asterisk and followed by the GenBank accession number.

2.2. DNA extraction, PCR amplification, and sequencing

Total genomic DNA was extracted for this study from fresh or silica dried leaf tissue using either a modified CTAB protocol (Doyle and Doyle 1987; with the addition of 2% polyvinylpyrrolidine to the extraction buffer), or using the QIAGEN DNeasy Plant Mini Kit following the manufacturer protocol, including recommended steps, with minor modifications (≤

50 mg fresh tissue or ≤ 20 mg dried tissue was ground in liquid nitrogen, 500 µl Buffer AP1 was added, RNase A was not added at step 2, 140 µl Buffer AP2 was added). Total genomic DNA was extracted from herbarium tissue using the QIAGEN DNeasy Plant Mini Kit as described above with the following additional modifications: immediately after addition of Buffer AP1 to

8

Table 1.1. Taxon sampling for 3CP and ITS analyses . *Cultivated individuals were grown from seed in greenhouses at Cornell University, Ithaca, NY, USA. Seed source letter codes indicate the following: PI = USDA-GRIN (www.ars-grin.gov); IGB = The Israel Gene Bank (igb.agri.gov.il/main/index.pl); SHS = Silver Hill Seeds (www.silverhillseeds.co.za); BA = Bakers Acres (www.bakersacres.net); AT = Angelo Troia (University of Palermo, Italy); AW = Alan Wood (University of Stellenbosch, South Africa); OP = O. Plume; PS = P. Silveira (University of Aveiro, Portugal); MS = M. Sequeira (University of Madeira, Portugal; via PS); pop. = seed collection from population rather than individual.

9

Individual Voucher Collection type Taxon ID (herbarium) (seed source)* Origin 3CP ITS Calendula L. arvensis (Vaill.) L. Carv5 O. Plume 5 (BH) Cultivated (PI 633645) Beja, Portugal x x C. arvensis Carv6 O. Plume 6 (BH) Cultivated (PI 578097) Turkey x x C. arvensis Carv7 O. Plume 7 (BH) Cultivated (PI 597586) Cadiz, Spain x x C. arvensis Carv8 O. Plume 8 (BH) Cultivated (PI 597587) Apulia, Italy x x C. arvensis Carv63 O. Plume 63 (BH) Wild Agrigento, Italy x x C. arvensis Carv92 O. Plume 92 (BH) Wild Lercara Friddi, Italy x x C. arvensis Carv118 O. Plume 118 (BH) Wild Castellana Sicula, Italy x x C. arvensis Carv126 O. Plume 126 (BH) Wild Gangi, Italy x x C. arvensis Carv129 O. Plume 129 (BH) Wild Gangi, Italy x x C. arvensis Carv149 O. Plume 149 (BH) Cultivated (PI 6031092) Greece x x C. arvensis Carv157 O. Plume 157 (BH) Cultivated (PS 2980) Almeria, Spain x x C. arvensis Carv234 O. Plume 234 (BH) Wild Hermel, Lebanon x x C. arvensis Carv416 O. Plume 416 (BH) Cultivated (PI 578099) Morocco x x C. arvensis Carv424 O. Plume 424 (BH) Cultivated (PI 597585) France x x C. arvensis Carv435 O. Plume 435 (BH) Cultivated (PI 578100) Spain x x C. arvensis Carv436 O. Plume 436 (BH) Cultivated (PI 305289) Spain x – C. arvensis Carv448 O. Plume 448 (BH) Cultivated (OP pop. 239) Hermel, Lebanon x x C. arvensis Carv450 O. Plume 450 (BH) Cultivated (OP pop. 220) Naqoura, Lebanon x x C. arvensis Carv452 O. Plume 452 (BH) Cultivated (OP 66) Modica, Italy x x C. arvensis Carv455 O. Plume 455 (BH) Cultivated (PI 618687) Coimbra, Portugal x – C. arvensis Carv462 O. Plume 462 (BH) Cultivated (PS 2979) Almeria, Spain x x C. arvensis Carv465 O. Plume 465 (BH) Cultivated (OP pop. Sp1) Spain x x C. arvensis Carv470 O. Plume 470 (BH) Cultivated (PI 613017) Morocco x – C. arvensis Carv477 O. Plume 477 (BH) Cultivated (OP pop. 245) Qaa, Lebanon x x C. eckerleinii Ohle Ceck15 O. Plume 15 (BH) Cultivated (PI 603110) Morocco x x C. eckerleinii Ceck16 O. Plume 16 (BH) Cultivated (PI 603110) Morocco x – C. eckerleinii Ceck287 O. Plume 287 (BH) Cultivated (PI 603110) Morocco x – C. eckerleinii Ceck3064 P. Silveira 3064 (AVE) Wild Morocco x x C. incana Willd. subsp. algarbiensis (Boiss.) Ohle Cialg145 O. Plume 145 (BH) Cultivated (PS 2899) Beja, Portugal x x C. incana subsp. incana Cii151 O. Plume 151 (BH) Cultivated (PS 2937b) Andalucia, Spain x x C. incana subsp. incana Cii367 O. Plume 367 (BH) Cultivated (PS 3043) Spain x – C. incana subsp. incana Cii2937b P. Silveira 2937b (AVE) Wild Andalucia, Spain x x C. incana subsp. maderensis (DC.) Ohle Cimad142 O. Plume 142 (BH) Cultivated (MS 5676) Madeira x x C. incana subsp. maderensis CimadKEW Chase 19982 (KEW) Wild Madeira x – C. incana subsp. maritima (Guss.) Ohle Cimar1 O. Plume 1 (BH) Cultivated (PI 597596) Trapani, Italy x x C. incana subsp. maritima Cimar2 O. Plume 2 (BH) Cultivated (PI 597596) Trapani, Italy x x C. incana subsp. maritima Cimar59 O. Plume 59 (BH) Wild Trapani, Italy x x C. incana subsp. maritima Cimar60 O. Plume 60 (BH) Wild Trapani, Italy x x C. incana subsp. maritima X Cmxf54 O. Plume 54 (BH) Wild Trapani, Italy x x C. suffruticosa subsp. fulgida

10

Table 1.1. (Continued) Individual Voucher Collection type Taxon ID (herbarium) (seed source)* Origin 3CP ITS C. incana subsp. maritima X Cmxf55 O. Plume 55 (BH) Wild Trapani, Italy x x C. suffruticosa subsp. fulgida C. incana subsp. maritima X Cmxf57 O. Plume 57 (BH) Wild Trapani, Italy x x C. suffruticosa subsp. fulgida C. incana subsp. microphylla (Willk.) Ohle Cim17 O. Plume 17 (BH) Cultivated (PI 633647) Serra da Boa Viagem, Portugal x x C. incana subsp. microphylla (cf.) Cim146 O. Plume 146 (BH) Cultivated (PS 2976) Pontevedra, Spain x x C. incana subsp. microphylla Cim218 O. Plume 218 (BH) Cultivated (PI 633647) Serra da Boa Viagem, Portugal x x C. incana subsp. microphylla (cf.) Cim2976 P. Silveira 2976 (AVE) Wild Pontevedra, Spain – x C. incana subsp. microphylla Cim3027 P. Silveira 3027 AVE) Wild Berlengas, Portugal x x C. lanzae Maire Clan22 6522 (GAT) Wild Morocco x – C. lanzae Clan23 6523 (GAT) Wild Morocco x – C. lanzae Clan24 6524 (GAT) Wild Morocco x – C. lanzae Clan44066 44066 (RAB) Wild Morocco x x C. maroccana (Ball) B. D. Jacks Cmar13 O. Plume 13 (BH) Cultivated (PI 578104) Morocco x x C. maroccana Cmar14 O. Plume 14 (BH) Cultivated (PI 578104) Morocco x – C. maroccana Cmar165 O. Plume 165 (BH) Cultivated (PI 607416) Morocco (Germany) x x C. maroccana Cmar213 O. Plume 213 (BH) Cultivated (PI 607417) Morocco (France) x x C. meuselii Ohle Cmeu3063 P. Silveira 3063 (AVE) Wild Morocco x x C. officinalis L. Co163 O. Plume 163 (BH) Cultivated (PI 578109) Algeria x x C. officinalis Co170 O. Plume 170 (BH) Cultivated (PI 578106) Kazakhstan x x C. officinalis Co172 O. Plume 172 (BH) Cultivated (PI 293762) Former Soviet Union x x C. officinalis Co341 O. Plume 341 (BH) Cultivated (PI 420253) Ribatejo, Portugal x – C. officinalis Co349 O. Plume 349 (BH) Cultivated (PI 420375) Spain x x C. officinalis Co351 O. Plume 351 (BH) Cultivated (PS 2986c) Ribatejo, Portugal x x C. pachysperma Zohary Cpac207 O. Plume 207 (BH) Cultivated (IGB 20562) Samaria Mountains, Israel x x C. palaestina Boiss. Cpal219 O. Plume 219 (BH) Cultivated (IGB 21124) Mount Carmel, Israel x x C. suffruticosa Vahl. Csuf3 O. Plume 3 (BH) Cultivated (PI 607419) Libya – x C. suffruticosa Csuf4 O. Plume 4 (BH) Cultivated (PI 607419) Libya x x C. suffruticosa Csuf9 O. Plume 9 (BH) Cultivated (PI 633646) Italy x – C. suffruticosa Csuf103 O. Plume 103 (BH) Wild Mondello, Italy – x C. suffruticosa Csuf107 O. Plume 107 (BH) Wild Mondello, Italy – x C. suffruticosa Csuf133 O. Plume 133 (BH) Wild Gangi, Italy x x C. suffruticosa subsp. boissieri Lanza Csb156 O. Plume 156 (BH) Cultivated (PI 656802) Algeria x x C. suffruticosa subsp. carbonellii Ohle Csc176 O. Plume 176 (BH) Wild Fuengirola, Spain x x C. suffruticosa subsp. fulgida (Raf.) Guadagno Csf58 O. Plume 58 (BH) Wild Trapani, Italy x x C. suffruticosa subsp. fulgida Csf111 O. Plume 111 (BH) Wild Mondello, Italy – x C. suffruticosa subsp. fulgida Csf121 O. Plume 121 (BH) Wild Gangi, Italy x x C. suffruticosa subsp. fulgida Csf123 O. Plume 123 (BH) Wild Gangi, Italy x x C. suffruticosa subsp. fulgida Csf137 O. Plume 137 (BH) Wild Rebuttone, Italy x x C. suffruticosa subsp. fulgida Csf166 O. Plume 166 (BH) Cultivated (PI 607420) Mt. Erice, Italy x x C. suffruticosa subsp. fulgida Csf202 O. Plume 202 (BH) Cultivated (AT pop. HA) Mondello, Italy x x 11

Table 1.1. (Continued) Individual Voucher Collection type Taxon ID (herbarium) (seed source)* Origin 3CP ITS C. suffruticosa subsp. fulgida Csf204 O. Plume 204 (BH) Cultivated (AT pop. HA) Mondello, Italy x – C. suffruticosa subsp. fulgida Csf381 O. Plume 381 (BH) Cultivated (PI 613021) Italy x x C. suffruticosa subsp. greuteri Ohle Csg143 O. Plume 143 (BH) Cultivated (PS 2983) Granada, Spain x x C. suffruticosa subsp. greuteri Csg180 O. Plume 180 (BH) Wild Pampaneira, Spain – x C. suffruticosa subsp. greuteri Csg392 O. Plume 392 (BH) Cultivated (PS 2983) Granada, Spain x – C. suffruticosa subsp. lusitanica (Boiss.) Ohle Csl12 O. Plume 12 (BH) Cultivated (PI 649652) Santarem, Portugal x x C. suffruticosa subsp. lusitanica Csl159 O. Plume 159 (BH) Cultivated (PS 3025) Estremadura, Portugal x x C. suffruticosa subsp. lusitanica (cf.) Csl2942 P. Silveira 2942 (AVE) Wild Monchique, Portugal x x C. suffruticosa subsp. lusitanica Csl3015 P. Silveira 3015 (AVE) Wild Estremadura, Portugal x x C. suffruticosa subsp. lusitanica (cf.) Csl3016 P. Silveira 3016 (AVE) Wild Estremadura, Portugal x x C. suffruticosa subsp. lusitanica (cf.) Csl3017a P. Silveira 3017a (AVE) Wild Estremadura, Portugal x x C. suffruticosa subsp. lusitanica (cf.) Csl3023 P. Silveira 3023 (AVE) Wild Estremadura, Portugal – x C. suffruticosa subsp. suffruticosa Css3038 P. Silveira 3038 (AVE) Wild Tunisia x x C. suffruticosa var. tunetana (Cuenod) Ohle Cstun3039d P. Silveira 3039d (AVE) Wild Tunisia x x C. stellata Cav. Cste10 O. Plume 10 (BH) Cultivated (PI 603114) Morocco x x C. stellata Cste18 O. Plume 186 (BH) Cultivated (PI 603114) Morocco x x C. stellata Cste169 O. Plume 169 (BH) Cultivated (PI 649651) El Jadida, Morocco x x C. stellata Cste319 O. Plume 319 (BH) Cultivated (PI 649651) El Jadida, Morocco x – C. stellata Cste321 O. Plume 321 (BH) Cultivated (OP pop. 250) Sidi Kacem, Morocco x x C. tripterocarpa Rupr. Ctri139 O. Plume 139 (BH) Cultivated (PS 2982) Almeria, Spain x x C. tripterocarpa Ctri3066 P. Silveira 3066 (AVE) Wild Morocco x x C. tripterocarpa Ctri77193 77193 (RAB) Wild Morocco x – Dimorphotheca Moench acutifolia Hutch. Dacu214 O. Plume 214 (BH) Cultivated (SHS 4328) South Africa x x D. tragus (Aiton) B. Nord. Dtra19 O. Plume 19 (BH) Cultivated (PI 263145) New York, USA x x Garuleum Cass. pinnatifidum DC. Gpin174 O. Plume 174 (BH) Cultivated (SHS 11408) South Africa x x Osteospermum L. barberiae (Harv.) Norl. Obar173 O. Plume 173 (BH) Purchased (BA) Ithaca, USA x x O. fruticosum (L.) Norl. Ofru209 O. Plume 209 (BH) Cultivated (SHS 5205) South Africa x x O. rigidum Aiton Orig208 O. Plume 208 (BH) Cultivated (SHS 13557) South Africa x x Tripteris Less. clandestina Less. Tcla373 O. Plume 373 (BH) Cultivated (AW s.n.) South Africa x x

12

ground tissue, 30 µL QIAGEN Proteinase K (>600 mAU/ml) was added, and the sample was

incubated on an orbital shaker at 125 rpm at 42°C for 19.5 hours. The addition of Proteinase K

was repeated either twice, followed by incubation steps of 12 hours each, or once, followed by

an incubation step of 35 hours. After the final incubation with Proteinase K, the extraction

continued following the manufacturer protocol (starting with the addition of Buffer AP2).

Ten, non-protein-coding chloroplast regions with demonstrated phylogenetic utility at low

taxonomic levels were selected from the literature for screening: psbM-trnD and ycf6-psbM from

Shaw et al. (2005); atpI-atpH, ndhF-rpl32, psbD-trnT, petL-psbE, and trnQ-5’rps16 from Shaw et al. (2007); and ndhC-trnV, trnL-rpl32, and trnY-trnE-rpoB from Timme et al. (2007). To determine the utility of these regions, all were first sequenced for a subset of taxa (four individuals representing a range of morphological and karyological diversity), using primers as published for each region in the cited studies. Typical PCR reactions were performed in 20 µl volumes with 1 µl template DNA (0.1-1.0 ng), 2 µl 1X Standard Taq Reaction Buffer (New

England Biolabs, Ipswich, MA), 0.2 mM each dNTP (New England Biolabs, Ipswich, MA), 0.5

µM of each primer (prepared by InvitrogenTM, Life Technologies, Grand Island, NY), and 1 unit

Taq DNA polymerase (New England Biolabs, Ipswich, MA) using either Techne or MJ

thermocyclers. PCR reaction conditions for the screening of all regions were as follows: 1 cycle

at 94°C for 2 min; 40 cycles of 94°C for 30 sec, 52°C for 30 sec, and 72°C for 1 min; 1 cycle at

72°C for 5 min. Criteria used to select regions for further analysis were amplification in all four

individuals, amplification of single bands, and phylogenetically informative sequence variation

between individuals. Of those that fit the first two criteria (atpI-atpH, ndhF-rpl32, petL-psbE, rpl32-trnL, trnQ-5’rps16, trnY-rpoB, ycf6-psbM), only three had enough sequence variation across all individuals to warrant further investigation (atpI-atpH, ndhF-rpl32, petL-psbE). These

13

three regions were then successfully sampled from all or most available taxa, again using the published primers (Shaw et al. 2007 primers pairs: atpI and atpH; ndhF and rpl32-R; petL and

psbE) as well as six additional internal primers (two for each region) designed to improve

amplification across the entire length of each region (see Table 1.2). All primers were used for

Table 1.2. Chloroplast primers generated for this study

Region Primer name and sequence (5’ – 3’) atpI-atpH atpI2: AAATTCCCGTTTACTCCCTCCC atpH2: CCCTAACCGCTCCTTGAATTCTTC ndhF-rpl32 ndhF2: TTCACCGGATCTTACCTCTTTCG rpl32R2: GGTTTGGAACTCTTAGTTCTCGTTG petL-psbE petL2: AACATCTACCCATACCGCGTTTGC psbE2: AGGGCTAGAAACTGAGGATCTGGT

both amplification and sequencing. PCR reaction conditions for amplification in the full taxon

set were as described above except that annealing temperature was sometimes adjusted to

improve amplification and ranged from 50°C to 55°C.

ITS was amplified using the primers ITS4 (White et al. 1990) and ITS5a (Downie and Katz-

Downie 1996). PCR reaction conditions were as described for chloroplast regions except that, if

cloning was to be performed, then three replicate reactions were prepared for each individual and

pooled after PCR but before cloning. This was to compensate for possible PCR amplification

bias and increase the chance of recovering all copies (Wagner et al. 1994). Also, the final hold at

72°C was extended to 10 min. Most individuals with more than two polymorphisms in ITS were

cloned using the TOPO® TA Cloning® Kit for Sequencing (InvitrogenTM, Life Technologies,

Grand Island, NY) following the manufacturer protocol. Eight or more colonies were selected for

sequencing from each cloned individual with the goal of obtaining enough clones to resolve all

polymorphisms visible in the direct sequence. For individuals with only one or two polymorphic

14

sites, all possible haplotypes were created (two real or four hypothetical haplotypes respectively)

from the direct sequences and included in analyses.

Success of PCR reactions was determined by gel electrophoresis of 2-5 µl of each PCR

product. All successful PCR products were cleaned before sequencing (but not before cloning)

by adding 3 µl Standard Taq Reaction Buffer, 10 units Exonuclease I, and 0.5 units Antarctic

Phosphatase (all reagents from New England Biolabs, Ipswich, MA) to each 20 µl of PCR product (reagents scaled as necessary to compensate for evaporation or removal of product for gel-electrophoresis) and incubating for 45 min at 37°C, then 10 min at 90°C. Capillary DNA sequencing of both forward and reverse strands was carried out using either amplification primers (for direct sequencing) or vector primers (for clones; using T3 and T7 primers as published in the theTOPO® TA Cloning® Kit for Sequencing manual) and BigDye Terminator

v3.1 chemistry (Applied Biosystems®, Life Technologies, Grand Island, NY) at Cornell

University Life Sciences Core Laboratories Center (CLC) on a 3730 DNA analyzer (Applied

Biosystems).

2.3. Phylogenetic and network analyses

Sequences were aligned using MUSCLE (Edgar 2004) via the MUSCLE webserver

(http://www.ebi.ac.uk) and alignments were adjusted by eye in Winclada (Nixon 1999a). A small

proportion of bases at the 5’ and 3’ ends of each chloroplast alignment was trimmed due to

missing or unreliable data for many taxa at the beginnings and ends of sequence chromatograms.

The ITS alignment was trimmed to span the region from the beginning of ITS1 to the end of

ITS2 (i.e., the 3’ portion of 18S and 5’ portion of 26S were removed). Identical sequences in

15

each of the four datasets (atpI-atpH, ndhF-rpl32r, psbE-petL, and ITS) were identified using the

ALTER web interface (Glez-Peña 2010; available at http://sing.ei.uvigo.es/ALTER/) by uploading each alignment as a nexus file, then using the following output settings: select program=General, format=FASTA, “lower case” and “match first” (under “general”) unselected,

“collapse sequences to haplotypes” selected and “treat gaps as missing data” and “count missing data as difference” both unselected (under “haplotypes collapse”). Gaps were coded in all four aligned datasets in SeqState (Müller 2005; 2006) using the simple indel coding method of

Simmons and Ochoterena (2002). The three chloroplast alignments plus chloroplast gap characters were concatenated into one dataset, and a new haplotype was determined for each individual based on the combined data. Identical haplotypes were collapsed into a single terminal per taxon, and the final matrix was named the 3CP matrix. Chloroplast haplotype codes are included in Figs. 2 and 3. Identical ITS sequences were also collapsed into a single terminal per taxon for the ITS matrix.

Parsimony and Bayesian phylogenetic analyses were performed separately on the 3CP and

ITS matrices. Parsimony analyses were conducted in TNT v1.1 (Goloboff et al. 2008). Twenty thousand tree bisection and reconnection (TBR) searches were first performed, saving 20 trees per replicate, with characters equally weighted, uninformative characters deactivated, random seed set as time, and space for 40,000 trees in memory. This was followed by 5000 iterations of the parsimony ratchet (Nixon 1999b) with probability of upweighting or downweighting a character set to 5, 50 cycles of tree drifting, and 5 cycles of tree fusion. With space for trees in memory (maxtrees) increased to 1,000,000, all trees in memory were swapped using TBR, either to completion or until the 1,000,000 tree limit was reached, and the strict consensus tree was calculated from all most-parsimonious trees. Ten thousand bootstrap replications were conducted

16

(each with 20 TBR searches, holding 20 trees per search, and 200 ratchet iterations with probabilities of upweighting and downweighting again set to 5). Bootstrap support values were calculated based on the frequency of each clade occurring in the pool of 10,000 strict consensus trees generated by the bootstrap replicates in Winclada.

Bayesian analyses of the 3CP and ITS matrices were performed separately in MrBayes 3.1.2

(Huelsenbeck and Ronquist 2001; Ronquist and Huelsenbeck 2003) on XSEDE via the CIPRES

Science Gateway (Miller et al. 2010; available at http://www.phylo.org/portal2/login!input.action). Both 3CP and ITS matrices contained DNA characters and binary gap-code characters, so in the Nexus files for each matrix, “Datatype” was set to “Mixed” for both matrices, with DNA characters set to “DNA” and gap-code characters set to “Restriction” using the syntax “FORMAT

DATATYPE=Mixed(dna:[range],restriction:[range])”. The 3CP and ITS matrices were further partitioned by region for model estimation and analysis (Table 1.3). Models for each DNA partition were estimated using the corrected Akaike Information Criterion (AICc; Akaike 1974;

Sugiura 1978; Hurvich and Tsai 1989) in jModelTest 0.1.1 (Posada 2008). Gap-code partitions were analyzed under the binary model in MrBayes using the command “lset coding = variable”.

For each analysis, two independent Markov Chain Monte Carlo (MCMC) runs were performed in MrBayes starting from random trees, each with four chains (1 cold and 3 heated chains, temperature parameter set to 0.2), and run for 20 million generations, sampling every 1000 generations. The resulting MCMC trace files were evaluated both in Tracer (Rambaut and

Drummond 2003-2009; http://beast.bio.ed.ac.uk/Tracer) to check that effective sample size was greater than 200 for all parameters, and in MrBayes to see that the potential scale reduction factor measured 1.0 and the average standard deviation of split frequencies was <0.01. A burn-in

17

of 10% was determined to be sufficient and the first 2000 trees sampled for each run were

removed. All trees were rooted with Garuleum pinnatifidum. The strict consensus tree from each

parsimony analysis and the 50% majority rules consensus tree from each Bayesian analysis were

visualized in FigTree 1.3.1 (Rambaut 2006-2009; http://tree.bio.ed.ac.uk/software/figtree/).

Table 1.3. 3CP and ITS partitions and models used for Bayesian analyses

3CP Partitions and Models ITS Partitions and Models Region Range Model Region Range Model atpI-atpH 1-1260 GTR+G ITS1 1-307 K80+G ndhF-rpl32r 1261-2523 GTR 5.8S 308-470 SYM+I+G petL-psbE 2524-3866 GTR ITS2 471-695 SYM+G gaps 3867-4046 binary gaps 696-719 binary

Network analysis in TCS v. 1.21 (Clement et al. 2000) was used to further analyze and

visualize variation in ITS. To focus the analysis on Calendula, all but one outgroup species, T.

microcarpa Harv., were removed from the TCS analysis. Also, because TCS treats all

ambiguous sites as missing, all heteromorphic sequences (with polymorphisms coded as

ambiguous) were removed since any information gained from the inclusion of these sequences

could be misleading. One individual of C. stellata, missing all of 5.8s, was also removed to

reduce the possible effects of missing data on the analysis. The default 95% connection limit was

used for the analysis of the reduced matrix (with 158 terminals) and gaps were treated as

missing.

18

3. Results

Chloroplast sequence data from 100 individuals (from all seven outgroups and from 93

Calendula) were included in analyses. All three chloroplast regions were included from all seven

outgroup taxa. Within Calendula, all three regions were included from 83 individuals, two out of three regions from five individuals, and one of three regions from an additional five individuals.

Partial sequences obtained from an additional three individuals were excluded from the analyses because missing data reduced resolution in resulting trees. Sequence data could not be obtained for any chloroplast region for four individuals. ITS sequence data were analyzed from a total of

90 individuals (all seven outgroups and 83 Calendula). For the remaining individuals available for sampling, either good sequence data could not be obtained for ITS, or clones were not obtained for heteromorphic individuals and polymorphisms in direct sequences from these individuals reduced resolution in the resulting trees. See Table 1.1 for complete information on which individuals were sampled for each region.

Statistics for each matrix (including aligned length, number of terminals (haplotypes by taxon), proportion of gaps and missing data, number of gap characters, variable sites and parsimony informative sites) are presented in Table 1.4.

3.1. Phylogenetic analyses of the combined chloroplast matrix

Parsimony analysis of the 3CP matrix resulted in a single most parsimonious tree with a length, (L) of 832 steps, a consistency index (CI) of 92, and a retention index (RI) of 95. The

50% majority rule tree produced by Bayesian analysis was identical in topology to the parsimony

19

Table 1.4. Matrix statistics by region Aligned Variable Variable Terminals length sites/PICS, sites/PICS, with excluding Alignment Missing excluding indel sequence indel gaps (% of data (% of indel Indel characters Region data characters matrix) matrix) characters characters only atpI-atpH 59 1260 17.5 1.8 189/94 45 45/27 ndhF-rpl32 57 1263 30.9 4.0 190/88 95 95/45 petL-psbE 54 1343 8.3 0.9 175/98 40 40/18 total CP 60 3866 17.8 7.8 554/280 180 180/90

ITS 1 208 307 1.6 1.6 132/102 11 11/4 5.8S 207 163 0 0 13/9 0 0/0 ITS 2 208 225 1.4 0.3 103/79 13 13/8 total ITS 208 695 1.2 0.9 248/190 24 24/12

tree (Fig. 1.2). Calendula was strongly supported as monophyletic with a bootstrap support (BS)

of 100% and Bayesian posterior probability value (PP) of 1.0. Relationships of outgroup taxa to

each other were generally consistent with those obtained in other studies (e.g., Barker et al. 2009;

Nordenstam and Källersjö 2009). Ignoring the differences in the level of resolution, parsimony

analyses of each region separately (atpI-atpH, petL-psbE, ndhF-rpl32) were congruent with one

exception: C. pachysperma was in a different position in the petL-psbE analysis than it was in the other two analyses or in the 3CP tree (see branch marked with a star in Fig. 1.3), diverging after, rather than before, the C. tripterocarpa clade (see Fig. 1.2).

Within Calendula, chloroplast haplotypes from the four species endemic to Morocco, C. lanzae, C. eckerleinii, C. meuselii, and C. maroccana, formed a highly supported clade (BS

99/PP 1.0; henceforth the “Moroccan endemic clade”) sister to the rest of the genus (Fig. 1.2).

Two haplotypes were recovered from three individuals of C. tripterocarpa and were paraphyletic in the 3CP tree. One haplotype from a Moroccan sample of C. tripterocarpa grouped with the haplotype recovered from C. pachysperma (one of the two high-polyploid species from Israel), forming lineage L1 (Fig. 1.2; BS 58 / PP 1.0). The two other haplotypes of C. tripterocarpa

(recovered from one Moroccan and one Spanish individual) formed the C. tripterocarpa clade

20

Figure 1.2. Combined chloroplast cladogram showing identical parsimony (single most parsimonious tree) and Bayesian (50% majority rule tree) topologies. Above branches, numbers left of the slash are parsimony bootstrap support (BS), and plus signs right of the slash indicate Bayesian posterior probabilities (PP), with 0.9

21

22

Figure 1.3. Strict consensus parsimony topologies of individual cp regions. One instance of incongruence is marked with an asterix. Shared haplotype codes are shown in parentheses. Note: with the exception of missing terminals, terminals for the individual analyses match those of the combined analysis (i.e., individuals from the same taxon sharing the same haplotype for a single region have not been further collapsed for the analysis of that region). Statistics for the analyses are as follows: atpI-atpH, 1 MP tree L=265, Ci=92, Ri=96; ndhF-rpl32, 2943 MP trees L=325, Ci=92, Ri=93 (strict consensus tree L=337, Ci=88, Ri=91); petL-psbE, 8 MP trees L=240, CI=94, RI=97 (strict consensus tree L=244, Ci=93, Ri=96). The name “C. maritima x fulgida” refers to putative hybrids between C. incana subsp. maritima and C. suffruticosa subsp. fulgida..

23

(also called lineage L2 in Fig. 1.2) together with four haplotypes recovered from nine individuals

of the highly polymorphic, pan-Mediterranean C. arvensis. The C. tripterocarpa clade and L1

together comprise the C. tripterocarpa grade. The three haplotypes of C. stellata, recovered from

five individuals (from three different Moroccan populations), were, like those of C. tripterocarpa, paraphyletic on the 3CP tree. One of the three belonged to a different lineage than

the other two. However, all three fell within a larger, well-supported clade (BS 94 / PP 1.0),

sister to the C. tripterocarpa clade, which also contained haplotypes recovered from individuals representing all other Calendula taxa sampled. This larger clade is henceforth referred to as the

C. stellata clade. The haplotype from C. palaestina, the other high-polyploid species, was in lineage L9 within the C. stellata clade, and was therefore not closely related to that of C.

pachysperma. A total of nine haplotypes were recovered from 24 individuals of C. arvensis, and

these sat in five major lineages (L2, L5, L7, L8, and L9 in Fig. 1.2). As described above, four

haplotypes from nine individuals fell into the C. tripterocarpa clade (L2). Five haplotypes from the remaining 19 individuals of C. arvensis fell within the C. stellata clade (within lineages L5,

L7, L8, and L9). All haplotypes of the perennial polyploids C. incana and C. suffruticosa, as well as all those of C. officinalis, fell within the C. stellata clade. Haplotypes of C. incana and C.

suffruticosa were distributed, like those of C. arvensis, over several lineages. Seven haplotypes

recovered from 13 individuals of C. incana fell into three lineages (L4, L5, and L10). Fourteen

haplotypes from 24 individuals of C. suffruticosa fell into five lineages (L3, L5, L8, L9, and

L10). A single haplotype was recovered from six individuals of C. officinalis, and this haplotype

was placed in lineage L5.

Several of the lineages within the C. stellata clade contained assemblages of haplotypes from

the polyploid species. For example, L5 includes haplotypes from C. incana, C. suffruticosa, and

24

C. arvensis. Lineage L9 is an assemblage of haplotypes from C. suffruticosa, C. arvensis, C.

palaestina, and also the diploid C. stellata. Lineage L10 is an assemblage of haplotypes from C.

incana and C. suffruticosa. In some cases, even haplotypes of subspecies of C. incana and C. suffruticosa repeat across lineages, with, for example, haplotypes of C. suffruticosa subsp. fulgida appearing in lineages L5, L8, and L10, and those of C. incana subsp. microphylla appearing in L4 and L10.

Haplotype lineages, as described above, did not correspond well to taxonomy, nor did they correspond to geography or morphology. For example, individuals of C. arvensis that shared haplotypes or had closely related haplotypes shared no obvious morphological traits (e.g., concolorous versus bicolorous capitula, presence or absence of rostrate achenes, presence or absence of broad wings on rostrate achenes) nor did they necessarily come from the same geographical region. Haplotypes of C. suffruticosa and C. incana were not divergent from each other. In some cases, individuals from both species shared one haplotype. In others, several haplotypes of both species were interdigitated in the same lineage.

3.2. Phylogenetic analyses of ITS

Parsimony analysis of the ITS matrix reached the “max trees” limit of 1,000,000. The strict consensus tree (L=884, CI=40, RI=85; depicted in gray in Fig. 1.4) was somewhat less resolved but otherwise topologically consistent with the Bayesian topology (indicated in black in Fig.

1.4). In both analyses, resolution was poor and many clades lacked support (bootstrap support

(BS) < 50, posterior probability (PP) < 0.9). However, as in the chloroplast analysis, Calendula

25

Figure 1.4. ITS cladogram showing the Bayesian topology in black and differences in the parsimony topology overlaid in gray. Branch support conventions are as described in Fig. 1.1. Triangles indicate that multiple haplotypes for a single individual, or multiple interdigitated haplotypes for multiple individuals have been collapsed for ease of viewing and reference. Numbers after taxon names are individual IDs. Letter and number codes immediately after some individuals indicate a unique clone recovered for that individual. As in Fig. 1, the six putative parental species are highlighted in orange (C. stellata), blue (the Moroccan endemics), and pink (C. tripterocarpa). Clades or grades relevant to the discussion are named at right. The color bar to the right of terminal names, as well as highlighting of some individual IDs, indicates country of origin (see key at left). The name “C. maritima x fulgida” refers to putative hybrids between C. incana subsp. maritima and C. suffruticosa subsp. fulgida. Taxa for which sequences were downloaded from GenBank from previously published studies are indicated with a small asterix. The larger asterisk pointing to a branch indicates a clade containing the perennial polyploid clade plus C. stellata 10 and 18 and related ITS copies.

26

27

was strongly supported as monophyletic (BS 100/PP 1.0), and outgroup relationships were

generally consistent with previous studies. The Moroccan endemics again formed a strongly

supported clade (BS 99/PP 1.0; see Fig. 1.4) sister to the rest of the genus. In the ITS analysis,

unlike in the chloroplast analysis, all ITS copies recovered from species reported to have 2n = 32

chromosomes (C. suffruticosa, C. incana, and C. officinalis) were monophyletic (“Perennial

polyploid clade” in Fig. 1.4) and nested within an “Annual grade” that included all of the various

annual lineages of C. stellata, C. tripterocarpa, C. arvensis, C. pachysperma, and C. palaestina.

The Perennial polyploid clade was resolved in both parsimony and Bayesian analyses, but supported only in the Bayesian analysis.

Within the annual grade, a clade containing all ITS copies from both of the high-polyploid

species (C. pachysperma and C. palaestina), two individuals of C. stellata, and a single

individual of C. arvensis (C. arvensis 234), was recovered with moderate (BS 74) to strong (PP

1.0) support in parsimony and Bayesian analyses (henceforth the “High polyploid clade”). Most

ITS copies from C. arvensis (see “C. arvensis p. p. (20 individuals)” in Fig. 1.4), either formed a

clade with copies from C. tripterocarpa (Bayesian; unsupported) that was in a polytomy with the

High polyploid clade, one other C. arvensis clone (C. arvensis 149 C3f in Fig. 1.4), and a clade

(marked with a star in Fig. 1.4) comprising the rest of the genus, or they were in a polytomy with

copies of C. tripterocarpa, C. arvensis 149 C3f, and the High-polyploid clade in a lineage that

was sister to the rest of the genus (parsimony; unsupported). Calendula arvensis 234, in the

High-polyploid clade, was the only individual that did not have any copies that fell into “C.

arvensis p. p.” as described above. While most copies from two other individuals of C. arvensis

(C. arvenis 5 and 149) were part of the C. arvensis p. p. assemblage, two clones from C. arvensis

5 and one clone from C. arvensis 149 were part of a grade of early diverging lineages in the

28

clade marked with a star (Fig. 1.4) together with several lineages of copies from C. stellata individuals 10 and 18. Within the Perennial polyploid clade, all ITS copies from C. officinalis

belonged to a single lineage, and the same relationship observed among chloroplast haplotypes

of C. officinalis to those of two individuals of C. suffruticosa subsp. fulgida and a third C.

suffruticosa (not identified to subspecies because it was not in fruit) was seen between ITS copies of these taxa with weak (BS 65) to strong (PP 1.0) support in parsimony and Bayesian analyses. However, one of several clones recovered from an Algerian taxon, C. suffruticosa subsp. boissieri, also appeared in this clade. A larger clade containing the C. officinalis clade and all ITS copies from Sicilian and Tunisian individuals was resolved in both analyses but not supported. This clade was in a polytomy with two lineages of copies from Spanish and Algerian taxa, and the clade containing all of these (copies from Sicilian, Tunesian, Spanish, and Algerian taxa), was itself in a polytomy with copies from individuals mostly from Portugal (including the island of Madeira) but also from Spain and Libya.

3.3. Network analysis of ITS

The reduced ITS matrix of 142 unique sequences (see Methods) was analyzed in TCS. The

95% connection limit was 11 steps. At this connection limit, the outgroup species, T.

microcarpa, was not connected to a network, and neither were copies from the four Moroccan

endemic species, which formed their own network (not shown). Within the major Calendula

network (including copies from all other taxa in the analysis; Fig. 1.5), the clusters recovered by

network analysis in TCS of the ITS dataset were consistent with clades recovered by parsimony

and Bayesian analyses of ITS copies (Fig. 1.4). It is also evident, both from the small numbers of

29

Figure 1.5. Major Calendula network from analysis in TCS of reduced ITS dataset. The Moroccan endemics (not shown) formed their own network not connected to this one. Each box represents one unique ITS copy. Codes for copies by individual are listed in each box. Each line between copies represents one change. Each black tick on a line represents one additional change between connected copies. Each small circle is a junction point between three or more copies and also represents one additional change. The dotted line divides annuals from perennials (including C. officinalis). Clusters highlighted in gray correspond to major clades or grades recovered in the phylogenetic analyses (Fig. 1.4).

30

31

characters separating the clusters and from the lack of reticulation between clusters, that lack of

support for major lineages in the phylogenetic analysis is due more to low variation than to

character conflict (homoplasy). Although sequences from annual species were paraphyletic in phylogenetic analyses (the annual grade in Fig. 1.4) and the perennial polyploid clade was unsupported in the parsimony analyses, network analysis in TCS recovered distinct annual and perennial clusters without reticulation between them. Within the annual cluster, three distinct

clusters corresponding to clades or grades in the ITS tree (Fig. 1.4) were present in the ITS network (Fig. 1.5): the C. arvensis p. p. + C. tripterocarpa clade; the high polyploid clade; and

the grade of C. stellata and C. arvensis clones. The high polyploid cluster was especially well-

differentiated from other clusters and included (as did the phylogenetic analysis) copies from C. stellata 169 and C. arvensis 234. Copies from C. stellata 10 and 18, as well as closely related copies from C. arvensis, formed a cluster linking the Arvensis and Perennial clusters. Within the

Perennial cluster, there were Sicilian/Tunisian, Portuguese/Spanish, and Spanish/Algerian clusters (again, corresponding to the clades in Fig. 1.4). Within the Sicilian/Tunisian cluster was a distinct ITS copy shared by C. incana subsp. maritima, endemic to the north-west coast of

Sicily, and two Tunisian taxa, C. suffruticosa subsp. tunetana and C. suffruticosa subsp. suffruticosa. Reticulation was present within all major clusters, which corresponded to polytomies in the ITS phylogenetic trees.

32

4. Discussion

4.1. Relationships of outgroups sampled from Calenduleae and the monophyly of Calendula

The monophyly of Calendula has never been in doubt and was strongly supported by both

chloroplast and ITS analyses. The rest of the genera in Calenduleae, in contrast, have long suffered from problems of paraphyly, with Osteospermum, Tripteris, and Chrysanthemoides all

poorly differentiated from each other. In this study, as was found in Barker et al. (2009) and

Nordenstam and Källersjö (2009), Osteospermum was paraphyletic, with O. barberiae and O.

fruticosum strongly supported as belonging to Dimorphotheca, while O. rigidum was sister to T. clandestina. Both of the former were part of Norlindh’s (1943) Osteospermum section Blaxium, which he had erected for these and several related species previously classified in

Dimorphotheca, but the section was returned to Dimorphotheca by Nordenstam (1994). Only

one species of Tripteris (T. clandestina) was included in the chloroplast analysis, but the two

species included in the ITS analysis (T. clandestina and T. microcarpa) were paraphyletic (in

agreement with a paraphyletic Tripteris in Nordenstam and Källersjö 2009). These problems

with generic delimitations in the outgroups have made it difficult to establish the nearest relative

of Calendula, although several studies (Norlindh 1943, 1946; Nordenstam and Källersjö 2009)

have proposed a closer relationship of Calendula to some sections of Osteospermum and

Tripteris than to other genera in the tribe. This was supported in the present study by the sister

relationship of Calendula to O. rigidum + T. clandestina in the 3CP analysis (Fig. 1.2) or to T.

microcarpa in the ITS analysis (Fig. 1.4). Norlindh (1946) suspected that Tripteris vaillantii

Decne. was the species in tribe Calenduleae most closely related to Calendula, and also

33

represented the biogeographical link between the Mediterranean Calendula and the South

African genera. The northern-most part of the range of T. vaillantii overlaps the southeastern-

most part of the range of Calendula. Unfortunately, this species could not be sampled for this study, but Norlindh (1943) placed both his O. vaillantii (Decne.) Norl. (≡T. vaillantii) and O. microcarpum (≡T. microcarpa) in the same section of Osteospermum (section Trifenestrata of

subgenus Tripteris), so the sister relationship between Calendula and T. microcarpa recovered in

this study (albeit without support) was in line with his hypothesis. Clearly, much more extensive

sampling of outgroup taxa, including T. vaillantii, is needed to determine the nearest relative of

Calendula, and provide further insight into the biogeography of the tribe.

Although Calendula sequences were monophyletic in these analyses, sequences from species

within Calendula were not. This does not necessarily mean that the species, as currently

delimited, do not represent evolutionary lineages. The phylogenetic trees and the network

presented here show relationships of genes, not species. That the 3CP and ITS analyses of

Calendula species were largely incongruent with each other (see Figs. 2, 4) emphasize this point.

The relatively short branches between sequences from most species in the phylogenetic trees

(Fig. 1.6; particularly in the ITS tree, Fig. 1.6b) also suggest recent divergence, making

incomplete lineage sorting a possible explanation for paraphyly in these analyses. Finally, the

hypothesized hybrid origins of most species would be expected to produce paraphyly in gene

trees since the progeny of hybrid crosses may share genetic material with two (or more)

divergent species. Nevertheless, it is possible to draw some conclusions about species

relationships and polyploid origins from the contrasting histories of plastid and nuclear ribosomal genes recovered in this study.

34

4.2. Divergence of Moroccan endemic species from the rest of the genus

The Moroccan endemic species (C. eckerleinii, C. maroccana, C. meuselii, and C. lanzae) were genetically divergent from the remainder of the genus in both 3CP and ITS analyses (Figs.

2, 4), as well as in analyses of each chloroplast regions separately (Fig. 1.3). The length of the branches that separated the Moroccan endemics from the rest of the genus in phylogenetic analyses of both 3CP and ITS datasets (Fig. 1.6, but particularly of the 3CP dataset (Fig. 1.6a), as well as the fact that the Moroccan endemic network could not be connected to the network of

Figure 1.6. Simplified Bayesian trees showing branch lengths, with outgroup branches shown in gray and ingroup branches shown in black, for (A) 3CP and (B) ITS.

other taxa at the TCS 95% connection limit (Fig. 1.5), further emphasized the divergence of

these species. If the hypothesis is accepted that one or more of these species hybridized with C.

stellata to form the perennial polyploid species (C. incana and C. suffruticosa) and C. officinalis,

the following interpretations are possible. The first is that the Moroccan endemics played no part

in forming any of these species, but rather diverged early and continued on their own

evolutionary trajectory away from that of the rest of the genus. A second interpretation, however,

is that these species were involved, as previously hypothesized, but that the 3CP and ITS data

could not provide evidence of their involvement. Since the chloroplast is assumed to be

35

maternally inherited in Calendula, the 3CP analysis provided evidence that no Moroccan

endemic species was ever the maternal parent in any hypothesized cross that may have led to the

foundation of polyploid lineages, but it could not tell us whether or not any of them served as the

paternal parent in any cross. If one or more did serve as paternal parents, we might have expected to see evidence of this in the ITS analyses, but biased concerted evolution of ITS copies toward the maternal parent could have erased any evidence of paternal parentage. Biased conversion of ITS copies toward one parent or the other has been shown in plants (Wendel et al.

1995; Joly et al. 2004). Evidence from a low-copy nuclear gene (A39; see Chapter 2), which showed sequences of the Moroccan endemics interdigitated with other taxa, as would be expected if the genomes of these diploid species were present in some of the polyploids, makes this second explanation of the 3CP and ITS results more likely.

Given the much narrower divergence of chloroplast haplotypes in the rest of the genus, the sharing of haplotypes across other species, and at least some overlap in geographical distribution of the Moroccan endemics with C. stellata, C. arvensis, and C. tripterocarpa, it seems strange that the Moroccan endemics have never contributed their plastids to the mix. Also, given evidence of multiple origins of the perennial polyploids (see below), any hypothesis that crosses may have occurred repeatedly between C. stellata and one or more Moroccan endemic species but always in one direction requires further justification. Why would they never have been maternal parents? Several researchers (Anderson et al. 2008; Tiffin et al. 2001) have found that paternal effects on seed germination (e.g., reduced seed germination in the maternal environment when fertilization is by pollen from outside that environment) can explain asymmetrical gene flow between populations of related taxa, and perhaps something similar is happening in crosses between the Moroccan endemics and other taxa in Calendula. It could be that the seeds of

36

Moroccan endemic species do not develop or do not germinate as well when pollination is by other species, but that these other species have fewer problems being pollinated by the Moroccan endemics. Heyn and Joel (1983) showed differences in seed set and viability depending on directionality of crosses between several annual Calendula species (C. stellata, C. tripterocarpa,

C. arvensis, C. pachysperma, and C. palaestina). Performing controlled, reciprocal crosses between the Moroccan endemics and other Calendula species (particularly C. stellata) would help determine whether there is some explanation (beyond coincidence) for the lack of exchange of chloroplasts between the Moroccan endemics and other taxa. In any case, this lack of exchange could explain why chloroplast sequences from these species have diverged so widely from those of the rest of the genus.

There was no evidence of any relationship of C. stellata to C. lanzae or that a putative ancestor of C. stellata (derived from C. lanzae) contributed its genome to C. tripterocarpa.

However, 3CP and ITS data may have been insufficient to reveal such a contribution (just as they may have been insufficient to reveal contribution of any Moroccan endemic species to the genome of the perennial polyploids).

4.3. Chloroplast polymorphism in C. stellata and C. tripterocarpa

Chloroplast capture (introgression of chloroplast DNA; Rieseberg and Soltis 1991) is often invoked as an explanation for paraphyly in plastid trees or for incongruence of plastid trees with nuclear trees (Tsitrone et al. 2003), especially if plastid clades exhibit a geographic signal (i.e., plastids from two or more species from one region or population are more closely related to each other than to those of conspecifics from another region or population). This becomes all the more

37 likely if there is evidence of hybridization and gene flow between species, as there is in

Calendula. Chloroplast capture may explain some relationships in the 3CP tree (e.g., one haplotype shared by Sicilian and southern Italian individuals of two species in lineage L8; Fig.

1.3). However, it does not adequately explain several clades in which different species, or even different individuals of the same species, from different countries, often with wide geographical separation, share identical or closely related haplotypes, nor does it explain why haplotypes from different individuals of these same species, from the same countries, belong to different clades

(e.g., lineages L2, L5, L9, L10; Fig. 1.3). For example, lineage L2 groups nine individuals of C. arvensis from Portugal, Spain, Morocco, Italy, Greece, and Lebanon with C. tripterocarpa from

Morocco and Spain. Lineage L5 grouped three individuals of C. suffruticosa from Spain and

Sicily, three individuals of C. incana from Spain, seven individuals of C. arvensis from Spain,

Morocco, Italy, Turkey, and Lebanon, and six individuals of C. officinalis (either cultivated or assumed escaped from cultivation in Algeria, the former Soviet Union, Kazakhstan, Spain, and

Portugal). In lineage L9, individuals of three different species from four different countries spanning the Mediterranean (Portugal, France, Italy, and Israel) all share one identical haplotype.

Shared geography is no better explanation for the formation of these clades than shared taxonomy. Another explanation is that the clades in the 3CP tree represent relictual plastid polymorphism in the diploid progenitors of polyploid taxa, multiple formations of polyploids from crosses between these progenitors, and the maintenance of these polymorphisms within multiple polyploid lineages.

A likely consequence of chloroplast polymorphism in progenitor populations is that polyploid lineages formed from within these populations or via hybridization with individuals from these populations will reflect this polymorphic history. If a polyploid taxon has a single

38

origin, then it would be expected to group, in a phylogenetic analysis of plastid data, with the

most closely related haplotype recovered from the putative maternal progenitor. If there have been multiple, and possibly reciprocal, origins of a polyploid taxon from polymorphic diploid progenitors, then divergent haplotypes recovered from different individuals of the polyploid taxon would be expected to appear in different clades containing the divergent haplotypes of the maternal progenitor(s). If this story is further complicated by multiple formations of more than one polyploid taxon by polymorphic maternal progenitors, then a phylogenetic analysis would be expected to show several lineages, each grouping some of the same polyploid species with the most closely related haplotype recovered from the maternal progenitor taxon. Lineages that lack a haplotype of the maternal progenitor (i.e., contain only haplotypes sampled from polyploid individuals) would also be present if the most closely related haplotype from the maternal progenitor had not been sampled. However, these lineages would still be closely related to lineages that did contain a haplotype from the maternal progenitor, and, if the maternal progenitor were monophyletic when analyzed without any of its polyploid derivatives, then all the lineages formed by the maternal progenitor would also be monophyletic. If the maternal progenitor were paraphyletic, then so might be the lineages of its progeny. These patterns are congruent with those observed in the 3CP tree (Fig. 1.2). Implications of these patterns for the origins of polyploid taxa in Calendula, coupled with ITS evidence, are described in more detail below, and a schematic showing modified hypotheses of origin is presented in Fig. 1.7.

39

Figure 1.7. Modified hypotheses of species origins in Calendula based on 3CP and ITS analyses. Conventions for depicting lineages, capitula size and color, hybridization, and polyploidization events are as described in Fig. 1.1 with the following additions: the cladogram at the top of the image depicts relationships between the Moroccan endemic lineage, C. stellata, and C. tripterocarpa (with the latter two more closely related to each other than either are to the Moroccan endemics); double arrows indicate multiple origins; and male and female symbols indicate the sex of progenitors in hypothesized crosses.

4.4. Origins of the annual polyploids

When haplotypes of putative progenitor species (C. stellata, C. tripterocarpa, and the

Moroccan endemics C. eckerleinii, C. lanzae, C. maroccana, and C. meuselii) were analyzed alone, those of C. stellata were monophyletic within a paraphyletic C. tripterocarpa (due to the position of C. tripterocarpa 77193), and C. stellata + C. tripterocarpa were sister to the

Moroccan endemic clade (data not shown, but positions of these haplotypes were the same, relative to each other, as they are in Fig. 1.2). When the complete dataset was analyzed, all 40

polyploids either fell into one large clade with the C. stellata haplotypes (the C. stellata clade;

Fig. 1.2) or into two small, paraphyletic clades with the C. tripterocarpa haplotypes (the C.

tripterocarpa clade and grade; Fig. 1.2). The Moroccan endemics, as already discussed, could

not have played any role as maternal progenitors. Therefore, according to the proposed

hypotheses of origin (Fig. 1.1), C. stellata would have to have been the maternal progenitor of all

the perennial polyploids (C. incana and C. suffruticosa) and C. officinalis (see below) and may

also have been the maternal progenitor of some or all of the annual polyploids (C. arvensis, C.

pachysperma, and C. palaestina). Calendula tripterocarpa could also have been the maternal

progenitor of C. arvensis and possibly also of C. pachysperma and C. palaestina.

The fact that C. arvensis chloroplast haplotypes fell into five lineages, four within the C.

stellata clade and one in the C. tripterocarpa clade, supports a hypothesis of multiple, reciprocal

origins of C. arvensis from C. stellata and C. tripterocarpa, as many as four times with C.

stellata as the maternal parent and at least once with C. tripterocarpa as the maternal parent (Fig.

1.7). The ITS tree corroborated the reciprocal origins of C. arvensis, with most ITS copies from

C. arvensis grouping with C. tripterocarpa but some copies from C. arvenis grouping with C.

stellata. All possible scenarios for conversion of ITS copies were present, with conversion

toward C. stellata in C. arvensis 234, conversion toward C. tripterocarpa in all individuals of C.

arvensis except 5, 149, and 234, and retention of copies from both parents in C. arvensis 5 and

149.

The high polyploid species showed evidence of two origins, although it was not as simple to

determine parentage. The 3CP haplotype from C. pachysperma was sister to that of C.

tripterocarpa 77193 (lineage L1, Fig. 1.2), while C. palaestina shared a haplotype with several individuals of C. arvensis and C. suffruticosa that was closely related to two haplotypes of C.

41

stellata (lineage L5, Fig. 1.2). These results support separate origins for the two species, with C.

tripterocarpa as the maternal parent for C. pachysperma, and C. arvensis as the maternal parent

for C. palaestina. In the ITS analysis, both C. pachysperma and C. palaestina, together with C.

stellata 169 and 321 and C. arvensis 234, were part of the same lineage (one of the few lineages

in the ITS tree with support). Based on ITS results, C. stellata, C. arvensis, or both could have

contributed most recently to the genome of both high polyploids. Discerning which of C.

stellata, C. arvensis, or C. tripterocarpa has contributed most, or most recently, to the high

polyploid species is complicated by the presence of both the C. stellata and C. tripterocarpa

genomes in C. arvensis. 3CP and ITS results indicate that C. palaestina could be an

autopolyploid (or a narrow allopolyploid) resulting from hybridization between narrowly

diverged lineages of C. arvensis and that C. pachysperma is a wider allopolyploid resulting from hybridization between C. arvensis and C. tripterocarpa (Fig. 1.7). However, it could be that the

C. pachysperma chloroplast haplotype is actually most closely related to a C. arvensis haplotype

which is also closely related to the haplotype from C. tripterocarpa 77193, but that this

hypothetical haplotype was not sampled for the analysis. Also, since only one representative of

each high polyploid species was included in the analyses, it was not possible to determine

whether each species had a single or multiple origin(s).

4.5. Origins of the perennial polyploids

Based largely on characters describing growth habit (e.g., branching patterns and length of

vegetative and floral shoots), Ohle (1974) separated the polyploid perennials into two lineages,

C. incana and C. suffruticosa. He proposed that hybridization between the annual C. stellata and

42

the perennial C. meuselii, followed by genome duplication, may have produced the C. incana lineage, and later proposed (1975a, b) that a similar event between C. stellata and C. eckerleinii may have produced the C. suffruticosa lineage. As discussed above, chloroplast and ITS analyses could not establish whether or not any of the Moroccan endemic species were diploid progenitors of the perennial polyploid taxa (and this doubt is reflected in Fig. 1.7), but if they were (and low- copy nuclear data is consistent with this hypothesis; see Chapter 2), then they must have been the paternal parents and C. stellata must have been the maternal parent in all such crosses.

C. suffruticosa and C. incana haplotypes belonged to six major chloroplast lineages (L3, L4,

L5, L8, L9, and L10; Fig. 1.2), which suggests multiple origins of these taxa, likely from a maternal progenitor with wide chloroplast polymorphism (much like the scenario proposed for

C. arvensis). All of these lineages fell within the C. stellata clade, which is consistent with C. stellata being the maternal parent of these species. Phylogenetic analyses of ITS copies placed C.

incana and C. suffruticosa, as well as C. officinalis, in a single, monophyletic clade (without

parsimony support but with moderate support in Bayesian analysis; Fig. 1.4), and the network analysis also supported the monophyly of taxa with 2n = 32 chromosomes. Most closely related

to sequences from this species were ITS copies from C. stellata 10 and 18 (as well as copies

from C. arvensis most likely derived from C. stellata), a result also consistent with an origin of

the perennial polyploids from C. stellata. There was no evidence in either 3CP or ITS analyses

for C. suffruticosa and C. incana representing two distinct lineages. Chloroplast haplotypes and

ITS copies shared among the two species, as well as evidence of reticulation between them, is

more consistent with recognition of a single species, C. suffruticosa (inlcuding C. incana), rather

than two separate species.

43

Even if C. incana is not recognized as distinct from C. suffruticosa, this is not to say that

some subspecies or groups of subspecies of C. suffruticosa s.l., may not, with better sampling

and more variable molecular markers, prove to represent distinct lineages. Unlike in the 3CP

tree, there was some geographical signal in ITS that corresponded to clades (or grades) in the

phylogenetic analysis (Fig. 1.4) and to clusters in the network analysis (Fig. 1.5). All Portuguese

individuals (representing C. incana subsp. algarbiensis, C. incana subsp. maderensis, C. incana

subsp. microphylla, and C. suffruticosa subsp. lusitanica) clustered together with three

individuals from Spain (C. incana subsp. incana and C. incana subsp. microphylla ) and two

from Libya (C. suffruticosa not identified to subspecies). The Spanish subspecies C. suffruticosa

subsp. greuteri and C. suffruticosa subsp. carbonellii clustered with a Spanish individual of C.

incana subsp. incana and the Algerian subspecies C. suffruticosa subsp. boissieri. All Sicilian

and Tunisian individuals (C. suffruticosa subsp. fulgida, C. incana subsp. maritima, C.

suffruticosa var. tunetana, C. suffruticosa subsp. suffruticosa) also formed a cluster (with the

inclusion of a single clone from the Algerian C. suffruticosa subsp. boissieri). Calendula incana

subsp. maritima, endemic to Trapani in northwestern Sicily and to small islands off the

northwestern coast, may be the most genetically uniform taxon sampled. All four individuals

sampled shared a single chloroplast haplotype (in lineage L10; Fig. 1.2) and identical ITS copies

(“Maritima clade”; Fig. 1.4). However, this taxon has been known to hybridize with C. suffruticosa subsp. fulgida in one mixed population (Plume et al. 2013). The Tunisian C.

suffruticosa var. tunetana appeared to be the result of hybridization between these Sicilian taxa

since it retained ITS copies of both taxa, and another Tunisian taxon, C. suffruticosa subsp.

suffruticosa, also shared an ITS copy with C. incana subsp. maritima (see shared network

haplotypes in Fig. 1.6). Considerations of both geography and genetic isolation, in addition to

44

new morphological, molecular, and chemical assessments, may prove useful in creating new

delimitations for the perennial polyploid taxa in the future.

4.6. Origin of C. officinalis

Calendula officinalis, cultivated for centuries for ornamental, medicinal, and culinary use,

differs from C. incana and C. suffruticosa in that it is an annual or short-lived perennial rather

than strictly a perennial. Extensive breeding has produced cultivars with capitula much larger than those of other taxa, increased number of disc and ray florets, and “doubled” capitula with

multiseriate rays compared to the uni- to biseriate rays found in other taxa (and also in some

cultivars of C. officinalis). There is a tendency for “doubled” individuals to become proliferous;

that is, instead of fruits developing from the ray florets, new floral shoots develop that terminate

in smaller capitula. Calendula officinalis may have concolorous capitula with yellow to orange

rays and discs or bicolorous capitula with yellow to orange rays (sometimes tinted with violet)

and red-brown to violet discs. Most taxa in Calendula (including C. suffruticosa and C. incana)

have yellow, concolorous capitula, but C. suffruticosa subsp. fulgida and the closely related

Tunisian taxon C. suffruticosa subsp. suffruticosa have orange rays and yellow to orange discs.

Ohle (1974) saw affinities between C. officinalis and other taxa, such as the orange capitula of

Sicilian (C. suffruticosa subsp. fulgida) and North African (C. suffruticosa subsp. suffruticosa)

taxa, the fruit morphology of both Iberian (C. incana subsp. maderensis) and Sicilian (C. incana

subsp. maritima) taxa, as well as the Moroccan endemic C. meusellii, and the violet disc florets

in C. stellata. He was not certain whether C. officinalis, C. incana and C. suffruticosa shared a

45 common ancestor or if C. officinalis was the product of later hybridization between C. incana and C. suffruticosa.

Chloroplast haplotypes from all sampled individuals of C. officinalis were identical and ITS copies from all individuals belonged to a single supported clade (the “Officinalis clade”; Fig.

1.4). Though morphological divergence across sampled individuals was relatively high (sampled individuals were bicolorous or concolorous, yellow, saffron, or orange, single, double, or proliferous), molecular divergence between individuals was lacking (3CP) or quite low (ITS).

These data fit well with a scenario of a single introduction of C. officinalis into cultivation, followed by centuries of cultivation and selection for rare or recessive traits or mutations to achieve maximum morphological diversity.

The C. officinalis chloroplast haplotype was also recovered in two individuals of C. suffruticosa subsp. fulgida (121 and 123) and a third C. suffruticosa (133), all collected in Sicily.

These individuals also had some ITS copies that were identical to those from C. officinalis. It would be tempting to conclude that C. officinalis is therefore most closely related to C. suffruticosa subsp. fulgida and is Sicilian in origin. However, the chloroplast haplotypes from all other individuals of C. suffruticosa subsp. fulgida were not closely related to those from C. officinalis, and no other C. suffruticosa subsp. fulgida individual had ITS copies identical to those in C. officinalis. Given evidence of multiple origins of the perennial polyploids (including

C. suffruticosa subsp. fulgida), it could be that these individuals do represent a native lineage to which C. officinalis also belongs, but the possibility that they might be hybrids between C. officinalis (which is as common in gardens in Sicily as it is everywhere) and Sicilian individuals of C. suffruticosa cannot be ruled out. ITS copies of all other individuals of C. suffruticosa subsp. fulgida, all other Sicilian and Tunisian perennials, and C. officinalis did form a clade (Fig.

46

1.4) or a distinct cluster (Fig. 1.5), but no single taxon was resolved as sister to the C. officinalis

clade, and this Sicilian/Tunisian clade is without support in the phylogenetic analysis. Still, lack

of reticulation between the Sicilian/Tunisian cluster in the network analysis (Fig. 1.5) adds

weight to the possibility of a Sicilian origin of C. officinalis and a close relationship of this

species to C. suffruticosa subsp. fulgida. Further investigation, including better taxon sampling and more variable molecular markers, will be necessary to solve this mystery.

47

REFERENCES

Akaike, H. 1974. A new look at the statistical model identification. IEEE Transactions on Automatic Control 19: 716-723.

Arnold, M. L. 2006. Evolution through genetic exchange. Oxford; New York: Oxford University Press.

Barker, N. P., S. Howis, B. Nordenstam, M. Källersjö, P. Eldenäs, C. Griffioen, and H. P. Linder. 2009. Nuclear and chloroplast DNA-based phylogenies of Chrysanthemoides Tourn. ex Medik. (Calenduleae; ) reveal extensive incongruence and generic paraphyly, but support the recognition of infraspecific taxa in C. monilifera. South African Journal of Botany 75: 560-572.

Chapman, M. A. and J. M. Burke. 2007. Genetic divergence and hybrid speciation. Evolution 61: 1773-1780.

Clement, M., D. Posada, and K. A. Crandall. 2000. TCS: a computer program to estimate gene genealogies. Molecular Ecology 9: 1657-1659.

Downie, S. R. and D. S. Katz-Downie. 1996. A molecular phylogeny of Apiaceae subfamily Apioideae: evidence from nuclear ribosomal DNA internal transcribed spacer sequences. American Journal of Botany 83: 234-251.

Doyle, J. J. and J. L. Doyle. 1987. A rapid DNA isolation procedure for small quantities of fresh leaf tissue. Phytochemical Bulletin 19: 11-15.

Doyle, J. J., J. L. Doyle, J. T. Rauscher, and A. H. D. Brown. 2004a. Evolution of the perennial soybean polyploid complex (Glycine subgenus Glycine): a study of contrasts. Biological Journal of the Linnean Society 82: 583-583.

Doyle, J. J., J. L. Doyle, J. T. Rauscher, and A. H. D. Brown. 2004b. Diploid and polyploid reticulate evolution throughout the history of the perennial soybeans (Glycine subgenus Glycine). New Phytologist 161: 121-132.

Edgar, R. C. 2004. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Research 32: 1792-1797.

48

Funk, V. A. and International Association for Plant Taxonomy. 2009. Systematics, evolution, and biogeography of Compositae. Vienna, Austria: International Association for Plant Taxonomy, Institute of Botany, University of Vienna.

Glez-Peña, D., D. Gómez-Blanco, M. Reboiro-Jato, F. Fdez-Riverola, and D. Posada. 2010. ALTER: program-oriented conversion of DNA and protein alignments. Nucleic Acids Research 38: W14-W18.

Goloboff, P. A., J. S. Farris, and K. C. Nixon. 2008. TNT, a free program for phylogenetic analysis. Cladistics 24: 774-774.

Green, A. F., T. S. Ramsey, and J. Ramsey. 2013. Polyploidy and invasion of English ivy ( Hedera spp., Araliaceae) in North American forests. Biological Invasions 15: 2219-2241.

Greuter, W. 2006+. Compositae (pro parte majore)Compositae. Euro+Med Plantbase - the information resource for Euro-Mediterranean plant diversity, eds. W. Greuter and E. v. Raab-Straube. Published on the internet: http://ww2.bgbm.org/EuroPlusMed/

Heyn, C. C. and A. Joel. 1983. Reproductive relationships between annual species of Calendula (Compositae). Plant Systematics and Evolution 143: 311-329.

Heyn, C. C., O. Dagan, and B. Nachman. 1974. The annual Calendula species: taxonomy and relationships. Israel Journal of Botany 23: 169-201.

Huelsenbeck, J. P. and F. Ronquist. 2001. MRBAYES: Bayesian inference of phylogenetic trees. Bioinformatics (Oxford, England) 17: 754-755.

Hurvich, C. M. and C. Tsai. 1989. Regression and Time Series Model Selection in Small Samples. Biometrika 76: 297-307.

Huson, D. H. 1998. SplitsTree: analyzing and visualizing evolutionary data. Bioinformatics (Oxford, England) 14: 68-73.

Huson, D. H. and D. Bryant. 2006. Application of phylogenetic networks in evolutionary studies. Molecular Biology and Evolution 23: 254-267.

Jiao, Y., P. S. Soltis, D. E. Soltis, S. W. Clifton, S. E. Schlarbaum, S. C. Schuster, H. Ma, J. Leebens-Mack, C. W. dePamphilis, N. J. Wickett, S. Ayyampalayam, A. S. Chanderbali, L.

49

Landherr, P. E. Ralph, L. P. Tomsho, Y. Hu, and H. Liang. 2011. Ancestral polyploidy in seed plants and angiosperms. Nature 473: 97-U113.

Joly, S., J. Rauscher, S. Sherman-Broyles, A. Brown, and J. Doyle. 2004. Evolutionary dynamics and preferential expression of homeologous 18S-5.8S-26S nuclear ribosomal genes in natural and artificial Glycine allopolyploids. Molecular Biology and Evolution 21: 1409- 1421.

Lanza, D., 1919. Monografía del genere Calendula L. Palermo: Scuola Tip. "Boccone del Povero".

Levin, D. A. 1983. Polyploidy and Novelty in Flowering Plants. The American Naturalist 122: 1- 25.

Lim, K. Y., A. R. Leitch, D. E. Soltis, P. S. Soltis, J. Tate, R. Matyasek, H. Srubarova, A. Kovarik, J. C. Pires, and Z. Xiong. 2008. Rapid chromosome evolution in recently formed polyploids in Tragopogon (Asteraceae). PloS One 3: e3353.

Mallet, J. 2007. Hybrid speciation. Nature 446: 279-283.

Mandáková, T., A. Kovarík, J. Zozomová-Lihová, R. Shimizu-Inatsugi, K. K. Shimizu, K. Mummenhoff, K. Marhold, and M. A. Lysak. 2013. The more the merrier: recent hybridization and polyploidy in Cardamine. The Plant Cell 25: 3280-3295.

Mears, J. A. 1980. The evolution of the pseudoguaianolides of Parthenium L. (Asteraceae, Ambrosiinae). Proceedings of the Academy of Natural Sciences of Philadelphia 132: 156- 172.

Meikle, R. D. 1976. Calendula L. Pp. 206-207 in Flora Europaea, Plantaginaceae to Compositae vol.4, eds. T. G. Tutin, V. H. Heywood, N. A. Burges, D. H. Valentine, S. M. Walters, and D. A. Webb. Cambridge: Cambridge University Press.

Meusel, H. and H. Ohle. 1966. Zur taxonomie und cytologie der gattung Calendula. Plant Systematics and Evolution 113: 191-210.

Miller, M. A., M. A. Miller, W. Pfeiffer, W. Pfeiffer, T. Schwartz, and T. Schwartz. 2010. Creating the CIPRES Science Gateway for inference of large phylogenetic trees. Gateway Computing Environments Workshop (GCE), 2010 : 1-8.

50

Müller, K. 2006. Incorporating information from length-mutational events into phylogenetic analysis. Molecular Phylogenetics and Evolution 38: 667-676.

Müller, K. 2005. SeqState - primer design and sequence statistics for phylogenetic DNA data sets. Applied Bioinformatics 4: 65-69.

Nixon, K. C. 1999a. Winclada (BETA) ver. 0.9.9. Ithaca, NY: Published by the Author.

Nixon, K. C. 1999b. The parsimony ratchet, a new method for rapid parsimony analysis. Cladistics 15: 407-414.

Nora, S., S. Castro, J. Loureiro, A. C. Gonçalves, H. Oliveira, M. Castro, C. Santos, and P. Silveira. 2013. Flow cytometric and karyological analyses of Calendula species from Iberian Peninsula. Plant Systematics and Evolution 299: 853-864.

Nordenstam, B. 2007. Tribe Calenduleaea Cass. (1819). Pp. 241-245 in The Families and Genera of Vascular Plants, VIII: Flowering Plants, , vol.8, eds. K. Kubitzki and J. W. Kadereit. Berlin: Springer.

Nordenstam, B. 1994. Tribe Calenduleae. Pp. 365-376 in Asteraceae : cladistics & classification, eds. K. Bremer, A. A. Anderberg, P. O. Karis, B. Nordenstam, J. Lundberg, and O. Ryding. Portland: Timber Press.

Nordenstam, B. and M. Källersjö. 2009. Calenduleae. Pp. 527-538 in Systematics, Evolutions, and Biogeography of Compositae, eds. V. A. Funk, A. Susanna, T. F. Stuessy, and R. J. Bayer. Vienna: International Association for Plant Taxonomy.

Norlindh, T. 1977. Calenduleae – systematic review. Pp. 961-987 in The Biology and Chemistry of the Compositae vol.11, eds. V. H. Heywood and J. B. Harborne. New York: Academic Press.

Norlindh, T. 1962. Studies in Calendula maderensis DC. Botaniska Notiser 115: 437-445.

Norlindh, T. 1946. Studies in the Calenduleae. 2, Phytogeography and interrelation. Lund, Sweden: Gleerup.

Norlindh, T. 1943. Studies in the Calenduleae. 1, Monograph of the genera Dimorphotheca, Castalis, Osteospermum, Gibbaria, and Chrysanthemoides. Lund, Sweden: Gleerup.

51

Ohle, H. 1974. Beiträge zur Taxonomie der Gattung Calendula II. Taxonomische Revision der südeuropäischen perennierenden Calendula-Sippen. Feddes Repertorium 85: 245-283.

Ohle, H. 1975a. Beiträge zur Taxonomie und Evolution der Gattung Calendula L. III. Revision der marokkanischen perennierenden Sippen unter Berücksichtigung einiger marokkanischer Annueller Mit 6 Tafeln und 4 Abbildungen. Feddes Repertorium 86: 1-17.

Ohle, H. 1975b. Beiträge zur Taxonomie und Evolution der Gattung Calendula L. IV. Revision der algerisch-tunesischen perennierenden Calendula-Sippen unter Berücksichtigung einiger marokkanisch-algerischer Annueller und der marokkanischen und südeuropäischen perennierenden Taxa Mit 5 Tafeln und 3 Abbildungen. Feddes Repertorium 86: 525-541.

Orians, C. M. 2000. The effects of hybridization in plants on secondary chemistry: implications for the ecology and evolution of plant-herbivore interactions. American Journal of Botany 87: 1749-1756.

Otto, S. P. and J. Whitton. 2000. Polyploid incidence and evolution. Annual Review of Genetics 34: 401-437.

Panero, J. L. and V. A. Funk. 2008. The value of sampling anomalous taxa in phylogenetic studies: major clades of the Asteraceae revealed. Molecular Phylogenetics and Evolution 47: 757-782.

Pazy, B. 2000. Flora palaestina: Chromosome numbers. Israel Journal of Plant Sciences 48: 7- 32.

Plume, O., F. M. Raimondo, and A. Troia. 2013. Hybridization and competition between the endangered sea marigold (Calendula maritima, Asteraceae) and a more common congener. Plant Biosystems. Advance online publication. doi:10.1080/11263504.2013.810182

Popp, M. and B. Oxelman. 2007. Origin and evolution of North American polyploid Silene (Caryophyllaceae). American Journal of Botany 94: 330-349.

Posada, D. 2008. jModelTest: phylogenetic model averaging. Molecular Biology and Evolution 25: 1253-1256.

Rambaut, A. 2006. FigTree v1.3.1. Edinburgh: Published by the Author.

52

Rambaut, A. and A. J. Drummond. 2003-2009. Tracer v1.5.0. Edinburgh: Published by the Authors.

Ramsey, J. 2011. Polyploidy and ecological adaptation in wild yarrow. Proceedings of the National Academy of Sciences of the United States of America 108: 7096-7101.

Rieseberg, L. and D. Soltis. 1991. Phylogenetic consequences of cytoplasmic gene flow in plants. Evolutionary Trends in Plants 5: 65-84.

Ronquist, F. and J. P. Huelsenbeck. 2003. MrBayes 3: Bayesian phylogenetic inference under mixed models. Bioinformatics (Oxford, England) 19: 1572-1574.

Shaw, J., E. B. Lickey, E. E. Schilling, and R. L. Small. 2007. Comparison of whole chloroplast genome sequences to choose noncoding regions for phylogenetic studies in angiosperms: the tortoise and the hare III. American Journal of Botany 94: 275-288.

Shaw, J., R. L. Small, E. B. Lickey, J. T. Beck, S. B. Farmer, W. Liu, J. Miller, K. C. Siripun, C. T. Winder, and E. E. Schilling. 2005. The tortoise and the hare II: relative utility of 21 noncoding chloroplast DNA sequences for phylogenetic analysis. American Journal of Botany 92: 142-166.

Silveira, P., A. Goncalves, C. Santos, and J. Paiva. 2013. Two lectotypifications and a new combination in Calendula ( Asteraceae) for Flora Iberica. Phytotaxa 145: 47-53.

Simmons, M. P. and H. Ochoterena. 2000. Gaps as Characters in Sequence-Based Phylogenetic Analyses. Systematic Biology 49: 369-381.

Soltis, D. E., P. S. Soltis, D. W. Schemske, J. F. Hancock, J. N. Thompson, B. C. Husband, and W. S. Judd. 2007. Autopolyploidy in angiosperms: have we grossly underestimated the number of species? Taxon 56: 13-13.

Stefanovic, S., B. E. Pfeil, J. D. Palmer, and J. J. Doyle. 2009. Relationships among phaseoloid legumes based on sequences from eight chloroplast regions. Systematic Botany 34: 115-128.

Sugiura, N. 1978. Further analysis of the data by akaike's information criterion and the finite corrections. Communications in Statistics - Theory and Methods 7: 13-26.

53

Timme, R. E., J. V. Kuehl, J. L. Boore, and R. K. Jansen. 2007. A comparative analysis of the Lactuca and Helianthus (Asteraceae) plastid genomes: identification of divergent regions and categorization of shared repeats. American Journal of Botany 94: 302-312.

Tsitrone, A., M. Kirkpatrick, and D. A. Levin. 2003. A model for chloroplast capture. Evolution 57: 1776.

Wagner, A., G. P. Wagner, J. Pendleton, N. Blackstone, P. Cartwright, M. Dick, B. Misof, P. Snow, J. Bartels, and M. Murtha. 1994. Surveys of gene families using polymerase chain reaction: PCR selection and PCR drift. Systematic Biology 43: 250-261.

Wagstaff, S. J. and I. Breitwieser. 2002. Phylogenetic relationships of New Zealand Asteraceae inferred from ITS sequences. Plant Systematics and Evolution 231: 203-224.

Warner, D. A. and G. E. Edwards. 1993. Effects of polyploidy on photosynthesis. Photosynthesis Research 35: 135-147.

Wendel, J. F. and R. C. Cronn. 2003. Polyploidy and the evolutionary history of cotton. Pp. 139- 186 in vol.78: Elsevier Science & Technology.

Wendel, J. F., A. Schnabel, and T. Seelanan. 1995. Bidirectional interlocus concerted evolution following allopolyploid speciation in cotton (Gossypium). Proceedings of the National Academy of Sciences of the United States of America 92: 280-284.

White, T. J., T. Bruns, S. Lee, and J. W. Taylor. 1990. Amplification and direct sequencing of fungal ribosomal RNA genes for phylogenetics . Pp. 315-322 in PCR protocols: a guide to methods and applications, eds. M. A. Innis, D. H. Gelfand, J. J. Sninsky, and T. J. White. New York: White Academic Press, Inc.

Wood, T. E., N. Takebayashi, M. S. Barker, I. Mayrose, P. B. Greenspoon, and L. H. Rieseberg. 2009. The frequency of polyploid speciation in vascular plants. Proceedings of the National Academy of Sciences of the United States of America 106: 13875-13879.

54

CHAPTER 2

UTILITY OF LOW-COPY NUCLEAR MARKERS FOR USE IN PHYLOGENY

RECONSTRUCTION IN CALENDULA L. (COMPOSITAE)

1. Introduction

Low-copy nuclear (LCN) markers are a potentially rich source of phylogenetic information in plants. They are numerous, often rapidly evolving, biparentally inherited, and each unlinked marker can serve as an independent estimate of phylogeny (Doyle and Doyle 1999; Sang 2002;

Small et al. 2004). Challenges temper the benefits of using LCN markers, and even include some of the same traits that make them appealing (Feliner and Rossello 2007). Rapid evolution, for example, means that even the most useful LCN markers in plants are not as “universal” as those available for plastid and nuclear ribosomal markers. Even within closely related taxa, it can be difficult to develop primers that amplify the expected targets and recover markers that are truly single or even “low” in copy number. Complex histories including whole-genome duplication, gene birth and death, recombination, and mutations in priming sites may lead, variously, to amplification of multiple paralogs when a single copy is expected, to no amplification at all, or to amplification of unexpected targets. Paralogs may be mistaken for orthologs (leading to incorrect estimations of relationships) especially if different copies of a marker have been retained or lost in different taxa or if sampling has simply failed to recover all orthologous copies in all taxa included in analyses (Doyle and Doyle 1999; of course, this can also be a problem for phylogenetic reconstruction using nuclear ribosomal markers, e.g., Álvarez and Wendel 2003).

55

Further, variation at single loci (heterozygosity) may produce misleading (and often non- monophyletic) hypotheses of species delimitations in phylogenetic trees, and may also be difficult to distinguish from variation at paralogous loci (Doyle 1995; Small et al. 2004).

Awareness of the challenges associated with finding and using LCN markers does not negate their allure, however. An ever-increasing pool of genetic sequence information in public databases (such as GenBank (http://www.ncbi.nlm.nih.gov/genbank/), The Arabidopsis

Information Network (TAIR; www.arabidopsis.org), and the Compositae Genome Project

(compgenomics.ucdavis.edu)) and new approaches to utilizing the available data to identify useful regions and develop primers (e.g., Hughes et al. 2006; Steele et al. 2008; Ilut and Doyle

2012 and references therein) have made the use of LCN markers feasible in more plant groups.

There has also been an explosion of new sequence data over the past few years due to next- generation sequencing (NGS) techniques, which is providing new resources for finding candidate

LCN genes in both model and non-model organisms and has the potential to make thousands of genes available for study instead of one or a few. Best practices for obtaining, analyzing, and interpreting these data are often debated and are still evolving, but NGS is already changing the game of plant systematics (see Davey et al. 2011; Egan et al. 2012; Bombarely et al. 2014).

LCN markers are increasingly a part of strategies (e.g., Chapman et al. 2007; Álvarez et al.

2008) to elucidate relationships in what is arguably the largest family of angiosperms, the

Compositae, comprising some ~23,000 species in four subfamilies and 42 tribes, and occurring everywhere in the world except Antarctica (Funk et al. 2009). Although major inroads into understanding subfamilial and tribal relationships have been made recently (Funk et al. 2009;

Funk et al. 2005), phylogenetic reconstruction within the Compositae has been and continues to be notoriously difficult. Extensive hybridization and polyploidy, both of which likely played

56

roles in the spectacular radiation of species in the family (see Chapman and Burke 2007a; Mallet

2007; Rieseberg and Willis 2007; Wood et al. 2009), have not simplified matters, and Calendula

represents a typical example. Both aneuploid and polyploid variation has been reported

throughout the genera of the tribe Calenduleae (for which Calendula is the type genus), but in

none more so than in Calendula (Nordenstam and Källersjö 2009). Untangling such complex

histories both necessitates and complicates the use of LCN markers, as the present study will

show.

Calendula is a small, circum-Mediterranean genus of 11 or 12 species (depending on whether

the perennial polyploids, excluding C. officinalis, are placed in one or two species; see Chapter

1), with chromosome numbers of 2n=14, 18, 30, 32, 44, and ~85 consistently reported for

species or groups of species and thought to represent a history of dysploid/aneuploid variation as

well as hybridization between species with different chromosome numbers followed by genome

duplication (see Chapter 1). Plume (Chapter 1) used chloroplast and ITS markers to begin to test

hypotheses of relationships among taxa and, in particular, origins of putative allopolyploids in

Calendula. Pictorial summaries of hypotheses from the literature as well as revised hypotheses

based on chloroplast and ITS analyses are reproduced from Chapter 1 in Fig. 2.1 below.

Chloroplast and ITS markers were able to provide evidence for some hypotheses, as follows: 1)

C. arvensis likely originated from multiple, reciprocal crosses between C. stellata and C. tripterocarpa, 2) C. officinalis was likely the result of a single introduction into cultivation from a wild perennial polyploid taxon, and 3) a closer relationship exists between C. stellata and C. tripterocarpa than between C. stellata and the annual C. lanzae or any other diploid species.

Despite these advances, some questions remained unanswered, such as the possibility of involvement of any of four diploid species endemic to Morocco in the founding of perennial

57

polyploid taxa (including the annual or short-lived perennial C. officinalis) via crosses with C.

stellata. Chloroplast and ITS markers provided no evidence for a role of the Moroccan species as

progenitor, in direct contrast to hypotheses based on morphological and karyological evidence

from the literature (Ohle 1974, 1975a, 1975b). However, it is possible that chloroplast and ITS

markers, due to uniparental (in all likelihood maternal) inheritance of the former and concerted

evolution of the latter, were not sufficient to fully test these hypotheses.

(b) (a)

Figure 2.1. Pictorial summary of major hypotheses of species origins in Calendula from (a) previously published work based on morphology and karyology (Heyn et. al. 1974; Heyn and Joel 1983; Ohle 1974, 1975a, 1975b), and (b) from analyses of three chloroplast regions and ITS as markers (Chapter 1). Conventions for depicting lineages, capitula size and color, hybridization, and polyploidization events are as described in Chapter 1, Figs. 1.1 and 1.7.

The present study seeks to expand upon the previous work in three ways. First, it is an

exploratory mission to survey the utility of available LCN markers for phylogenetic

reconstruction in Calendula (a similar investigation of LCN markers was carried out in a group

of vine cacti by Plume et al. 2013). Second, it brings sequence information from LCN markers to

58

bear on questions that remained unanswered after analysis of chloroplast and ITS sequences from Calendula. Third, it briefly explores LCN variation observed in Calendula within the broader context of the Compositae and emphasizes the information that is to be gained by this approach.

2. Materials and Methods

2.1. Taxon sampling and marker screening and selection

New sequence data were generated from subsets of the individuals included in chloroplast and ITS phylogenetic analyses (Chapter 1). Details of these individuals, including voucher

information and analyses in which they were included, are provided in Table 2.1. In addition,

several sequences were downloaded from GenBank for inclusion in some analyses (see section

2.3. Phylogenetic Analyses for details). A small subset of seven individuals of Calendula

(indicated with asterisks in Table 2.1), including an individual from each chromosome number

group and an individual of C. officinalis, were used for screening primer pairs, via PCR, for 17

candidate single or low-copy nuclear loci, which were gathered from the literature for their

potential utility in this study. Each locus was evaluated for its amplification in all or most of the

individuals screened, the presence of single bands after gel electrophoresis (particularly in

diploids), and for the identity of any sequences obtained (i.e., whether or not primers amplified

the expected targets). Evaluation of this last criterion was performed by submitting direct

sequences to basic nucleotide BLAST (blast.ncbi.nlm.nih.gov). All loci, primer information,

59

Table 2.1. Taxa used for screening of LCN primers (marked with “*”) and for generation of A39 and Chs sequences for this study. **Cultivated individuals were grown from seed in greenhouses at Cornell University, Ithaca, NY, USA. Seed source letter codes indicate the following: PI = USDA-GRIN (www.ars-grin.gov); IGB = The Israel Gene Bank (igb.agri.gov.il/main/index.pl); SHS = Silver Hill Seeds (www.silverhillseeds.co.za); BA = Bakers Acres (www.bakersacres.net); OP = O. Plume PS = P. Silveira (University of Aveiro, Portugal); MS = M. Sequeira (University of Madeira, Portugal; via PS); pop. = OP seed collection from population rather than individual.

60

Individual Voucher Collection type Taxon ID (herbarium) (seed source ) ** Origin A39 Chs Calendula arvensis (Vaill.) L. Carv5 O. Plume 5 (BH) Cultivated (PI 633645) Beja, Portugal x – C. arvensis Carv6 O. Plume 6 (BH) Cultivated (PI 578097) Turkey x x C. arvensis Carv8 O. Plume 8 (BH) Cultivated (PI 597587) Apulia, Italy x – *C. arvensis Carv149 O. Plume 149 (BH) Cultivated (PI 6031092) Greece x x C. arvensis Carv234 O. Plume 234 (BH) Wild Hermel, Lebanon x – C. arvensis Carv450 O. Plume 450 (BH) Cultivated (OP pop. 220) Naqoura, Lebanon x – C. arvensis Carv452 O. Plume 452 (BH) Cultivated (OP 66) Modica, Italy x – C. arvensis Carv470 O. Plume 470 (BH) Cultivated (PI 613017) Morocco x – C. arvensis Carv477 O. Plume 477 (BH) Cultivated (OP pop. 245) Qaa, Lebanon x – C. eckerleinii Ohle Ceck15 O. Plume 15 (BH) Cultivated (PI 603110) Morocco x x C. eckerleinii Ceck287 O. Plume 287 (BH) Cultivated (PI 603110) Morocco x – *C. eckerleinii Ceck3064 P. Silveira 3064 (AVE) Wild Morocco x x C. incana subsp. agarbiensis (Boiss.) Ohle Cialg145 O. Plume 145 (BH) Cultivated (PS 2899) Beja, Portugal x – C. incana subsp. maderensis (DC.) Ohle Cimad142 O. Plume 142 (BH) Cultivated (MS 5676) Madeira x – C. incana subsp. maderensis CimadKEW Chase 19982 (KEW) Wild Madeira x x C. incana subsp. maritima (Guss.) Ohle Cimar1 O. Plume 1 (BH) Cultivated (PI 597596) Trapani, Italy x – C. lanzae Maire Clan22 6522 (GAT) Wild Morocco x – C. lanzae Clan24 6524 (GAT) Wild Morocco x – C. maroccana (Ball) B. D. Jacks Cmar14 O. Plume 14 (BH) Cultivated (PI 578104) Morocco x – C. maroccana Cmar165 O. Plume 165 (BH) Cultivated (PI 607416) Morocco (cult. Germany) x – C. maroccana Cmar213 O. Plume 213 (BH) Cultivated (PI 607417) Morocco (cult. France) x – C. meuselii Ohle Cmeu3063 P. Silveira 3063 (AVE) Wild Morocco x x *C. officinalis L. Co170 O. Plume 170 (BH) Cultivated (PI 578106) Kazakhstan x x *C. pachysperma Zohary Cpac207 O. Plume 207 (BH) Cultivated (IGB 20562) Samaria Mountains, Israel x – C. palaestina Boiss. Cpal219 O. Plume 219 (BH) Cultivated (IGB 21124) Mount Carmel, Israel x – *C. suffruticosa subsp. fulgida (Raf.) Guadagno Csf166 O. Plume 166 (BH) Cultivated (PI 607420) Mt. Erice, Italy x x C. suffruticosa Vahl. subsp. suffruticosa Css3038 P. Silveira 3038 (AVE) Wild Tunisia x – *C. stellata Cav. Cste10 O. Plume 10 (BH) Cultivated (PI 603114) Morocco x x C. stellata Cste18 O. Plume 186 (BH) Cultivated (PI 603114) Morocco x – C. stellata Cste321 O. Plume 321 (BH) Cultivated (OP pop. 250) Sidi Kacem, Morocco x – *C. tripterocarpa Rupr. Ctri139 O. Plume 139 (BH) Cultivated (PS 2982) Almeria, Spain x x Dimorphotheca tragus (Aiton) B. Nord. Dtra19 O. Plume 19 (BH) Cultivated (PI 263145) New York, USA x – Garuleum Cass. pinnatifidum DC. Gpin174 O. Plume 174 (BH) Cultivated (SHS 11408) South Africa x – D. barberiae Harv. Obar173 O. Plume 173 (BH) Purchased (BA) Ithaca, USA x – D. fruticosa (L.) DC. Ofru209 O. Plume 209 (BH) Cultivated (SHS 5205) South Africa x –

61

PCR programs, and outcomes of amplification and sequencing (if applicable) for each locus are included in Table 2.2.

Candidates included: two loci of purported general phylogenetic utility in angiosperms, a glyceradehyde dehydrogenase gene (G3pdh; Strand et al. 2007) and the floral identity gene pistillata (Bailey and Doyle 1999); ten of the numerous markers identified by Chapman et al.

(2007) for their demonstrated phylogenetic utility in the Compositae (A01, A25, A39, C08, C32,

D15, D27, D36, D39, and D46); a floricaula/leafy homolog identified in Chrysanthemum (Dfl;

Ma et al. 2008), a globosa homolog from Gerbera (Yu et al. 1999), an alcohol dehydrogenase locus developed for a study of hybridization in Dubautia (Dadh; Friar et al. 2008); a member of a polyketide synthase family characterized by Helariutta et al. (1996) and modified for use in

Senecioneae (and, specifically, for a single chalcone synthase locus) by Álvarez et al. (Chs;

2008); and one locus (QG8140) identified specifically for the Senecioneae by Álvarez et al.

(2008), following the method proposed in their study for selecting single-copy nuclear markers for phylogenetics. In an effort to decrease the phylogenetic distance between Calendula and the published primers for pistillata and globosa (developed for amplification of targets in

Brassicaceae and Gerbera, respectively), pistillata primers Chryspi58F, Chryspi670R, and

Comppi58F were designed for this study based on alignment of pistillata sequences from

Brassicaceae (Bailey and Doyle 1999) with either Chrysanthemum expressed sequence tags

(ESTs) or a wider set of Compositae ESTs from the Compositae Genome Project

(compgenomics.ucdavis.edu), and Globosa primers ChrysgloF and ChrysgloR were designed based on alignment of Gerbera Gglo1 (Yu et al. 1999) with the Chrysanthemum ESTs.

Only two of the markers tested, A39 and Chs, met the evaluation criteria after screening. The

A39 locus, identified as a single- or low-copy nuclear marker of potential phylogenetic utility in

62

Table 2.2. Loci, primer information, PCR programs, and amplification and sequencing outcomes for 17 low-copy nuclear markers tested for their utility in this study. A39 and Chs, both marked with an asterisk (*), were selected for further exploration. The A39 locus was found to have two copies in Calendula (A39c1 and A39c2), and the copy-specific reverse primers (A39c1R and A39c2R) used to amplify each copy (together with A39F) are included below. Conditions for each PCR program are given in Materials and Methods.

63

PCR program Amplification and sequencing Locus Primer name and sequence (5’ – 3’) or source outcomes A01 A01F and A01 R: Chapman et al. 2007 Screen 55 One faint band in a few individuals or no amplification A25 A25F and A25R: Chapman et al. 2007 Screen 55 One faint band in a few individuals or no amplification *A39 A39F and A39R: Chapman et al. 2007 Screen 52-56 One bright band in most individuals; or Touchdown polymorphisms and indels in direct sequences; expected target A39c1 A39F: Chapman et al. 2007 A39c1c2 One bright band in most individuals; A39c1R: ATVCCAACYCCAACAAGTGG polymorphisms and indels in direct sequences; expected target or (rarely) A39c2 A39c2 A39F: Chapman et al. 2007 A39c1c2 One bright band in most individuals; A39c2R: ATVCCAACYCCAACAAGTGA polymorphisms and indels in direct sequences; expected target or (rarely) A39c1 CO8 CO8F and C08R: Chapman et al. 2007 Touchdown Multiple bands or no amplification C32 C32F and C32R: Chapman et al. 2007 Screen 56 One or two bands in a few individuals or no amplification; short sequence fragments; expected target D15 D15F and D15R: Chapman et al. 2007 Screen 56 No amplification D27 D27F and D27R: Chapman et al. 2007 Screen 56 No amplification D36 D36F and D36R: Chapman et al. 2007 Screen 55 One faint to bright band in a few individuals or no amplification D39 D39F and D39R: Chapman et al. 2007 Screen 56 One faint band in a few individuals or no amplification; short sequence fragments; unknown target D46 D46F and D46R: Chapman et al. 2007 Screen 52-56 One bright band in most individuals (variable length); bad sequences or unexpected target *Chs 1266F and 1990R: Álvarez et al. 2008 Screen 52-56 One bright band in most individuals; or Touchdown polymorphisms and indels in direct sequences; expected target or related copies QG8140 72F and 1070R: Álvarez et al. 2008 Touchdown Two bands in one individual, faint single band in one individual, no other amplification Globosa ChrysgloF: GCTAGAGGAAAGATCCAGATCAA Screen 48 or One faint band in a few individuals ChrysgloR: CCTTATATCAAATTCATGCATTAG Touchdown or no amplification; bad sequences Dadh DadhA-95F and Dadh-E10R: Friar et al. 2008 Screen 52 One faint band in a few individuals or no amplification; bad sequences or unexpected targets DadhB-86F and Dadh-E10R: Friar et al. 2008 Screen 52 One faint band in one individual or no amplification; unexpected target DadhC-61F and Dadh-E10R: Friar et al. 2008 Screen 52 Multiple bands Dfl FLO/LFY F and FLO/LFY R: Ma et al. 2008 Screen 56 or Multiple bands Touchdown G3pdh GPDX7F and GPDX9R: Strand et al. 1997 G3pdh Multiple bands (rarely one band); primer dimers; both expected and unexpected targets

64

Table 2.2. (Continued)

PCR program Amplification and sequencing Locus Primer name and sequence (5’ – 3’) or source outcomes Pistillata F18 and F16: Bailey and Doyle 1999 (provided by Touchdown One faint band in one individual or E. M. Meyerowitz and E. Krizek) no amplification; bad sequence Chryspi58F: ATGGCTAGAGGAAAGATCCAG Touchdown One faint band in a few individuals Chryspi670R: GAGTGAGTATGCGTTGTGTAG or no amplification Comppi58F: ATGGSKAGAGGAAAGATMSAG Touchdown No amplification pi1254R: Bailey and Doyle 1999

65

the Compositae (Chapman et al. 2007), spans part of a gene that codes for a sugar transporter

protein. This gene in Arabidopsis has seven exons and six introns. The A39 primers are anchored

in exons 2 and 4. This locus has been used successfully in studies on the origin of cultivated

safflower (Carthamus tinctorius; Chapman and Burke 2007b) and allopolyploidy and

relationships among genera of everlasting daisies (Gnaphalieae; Smissen et al. 2011). In both

studies, a single copy of A39 was suggested to be present. The Chs locus used here spans part of

the second exon of a nuclear gene that codes for chalcone synthase (Chs) and Chs-like enzymes

(Helariutta 1996; Wang et al. 2000). See gene models and primer locations for A39 and Chs in

Fig. 2.2.

Figure 2.2. Location of A39 and Chs primers relative to Arabidopsis gene models At2G28315 (sugar transporter family protein) and At5G13930 (chalcone synthase), respectively. Gene models are adapted from TAIR (www.arabidopis.org); e = exon; i = intron; gray box = untranslated region (UTR)

Direct sequences from both markers suggested the presence of two or more copies varying

both by nucleotide polymorphisms and insertion-deletion events (indels), which necessitated

cloning to further assess the utility of these loci for reconstructing phylogenies in Calendula.

Phylogenetic analyses of A39 clones (see below) revealed two distinct paralogs of A39 sequences

(A39c1 and A39c2), so copy-specific reverse primers (A39c1R and A39c2R; Table 2.2) were

66 designed for use with the A39F primer in an effort to amplify and sequence each copy separately and reduce the need for cloning.

The small screening subset was expanded to 35 individuals for sampling of A39 (including four outgroup species in two genera in Calenduleae) and to 10 individuals for sampling of Chs

(see Table 2.1). For A39, an effort was made to sample from the range of variation in the genus while reducing the total number of individuals included for each taxon (to save time and reduce cost of multiple rounds of cloning and sequencing). Sampling of Chs was stopped after it became clear that more than two copies of this marker were present even in diploid individuals in

Calendula and that continued sampling would be inefficient with the methods available for the study (Sanger sequencing and cloning). Nevertheless, sequences from individuals representing all but the highest chromosome number group (2n=~85) were obtained.

2.2. DNA extraction, PCR, cloning, and sequencing

Total genomic DNA was extracted as described in Chapter 1. Typical PCR reactions during the screening phase were performed in 20 ul volumes with 1 ul template DNA (0.1-1.0 ng), 2 ul

1X Standard Taq Reaction Buffer (New England Biolabs, Ipswich, MA), 0.2 mM each dNTP

(New England Biolabs, NEB), 0.5 uM of each primer (prepared by InvitrogenTM, Life

Technologies, Grand Island, NY), and 1 unit Taq DNA polymerase (NEB). Some A39 and most

Chs reactions were also performed using the NEB Taq protocol just described, but many A39,

A39c1, and A39c2 reactions were performed using either FIREPol® (Solis BioDyne, Tartu,

Estonia) or TaKaRa Ex TaqTM (Takara Bio Inc., Shiga, Japan) DNA polymerases. FIREPol reactions were performed in 20 ul volumes with 1 ul template DNA (0.1-1.0 ng), 0.2 mM each

67 dNTP (Solis BioDyne), 2.0 mM MgCl2 (Solis BioDyne), 0.25-0.5 uM of each primer (prepared by InvitrogenTM, Life Technologies), and 1 unit of FIREpol DNA polymerase (Solis BioDyne).

TaKaRa Ex Taq reactions were performed in 12 ul volumes with 1 ul template DNA (0.1-1.0 ng), 1.2 ul 10x Ex Taq Buffer (Takara Bio Inc.), 0.2 mM each dNTP (Takara Bio Inc.), 0.5 uM of each primer (prepared by InvitrogenTM, Life Technologies), and 0.3 units Takara Ex Taq DNA polymerase (Takara Bio Inc.).

Most loci were amplified using the same basic PCR program (“Screen” in Table 2.2) except that the annealing temperature ranged from 48-56°C – 1 cycle of 94°C for 2 min; 40 cycles of

94°C for 30 sec, 48-56°C for 30 sec, 72°C for 1 min; 1 cycle of 72°C for 5 min. The annealing temperature(s) used for each locus is appended to the name “Screen” in Table 2.2 (e.g., “Screen

48” had an annealing temperature of 48°C). The “Touchdown” program (Table 2.2) of Chapman et al. (2007) was sometimes used to improve amplification of desired products – 1 cycle of 95°C for 3 min; 10 cycles during which the annealing temperature is decreased from 60 to 50°C by

1°C per cycle (94°C for 30 sec, 60 => 50°C for 30 sec, 72°C for 45 sec); 30 cycles of 94°C for

30 sec, 50°C for 30 sec, 72°C for 45 sec; 1 cycle of 72°C for 20 min. A39c1 and A39c2 were amplified using the program “A39c1c2” (Table 2.2) – 1 cycle of 95°C for 3 min; 35 cycles of

95°C for 30 sec, 52°C for 30 sec, 72°C for 1 min; 1 cycle of 72°C for 5 min. G3pdh was amplified using the program of Strand et al. (1997; “G3pdh” in Table 2.2) – 1 cycle of 95°C for

2 min; 35 cycles of 95°C for 1 min, 48°C for 90 sec, 72°C for 2 min; 1 cycle of 72°C for 9 min.

Reactions were performed on Techne, MJ, or MaxyGene thermocyclers.

PCR amplification was confirmed by agarose gel electrophoresis of 2-5 ul of each PCR product. PCR products were cleaned before sequencing (but not before cloning) by adding 3 ul

Standard Taq Reaction Buffer (NEB), 10 units Exonuclease 1 (NEB), and 0.5 units Antarctic

68

Phosphatase (NEB) to each 20 ul PCR product and incubating for 45 min at 37°C, then 10 min at

90°C or by adding 2.25 ul sterile water, 5 units Exonuclease I, and 2.5 units Shrimp Alkaline

Phosphatase (NEB) to each 20 ul product and incubating for 30 min at 37°C, then 15 min at

80°C. Reactions were scaled as necessary for smaller reaction volumes.

PCR reactions for cloning of A39 and Chs loci were performed in triplicate using the

“Touchdown” or “Screen” programs and pooled before cloning (as described in Chatper 1 for

ITS). Pooled products were cloned using the TOPO® TA Cloning® Kit for Sequencing

(InvitrogenTM, Life Technologies, Grand Island, NY) following the manufacturer’s protocol. At least 10 (and as many as 40) colonies were selected for sequencing from each cloned individual

(when 10 or more colonies were available) with the goal of sampling, as much as possible, the true diversity of copies present. Amplification of cloned inserts was performed using the NEB

Taq protocol, the T3 and T7 primers as published in the theTOPO® TA Cloning® Kit for

Sequencing manual, and the “Screen” PCR program, typically with an annealing temperature of

52°C. Direct sequencing and/or cloning of A39 was repeated multiple times for some species,

especially to confirm the presence of only a single copy in one outgroup.

Capillary DNA sequencing of both forward and reverse strands was carried out using the

amplification primers (including T3 and T7 for all clones) and BigDye Terminator v3.1

chemistry (Applied Biosystems®, Life Technologies, Grand Island, NY) at Cornell University

Life Sciences Core Laboratories Center (CLC, Ithaca, NY) or at Massey Genome Service of

Massey University (Palmerston North, New Zealand) on a 3730 DNA analyzer (Applied

Biosystems).

69

2.3. Phylogenetic analyses

The A39 matrix included only Calendula and outgroup sequences generated for this study.

The Chs matrix included Calendula sequences generated for this study (Table 2.1) as well as 116

Chs sequences downloaded from GenBank (115 from Compositae and one from Solanum) used in three different studies: 1) a study of Chs duplication in Chrysanthemum (Compositae:

Anthemideae) within the context of Chs diversity in the (Yang et al. 2002); 2) a method for selecting single-copy nuclear genes for use in Senecioneae (Compositae; Álvarez et al. 2008); and 3) a phylogeny of Adenostyles (Compositae: Senecioneae; Dillenberger and Kadereit 2013).

For simplicity (and to decrease computation time), duplicate sequences within individuals and sequences identified by Álvarez et al. (2008) as pseudogenes were excluded. Because these sequences each grouped with other sequences recovered from the same individual (except for one sequence recovered twice from E. angustifolia which grouped with Ci. volubilis), it is not expected that their inclusion would alter topologies recovered without them. Sources for

GenBank sequences are explained in Fig. 4. Four Calendula sequences suspected to be pseudogenes (due to the presence of premature stop codons) were also excluded.

Sequences were aligned using MUSCLE (Edgar 2004) via the MUSCLE webserver

(http://www.ebi.ac.uk) and alignments were adjusted by eye in Winclada (Nixon 1999). A comparison of A39 sequences from Calendula to one from Arabidopsis and to many from other genera in Compositae (ESTs and genomic sequences available in GenBank) revealed that the exonic regions were well conserved such that they were easily alignable even to Arabidopsis without ambiguity or indels. The introns, however, were extremely variable, and sequences from

Calendula had high levels of nucleotide polymorphisms and large and numerous indels.

70

Although the unaligned length of individual A39 sequences from Calendula ranged from 419 to

496 nucleotides (nt), the aligned length of the matrix was 709 nt to accommodate indels. The

resulting gaps were coded in SeqState (Müller 2005; 2006) using the simple indel coding method

of Simmons and Ochoterena (2002), which added 104 characters to the matrix (see matrix

statistics in Table 2.3). Chs sequences were easily alignable between all taxa and no gaps were

present in the Chs matrix. The primers were trimmed off of both the A39 and Chs alignments.

Table 2.3. Partitions, models (for Bayesian analyses), and matrix statistics for A39 and Chs matrices

Terminals Aligned Alignment Missing Variable Region Partition Model w/ data length gaps (%) data (%) sites/PICS A39, exon 2 1-150 GTR+G 151 150 0.2 1.0 53/39 A39, intron 2 151-303 GTR+G 151 153 46.8 0.0 58/45 A39, exon 3 304-372 GTR+I 151 69 1.4 0.0 26/16 A39, intron 3 373-648 JC 151 276 66.2 2.0 90/62 A39, exon 4 649-709 GTR+G 146 61 0.0 21.9 23/12 A39, gaps 710-813 binary 151 104 - - 103/76

Chs, exon 2 1-487 K80+G 168 487 0.2 0.3 356/302

Parsimony and Bayesian analyses were performed separately on the A39 and Chs matrices

following the same methods and using the same settings described for the 3CP and ITS datasets in Chapter 1 (including methods and settings for the treatment of gap characters in Bayesian analysis). The A39 matrix was partitioned by exonic and intronic regions and gaps for model

estimation and analysis, and the first Chs matrix was partitioned by coding region (exon 2) and

gaps (see Table 2.3). Models for DNA partitions were estimated as described in Chapter 1. A39

trees were rooted with Garuleum pinnatifidum, and Chs trees were rooted with Solanum

tuberosum.

71

3. Results

3.1. Phylogenetic analyses of A39

Parsimony analyses of the A39 matrix resulted in exactly 24,000 trees with a length (L) of

641 steps, a consistency index (CI) of 63, and a retention index (RI) of 95. The strict consensus

tree had L = 674, C = 60, and RI = 94. Both the 50% majority rule tree produced by Bayesian

analysis (depicted in black in Fig. 2.3) and the strict consensus parsimony tree (depicted in gray

in Fig. 2.3) show A39 sequences from Calendula resolved into two distinct clades, each with

strong support (the first with parsimony bootstrap support (BS) of 99 and Bayesian posterior

probability (PP) of 1 and the second with BS = 93/PP = 1). Sequences belonging to both clades

were recovered from all species sampled except C. tripterocarpa. From these results, two

paralogs of A39 (A39c1 and A39c2) were inferred in Calendula corresponding to the two clades.

Together, all A39c1 and A39c2 sequences from Calendula were monophyletic relative to those

from outgroup species (BS = 97/PP = 1). Neither Garuleum nor Dimorphotheca shared the duplication of A39 with Calendula. Reliable sequences could not be obtained for more closely related outgroups in Osteospermum and Tripteris.

Bayesian and parsimony topologies within A39c1 were congruent although the Bayesian topology was more resolved. Within A39c2, the parsimony topology differed from the Bayesian topology in three ways: 1) in the parsimony tree, C. stellata sequences within this clade were sister to a clade with good support (BS = 81) comprising sequences from C. incana subsp. algarbiensis, C. palaestina, and C. pachysperma, but in the Bayesian tree, sequences from all four were in a polytomy; 2) in both analyses, sequences from C. arvensis and C. tripterocarpa

72

Figure 2.3. A39 cladogram showing the Bayesian topology in black and differences in the parsimony topology in gray. One terminal (C. arvensis 452, represented by the single clone “d10”) changed clade membership depending on the analysis (indicated with a gray box and arrow). The two copies of A39 in Calendula are labelled A39c1 and A39c2. Above branches, numbers to the left of the slash indicate parsimony bootstrap support (BS), and plus signs to the right of the slash indicate Bayesian posterior probabilities (PP), with “+” = 0.90 – 0.99, and “++” = 1. BS less than 50 and PP less than 0.90 are not shown. A minus sign “-“ on either side of the slash (e.g., “-/++”) indicates no support in that analysis, despite support in the other. Clades are shown as triangles when they contained multiple sequences from a single individual, or multiple interdigitated sequences from multiple individuals. Numbers after taxon names indicate individuals. Codes inside parentheses immediately after some individuals indicate the following: a letter “a” through “h” followed by a number one through twelve (e.g., h7) indicates a unique sequence recovered only once from a single clone (and corresponds to position on a sequencing plate); if a sequence was recovered more than once from an individual, the letter “x” followed by a number (e.g., x2) indicates the number of times the sequence was recovered; the letters “ds” indicate that the sequence is the result of direct sequencing (i.e., not from cloning). Five diploid and one polyploid species putatively and variously involved in the parentage of all other species are highlighted as follows: C. stellata is in orange; the four Moroccan endemic species are in blue; C. tripterocarpa is in pink. Subclades indicated at right could represent additional duplications or allelic diversity within each copy.

73

74

together appeared in two distinct lineages (the second of which also included sequences from C.

suffruticosa subsp. fulgida), but the position of a single clone recovered from C. arvensis 452 differed between analyses. Parsimony analysis placed it as sister to clade “Ctri A” (Fig. 2.3; without support) while Bayesian analysis placed it within clade “Ctri B” (Fig. 2.3; PP = 1); and

3) in parsimony analysis, Ctri A plus the sequence from C. arvensis 452 were sister to the clade

containing Ctri B and clade “Cste+ME” (without support), while in Bayesian analysis, Ctri A was sister to Cste+ME with strong support (PP = 1).

A39c1 sequences formed three subclades (Cste A, Cste B, and Moroccan endemics (MEs);

Fig. 2.3) with clades Cste A and Cste B each containing a sequence from C. stellata. Clade Cste

A also contained putative orthologs from all polyploid taxa except C. tripterocarpa (sequences of which were not recovered for A39c1), and Cste B contained additional sequences from the high polyploid species. Clade MEs contained all A39c1 sequences recovered for the Moroccan endemic species as well as sequences from each of the perennial polyploid taxa and C. officinalis. Five A39c2 subclades were present in both parsimony and Bayesian analyses (Cste A,

MEs, Ctri A, Cste+ME, and Ctri B). The first four subclades were also supported in both analyses, while Ctri B was only supported in the Bayesian analysis. C. stellata sequences again appeared in two subclades, the first (in Cste A) with sequences from perennial and annual polyploids, and the second (in Cste+ME) with perennial polyploids and one sequence from the

Moroccan endemic C. eckerleinii. Calendula tripterocarpa sequences also appeared in two subclades (Ctri A and Ctri B), and C. arvensis individuals had sequences that were included in one or both of these subclades. Sequences from the Moroccan endemic species C. maroccana

and C. meusellii formed a clade (clade MEs) with a sequence from C. officinalis. Table 2.4

summarizes clade membership of all A39c1 and A39c2 sequences retrieved from each individual.

75

Table 2.4. Clade membership of A39 copies by species and individual (if more than one individual was sampled for a given species). Clades correspond to those named in Fig. 2.3. Sequences not part of these clades are included in the last column (i.s. = incertae sedis).

A39c1 A39c2 Species or ID 2n Cste A Cste B MEs … Cste A MEs Ctri A Cste+ME Ctri B i.s. C. pachysperma 85 x x x C. palaestina 85 x x x C. arvensis 44 x x x x … Carv5 44 x x x Carv6 44 x x x x Carv8 44 x x x Carv149 44 x x x Carv234 44 x x x Carv450 44 x Carv452 44 x Carv470 44 x Carv477 44 x x x C. tripterocarpa 30 x x C. stellata 14 x x x x Cste10 14 x x Cste18 14 x Cste321 14 x x C. eckerleinii 18 x x Ceck15 18 x Ceck287 18 x Ceck3064 18 x x C. lanzae 18 x x Clan22 18 x x Clan24 18 x x C. maroccana 18 x x Cmar13 18 x Cmar14 18 x Cmar165 18 x x Cmar213 18 x x C. meuselii 18 x x C. incana 32 x x x x Cialg145 32 x x x Cimad142 32 x x x CimadKEW 32 x Cimar1 32 x C. officinalis 32 x x C. suffruticosa 32 x x x x Csf166 32 x x x x Css3038 32 x x

76

Sequences from any given diploid individual (with 2n=14 or 18 chromosomes) never belonged to more than one lineage within A39c1 or A39c2. However, different individuals of some diploid species did appear in different lineages. Sequences from polyploid individuals

(including C. tripterocarpa), on the other hand, sometimes did appear in different lineages within one or both A39 clades (Table 2.4). One exception was C. officinalis, sequences of which belonged to a single lineage within each A39 clade.

3.2. Phylogenetic analyses of Chs

Parsimony analysis of the Chs matrix produced 1080 trees with L = 1766, CI = 36, and RI =

87 and a strict consensus tree (Fig. 2.4) with L = 1782, CI = 36, and RI = 87. The strict consensus parsimony tree (Fig. 2.4) and the 50% majority rule tree produced by Bayesian analysis (Fig. 2.5) differed from each other in many respects.

Analysis of Chs sequences from Calendula together with those from several studies that explored Chs variation and divergence in the Compositae (Helariutta et al. 1995, 1996; Yang et al. 2002; Álvarez et al. 2008; Dillenberger and Kadereit 2013), allowed an assessment of correspondence between Chs paralogs inferred in the previous and present studies. Three paralogs from Gerbera (Mutiseae; Gchs1, Gchs2, and Gchs3; Helariutta et al. (1995, 1996), three clades of sequences from Chrysanthemum (Anthemideae) that corresponded to the three paralogs from Gerbera (here named Y1, Y2, and Y3; Yang et al. 2002), and three clades of sequences from Senecioneae, Cichoreae, Astereae, and Heliantheae inferred to represent three Chs paralogs

(here named A1, A2, and A3) are indicated in Figs. 2.4 and 2.5, though some were no longer monophyletic in one or both analyses. Tribal relationships recovered in the Chs analyses were

77

Figure 2.4. Strict consensus tree resulting from parsimony analysis of the Chs matrix. Copy numbers c1 through c11 indicate the minimum number of paralogs inferred from this analysis based on expected tribal relationships in Compositae (following Funk et al. 2009; Fig. 2.6). Correspondence of these copies to paralogs proposed in the literature is indicated as follows: Y1, Y2, and Y3 indicate three clades of Chs and Chs-like genes corresponding to those recovered by Helariutta et al. (1996; therein characterized by Gerbera genes Gchs1, Gchs2, and Gchs3) and by Yang et al. (2002; therein labeled “SF1,” “SF2,” and “SF3”). Y3 and Y1 are here rendered polyphyletic (p. p. = pro parte) by inclusion of new sequences and evidence of further duplications. A1, A2, and A3 indicate three paralogous clades recovered by Álvarez et al. (2008). Callistephus chinensis was part of the A1 clade in their analysis. Some clades have been collapsed to triangles as in Fig. 2.3, BS values ≥=50 are shown above branches, and conventions for highlighting and naming terminals are like those in Fig. 2.3 with some additions. The color bar to the right of terminal names indicates tribal membership of each genus and corresponds to color codes for tribes in Fig. 2.6. Sequences downloaded from GenBank are preceded by symbols indicating their sources as follows: ¥ = Jeon et al. (1996); ¥¥ = Helariutta et al. (1996); * = Helariutta et al. (1995); ** = Yang et al. (2002); *** = Yang et al. (2002), GenBank accessions AF511459-60, AF511463-67, AF511469-70, AF511472-74; ‡ = Henkel et al. (unpublished) in Yang et al. (2002); Φ = Álvarez et al. (2008), GenBank accessions EF128532- 35, EF128537-39, EF128543-46, EF128548-50, EF128553-54, EF128556-70, EF128573-76, EF128578, EF128581, EF128585-87, EF128589-92, EF128594-96; £ = Dillenberger and Kadereit (2013), GenBank accessions KC784375-KC78421 (“other Senecioninae” include species in Iranecio, Pojarkovia, and Caucasalia)

78

79

Figure 2.5. 50% majority rule tree resulting from Bayesian analysis of the Chs matrix. PP values = 1.0 (++) and <1.0 but >0.9 (+) are shown above branches. Conventions for highlighting and naming of terminals, as well as for indicating GenBank sequences, are like those in Fig. 2.4. Inferred paralogs are numbered relative to those in Fig. 2.4 for ease of reference. Correspondance of inferred paralogs to paralogs in the literature is similar to that in Fig. 2.4, but the membership of sequences in some paralogs has changed from the parsimony analysis as follows: c1+ c2 in Fig. 2.4 is now a single copy (c1); c4 + c6 in Fig. 2.4 is now a single copy (c4); c10 in Fig. 2.4 no longer includes Lactuca sativa (c12); c8+c11 in Fig. 2.4 is now a single copy (c8).

80

81

compared to the expected tribal relationships following Funk et al. (2009; see Fig. 2.6), and

when they were incongruent, additional paralogs were inferred. Correspondence between

paralogs inferred from parsimony and Bayesian analyses is indicated in Figs. 4 and 5. Eleven paralogs were inferred from the parsimony analysis (c1 – c11, Fig. 2.4) and nine were inferred from the Bayesian analysis (c1, c3, c4, c5, c7 – c10, and c12). All of the lineages corresponding

to the newly inferred paralogs (c1 – c12) were supported in both analyses except for c10, which

was unsupported regardless of whether Lactuca sativa was included (parsimony analysis) or

excluded (Bayesian analysis).

Figure 2.6. Expected phylogeny for Chs sequences in Figs. 2.4 and 2.5 based on tribal relationships in Funk et al. (2009). Color-coding for tribes corresponds to the color bars showing tribal membership in Figs. 2.4 and 2.5.

Both analyses recovered Calendula sequences in the same four lineages (c1, c3, c9, and the

Major Calendula Clade (MCC) in Figs. 4 and 5). In the parsimony analysis, all the sequences in the MCC shared (or were derived from) a paralog unique to the genus (c11), whereas in the

Bayesian analysis, the Calendula sequences shared (or were derived from) the Gerbera paralog

Gchs1 (c8). However, the relationship of the MCC to either c10+c9 (parsimony analysis) or c8

(Bayesian analysis) was unsupported. The clade of Calendula sequences in c1 was sister to the sequence from Gerbera hybrida in parsimony analysis, but sister to a clade of Chrysanthemum

82

sequences in Bayesian analysis. Therefore, in parsimony analysis, the Chrysanthemum clade was

considered to represent a separate Chs paralog (c2) and Y3 was considered to represent two paralogs (c1 and c2), while in Bayesian analysis the Chrysanthemum clade was considered to

represent the same paralog as the Gerbera and Calendula sequences (c1) and Y3 was considered

to represent a single paralog (c1). Y2 corresponded to a single paralog (c3) in both analyses. Y1

was shown to represent two paralogs (c4 and c8) in both analyses, although the first (c4) was

only considered to be shared by a clade of Senecioid genera (A1, here inferred to be shared by

“Adenostyles and other Senecioninae” from Dillenberger and Kadereit 2013) in the Bayesian

analysis. A1 included the sequence from Callistephus chinensis in Álvarez et al. (2008), but, in

the present analyses, this sequence could not be considered to represent the same paralog as other

sequences in the A1 lineage due to its position relative to sequences from Chrysanthemum in

both analyses. Lineage c7 was identical to A2 in both analyses, but its position differed. It was

weakly supported (BS = 63) as sister to the clade comprising c8, c9, c10, and c11 in parsimony

analysis and was in a polytomy with the clade containing all other ingroup sequences in

Bayesian analysis. Lineage c10 was identical to A3 in parsimony analyses but excluded Lactuca sativa in Bayesian analysis. The sequences from Lactuca sativa were considered to represent a

separate paralog (c12) in Bayesian analysis.

Six Calendula sequences (one from each species sampled) formed a clade with high support

(BS=99/PP=1) within c1 as described above. Subclades divided sequences of the annual

polyploids (C. tripterocarpa and C. arvensis) from those of perennial diploids (C. eckerleinii and

C. meusellii) and polyploids (C. suffruticosa and C. officinalis). Within lineage c3, a single

Calendula sequence (recovered twice from C. meusellii) was sister to two Chrysanthemum

sequences, and all three were sister to a clade of two Gerbera sequences, one of which was

83

Gchs2 (confirmed to have pyrone synthase activity by Eckermann et al. 1998). Three sequences

from C. tripterocarpa formed a clade either sister to c10 (parsimony analysis; without support)

or to c8 (Bayesian analysis; without support) and represented paralog c9 in both analyses. The majority of Calendula sequences formed the MCC, which was supported in both analyses (BS =

93/PP > 0.90). Within the MCC, at least three lineages in each analysis comprised sequences from many if not all species sampled, and most individuals (including diploids) had sequences that fell into two or more of these lineages. This suggests the presence of additional Chs paralogs within this clade, although some sequences recovered may represent variation at single loci

(heterozygosity).

4. Discussion

4.1. A39

Despite the fact that A39 was a single-copy locus in Carthamus tinctorius (Cardueae;

Chapman and Burke 2007b) and in all sampled genera of Gnaphalieae in Smissen et al. (2011), two distinct copies were recovered even from diploid individuals of Calendula. The individual of

Garuleum pinnatifidum (Calenduleae) sampled for this study also had a single copy of A39 (a single, invariant sequence was recovered from this individual repeatedly). Sequences recovered from Dimorphotheca, here including Osteospermum fruticosum (≡ D. fruticosa) and O. barberiae (≡ D. barberiae), were more variable and could, potentially, represent more than one copy (though more extensive sequencing would be necessary to confirm this). However, all sequences from Dimorphotheca formed a well-supported clade sister to Calendula, so

Dimorphotheca did not share the duplication seen in Calendula. Reliable sequence data were not

84 obtained for closer outgroup species in Osteospermum and Tripteris, but future sampling of A39 from these genera would help determine where the duplication of A39 occurred, either in the ancestor of Calendula or in the ancestor of Calendula plus closely related taxa in Calenduleae.

This could also help determine the nearest relative to Calendula in the tribe.

Sequence variation within individuals and species in A39c1 suggested both heterozygosity and duplications resulting from allopolyploidy. For example, there was considerable intra- individual variation in Moroccan endemic species (all of which are diploids), but sequences from any given diploid individual never belonged to more than one lineage, suggesting recent divergence of these sequences. Sequences were more diverged across different individuals of the same species, and especially across individuals from different populations. For example, sequences from C. maroccana 165 and 213, individuals from the same population, formed a clade, whereas the sequence from C. maroccana 14, from a different population, was more distantly related. Likewise, two sequences recovered from two different individuals of C. stellata from different populations occupied different positions within A39c1. The polytomy, or “comb,” of sequences obtained from C. eckerleinii 3064 was suggestive of allelic recombination (see

Doyle 1995) either in vivo or via PCR. An overarching pattern was clear, however. All sequences from the Moroccan endemics were in one clade (MEs) and each of two sequences from C. stellata were in two separate clades (Cste A and Cste B, with Cste B more closely related to

MEs). All A39c1 sequences recovered from annual polyploids and at least one sequence from most perennial polyploids (excluding only C. officinalis and C. incana subsp. maritima) appeared in one or both of Cste A and Cste B, and at least one sequence from all perennial polyploids appeared in MEs. A39c1 sequences were not recovered from C. tripterocarpa, so

A39c1 could not provide evidence for or against the contribution of C. tripterocarpa to the

85

annual polyploids or for a close relationship of C. stellata to C. tripterocarpa, but the patterns

observed were consistent with the hypotheses that C. stellata contributed its genome to all polyploid species and also that the Moroccan endemics contributed their genome to all perennial

polyploids. Due to lack of resolution in the tree, it was not clear if sequences from any of C.

eckerleinii, C. maroccana, or C. meusellii were most closely related to those of the perennial

polyploids. Sequences from C. lanzae were more distantly related than those from the other three

Moroccan endemics, making it unlikely that C. lanzae was involved in the founding of the

perennial polyploids.

In general, variation within A39c2 followed a similar pattern. Again, sequences from C.

stellata appeared in two clades (Cste A and Cste B, with Cste B more closely related to MEs).

All but one sequence from annual polyploids and all but one sequence from perennial polyploids

appeared in either Cste A or Cste B, again providing evidence for the contribution of C. stellata

to all polyploid taxa. In contrast to the A39c1 topology, the majority of sequences appeared in

Cste B instead of Cste A, and sequences from the high polyploids appeared in Cste A instead of

Cste B. This swapping, between loci, of relationships to C. stellata from one population or

another (C. stellata 10 and 18 vs. C. stellata 321) was suggestive of multiple origins of polyploid

species, which was consistent with patterns recovered in chloroplast and ITS analyses (Chapter

1). Sequences of C. tripterocarpa belonged to two different clades within Cste B (Ctri A and Ctri

B), and sequences belonging to one or both of these clades were also recovered from all C.

arvensis individuals, supporting hypotheses of origin of C. arvensis from C. tripterocarpa. The

presence of the two clades was suggestive of fixed heterozygosity in the two related polyploid

species. All but one sequence from Moroccan endemic species appeared in MEs together with a

sequence from C. officinalis (the only representative of the perennial polyploids in that clade, but

86

still suggestive of contribution of the Moroccan endemics to taxa with 2n = 32 chromosomes),

and C. lanzae was more distantly related to C. maroccana and C. meusellii. However, the

presence of one sequence from the annual high polyploid C. pachysperma in the MEs clade and

of one sequence of the Moroccan endemic C. eckerleinii in Cste B contradicted expectations.

The positions of those sequences could not be easily explained without further sampling of

A39c2, but were suggestive of further duplications within the A39c2 locus.

4.2. Chs

The Chs sequences explored in this study are part of a large, multi-gene family of Chs and

Chs-like genes (polyketide synthases), the evolution of which has been studied extensively in

plants and even across kingdoms (e.g., in bacteria, fungi, and plants in Abe and Morita 2010).

Helariutta et al. (1996) and later Yang et al. (2002) both proposed expansion of the gene family

via duplication and divergence of copies. Specifically, Helariutta et al. (1996), in their analysis of

Chs sequences from several members of asterid families, showed that Gchs2 (one of three copies characterized from Gerbera, and differing from the other two in that it codes for pyrone rather

than chalcone synthase; see Eckermann et al. 1998) evolved from Chs after duplication in the

ancestor of Compositae. Yang et al. (2002) observed duplications in Chrysanthemum

(Anthemideae) that were shared with those seen previously in Gerbera. Álvarez et al. (2008) designed primers to target one of the three Gerbera copies (Gchs1), but still recovered additional paralogs across Senecioneae and three other tribes in the Compositae. It was not surprising that the present study recovered evidence of duplications in Calendula as well. Indeed, Dillenberger

and Kadereit’s (2013) study of Adenostyles and closely related outgroups in Senecioneae,

87 subtribe Senecioninae, should be considered unique among these studies in that the authors recovered a single copy from all sampled taxa, with extremely low variation both within and among individuals and species. The copy they sampled was shown in the present study to be the same copy present in several other genera of Senecioneae and possibly also Chrysanthemum (c6 in parsimony analysis, c4 in Bayesian analysis).

The addition of Chs sequences from Yang et al. (2002), Álvarez et al. (2008), and

Dillenberger and Kadereit (2013) to phylogenetic analyses of Chs sequences from Calendula provided a context that not only facilitated interpretation of the variation in Calendula but also expanded understanding of the variation in other genera and tribes in the Compositae. A minimum of four Chs paralogs was inferred in Calendula, two of which were shared with Gchs2 and Gchs3 characterized in Gerbera in both analyses (c1 and c3). The presence of two (Bayesian analysis) or more (parsimony analysis) lineages within the MCC, each comprising sequences from all sampled taxa, suggested the presence of additional duplications within this clade, unique to Calendula based on the present study, but possibly closely related to Gchs1 from Gerbera.

These duplications could not be attributed to polyploidy within Calendula because multiple sequences even from diploid individuals appeared in two or more distinct lineages. However, such paralogs could have been the result of earlier polyploidization events or single-gene duplications before the divergence of Calendula.

Relationships among Chs sequences in Calendula were interpreted with caution because sampling of Chs and Chs-like genes from Calendula was not exhaustive. Nevertheless, close relationships of C. arvensis to C. tripterocarpa and of the Moroccan endemics to perennial polyploids were both evident in clade c1, and the contribution of C. stellata to the genomes of polyploid taxa was reiterated in multiple lineages within the MCC.

88

Extensive duplication of Chs genes in Calendula, as well as the implications for lineage- specific evolution of these genes in Compositae (e.g., one copy with low variation in Adenostyles vs. multiple copies with high variation in Calendula), was compelling, and highlighted both the promise and the frustration of working with LCN markers. High copy number and lack of specificity of primers in Calendula made Chs unwieldy for phylogenetic analysis, but, within the context of prior studies, provided evidence of more duplications across these taxa than previously characterized. These duplications were also interesting in the context of polyploidy.

There is a large and growing body of literature on relatively recent polyploid events and polyploid complexes in various genera of Compositae (e.g., Bryce et al. 2012; Ramsey et al.

2003; Pellicer et al. 2010; Soltis et al. 2004), but there is also evidence of older events during the diversification of the family. Barker et al. (2008) found evidence for paleopolyploidization events at the base of the family as well as on the branches leading to Mutiseae and Heliantheae, the latter of which they suggested could explain, at least in part, the expansion of floral symmetry genes in Heliantheae observed in a related study (Chapman et al. 2008; ten copies of

CYC/TB1 genes in Heliantheae compared to one to five in all other plants observed).

Paleopolyploidization on the branch leading to Mutiseae does not explain the duplicated Chs genes in Gerbera (Mutiseae), since relatives of all three copies were found in Chrysanthemum

(Anthemideae; Yang et al. 2002) and Calendula (Calenduleae; this study). These copies are more likely related to the older event at the base of the family. Future work would require the addition of more taxa and genomic sequencing to fully explore evolution of this gene family in Calendula and elsewhere in Compositae.

89

4.3. Implications for relationships in Calendula

Analyses of A39 and Chs sequences from Calendula presented here complemented previous

work on ITS and chloroplast sequences (Chapter 1). Taken all together, these analyses supported

a division of Calendula into two polyploid complexes, as previously hypothesized (Fig. 2.1a),

one consisting of perennial polyploids (C. incana, C. officinalis, and C. suffruticosa) with

paternal contributions from a Moroccan endemic species (any of C. eckerleinii, C. maroccana,

and C. meusellii, but not C. lanzae) and chloroplast contributions (assumed to be maternally transmitted) from C. stellata (sex of progenitors inferred from chloroplast data; see Chapter 1), and the other consisting of annual polyploids formed by multiple, reciprocal crosses between C.

stellata and C. tripterocarpa (to form C. arvensis) and between C. arvensis and C. tripterocarpa

(and possibly also C. stellata) to form the high polyploids C. pachysperma and C. palaestina.

The diploid progenitors of C. tripterocarpa (2n = 30), itself an annual polyploid, remained a mystery, although most analyses suggested a closer relationship of this species to C. stellata than to the Moroccan endemic species. It was not closely related to the perennial polyploids, to which it is closest in chromosome number, but neither was there strong evidence of an autopolyploid origin (with dysploidy) from C. stellata. Nevertheless, it was certainly part of the annual polyploid complex. A pictorial summary of new hypotheses, taking into account evidence from chloroplast, ITS, A39, and Chs markers, is shown in Fig. 2.7. These two complexes therefore represent independent replicates of polyploidization within which to test hypotheses of the effects of polyploidy on genetic and phenotypic evolution (e.g., the effects of polyploidy on evolution of chemical diversity in Calendula; see Chapter 3).

90

Figure 2.7. Pictorial summary of major hypotheses of species origins in Calendula based on evidence from three chloroplast regions, ITS, and two low-copy nuclear regions (A39 and Chs). Conventions for depicting lineages, capitula size and color, hybridization, and polyploidization events are as described in Chapter 1, Figs. 1 and 7, except that the question mark and crooked line to C. pachysperma depict the possibility of an as-yet-unknown additional contributer to this species.

91

REFERENCES

Álvarez, I., A. Costa, and G. N. Feliner. 2008. Selecting Single-Copy Nuclear Genes for Plant Phylogenetics: A Preliminary Analysis for the Senecioneae (Asteraceae). Journal of Molecular Evolution 66: 276-291.

Álvarez, I. and J. F. Wendel. 2003. Ribosomal ITS sequences and plant phylogenetic inference. Molecular phylogenetics and evolution 29: 417-434.

Bailey, C. D. and J. J. Doyle. 1999. Potential phylogenetic utility of the low-copy nuclear gene pistillata in dicotyledonous plants: comparison to nrDNA ITS and trnL intron in Sphaerocardamum and other Brassicaceae. Molecular Phylogenetics and Evolution 13: 20- 20.

Bombarely, A., J. E. Coate, and J. J. Doyle. 2014. Mining transcriptomic data to study the origins and evolution of a plant allopolyploid complex. PeerJ 2: e391. Published online: http://dx.doi.org/10.7717/pperj.391

Chapman, M. A. and J. M. Burke. 2007a. DNA sequence diversity and the origin of cultivated safflower (Carthamus tinctorius L.; Asteraceae). BMC Plant Biology 7: 60-60.

Chapman, M. A. and J. M. Burke. 2007b. Genetic divergence and hybrid speciation. Evolution 61: 1773-1780.

Chapman, M. A., J. Chang, D. Weisman, R. V. Kesseli, and J. M. Burke. 2007. Universal markers for comparative mapping and phylogenetic analysis in the Asteraceae (Compositae). TAG.Theoretical and Applied Genetics.Theoretische Und Angewandte Genetik 115: 747-755.

Chapman, M. A., J. H. Leebens-Mack, and J. M. Burke. 2008. Positive selection and expression divergence following gene duplication in the sunflower CYCLOIDEA gene family. Molecular Biology and Evolution 25: 1260-1273.

Dillenberger, M. S. and J. W. Kadereit. 2013. The phylogeny of the European high mountain genus Adenostyles (Asteraceae-Senecioneae) reveals that edaphic shifts coincide with dispersal events. American Journal of Botany 100: 1171-1183.

92

Davey, J. W., P. A. Hohenlohe, P. D. Etter, J. Q. Boone, J. M. Catchen, and M. L. Blaxter. 2011. Genome-wide genetic marker discovery and genotyping using next-generation sequencing. Nature Reviews Genetics 12: 499-510.

Doyle, J. 1995. The irrelevance of allele tree topologies for species delimitation, and a nontopological alternative. Systematic Botany 20: 574-588.

Doyle, J. J. and J. L. Doyle. 1999. Nuclear protein-coding genes in phylogeny reconstruction and homology assessment: some examples from Leguminosae. In: Molecular Systematics and Plant Evolution (P. Honllingsworth, R. Bateman, and R. Gornall, Eds.), London: Taylor and Francis.

Doyle, J. J. 2013. The promise of genomics for a "next generation" of advances in higher-level legume molecular systematics. South African Journal of Botany 89: 10.

Eckermann, S., G. Schröder, J. Schmidt, D. Strack, R. A. Edrada, Y. Helariutta, P. Elomaa, M. Kotilainen, I. Kilpeläinen, P. Proksch, T. H. Teeri, and J. Schröder. 1998. New pathway to polyketides in plants. Nature 396: 387-390.

Edgar, R. C. 2004. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Research 32: 1792-1797.

Egan, A. N., J. Schlueter, and D. M. Spooner. 2012. Applications of next-generation sequencing in plant biology. American Journal of Botany 99: 175-185.

Feliner, G. N., and J. A. Rosselló. 2007. Better the devil you know? guidelines for insightful utilization of nrDNA ITS in species-level evolutionary studies in plants. Molecular Phylogenetics and Evolution 44: 911-919.

Friar, E. A., L. M. Prince, J. M. Cruse-Sanders, M. E. McGlaughlin, C. A. Butterworth, and B. G. Baldwin. 2008. Hybrid origin and genomic mosaicism of Dubautia scabra (Hawaiian Silversword Alliance; Asteraceae, Madiinae). Systematic Botany 33: 589-589.

Funk, V. A., A. Susanna, T. F. Stuessy, and R. J. Bayer. 2009. Systematics, Evolution, and Biogeography of Compositae. International Association for Plant Taxonomy: Vienna.

93

Funk, V. A., R. J. Bayer, S. Keeley, R. Chan, L. Watson, B. Gemeinholzer, E. Schilling, J. L. Panero, B. G. Baldwin, N. T. Garcia-Jacas, A. Susanna and R. K. Jansen 2005. Everywhere but Antarctica: Using a supertree to understand the diversity and distribution of the Compositae. In: Friis, I. and H. Balslev [eds.], Proceedings of a Symposium on Plant Diversity and Complexity Patterns - Local, Regional and Global Dimensions. The Royal Danish Academy of Sciences and Letters, Copenhagen. Biologiske Skrifter 55: 343-374.

Helariutta, Y., P. Elomaa, M. Kotilainen, R. J. Griesbach, J. Schröder, and T. H. Teeri. 1995. Chalcone synthase-like genes active during corolla development are differentially expressed and encode enzymes with different catalytic properties in Gerbera hybrida (Asteraceae). Plant Molecular Biology 28: 47-60.

Helariutta, Y., M. Kotilainen, P. Elomaa, N. Kalkkinen, K. Bremer, T. H. Teeri, V. A. Albert, and Uppsala universitet. 1996. Duplication and functional divergence in the chalcone synthase gene family of Asteraceae: evolution with substrate change and catalytic simplification. Proceedings of the National Academy of Sciences of the United States of America 93: 9033-9038.

Heyn, C. C. and A. Joel. 1983. Reproductive relationships between annual species of Calendula (Compositae). Plant Systematics and Evolution 143: 311-329.

Heyn, C. C., O. Dagan, and B. Nachman. 1974. The annual Calendula species: taxonomy and relationships. Israel Journal of Botany 23: 169-201.

Hughes, C. E., R. J. Eastwood, and C. D. Bailey, C. D. 2006. From famine to feast? selecting nuclear DNA sequence loci for plant species-level phylogeny reconstruction. Philosophical Transactions: Biological Sciences 361: 211-225.

Ilut, D. C. and J. J. Doyle. 2012. Selecting nuclear sequences for fine detail molecular phylogenetic studies: a computational approach. Systematic Botany 37:7-14.

Jeon, J. H., H. S. Kim, K. H. Choi, Y. H. Joung, H. Joung, and S. M. Byun. 1996. Cloning and characterization of one member of the chalcone synthase gene family from Solanum tuberosum L. Bioscience, Biotechnology, and Biochemistry 60: 1907-1910.

Ma, Y., X. Fang, F. Chen, and S. Dai. 2008. DFL, a FLORICAULA/LEAFY homologue gene from Dendranthema lavandulifolium is expressed both in the vegetative and reproductive tissues. Plant Cell Reports 27: 647-654.

Mallet, J. 2007. Hybrid speciation. Nature 446: 279-283.

94

Müller, K. 2006. Incorporating information from length-mutational events into phylogenetic analysis. Molecular Phylogenetics and Evolution 38: 667-676.

Müller, K. 2005. SeqState - primer design and sequence statistics for phylogenetic DNA data sets. Applied Bioinformatics 4: 65-69.

Nixon, K. C. 1999. Winclada (BETA) ver. 0.9.9. Ithaca, NY: Published by the Author.

Nordenstam, B. and M. Källersjö. 2009. Calenduleae. Pp. 527-538 in Systematics, Evolutions, and Biogeography of Compositae, eds. V. A. Funk, A. Susanna, T. F. Stuessy, and R. J. Bayer. Vienna: International Association for Plant Taxonomy.

Ohle, H. 1974. Beiträge zur Taxonomie der Gattung Calendula II. Taxonomische Revision der südeuropäischen perennierenden Calendula-Sippen. Feddes Repertorium 85: 245-283.

Ohle, H. 1975a. Beiträge zur Taxonomie und Evolution der Gattung Calendula L. III. Revision der marokkanischen perennierenden Sippen unter Berücksichtigung einiger marokkanischer Annueller Mit 6 Tafeln und 4 Abbildungen. Feddes Repertorium 86: 1-17.

Ohle, H. 1975b. Beiträge zur Taxonomie und Evolution der Gattung Calendula L. IV. Revision der algerisch-tunesischen perennierenden Calendula-Sippen unter Berücksichtigung einiger marokkanisch-algerischer Annueller und der marokkanischen und südeuropäischen perennierenden Taxa Mit 5 Tafeln und 3 Abbildungen. Feddes Repertorium 86: 525-541.

Pellicer, J., S. Garcia, M. Á. Canela, T. Garnatje, A. A. Korobkov, J. D. Twibell, and J. Vallès. 2010. Genome size dynamics in Artemisia L. (Asteraceae): following the track of polyploidy. Plant Biology 12: 820-830.

Plume, O., S. C. K. Straub, N. Tel-Zur, A. Cisneros, B. Schneider, & J. J. Doyle. 2013. Testing a hypothesis of intergeneric allopolyploidy in vine cacti (cactaceae: Hylocereeae). Systematic Botany, 38: 737-751.

Ramsey, J.M. 2003. Polyploidy and local adaptation in Achillea millefolium (Asteraceae). ProQuest, UMI Dissertations Publishing

Richardson, B. A., J. T. Page, P. Bajgain, S. C. Sanderson, and J. A. Udall. 2012. Deep sequencing of amplicons reveals widespread intraspecific hybridization and multiple origins of polyploidy in big sagebrush (Artemisia tridentata; Asteraceae). American Journal of Botany 99: 1962-1975.

95

Rieseberg, L. H. and J. H. Willis. 2007. Plant Speciation. Science 317: 910-914.

Sang, T. 2002. Utility of low-copy nuclear gene sequences in plant phylogenetics. Critical Reviews in Biochemistry and Molecular Biology 37: 121-147 .

Simmons, M. P. and H. Ochoterena. 2000. Gaps as Characters in Sequence-Based Phylogenetic Analyses. Systematic Biology 49: 369-381.

Small, R., R. Cronn, and J. Wendel (2004). Use of nuclear genes for phylogeny reconstruction in plants. Australian Systematic Botany 17: 145-170.

Smissen, R., M. Galbany-Casals, and I. Breitwieser. 2011. Ancient allopolyploidy in the everlasting daisies (Asteraceae: Gnaphalieae): Complex relationships among extant clades. Taxon 60: 649-662.

Soltis, D. E., P. S. Soltis, J. C. Pires, A. Kovarik, J. A. Tate, and E. Mavrodiev. 2004. Recent and recurrent polyploidy in Tragopogon (Asteraceae): cytogenetic, genomic and genetic comparisons. Biological Journal of the Linnean Society 82: 485-485.

Steele, P. R., M. Guisinger-Bellian, C. R. Linder, R. K. Jansen. 2008. Phylogenetic utility of 141 low-copy nuclear regions in taxa at different taxonomic levels in two distantly related families of rosids. Molecular Phylogenetics and Evolution 48: 1013-1026.

Strand, A. E., J. Leebens-Mack, and B. G. Milligan. 1997. Nuclear DNA-based markers for plant evolutionary biology. Molecular Ecology 6: 113-118.

Wang, J., L. Qu, J. Chen, H. Gu, and Z. Chen. 2000. Molecular evolution of the exon 2 of CHS genes and the possibility of its application to plant phylogenetic analysis. Chinese Science Bulletin 45: 1735-1742.

Wood, T. E., N. Takebayashi, M. S. Barker, I. Mayrose, P. B. Greenspoon, and L. H. Rieseberg. 2009. The frequency of polyploid speciation in vascular plants. Proceedings of the National Academy of Sciences of the United States of America 106: 13875-13879.

Yang, J., J. Huang, H. Gu, Y. Zhong, and Z. Yang. 2002. Duplication and adaptive evolution of the chalcone synthase genes of Dendranthema (Asteraceae). Molecular Biology and Evolution 19: 1752-1759.

96

Yu, D., M. Kotilainen, E. Pöllänen, M. Mehto, P. Elomaa, Y. Helariutta, V. A. Albert, and T. H. Teeri. 1999. Organ identity genes and modified patterns of flower development in Gerbera hybrida (Asteraceae). The Plant Journal : For Cell and Molecular Biology 17: 51-62.

97

CHAPTER 3

DIVERSITY AND EVOLUTION OF SECONDARY CHEMISTRY IN CALENDULA L.

(COMPOSITAE)

1. Introduction

Variation in secondary chemistry in plants, including both qualitative and quantitative

variation, may be the result of numerous and often interacting genetic and environmental factors.

These may include hybridization and genome duplication (Roose and Gottlieb 1976; Orians

2000; Kirk et al. 2005; Hull-Sanders et al. 2009; Soltis et al. 2014), biotic stresses such as

herbivory, disease, and competition with other plants (Kessler and Baldwin 2002; Uesugi and

Kessler 2013), abiotic stresses such as extremes of temperature and light (Jakopic et al. 2009), and reproductive pressures such as the need to attract pollinators or seed dispersers (Rodríguez et al. 2013; Irwin et al. 2014). Some changes will have taken place over evolutionary time, reflecting past events and pressures, and signaling bifurcating or reticulate histories (treated extensively in a great body of chemotaxonomic literature spanning decades, but see Alvarenga et al. 2001 and Calabria et al. 2007 for recent applications in Compositae), while other changes may take place over the course of a year or even a day and may correspond to developmental stage (Johnson et al. 2004; Kaškonienė 2011), seasonal or circadian rhythms (Martin et al. 2003;

Llusia et al. 2012), direct attack by herbivores (Kaplan et al. 2008), or even signals from other plants indicating that an attack by herbivores could be imminent (Kessler and Heil 2011; Ueda et al. 2012). Variation may be specific to different organs, which may reflect different or even

98 antagonistic adaptive strategies (e.g., defense against herbivores versus attraction of pollinators;

Kessler and Halitschke 2009; Agrawal 2011; Manson et al. 2012). Patterns of chemical variation across species, therefore, are expected to be complex, and a wholly satisfying explanation of the effects of any single factor on observed diversity may not be possible. Nevertheless, exploration of chemical diversity within an evolutionary context may provide information about the types of changes that have occurred and some clues as to what may have caused them.

In this study, chemical variation was explored within the context of evolutionary relationships in Calendula L. (Compositae), a plant genus in which both hybridization and polyploidy have played a role in species diversification and radiation (see Chapters 1 and 2). The most familiar of the species is C. officinalis L. (pot marigold), cultivated for centuries for horticultural and medicinal use, including reducing skin and gastric inflammation and speeding the healing of wounds. A growing body of literature documents the diverse phenolic and terpene content in, and medicinal activity of, extracts from this species (e.g., Yoshikawa et al. 2001;

Murukami et al. 2001; Hamburger et al. 2003; Ukiya et al. 2006; Muley et al. 2009; Baciu et al.

2012; Butnariu and Coradini 2012; Olennikov and Kashchenko 2013). Calendula arvensis L.

(field marigold) also finds its way into the phytochemical literature (e.g., Pizza et al. 1987; De

Tommasi and Pizza 1990; Kirmzebekmez et al. 2006; Paolini et al. 2010; Tosun et al. 2012), perhaps because it is the most common Calendula species not in cultivation. There is little to no information about chemical diversity in other Calendula species.

A classic hypothesis explaining the evolution of chemical diversity in plants is that the evolutionary “arms race” between plants and herbivores drives the escalation of chemical defenses during plant speciation. Thus, chemical defenses are predicted to intensify with evolutionary nestedness (see Ehrlich and Raven 1964; Vermeij 1994). Support for this

99

hypothesis within a phylogenetic context has been found in Bursera (Becerra et al. 2009) and in

Asclepias (Agrawal et al. 2008, 2009) with the caveats that the interplay of different defense strategies (e.g., the correlation of higher investment in mechanical defenses with lower investment in chemical complexity and vice versa), physiological constraints, or interactions with different communities of herbivores may have led to differences in evolutionary trends for different strategies across species. Because relationships among allopolyploid taxa in Calendula are reticulate rather than phylogenetic (which implies bifurcation of lineages over evolutionary time), a hypothesis of escalation of chemical complexity with increasing evolutionary nestedness

(i.e., ongoing species divergence) loses meaning and cannot be adequately tested within the context of such relationships. However, it is interesting to consider that hybridization and

polyploidy, both prevalent in plants, likely contribute to the successful participation of plants in

the evolutionary arms race with herbivores, and that these processes, in bringing together

divergent genomes and different biosynthetic pathways, rearranging genomes, and doubling the

genic material at play, may provide the raw materials for chemical diversification, escalation of

defense, and speciation (see Levin 1983; Mears 1980; Warner and Edwards 1993; Orians 2000;

Otto and Whitton 2000). Therefore, a hypothesis of escalation of defense with multiple rounds of

allopolyploid speciation may be more relevant in this context.

Patterns of variation of four classes of compounds produced from two major, independent

biosynthetic pathways were assessed. These were caffeic acid derivatives (CADs) and flavonoid

glycosides (FGs) produced by different branches of the phenylpropanoid pathway, and

monoterpenes (MTs) and sesquiterpenes (STs) produced by the isoprenoid pathway. The

products of these two pathways represent much of the diversity of secondary chemicals produced

by plants, with over 8,000 phenolic compounds and over 25,000 terpenes estimated (Buchanan et

100

al. 2000). The spectacular diversity of terpenes, in particular, stems from the fact that a single

terpene synthase can produce multiple terpene products from a single substrate (Degenhardt et al.

2009; Köllner et al. 2006). The justification for surveying CADs, FGs, MTs, and STs is both

biological and methodological. Biologically, CADs and FGs have demonstrated involvement in

defense against pathogens and pests and in response to other environmental stresses (Del Moral

1972; Harborne 1977; Constable 1999; Harborne and Williams 2000; Niveyro et al. 2012), and

MTs and STs may serve as mediators of plant-insect and plant-plant interactions, including

attracting the predators of herbivores (Bruce et al. 2005; Bruce and Pickett 2011). As representatives of independent pathways, each of which are producing chemicals with different functions (e. g., defense, attraction, free-radical scavenging), and each of which may be experiencing different selective pressures, they offer independent perspectives on the effects of hybrid and polyploid speciation on chemical diversification in Calendula. Methodologically, compounds produced by all four classes are known to be diverse in plants, and they are easily identifiable to class based on UV or mass spectra. The ability to identify each unique compound as a CAD, FG, MT, or ST and note its presence, absence, and abundance relative to other compounds in the same class allows interpretation of this variation within a phylogenetic context without requiring the identification of individual compounds.

The aim of this study was to test the following hypotheses of chemical diversity in Calendula within the context of evolutionary relationships among taxa: 1) the chemical profiles of

Calendula plants will vary by chromosome number, taxon, and tissue type; 2) chemical profiles of allopolyploid individuals will be characterized by more compounds, greater abundance of compounds, more complex mixtures of compounds, and/or expression of compounds in different tissues relative to their progenitors; 3) differences in lifespan and breeding system will lead to

101

different evolutionary trends in chemical diversity in the annual and perennial polyploid complexes; 4) MT and ST profiles will be less well correlated to evolutionary relationships than

CAD and FG profiles because of lower enzyme specificity and higher number of products produced by individual enzymes in the isoprenoid pathway relative to the phenylpropanoid pathway.

2. Materials and Methods

2.1. Plant growth and sampling

Chemotype was expected to vary within taxa or even within populations, especially within the polyploid and highly polymorphic C. arvensis. Therefore, a minimum of eight seeds were selected for germination from each of 49 accessions representing ten species of Calendula. Seeds from two subspecies of C. incana, three subspecies of C. suffruticosa, and from twenty-nine

accessions of C. arvensis from eight countries were included. Seeds were planted in late July in a

greenhouse at Cornell University (Ithaca, NY) in twelve 72-well trays (for a total of 864 seeds).

An effort was made to place seeds from each accession across several trays to reduce any

potential positional effects in the green house setup. Each seed was assigned a code depending

on its position in each tray. Trays were kept moist and were checked every two days for germination. Seedlings were transplanted into six inch pots when the first true leaves had emerged, and plants from each accession and species were spread out across benches in the

greenhouse, intermingled with plants from other accessions and species. Seedlings germinated

over the course of about two weeks. Germination rates were extremely low for many accessions

102 and this resulted in some taxa being underrepresented in the study, particularly C. stellata, C. tripterocarpa, C. pachysperma, and C. palaestina.

Two hundred and four transplanted seedlings were kept moist and grown under 12 hours of light. Daytime temperature was set at 72°F, and nighttime temperature did not fall below 55°F.

After three months, all plants were in full flower. In late October, tissue was harvested from all plants over the course of two days, in the morning of both days, flash frozen in liquid nitrogen, placed on dry ice, and subsequently stored at -80ºC. Plants were sampled in no particular order on both days. Approximately seven grams of fully expanded leaves at mid-stem were collected from each individual into 50 mL microcentrifuge tubes. Floral sampling was less straightforward. To ensure that a sufficient amount of floral material was available for analysis for all species, floral tissue was collected in bulk from all individuals of each accession (rather than from each individual separately). For most of the accessions in the study, whole capitula

(i.e., entire inflorescence heads including ray and disk florets, receptacular tissue, and bracts subtending the inflorescences) were collected. However, for some accessions, instead of whole capitula, separate collections were made of ray florets only and of disc florets only. Both whole capitulum collections and separate ray and disc collections were made for some groups, allowing both within-group and among-group comparisons of these tissues. In total, 266 samples were collected and analyzed for phenolic and terpene content. These included leaf samples from each of 204 individuals, bulked whole capitula from each of 36 accessions, and bulked ray florets and bulked disc florets from each of 11 accessions from three species. A summary of the number of accessions included, the number and type of collections made, and information on chromosome number, lifespan, breeding system, and distribution of each species is presented in Table 3.1.

More detailed information for each sample is provided in Table 3.2.

103

Table 3.1. Summary of taxa sampled for this study with information on chromosome number, lifespan, breeding system, and taxon distribution. Taxa are ordered by polyploid complex, with the putative progenitors in the middle, the members of the annual polyploid complex (with 2n = 14, 30, 44, and 85) in order of increasing chromosome number from the middle to the top, and the members of the perennial polyploid complex (with 2n = 14, 18, and 32) in order of increasing chromosome number from the middle to the bottom.

Breeding Accessions Phenolic Terpene 2n1 Taxon Lifespan system2 Distribution3 (#) samples4 samples 85 C. pachysperma annual selfing Israel, Jordan 1 L(4), C(1) L(4), C(1) 85 C. palaestina annual selfing Israel, Jordan, 1 L(4), C(1) L(4), C(1) Lebanon, Syria 44 C. arvensis annual selfing Widespread 19 L(79), C(19) L(80), C(19) 30 C. tripterocarpa annual selfing Widespread 1 L(5), C(2) L(5), C(2) 14 C. stellata annual outcrossing Algeria, Morocco, 2 L(7), R(2), D(2) L(7), R(2), Sicily, Tunisia D(2) 18 C. eckerleinii perennial outcrossing Morocco 1 L(18), C(2) L(18), C(2) 18 C. maroccana perennial outcrossing Morocco 3 L(15), C(2) L(16), C(3) 32 C. officinalis semi- outcrossing Cultivated 4 L(19), R(5), D(4) L(18), R(5), perennial D(5) 32 C. incana perennial outcrossing Spain 2 L(5), C(1) L(5), C(1) subsp. incana 32 C. incana perennial outcrossing Sicily 2 L(12), R(1), D(1), L(11), R(1), subsp. maritima C(1) D(1), C(1) 32 C. suffruticosa perennial outcrossing Spain 3 L(10), C(3) L(10), C(3) subsp. carbonellii 32 C. suffruticosa perennial outcrossing Italy, Morocco, 3 L(11), R(2), D(2), L(11), R(2), subsp. fulgida Malta, Sicily C(2) D(2), C(2) 32 C. suffruticosa perennial outcrossing Portugal, Morocco 3 L(15), R(2), D(2), L(15), R(2), subsp. lusitanica C(1) D(2), C(1) 12n counts reported in the literature (although counts are corroborated by flow cytometry performed on many of the individuals sampled for this study; P. Silveira, pers. comm.). 2The label “selfing” means only that these species do self, not that they do not outcross. 3Distribution data from Euro-Med PlantBase (emplantbase.org). 4L=leaf; C=capitulum; R=ray; D=disc; the number of each type of sample collected for each taxon is shown in parentheses.

104

Table 3.2. Taxon sampling for HPLC and GC/MS analyses. *All voucher specimens are held at BH. nv=no voucher collected. In a few instances, vouchers were collected for all individuals in an accession, but sample IDs were not recorded on each voucher. In these cases, either the pair or the series of vouchers including the individual from which the sample was taken is listed. Voucher information is not included for floral samples because these samples were collected in bulk from all individuals in each accession. **Type codes for each sample are as follows: L=individual leaf; C=bulk capitulum; D=bulk disc; R=bulk ray. ***Accession letter codes indicate the following: PI = USDA-GRIN (www.ars-grin.gov); IGB = The Israel Gene Bank (igb.agri.gov.il/main/index.pl); SHS = Silver Hill Seeds (www.silverhillseeds.co.za); BA = Bakers Acres (www.bakersacres.net); AT = Angelo Troia (University of Palermo, Italy); AW = Alan Wood (University of Stellenbosch, South Africa); OP = O. Plume; PS = P. Silveira (University of Aveiro, Portugal); MS = M. Sequeira (University of Madeira, Portugal; via PS); pop. = seed collection from population rather than individual.

105

Taxon Sample ID Voucher* Accession*** Seed origin Type** HPLC GC/MS Calendula L. arvensis (Vaill.) L. Carv2979_64-1 nv PS 2979 Spain L x x C. arvensis Carv2979_64-6 O. Plume 462 PS 2979 Spain L x x C. arvensis Carv2979_h PS 2979 Spain C x x C. arvensis Carv2980_13-9 O. Plume 425 PS 2980 Spain L x x C. arvensis Carv2980_rd O. Plume 425 PS 2980 Spain C x x C. arvensis Carv305289_62-2 O. Plume 439 PI 305289 Spain L x x C. arvensis Carv305289_62-3 O. Plume 436 PI 305289 Spain L x x C. arvensis Carv305289_62-4 O. Plume 437 PI 305289 Spain L x x C. arvensis Carv305289_62-6 O. Plume 440 PI 305289 Spain L x x C. arvensis Carv305289_62-7 O. Plume 438 PI 305289 Spain L x x C. arvensis Carv305289_h PI 305289 Spain C x x C. arvensis Carv578097_41-8 O. Plume 398 PI 578097 Turkey L x x C. arvensis Carv578097_42-1 O. Plume 400 PI 578097 Turkey L x x C. arvensis Carv578097_42-4 O. Plume 402 PI 578097 Turkey L x x C. arvensis Carv578097_42-7 O. Plume 401 or 404 PI 578097 Turkey L x x C. arvensis Carv578097_48-2 O. Plume 399 PI 578097 Turkey L x x C. arvensis Carv578097_48-3 O. Plume 405 PI 578097 Turkey L x x C. arvensis Carv578097_48-7 O. Plume 401 or 404 PI 578097 Turkey L x x C. arvensis Carv578097_sn O. Plume 403 PI 578097 Turkey L x x C. arvensis Carv578097_h PI 578097 Turkey C x x C. arvensis Carv578099_32-1 O. Plume 415 PI 578099 Morocco L x x C. arvensis Carv578099_32-4 O. Plume 417 PI 578099 Morocco L x x C. arvensis Carv578099_32-5 O. Plume 413 PI 578099 Morocco L x x C. arvensis Carv578099_33-2 O. Plume 418 PI 578099 Morocco L x x C. arvensis Carv578099_33-3 O. Plume 414 PI 578099 Morocco L x x C. arvensis Carv578099_33-5 O. Plume 419 PI 578099 Morocco L x x C. arvensis Carv578099_33-7 O. Plume 416 PI 578099 Morocco L x x C. arvensis Carv578099_h PI 578099 Morocco C x x C. arvensis Carv578100_63-1 O. Plume 430 PI 578100 Spain L x x C. arvensis Carv578100_63-2 O. Plume 435 PI 578100 Spain L x x C. arvensis Carv578100_63-3 O. Plume 432 PI 578100 Spain L x x C. arvensis Carv578100_63-4 O. Plume 434 PI 578100 Spain L x x C. arvensis Carv578100_63-6 O. Plume 433 PI 578100 Spain L x x C. arvensis Carv578100_63-7 O. Plume 431 PI 578100 Spain L x x C. arvensis Carv578100_h PI 578100 Spain C x x C. arvensis Carv597585_30-7 O. Plume 420 PI 597585 France L x x C. arvensis Carv597585_39-2 O. Plume 424 PI 597585 France L x x C. arvensis Carv597585_39-3 O. Plume 423 PI 597585 France L x x C. arvensis Carv597585_39-7 O. Plume 421 PI 597585 France L x x C. arvensis Carv597585_h PI 597585 France C x x C. arvensis Carv597586_69-1 O. Plume 427 PI 597586 Spain L x x C. arvensis Carv597586_69-4 O. Plume 428 PI 597586 Spain L x x C. arvensis Carv597586_69-5 O. Plume 426 PI 597586 Spain L x x

106

Table 3.2. (Continued) Taxon Sample ID Voucher* Accession*** Seed origin Type** HPLC GC/MS C. arvensis Carv597586_69-8 O. Plume 429 PI 597586 Spain L x x C. arvensis Carv597586_h PI 597586 Spain C x x C. arvensis Carv597587_44-2 O. Plume 412 PI 597587 Italy L x x C. arvensis Carv597587_44-3 O. Plume 406-411 PI 597587 Italy L x x C. arvensis Carv597587_44-4 O. Plume 408 PI 597587 Italy L x x C. arvensis Carv597587_44-5 O. Plume 406-411 PI 597587 Italy L x x C. arvensis Carv597587_45-3 O. Plume 406-411 PI 597587 Italy L x x C. arvensis Carv597587_45-5 O. Plume 406-411 PI 597587 Italy L x x C. arvensis Carv597587_45-7 O. Plume 410 PI 597587 Italy L x x C. arvensis Carv597587_h PI 597587 Italy C x x C. arvensis Carv603109_61-3 O. Plume 445 PI 603109 Greece L x x C. arvensis Carv603109_61-4 O. Plume 446 PI 603109 Greece L x x C. arvensis Carv603109_61-8 O. Plume 447 PI 603109 Greece L x x C. arvensis Carv603109_rd O. Plume 447 PI 603109 Greece C x x C. arvensis Carv613017_34-1 O. Plume 467 or 470 PI 613017 Morocco L – x C. arvensis Carv613017_34-5 O. Plume 473 PI 613017 Morocco L x x C. arvensis Carv613017_34-7 O. Plume 471 PI 613017 Morocco L x x C. arvensis Carv613017_34-8 O. Plume 468 PI 613017 Morocco L x x C. arvensis Carv613017_35-1 O. Plume 467 or 470 PI 613017 Morocco L x x C. arvensis Carv613017_35-2 O. Plume 469 PI 613017 Morocco L x x C. arvensis Carv613017_35-3 O. Plume 472 PI 613017 Morocco L x x C. arvensis Carv613017_h PI 613017 Morocco C x x C. arvensis Carv618687_59-2 O. Plume 459 PI 618687 Portugal L x x C. arvensis Carv618687_59-3 O. Plume 460 PI 618687 Portugal L x x C. arvensis Carv618687_59-4 O. Plume 461 PI 618687 Portugal L x x C. arvensis Carv618687_59-5 O. Plume 457 PI 618687 Portugal L x x C. arvensis Carv618687_59-6 O. Plume 455 PI 618687 Portugal L x x C. arvensis Carv618687_59-7 O. Plume 456 PI 618687 Portugal L x x C. arvensis Carv618687_59-8 O. Plume 458 PI 618687 Portugal L x x C. arvensis Carv618687_h PI 618687 Portugal C x x C. arvensis Carv63_47-1 O. Plume 464 OP 63 Italy (Sicily) L x x C. arvensis Carv63_47-5_c O. Plume 463 OP 63 Italy (Sicily) L x x C. arvensis Carv63_h OP 63 Italy (Sicily) C x x C. arvensis Carv633645_65-1 O. Plume 442 PI 633645 Portugal L x x C. arvensis Carv633645_65-2 O. Plume 441 PI 633645 Portugal L x x C. arvensis Carv633645_65-3 O. Plume 443 PI 633645 Portugal L x x C. arvensis Carv633645_65-4 O. Plume 444 PI 633645 Portugal L x x C. arvensis Carv633645_h PI 633645 Portugal C x x C. arvensis Carv66_13-2 O. Plume 453 OP 66 Italy (Sicily) L x x C. arvensis Carv66_13-5 O. Plume 454 OP 66 Italy (Sicily) L x x C. arvensis Carv66_13-6 O. Plume 452 OP 66 Italy (Sicily) L x x C. arvensis Carv66_13-8 O. Plume 451 OP 66 Italy (Sicily) L x x

107

Table 3.2. (Continued) Taxon Sample ID Voucher* Accession*** Seed origin Type** HPLC GC/MS C. arvensis Carv66_h OP 66 Italy (Sicily) C x x C. arvensis CarvLebA_49-1 O. Plume 448 OP pop. LebA Lebanon L x x C. arvensis CarvLebA_h OP pop. LebA Lebanon C x x C. arvensis CarvLebC_53-1 O. Plume 477 OP pop. LebC Lebanon L x x C. arvensis CarvLebC_53-4 O. Plume 474 OP pop. LebC Lebanon L x x C. arvensis CarvLebC_53-7 O. Plume 475 OP pop. LebC Lebanon L x x C. arvensis CarvLebC_53-8 O. Plume 476 OP pop. LebC Lebanon L x x C. arvensis CarvLebC_h OP pop. LebC Lebanon C x x C. arvensis CarvLebD_54-1 O. Plume 450 OP pop. LebD Lebanon L x x C. arvensis CarvLebD_54-4 O. Plume 449 OP pop. LebD Lebanon L x x C. arvensis CarvLebD_h OP pop. LebD Lebanon C x x C. arvensis CarvSp1_68-3 O. Plume 465 OP pop. Sp1 Spain L x x C. arvensis CarvSp1_68-4 O. Plume 466 OP pop. Sp1 Spain L x x C. arvensis CarvSp1_h OP pop. Sp1 Spain C x x C. eckerleinii Ohle Ceck603110A_19-2 O. Plume 296 PI 603110 Morocco L x x C. eckerleinii Ceck603110A_19-3_c nv PI 603110 Morocco L x x C. eckerleinii Ceck603110A_40-1 O. Plume 299 PI 603110 Morocco L x x C. eckerleinii Ceck603110A_40-5 O. Plume 297 PI 603110 Morocco L x x C. eckerleinii Ceck603110A_55-3 O. Plume 298 PI 603110 Morocco L x x C. eckerleinii Ceck603110A_55-4 O. Plume 300 PI 603110 Morocco L x x C. eckerleinii Ceck603110A_h PI 603110 Morocco C x x C. eckerleinii Ceck603110B_1-10 O. Plume 295 PI 603110 Morocco L x x C. eckerleinii Ceck603110B_1-9 O. Plume 288 PI 603110 Morocco L x x C. eckerleinii Ceck603110B_19-4 O. Plume 284 PI 603110 Morocco L x x C. eckerleinii Ceck603110B_19-5 O. Plume 291 PI 603110 Morocco L x x C. eckerleinii Ceck603110B_19-6 O. Plume 289 PI 603110 Morocco L x x C. eckerleinii Ceck603110B_19-7 O. Plume 290 PI 603110 Morocco L x x C. eckerleinii Ceck603110B_19-8 O. Plume 284-295 PI 603110 Morocco L x x C. eckerleinii Ceck603110B_37-10 O. Plume 284-295 PI 603110 Morocco L x x C. eckerleinii Ceck603110B_40-8 O. Plume 284-295 PI 603110 Morocco L x x C. eckerleinii Ceck603110B_55-5 O. Plume 284-295 PI 603110 Morocco L x x C. eckerleinii Ceck603110B_55-6 O. Plume 294 PI 603110 Morocco L x x C. eckerleinii Ceck603110B_55-8 O. Plume 284-295 PI 603110 Morocco L x x C. eckerleinii Ceck603110B_h PI 603110 Morocco C x x C. incana Willd. subsp. incana Ciinc2937b_25-5 O. Plume 364 PS 2937b Spain L x x C. incana subsp. incana Ciinc2937b_h PS 2937b Spain C x x C. incana subsp. incana Ciinc3043_7-1 O. Plume 368 PS 3043 Spain L x x C. incana subsp. incana Ciinc3043_7-2 O. Plume 367 PS 3043 Spain L x x C. incana subsp. incana Ciinc3043_7-3 O. Plume 365 PS 3043 Spain L x x C. incana subsp. incana Ciinc3043_7-4 O. Plume 366 PS 3043 Spain L x x C. incana subsp. maritima (Guss.) Ohle Cimar59_4-10 O. Plume 325 OP 59 Italy (Sicily) L x x C. incana subsp. maritima Cimar59_d OP 59 Italy (Sicily) D x x

108

Table 3.2. (Continued) Taxon Sample ID Voucher* Accession*** Seed origin Type** HPLC GC/MS C. incana subsp. maritima Cimar59_r OP 59 Italy (Sicily) R x x C. incana subsp. maritima Cimar597596_2-1 O. Plume 326-335 PI 597596 Italy (Sicily) L x x C. incana subsp. maritima Cimar597596_2-2 O. Plume 326-335 PI 597596 Italy (Sicily) L x x C. incana subsp. maritima Cimar597596_2-4 O. Plume 326-335 PI 597596 Italy (Sicily) L x x C. incana subsp. maritima Cimar597596_2-5_c O. Plume 326-335 PI 597596 Italy (Sicily) L x x C. incana subsp. maritima Cimar597596_2-8 O. Plume 333 PI 597596 Italy (Sicily) L x x C. incana subsp. maritima Cimar597596_9-1 O. Plume 326-335 PI 597596 Italy (Sicily) L x x C. incana subsp. maritima Cimar597596_9-2 O. Plume 326-335 PI 597596 Italy (Sicily) L x – C. incana subsp. maritima Cimar597596_9-3_c O. Plume 326-335 PI 597596 Italy (Sicily) L x x C. incana subsp. maritima Cimar597596_9-4 O. Plume 326-335 PI 597596 Italy (Sicily) L x x C. incana subsp. maritima Cimar597596_9-5 O. Plume 326-335 PI 597596 Italy (Sicily) L x x C. incana subsp. maritima Cimar597596_9-7 O. Plume 326-335 PI 597596 Italy (Sicily) L x x C. incana subsp. maritima Cimar597596_h PI 597596 Italy (Sicily) C x x C. maroccana (Ball) B. D. Jacks Cmar578104_51-10 O. Plume 302 PI 578104 Morocco L x x C. maroccana Cmar578104_51-4 O. Plume 303 PI 578104 Morocco L x x C. maroccana Cmar578104_51-6 O. Plume 305 PI 578104 Morocco L x x C. maroccana Cmar578104_51-7_c O. Plume 301 PI 578104 Morocco L x x C. maroccana Cmar578104_51-8 O. Plume 304 PI 578104 Morocco L x x C. maroccana Cmar578104_h PI 578104 Morocco C x x C. maroccana Cmar607416_15-1 O. Plume 316 PI 607416 Morocco L x x C. maroccana Cmar607416_15-10 O. Plume 313 PI 607416 Morocco L x x C. maroccana Cmar607416_15-2 O. Plume 311 PI 607416 Morocco L x x C. maroccana Cmar607416_15-4 O. Plume 317 PI 607416 Morocco L x x C. maroccana Cmar607416_15-5 O. Plume 315 PI 607416 Morocco L x x C. maroccana Cmar607416_15-7 O. Plume 314 PI 607416 Morocco L x x C. maroccana Cmar607416_15-8 O. Plume 312 PI 607416 Morocco L x x C. maroccana Cmar607416_h PI 607416 Morocco C x x C. maroccana Cmar607417_36-3 O. Plume 308 PI 607417 France (cult.) L x x C. maroccana Cmar607417_36-4 O. Plume 309 PI 607417 France (cult.) L – x C. maroccana Cmar607417_36-5 O. Plume 306 PI 607417 France (cult.) L x x C. maroccana Cmar607417_36-9 O. Plume 307 PI 607417 France (cult.) L x x C. maroccana Cmar607417_h PI 607417 France (cult.) C – x C. officinalis L. Coff2986c_29-2 O. Plume 350 PS 2986c Portugal L x x C. officinalis Coff2986c_29-5 O. Plume 351 PS 2986c Portugal L x x C. officinalis Coff2986c_d PS 2986c Portugal D x x C. officinalis Coff2986c_r PS 2986c Portugal R x x C. officinalis Coff420253_26-1 O. Plume 340 PI 420253 Portugal L x x C. officinalis Coff420253_26-2 O. Plume 337 PI 420253 Portugal L x x C. officinalis Coff420253_26-5 O. Plume 339 PI 420253 Portugal L x x C. officinalis Coff420253_26-6 O. Plume 338 PI 420253 Portugal L x x C. officinalis Coff420253_26-7 O. Plume 336 PI 420253 Portugal L x x C. officinalis Coff420253_26-8 O. Plume 341 PI 420253 Portugal L x x

109

Table 3.2. (Continued) Taxon Sample ID Voucher* Accession*** Seed origin Type** HPLC GC/MS C. officinalis Coff420253_d PI 420253 Portugal D x x C. officinalis Coff420253_r PI 420253 Portugal R x x C. officinalis Coff420375_28-1 O. Plume 343 PI 420375 Spain L x x C. officinalis Coff420375_28-2 O. Plume 345 PI 420375 Spain L x x C. officinalis Coff420375_28-3 O. Plume 347 PI 420375 Spain L x x C. officinalis Coff420375_28-4 O. Plume 344 PI 420375 Spain L x – C. officinalis Coff420375_28-5 O. Plume 348 PI 420375 Spain L x x C. officinalis Coff420375_28-6 O. Plume 346 PI 420375 Spain L x x C. officinalis Coff420375_28-7 O. Plume 342 PI 420375 Spain L x x C. officinalis Coff420375_28-8 O. Plume 349 PI 420375 Spain L x x C. officinalis Coff420375_d PI 420375 Spain D x x C. officinalis Coff420375_r PI 420375 Spain R x x C. officinalis Coff578106_27-5 O. Plume 363 PI 578106 Kazakhstan L x x C. officinalis Coff578106_27-6 O. Plume 362 PI 578106 Kazakhstan L x x C. officinalis Coff578106_27-8 O. Plume 361 PI 578106 Kazakhstan L x x C. officinalis Coff578106_d1 PI 578106 Kazakhstan D x x C. officinalis Coff578106_d2 PI 578106 Kazakhstan D – x C. officinalis Coff578106_r1 PI 578106 Kazakhstan R x x C. officinalis Coff578106_r2 PI 578106 Kazakhstan R x x C. pachysperma Zohary Cpac20562_60-3 O. Plume 478 IGB 20562 Israel L x x C. pachysperma Cpac20562_60-7 O. Plume 481 IGB 20562 Israel L x x C. pachysperma Cpac20562_66-2 O. Plume 479 IGB 20562 Israel L x x C. pachysperma Cpac20562_66-4 O. Plume 480 IGB 20562 Israel L x x C. pachysperma Cpac20562_h IGB 20562 Israel C x x C. palaestina Boiss. Cpal21124_16-10 O. Plume 484 IGB 21124 Israel L x x C. palaestina Cpal21124_16-9 O. Plume 482 IGB 21124 Israel L x x C. palaestina Cpal21124_71-3 O. Plume 483 IGB 21124 Israel L x x C. palaestina Cpal21124_71-4 O. Plume 485 IGB 21124 Israel L x x C. palaestina Cpal21124_h IGB 21124 Israel C x x C. suffruticosa Vahl. subsp. carbonellii Ohle Cscar175_21-3 O. Plume 379 OP pop. 175 Spain L x x C. suffruticosa subsp. carbonellii Cscar175_21-4 O. Plume 380 OP pop. 175 Spain L x x C. suffruticosa subsp. carbonellii Cscar175_21-6 O. Plume 378 OP pop. 175 Spain L x x C. suffruticosa subsp. carbonellii Cscar175_21-8 O. Plume 377 OP pop. 175 Spain L x x C. suffruticosa subsp. carbonellii Cscar175_h OP pop. 175 Spain C x x C. suffruticosa subsp. carbonellii Cscar2983_20-2 O. Plume 392 PS 2983 Spain L x x C. suffruticosa subsp. carbonellii Cscar2983_h PS 2983 Spain C x x C. suffruticosa subsp. carbonellii CscarFuen_24-1 O. Plume 394 OP pop. Fuen Spain L x x C. suffruticosa subsp. carbonellii CscarFuen_24-2 O. Plume 397 OP pop. Fuen Spain L x x C. suffruticosa subsp. carbonellii CscarFuen_24-3 O. Plume 393 OP pop. Fuen Spain L x x C. suffruticosa subsp. carbonellii CscarFuen_24-4 O. Plume 396 OP pop. Fuen Spain L x x C. suffruticosa subsp. carbonellii CscarFuen_24-5 O. Plume 395 OP pop. Fuen Spain L x x C. suffruticosa subsp. carbonellii CscarFuen_h Spain C x x

110

Table 3.2. (Continued) Taxon Sample ID Voucher* Accession*** Seed origin Type** HPLC GC/MS C. suffruticosa subsp. fulgida (Raf.) Guadagno Csful607420A_12-3 O. Plume 390 PI 607420 Italy (Sicily) L x x C. suffruticosa subsp. fulgida Csful607420A_12-5 O. Plume 391 PI 607420 Italy (Sicily) L x x C. suffruticosa subsp. fulgida Csful607420A_d PI 607420 Italy (Sicily) D x x C. suffruticosa subsp. fulgida Csful607420A_r PI 607420 Italy (Sicily) R x x C. suffruticosa subsp. fulgida Csful607420B_12-1 O. Plume 385 PI 607420 Italy (Sicily) L x x C. suffruticosa subsp. fulgida Csful607420B_12-2 O. Plume 382 PI 607420 Italy (Sicily) L x x C. suffruticosa subsp. fulgida Csful607420B_12-7 O. Plume 383 PI 607420 Italy (Sicily) L x x C. suffruticosa subsp. fulgida Csful607420B_12-8 O. Plume 384 PI 607420 Italy (Sicily) L x x C. suffruticosa subsp. fulgida Csful607420B_h PI 607420 Italy (Sicily) C x x C. suffruticosa subsp. fulgida Csful613021_10-5 O. Plume 381 PI 613021 Italy (Sicily) L x x C. suffruticosa subsp. fulgida Csful613021_h PI 613021 Italy (Sicily) C x x C. suffruticosa subsp. fulgida CsfulAdd_14-1 O. Plume 387 AT pop. HA Italy (Sicily) L x x C. suffruticosa subsp. fulgida CsfulAdd_14-4 O. Plume 389 AT pop. HA Italy (Sicily) L x x C. suffruticosa subsp. fulgida CsfulAdd_14-5 O. Plume 386 AT pop. HA Italy (Sicily) L x x C. suffruticosa subsp. fulgida CsfulAdd_14-7 O. Plume 388 AT pop. HA Italy (Sicily) L x x C. suffruticosa subsp. fulgida CsfulAdd_d AT pop. HA Italy (Sicily) D x x C. suffruticosa subsp. fulgida CsfulAdd_r AT pop. HA Italy (Sicily) R x x C. suffruticosa subsp. lusitanica (Boiss.) Ohle Cslus24677_18-2 O. Plume 267 PI 649652 Portugal L x x C. suffruticosa subsp. lusitanica Cslus24677_18-3 O. Plume 269 PI 649652 Portugal L x x C. suffruticosa subsp. lusitanica Cslus24677_18-4 O. Plume 268 PI 649652 Portugal L x x C. suffruticosa subsp. lusitanica Cslus24677_18-6 O. Plume 271 PI 649652 Portugal L x x C. suffruticosa subsp. lusitanica Cslus24677_18-7 O. Plume 270 PI 649652 Portugal L x x C. suffruticosa subsp. lusitanica Cslus24677_d PI 649652 Portugal D x x C. suffruticosa subsp. lusitanica Cslus24677_r PI 649652 Portugal R x x C. suffruticosa subsp. lusitanica Cslus3015_17-1 O. Plume 280 PS 3015 Portugal L x x C. suffruticosa subsp. lusitanica Cslus3015_17-2 O. Plume 281 PS 3015 Portugal L x x C. suffruticosa subsp. lusitanica Cslus3015_17-6 O. Plume 282 PS 3015 Portugal L x x C. suffruticosa subsp. lusitanica Cslus3015_17-8 O. Plume 279 PS 3015 Portugal L x x C. suffruticosa subsp. lusitanica Cslus3015_d PS 3015 Portugal D x x C. suffruticosa subsp. lusitanica Cslus3015_r PS 3015 Portugal R x x C. suffruticosa subsp. lusitanica Cslus3023_16-1 O. Plume 278 PS 3023 Portugal L x x C. suffruticosa subsp. lusitanica Cslus3023_16-2 O. Plume 274 PS 3023 Portugal L x x C. suffruticosa subsp. lusitanica Cslus3023_16-4_c O. Plume 272 PS 3023 Portugal L x x C. suffruticosa subsp. lusitanica Cslus3023_16-5 O. Plume 276 PS 3023 Portugal L x x C. suffruticosa subsp. lusitanica Cslus3023_16-6 O. Plume 277 PS 3023 Portugal L x x C. suffruticosa subsp. lusitanica Cslus3023_16-7 O. Plume 273 PS 3023 Portugal L x x C. suffruticosa subsp. lusitanica Cslus3023_h PS 3023 Portugal C x x C. stellata Cav. Cste649651_58-1 O. Plume 322 PI 649651 Morocco L x x C. stellata Cste649651_58-4 O. Plume 321 PI 649651 Morocco L x x C. stellata Cste649651_58-5 O. Plume 324 PI 649651 Morocco L x x C. stellata Cste649651_58-6 O. Plume 320 PI 649651 Morocco L x x C. stellata Cste649651_58-7 O. Plume 323 PI 649651 Morocco L x x

111

Table 3.2. (Continued) Taxon Sample ID Voucher* Accession*** Seed origin Type** HPLC GC/MS C. stellata Cste649651_d PI 649651 Morocco D x x C. stellata Cste649651_r PI 649651 Morocco R x x C. stellata CsteP250_1-5 O. Plume 319 OP pop. 250 Morocco L x x C. stellata CsteP250_1-6 O. Plume 318 OP pop. 250 Morocco L x x C. stellata CsteP250_d OP pop. 250 Morocco D x x C. stellata CsteP250_r OP pop. 250 Morocco R x x C. tripterocarpa Rupr. Ctri139f2_1 O. Plume 266 OP 139 Spain L x x C. tripterocarpa Ctri139f2_2 O. Plume 265 OP 139 Spain L x x C. tripterocarpa Ctri139f2_3 O. Plume 264 OP 139 Spain L x x C. tripterocarpa Ctri139f2_4 O. Plume 263 OP 139 Spain L x x C. tripterocarpa Ctri139f2_5 O. Plume 262 OP 139 Spain L x x C. tripterocarpa Ctri139f2_rd OP 139 Spain C x x C. tripterocarpa Ctri139f2_h OP 139 Spain C x x

112

2.2. Sampling of phenolics and small terpenes

Samples were removed from storage in small batches, which were kept on liquid nitrogen during processing of each sample. Each full tube was weighed and the average weight of empty tubes was subtracted from this weight to obtain the weight of fresh tissue in each tube. Frozen tissue was ground to a rough powder with a metal rod in each tube. Fifteen mL of 80% methanol

(MeOH) was added to the ground tissue (5 or 10 mL of MeOH were added to some small volume floral samples and 20 mL were added to some large volume leaf samples), and the slurry was blended with a polytron homogenizer until smooth. Blended samples were placed on ice and then each sample (up to 15 mL) was transferred to a 15 mL tube. Samples in 15 mL tubes were placed in a refrigerated centrifuge and spun at 5000 rpm for 10 minutes at 4°C. Two mL of supernatant were removed to 2 mL microcentrifuge tubes, and spun at 14,000 rpm for 10 minutes. Samples were removed and kept on ice. From each 2 mL sample, exactly 1 mL of supernatant was removed to a new 2 mL microcentrifuge tube, and the other approximately 1 mL of supernatant was removed to a 1 mL glass vial for HPLC analysis. To the new 2 mL microcentrifuge tube, 2240 ng of tetralin (Sigma-Aldrich) as an internal standard and 1 mL of pure hexane were added. Tubes were capped tightly, wrapped with parafilm to prevent evaporation, and shaken at moderate speed at 4°C for 60 minutes. The hexane phase of each sample was then filtered through a 1 mL silica column (pre-primed with 1 mL pure hexane;

Sigma-Aldrich Discovery® DSC-Si SPE Tube, bed weight 100 mg) using a vacuum manifold and collected into a 1 mL glass vial for GC/MS analysis.

Fifteen µL of each methanol extract were analyzed for phenolic content (CADs, FGs) on a

Hewlett-Packard (Avondale, PA) 1100 series HPLC (machine specifications were as described in

113

Keinänen et al. 2001) with a Gemini C18 reverse-phase column (3 μm, 150 × 4.6 mm;

Phenomenex, Torrance, CA, USA). The method of Keinänen et al. (2001) was used with slight modifications. The elution system, using solvents (A) 0.25% H3PO4 in water (pH 2.2) and (B) acetonitrile, was as follows: 0-6 min, 0-12% of B; 6-10 min, 12-18% of B; 10-30 min, 18-58% of

B. The flow rate was 1 mL/min and the column oven was set at 24°C. The eluent was monitored at 210, 254, 320, and 365 nm.

Fifteen µL of each hexane extract were analyzed for terpene content (MTs and STs) on a

Varian 2200 GC/MS with a polar EC WAX-column (30 m, 0.25 mm internal diameter, 0.25 µm film thickness; Alltech Associates, USA). The carrier gas was helium, kept at a constant flow rate of 1 ml/min. GC oven conditions were as follows: 45°C for 6 min, increased to 130°C at

10°C/min, increased to 180°C at 5°C/min, increased to 230°C at 20°C/min with a 5 min hold at

230°C, increased to 250°C, and then held at 250°C for 5 min.

Compounds were identified to class by comparing UV spectra to those of authentic standards or by submitting mass spectra to searches of the NIST/EPA/NIH mass spectral library (NIST02,

July 2002). For phenolic compounds, peak area was normalized to original tissue concentration in each MeOH sample (as area / mg fresh tissue) and served as a measure of relative abundance.

For terpenoids, peak areas were normalized to tissue concentration in each original MeOH sample and concentrations of each analyte were then calculated relative to the internal tetralin standard.

114

2.3. Data analyses

Chromatograms were compared between all samples. Homology of each peak across samples

was hypothesized based on retention time and spectral characteristics. For the phenolic dataset,

HPLC peaks with a signal intensity area less than 250 were excluded because below this limit, peak purity declined, spectra degraded, and identification to class and assessment of homology

across samples became unreliable. Similarly, for the terpene dataset, GC peaks with a signal

intensity area below 50 were excluded. All peaks above these limits that could be reliably and

consistently identified to one of the four compound classes under study, and for which homology

could be reasonably assessed were included in the study. Data matrices were prepared for the

terpene and phenolic datasets. Because standards were not used for the analyses of phenolic

content via HPLC, absolute quantities of each peak could not be assessed. Instead, each peak was

expressed as the ratio of its area to the total area of all peaks of the same class in each sample. To

standardize analyses of both the terpene and phenolic datasets, each terpene peak was similarly

expressed as the ratio of its concentration (based on comparison to an internal standard) to the

total concentration of all peaks in the same class in each sample. In addition, total number of

peaks and total bulk content (sum of all peaks) in each class was recorded for each sample. Mean

ratios of each peak, mean peak number, and mean bulk content for each compound class are

presented by chromosome number group and tissue type and (when chromosome number groups

comprised more than one taxon) by taxon and tissue type in Table 3.3 and Table 3.4 respectively.

Because there was no internal standard used in the HPLC analyses, mean content was relative for

CADs and FGs (calculated as the sum of the areas under all CAD or all FG peaks per mg fresh

tissue). The internal tetralin standard in the GC/MS analyses allowed estimation of the actual

115

Table 3.3. Mean ratio of amount of each peak to total amount of its type, bulk abundance of each type, and number of peaks for each compound class by chromosome number group and tissue type (showing mean±standard deviation). Chromosome number groups 14, 30 and 44 each comprise a single species (Cste=C. stellata; Ctri=C. tripterocarpa; Carv=C. arvensis), so chromosome number group is equivalent to taxon in these three instances. Chromosome number groups 18, 32, and 85 each comprise two or more species or subspecies. For these three groups, detailed information for each included taxon is presented in Table 3.3. Sample number by chromosome number group is shown for GC/MS and HPLC analyses (#Floral/#Leaf=number of floral samples/number of leaf samples).

116

Perennial Polyploid Complex Annual Polyploid Complex 2n  85 44 (Carv) 30 (Ctri) 14 (Cste) 18 32

GC/MS sampling for small terpenes: #Floral/#Leaf  2/8 19/80 2/5 4/7 5/34 28/70

Monoterpenes Mean ratio of each MT to bulk MTs: MT1 F 0.25±0.05 0.29±0.16 0.01±0 0.34±0.32 0.14±0.08 0.37±0.32 L 0.19±0.21 0.29±0.14 0.01±0 0.49±0.27 0.15±0.12 0.36±0.31 MT2 F 0.58±0.12 0.55±0.17 0.69±0.04 0.22±0.17 0.59±0.11 0.21±0.20 L 0.63±0.22 0.54±0.16 0.62±0.01 0.32±0.17 0.65±0.19 0.19±0.14 MT3 F 0.03±0 0.03±0.01 0.01±0 0.04±0.02 0.11±0.03 0.19±0.23 L 0.03±0.01 0.03±0.01 0.01±0 0.06±0.04 0.05±0.02 0.12±0.13 MT4 F 0.07±0.01 0.06±0.02 0.07±0 0.01±0.02 0.05±0.01 0.08±0.09 L 0.07±0.02 0.05±0.02 0.07±0.01 0.05±0.04 0.03±0.02 0.05±0.06 MT5 F 0±0 0.01±0.01 0.01±0 0.01±0.01 0.01±0.01 0.01±0.01 L 0.01±0.01 0.01±0.01 0.02±0.02 0±0.01 0.01±0.02 0.02±0.02 MT6 F 0.04±0.05 0.03±0.02 0.06±0.03 0.35±0.31 0.10±0.05 0.11±0.24 L 0.05±0.07 0.02±0.02 0.03±0.01 0±0.01 0.09±0.09 0.12±0.23 MT7 F 0.02±0.02 0.04±0.04 0.15±0.06 0.03±0.01 0.02±0 0.04±0.04 L 0.02±0.02 0.05±0.04 0.23±0.02 0.07±0.12 0.02±0.01 0.15±0.21 Mean bulk MTs (ng/mg): F 199.70±61.87 105.52±96.43 364.70±154.25 17.19±15.72 117.64±90.91 122.25±316.17 L 45.76±65.57 18.68±33.03 11.09±2.12 2.08±1.90 10.20±12.80 29.40±36.24 Mean number of MTs: F 7.0±0 6.95±0.23 7.0±0 6.0±0.82 6.8±0.45 6.75±0.59 L 6.75±0.46 6.16±1.11 7.0±0 4.71±1.60 5.74±1.08 6.47±1.10

Sesquiterpenes Mean ratio of each ST to bulk STs: ST1 F 0.12±0.07 0±0 0±0 0±0 0.25±0.26 0±0.01

L 0.15±0.12 0.01±0.05 0±0 0±0 0.16±0.25 0±0.01 ST2 F 0±0 0.06±0.10 0±0 0±0 0.03±0.05 0.04±0.08 L 0.04±0.04 0.02±0.07 0±0 0±0 0.02±0.07 0.01±0.04 ST3 F 0.04±0.03 0±0 0±0 0±0 0.02±0.02 0±0.02 L 0.03±0.03 0.01±0.04 0±0 0±0 0.01±0.03 0±0.02 ST4 F 0±0 0.01±0.03 0±0 0±0 0±0 0.01±0.02 L 0±0 0±0 0±0 0±0 0±0.01 0±0.01 ST5 F 0.24±0.02 0.21±0.16 0±0 0.09±0.11 0.27±0.15 0.23±0.25 L 0.24±0.05 0.19±0.28 0±0 0.09±0.12 0.27±0.20 0.21±0.26

117

Table 3.3. (Continued) Perennial Polyploid Complex Annual Polyploid Complex 2n  85 44 (Carv) 30 (Ctri) 14 (Cste) 18 32

Sesquiterpenes (continued) Mean ratio of each ST to bulk STs (continued): ST6 F 0.19±0.01 0.41±0.25 0±0 0.19±0.25 0.28±0.26 0.30±0.23 L 0.21±0.08 0.29±0.28 0±0 0.11±0.19 0.27±0.25 0.30±0.26 ST7 F 0±0 0±0 0±0 0±0 0±0 0.04±0.13 L 0±0 0±0 0±0 0±0 0±0 0.07±0.18 ST8 F 0.29±0.01 0.09±0.23 0±0 0.25±0.29 0.04±0.02 0.04±0.06 L 0.21±0.08 0.02±0.07 0±0 0.11±0.19 0.04±0.11 0.02±0.06 ST9 F 0.06±0.01 0.06±0.07 0±0 0.20±0.25 0.06±0.06 0.19±0.26 L 0.09±0.03 0.18±0.27 0.18±0.25 0.11±0.19 0.11±0.15 0.13±0.13 ST10 F 0.04±0.01 0.03±0.07 0±0 0±0 0.02±0.04 0.07±0.10 L 0.01±0.02 0.05±0.15 0.09±0.21 0±0 0.01±0.03 0.09±0.15 ST11 F 0±0 0±0 0±0 0±0 0±0 0±0 L 0±0 0±0 0±0 0±0 0±0 0±0.01 ST12 F 0.02±0 0.01±0.03 0.50±0.71 0.27±0.31 0±0 0.04±0.08 L 0.03±0.03 0.07±0.14 0.33±0.47 0.30±0.40 0.06±0.13 0.09±0.17 ST13 F 0±0 0±0 0±0 0±0 0.02±0.03 0.01±0.02 L 0±0 0±0 0±0 0±0 0.01±0.02 0.01±0.03 Mean bulk STs (ng/mg) F 9.43±2.97 2.13±4.36 0.24±0.25 0.55±0.42 5.13±4.37 5.97±15.90 L 3.78±5.62 0.48±1.09 0.04±0.04 0.14±0.18 1.57±2.71 1.29±1.38 Mean number of STs: F 8.0±0 3.32±1.80 0.50±0.71 2.50±0.58 6.60±2.19 4.79±2.94 L 6.88±1.13 2.43±1.87 1.00±1.0 1.71±1.38 3.91±1.88 3.87±2.06

HPLC sampling for phenolics: #Floral/#Leaf  2/8 19/79 2/5 4/7 4/33 27/72

Caffeic acid derivatives Mean ratio of each CAD to bulk CADs: CAD1 F 0±0.01 0±0 0±0 0±0 0±0 0±0 L 0.04±0.01 0.04±0.02 0±0 0.04±0.01 0±0.01 0.02±0.02 CAD2 F 0±0 0.02±0.01 0.01±0.02 0±0 0±0 0±0.01 L 0±0 0±0 0±0 0±0 0±0 0±0 CAD3 F 0.02±0 0.02±0.01 0.02±0.01 0±0.01 0.05±0.01 0.01±0.01 L 0.12±0.03 0.14±0.04 0.10±0.02 0.13±0.03 0.05±0.05 0.10±0.06

118

Table 3.3. (Continued) Perennial Polyploid Complex Annual Polyploid Complex 2n  85 44 (Carv) 30 (Ctri) 14 (Cste) 18 32

Caffeic acid derivatives (continued) Mean ratio of each CAD to bulk CADs (continued): CAD4 F 0.07±0.01 0.02±0.01 0±0 0.01±0.02 0.01±0.01 0.02±0.02 L 0.23±0.05 0.13±0.05 0.07±0.01 0.14±0.02 0.06±0.07 0.14±0.08 CAD5 F 0.02±0 0±0 0±0 0±0 0±0.01 0.02±0.04 L 0.09±0.03 0±0.01 0±0 0±0.01 0±0.01 0.08±0.17 CAD6 F 0.03±0.01 0.18±0.06 0.25±0.01 0.01±0.01 0.01±0.01 0.04±0.04 L 0.24±0.12 0.51±0.07 0.65±0.03 0.46±0.05 0.24±0.12 0.45±0.19 CAD7 F 0.01±0.01 0±0.01 0±0 0±0 0.01±0 0.05±0.04 L 0.03±0.03 0.01±0.02 0±0 0.01±0.01 0±0.02 0.02±0.03 CAD8 F 0.55±0.01 0.55±0.06 0.51±0.09 0.81±0.05 0.40±0.04 0.53±0.09 L 0.23±0.13 0.09±0.05 0.11±0.02 0.08±0.03 0.52±0.16 0.14±0.14 CAD9 F 0±0.01 0.03±0.02 0.04±0.02 0±0 0±0 0±0.01 L 0.02±0.03 0.07±0.05 0.07±0.01 0.03±0.03 0.01±0.02 0.05±0.08 CAD10 F 0.07±0.07 0.01±0.02 0.03±0.04 0.02±0.02 0.04±0 0.01±0.01 L 0.01±0.02 0.01±0.01 0±0 0.01±0.02 0.01±0.02 0±0.01 CAD11 F 0.01±0.01 0.01±0.01 0.01±0.01 0±0 0±0 0±0.01 L 0±0 0±0.01 0±0 0.02±0.02 0±0.01 0±0 CAD12 F 0±0 0±0 0±0 0±0 0.01±0.01 0±0.01 L 0±0 0±0 0±0 0.02±0.03 0±0 0±0 CAD13 F 0±0 0±0 0±0 0±0 0±0 0±0 L 0±0 0±0 0±0 0.03±0.05 0±0 0±0.01 CAD14 F 0.21±0.07 0.16±0.04 0.12±0 0.13±0.07 0.43±0.04 0.31±0.09 L 0±0 0±0 0±0 0.02±0.03 0.08±0.05 0±0.01 CAD15 F 0.01±0 0±0 0±0 0.02±0.02 0.02±0 0.02±0.02 L 0±0 0±0 0±0 0.01±0.01 0±0 0±0 CAD16 F 0.01±0.01 0.01±0.01 0.01±0.01 0±0 0.02±0.01 0±0.01 L 0±0 0±0 0±0 0±0 0±0 0±0 Mean bulk CADs (area/mg): F 35634.42 30540.88 14311.32 16433.20 28560.50 16614.718 ±7856.75 ±15439.53 ±5010.633 ±11538.92 ±7314.00 ±9421.43 L 10640.74 10306.02 5927.08 10464.18 9844.62 8138.84 ±4137.26 ±3644.94 ±1384.51 ±3112.08 ±5095.77 ±4059.22 Mean number of CADs: F 11.0±0 8.32±1.83 7.0±2.83 4.5±2.38 9.25±0.96 6.60±2.69 L 7.25±1.39 6.61±1.36 5.0±0 8.57±2.07 5.27±1.51 5.71±2.17

119

Table 3.3. (Continued) Perennial Polyploid Complex Annual Polyploid Complex 2n  85 44 (Carv) 30 (Ctri) 14 (Cste) 18 32

Flavonoid glycosides Mean ratio of each FG to bulk FGs: FG1 F 0±0 0±0 0±0 0±0 0.06±0.02 0±0 L 0±0 0±0 0±0 0±0 0±0 0±0 FG2 F 0±0 0±0 0±0 0±0 0.10±0.06 0±0 L 0±0 0±0 0±0 0±0 0±0 0±0 FG3 F 0±0 0±0 0±0 0±0 0.14±0.06 0.08±0.09 L 0±0 0±0 0±0 0±0 0.15±0.33 0±0 FG4 F 0±0 0.17±0.09 0.08±0.11 0±0 0±0 0.01±0.03 L 0±0 0±0 0±0 0±0 0±0 0±0 FG5 F 0±0 0±0 0±0 0±0 0.16±0.12 0.12±0.12 L 0±0 0±0 0±0 0±0 0.13±0.33 0±0 FG6 F 0.04±0.05 0.01±0.04 0±0 0±0 0.11±0.07 0.06±0.08 L 0±0 0±0 0±0 0.04±0.08 0.05±0.19 0.07±0.14 FG7 F 0.03±0.04 0±0.01 0.02±0.03 0.01±0.01 0.01±0.01 0.01±0.02 L 0±0 0±0 0±0 0.02±0.05 0±0 0±0 FG8 F 0.08±0.01 0.46±0.14 0.09±0.03 0.33±0.26 0.05±0.02 0.26±0.29 L 0±0 0±0.01 0±0 0.16±0.15 0.04±0.18 0.04±0.18 FG9 F 0.06±0.08 0.02±0.05 0±0 0±0 0.19±0.04 0.05±0.06 L 0.02±0.06 0.01±0.05 0±0 0±0 0±0 0.20±0.29 FG10 F 0±0 0±0 0±0 0±0 0±0 0±0.02 L 0±0 0±0.01 0±0 0±0 0±0 0.02±0.10 FG11 F 0.36±0.02 0.13±0.08 0.43±0.01 0.15±0.19 0.07±0.03 0.07±0.08 L 0.02±0.04 0.11±0.09 0±0 0.26±0.40 0±0 0±0.02 FG12 F 0±0 0.02±0.05 0±0 0.24±0.28 0±0 0.20±0.28 L 0±0 0.01±0.03 0±0 0.13±0.12 0±0 0.03±0.13 FG13 F 0.03±0.04 0.09±0.04 0.11±0.04 0.04±0.05 0±0 0.02±0.02 L 0.03±0.06 0.35±0.17 0.16±0.15 0±0 0±0 0.20±0.31 FG14 F 0.23±0.09 0.03±0.04 0.17±0.12 0.03±0.05 0±0 0.01±0.03 L 0.46±0.15 0.49±0.18 0.84±0.15 0.39±0.34 0±0 0.04±0.12 FG15 F 0.03±0.04 0±0.01 0.02±0.02 0.05±0.04 0.05±0.01 0.02±0.03 L 0.12±0.08 0±0 0±0 0±0 0±0 0±0 FG16 F 0±0 0±0 0±0 0±0 0±0 0±0 L 0.06±0.06 0±0 0±0 0±0 0±0 0±0 FG17 F 0±0 0±0 0±0 0±0 0±0 0±0 L 0.18±0.08 0±0 0±0 0±0 0±0 0±0.03 FG18 F 0.16±0.04 0.07±0.02 0.09±0.01 0.14±0.16 0.07±0.01 0.09±0.08 L 0±0 0.01±0.04 0±0 0±0 0±0 0.01±0.05

120

Table 3.3. (Continued) Perennial Polyploid Complex Annual Polyploid Complex 2n  85 44 (Carv) 30 (Ctri) 14 (Cste) 18 32

Flavonoid glycosides (continued) Mean ratio of each FG to bulk FGs (continued): FG19 F 0±0 0±0 0±0 0±0 0±0 0±0 L 0.11±0.08 0.01±0.03 0±0 0±0 0±0 0±0 Mean bulk FGs (area/mg): F 4377.66 10320.88 5605.29 9338.23 12275.07 10345.39 ±1685.88 ±5533.01 ±2907.48 ±2468.25 ±3119.86 ±6512.83 L 2616.20 2604.06 1047.81 2620.58 186.44 702.80 ±950.09 ±1428.23 ±362.45 ±1298.92 ±354.35 ±769.27 Mean number FGs: F 6.50±2.12 5.74±0.93 6.50±2.12 4.75±2.22 10.25±0.50 6.19±2.37 L 4.50±1.31 2.84±0.99 1.60±0.55 2.86±1.21 0.48±0.76 1.40±1.36

121

Table 3.4. Mean ratio of amount of each peak to total amount of its type, bulk abundance of each type, and number of peaks for each compound class by taxon and tissue type (showing mean±standard deviation). Each taxon included in chromosome number groups 18, 32 and 85 are presented here. Since chromosome groups 14, 30, and 44 each comprise a single species (C. stellata, C. tripterocarpa, and C. arvensis, respectively), information for those species can be found in Table 3.2. Cpac=C. pachysperma; Cpal=C. palaestina; Ceck=C. eckerleinii; Cmar=C. marocanna; Ciinc=C. incana subsp. incana; Cimar=C. incana subsp. maritima; Coff=C. officinalis; Cscar=C. suffruticosa subsp. carbonellii; Csful=C. suffruticosa subsp. fulgida; Cslus=C. suffruticosa subsp. lusitanica. Sample number by taxon is shown for GC/MS and HPLC analyses (#Floral/#Leaf=number of floral samples/number of leaf samples).

122

2n=~85 2n=18 2n=32 Taxon ID  Cpac Cpal Ceck Cmar Ciinc Cimar Coff Cscar Csful Cslus

GC/MS sampling for small terpenes: #Floral/#Leaf  1/4 1/4 2/18 3/16 1/5 3/11 10/18 3/10 6/11 5/15

Monoterpenes Mean ratio of each MT to bulk MTs: MT1 F 0.21±NA 0.29±NA 0.21±0.05 0.09±0.04 0.03±NA 0.01±0.01 0.63±0.07 0.57±0.11 0.01±0 0.44±0.40 L 0.14±0.14 0.25±0.28 0.21±0.13 0.09±0.05 0.07±0.03 0.02±0.01 0.64±0.05 0.65±0.09 0.04±0.03 0.43±0.32 MT2 F 0.66±NA 0.50±NA 0.47±0 0.67±0.02 0.17±NA 0.10±0.02 0.25±0.06 0.21±0.08 0.33±0.39 0.06±0.03 L 0.73±0.13 0.53±0.26 0.52±0.15 0.80±0.08 0.26±0.19 0.12±0.04 0.27±0.05 0.21±0.08 0.26±0.25 0.06±0.04 MT3 F 0.03±NA 0.04±NA 0.12±0.03 0.09±0.02 0.01±NA 0.57±0.07 0.06±0.01 0.09±0.03 0.39±0.28 0.08±0.09 L 0.03±0.01 0.03±0.02 0.06±0.02 0.04±0.02 0±0 0.33±0.12 0.05±0 0.06±0.01 0.08±0.07 0.14±0.12 MT4 F 0.08±NA 0.06±NA 0.06±0 0.04±0.01 0.01±NA 0.23±0.05 0.01±0 0.03±0.02 0.17±0.06 0.03±0.05 L 0.08±0.01 0.06±0.03 0.04±0.02 0.03±0.02 0±0 0.12±0.04 0.01±0 0.02±0.01 0.04±0.03 0.07±0.08 MT5 F 0±NA 0.01±NA 0±0 0.01±0 0.03±NA 0±0 0.01±0 0.02±0 0±0.01 0.02±0.01 L 0.01±0.01 0.01±0.01 0±0.01 0.01±0.03 0.01±0.02 0.02±0.01 0.01±0 0.01±0 0.01±0.01 0.03±0.05 MT6 F 0.01±NA 0.08±NA 0.13±0.01 0.08±0.06 0.66±NA 0.01±0.01 0.02±0.01 0.07±0.04 0.03±0.01 0.36±0.45 L 0.01±0 0.08±0.10 0.16±0.08 0.02±0.03 0.52±0.13 0.05±0.05 0.01±0 0.03±0.04 0.08±0.05 0.25±0.38 MT7 F 0.01±NA 0.04±NA 0.01±0 0.02±0 0.09±NA 0.08±0.03 0.01±0 0.02±0 0.07±0.05 0.02±0.01 L 0.01±0 0.04±0.03 0.02±0.01 0.01±0.01 0.13±0.05 0.34±0.14 0.01±0 0.02±0.01 0.49±0.20 0.02±0.01 Mean bulk MTs (ng/mg): F 259.32 140.08 154.16 93.30 96.45 17.13 35.29 116.75 382.08 55.91 ± NA ± NA ±121.78 ±53.42 ± NA ±19.24 ±27.66 ±47.14 ±617.12 ±22.44 L 22.83 68.70 5.79 15.16 2.32 28.95 31.32 39.55 6.96 46.14 ±7.82 ±87.22 ±4.98 ±16.59 ±2.13 ±27.39 ±14.83 ±22.69 ±6.57 ±63.33 Mean number of MTs: F 7.0±NA 7.0±NA 6.5±2.83 7.0±1.15 7.0±NA 6.0±1.53 7.0±2.02 7.0±2.31 6.67±2.19 6.6±2.56 L 6.5±0.82 7.0±0.50 6.0±1.41 5.44±2.33 4.4±1.48 6.82±1.91 7.0±1.14 7.0±2.23 5.73±1.69 6.47±4.87

123

Table 3.4. (Continued) 2n=~85 2n=18 2n=32 Taxon ID  Cpac Cpal Ceck Cmar Ciinc Cimar Coff Cscar Csful Cslus

Sesquiterpenes Mean ratio of each ST to bulk STs: ST1 F 0.17±NA 0.08±NA 0±0 0.42±0.16 0±NA 0±0 0±0.01 0±0.01 0±0 0.01±0.01

L 0.19±0.14 0.11±0.11 0±0 0.35±0.25 0±0 0±0 0±0 0±0.01 0±0 0±0.01 ST2 F 0±NA 0±NA 0.05±0.08 0.01±0.01 0±NA 0±0 0±0.01 0.03±0.03 0.13±0.14 0.04±0.05 L 0.04±0.05 0.04±0.01 0.04±0.09 0±0.01 0±0 0.02±0.04 0±0 0.01±0.02 0.03±0.07 0.01±0.03 ST3 F 0.02±NA 0.07±NA 0.02±0.03 0.02±0 0±NA 0±0 0±0 0.01±0.01 0.01±0.04 0±0.01 L 0±0 0.05±0.02 0.02±0.03 0.01±0.02 0±0 0.01±0.02 0±0 0±0.01 0.01±0.03 0±0.01 ST4 F 0±NA 0±NA 0±0 0±0 0±NA 0±0 0.01±0.02 0±0 0±0 0.01±0.02 L 0±0 0±0 0±0.01 0±0 0±0 0±0 0±0.01 0±0 0±0 0.01±0.02 ST5 F 0.22±NA 0.25±NA 0.15±0.11 0.35±0.13 0.66±NA 0±0 0.17±0.04 0.71±0.16 0.14±0.25 0.22±0.20 L 0.26±0.05 0.21±0.03 0.18±0.15 0.37±0.22 0.13±0.21 0.04±0.06 0.11±0.06 0.72±0.10 0.13±0.22 0.18±0.22 ST6 F 0.19±NA 0.19±NA 0.56±0.08 0.09±0.02 0.21±NA 0.37±0.32 0.50±0.07 0.06±0.04 0.25±0.20 0.06±0.09 L 0.21±0.11 0.21±0.03 0.45±0.22 0.07±0.06 0±0 0.41±0.26 0.54±0.14 0.08±0.10 0.39±0.28 0.11±0.09 ST7 F 0±NA 0±NA 0±0 0±0 0±NA 0±0 0±0 0±0 0±0 0.20±0.27 L 0±0 0±0 0±0 0±0 0±0 0.01±0.03 0±0 0±0 0±0 0.33±0.27 ST8 F 0.28±NA 0.30±NA 0.04±0.05 0.04±0.01 0±NA 0±0 0.02±0.02 0.01±0.01 0.09±0.08 0.06±0.06 L 0.16±0.06 0.26±0.07 0.02±0.07 0.06±0.15 0.04±0.08 0.05±0.04 0±0.01 0±0 0±0.01 0.06±0.10 ST9 F 0.05±NA 0.06±NA 0.12±0.05 0.02±0.02 0.13±NA 0.26±0.23 0.11±0.07 0.03±0.04 0.26±0.37 0.36±0.41 L 0.10±0.01 0.08±0.04 0.15±0.13 0.06±0.15 0.22±0.23 0.23±0.13 0.08±0.07 0.01±0.02 0.14±0.16 0.15±0.11 ST10 F 0.04±NA 0.03±NA 0.05±0.07 0±0 0±NA 0.05±0.08 0.18±0.08 0±0.01 0±0 0±0 L 0±0 0.01±0.03 0.02±0.05 0±0 0±0 0.08±0.15 0.27±0.15 0.04±0.10 0±0 0±0.01 ST11 F 0±NA 0±NA 0±0 0±0 0±NA 0±0 0±0 0±0 0±0 0±0 L 0±0 0±0 0±0 0±0 0±0 0.02±0.03 0±0 0±0 0±0 0±0 ST12 F 0.02±NA 0.02±NA 0±0 0±0.01 0±NA 0±0 0.01±0.03 0.08±0.06 0.12±0.14 0.03±0.03 L 0.04±0.03 0.01±0.02 0.12±0.16 0.01±0.02 0.41±0.43 0.14±0.16 0±0 0.08±0.06 0.12±0.15 0.06±0.10

124

Table 3.4. (Continued) 2n=~85 2n=18 2n=32 Taxon ID  Cpac Cpal Ceck Cmar Ciinc Cimar Coff Cscar Csful Cslus

Sesquiterpenes (continued) Mean ratio of each ST to bulk STs (continued): ST13 F 0±NA 0±NA 0±0 0.04±0.02 0±NA 0±0 0±0 0.07±0.02 0±0 0.01±0.02 L 0±0 0±0 0±0 0.01±0.02 0±0 0±0 0±0 0.06±0.04 0±0 0±0.01 Mean bulk STs (ng/mg): F 12.35±0 6.52±0 2.54±1.40 6.86±4.83 2.00±0 0.22±0.26 2.85±4.65 5.49±2.45 15.00±31.50 5.89±7.09 L 1.16±0.25 6.40±7.04 0.50±0.46 2.77±3.56 0.09±0.05 1.77±1.71 1.49±1.11 2.35±1.58 0.25±0.18 1.16±1.17 Mean number of STs: F 8.0±NA 8.0±NA 5.0±2.83 7.67±1.15 3.0±NA 1.67±1.53 5.10±2.02 7.33±2.31 4.0±2.19 5.8±4.87 L 6.0±0.82 7.75±0.50 3.72±1.41 4.13±2.33 1.8±1.48 4.36±1.91 4.0±1.14 4.9±2.23 2.45±1.69 4.4±2.56

HPLC sampling for phenolics: 1/4 1/4 2/18 2/15 1/5 3/12 9/19 3/10 6/11 5/15 #Floral/#Leaf 

Caffeic acid derivatives Mean ratio of each CAD to bulk CADs: CAD1 F 0.01±NA 0±NA 0±0 0±0 0±NA 0±0 0±0 0±0 0±0 0±0 L 0.04±0.01 0.04±0.01 0.01±0.01 0±0 0.03±0.03 0±0 0.04±0.02 0.02±0.02 0.01±0.02 0.01±0.02 CAD2 F 0±NA 0±NA 0±0 0±0 0.01±NA 0±0 0±0 0±0 0.01±0.01 0±0 L 0±0 0±0 0±0 0±0 0±0 0±0 0±0 0±0 0±0 0±0 CAD3 F 0.02±NA 0.02±NA 0.04±0 0.06±0 0.02±NA 0.01±0.01 0±0.01 0.03±0.01 0±0 0±0.01 L 0.13±0.04 0.11±0.02 0.07±0.05 0.03±0.02 0.14±0.03 0.05±0.04 0.15±0.02 0.11±0.04 0.06±0.06 0.09±0.06 CAD4 F 0.08±NA 0.07±NA 0.02±0 0.01±0.01 0.05±NA 0.01±0.01 0.02±0.02 0.03±0.01 0.02±0.02 0.01±0.01 L 0.21±0.04 0.25±0.05 0.10±0.07 0.02±0.03 0.18±0.04 0.05±0.05 0.21±0.04 0.11±0.05 0.13±0.06 0.11±0.07 CAD5 F 0.02±NA 0.02±NA 0±0 0.01±0.01 0.01±NA 0±0 0.01±0.01 0±0 0.07±0.06 0±0 L 0.08±0.02 0.10±0.03 0.01±0.02 0±0 0.01±0.01 0±0 0.07±0.03 0±0 0.37±0.28 0±0

125

Table 3.4. (Continued) 2n=~85 2n=18 2n=32 Taxon ID  Cpac Cpal Ceck Cmar Ciinc Cimar Coff Cscar Csful Cslus

Caffeic acid derivatives (continued) Mean ratio of each CAD to bulk CADs (continued): CAD6 F 0.03±NA 0.02±NA 0.01±0.01 0.01±0.01 0.03±NA 0.08±0.09 0.01±0.02 0.07±0.01 0.02±0.02 0.05±0.02 L 0.33±0.09 0.15±0.04 0.30±0.13 0.17±0.06 0.45±0.08 0.54±0.12 0.30±0.04 0.37±0.09 0.30±0.18 0.72±0.10 CAD7 F 0.01±NA 0.02±NA 0.02±0 0.01±0 0.02±NA 0.04±0.03 0.03±0.02 0.06±0.03 0.02±0.01 0.09±0.07 L 0±0 0.05±0.01 0.01±0.02 0±0 0±0 0.04±0.03 0.05±0.02 0.01±0.01 0.01±0.02 0±0 CAD8 F 0.55±NA 0.54±NA 0.39±0.02 0.40±0.07 0.39±NA 0.51±0.12 0.57±0.05 0.55±0.01 0.50±0.06 0.50±0.15 L 0.21±0.15 0.25±0.12 0.44±0.14 0.62±0.11 0.15±0.07 0.11±0.11 0.12±0.06 0.37±0.18 0.07±0.08 0.07±0.09 CAD9 F 0±NA 0.01±NA 0±0 0±0 0±NA 0.01±0.02 0±0 0±0 0±0 0±0 L 0±0 0.05±0.01 0.01±0.02 0±0.01 0.03±0.03 0.20±0.10 0.04±0.02 0.01±0.01 0.03±0.05 0±0 CAD10 F 0.01±NA 0.12±NA 0.04±0 0.03±0 0.01±NA 0±0 0.01±0.01 0±0.01 0.01±0.01 0.02±0.01 L 0±0 0.02±0.02 0±0.02 0.02±0.01 0±0 0.01±0.03 0±0 0±0 0±0 0±0.02 CAD11 F 0.01±NA 0±NA 0±0 0±0 0±NA 0.01±0.01 0±0 0±0 0±0 0±0.01 L 0±0 0±0 0±0 0.01±0.01 0±0 0±0 0±0 0±0 0±0.01 0±0 CAD12 F 0±NA 0±NA 0.01±0.01 0.01±0 0.01±NA 0.01±0.01 0±0 0.01±0.01 0±0 0±0 L 0±0 0±0 0±0 0±0.01 0±0 0±0 0±0 0±0 0±0 0±0 CAD13 F 0±NA 0±NA 0±0 0±0 0±NA 0±0 0±0 0±0 0±0 0±0 L 0±0 0±0 0±0 0±0 0.02±0.03 0±0 0±0 0±0 0±0 0±0 CAD14 F 0.25±NA 0.16±NA 0.43±0.05 0.42±0.06 0.44±NA 0.30±0.04 0.30±0.05 0.24±0.03 0.34±0.13 0.32±0.11 L 0±0 0±0 0.05±0.05 0.11±0.03 0±0 0±0 0.01±0.02 0±0 0.01±0.02 0±0 CAD14 F 0.01±NA 0.01±NA 0.02±0.01 0.02±0 0±NA 0.01±0.01 0.03±0.02 0.01±0.01 0.01±0.01 0.01±0.01 L 0±0 0±0 0±0 0±0 0±0 0±0 0±0.01 0±0 0±0.01 0±0 CAD16 F 0±NA 0.02±NA 0.03±0 0.02±0 0.01±NA 0.01±0.01 0.01±0.01 0±0 0±0.01 0±0 L 0±0 0±0 0±0 0±0 0±0 0±0 0±0 0±0 0±0.01 0±0

126

Table 3.4. (Continued) 2n=~85 2n=18 2n=32 Taxon ID  Cpac Cpal Ceck Cmar Ciinc Cimar Coff Cscar Csful Cslus

Caffeic acid derivatives (continued) Mean bulk CADs (area/mg): F 41189.98 30078.86 22724.64 34396.37 36930.27 5141.02 11793.93 11555.80 8483.51 13349.95 ±NA ±NA ±4733.96 ±1358.00 ±NA ±4764.81 ±8102.39 ±5766.48 ±5858.39 ±4989.72 L 8473.78 12807.70 6993.15 13266.39 8374.73 190.34 1345.01 845.68 608.33 179.07 ±1978.72 ±4847.90 ±2243.66 ±5499.21 ±3562.63 ±309.62 ±708.11 ±775.52 ±849.73 ±327.04 Mean number of CADs: F 11.00±NA 11.00±NA 9.00±1.41 9.50±0.71 11.00±NA 8.00±4.36 5.89±3.30 7.67±1.15 6.17±1.83 6.00±1.41 L 6.00±0 8.50±0.58 4.78±1.40 5.87±1.46 5.80±1.10 5.17±1.27 8.00±1.33 5.00±1.15 5.91±2.55 3.53±1.30

Flavonoid glycosides Mean ratio of each FG to bulk FGs: FG1 F 0±NA 0±NA 0.04±0.01 0.07±0.01 0±NA 0±0 0±0 0±0 0±0 0±0 L 0±0 0±0 0±0 0±0 0±0 0±0 0±0 0±0 0±0 0±0 FG2 F 0±NA 0±NA 0.06±0.03 0.15±0.05 0±NA 0±0 0±0 0±0 0±0 0±0 L 0±0 0±0 0±0 0±0 0±0 0±0 0±0 0±0 0±0 0±0 FG3 F 0±NA 0±NA 0.17±0.02 0.12±0.10 0.25±NA 0±0 0.11±0.06 0.10±0.10 0±0 0.13±0.11 L 0±0 0±0 0.20±0.32 0.60±0.49 0±0 0±0 0±0 0±0 0±0 0±0 FG4 F 0±NA 0±NA 0±0 0±0 0±NA 0.08±0.07 0±0 0±0 0±0 0±0 L 0±0 0±0 0±0 0±0 0±0 0±0 0±0 0±0 0±0 0±0 FG5 F 0±NA 0±NA 0.26±0.03 0.05±0.03 0.11±NA 0±0 0.26±0.05 0.10±0.08 0.07±0.07 0.03±0.03 L 0±0 0±0 0.73±0.43 0±0 0±0 0±0 0±0 0±0 0±0 0±0 FG6 F 0±NA 0.07±NA 0.07±0 0.16±0.08 0.14±NA 0±0 0.08±0.04 0.04±0.05 0.10±0.13 0±0 L 0±0 0±0 0.03±0.07 0.24±0.41 0.30±0.20 0±0 0.17±0.15 0.06±0.11 0±0 0.10±0.23 FG7 F 0±NA 0.06±NA 0±0 0.01±0.02 0±NA 0±0 0±0 0.02±0.03 0.03±0.05 0±0 L 0±0 0±0 0±0 0±0 0±0 0±0 0±0 0±0 0±0 0±0 FG8 F 0.08±NA 0.07±NA 0.05±0.02 0.04±0.03 0.16±NA 0.70±0.26 0.04±0.04 0.10±0.04 0.15±0.14 0.65±0.13 L 0±0 0±0 0.03±0.08 0.17±0.41 0±0 0.71±0.34 0±0 0±0 0±0 0.05±0.11

127

Table 3.4. (Continued) 2n=~85 2n=18 2n=32 Taxon ID  Cpac Cpal Ceck Cmar Ciinc Cimar Coff Cscar Csful Cslus

Flavonoid glycosides (continued) Mean ratio of each FG to bulk FGs (continued): FG9 F 0.12±NA 0±NA 0.19±0.01 0.19±0.06 0±NA 0±0 0.12±0.06 0.05±0.04 0±0.01 0±0 L 0.04±0.08 0±0 0±0 0±0 0.05±0.10 0±0 0.56±0.21 0.45±0.30 0±0 0.08±0.17 FG10 F 0±NA 0±NA 0±0 0±0 0±NA 0.03±0.06 0±0 0±0 0±0 0±0 L 0±0 0±0 0±0 0±0 0.38±0.27 0±0 0±0 0±0 0±0 0±0 FG11 F 0.38±NA 0.34±NA 0.05±0 0.08±0.05 0.10±NA 0.07±0.06 0.04±0.04 0.06±0.02 0.15±0.13 0.03±0.03 L 0.03±0.05 0.02±0.03 0±0 0±0 0±0 0±0 0.01±0.03 0±0 0±0 0±0 FG12 F 0±NA 0±NA 0±0 0±0 0±NA 0±0 0.25±0.25 0.41±0.38 0.25±0.40 0.06±0.09 L 0±0 0±0 0±0 0±0 0.05±0.10 0±0 0±0 0±0 0.28±0.37 0±0 FG13 F 0±NA 0.05±NA 0±0 0±0 0±NA 0±0 0.02±0.03 0±0 0.03±0.03 0.02±0.02 L 0±0 0.07±0.08 0±0 0±0 0±0 0.12±0.24 0.25±0.14 0.28±0.40 0.57±0.47 0.77±0.31 FG14 F 0.29±NA 0.16±NA 0±0 0±0 0±NA 0±0 0±0 0±0 0.05±0.03 0±0 L 0.56±0.15 0.36±0.08 0±0 0±0 0.17±0.20 0.09±0.18 0.01±0.05 0.21±0.26 0.08±0.13 0±0 FG15 F 0±NA 0.05±NA 0.04±0 0.06±0.01 0.05±NA 0.03±0.04 0.01±0.02 0.05±0.01 0.03±0.03 0.01±0.02 L 0.12±0.08 0.12±0.09 0±0 0±0 0±0 0±0 0±0 0±0 0±0 0±0 FG16 F 0±NA 0±NA 0±0 0±0 0±NA 0±0 0±0 0±0 0±0 0±0 L 0±0 0.11±0.02 0±0 0±0 0±0 0±0 0±0 0±0 0±0 0±0 FG17 F 0±NA 0±NA 0±0 0±0 0±NA 0±0 0±0 0±0 0±0 0±0 L 0.19±0.03 0.16±0.11 0±0 0±0 0±0 0±0 0±0 0±0 0.04±0.10 0±0 FG18 F 0.14±NA 0.19±NA 0.08±0.01 0.07±0 0.19±NA 0.09±0.09 0.06±0.06 0.08±0.06 0.14±0.11 0.06±0.06 L 0±0 0±0 0±0 0±0 0.05±0.10 0.08±0.15 0±0 0±0 0.02±0.06 0±0 FG19 F 0±NA 0±NA 0±0 0±0 0±NA 0±0 0±0 0±0 0±0 0±0 L 0.06±0.08 0.17±0.02 0±0 0±0 0±0 0±0 0±0 0±0 0±0 0±0

128

Table 3.4. (Continued) 2n=~85 2n=18 2n=32 Taxon ID  Cpac Cpal Ceck Cmar Ciinc Cimar Coff Cscar Csful Cslus

Flavonoid glycosides (continued) Mean bulk FGs (area/mg): F 3185.56 5569.76 12181.46 12368.68 5438.88 5141.02 11793.93 11555.80 8483.51 13349.95 ±NA ±NA ±4637.23 ±2767.96 ±NA ±4764.81 ±8102.39 ±5766.48 ±5858.39 ±4989.72 L 2049.21 3183.19 228.72 135.71 985.62 190.34 1345.01 845.68 608.33 179.07 ±654.06 ±906.18 ±446.23 ±200.42 ±707.29 ±309.62 ±708.11 ±775.52 ±849.73 ±327.04 Mean number FGs: F 5.00±NA 8.00±NA 10.00±0 10.50±0.71 7.00±NA 3.67±2.52 6.78±2.28 8.00±1.73 6.00±2.83 5.60±1.67 L 3.75±1.26 5.25±0.96 0.50±0.86 0.47±0.64 2.20±1.48 0.58±1.00 2.58±0.90 1.40±1.17 1.09±1.51 0.53±0.92

129

concentration of MTs and STs (calculated as the sum of the concentrations of all MT or all ST

peaks in ng per mg fresh tissue).

Taxa were considered by chromosome group rather than individually for many analyses.

Justification for this approach is that these chromosome number categories likely represent

natural groups (i.e., members of each group are more closely related to each other than they are

to members of other groups). However, sometimes analyses by taxon were conducted when it appeared that observed variation would be compelling or better explained by differences between taxa than between chromosome groups. Ray, disc, and capitulum samples were considered separately for statistical comparison to each other, but the three types were collapsed into a single category, “Floral,” for comparison to leaf samples.

Mean number of peaks and mean abundance (relative abundance for phenolics) for each of the four compound classes were separated into annual and perennial data matrices, the mean abundance data were log-transformed, and each matrix was subjected to ANOVA in base R (R

Core Team 2014) to assess significance of global differences in group means. Tukey’s HSD

(Tukey 1949) tests were then performed to assess significance of pairwise comparisons. In order to evaluate variation in chemical profiles across groups, non-metric multidimensional scaling

(using “metaMDS” in R package vegan, Oksanen et al. 2013) was used to visualize Bray-Curtis dissimilarities (Bray and Curtis 1957) between samples for each compound class in two

dimensions. The Bray-Curtis dissimilarity matrices were calculated from the matrices of relative

abundance of each peak in each sample (with a call to function vegdist by metaMDS in vegan).

Ordispider (R; vegan) was used to show group membership and group medians of each plotted sample. Goodness of fit between the Bray-Curtis distances and the representation of distances in ordination space was determined by stress tests incorporated in each metaMDS call. To assess

130

significant differences in chemical composition across all groups, permutational multiple

analyses of variance (permanova) using dissimilarity matrices (adonis in R package vegan) were

performed on leaf and floral matrices for each of the four compound classes. Adonis is robust to

unbalanced designs and departures from normality (Oksanen 2013), and so was preferred over

standard anova or amova for the chemical profile data, despite the fact that there is no implementation in adonis in R for pairwise comparisons of the groups included in a global analysis. For pairwise comparisons of particular interest, matrices were further subdivided and these subsets were analyzed in adonis. To test the possible effects of differences in dispersion around group means, mean distances to group medians were also calculated from the Bray-Curtis dissimilarity matrices using betadisper (R; vegan; implementing PERMDISP2 of Anderson et al.

2006) and then pairwise permutation tests for homogeneity of multivariate dispersions were performed using permutest (R; vegan). Ratio data for each compound class were also visualized in 100% stacked column charts showing percentages of every compound in every sample.

3. Results

3.1. Monoterpenes

Seven MT peaks (MT1–MT7) were detected. All seven peaks were present in 83% of floral samples and 55% of leaf samples. Mean number of peaks by chromosome group (Fig. 3.1a, b) did not fall below six except in leaf tissue of chromosome groups 14 (4.71±1.60) and 18

(5.74±1.08; see Table 3.3). The difference was significant (p<0.05) despite the fact that at least some samples in every chromosome group had all seven peaks. There were no other significant

131

Figure 3.1. Mean MT number (a, b; y-axis is number of peaks detected) and bulk abundance (c, d; y-axis is log10-transformed concentration of combined MTs in ng/mg fresh tissue) by chromosome number and tissue (F=floral; L=leaf) in the annual and perennial polyploid complexes. Lower-case letters below x-axis labels denote significant differences at p≤0.05 (based on anova and Tukey’s HSD tests).

132

differences in peak number among groups. MT2 was never absent (and had the highest mean

abundance across the genus). MT5 was most likely to be absent (and had the lowest mean

abundance).

Mean MT abundance varied greatly both within and across groups. Mean abundance (Fig.

3.1c, d) was significantly higher in floral tissue than in leaf tissue overall in both complexes, and

was also significantly higher in floral tissue within most chromosome groups. There was an

increasing trend in mean abundance in both floral and leaf tissue among chromosome groups 14,

44, and 85 in the annual complex and in leaf tissue among all groups in the perennial complex,

but the only significant differences were that C. stellata (chromosome group 14) had the lowest

MT concentration amongst annual flowers and leaves and perennial leaves. The mean

concentration in flowers of C. tripterocarpa (group 30) was higher than in other annual groups

(but not significantly so). Floral MT concentration in groups 44 and 85 were not significantly different from those in group 30. Floral mean abundance was not significantly different between groups in the perennial complex, but the polyploid perennials had significantly higher concentrations in their leaves.

Compared to the wide variation in mean abundance across tissue types and chromosome groups, the proportions of each peak appeared more stable. This was most evident in the 100% stacked column chart (Fig. 3.2). Similar patterns were shared across some taxa. Either MT1 or

MT2 was dominant in most samples across the genus. The MT profile in C. arvensis (group 44) appeared to be a combination of the profiles of those in C. stellata and C. tripterocarpa (groups

14 and 30). This was most apparent in leaf samples. Within group 18, C. eckerleinii had a higher proportion of MT6 and C. maroccana had a very high proportion of MT2 and very little MT6 and MT7, (which was also true for some individuals of C. arvensis). The highest proportions of

133

Figure 3.2. 100% stacked column chart showing proportion of each MT peak in each sample. Floral samples (F) are to the left of the dividing line above, and leaf samples (L) are to the right. Floral and leaf samples are further subdivided by chromosome number group (below) and taxon (above). Taxa are only labeled for leaf samples but taxon divisions are indicated for floral samples and are in the same order as leaf samples.

134

MT6 were found in perennial polyploid subspecies C. incana subsp. incana and one accession of

C. suffruticosa subsp. lusitanica. High proportions of both MT3 and MT7 in leaf tissue and of

MT3 and MT4 (with a reduction in MT7) in floral tissue were shared by C. incana subsp. maritima and C. suffruticosa subsp. fulgida. These taxa were exceptional in the genus in that their MT profiles differed between leaf and floral tissue. C. officinalis had the most consistent pattern of peak ratios across samples, with high proportions of MT1 and remarkably stable, though low, proportions of MT3-MT7. This pattern was shared by C. suffruticosa subsp. carbonellii. The most variable taxon was C. suffruticosa subsp. lusitanica, all three accessions of which differed markedly from each other.

The NMDS plots of MT profiles across chromosome groups, tissue types, and taxa (Fig.

3.3a-f) were consistent with patterns in Fig. 3.2. Most samples (points in Fig. 3.3) belonging to group 32 were divergent from other samples, and differed most by increases in MT3, MT6, and

MT7 (Fig. 3.3a,e). Particularly in leaves (Fig. 3.3e), the remaining samples appeared less well differentiated by group and instead formed a continuum of divergence from group 14, through group 44, to groups 18 and 85. Grouping points by taxon (Fig. 3.3c, f) rather than chromosome group reiterated that the most divergent taxa were C. suffruticosa subsp. fulgida, C. incana subsp. incana, C. incana subsp. maritima, and C. incana subsp. lusitanica. The MT profiles of C. officinalis and C. suffruticosa subsp. carbonellii were most like those of some samples of C. stellata. Leaf tissue did not differ from floral tissue in MT composition (Fig. 3.3d), and the groupings by chromosome group in floral tissue (Fig. 3.3a) looked much like those in leaves

(although the projection in Fig. 3.3a is flipped and rotated). Grouping the floral samples by collection type (ray, disc, or whole capitulum; Fig. 3.3b) indicated a divergence of both ray and disc florets from whole capitula which was significant (p<0.01), but this was confounded by the

135

Figure 3.3. NMDS plots of Bray-Curtis dissimilarities between the MT profiles of each sample in Calendula. Points represent samples and MT1-MT7 are MT peaks. The closer peaks are to samples, the more abundant those peaks are in those samples relative to their abundance in other samples from which they are further. Plots show floral samples (a-c), all tissue (d), or leaf tissue (e-f), grouped by chromosome number (a,e), tissue (b, d), or taxon (c, f). Global significance (from adonis) as well as significant pairwise differences (p≤0.05) are indicated below each plot. “np” indicates that no pairwise tests were conducted. “ns” indicates no significant differences. Taxon codes in (c) and (f) indicate the following: Carv=C. arvensis; Ceck=C. eckerleinii; Ciinc=C. incana subsp. incana; Cimar=C. incana subsp. maritima; Cmar=C. marocanna; Coff=C. officinalis; Cpac=C. pachysperma; Cpal=C. palaestina; Cste=C. stellata; Cscar=C. suffruticosa subsp. carbonellii; Csful=C. suffruticosa subsp. fulgida; Cslus=C. suffruticosa subsp. lusitanica; Ctri=C. tripterocarpa.

136

(a) Global p<0.001, np (b) Global p=0.005, Capitulum≠Ray; Capitulum≠Disc

(c) Global p<0.001, np (d) ns

(e) Global p<0.001, np (f) Global p<0.001, np

137

fact that all ray and disc collections were made from groups 14 and 32 (i.e., the significant result

is indicative of a chromosome number group difference, not necessarily a tissue difference in this

case). Rays and discs did not differ from each other, nor did floral tissue differ from leaf tissue. It

was clear from the plots in Fig. 3.3 that there were large differences in dispersions around the

median of each group (betadispersion), which could not be entirely explained by differences in

sampling (i.e., one might expect greater variation with greater sampling). For example, samples

from group 32 were spread widely in all directions from the group centroid, while most of those

from group 44 were clustered near the group centroid, despite the fact that both groups 32 and 44

were extensively sampled. The difference in betadispersions of MT profiles between groups 32

and 44 was not significant in either leaf or floral tissue, however, likely due to the presence of a

few more widely diverged samples in group 44. In general, group 32 was much more variable

than group 44 across samples.

3.2. Sesquiterpenes

Thirteen ST peaks (ST1–ST13) were detected. No sample had all 13 peaks and no peak was

always detected. Peaks ST6, ST5, and ST9 were the most frequently detected (in that order),

while ST11 was detected only in two samples of one accession of C. incana subsp. maritima.

Number of peaks detected varied across tissue and chromosome number groups (Fig. 3.4a, b).

Across all samples, the average number of peaks per floral sample was about four (4.28±2.71)

but ranged from zero to nine, while the average number per leaf sample was about three

(3.28±2.17) and ranged from zero to 11, though there were few significant differences among

groups or among tissue within groups. Only the high polyploids (group 85) had a significantly higher number of peaks than other groups, which was true in both tissue types. The pattern was

138

Figure 3.4. Mean ST number (a, b) and bulk abundance (c, d) by chromosome number and tissue in the annual and perennial polyploid complexes. Labeling conventions are as described in Fig. 3.1.

139

much the same for mean ST concentration (Fig. 3.4c, d). Sesquiterpene abundance in the high

polyploids (group 85) was significantly higher than in other groups in the annual complex in

both leaf and floral tissue. Within the group, concentration was significantly higher in floral than in leaf tissue. In the perennial group, ST concentration was higher in floral tissue than in leaf tissue of group 18. There were no other significant differences.

Sesquiterpene profiles (see Fig. 3.5) varied considerably across samples, particularly within group 44 (in marked contrast to MT profiles in this group; Fig. 3.2). Profiles from C. officinalis

and C. suffruticosa subsp. carbonelli (group 32) were relatively stable compared with other taxa,

and each differed from those of the other. An increase in ST10 was characteristic of the former

and a predominance of ST5 was characteristic of the latter. Peak ST1was dominant in C.

maroccana. Peak ST7 was found primarily in C. suffruticosa subsp. lusitanica and was rare in

other samples. Patterns were fairly stable between tissue types from the same taxa.

NMDS plots showed considerable overlap in ST profiles between groups and tissue (Fig.

3.6), yet divergence of some chromosome groups from others was indicated. In floral tissue,

groups 85 and 18 were more similar to each other, groups 32 and 14 were more similar, and

group 44, though it had almost complete overlap with group 32, was pulled left and down by a

single divergent sample. Group 30, represented by a single sample in this analysis, had only a

single peak, ST12 (Fig. 3.6a). In leaves, groups 32 and 44 were completely overlapping, while

group 14 was slightly diverged. Groups 85 and 18 remained close to each other, and group 30

was the most divergent from other groups (again, either by the singular presence or by the

dominance of ST12). Samples from different types of floral tissue did not differ significantly

from each other (Fib. 6b), nor did floral samples differ from leaf samples (Fig. 3.6d). Group

140

Figure 3.5. 100% stacked column chart showing proportion of each ST peak in each sample. Labeling conventions are as described in Fig. 3.2 with the following addition: white columns indicate the absence of detectable STs in these samples.

141

30

(a) Global p<0.001, np (b) ns

(c) Global p<0.001, np (d) ns

Figure 3.6. NMDS plots of Bray-Curtis dissimilarities between ST profiles in Calendula. Labeling conventions are as described in Fig. 3.3 except that S1-S13 are ST peaks. Plots show floral samples (a,b), leaf samples (c), and all tissue samples (d) grouped by chromosome number (a,c) or tissue (b, d).

142

dispersions in leaves and flowers were equivalent except that group 85 had lower betadispersion

than all other chromosome groups in leaves (p≤0.01).

3.3. Caffeic acid derivatives

Sixteen CAD peaks were detected (CAD1 – CAD16). On average, about seven (7.36±2.59)

peaks were detected in floral samples and about six (6.13±1.86) were detected in leaf samples,

although peak number per sample ranged from two to 11 in both tissues. Peaks CAD8, CAD6,

CAD4, and CAD3 were the most frequently detected (in that order), but only CAD8 occurred in

almost all samples of both tissue types (94% of leaf samples and 100% of floral samples).

CAD6, CAD4, and CAD3 occurred in over 90% of leaf samples but only in 76%, 72%, and 62%

(respectively) of floral samples. Many CAD peaks were detected more frequently (or only) in

one tissue type. There was an increasing trend in number of peaks in floral tissue in the annual

complex (Fig. 3.7a). Groups 44 and 85 had significantly more peaks than group 14, but all three groups were equivalent to group 30. Number of peaks in leaves was significantly higher in group

14. Among perennials (Fig. 3.7b), group 18 had significantly more peaks than group 14 in floral

tissue, and group 32 was equivalent to both 14 and 18. The floral mean of group 32 was intermediate between groups 14 and 18, though the range encompassed both. Group 14 had more

peaks in leaves than groups 18 and 32 (which did not differ from each other). Mean number of

peaks was higher in floral tissue within groups 44, 85. There were fewer peaks in floral tissue

than in leaf tissue of group 14. Number of peaks was equivalent between tissues in other groups.

Abundance of CADs was higher in floral tissue than in leaf tissue (Fig. 3.7c), although the

difference was not significant in groups 14 and 30. Again, there was an increasing trend in

143

Figure 3.7. Mean CAD number (a, b) and bulk abundance (c, d) by chromosome number and tissue in the annual and perennial polyploid complexes. Labeling conventions are as described in Fig. 3.1 except that the y-axis for (c, d) is the log-transformed relative abundance (peak area/mg fresh tissue) rather than absolute abundance (ng/mg fresh tissue).

144

relative abundance in annual floral tissue, but only group 44 was significantly higher than group

14. Perennial groups did not differ from each other within tissue types in relative abundance of

CADs (Fig. 3.7d).

Caffeic acid derivative profiles within samples (Fig. 3.8) were relatively stable within taxa

and within tissues, but not across tissues. The CAD profile was quite different between floral and leaf samples of most individuals. The presence of minor peaks CAD15 and CAD16, an increase in the proportions of CAD8 and CAD14, and a reduction in CAD9 were characteristic of floral tissue in both complexes. Leaf samples from group 18 were more similar to floral samples than were leaf samples from other groups. In particular, they appeared most similar to floral samples from the annual complex. Calendula officinalis and C. suffruticosa subsp. fulgida (group 32), as well as the annual high polyploids (group 85) had a higher proportion of CAD5 in leaf tissue than did other taxa, with C. suffruticosa subsp. fulgida having the highest ratios of this

compound. Much higher ratios of CAD5 and CAD14 in floral tissue in perennial taxa, coupled

with the (near) absence of CAD2 and CAD9 (both of which were prevalent in annual floral

tissue), distinguished the perennial from the annual taxa.

NMDS of CAD profiles in floral tissue by chromosome group (Fig. 3.9a) showed samples from the six different chromosome groups in more discrete clusters than was the case for other compound classes. Annual polyploid groups 44 and 85 were intermediate between groups 30 and

14. Perennial polyploid group 32 was intermediate between groups 14 and 18. Group 14 was most divergent from other groups. Group 85 was not highly divergent from either group 44 or group 32, but was distinct. Comparison to the same plot grouped by floral collection type (Fig.

3.9b) showed that, once again, taxon differences may be confounded with tissue differences

(because ray and disc collections were only made from groups 14 and 44). It is possible that

145

Figure 3.8. 100% stacked column chart showing proportion of each CAD peak in each sample. Labeling conventions are as described in Fig. 3.2.

146

(a) Global p<0.001, np (b) Global p<0.001, ray≠capitulum, ray≠disc, disc≠capitulum

(c) Global p<0.001, np (d) Global p<0.001, floral≠leaf

Figure 3.9. NMDS plots of Bray-Curtis dissimilarities between CAD profiles in Calendula. Labeling conventions are as described in Fig. 3.3 except that C1-C16 are CAD peaks. Plots show floral samples (a,b), leaf samples (c), and all tissue samples (d) grouped by chromosome number (a,c) or tissue (b, d).

groups 44 and 30 differed from groups 32 and 14 because capitula (collected for the former two)

differed from rays and discs (collected from the latter two). The high proportion of CAD6 and

the prevalence of CAD9 in floral tissue of groups 30 and 44 (Fig. 3.8) was also a signature of

leaf samples from those groups, and so could have resulted from the inclusion of phyllaries in

capitulum samples. CAD2, however, still distinguished floral tissue of groups 30 and 44 from

147

other groups, and was not present in leaves. Also, only capitulum collections were made for

groups 18 and 85, yet these groups are divergent from groups 30 and 44 and not well

distinguished from ray and disc collections from group 32. Within group 32, rays and discs were

well distinguished from each other (Fig. 3.9b). In leaf tissue (Fig. 3.9c), samples from group 18

were particularly divergent, nearly forming a discrete cluster. Samples from the other groups

formed a continuum of divergence from group 30 through groups 14, 32, and 44, to group 85.

The spread between these groups was likely reduced by four, widely diverged samples from one

accession of C. suffruticosa subsp. fulgida, each dominated by peak CAD5. CAD profiles in

floral tissue differed significantly from those in leaf tissue (Fig. 3.9d). Samples in the area of

overlap in Fig. 3.9d are all from group 18, CAD profiles from which differed less between tissues than they did between tissues in all other groups. Betadispersion was significantly lower

in group 44 than group 32 in both floral (p=0.002) and leaf (p=0.001) tissues.

3.4. Flavonoid glycosides

Nineteen FG peaks were detected. Of these, only five were detected in more than half of the

samples of either tissue, and only one was detected in over half of the samples of both tissue

types. Peaks FG8, FG11, and FG19 were detected in over 80% of floral samples but only in

6.4%, 29.4% and 3.9% of leaf samples, respectively. Peak FG13 was detected in 58.6% of floral and 50.5% of leaf samples. Fourteen of the nineteen peaks were detected more frequently in floral samples. Six peaks (FG, FG2, FG10, FG16, FG17, FG19) were detected in fewer than 10% of samples of either tissue (with the four latter peaks detected in fewer than 5% of samples).

Forty six percent of leaf samples from perennial groups 18 and 32 lacked detectable FGs, compared to only one percent of leaf samples from annuals. However, C. officinalis leaf samples

148

never lacked FGs, nor did any floral samples from any taxon. In general, there were more peaks

in floral samples than in leaf samples in both complexes, though this difference was not

significant within groups 14 and 85 (Fig. 3.10a, b). Peak number did not increase with ploidy in

either tissue, in either complex. In annuals, mean peak number was equivalent across groups

within both tissues, except that group 85 had more leaf peaks than did other groups (Fig. 3.10a).

In the perennial complex, group 18 had more peaks in floral than in leaf tissue. Group 18 also

had a higher mean peak number than groups 14 and 32 in floral tissue, but was lower than both

groups in leaf tissue (Fig. 3.10b).

Mean FG abundance (Fig. 3.10c, d) was higher in floral tissue than in leaf tissue in both

complexes, though the difference was not significant within groups 2n=30 and 85. Within tissue

types, mean abundance did not differ significantly between groups in either complex except that

group 85 had a lower mean abundance in floral samples than did group 44 (Fig. 3.10c).

FG profiles (Fig. 3.11) were quite stable across group 44 but less so across other

chromosome groups. Despite minor variations across individuals, three peaks predominated in

most C. arvensis (group 44) leaf samples, FG11, FG13, and FG14, in an average ratio across all

samples of 11%, 35%, and 49% respectively (Table 3.3). FG11 was occasionally missing and

this was compensated by an increase in FG13. One accession was unusual in that FG9 was present instead of FG11, although in the expected ratio. In a few samples, only FG14 was detected. In floral samples of C. arvensis, FG4 and FG8 were predominant in an average ratio of

17% and 46% (Table 3.3). FG profiles were also relatively steady in leaf and floral tissue of C. officinalis within group 32, although FG17 was sometimes missing. Samples from one accession of C. suffruticosa subsp. carbonellii had FG profiles very much like those of C. officinalis, although FG13 was usually replaced by FG14 in leaf tissue. FG profiles in leaves in groups 44

149

Figure 3.10. Mean FG number (a, b) and bulk abundance (c, d) by chromosome number and tissue in the annual and perennial polyploid complexes. Labeling conventions are as described in Figs. 3.1 and 3.7.

150

Figure 3.11. 100% stacked column chart showing proportion of each FG peak in each sample. Labeling conventions are as described in Figs. 3.2 and 3.5.

151

and 85 were more similar to those in group 30 than any were to group 14, although some peaks

from group 14 occurred rarely in group 44. The proportion of FG8 and FG12 in leaf samples of

group 14 looked more typical of perennial leaf and floral samples, but the proportion of FG14

was much like that in all other annual leaves. As mentioned above, the most distinguishing

characteristic of many perennial leaf samples was their lack of detectable FGs.

NMDS of floral FG profiles by chromosome group (Fig. 3.12a) showed groups 18, 30, and

85 as being well-diverged from groups 14, 32, and 44. A widely dispersed group 14 entirely overlapped groups 32 and 44. Many samples from group 44 were distinct from those from group

32, but some overlap between the two groups was due to shared similarity between some samples from group 44 and two subspecies from group 32 (C. incana subsp. maritima and C. suffruticosa subsp. lusitanica; Fig. 3.12c). Most samples from group 32 occupied an intermediate position between groups 14 and 18, and samples from 44 likewise occupied an intermediate position between groups 14 and 30. Group 85 was the most diverged group from other groups in the plot. FG profiles in ray florets were significantly different from those in either disc florets or whole capitula (Fig. 3.12b), and there was no confounding effect of taxon differences. All ray floret samples from both groups 14 and 32 were distinct from disc florets, and disc and capitulum samples were not different from each other. However, two capitulum samples from C. suffruticosa subsp. carbonellii had FG profiles like those of ray florets. Samples from group 18 had the most divergent leaf FG profiles (Fig. 3.12e). Groups 30, 44, and 85 were highly similar while groups 14 and 32 were divergent from those groups. FG profiles in leaf tissue were significantly different from those in floral tissue, although some samples from group 18 were even more divergent from every other sample than leaf tissue was from floral tissue (Fig. 3.12d).

152

Figure 3.12. NMDS plots of Bray-Curtis dissimilarities between FG profiles in Calendula. Labeling conventions are as described in Fig. 3.3 except that F1-F19 are flavonoid peaks. Plots show floral samples (a-c), all tissue samples (d), and leaf samples (e) grouped by chromosome number (a, e), tissue (b, d), or taxon (c).

153

(a) Global p<0.001, np (b) Global p<0.001, ray≠capitulum, ray≠disc

(c) Global p<0.001, np (d) Global p<0.001, floral≠leaf

(e) Global p<0.001, np

154

As in CAD analyses, betadispersion of groups 32 and 14 were large and betadispersion of 44 was

relatively small (p≤0.05).

4. Discussion

Plant secondary compounds from four chemical classes representing two major, independent

biosynthetic pathways (MTs and STs from the isoprenoid pathway and CADs and FGs from the phenylpropanoid pathway) were diverse in Calendula. Variation was qualitative and/or quantitative, and each class varied across groups of interest (species complex, chromosome number, taxon, or tissue) in different ways.

4.1. Summary of variation by chemical class

MT variation was almost entirely quantitative. Proportions of the seven compounds changed across some groups, were relatively stable within groups, and rarely differed across leaf and floral tissue within groups (though mean concentration of MTs was higher in floral tissue). In

other words, with few exceptions, Calendula species or subspecies had different and relatively stable whole-plant MT profiles characterized by unique mixtures of ubiquitous peaks rather than by unique peaks. In contrast, ST variation was both qualitative and quantitative. Though the data

appeared noisy, unique complements of different peaks, sometimes in stable proportions to each

other, were characteristic of taxa, and, like MT profiles, were steady across tissue types within

taxa. There were few differences in ST peak number or mean concentration between floral and

leaf tissue or between chromosome number groups.

155

Differences in CAD profiles were also both quantitative and qualitative. Within each tissue type, unique blends of more abundant CAD peaks coupled with unique complements of more minor peaks were characteristic of taxa or groups of taxa. Unlike MT and ST profiles, CAD profiles differed significantly between leaf and floral tissue. Mean number of CAD peaks varied inconsistently across groups, but relative abundance was typically higher in floral tissue.

Variation of FG profiles across complexes, chromosome number groups, taxa, and tissue types was largely qualitative. Mean peak number and mean relative abundance of FGs was higher in floral tissue. Both measures were typically lower in leaves and higher in floral tissue of perennial species compared to annual species.

Considering that the four compound classes surveyed in this study are the products of independent biosynthetic pathways, and that the compound classes within each pathway represent different branches of these pathways, it was expected that patterns of variation would differ between each of the four classes. In particular, it was predicted MT and ST profiles would be more variable within taxa than CAD and FG profiles because of the large number of terpene synthases in plants and the fact that many terpene synthases are capable of forming multiple products from the same substrate (Degenhardt et al. 2009; Kollner et al. 2006). As a result, any consistent pattern of differentiation of terpenoid profiles between taxa would be absent or obscured, and these compounds would not be expected to be useful for tracking evolutionary relationships due to this more tenuous connection between phenotype and genotype. CAD and

FG profiles, on the other hand, due to high enzyme specificity in the phenylpropanoid pathway

(see Vogt et al. 2010), would be expected to show better correspondence to evolutionary relationships.

156

Although patterns produced by each class of compounds did vary considerably, they did not

vary entirely as predicted. Monoterpenes, for example, were the least variable in number across

taxa and tissues of the four compound classes. Sesquiterpenes were variable in number, but this

could have been an artifact of low concentrations and sporadic detection. Flavonoids were the most variable in terms of the total number of peaks detected in Calendula. Monoterpene, CAD,

and FG profiles all showed a signal of allopolyploidization in the annual complex (see below),

and so all three tracked evolutionary relationships at least in this respect. However, NMDS plots

supported the hypothesis that MT and ST profiles were poorer at tracking evolutionary relationships than CADs and FGs. Particularly in floral tissue, CAD profiles from different chromosome number groups formed distinct clusters that were well diverged from each other.

Further, clusters from the annual and perennial complexes diverged from each other, radiating from a more central group 14 cluster (C. stellata), and the allopolyploid clusters (particularly groups 32 and 44) occupied intermediate positions between the clusters of samples from putative progenitors (14 and 18 for group 32 and 14 and 30 for group 44; Fig. 3.9a). NMDS plots of FG profiles showed a similar pattern. Although some groups were less well diverged from each other

(with samples from group 14 entirely overlapping those from groups 32 and 44, and some samples from groups 32 and 44 also overlapping), polyploid groups 32 and 44 still occupied intermediate positions between putative progenitors. CAD and FG profiles were not as distinct by group in leaf tissue as they were in floral tissue, which suggests greater selective pressure on these compounds in floral tissue than in leaf tissue. However, CAD and FG profiles in leaves from group 18 were well diverged from those of other groups (consistent with the divergence of the Moroccan endemic species from all other Calendula species seen in analyses of ITS and chloroplast markers (Chapter 1). In samples from group 18, CAD profiles varied much less

157

between floral and leaf tissue than they did between these tissues in other groups, suggesting

either that divergence of CAD profiles between floral and leaf tissue in Calendula may have

occurred after the divergence of the Moroccan endemic species from the rest of the genus, or that

convergence of CAD profiles in floral and leaf tissue may have occurred uniquely in the

Moroccan endemics. A survey of chemical differentiation between floral and leaf tissue across the Compositae would be useful for choosing between the two hypotheses.

These patterns of variation in CAD and FG profiles were all consistent with hypotheses of evolutionary relationships (particularly allopolyploid origins) between taxa in Calendula (see

Chapters 1 and 2). No such consistent patterns were evident in analyses of MT and ST profiles,

and clusters were less well diverged from each other in all tissues, consistent with a hypotheses

of blurred boundaries between groups because of high variability of MT and ST profiles within

groups. In these analyses, as in analyses of ITS, CP, and LCN markers, the high polyploid group

85 defied any simple explanation of polyploid origin. CAD profiles in floral tissue from group 85 were more similar to those of group 32 than to putative progenitor groups 30 or 44, but FG profiles in floral tissue were most similar to group 30.

A consistent feature of the variation in three of the four compound classes (and even to some extent within the fourth class, STs), was that proportions of compounds, especially dominant compounds, relative to others of the same class were relatively constant within taxa. Proportions were constant even when bulk abundance varied widely between samples. This suggests that it is the blend of compounds, more than any individual compound, that is important and thus may be adaptive. There is mounting evidence that, more often than not, taxonomically specific blends of ubiquitous compounds, rather than individual compounds specific to single taxa, are critical in

158

plant-insect interactions, including attracting pollinators or attracting the predators of herbivores

(Bruce and Pickett 2011; Bruce et al. 2005).

4.2. Some taxon-specific patterns of variation and implications

Chemical profiles (including both complement and proportion of compounds) of three of four

chemical classes in C. arvensis (group 44) were much more homogeneous than expected given

the geographical and morphological range of individuals included in sampling. Only STs varied

considerably across the species, generally involving changes in the presence and proportion of

five of the 13 ST peaks detected across the genus. Paolini et al. (2010) found that STs in essential

oil samples of C. arvensis from Corsica varied in proportion to each other depending on

environmental variables such as geographical location or season. Because the individuals

sampled in this study were grown in a common greenhouse, environmental variables are not

expected to have been a factor, nor is geographical origin of seeds, as ST profiles varied even

within accessions. It is more likely that the extremely low concentrations of STs in extracts from

fresh leaves and floral tissue found in Calendula in this study may have led to sporadic detection

of these compounds in many samples, and that ST profiles in C. arvensis plants grown in a

common greenhouse are more uniform than they appeared. This low chemical variability was at

odds with high morphological and molecular variability (particularly among chloroplast

haplotypes; see Chapter 1).

In all compound classes, both qualitative and quantitative variation across the perennial polyploids, C. incana and C. suffruticosa, were high. Profiles were often relatively stable within subspecies, but not species, and affinities were present between some pairs of subspecies

159

belonging to different species. For example, Calendula incana subsp. maritima, a Sicilian

endemic, had a MT profile similar to that of C. suffruticosa subsp. fulgida, a Sicilian native, and

these profiles differed considerably from those of all other taxa. On the other hand, their CAD

profiles had little in common. The CAD profiles from samples of Calendula incana subsp.

maritima were more like those of annual species, while the ones from C. suffruticosa subsp.

fulgida were unlike those in all other samples. Calendula suffruticosa subsp. carbonellii and C.

officinalis shared highly similar MT and flavonoid profiles, but the ST profile of the former

differed considerably from that of all other taxa. Calendula officinalis had more homogeneous

chemical profiles than other perennial polyploids.

The most consistent difference between the annual polyploid C. arvensis and the perennial

polyploids was the greater homogeneity of the former compared to the latter. Calendula arvensis

is the most widespread and morphologically polymorphic species in the genus. It has been

divided into species, subspecies, varieties, and forms in various treatments and floras. Likewise,

the perennial polyploids together have been recognized as one species or many, and evidence has

recently tipped the scales in favor of recognition of one species (C. suffruticosa; see Nora et al.

2013; Greuter 2006+) rather than two (C. suffruticosa and C. incana). Both groups were sampled

extensively (especially relative to other groups; 79 leaf and 19 floral samples in 19 accessions for

C. arvensis and 43 leaf and 18 floral samples in 13 accessions for C. incana and C. suffruticosa),

so differences in homogeneity cannot be explained by wider sampling in one group than the

other. Also, although C. officinalis was included in group 32 for the analyses of homogeneity, it

is unlikely that its inclusion affected the results. This is because the dispersion of C. officinalis was small and fell within the range of the other species. Two major hypotheses can be drawn from this difference in betadispersion. First, the recognition of C. arvensis as a single species is

160

wholly appropriate based on these results, but the collapse of all of the subspecies of C. incana

and C. suffruticosa into a single, polymorphic species might not be, particularly given the

divergence of some taxa from others evident in the NMDS plots. Second, the difference in

betadispersion of these groups could be explained in part by differences in breeding system.

Calendula arvensis, like the annual species C. tripterocarpa, C. pachysperma, and C. palaestina,

self-fertilizes readily and prolifically when pollinators are not present (pers. obs.; Heyn and Joel

1983). The perennial species very rarely self-fertilize, and thus would be expected to show

greater heterozygosity and, potentially, to harbor greater genetic diversity. Interestingly, the annual C. stellata, which also rarely self-fertilizes, had a degree of betadispersion equivalent to the perennial species in most analyses. Increased chemical diversity in these outcrossing species could be related to increased pressure to attract pollinators while remaining defended against herbivores. It could be that the decreased diversity in C. arvensis and other selfing annuals was

related to decreased genetic diversity due to selfing. However, chloroplast and ITS markers were

quite variable for these species (see Chapter 1).

4.3. Variation in allopolyploids relative to progenitors

Morphological, karyological (Heyn et al. 1974; Heyn and Joel 1983), and molecular

(Chapters 1 and 2) evidence support an allopolyploid origin of C. arvensis from crosses between

C. stellata and C. tripterocarpa. They also support an allopolyploid origin of the perennial

polyploids (including C. officinalis) from crosses between C. stellata and one or more of the

perennial diploid species (of which C. maroccana and C. eckerleinni were sampled here).

Following Orians (2000), though traits in hybrids are often expected to be either intermediate

161

between those of parents or more similar to those of one parent or the other, genetic and

biosynthetic changes resulting from hybridization and genome duplication may result in more

complex patterns of variation. Obstruction of existing pathways may lead to absent products or

accumulation of intermediate products; new combinations of products from merged pathways

may produce novel products; fully duplicated pathways may result in increased products; and

regulatory changes may lead to novel location of expression. Chemical profiles were assessed for

evidence of various patterns possibly resulting from hybridization and genome duplication.

As mentioned above, CAD and FG profiles in floral tissue of C. arvensis (group 44) and the perennial polyploids (group 32) were intermediate between those of their putative progenitors

(C. stellata (group 14) and C. tripterocarpa (group 30) for C. arvensis; C. stellata and the

Moroccan endemics (group 18) for the perennial polyploids) in NMDS plots. A similar pattern

was not obvious in profiles of these compounds from leaf tissue, but looking at presence and

proportion of individual peaks in leaf tissue offered some insights into the putative contributions

of progenitor species to allopolyploids, particularly in the annual complex. Calendula stellata

(group 14), one likely progenitor of all polyploid taxa, was characterized by a high proportion of

MT1 (nearly 70% in most samples, but with a mean across all samples of 49%), absent or low

MT7, the presence of ST8 in some samples, the presence of CAD1 in all samples, and the presence of FG8, FG11, and FG12, and the absence of FG13. In contrast, Calendula

tripterocarpa (group 30) had an extremely low proportion of MT1 (mean 1%) and higher MT7

(mean 23%) in all samples, lacked ST8, CAD1, FG8, FG11, and FG12, and had FG13 in some

samples (~25% when present). In Calendula arvensis (group 44), the proportions of both MT1

and MT7 ranged from nearly absent to almost 70% (MT1) or 20% (MT7), but, in most samples,

were roughly intermediate between those of progenitors (although the mean proportion of MT7

162 in C. arvensis was similar to that in C. stellata (5% and 7% respectively). Peak ST8 was lacking,

CAD1 was either present (and in the same proportion as in C. stellata) or absent (as in C. tripterocarpa), and FG11 (presumably from C. stellata) and FG13 (presumably from C. tripterocarpa) were two of three dominant peaks present in most samples. FG13 was present in

C. arvensis in roughly twice the proportion of FG13 in C. tripterocarpa. FG8 and FG12 appeared sporadically and were minor peaks. FG18 and FG19, present in only a few leaf samples of C. arvensis, were not present in leaves of C. stellata and C. tripterocarpa but were present in floral tissue of all three species.

Table 3.5 shows the presence and mean proportion of these compounds in the three species, and characterizes the presence and proportion of each compound in C. arvensis relative to those of C. stellata and C. tripterocarpa. Interestingly, if we accept that C. arvensis is indeed the allopolyploid product of C. stellata and C. tripterocarpa, then there is evidence that at least two of the types of biosynthetic changes that can occur upon genome merger and duplication

(summarized from Orians 2000 above) have occurred in C. arvensis. Several compounds (MT1,

ST8, FG11, and FG12) showed a pattern of intermediacy (in either proportion, presence, or both) between the two progenitors. Two (MT7 and FG8), were similar to C. stellata in presence and proportion, while FG8 was like C. tripterocarpa. However, the high proportion of FG13 in C. arvensis relative to both parents was transgressive and could be indicative of duplication, particularly if the pathway yielding FG13 exists in C. stellata but is not active, but was activated in C. arvensis after hybridization and genome duplication. In combination with the active pathway from C. tripterocarpa, twice the proportion of FG13 could then be produced. Finally, the presence of FG18 and FG19 in floral tissue of all three species but in leaf tissue only of

163

C. arvensis suggests a change in location of expression of these compounds possibly due to regulatory changes resulting from genome merger and duplication.

Table 3.5. Presence and mean proporption of several compounds in leaf samples of two putative progenitor species (C. stellata and C. tripterocarpa) and in the putative allopolyploid of these species (C. arvensis). + = present; (+) = sometimes present; – = absent. Mean percentages of each peak by species are taken from Table 3.3. Presence and proportion of each peak in C. arvensis relative to progenitor species is summarized at the bottom. int=intermediate between progenitors; Cste=more similar to C. stellata; Ctri=more similar to C. tripterocarpa; ↑ = in greater proportion than both progenitors; nov=novel relative to progenitors.

Species (2n) MT1 MT7 ST8 CAD1 FG8 FG11 FG12 FG13 FG18 FG19 C. stellata (14) + + (+) + + + + – – 49% 7% 11% 4% 16% 26% 13% C. arvensis (44) + + (+) + – + (+) + + + 29% 5% 2% 4% 11% 1% 35% C. tripterocarpa (30) + + – – – – – 16% – – 1% 23% C. arvensis relative to progenitors: int Cste int Cste Ctri int int ↑ nov nov

Chemical profiles were similar between the high polyploids, C. pachysperma and C. palaestina (group 85). Molecular evidence for the allo- or autopolyploid origins of the high polyploids was inconclusive (Chapters 1 and 2), although sequences from these species were most closely related to those from the other annual species. Monoterpene profiles in the high polyploids were similar to those of other species in the annual complex. Some rare and some ubiquitous STs, presumably from C. arvensis, were present in very different proportions in the high polyploids. Three novel FGs (FG15, FG16, and FG17) were detected in high polyploids.

FG19, which was rare in C. arvensis and absent in both C. stellata and

C. tripterocarpa, was present in most high polyploid samples. Unexpectedly, the CAD profile in high polyploids was nearly indistinguishable from that in C. officinalis. Whether this was convergent or indicative of a past hybridization event (and the contribution of the genome of

164

either C. officinalis or its wild ancestor to that of the high polyploids) warrants future investigation. No evidence of such a contribution was seen in analyses of ITS, CP, or LCN markers (Chapters 1 and 2).

The perennial diploids differed from each other across all chemical classes but, for the most part, neither was so well differentiated from C. stellata or similar enough to any perennial polyploid taxon that a satisfying assessment of potential contributions of each of these species to the perennial polyploids could be made. One pattern was notable. Calendula eckerleinii had a high proportion of MT6 (~10 to 30%) in leaves, which was absent (or nearly so), in leaves of C. stellata. Calendula incana subsp. incana and one accession of C. suffruticosa subsp. lusitanica had very high proportions of this compound in their leaves as well (up to 90%). However, in a rare departure from the uniformity of MT profiles across leaves and flowers, some floral samples of C. stellata also had very high proportions of MT6 (up to 80%). The high proportion of this compound in some perennial polyploid accessions could, therefore, represent a contribution from

C. maroccana or a contribution from C. stellata with a change in location of expression. It could also be unrelated to polyploidy.

There was no consistently significant effect of higher ploidy across chemical classes on mean peak number and mean abundance. However, a general trend toward increase of CADs was observed for both peak number and abundance in floral tissue in the annual complex and for abundance of MTs in both leaf and floral tissue in both the annual and perennial complexes.

Also, peak number and abundance of STs were both much higher in the high polyploids than in any other group.

165

4.4.Tissue-specific variation

Although the pattern was not always significant, mean peak number and mean abundance of

products were typically higher in combined floral tissue than in leaves, with few exceptions.

Seven MT peaks were present in both tissue types of most taxa, but when fewer than seven were

present (i.e., in C. stellata and C. eckerleinii), more peaks were detected in floral tissue than in leaves. Number of ST peaks was roughly equivalent across floral tissue and leaves, but abundance was greater in floral tissue. Fewer CADs were detected in the florets of C. stellata

than in leaves (but mean abundance was still higher in florets). This last exception was likely due

to the fact that only ray and disc florets were collected from this species (see below).

As noted in “Methods and Materials,” three types of collections of floral tissue were made,

but not all types were made from all taxa. It was a concern that apparent differences between

taxa based on chemical differences in floral samples could be confounded by differences

between tissue type (ray, disc, or whole capitulum) and vice versa. This was because it was reasonable to expect that the phyllaries and receptacular tissue included in the whole-capitulum

collections might be more similar to leaves in their chemical composition than to florets, and that

groups for which only capitula were collected would therefore appear more diverged than they

actually were from groups for which rays and discs were collected separately. When all floral

samples together were not significantly different from leaf samples, (Figs. 3.3d, 3.6d), then this

was not as much of a concern and any apparent differences between types of floral sample,

particularly divergence of both rays and discs from capitula (Fig. 3.3b and, to a lesser extent, Fig.

3.6b), were assumed to reflect differences between taxa for which those types of collections were

made. When floral tissue was significantly different from leaf tissue (Figs. 3.9d, 3.12d), then it

166 was inappropriate to compare groups for which only non-comparable collections were made. For example, in Fig. 3.9a, the NMDS plot of floral samples by chromosome number group showed groups 18, 30, 44, and 85 (for which only whole capitulum collections were made) as more or less divergent from group 14 (for which only floret collections were made) and group 32 (within which either floret samples or capitulum samples were made for each of the various accessions, but never both). Interpretation of divergence between groups with some or all comparable points, for example, that groups 44 and 30 are more similar to each other than either are to group 85, that all three are well diverged from group 18, and that groups 30 and 44 are more divergent from group 32 than are 18 and 85, is straightforward. Interpretation of the divergence of groups

14 and 32 from each other is also straightforward, but any interpretation of the divergence of group 14 from groups 18, 30, 44, and 85 is complicated by the fact that it was not clear if group

14 differed from groups 30 and 44 because of differences in CAD profiles in their florets, or if the difference actually stemmed from the difference between floral tissue and leaf (i.e., phyllary) tissue. Similarly, any discussion of significant differences between floral tissue types (e.g., Fig.

3.9b, 12b) is only applicable to groups for which those types of tissue were collected. With these considerations in mind, the following difference in floral tissue was noted: Within groups 14 and

32, CAD and FG profiles were significantly different between ray and disc florets. This difference may prove to be more widespread with more sampling of ray and disc florets from more taxa, and could be indicative of different adaptive strategies in the different florets, e.g., to attract pollinators or to protect developing fruits from predators (disc florets are functionally male and ray florets are female in Calendula).

167

4.5. Problems and future directions

Given high levels of polymorphism in some groups, as well as evidence of multiple origins

of polyploid species, more extensive sampling within taxa and chromosome number groups

could lead to different results. Particularly within the annual polyploid complex, the individuals of C. stellata and C. tripterocarpa sampled for this study varied in ways that made interpretation of their potential contributions to annual polyploid taxa possible (e.g., individuals from the two

species had distinct profiles, and elements of both were found together across individuals of C.

arvensis). Wider sampling of C. stellata and C. tripterocarpa could reveal a continuum of traits

present in these two species much like the continuum seen in C. arvensis, shifting the conclusion

from one of hybridization to one of shared polymorphism among all three species. Also, greater

sampling of perennial polyploid taxa could reveal more patterns of divergence between

subspecies and aid taxonomic revision of C. suffruticosa s. l. (including C. incana). One obstacle

to wider sampling was an extremely low germination rate for seeds of many taxa, and strategies

will have to be found to break seed dormancy in order to make fresh tissue from more taxa

available for study.

The result that chemical profiles differ between ray and disc florets of both C. stellata and C.

officinalis was compelling, but sampling fresh ray and disc florets into liquid nitrogen from over

two hundred plants while trying to control for potential diurnal effects or induction responses

wounded plants or plants close to wounded plants proved difficult. One approach would be to

first assess whether any such effects are in fact a concern by making trial collections over many

time points from unwounded, wounded, and proximal-to-wounded plants, and then modify sampling methods based on these results.

168

Finally, there is much to be learned about the importance and potential adaptive significance

of chemical blends in Calendula. Little is known about plant-insect interactions in Calendula, and some pollinator studies would be a useful next step, especially for the highly variable perennial polyploid species.

169

REFERENCES

Agrawal, A. A. 2011. New synthesis--trade-offs in chemical ecology. Journal of Chemical Ecology 37: 230-231.

Agrawal, A. A., M. J. Lajeunesse, and M. Fishbein. 2008. Evolution of latex and its constituent defensive chemistry in milkweeds (Asclepias): a phylogenetic test of plant defense escalation. Entomologia Experimentalis Et Applicata 128: 126-138.

Aguilar-Ortigoza, C. J., V. Sosa, and M. Aguilar-Ortigoza. 2003. Toxic Phenols in various Anacardiaceae species. Economic Botany 57: 354-364.

Alvarenga, S. A. V., M. J. P. Ferreira, V. P. Emerenciano, and D. Cabrol-Bass. 2001. Chemosystematic studies of natural compounds isolated from Asteraceae: characterization of tribes by principal component analysis. Chemometrics and Intelligent Laboratory Systems 56: 27-37.

Anderson, M. J., K. E. Ellingsen, and B. H. McArdle. 2006. Multivariate dispersion as a measure of beta diversity. Ecology Letters 9: 683-693.

Basch, E., S. Yong, S. Bent, I. Foppa, S. Haskmi, D. Kroll, M. Mele, P. Szapary, C. Ulbricht, M. Vora, and Natural Standard Research Collaboration. 2006. Marigold (Calendula officinalis L.): an evidence-based systematic review by the Natural Standard Research Collaboration. Journal of Herbal Pharmacotherapy 6: 135-159.

Bray, J. R. and J. T. Curtis. 1957. An ordination of upland forest communities of southern Wisconsin. Ecological Monographs 27: 325-349.

Bruce, T. J. A. and J. A. Pickett. 2011. Perception of plant volatile blends by herbivorous insects – Finding the right mix. Phytochemistry 72: 1605-1611.

Bruce, T. J. A., L. J. Wadhams, and C. M. Woodcock. 2005. Insect host location: a volatile situation. Trends in Plant Science 10: 269-274.

Butnariu, M. and C. Z. Coradini. 2012. Evaluation of Biologically Active Compounds from Calendula officinalis Flowers using Spectrophotometry. Chemistry Central Journal 6: 35- 35.

170

Calabria, L., V. Emerenciano, M. Ferreira, M. SCotti, and T. Mabry. 2007. A phylogenetic analysis of tribes of the Asteraceae based on phytochemical data. Natural Product Communications 2: 277-285.

Constable, C. P. 1999. A survey of herbivore-inducible defensive proteins and phytochemicals. In: Agrawal, A. A., Tuzun, S., Bent, E. (Eds.), Induced Plant Defenses Against Pathogens and Herbivores, APS Press, St. Paul.

De Tommasi, N., C. Pizza, C. Conti, N. Orsi, and M. L. Stein. 1990. Structure and in vitro antiviral activity of sesquiterpene glycosides from Calendula arvensis. Journal of Natural Products 53: 830-835.

Degenhardt, J., T. G. Köllner, and J. Gershenzon. 2009. Monoterpene and sesquiterpene synthases and the origin of terpene skeletal diversity in plants. Phytochemistry 70: 1621- 1637.

del Moral, R. 1972. On the Variability of Chlorogenic Acid Concentration. Oecologia 9: 289- 300.

Ehrlich. P. R., and P. H. Raven. 1964. Butterflies and plants: a study in coevolution. Evolution 18:586-608.

Games, P. A. and J. F. Howell. 1976. Pairwise Multiple Comparison Procedures with Unequal N's and/or Variances: A Monte Carlo Study. Journal of Educational Statistics 1: 113-125.

Greuter, W. 2006+. Compositae (pro parte majore). Euro+Med Plantbase - the information resource for Euro-Mediterranean plant diversity, eds. W. Greuter and E. v. Raab-Straube. Published on the Internet http://ww2.bgbm.org/EuroPlusMed/ [20 October 2014]

Hamrick, J. L. and M. J. W. Godt. 1996. Effects of Life History Traits on Genetic Diversity in Plant Species. Philosophical Transactions of the Royal Society of London.Series B: Biological Sciences 351: 1291-1298.

Harborne, J. B. 1977. Flavonoids and the evolution of the angiosperms. Biochemical Systematics and Ecology 5: 7-22.

Harborne, J. B.and C. A. Williams. 2000. Advances in flavonoid research since 1992. Phytochemistry 55: 481-504.

171

Hamburger, M., S. Adler, D. Baumann, A. Förg, and B. Weinreich. 2003. Preparative purification of the major anti-inflammatory triterpenoid esters from Marigold ( Calendula officinalis). Fitoterapia 74: 328-338.

Heyn, C. C. and A. Joel. 1983. Reproductive relationships between annual species of Calendula (Compositae). Plant Systematics and Evolution 143: 311-329.

Hull-Sanders, H. M., R. H. Johnson, H. A. Owen, and G. A. Meyer. 2009. Effects of polyploidy on secondary chemistry, physiology, and performance of native and invasive genotypes of Solidago gigantea (Asteraceae). American Journal of Botany 96: 762-770.

Huminiecki, L., G. C. Conant, Stockholms universitet, Institutionen för biokemi och biofysik, and Naturvetenskapliga fakulteten. 2012. Polyploidy and the evolution of complex traits. International Journal of Evolutionary Biology 2012: 292068.

Irwin, R. E., D. Cook, L. L. Richardson, J. S. Manson, and D. R. Gardner. 2014. Secondary compounds in floral rewards of toxic rangeland plants: impacts on pollinators.Journal of Agricultural and Food Chemistry 62: 7335-7344.

Jakopic, J., F. Stampar, and R. Veberic. 2009. The influence of exposure to light on the phenolic content of ‘Fuji’ apple. Scientia Horticulturae 123: 234-239.

Johnson, C. B., A. Kazantzis, M. Skoula, U. Mitteregger, and J. Novak. 2004. Seasonal, populational and ontogenic variation in the volatile oil content and composition of individuals of Origanum vulgare subsp. Hirtum, assessed by GC headspace analysis and by SPME sampling of individual oil glands. Phytochemical Analysis : PCA 15: 286-292.

Kaplan, I., R. Halitschke, and A. Kessler. 2008. Constitutive and induced defenses to herbivory in above- and belowground plant tissues. Ecology [H.W.Wilson - GS] 89: 392.

Kaškonienė, V., P. Kaškonas, M. Jalinskaitė, and A. Maruška. 2011. Chemical Composition and Chemometric Analysis of Variation in Essential Oils of Calendula officinalis L. during Vegetation Stages. Chromatographia 73: 163-169.

Kessler, A. and I. T. Baldwin. 2002. PLANTRESPONSES TOINSECTHERBIVORY: The Emerging Molecular Analysis. Annual Review of Plant Biology 53: 299-328.

172

Kessler, A. and R. Halitschke. 2009. Testing the Potential for Conflicting Selection on Floral Chemical Traits by Pollinators and Herbivores: Predictions and Case Study.Functional Ecology 23: 901-912.

Kessler, A. and M. Heil. 2011. The multiple faces of indirect defences and their agents of natural selection. Functional Ecology 25: 348-357.

Keinänen, M., N. J. Oldham, and I. T. Baldwin. 2001. Rapid HPLC screening of jasmonate- induced increases in tobacco alkaloids, phenolics, and diterpene glycosides in Nicotiana attenuata. Journal of Agricultural and Food Chemistry 49: 3553-3558.

Kirk, H., Y. H. Choi, H. K. Kim, R. Verpoorte, and E. van der Meijden. 2005. Comparing Metabolomes: The Chemical Consequences of Hybridization in Plants. New Phytologist167: 613-622.

Kirmizibekmez, H., C. Bassarello, S. Piacente, C. Pizza, and I. Calis. 2006. Triterpene saponins from Calendula arvensis. Zeitschrift Fur Naturforschung Section b-a Journal of Chemical Sciences 61: 1170-1173.

Köllner, T. G., P. E. O’Maille, N. Gatto, W. Boland, J. Gershenzon, and J. Degenhardt. 2006. Two pockets in the active site of maize sesquiterpene synthase TPS4 carry out sequential parts of the reaction scheme resulting in multiple products. Archives of Biochemistry and Biophysics 448: 83-92.

Leach, M. 2008. Calendula officinalis and wound healing: A systematic review. Wounds-a Compendium of Clinical Research and Practice 20: 236-243.

Levin, D. A. 1983. Polyploidy and Novelty in Flowering Plants. The American Naturalist 122: 1- 25.

Llusia, J., J. Peñuelas, R. Seco, and I. Filella. 2012. Seasonal changes in the daily emission rates of terpenes by Quercus ilex and the atmospheric concentrations of terpenes in the natural park of Montseny, NE Spain. Journal of Atmospheric Chemistry 69: 215-230.

Manson, J. S., S. Rasmann, R. Halitschke, J. D. Thomson, A. A. Agrawal, and M. Johnson. 2012. Cardenolides in nectar may be more than a consequence of allocation to other plant parts: a phylogenetic study of Asclepias. Functional Ecology 26: 1100-1110.

173

Martin, D. M., J. Gershenzon, and J. Bohlmann. 2003. Induction of Volatile Terpene Biosynthesis and Diurnal Emission by Methyl Jasmonate in Foliage of Norway Spruce.Plant Physiology 132: 1586-1599.

Marukami, T., A. Kishi, and M. Yoshikawa. 2001. Medicinal flowers. IV. Marigold. (2): Structures of new ionone and sesquiterpene glycosides from Egyptian Calendula officinalis. Chemical & Pharmaceutical Bulletin 49: 974-978.

Mears, J. A. 1980. The evolution of the pseudoguaianolides of Parthenium L. (Asteraceae, Ambrosiinae). Proceedings of the Academy of Natural Sciences of Philadelphia 132: 156- 172.

Meikle, R. D. 1976. Calendula L. Pp. 206-207 in Flora Europaea, Plantaginaceae to Compositae vol.4, eds. T. G. Tutin, V. H. Heywood, N. A. Burges, D. H. Valentine, S. M. Walters, and D. A. Webb. Cambridge: Cambridge University Press.

Moore, B. D., R. L. Andrew, C. Külheim, and W. J. Foley. 2014. Explaining intraspecific diversity in plant secondary metabolites in an ecological context. New Phytologist 201: 733- 750.

Muley, B. P., S. S. Khadabadi, and N. B. Banarase. 2010. Phytochemical Constituents and Pharmacological Activities of Calendula officinalis Linn (Asteraceae): A Review. Tropical Journal of Pharmaceutical Research 8: 455-465.

Niveyro, S. L., A. G. Mortensen, I. S. Fomsgaard, and A. Salvo. 2013. Differences among five amaranth varieties (Amaranthus spp.) regarding secondary metabolites and foliar herbivory by chewing insects in the field. Arthropod-Plant Interactions 7: 235-245.

Nora, S., S. Castro, J. Loureiro, A. C. Gonçalves, H. Oliveira, M. Castro, C. Santos, and P. Silveira. 2013. Flow cytometric and karyological analyses of Calendula species from Iberian Peninsula. Plant Systematics and Evolution 299: 853-864.

Oksanen, J., F. G. Blanchet, R. Kindt, P. Legendre, P. R. Minchin, R. B. O'Hara, G. L. Simpson, P. Solymos, M. H. H. Stevens, H. Wagner. 2013. vegan: Community Ecology Package, v. 2.0-10. URL: http://vegan.r-forge.r-project.org/

Olennikov, D. N.and N. I. Kashchenko. 2013. New Isorhamnetin Glycosides and other Phenolic Compounds from Calendula officinalis. Chemistry of Natural Compounds 49: 833-840.

174

Orians, C. M. 2000. The effects of hybridization in plants on secondary chemistry: implications for the ecology and evolution of plant-herbivore interactions. American Journal of Botany 87: 1749-1756.

Orians, C. 2005. Herbivores, vascular pathways, and systemic induction: facts and artifacts. Journal of Chemical Ecology 31: 2231-2242.

Otto, S. P. and J. Whitton. 2000. Polyploid incidence and evolution. Annual Review of Genetics 34: 401-437.

Paolini, J., T. Barboni, J. Desjobert, N. Djabou, A. Muselli, and J. Costa. 2010. Chemical composition, intraspecies variation and seasonal variation in essential oils of Calendula arvensis L. Biochemical Systematics and Ecology 38: 865-874.

Paolini, J., T. Barboni, J. Desjobert, N. Djabou, A. Muselli, and J. Costa. 2010. Chemical composition, intraspecies variation and seasonal variation in essential oils of Calendula arvensis L. Biochemical Systematics and Ecology 38: 865-874.

Pizza, C. and N. de Tommasi. 1987. Plants Metabolites. A New Sesquiterpene Glycoside from Calendula arvensis. Journal of Natural Products 50: 784-789.

R Core Team. 2014. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL http://www.R-project.org/.

Rodríguez, A., B. Alquézar, and L. Peña. 2013. Fruit aromas in mature fleshy fruits as signals of readiness for predation and seed dispersal. The New Phytologist 197: 36-48.

Roose, M. L. and L. D. Gottlieb. 1976. Genetic and Biochemical Consequences of Polyploidy in Tragopogon. Evolution 30: 818-830.

Schnee, C., T. G. Köllner, M. Held, T. C. J. Turlings, J. Gershenzon, and J. Degenhardt. 2006. The Products of a Single Maize Sesquiterpene Synthase Form a Volatile Defense Signal That Attracts Natural Enemies of Maize Herbivores. Proceedings of the National Academy of Sciences of the United States of America 103: 1129-1134.

Soltis, P. S., X. Liu, D. B. Marchant, C. J. Visger, and D. E. Soltis. 2014. Polyploidy and novelty: Gottlieb's legacy. Philosophical Transactions of the Royal Society of London.Series B, Biological Sciences 369: 20130351-20130351.

175

Stuart A. Campbell and André Kessler. 2013. Plant mating system transitions drive the macroevolution of defense strategies. Proceedings of the National Academy of Sciences 110: 3973-3978.

Tosun, G., B. Yayli, T. Arslan, A. Yasar, S. Karaoglu, and N. Yayli. 2012. Comparative Essential Oil Analysis of Calendula arvensis L. Extracted by Hydrodistillation and Microwave Distillation and Antimicrobial Activities. Asian Journal of Chemistry 24: 1955- 1958.

Tukey, J. 1949. Comparing individual means in the analysis of variance. Biometrics 5: 99-114.

Ueda, H., Y. Kikuta, and K. Matsuda. 2012. Plant communication: Mediated by individual or blended VOCs? Plant Signaling & Behavior 7: 222-226.

Uesugi, A. and A. Kessler. 2013. Herbivore exclusion drives the evolution of plant competitiveness via increased allelopathy. The New Phytologist 198: 916-924.

Uesugi, A.and A. Kessler. 2013. Herbivore exclusion drives the evolution of plant competitiveness via increased allelopathy. The New phytologist 198: 916-924.

Ukiya, M., T. Akihisa, K. Yasukawa, H. Tokuda, T. Suzuki, and Y. Kimura. 2006. Anti- inflammatory, anti-tumor-promoting, and cytotoxic activities of constituents of marigold (Calendula officinalis) flowers. Journal of Natural Products 69: 1692-1696.

Vermeij, G J. 1994. The evolutionary interaction among species: selection, escalation, and coevolution. Annual Review of Ecology and Systematics 25:219-236.

Vidal-Ollivier, E., R. Elias, F. Faure, A. Babadjamian, F. Crespin, G. Balansard, and G. Boudon. 2007. Flavonol glycosides from Calendula officinalis flowers. Planta Medica 55: 73-74.

Vogt, T. 2010. Phenylpropanoid biosynthesis. Molecular Plant 3: 2-20.

Warner, D. A. and G. E. Edwards. 1993. Effects of polyploidy on photosynthesis. Photosynthesis Research 35: 135-147.

176

Yoshikawa, M., T. Murakami, A. Kishi, T. Kageura, and H. Matsuda. 2001. Medicinal flowers. III. Marigold. (1): hypoglycemic, gastric emptying inhibitory, and gastroprotective principles and new oleanane-type triterpene oligoglycosides, calendasaponins A, B, C, and D, from Egyptian Calendula officinalis. Chemical & pharmaceutical bulletin 49: 863-870.

177