<<

Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Science and Technology 367

Evolution of Flowering Time in the Tetraploid bursa-pastoris ()

TANJA SLOTTE

ACTA UNIVERSITATIS UPSALIENSIS ISSN 1651-6214 UPPSALA ISBN 978-91-554-7024-1 2007 urn:nbn:se:uu:diva-8311                     !   " #$$"  $%$$ &    &    & '  ( )   *      (

 +  )( #$$"(   & ! *  )   )        ,-( .   (                           /0"( 1$ ( ( 2+ 3" 43 45514"$#14 (

.           6    &     &*          &      *    ( 2   2        & & *      *    &       *        &   ( !      *          &     (  . 7  &  *    .             * &    &    &            &               &                      *      (              &     &        * &         *          8       &     ( )    & & *    *        ( .           4  4& *    &&      *     * &&  & *  ( +  * 9) ,9  )  - &  & *        :   ( )             !"!#$% , %-   &  9)    &!'$()* &!  ,&-    *  & *    ( &&  &          * & *      &  ;    ( )  %     &     *     &      && & *  ( 2            8                 & & *         

+, - & *         8                9)

.  /    $   / *     / $         * / )  %0/    / $ 12345 / ,

< ) 6 +  #$$"

2++ 05 40# 1 2+ 3" 43 45514"$#14  %  %%% 4 / , %== (:(= > ? %  %%% 4 / -

looking carefully,

a shepherd’s purse is blooming

under the fence

Bash

List of papers

This thesis is based on the following papers, which are referred to by their Roman numerals:

I Slotte, T., Ceplitis, A., Neuffer, B., Hurka, H., and M. Lascoux. 2006. Intrageneric phylogeny of Capsella (Brassicaceae) and the origin of the tetraploid C. bursa-pastoris based on chloroplast and nuclear DNA sequences. American Journal of Botany 93: 1714-1724 II Slotte, T., Huang, H., Lascoux, M. and A. Ceplitis. Polyploid speci- ation did not confer instant reproductive isolation in Capsella (Bras- sicaceae). (Manuscript) III Slotte, T., Holm, K., McIntyre, L. M., Lagercrantz, U. and M. Las- coux. 2007. Differential expression of genes important for adapta- tion in Capsella bursa-pastoris (Brassicaceae). Physiology 145: 160-173 IV Ceplitis, A., Slotte, T., Neuffer, B., Linde, M., Kraft, T, and M. Las- coux. QTL mapping of flowering time using AFLP and candidate gene markers in the tetraploid Capsella bursa-pastoris (Brassica- ceae). (Manuscript) V Slotte, T., Huang, H., Ceplitis, A., Chen, J., and M. Lascoux. The role of introgressed alleles for flowering time variation in the tetraploid Capsella bursa-pastoris (Brassicaceae). (Manuscript)

Papers I and III are reprinted with kind permission from the Botanical Soci- ety of America and the American Society for Plant Biology, respectively.

Contents

Introduction...... 9 Polyploid speciation ...... 9 The model system Capsella ...... 11 Genetic basis of flowering time variation ...... 12 Finding the genes that matter for quantitative variation...... 13 Aims ...... 16 Specific aims...... 16 Results and Discussion ...... 17 Polyploid speciation in Capsella...... 17 Intrageneric phylogeny of Capsella (Paper I)...... 17 Multiple origins or introgression (Paper II)...... 20 Genetic basis of flowering time variation in C. bursa-pastoris ...... 23 Differential gene expression and flowering time variation (Paper III)23 QTL mapping (Paper IV)...... 25 Association mapping (Paper V)...... 26 Conclusions...... 30 Svensk sammanfattning ...... 31 Acknowledgements...... 33 References...... 34

Abbreviations

AFLP Amplified fragment length polymor- phism cpDNA Chloroplast DNA FDR False discovery rate GA Gibberellic acid ka Years before present, thousands QTL Quantitative trait locus SNP Single nucleotide polymorphism

Introduction

“Natural selection merely modified, while redundancy created”

The above quote summarizes Ohno’s (1970) view of gene duplication as a major source of evolutionary novelty. It has since become clear that gene duplication can play a role in the evolution of new gene function and in speciation. What is less well known is what role gene and genome duplica- tion plays for the evolution of quantitative traits, and whether this could pro- vide an adaptive benefit in polyploids. Up to now, studies of the genetics of quantitative traits in polyploids have mostly been confined to crops. Here, we study the genetic basis of flowering time variation and the evolutionary history of the wild tetraploid crucifer Capsella bursa-pastoris. Almost a century ago, the prominent American geneticist Shull (1929) conducted ex- tensive studies on speciation and genetic variation in C. bursa-pastoris and its relatives. The conceptual and experimental toolbox available today is very different from that available to Shull. However, our aims are similar: to carry genetic analysis “beyond the field of domesticated and animals into the realm of wild nature”.

Polyploid speciation Polyploidization, the duplication of entire genomes, is common in flowering plants, where an estimated 47 to 70% of all have gone through poly- ploidization at some point in their history (Masterson 1994). Polyploid spe- cies include major crops such as wheat, sugarcane, cotton and coffee, to name a few. Even species possessing “streamlined genomes” such as Arabi- dopsis thaliana exhibit signs of in their genomes ( Genome Initiative 2000; Simillon et al. 2002; Bowers et al. 2003), and the number of such cryptic paleopolyploids can be expected to increase with the rapid increase in whole genome sequences. According to their mode of origin, polyploids can be classified as autopolyploids, which form by genome doubling within a single species, and allopolyploids, which form by interspecific hybridization and genome dou- bling (Ramsey and Schemske 1998). When orthologous chromosomes from different species come together in one nucleus, as in allopolyploid species,

9 these chromosomes are termed homoeologous (or homeologous), and this term can be used for genes or regions on such chromosomes as well (Fig. 1).

Figure 1. Modes of polyploid formation (modified from Chen 2007). A. Autopoly- ploid formation occurs through genome doubling within a species. B. Allopolyploids form by interspecific hybridization and genome doubling. There are several ways for this to occur, e.g. by genome doubling in a first generation hybrid, by the union of unreduced (2n) gametes or by hybridization between two autopolyploids (not shown).

Polyploid speciation is likely to be a predominant mode of sympatric speci- ation in plants (Otto and Whitton 2000) and can also contribute to allopatric speciation (Lynch and Conery 2000). Of all speciation processes in flower- ing plants, polyploidy has been estimated to account for 2-4% (Otto and Whitton 2000). It is less common among animals, but polyploids can still be found in many animal lineages (Otto and Whitton 2000). Unlike other speci- ation processes, polyploidization is often assumed to constitute a special form of “instant speciation” whereby the newly formed polyploid is immedi- ately and completely reproductively isolated from its progenitor species (Coyne and Orr 2004; Linder and Rieseberg 2004; Mallet 2007). This is expected to occur because hybrids between a polyploid and its ancestors have an uneven number of chromosome sets, potentially leading to meiotic non-disjunction and infertility in hybrids. However, not all hybrids are com- pletely sterile, and it has been shown that hybridization between polyploids and their ancestors can constitute a “bridge” that allows gene flow across ploidy levels (Ramsey and Schemske 1998; Petit et al. 1999; Husband 2004; Henry et al. 2005; Henry et al. 2007). Although this process can lead to shared polymorphism across ploidy levels, such observations have more often been interpreted as signs of multiple origins of the polyploid, and the consensus in the field is that most polyploids originated more than once (Soltis and Soltis 1993; 1999; 2000). Exceptions include the peanut Arachis hypogaea (Kochert et al. 1996), the salt marsh grass Spartina anglica (Ainouche et al. 2004), and Arabidopsis suecica (Jakobsson et al. 2006), all of which appear to have had a single origin. The fact that so many extant plant species are polyploids implies that ge- nome duplication may have been an important source of evolutionary nov-

10 elty, as hypothesized by Ohno (1970). Although our knowledge on the ge- netic and epigenetic changes accompanying polyploid formation is rapidly increasing (Chen 2007), the genetic basis of quantitative traits in established polyploids has attracted less attention. Such studies have mostly been con- fined to traits of agricultural importance in polyploid crops (e.g. Jiang et al. 1998; Ming et al. 2001; Schranz et al. 2002; Delourme et al. 2006). The very duplicated nature of polyploid genomes renders studies of the genetics of quantitative traits more difficult, as distinguishing homoeologous loci from loci duplicated by other means than whole genome duplication, or even from alleles is not always straightforward. Although challenging, increasing our knowledge on the role of genome duplication for quantitative trait variation in non-domesticated species is necessary in order to understand the role of polyploidy in adaptation.

The model system Capsella The family Brassicaceae comprises some 340 genera and 3,350 species of which 50 % or more are believed to be polyploids (Koch et al. 2003). With the ready availability of genomic data from and Bras- sica crop species, the Brassicaceae provide several attractive opportunities to study polyploid evolution (e.g. Jakobsson et al. 2006). The Capsella is one of the most closely related to A. thaliana, with an estimated divergence time of Arabidopsis and Capsella of about 10 million years (Koch et al. 2000; 2001). According to current classification the genus includes three species: the tetraploid (2n=4x=32) Capsella bursa-pastoris (L.) Medik., and two diploids (2n=2x=16), Reut. and (Fauché & Chaub.) Boiss. (Hurka and Neuffer 1997). The self-fertile C. rubella is mainly found in central-southern Europe, whereas the self-incompatible C. grandiflora has a distribution limited to the western Balkans (Chater 1993; Hurka and Neuffer 1997). Due to its phylogenetic proximity to A. thaliana, the diploid C. rubella has attracted attention as a model system for compara- tive studies of the evolution of genome organization in crucifers (Boivin et al. 2004; Yogeeswaran et al. 2005) and genome sequencing of C. rubella is currently underway (Schranz et al. 2006). C. bursa-pastoris, or Shepherd’s Purse, is a predominantly selfing tetraploid species with disomic inheritance and a nearly worldwide distribu- tion (Hurka and Neuffer 1997). It grows as a weed on disturbed ground, and is characterized by an extensive variability for many morphological traits such as leaf shape and fruit shape, as well as life-history traits such as flow- ering time (Shull 1929; Neuffer and Hurka 1986a, b; Neuffer and Bartelheim 1989; Neuffer and Eschner 1995; Ceplitis et al. 2005). C. bursa-pastoris is generally thought to have originated in the Middle East, and due to its great

11 colonizing ability rapidly spread across the world, reaching the Americas with European colonizers and Australia in the 18th century (Neuffer and Hurka 1999; Ceplitis et al. 2005). The close relationship to A. thaliana means that genomic tools and data from A. thaliana can be utilized in Capsella. The genus therefore offers an excellent opportunity to study both polyploid speciation and the genetic ba- sis of quantitative traits in a wild, non-model polyploid. We chose to concen- trate on elucidating the genetic basis of flowering time, a quantitative trait of potential adaptive importance that is well studied in A. thaliana.

Genetic basis of flowering time variation Flowering time is a major life-history trait contributing to reproduction and adaptation, especially in annual plants (Roux et al. 2006). The time to flow- ering is a quantitative trait of crucial importance for seed production, and different flowering strategies may have evolved in response to local climatic conditions (Engelmann and Purugganan 2006; Mitchell-Olds and Schmitt 2006). The genetic basis of flowering time variation is well understood in Arabidopsis thaliana, where over 60 genes with effects on flowering have been identified, mainly through mutant analysis (Ehrenreich and Purugganan 2006). Four main pathways, the photoperiod, vernalization, gibberellin and autonomous pathways, allow the plant to perceive and respond changes in daylength, temperature and hormonal status, respectively (Mouradov et al. 2002; Simpson and Dean 2002; Koornneef et al. 2004). Floral pathway in- tegrator genes combine the signals from these pathways to fine-tune the tran- sition from vegetative to reproductive development, although recent studies also indicate that there is direct cross-talk between pathways (Edwards et al. 2006; Gould et al. 2006; Salathia et al. 2007). Understanding the genetic basis of natural variation is of primary interest for evolutionary studies of adaptation (Mitchell-Olds and Schmitt, 2006). The precise role of flowering genes among and within species can vary sig- nificantly and the effect of allelic variation for these genes in natural popula- tions is a focus of current research (Werner et al. 2005; Engelmann and Pu- rugganan 2006; Roux et al. 2006; Salathia et al. 2007). Recent studies dem- onstrate that the main genes responsible for natural variation in flowering time can differ between populations or species, reflecting differences in ge- netic architecture, ecological niche and history. In A. thaliana, variation at the genes FRIGIDA (FRI) and FLOWERING LOCUS C (FLC) which are involved in the vernalization response can explain a great deal of genetic variation in flowering time (Johanson et al. 2000; Caicedo et al. 2004), and selection for earlier flowering appears to have acted on FRI (Hagenblad and Nordborg 2002; Le Corre et al. 2002; Toomajian et al. 2006). However, pho- toreceptor genes such as CRYPTOCHROME2 (CRY2) and PHYTOCHROME

12 C (PHYC) have also been implicated in natural variation flowering time in A. thaliana (El-Assal et al. 2001; Balasubramanian et al. 2006). Findings from A. thaliana have successfully been used to start to elucidate the genetic basis of natural flowering time variation in other crucifer species (B. rapa: Schranz et al. 2002, B. nigra: Kruskopf-Österberg et al. 2002, and B. ol- eracea: Okazaki et al. 2007). However, despite the availability of genomic tools, there is a dearth of studies on the genetic control of natural variation in flowering time in the closest relatives of A. thaliana such as Capsella spp..

Finding the genes that matter for quantitative variation Quantitative traits are traits that are affected by segregation of alleles at many loci, as well as by the environment. Identifying the genetic basis of quantitative trait variation is meaningful, not only because it allows us to answer long-standing questions on the types of polymorphisms affecting quantitative traits (e.g. cis-regulatory or coding sequence; Hoekstra and Coyne 2007; Wray 2007) and their effect sizes, but also because such studies can ultimately contribute to our understanding of the evolutionary forces that maintain such variation (Barton and Keightley 2002; Mitchell-Olds et al. 2007). However, the path to identifying the genes underlying phenotypic variation is often long and difficult, and in most cases a combination of ap- proaches is required for successful identification of causative polymor- phisms. Three of the currently most common approaches will be outlined below, namely QTL (Quantitative Trait Locus) and association mapping, and gene expression microarray analysis. A main aim in genetics is to find out “how the genes map onto the pheno- type”, in the words of Sydney Brenner’s 2002 Nobel Prize lecture. QTL mapping and association mapping both aim to achieve this goal, identifying parts of the genome that affect phenotypic variation for quantitative traits. Both methods rely on linkage disequlibrium (the non-random association of alleles at separate loci) between linked loci, but the source of the linkage disequilibrium differs between the two methods. Whereas QTL mapping studies are usually undertaken in an experimental context, where the map- ping population is created by some crossing scheme (see e.g. Lynch and Walsh 1998), association mapping relies on samples from natural popula- tions. For QTL mapping, a common strategy is to cross two parental lines that differ in the phenotypic trait of interest to create an F1, which is then inter- crossed or selfed to create an F2 mapping population, or back-crossed to one of the parental lines. The phenotypic trait is then scored in the mapping population alongside with a number of molecular markers spread across the genome, and statistical methods are applied to identify markers that are sig- nificantly associated with trait variation. Although QTL mapping studies are

13 useful as a first step towards identifying the genes underlying quantitative variation they also have several limitations. First, the exact QTLs identified can be dependent on the environment, making results from lab-based studies difficult to generalize to conditions in the field (Mauricio 2001). Second, when QTL studies are based on simple line crosses such as those described above, they often do not capture the genetic variation of importance in a wider population sample. Finally, as mentioned above, the limited resolution of QTL mapping is also an issue: QTLs can encompass chromosomal re- gions containing hundreds of genes, and without prior information on gene function and location, laborious fine-scale mapping is necessary to identify causative polymorphisms (Glazier et al. 2002). Association mapping offers (at least partial) solutions to the problems of generality and low resolution of QTL mapping. In association mapping stud- ies, instead of generating experimental mapping populations, samples are collected from natural populations, and associations between phenotypic variation and genetic markers are then sought in these samples. Association mapping has greater resolution than QTL mapping, because of the greater number of historical recombination events in the genomes of related indi- viduals from a natural population than between e.g. F2 individuals (Nord- borg and Tavare 2002). Provided that sampling is done in a representative way, results are also more general than those for QTL based on line crosses. However, association studies suffer from elevated rates of false positives unless population structure is properly accounted for (Lander and Schork 1994; Pritchard and Rosenberg 1999). In general, the prospects for success- ful association mapping depend on the rate of decay of linkage disequilib- rium, which is currently not known beyond a few model species (Flint- Garcia et al. 2003). The number of reports of successful genome-wide association studies (so far, mainly in humans) is currently rapidly increasing (e.g. Fellay et al. 2007; Scott et al. 2007). However, in most non-model organisms, association stud- ies are focused on a chromosomal region identified by QTL mapping or other means, or use a candidate gene approach (i.e. focuses on genes with known functions relevant to the trait) (e.g. Thornsberry et al. 2001; Krus- kopf-Österberg et al. 2002). When there are no known candidate genes for a trait, or when they are too many for them to be exhaustively evaluated, gene expression microarray studies can provide a means to select genes for further study (Wayne and McIntyre 2002). This method entails hybridizing fluorescently labeled cDNA to glass slides with probes for thousands of genes, and subsequently measuring the intensity of fluorescence for each probe. Whole-genome tran- scription profiles obtained using microarrays can be used to identify genes that are expressed at different levels in ecotypes, tissues, developmental stages or time points relevant to the trait under study (Remington and Pu- rugganan 2003). Such studies can provide additional information on the rela-

14 tive importance of different pathways for the trait, and recently have been extended to reconstruct regulatory networks and map expression QTL (Keurentjes et al. 2007). Co-localization between expression QTL and QTL for a quantitative trait of interest constitutes strong evidence that differential gene expression is important for quantitative variation (Kirst et al. 2004). When the aim is to find the genes that matter for quantitative trait variation, microarray gene expression studies are probably most powerful when used in combination with classical quantitative genetics techniques (Doerge 2002).

15 Aims

The main aims of this study were twofold: first, to elucidate the origin of the tetraploid C. bursa-pastoris, and second, to investigate the genetic basis of variation in flowering time in this species.

Specific aims

I Study phylogenetic relationships between diploid and tetraploid Capsella species using nuclear and cpDNA sequences (Paper I). II Evaluate the possibility of post-polyploidization gene flow between C. bursa-pastoris and C. rubella (Paper II). III Characterize genome-wide differential gene expression between early- and late-flowering C. bursa-pastoris accessions (Paper III). IV Map QTL for flowering time in an F2 mapping population (Paper IV). V Assess whether polymorphism at candidate flowering time genes are associated with flowering time in natural populations (Paper V).

16 Results and Discussion

Polyploid speciation in Capsella

Intrageneric phylogeny of Capsella (Paper I) In paper I, we investigated the phylogenetic relationships between the dip- loids C. rubella and C. grandiflora and the tetraploid C. bursa-pastoris. Based on isozyme and chloroplast restriction patterns, C. bursa-pastoris was previously hypothesized to be an allopolyploid of C. rubella and C. grandi- flora (Hurka et al. 1989; Mummenhoff and Hurka 1990), or an autopolyploid of C. grandiflora (Hurka and Neuffer 1997). The main aim of this study was to test these hypotheses by reconstructing phylogenetic trees based on DNA sequence data. To elucidate both maternal and paternal contributions in the origin of the tetraploid C. bursa-pastoris, we obtained DNA sequences for two maternally inherited cpDNA loci and parts of three biparentally inher- ited independent nuclear genes that are single-copy in A. thaliana: Alcohol dehydrogenase (Adh), PISTILLATA (PI) and LUMINIDEPENDENS (LD). cpDNA loci were sequenced in 20 accessions of C. bursa-pastoris sampled mainly in Europe but including accessions from China, North America and Africa, 9 accessions of C. rubella from throughout the species range, two accessions of C. grandiflora and in outgroup taxa ( paniculata and spp.). Nuclear loci were sequenced in four accessions of the tetraploid C. bursa-pastoris, and in all accessions of the diploid Capsella species. The 1.5 kb cpDNA sequenced was nearly devoid of intraspecific variation in C. bursa-pastoris, where the only intraspecific polymorphism was a single 11-bp indel found in one accession. In agreement with this, a low level of intraspecific polymorphism was previously observed at seven chloroplast microsatellites (Ceplitis et al. 2005). Together, these observations indicate that the maternal effective population size of C. bursa-pastoris is small, per- haps because the species had a single maternal origin.

17

Figure 2. Phylogenetic trees, with bootstrap values, for the cpDNA data set (A), Adh (B), LD (C) and PI (D). For the cpDNA data set and for Adh, a single most parsimo- nious tree was found. For LD and PI, one of the two most parsimonious trees is shown. C. bursa-pastoris is abbreviated by Cbp, C. rubella by Cr and C. grandiflora by Cg.

18 If C. rubella or C. grandiflora was the maternal ancestor of C. bursa- pastoris, one would expect cpDNA sequences from C. bursa-pastoris to cluster with sequences from either of these diploids. However, neither C. rubella nor C. grandiflora were sister to C. bursa-pastoris in the single, well-supported phylogeny retrieved for the cpDNA data set (Fig. 2). In order to maintain that either of these extant diploid Capsella species constitutes the maternal parent of C. bursa-pastoris, one would therefore have to invoke additional processes of chloroplast divergence and/or sorting to explain the observed pattern. For all three nuclear loci sequenced, the diploid Capsella accessions ex- hibited no signs of gene duplication, whereas two sequence types were re- trieved in all C. bursa-pastoris accessions. These sequence types probably correspond to duplicated loci because they did not segregate as alleles in offspring of each of the four C. bursa-pastoris accessions (“fixed heterozy- gosity”). In combination with the lack of evidence for duplication in the diploids, this indicates that these sequences are likely to correspond to ho- moeologous loci. In the ideal case, sequences for homoeologous loci should cluster with the same or with different diploid species if the tetraploid were an auto- or allo- polyploid, respectively. However, in the present study, neither of these two possible topologies appeared for any of the three nuclear genes (Fig. 2). Even though the topologies for the three different nuclear loci were similar, none of them can be easily reconciled with either of the previous hypotheses on the origin of the tetraploid C. bursa-pastoris. In each phylogenetic tree, sequences from each diploid species formed a clade, as did the sequences for one of the C. bursa-pastoris homoeologs (the homoeolog designated with a “B” in Fig. 2). At the other homoeolog (designated with an “A” in Fig. 2) however, some C. bursa-pastoris accessions harbored an allele shared with C. rubella. At Adh, C. bursa-pastoris accession SE14 from Sweden harbored such a shared allele, at LD, accession 721 from California had an allele shared with C. rubella, and at PI, accession 740 from Nevada had such an allele. The presence of alleles shared with C. rubella in C. bursa-pastoris was an unexpected finding. There are at least two possible explanations for this. First, the C. rubella alleles could be present in C. bursa-pastoris as remnants of a polyploidization event including C. rubella or a species harboring C. rubella-like alleles. This “retention of ancestral polymorphism” scenario would be more likely if the tetraploid had multiple origins, as a single poly- ploid speciation event is an extreme bottleneck that would be unlikely to leave several alleles segregating in the newly formed tetraploid population. Alternatively, the C. rubella alleles could have been introduced into C. bursa-pastoris after the establishment of the tetraploid, i.e. as a result of recent introgressive hybridization. Due to the limited number of C. bursa-

19 pastoris accessions sequenced for nuclear genes, distinguishing between these hypotheses was not possible in this study.

Multiple origins or introgression (Paper II) In paper II, we sequence six nuclear loci duplicated by polyploidy in 92 ac- cessions of the tetraploid C. bursa-pastoris, and in 21 accessions of its close diploid relative C. rubella. We compare patterns of allele sharing and nu- cleotide diversity in western Eurasia, where C. bursa-pastoris and C. rubella occur in partial sympatry, with those in eastern Eurasia, where C. rubella does not grow. In order to distinguish between ancestral polymorphism, ren- dered more likely if the tetraploid originated more than once, and introgres- sion as the cause of allele sharing, we employ coalescent-based divergence population genetic analyses to quantify gene flow across ploidy levels. Mean nucleotide diversity () of C. bursa-pastoris was an order of magni- tude higher in western than in eastern Eurasia (2.5 × 10-3 vs. 2.2 × 10-4). There was allele sharing between C. bursa-pastoris and C. rubella in west- ern but not in eastern Eurasia. In samples from western Eurasia, C. bursa- pastoris accessions shared alleles with C. rubella at one of the homoeologs of four nuclear genes: Alcohol dehydrogenase (Adh), CRYPTOCHROME 1 (CRY1), LUMINIDEPENDENS (LD) and PISTILLATA (PI). For each of these genes, the C. bursa-pastoris homoeolog that harbored C. rubella al- leles in western Eurasia was also the least divergent from C. rubella in C. bursa-pastoris from eastern Eurasia. Shared polymorphism in sympatry but not in allopatry is usually considered a hallmark of introgressive hybridiza- tion, but because C. bursa-pastoris is believed to have originated in the near Middle East or the Balkans (Hurka and Neuffer 1997) this pattern could also be due to retention of ancestral polymorphism, especially if the tetraploid arose more than once. We therefore proceeded to test whether our data was better explained by a model of speciation with or without gene flow between C. bursa-pastoris and C. rubella. For this purpose, we used an isolation- with-migration model that allows for population size change (Hey 2005). Our prediction was that, if allele-sharing between C. rubella and C. bursa- pastoris is only due to retention of ancestral polymorphism, then estimates of gene flow between these species should be zero. If, on the other hand, hybridization and introgression have occurred, then some estimates of gene flow between species should be non-zero. Indeed, we found the latter to be true. Using the isolation-with-migration model, there was evidence for uni-directional gene flow from C. rubella to C. bursa-pastoris in western Eurasia, where C. bursa-pastoris and C. rubella are partially sympatric and share alleles (Fig. 3). In eastern Eurasia, where C. rubella is absent, there was no evidence for gene flow in either direction (Fig. 3). Analyses of both geographical samples supported a scenario where

20 the origin of the tetraploid C. bursa-pastoris constituted a severe bottleneck, after which the species underwent a strong population expansion to a current effective population size of about 30000.

Figure 3. Probability density estimates for gene flow (A) and timing of migration (B). A. The left plot shows the probability density for gene flow in western Eurasia, from C. rubella to C. bursa-pastoris in black and from C. bursa-pastoris to C. ru- bella in grey. The right plot shows the probability density for gene flow in eastern Eurasia. B. Posterior probability distributions for the time of introgression from C. rubella to C. bursa-pastoris (black) and the time of gene flow from western to east- ern Eurasian C. bursa-pastoris populations (grey), for LD.

For each locus analyzed, only a few, quite recent introgression events are sufficient to explain the observed pattern of polymorphism. The timing of these events is necessarily coarse and dependent on model specifics and mutation rate assumptions. However, the general timeframe suggests that hybridization and introgression could have taken place during or after the last glacial maximum (ca 18 ka). The lack of shared alleles in eastern Eura- sia despite their presence at moderately high frequencies in western Eurasia suggests that C. bursa-pastoris may have spread to eastern Eurasia before hybridization and introgression took place in western Eurasia. Indeed, the timing of migration events supported such a scenario for all genes where C. bursa-pastoris shared alleles with C. rubella (Fig. 3). Because all loci had

21 negative values for Tajima’s D in the sample from eastern Eurasia, it appears that a demographic process such as a founder event followed by population expansion may have affected genetic variation in this region. A scenario where C. bursa-pastoris originated in the Middle East and subsequently spread both eastwards to Asia and westwards to Europe, where introgressive hybridization with C. rubella took place, is in agreement with this analysis. Divergence population genetic analyses are increasingly being used to as- sess gene flow between and within diploid species. In this study, we used an extension of the isolation-with-migration model (Hey 2005) which allows the specification of a set of parameters to create a realistic framework for polyploid speciation. Our results show that polyploid speciation need not result in immediate and complete reproductive isolation, that post- polyploidization hybridization and introgression can contribute significantly to genetic variation in a newly formed polyploid, and that divergence popu- lation genetic analysis constitutes a powerful way of testing hypotheses on polyploid speciation.

22 Genetic basis of flowering time variation in C. bursa- pastoris

Differential gene expression and flowering time variation (Paper III) The aim of paper III was to characterize gene expression differences be- tween early- and late-flowering C. bursa-pastoris accessions, in order to identify flowering pathways that might be involved in generating natural flowering time variation in this species. Based on a data from a survey of natural flowering time variation (Ceplitis et al. 2005), we chose two ecotypes that were extreme in their flowering responses, the early-flowering accession PL from Puli, Taiwan, and the late-flowering accession SE14 from Härnösand, Sweden. We assessed the flowering time of these accessions with and without vernalization treatment, and survival analysis showed that all four groups (non-vernalized and vernalized plants of each accession) differed in flowering time (Fig. 4). Both accessions responded to vernaliza- tion, although the effect was greater for accession SE14 than for PL.

Figure 4. Predicted probability of survival (not flowering) for each of the four strata, using a nonparametric Cox proportional hazards model. The horizontal axis displays the flowering time and the vertical axis displays the predicted probability of not flowering (a value of 1.0 indicates none of the plants are flowering). Accession SE14 is shown in grey and PL in black. Dashed lines are for vernalized plants and entire lines for non-vernalized plants of each accession.

Using Arabidopsis (Arabidopsis thaliana) microarrays, we found a large number of significant differences in gene expression between 1-week-old seedlings of these ecotypes. A total of 21 known flowering time genes were differentially expressed between extreme flowering ecotypes (at false dis- covery rate, FDR 0.1), and most of these were genes that were involved

23 either in regulating circadian rhythm or gibberellic acid (GA) biosynthesis and signaling (Fig. 5). In accordance with this, there was a significant over- representation of differentially expressed genes in the Gene Ontology cate- gory “circadian rhythm” (P = 0.023, Fisher’s exact test) as well as of genes involved in GA biosynthesis and signaling (P = 0.042, Fisher’s exact test). FLOWERING LOCUS C (FLC), a key flowering time regulator in A. thaliana, was not differentially expressed between ecotypes prior to vernali- zation.

Figure 5. Overview of flowering time pathways in A. thaliana. Figures are adapted from (Mouradov et al. 2002), (He and Amasino 2005) and (Roux et al. 2006). Genes that were significantly differentially expressed between accessions are in boldface. Genes that were up-regulated in SE14 compared to PL are marked in green and those that are down-regulated in SE14 compared to PL are marked in magenta.

A detailed time-series study of the expression of two core circadian genes clearly demonstrated that the two extreme flowering ecotypes differed in circadian rhythm. Good agreement in differential expression of flowering time genes was found in a biological validation using an independent pair of North American ecotypes with less extreme differences in flowering time, suggesting a shared genetic basis of these expression differences, or parallel evolution of similar regulatory differences. In A. thaliana, the circadian clock is an integral component in the external coincidence model for photo- periodic flowering, and altered clock function can also cause differences in

24 the expression of GA biosynthesis genes (Blazquez et al. 2002). Therefore, we conclude that genes involved in regulation of the circadian clock are strong candidates for the evolution of flowering time variation in C. bursa- pastoris.

QTL mapping (Paper IV) The aim of paper IV was to map QTL responsible for flowering time varia- tion using AFLP and gene-specific markers. For this purpose, we obtained F2 seeds from a cross of two North American flowering time ecotypes de- scribed in Linde et al. (2001). Seeds were germinated and the flowering time of 157 F2 individuals was assessed under long-day conditions. Using six selective primer combinations, a total of 102 AFLP loci were obtained. Gene-specific primers were developed to facilitate comparisons to other Brassicaceae genomes. The candidate flowering time genes CONSTANS (CO), DE-ETIOLATED 1 (DET1), FCA, FLOWERING LOCUS C (FLC), FRIGIDA (FRI), GA REQUIRING 1 (GA1) and LUMINIDEPENDENS (LD) were chosen for development of gene-specific markers. For each gene, we aimed at mapping both homoeologs, but this was only possible for FCA, GA1 and LD. Only one homoeolog of FRI and nei- ther of the homoeologs of CO, DET1 and FLC could be mapped because there were no sequence differences between the mapping parents for any of these loci in the regions examined. In total, seven loci corresponding to four different flowering time genes (GA1, FCA, FRI and LD) were therefore suc- cessfully mapped in this cross. The final linkage map consisted of 89 loci (82 AFLP loci and seven gene-based loci). These loci were distributed on 15 linkage groups (one fewer than the haploid chromosome number in C. bursa-pastoris). Putative homoeologous loci for each flowering-time gene mapped to different linkage groups, as expected for true homoeologs. Over- all, the genome structure of C. bursa-pastoris appears very similar to that in A. lyrata and C. rubella. For instance, the LD and FCA genes which are situ- ated on one chromosome in A. thaliana but on two separate chromosomes in A. lyrata, also mapped to independent linkage groups in this study. Of the 157 F2 individuals, 133 had flowered before 200 days. The flower- ing time of the early-flowering parental line was 45 days and that of the late- flowering parent was 91 days. The 133 F2 individuals had an average flow- ering time of 64 days and a right-skewed distribution of flowering times. A single-QTL genome scan revealed two QTL with LOD scores exceeding the genome-wide 1% significance level. These QTL explained 19.5% and 16.5% of the genetic variation in flowering time, and were found on linkage groups CBP06 and CBP14, located at 30.1 cM and 65.9 cM, respectively (Fig. 6). The most closely situated gene-based markers were LD A and GA1 A on CBP06 at 0.0 cM and 2.7 cM, respectively, and LD B and GA1 B on CBP14 at 15.8 cM and 14.5 cM, respectively.

25

Figure 6. The two QTL for flowering time map to linkage groups CBP06 (A) and CBP14 (B). The horizontal axis shows the map position in cM and the vertical axis the lod score.The positions of gene-based markers for LD and GA1 are shown on the horizontal axis.

Thus, the QTL for flowering time map to homoeologous linkage groups, which could mean that polymorphisms in homoeologous genes underlie these QTL. However, fine-scale mapping is necessary to determine whether this is the case. In summary, this study outlined two chromosomal regions of particular interest for studies of flowering time variation in C. bursa- pastoris, namely the parts of linkage groups CBP06 and CBP14 correspond- ing to the upper part of A. lyrata chromosome AL6. Genes involved in regu- lating the circadian rhythm and the GA pathway differed in expression be- tween the mapping parents (these accessions were used for biological valida- tion in Paper III). Therefore, candidate flowering time genes in this chromo- somal region with such effects on gene expression are of special interest for studies of the genetic basis of flowering time variation in C. bursa-pastoris.

Association mapping (Paper V) In paper V, we test whether introgressive hybridization has played a role in the evolution of flowering time variation in Capsella. The tetraploid C. bursa-pastoris has experienced introgressive hybridization with its close relative C. rubella in western Eurasia, but not in China (Paper II). Here, we measured flowering time and obtained DNA sequences for eight candidate genes for flowering time and ten background loci not involved in flowering time regulation, in two large samples from western Eurasia and China. Popu- lation stratification was assessed using the background loci, and we tested whether candidate gene polymorphisms were associated with flowering time

26 variation using a mixed-model approach which takes population structure into account (Yu et al. 2006; Zhao et al. 2007). Candidate genes were selected partly based on a previous study that mapped two major QTL for flowering time to homoeologous linkage groups (Paper IV). These linkage groups correspond to the upper part of A. lyrata chromosome AL6. Here, we sequenced parts of both homoeologs of two candidate genes located in this chromosomal region, LUMINIDEPENDENS (LD) and CRYPTOCHROME1 (CRY1). In addition, we included two candi- date flowering time genes in the vernalization pathway, FRIGIDA (FRI) and FLOWERING LOCUS C (FLC), which are responsible for a large part of the natural variation in flowering time in A. thaliana. We found no evidence for population structure in China, whereas analysis using STRUCTURE (Pritchard et al. 2000) indicated the presence of a southern and a northern cluster in western Eurasia. In western Eurasia, four SNPs in two candidate genes, CRY1 A and FLC A, were significantly asso- ciated with flowering time variation. In China, a SNP in FLC A was also significantly associated with flowering time variation. Both FLC A SNPs that were associated with flowering time are potentially functional polymor- phisms, because they disrupt consensus splice sites. Chinese accessions with a non-consensus splice donor site 3’ of exon five had an FLC A transcript which lacked exon five, and flowered earlier than accessions with a consen- sus splice donor site (Figs. 7, 8). Even though the FLC A splice site mutation that segregated in European accessions was located in a different position than that found in China, their predicted coding sequences were identical (Fig. 7). This suggests that parallel loss of FLC A exon five has evolved in different parts of the species distribution, perhaps as a result of selection for early flowering. If the absence of FLC exon five affects the function of this protein, early flowering would be the expected outcome given that FLC is a potent repressor of the transition to flowering.

27

Figure 7. Effect of different FLC A SNP495 alleles on splicing. A. The predicted cds in the sequenced fragment of FLC A lacks exon 5 in Chinese accessions (GY31 and GY36) with a non-consensus splice donor site at SNP position 495. Accessions with the consensus splice donor site (XY106, SE14) have a predicted cds containing exon 5. B. GY31 and GY 36 produce a shorter transcript than accessions with the consensus splice donor sequence AG (XA106, SE14). (The first eight wells after the ladder contain PCR products for samples not included in this study). C. Accessions with the non-consensus splice site have a 43-bp deletion in the FLC A transcript, corresponding to exon five. Accessions with the consensus splice donor sequence AG have a normal FLC A transcript. In XA106 and SE14, both homoeologs of FLC were detected, as indicated by the ambiguous positions marked by + signs. Only FLC A was detected in accessions GY36 and GY31.

The CRY1 A SNPs that were significantly associated with flowering time in western Eurasia define a haplotype that is shared with C. rubella, probably as a result of introgression (Paper II). Accessions harboring the C. rubella haplotype flowered later than those with C. bursa-pastoris haplotypes (Fig. 8). This suggests that introgressive hybridization has contributed to quantita- tive variation in flowering time. CRY1 is an attractive candidate locus for flowering time in C. bursa-pastoris, because differences in circadian rhythm and gibberellin biosynthesis gene expression previously observed between early- and late-flowering C. bursa-pastoris accessions (Slotte et al. 2007)

28 could be caused by functional polymorphism at CRY1. However, the CRY1 A SNPs that were associated with flowering time are not themselves ex- pected to affect the function of the protein (they are all synonymous) but instead might be in linkage disequilibrium with a causative polymorphism in CRY1 A or a linked gene.

Figure 8. Estimated effects of significant candidate gene SNPs on flowering time (mean number of days to flowering, DTF ± standard error). A. In western Eurasia, the C at SNP 8 in CRY1 A is only found in the northern cluster (Q1), where it is associated with later flowering. The haplotype containing the C allele at SNP 8 is shared with C. rubella. B. In China, the A at SNP 495 in FLC A which yields a non- consensus splice site is associated with earlier flowering.

In conclusion, the results suggest that different polymorphisms are associ- ated with flowering time variation in two parts of the species range. In west- ern Eurasia we find that alleles shared with C. rubella, probably as a result of introgressive hybridization, are associated with flowering time variation. In addition, splice site polymorphisms in FLC are associated with flowering time in both regions. Although further work is needed to fully characterize the genetic polymorphisms involved, this study suggests that introgression and differential splicing could be important for the evolution of flowering time in C. bursa-pastoris.

29 Conclusions

In summary, our investigations into polyploid speciation in Capsella yielded several unexpected results. First, C. bursa-pastoris does not appear to be an allopolyploid of C. rubella and C. grandiflora as was previously hypothe- sized, nor does it seem to be an autopolyploid of C. grandiflora (paper I). The true ancestor(s) of C. bursa-pastoris remain unknown, but must have been closely related to both C. rubella and C. grandiflora. Second, unlike commonly assumed, polyploid speciation did not confer immediate and complete reproductive isolation in Capsella (paper II). Instead, our studies indicate that unidirectional gene flow from C. rubella to C. bursa-pastoris has taken place in western Eurasia, where these species occur in mixed populations. Such post-polyploidization gene flow could constitute an im- portant but often overlooked source of genetic variation in polyploids. In the following papers (III-V), we used three different approaches to characterize the genetic basis of flowering time in C. bursa-pastoris. In summary, we found that genes involved in GA biosynthesis and response and regulation of the circadian clock differ in expression between early- and late-flowering C. bursa-pastoris (Paper III), that two main QTL for flower- ing time map to homoeologous regions of the C. bursa-pastoris genome, corresponding to the upper part of A. lyrata chromosome AL6 (Paper IV) and that introgressed alleles at one homoeolog of a gene situated in this re- gion, CRYPTOCHROME 1 (CRY1), as well as splice site polymorphisms in a FLOWERING LOCUS C (FLC) homoeolog are associated with natural flowering time variation in C. bursa-pastoris (Paper V). Future studies of the genetic basis of flowering time variation in C. bursa- pastoris should aim to identify causal polymorphisms in CRY1 and nearby genes, to functionally characterize the effects of FLC splicing variation on flowering time and to conduct fine-mapping of the identified QTL regions to determine whether homoeologous genes are responsible for flowering time variation in this tetraploid. In general, C. bursa-pastoris constitutes a good model system to investigate variation in introgression across the genome and its effect on quantitative trait variation in a polyploid species.

30 Svensk sammanfattning

Hos lomme, en senapsväxt som finns över hela världen, blommar populatio- ner från nordliga breddgrader senare än sådana från sydliga breddgrader. Blomningstiden, tiden från sådd till blomning, tros därför utgöra en anpass- ning till miljön hos denna art. I min avhandling har jag undersökt vilka gener detta beror på, samt hur arten lomme bildats. I den första artikeln använde vi oss av DNA-sekvenser för att undersöka hur den vanliga lommen är släkt med storlomme och skärlomme. Vi fann att en del av den genetiska variationen i kärn-DNAt delades mellan vanlig lomme och skärlomme, en art som enbart finns i Syd- och Centraleuropa. För att förstå hur detta kunde komma sig samlade vi in DNA-sekvenser för fler gener och prover. En populationsgenetisk analys visade att dessa arter troligen delar genetisk variation eftersom de hybridiserat med varandra (arti- kel II). Detta var ett oväntat resultat, eftersom lomme är tetraploid, dvs har har fyra kromosomer av varje sort medan de andra arterna har två. Sådana korsningar brukar leda till sterilitet hos avkomman, men detta gäller inte alltid hos växter och tydligen inte hos lomme. För att förstå hur blomningstiden styrs i lomme använde vi tre olika me- toder. I artikel III undersökte vi geners aktivitet. Två klasser av gener som har effekt på blomningstid hos andra arter skilde sig i aktivitet mellan tidig- och senblommande lomme. Dessa gener påverkar växtens dygnsrytm, samt signaleringen och produktionen av växthormonet gibberellin. Gener som är inblandade i dessa processer kan alltså ha betydelse för blomningstid hos lomme. Vi identifierade därefter områden i lommes arvsmassa som har stor bety- delse för variationen i blomningstid, på basen av en korsning mellan tidig- och senblommande lomme (artikel IV). Det fanns två sådana områden, och de satt på duplicerade kromosomer. Detta kan betyda att duplicerade gener är viktiga för blomningstid, men det behövs mer detaljerade studier för att avgöra om så är fallet. Slutligen studerade vi genetisk variation och blomningstid i en större upp- sättning prover av lomme från västra Eurasien och Kina (artikel V). Det fanns ett samband mellan genetisk variation och blomningstid för två gener, CRYPTOCHROME1 (CRY1) och FLOWERING LOCUS C (FLC). Eftersom CRY1 sitter i den del av arvsmassan som vi identifierade i artikel IV är det troligt att denna gen eller någon som sitter nära den på samma kromosom är viktig för blomningstid. Intressant nog delade lomme genetisk variation med

31 skärlomme för just denna gen, så hybridiseringen mellan dessa arter kan ha påverkat blomningstiden hos lomme. Det verkar alltså som om både duplice- rade kromosomer och hybridisering bidragit till variationen i blomningstid hos lomme. Resultaten av den här studien är viktiga för vår förståelse av huruvida duplicering av arvsmassan kan utgöra en fördel och bidra till väx- ters anpassning till miljön.

32 Acknowledgements

Several people have helped me in the work on this thesis, and I am deeply grateful to all of them. First and foremost I would like to thank my main supervisor Martin Lascoux for all his support and encouragement. Second, I would like to thank my assistant supervisor Ulf Lagercrantz, for numerous discussions on questions related to flowering time, for sharing his extensive technical expertise, and for being very friendly and helpful. Many thanks to fellow Capsella bursa-pastoris workers Alf Ceplitis, Huirun Huang and Karl Holm. I am deeply indebted to all of you for your input and the discussions that went into this thesis. I have greatly appreciated collaborating with Anna Palmé, Kate St. Onge and Jesper Bechsgaard on C. rubella and C. grandiflora. Several people helped me in the lab: Mattias Myrenås, Jun Chen and Robert Malmgren. Special thanks to everyone who read and commented on draft manuscripts, especially Sunniva Aagaard, Koen Verhoeven and Anna Palmé. Barbara Neuffer at Osnabrück University kindly contributed valuable plant material. My sincerest thanks to all of you! I learned a great deal during my stay in the Computational Genomics lab at Purdue University in 2005. I am very grateful to Lauren M. McIntyre for all her statistical advice and for generously hosting me. Many thanks also to Koen, Lisa and Jason, who made my stay in West Lafayette, Indiana very enjoyable. Thanks to everyone in the departments of Evolutionary Functional Ge- nomics and Population Biology for creating a nice working environment and for many laughs during the coffee breaks. Special thanks to Kerstin Santes- son and Susanne Gustafsson who kept the lab up and running during my time here! I would also like to thank the friends who took my mind off work every once in a while and my family for their constant support. Finally, my deepest thanks to Tobias, for always being there.

33 References

Ainouche, M. L., A. Baumel, and A. Salmon. 2004. Spartina anglica C. E. Hubbard: a natural model system for analysing early evolutionary changes that affect allopolyploid genomes. Biol J Linn Soc 82:475- 484. Arabidopsis Genome Initiative. 2000. Analysis of the genome sequence of the Arabidopsis thaliana. Nature 408:796-815. Balasubramanian, S., S. Sureshkumar, M. Agrawal, T. P. Michael, C. Wessinger, J. N. Maloof, R. Clark, N. Warthmann, J. Chory, and D. Weigel. 2006. The PHYTOCHROME C photoreceptor gene medi- ates natural variation in flowering and growth responses of Arabi- dopsis thaliana. Nat Genet 38:711-715. Barton, N. H., and P. D. Keightley. 2002. Understanding quantitative genetic variation. Nat Rev Genet 3:11-21. Blazquez, M. A., M. Trenor, and D. Weigel. 2002. Independent control of gibberellin biosynthesis and flowering time by the circadian clock in Arabidopsis. Plant Physiol 130:1770-1775. Boivin, K., A. Acarkan, R. S. Mbulu, O. Clarenz, and R. Schmidt. 2004. The Arabidopsis genome sequence as a tool for genome analysis in Bras- sicaceae. A comparison of the Arabidopsis and Capsella rubella ge- nomes. Plant Physiol 135:735-744. Bowers, J. E., B. A. Chapman, J. K. Rong, and A. H. Paterson. 2003. Unrav- elling angiosperm genome evolution by phylogenetic analysis of chromosomal duplication events. Nature 422:433-438. Caicedo, A. L., J. R. Stinchcombe, K. M. Olsen, J. Schmitt, and M. D. Pu- rugganan. 2004. Epistatic interaction between Arabidopsis FRI and FLC flowering time genes generates a latitudinal cline in a life his- tory trait. P Natl Acad Sci USA 101:15670-15675. Ceplitis, A., S. Yingtao, and M. Lascoux. 2005. Bayesian inference of evolu- tionary history from chloroplast microsatellites in the cosmopolitan weed Capsella bursa-pastoris (Brassicaceae). Mol Ecol 14:4221- 4233. Chater, A. O. 1993. Capsella. Pp. 381-382 in T. G. Tutin, Heywood, H., Burges, N.A., Moore, D.M., Valentine, D.H., Walters S.M. & Webb D.A., ed. Flora Europaea. Cambridge University Press, Cambridge. Chen, Z. J. 2007. Genetic and epigenetic mechanisms for gene expression and phenotypic variation in plant polyploids. Ann Rev Plant Biol 58:377-406.

34 Coyne, J. A., and H. A. Orr. 2004. Speciation. Sinauer Associates, Sunder- land, MA. Delourme, R., C. Falentin, V. Huteau, V. Clouet, R. Horvais, B. Gandon, S. Specel, L. Hanneton, J. E. Dheu, M. Deschamps, E. Margale, P. Vincourt, and M. Renard. 2006. Genetic control of oil content in oil- seed rape (Brassica napus L.). Theor Appl Genet 113:1331-1345. Doerge, R. W. 2002. Mapping and analysis of quantitative trait loci in ex- perimental populations. Nat Rev Genet 3:43-52. Edwards, K. D., P. E. Anderson, A. Hall, N. S. Salathia, J. C. W. Locke, J. R. Lynn, M. Straume, J. Q. Smith, and A. J. Millar. 2006. FLOWERING LOCUS C mediates natural variation in the high- temperature response of the Arabidopsis circadian clock. Plant Cell 18:639-650. Ehrenreich, I. M., and M. D. Purugganan. 2006. The molecular genetic basis of plant adaptation. Am J Bot 93:953-962. El-Assal, S. E. D., C. Alonso-Blanco, A. J. M. Peeters, V. Raz, and M. Koornneef. 2001. A QTL for flowering time in Arabidopsis reveals a novel allele of CRY2. Nat Genet 29:435-440. Engelmann, K., and M. Purugganan. 2006. The molecular evolutionary ecol- ogy of plant development: Flowering time in Arabidopsis thaliana. Pp. 507-526. Advances in Botanical Research: Incorporating Ad- vances in Plant Pathology, Vol 44. Fellay, J., K. V. Shianna, D. Ge, S. Colombo, B. Ledergerber, M. Weale, K. Zhang, C. Gumbs, A. Castagna, A. Cossarizza, A. Cozzi-Lepri, A. De Luca, P. Easterbrook, P. Francioli, S. Mallal, J. Martinez-Picado, J. M. Miro, N. Obel, J. P. Smith, J. Wyniger, P. Descombes, S. E. Antonarakis, N. L. Letvin, A. J. McMichael, B. F. Haynes, A. Telenti, and D. B. Goldstein. 2007. A whole-genome association study of major determinants for host control of HIV-1. Science 317:944-947. Flint-Garcia, S. A., J. M. Thornsberry, and E. S. Buckler. 2003. Structure of linkage disequilibrium in plants. Ann Rev Plant Biol 54:357-374. Glazier, A. M., J. H. Nadeau, and T. J. Aitman. 2002. Finding genes that underlie complex traits. Science 298:2345-2349. Gould, P. D., J. C. W. Locke, C. Larue, M. M. Southern, S. J. Davis, S. Hanano, R. Moyle, R. Milich, J. Putterill, A. J. Millar, and A. Hall. 2006. The molecular basis of temperature compensation in the Arabidopsis circadian clock. Plant Cell 18:1177-1187. Hagenblad, J., and M. Nordborg. 2002. Sequence variation and haplotype structure surrounding the flowering time locus FRI in Arabidopsis thaliana. Genetics 161:289-298. He, Y. H., and R. M. Amasino. 2005. Role of chromatin modification in flowering-time control. Tr Plant Sci 10:30-35. Henry, I. M., B. P. Dilkes, and L. Comai. 2007. Genetic basis for dosage sensitivity in Arabidopsis thaliana. PLoS Genet 3:e70.

35 Henry, I. M., B. P. Dilkes, K. Young, B. Watson, H. Wu, and L. Comai. 2005. Aneuploidy and genetic variation in the Arabidopsis thaliana triploid response. Genetics 170:1979-1988. Hey, J. 2005. On the number of New World founders: A population genetic portrait of the peopling of the Americas. Plos Biol 3:965-975. Hoekstra, H. E., and J. A. Coyne. 2007. The locus of evolution: Evo devo and the genetics of adaptation. Evolution 61:995-1016. Hurka, H., S. Freundner, A. H. D. Brown, and U. Plantholt. 1989. Aspartate aminotransferase isozymes in the genus Capsella (Brassicaceae): subcellular location, gene duplication, and polymorphism. Biochem Genet 27:77 - 90. Hurka, H., and B. Neuffer. 1997. Evolutionary processes in the genus Capsella (Brassicaceae). Plant Syst Evol 206:295-316. Husband, B. C. 2004. The role of triploid hybrids in the evolutionary dynam- ics of mixed-ploidy populations. Biol J Linn Soc 82:537-546. Jakobsson, M., J. Hagenblad, S. Tavare, T. Sall, C. Hallden, C. Lind- Hallden, and M. Nordborg. 2006. A unique recent origin of the al- lotetraploid species Arabidopsis suecica: Evidence from nuclear DNA markers. Mol Biol Evol 23:1217-1231. Jiang, C. X., R. J. Wright, K. M. El-Zik, and A. H. Paterson. 1998. Polyploid formation created unique avenues for response to selection in Gos- sypium (cotton). P Natl Acad Sci USA 95:4419-4424. Johanson, U., J. West, C. Lister, S. Michaels, R. Amasino, and C. Dean. 2000. Molecular analysis of FRIGIDA, a major determinant of natu- ral variation in Arabidopsis flowering time. Science 290:344-347. Keurentjes, J. J. B., J. Y. Fu, I. R. Terpstra, J. M. Garcia, G. van den Ack- erveken, L. B. Snoek, A. J. M. Peeters, D. Vreugdenhil, M. Koorn- neef, and R. C. Jansen. 2007. Regulatory network construction in Arabidopsis by using genome-wide gene expression quantitative trait loci. P Natl Acad Sci USA 104:1708-1713. Kirst, M., A. A. Myburg, J. P. G. De Leon, M. E. Kirst, J. Scott, and R. Sederoff. 2004. Coordinated genetic regulation of growth and lignin revealed by quantitative trait locus analysis of cDNA microarray data in an interspecific backcross of eucalyptus. Plant Physiol 135:2368-2378. Koch, M., I. A. Al-Shehbaz, and K. Mummenhoff. 2003. Molecular sys- tematics, evolution, and population biology in the mustard family (Brassicaceae). Ann Mo Bot Gard 90:151-171. Koch, M., B. Haubold, and T. Mitchell-Olds. 2001. Molecular systematics of the Brassicaceae: evidence from coding plastidic matK and nuclear Chs sequences. Am J Bot 88:534-544. Koch, M. A., B. Haubold, and T. Mitchell-Olds. 2000. Comparative evolu- tionary analysis of chalcone synthase and alcohol dehydrogenase loci in Arabidopsis, Arabis, and related genera (Brassicaceae). Mol Biol Evol 17:1483-1498.

36 Kochert, G., H. T. Stalker, M. Gimenes, L. Galgaro, C. R. Lopes, and K. Moore. 1996. RFLP and cytogenetic evidence on the origin and evo- lution of allotetraploid domesticated peanut, Arachis hypogaea (Leguminosae). Am J Bot 83:1282-1291. Koornneef, M., C. Alonso-Blanco, and D. Vreugdenhil. 2004. Naturally occurring genetic variation in Arabidopsis thaliana. Ann Rev Pl Biol 55:141-172. Kruskopf-Österberg, M., O. Shavorskaya, M. Lascoux, and U. Lagercrantz. 2002. Naturally occurring indel variation in the Brassica nigra COL1 gene is associated with variation in flowering time. Genetics 161:299-306. Lander, E. S., and N. J. Schork. 1994. Genetic dissection of complex traits. Science 265:2037-2048. Le Corre, V., F. Roux, and X. Reboud. 2002. DNA polymorphism at the FRIGIDA gene in Arabidopsis thaliana: Extensive nonsynonymous variation is consistent with local selection for flowering time. Mol Biol Evol 19:1261-1271. Linde, M., S. Diel, and B. Neuffer. 2001. Flowering ecotypes of Capsella bursa-pastoris (L.) Medik. (Brassicaceae) analysed by a cosegrega- tion of phenotypic characters (QTL) and molecular markers. Ann Bot-London 87:91-99. Linder, C. R., and L. H. Rieseberg. 2004. Reconstructing patterns of reticu- late evolution in plants. Am J Bot 91:1700-1708. Lynch, M., and J. S. Conery. 2000. The evolutionary fate and consequences of duplicate genes. Science 290:1151-1155. Lynch, M., and B. Walsh. 1998. Genetics and analysis of quantitative traits. Sinauer Associates, Inc., Sunderland, MA. Mallet, J. 2007. Hybrid speciation. Nature 446:279-283. Masterson, J. 1994. Stomatal size in fossil plants: evidence for polyploidy in majority of angiosperms. Science 264:421-423. Mauricio, R. 2001. Mapping quantitative trait loci in plants: Uses and cave- ats for evolutionary biology. Nat Rev Genet 2:370-381. Ming, R., S. C. Liu, P. H. Moore, J. E. Irvine, and A. H. Paterson. 2001. QTL analysis in a complex autopolyploid: Genetic control of sugar content in sugarcane. Genome Res 11:2075-2084. Mitchell-Olds, T., and J. Schmitt. 2006. Genetic mechanisms and evolution- ary significance of natural variation in Arabidopsis. Nature 441:947- 952. Mitchell-Olds, T., J. H. Willis, and D. B. Goldstein. 2007. Which evolution- ary processes influence natural genetic variation for phenotypic traits? Nat Rev Genet 8:845-856. Mouradov, A., F. Cremer, and G. Coupland. 2002. Control of flowering time: Interacting pathways as a basis for diversity. Plant Cell 14:S111-S130.

37 Mummenhoff, K., and H. Hurka. 1990. Evolution of the tetraploid Capsella bursa-pastoris (Brassicaceae): isoelectric focusing analysis of Rubisco. Plant Syst Evol 172:205 - 213. Neuffer, B., and S. Bartelheim. 1989. Genetic-ecology of Capsella bursa- pastoris from an altitudinal transsect in the Alps. Oecologia 81:521- 527. Neuffer, B., and S. Eschner. 1995. Life-history traits and ploidy levels in the genus Capsella (Brassicaceae). Can J Bot 73:1354-1365. Neuffer, B., and H. Hurka. 1986a. Adaptation in life-history traits of coloniz- ing plant-species - variation of development time until flowering in natural-populations of Capsella bursa-pastoris (Cruciferae). Plant Syst Evol 152:277-296. Neuffer, B., and H. Hurka. 1986b. Variation of growth form parameters in Capsella (Cruciferae). Plant Syst Evol 153:265-279. Neuffer, B., and H. Hurka. 1999. Colonization history and introduction dy- namics of Capsella bursa-pastoris (Brassicaceae) in North America: isozymes and quantitative traits. Mol Ecol 8:1667-1681. Nordborg, M., and S. Tavare. 2002. Linkage disequilibrium: what history has to tell us. Trends Genet 18:83-90. Ohno, S. 1970. Evolution by gene duplication. George Allen and Unwin, London. Okazaki, K., K. Sakamoto, R. Kikuchi, A. Saito, E. Togashi, Y. Kuginuki, S. Matsumoto, and M. Hirai. 2007. Mapping and characterization of FLC homologs and QTL analysis of flowering time in Brassica ol- eracea. Theor Appl Genet 114:595-608. Otto, S. P., and J. Whitton. 2000. Polyploid incidence and evolution. Ann Rev Genet 34:401-437. Petit, C., F. Bretagnolle, and F. Felber. 1999. Evolutionary consequences of diploid-polyploid hybrid zones in wild species. Trends Ecol Evol 14:306-311. Pritchard, J. K., and N. A. Rosenberg. 1999. Use of unlinked genetic mark- ers to detect population stratification in association studies. Am J Hum Genet 65:220-228. Pritchard, J. K., M. Stephens, and P. Donnelly. 2000. Inference of population structure using multilocus genotype data. Genetics 155:945-959. Ramsey, J., and D. W. Schemske. 1998. Pathways, mechanisms, and rates of polyploid formation in flowering plants. Ann Rev Ecol Syst 29:467- 501. Remington, D. L., and M. D. Purugganan. 2003. Candidate genes, quantita- tive trait loci, and functional trait evolution in plants. Int J Plant Sci 164:S7-S20. Roux, F., P. Touzet, J. Cuguen, and V. Le Corre. 2006. How to be early flowering: an evolutionary perspective. Trends Plant Sci 11:375- 381.

38 Salathia, N., S. J. Davis, J. R. Lynn, S. D. Michaels, R. M. Amasino, and A. J. Millar. 2007. FLOWERING LOCUS C -dependent and - independent regulation of the circadian clock by the autonomous and vernalization pathways. BMC Plant Biol 6. Schranz, M. E., M. A. Lysak, and T. Mitchell-Olds. 2006. The ABC's of comparative genomics in the Brassicaceae: building blocks of cruci- fer genomes. Trends Plant Sci 11:535-542. Schranz, M. E., P. Quijada, S.-B. Sung, L. Lukens, R. Amasino, and T. C. Osborn. 2002. Characterization and effects of the replicated flower- ing time gene FLC in Brassica rapa. Genetics 162:1457-1468. Scott, L. J., K. L. Mohlke, L. L. Bonnycastle, C. J. Willer, Y. Li, W. L. Duren, M. R. Erdos, H. M. Stringham, P. S. Chines, A. U. Jackson, L. Prokunina-Olsson, C.-J. Ding, A. J. Swift, N. Narisu, T. Hu, R. Pruim, R. Xiao, X.-Y. Li, K. N. Conneely, N. L. Riebow, A. G. Sprau, M. Tong, P. P. White, K. N. Hetrick, M. W. Barnhart, C. W. Bark, J. L. Goldstein, L. Watkins, F. Xiang, J. Saramies, T. A. Bu- chanan, R. M. Watanabe, T. T. Valle, L. Kinnunen, G. R. Abecasis, E. W. Pugh, K. F. Doheny, R. N. Bergman, J. Tuomilehto, F. S. Collins, and M. Boehnke. 2007. A genome-wide association study of type 2 diabetes in Finns detects multiple susceptibility variants. Sci- ence 316:1341-1345. Shull, G. H. 1929. Species hybridization among old and new species of Shepherd's Purse. Pp. 837-888. International Congress of Plant Sci- ences, Ithaca, New York. Simillon, C., K. Vandepoele, M. C. van Montagu, M. Z. Zabeau, and Y. van de Peer. 2002. The hidden duplication past of Arabidopsis thaliana. P Natl Acad Sci USA 99:13627-13623. Simpson, G. G., and C. Dean. 2002. Arabidopsis, the rosetta stone of flower- ing time? Science 296:285-289. Slotte, T., K. Holm, L. M. McIntyre, U. Lagercrantz, and M. Lascoux. 2007. Differential expression of genes important for adaptation in Capsella bursa-pastoris (Brassicaceae). Plant Physiol 145:160-173. Soltis, D. E., and P. S. Soltis. 1993. Molecular-data and the dynamic nature of polyploidy. Crit Rev Plant Sci 12:243-273. Soltis, D. E., and P. S. Soltis. 1999. Polyploidy: recurrent formation and genome evolution. Trends Ecol Evol 14:348-352. Soltis, P. S., and D. E. Soltis. 2000. The role of genetic and genomic attrib- utes in the success of polyploids. P Natl Acad Sci USA 97:7051- 7057. Thornsberry, J. M., M. M. Goodman, J. Doebley, S. Kresovich, D. Nielsen, and E. S. Buckler. 2001. Dwarf8 polymorphisms associate with variation in flowering time. Nat Genet 28:286-289. Toomajian, C., T. T. Hu, M. J. Aranzana, C. Lister, C. L. Tang, H. G. Zheng, K. Y. Zhao, P. Calabrese, C. Dean, and M. Nordborg. 2006. A non-

39 parametric test reveals selection for rapid flowering in the Arabidop- sis genome. PLoS Biol4:732-738. Wayne, M. L., and L. M. McIntyre. 2002. Combining mapping and arraying: An approach to candidate gene identification. P Natl Acad Sci USA 99:14903-14906. Werner, J. D., J. O. Borevitz, N. Warthmann, G. T. Trainer, J. R. Ecker, J. Chory, and D. Weigel. 2005. Quantitative trait locus mapping and DNA array hybridization identify an FLM deletion as a cause for natural flowering-time variation. P Natl Acad Sci USA 102:2460- 2465. Wray, G. A. 2007. The evolutionary significance of cis-regulatory muta- tions. Nat Rev Genet 8:206-216. Yogeeswaran, K., A. Frary, T. L. York, A. Amenta, A. H. Lesser, J. B. Nas- rallah, S. D. Tanksley, and M. E. Nasrallah. 2005. Comparative ge- nome analyses of Arabidopsis spp.: Inferring chromosomal rear- rangement events in the evolutionary history of A. thaliana. Genome Res 15:505-515. Yu, J. M., G. Pressoir, W. H. Briggs, I. V. Bi, M. Yamasaki, J. F. Doebley, M. D. McMullen, B. S. Gaut, D. M. Nielsen, J. B. Holland, S. Kres- ovich, and E. S. Buckler. 2006. A unified mixed-model method for association mapping that accounts for multiple levels of relatedness. Nat Genet 38:203-208. Zhao, K. Y., M. J. Aranzana, S. Kim, C. Lister, C. Shindo, C. L. Tang, C. Toomajian, H. G. Zheng, C. Dean, P. Marjoram, and M. Nordborg. 2007. An Arabidopsis example of association mapping in structured samples. PLoS Genet 3:e4.

40

Acta Universitatis Upsaliensis Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Science and Technology 367

Editor: The Dean of the Faculty of Science and Technology

A doctoral dissertation from the Faculty of Science and Technology, Uppsala University, is usually a summary of a number of papers. A few copies of the complete dissertation are kept at major Swedish research libraries, while the summary alone is distributed internationally through the series Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Science and Technology. (Prior to January, 2005, the series was published under the title “Comprehensive Summaries of Uppsala Dissertations from the Faculty of Science and Technology”.)

ACTA UNIVERSITATIS UPSALIENSIS Distribution: publications.uu.se UPPSALA urn:nbn:se:uu:diva-8311 2007