Characterization of a meiotic recombination hotspot in Arabidopsis thaliana Hossein Khademian

To cite this version:

Hossein Khademian. Characterization of a meiotic recombination hotspot in Arabidopsis thaliana. Agricultural sciences. Université Paris Sud - Paris XI, 2012. English. ￿NNT : 2012PA112051￿. ￿tel- 00800551￿

HAL Id: tel-00800551 https://tel.archives-ouvertes.fr/tel-00800551 Submitted on 14 Mar 2013

HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés.

UNIVERSITE PARIS-SUD 11

U.F.R. Scientifique d’Orsay

Thèse

Présentée pour l’obtention du grade de

Docteur en Sciences de l’Université Paris-Sud XI

Spécialité : Sciences du Végétal

par

Hossein KHADEMIAN

Caractérisation d’un point chaud de recombinaison méiotique chez Arabidopsis thaliana

Composition du jury :

Valérie BORDE Rapporteur Michel DRON Président du Jury Corinne GREY Examinateur Christine MEZARD Directeur de Thèse Minoo RASSOULZADEGAN Rapporteur

Abstract

Meiotic recombination initiated in prophase I of generates either crossovers (COs), which are reciprocal exchanges between chromosome segments, or gene conversion not associated to crossovers (NCOs). Both kinds of events occur in narrow regions (less than 10 kilobases) called hotspots, which are distributed non-homogenously along chromosomes. The aim of my PhD was the characterization of a hotspot of meiotic recombination (named 14a) in Arabidopsis thaliana (i) across different accessions (ii) in msh4 mutant, a gene involved in CO formation. In both ColxLer and ColxWs hybrids (i) 14a had a very high rate of COs 0.85% and 0.49%, respectively (ii) COs clustered in two small regions of a few kilobases, 14a1 and 14a2 with typical Gaussian curve distribution observed in other organisms (iii) 14a1 was also a hotspot of NCO with high rate (0.5%) in ColxLer (iv) a bias of recombination initiation at 14a1 CO and NCO hotspot was found in ColxLer. A reduction of CO frequency was observed in msh4 mutant in ColxLer background at 14a1 (6.4%) and 14a2 (18.7%) compared to wild type. This confirmed previously known role of MSH4 protein in CO formation. Frequency of NCO at 14a1 was similar to wild type. The role of putative Arabidopsis histone H3K4 trimethyltransferase in meiotic recombination as previously observed like Set1 in S.cerevisiae or PRDM9 in mammals (mice and human) was also studied. None of ten putative histone methyltransferase genes was involved in meiosis. This could be due to (i) a strong redundancy of function between gene products (ii) another histone methyltransferase in charge of labeling meiotic recombination hotspots (more than 29 putative histone methyltransferase have been identified in the Arabidopsis genome!) (iii) contrary to S. cerevisiae, mice and humans, another mechanism for epigenetic control of meiotic recombination.

Résumé

La recombinaison méiotique initiée en prophase I de méiose génère soit des crossing-over (COs), qui sont des échanges réciproques entre segments chromosomiques, ou des conversions géniques non associées aux COs (NCOs). Les deux types d'événements se produisent dans de petites régions (moins de 10 kilobases) appelées points chauds, qui sont distribuées de manière non homogène le long des chromosomes. L'objectif de ma thèse était la caractérisation d'un point chaud de recombinaison méiotique (nommée 14a) chez Arabidopsis thaliana (i) dans différentes accessions (ii) dans le mutant msh4, un gène impliqué dans la formation des COs. Dans les deux hybrides ColxLer et ColxWs (i) 14a a un taux très élevé de COs 0,85% et 0,49%, respectivement (ii) Les COs sont regroupés dans deux petites régions de quelques kilobases, 14a1 et 14a2 avec une distribution de type gaussienne observée aux points chauds décrits dans d'autres espèces (iii) 14a1 est aussi un point chaud de NCO avec un taux aussi élevé que celui des COs (0,5%) dans ColxLer (iv) un biais de l'initiation de recombinaison a été trouvé dans 14a1 aussi bien pour les COs que les NCOs dans le fond génétique ColxLer. Une réduction de la fréquence de CO a été observée dans le mutant msh4 dans le fond génétique ColxLer à 14a1 et 14a2 (6,4% et 18,7% par rapport au sauvage). Cela confirme le rôle précédemment connu de la protéine MSH4 impliqué dans la formation de CO. La fréquence de NCO à 14a1 est similaire à celle observéedans le fond sauvage.

Le rôle des H3K4 histones trimethyltransferase d’Arabidopsis dans la recombinaison méiotique (comme précédemment observé comme Set1 chez S. cerevisiae ou PRDM9 chez les mammifères) a également été étudiée. Aucun des dix gènes d’histones méthyltransférase étudié n'a montré de rôle dans la méiose. Cela pourrait être dû à (i) une forte redondance de la fonction entre les protéines (ii) une autre histone méthyltransférase en charge de l'étiquetage des points chauds de recombinaison méiotique (plus de 29 putatif histone méthyltransférase ont été identifiés dans le génome d'Arabidopsis!) (iii) contrairement à S. cerevisiae, les souris et l'homme, un autre mécanisme de contrôle épigénétique de la recombinaison méiotique.

TABLE DES MATIERES

INTRODUCTION...... 3 1.1 Meiosis (general information)...... 4 1.1.1 A brief history of genetics...... 5 1.1.2 Meiosis as Mendel’s law key ...... 6 1.2. Chronology and cytological aspects of prophase I of meiosis...... 7 1.2.1 Leptotene ...... 8 1.2.2 Zygotene...... 9 I.2.3 Pachytene Stage ...... 10 I.2.4 Diplotene and Diakinesis Stages...... 11 1.2. 5 Metaphase, anaphase and telophase 1...... 11 1.2. 6 Metaphase, anaphase and telophase 2...... 11 1.3 Molecular description of meiotic recombination ...... 12 1.3.1 Double strand breaks (DSB) formation...... 13 1.3.2 Maturation of DSBs ...... 14 1.3.3 Strand invasion...... 15 1.3.4 Inter-homologue bias ...... 16 1.3.5 Pathways...... 17 1.3.5.1 Type I Crossover...... 17 1.3.5.2 Type II Crossover...... 19 1.3.5.3 Non Crossover (Gene Conversion non associated with CO) ...... 19 1.3.6. What about molecular mechanisms of meiotic recombination in Arabidopsis thaliana? ...... 21 1.4 Distribution of Crossovers ...... 25 1.4.1 Chromosomal Map Determination of Crossover Location...... 25 1.4.2 Genomic Map Determination of Crossover Location...... 26 1.5 Distribution of DSBs ...... 27 1.6 Meiotic Recombination Hotspots ...... 28 1.6.1 Meiotic Recombination Hotspots in Fungi ...... 29 1.6.2 Meiotic Recombination Hotspots in Mammals...... 31 1.6.2.1. Mammalian Hotspots Characterized by Sperm Typing...... 31 1.6.2.2 The HapMap project and historical hotspots in human...... 33 1.6.2.3 Towards Elucidation of Mammalian Hotspots Localization Determinant ...... 34 1.6.3 Meiotic Recombination Hotspots in Plants ...... 37 1.7 Human genetic diseases caused by meiotic recombination...... 37 1.7.1 Diseases caused by deletion and duplication...... 38 1.7.2 Diseases due to inversions by homologous recombination ...... 40 1.7.3 Diseases due to gene fusion and gene conversion mediated by recombination ...... 40 1.8 Influence of Meiotic Recombination Control on Plant Breeding ...... 42 1.9. Aims of my thesis work ...... 43 CHAPTER 2: CHARACTERIZATION OF THE 14A HOTSPOT...... 45 2.1 Introduction...... 46 2.2. Manuscript ...... 48 2.3. Conclusion ...... 85 CHAPTER 3: ROLE OF MSH4 IN CO AND NCO FORMATION AT 14A HOTSPOT IN ARABIDOPSIS THALIANA...... 87

1 CHAPTER 4: CHARACTERIZATION OF THE 14A HOTSPOT IN VARIOUS GENETIC BACKGROUNDS ...... 91 Rate and Distribution of COs in the F1 ColxWs ...... 93 CHAPTER 5: EPIGENETIC CONTROL OF MEIOTIC RECOMBINATION IN ARABIDOPSIS THALIANA ...... 95 5.1. Introduction...... 96 5.2 The screen...... 98 5.2.1 SDG6 (SUVR5, CZS) ...... 98 5.2.2 SDG4 (ASHR3) and SDG8 (ASHH2) ...... 99 5.2.3 SDG2 (ATXR3) and SDG25 (ATXR7)...... 100 5.2.4 ATX1 to ATX5...... 101 5.3 Conclusion ...... 102 CONCLUSION ...... 103 MATERIAL AND METHODS...... 107 Plant material ...... 108 The Arabidopsis thaliana lines used for CO and NCO analysis ...... 108 Methods...... 108 CO and NCO analysis ...... 108 Primers used for ColxWs analysis...... 109 Primers used for CO detection ...... 109 Primers used for mutant genotyping ...... 110 Light and confocal microscopy ...... 113 Manuscript...... 113 ANNEXES ...... 137 REFERENCES...... 142

2

INTRODUCTION

3 Homologous chromosomes Meiosis

DNA replication and setting up of cohesion between sister chromatids

Anaphase: Loss of Bivalent at metaphase I: cohesion and sister the genetic exchange between homologous chromatids separation chromatids following a crossing-over and Cohesin cohesion between sister complex chromatids resist the forces performed by the spindle. Microtubules

Mono-oriented kinetochores

Bi-oriented Anaphase I: loss kinetochores of cohesion at the arms and separation of homologous chromosomes

Anaphase II: Formation of two diploid daughter cells loss of Figure 1-1: Schematic representation of chromosome cohesion at the segregation in mitosis and meiosis. Cohesion is centromeres, established during meiotic and mitotic DNA replication. In kinetochore mitosis, loss of cohesion and tension exerted by re-orientation microtubules on bi-oriented kinetochores allows the and separation segregation of sister chromatids. In meiosis, homologous of sister recombination events lead to the formation of crossing- chromatids over allowing together with sister chromatid cohesion the association of homologous chromosomes in bivalents. In anaphase I, the loss of cohesion at the arms of chromosomes and the tension exerted by microtubules on mono-oriented kinetochores allows the segregation of homologous chromosomes at opposite poles. In meiosis II, the loss of centromeric cohesion and microtubule tension on bi-oriented kinetochores leads to the separation of sister Formation of four haploid daughter cells chromatids. From de Muyt, 2008, thesis

1.1 Meiosis (general information)

Meiosis is an essential stage for all organisms having sexual reproduction. It is characterized by two successive chromosomal divisions following one single DNA replication. Thus, a mother diploid cell undergoes a series of two divisions during which four haploid daughter cells are created (figure 1-1). These haploid cells will generate a gamete sothat the ploidy level will be restored by fertilization, through fusion between gametes from each parent. One of the major differences between meiosis and mitosis is extra chromosome segregation. The division called reductional, corresponding to the first meiotic division, allows segregation of homologous chromosomes to opposite poles of the cell. The second meiotic division (called equational) is similar to a mitotic type division because it is characterized by the separation of sister chromatids (figure 1-1). Many cases of aneuploidies come from chromosomal segregation defect during meiosis (Hassold et al., 2007). For instance, 95% of trisomy 21 in human is due to problems of meiotic chromosomal segregation (the rest 5% comes from mitotic miss-segregation at embryonic stage) (Kokotas et al., 2008). Generally, such defects of meiotic segregation are from maternal origin (90%) and take place mainly during first meiotic division (69% of defects of maternal segregation) (Kokotas et al., 2008). To explain the distribution of chromosomes in meiosis, an important question arose: which events are required for a proper segregation of meiotic chromosomes? There are three essential functions by which the chromosomes are segregated normally: (i) Each pair of homologous chromosomes associates through a reciprocal exchange of genetic material so-called Crossing Over (CO), which involves the mechanism of homologous recombination. The structure resulting of this association is called “bivalent” and is formed during Prophase I of meiosis. Absence of CO leads to the formation of univalent instead of bivalents, and subsequently a random segregation of chromosomes during the first meiotic division. However, in certain species such as the Drosophila male, the physical interaction between homologous chromosomes is not ensured by CO formation. Here, the bivalent formation relies on chromosome pairing without any CO. Thus it is more the physical link that the exchange of genetic material per se that is necessary to homologous chromosome segregation.

4 (ii) A proper distribution of chromosomes also requires a delicate regulation of meiotic cohesion between sister chromatids. The cohesion takes place immediately after DNA replication by recruiting a protein complex, the cohesin complex, which keeps sister chromatids joined together along their length (figure 1-1). A sequential degradation of this complex will occur through meiotic division: firstly, the cohesion is released at chromosomal arms during the first division of meiosis, which leads to separation of homologous chromosomes. At the centromeres, the cohesion between sister chromatids is maintained until anaphase II. The centromeric cohesion is lost during the second meiotic division which results in the separation of sister chromatids (figure 1-1) (Nasmyth, 1999). Problems of cohesion are often associated with chromosome segregation defects: a premature separation of bivalents or early separation of sister chromatids (Petronczki et al., 2003). (iii) The orientation of kinetochores, a multiprotein complex localized to centromeres, is crucial to achieve a proper separation of chromosomes. During the first meiotic division, the kinetochores organize on two sister chromatids in a way that they are oriented toward the same direction (mono-oriented or monopolar). During the second meiotic division, the kinetochores reorientate to become opposite one to another (bipolar or bi-oriented). Consequently, the sister chromatids can be separated toward the opposite poles of the cell (figure 1-1).

1.1.1 A brief history of genetics

During the eighteenth century biologists tried to answer some basic, but fundamental questions such as: Why did certain traits disappear in one generation only to reappear suddenly in the next? Why some crosses are similar to parents, while others produced progeny different from each parent? Why was inbreeding sometimes successful and sometimes a failure? And what was the phenomenon of heredity at last? A century before Mendel, J.G. Kolreuter (1733–1806) a German naturalist, demonstrated that different characters are segregated in a hybrid progeny (inherited independently of each other). The Englishman Thomas Andrew Knight (1759–1838) first identified the common garden pea, Pisum sativum, as a model for crossing tests: its breeding can be strongly controlled since the structure of its flowers avoids pollination from outsiders. He also showed that the progeny of a cross between two pea parents (F1) was always uniform in appearance; a cross between the F1s produced progeny (F2) revealing the characters present in the initial parental generation. A generation later, in 1824, J. Goss and A. Seton,

5 described the phenomenon of dominance (again, in peas): certain traits of one parent entirely overshadowed the expression of other traits of the other parent, so that, for example, all the progeny of a purebred yellow-flowered parent and a white-flowered parent would have yellow flowers. Later, Gregor Mendel (1822–1884) chose the garden pea, like previous scientists, and made thousands of crosses from which he hoped to statistically infer the laws of heredity. Between 1856 and 1864, Mendel chose seven variable traits present in 22 distinct varieties of the pea (seed color, position of flower, form of pod, etc.), which he knew segregated independently making all the possible crosses between them and recording the results. The 1866 paper, ‘Experiments in Plant Hybridization’, detailed what would later become known as Mendel’s Laws: each parent shares one of two factors (alleles) so that their segregation occurs independently into gametes. The gametes fuse in the next generation to form an individual; it makes no difference whether a factor is inherited from one or the other parent (the mother or father). When two different factors fuse in the zygote, a range of several possibilities could happen (expression of alleles such: additive, dominant, recessive etc.). Factors for one trait segregate independently of factors for another trait. Mendel deduced these inferences following: the 3:1 dominance ratios in the progeny of the crosses between F1 hybrids (found in F2s) and the 9:3:3:1 ratios while breeding for two traits together. Unfortunately, Mendel’s results were ignored for over 30 years. However, during this period important developments in cytology were made. For example, Oscar Hertwig described the sperm entering the nucleus of the egg and the initiation of fertilization; Walther Flemming described the stages of ‘mitosis’; August Weismann theorized the need for a reduction division during the formation of gametes (termed ‘meiosis’ in 1905) and Wilhelm Waldeyer in 1888, named the fibrous nuclear components, as ‘chromosomes’. In 1900, three botanists – Hugo de Vries in Amsterdam, Carl Correns in Tubingen and Eric von Tschermak in Vienna – independently rediscovered Mendel’s laws.

1.1.2 Meiosis as Mendel’s law key

The meiotic process and the discovery of CO provide a physical explanation for Mendel’s Laws. According to Mendel’s first law, each allele from either maternal or paternal types separate randomly. This fact is clarified because: first, each of the gametes produced after the two successive divisions of meiosis possess only one type (paternal or maternal) of a given chromosome region. Second, by random fusion of gametes from each parent, the zygote is formed. The second Mendel’s law reflects that alleles corresponding to two different traits

6 Figure 1-2: Diagram to summarize the durations of the meiotic stages in Arabidopsis pollen mother cell (PMC). S Meiotic S-phase, L leptotene, Z/P zygotene/pachytene, D–T diplotene to tetrads (From Armstrong and Jones, 2003). segregate independently one from each other. The explanation of this rule is simple if the two traits stand on different chromosomes: during meiosis I, each bivalent aligns independently on the spindle, without respect to the parental origin. The more complex situation occurs when the genes corresponding to two traits stand on the same chromosome. If we consider this situation, the segregation pattern will depend on the crossover frequencies between those two given loci. One of the usual contradictions to Mendel’s law is observed when a crossover takes place between the maternal and paternal alleles of two traits situated on the same chromosome. Indeed, in the case of the high frequency of crossover between two loci on the same chromosome, the four equally combinations of the pairs of alleles (called, paternal, maternal, paternal to maternal recombinant and maternal to paternal recombinant) will be observed. While an intermediary frequency of crossover happens, a partial linkage between the two traits will be detected. In fact, the association of genetic traits in a linear arrangement and the presence of were inferred from the documentation of such exceptions to Mendel’s second law (Janssens, 1909; Morgan and Cattell, 1912; Sturtevant, 1913).

1.2. Chronology and cytological aspects of prophase I of meiosis

Prophase I is the longest and the most complex stage of meiosis. The duration of meiotic prophase is longer in comparison to mitosis and is highly variable depending on species and sexes. For example, the prophase I in male mouse takes 10 days, while that’s only 33 hours in Arabidopsis (figure 1-2) (Armstrong et al., 2003; Cohen et al., 2006). Similarly, the duration of prophase can vary within the same species, ranging from simple to double between male and female meiocyte of Caenorhaditis elegans (54 to 60 hours for oocytes and 20 to 24 hours for spermatocytes) (Jaramillo-Lambert et al., 2007). In 1977, Bennett estimated the duration of meiosis in about 70 organisms. They showed a vast variability of meiosis duration ranging from less than 6 h in yeast to more than 40 years in the human female (Bennett, 1977). Bennett also compared chromosome number, ploidy level and duration of meiosis on several plant species (table 1-1) (Bennett, 1971). The question was whether the meiosis duration was correlated to chromosome number or ploidy level? But, they didn’t find any correlation between ploidy level and duration of meiosis. For instance, the meiosis duration in Triticale (octoploid), a hybrid between wheat and rye, took

7 Species Duration of meiosis in hours Chromosome number Ploidy level

Triticale (Wheat x Rye ) hybride 20.8 56 Octoploid

Antirrhinum majus 24.0 16 Diploid

Triticum aestivum (Wheat) 24.0 42 Hexaploid

Haplopappus gracilis 36.0 4 Diploid

Rhoeo discolor 48.0 12 Diploid

Secale cereal (Rye) 51.2 14 Diploid

Allium cepa (Onion) 96.0 16 Diploid

Ornithogalum virens 72.0 6 Diploid

Tradescantia paludosa 126.0 12 Diploid

Tulbaghia violacea 130.0 12 Diploid

Tradescantia reflexa 144.0 24 Tetraploid

Lilium candidum (Madonna Lily) 168.0 24 Diploid

Lilium henryi (Tiger Lily) 170.0 24 Diploid

Lilium longiflorum (November Lily) 192.0 24 Diploid

Trillium erectum* 274.0 10 Diploid

Agapanthus umbellatus 300.0 30 Diploid

Table 1-1: Chromosome number, ploidy level and duration of meiosis in 16 plant species * The duration of meiosis was measured at 20 °C; meiosis normally occurs in T.erectum at much lower temperatures and then has duration of several weeks (Sparrow, 1955). Adapted from (Bennett, 1971). The measurement technique for meiosis duration is different from one species to another (for more information see Wilson 1959, Ernst 1938, Bennett 1971). less time than Lily (diploid). The same result has been observed for the chromosome number: Trillium erectum, a diploid plant with 10 chromosomes, has longer meiosis duration in comparison to wheat, a hexaploid with 24 chromosomes. At that time, they undertook a time- course of the different stages of meiosis. However, they didn’t calculate the duration of the five stages of prophase I separately. They only compared leptotene, zygotene and pachytene in prophase I and metaphase I to telophase II. These measurements showed a non- homogenous distribution of time through meiosis during which leptotene to pachytene covered high proportion of meiosis duration (see table 1-2). Additionally, meiosis and mitosis times were compared. According to data shown in table 1-3, the meiosis cell cycle is longer than mitotic cell cycle from two-fold to nine-fold in Haplopappus gracilis and Trillium erectum, respectively. In addition to meiosis duration, many events occur at prophase I such as chromatin compaction around the axes, association between homologous chromosomes (establishment of synaptonemal complex) and homologous recombination (Zickler and Kleckner, 1998). Thanks to advances in microscopy and preparation methods of the chromosomes, prophase of meiosis could be subdivided into five stages depending on the state of the chromatin and structure of chromosomal axes. These steps have been called, in chronological order, leptotene, zygotene, pachytene, diplotene and diakinesis (Zickler, 1999).

1.2.1 Leptotene

Leptotene comes from the Greek “leptos” for “thin” and “taenia” for “band”. The meiotic chromosomes begin to condense and the protein components, called axial, establish along chromosomes (figure 1-3). Two sister chromatids from each chromosome are characterized by a series of DNA loops associated with common Axial Elements (AE). It is also at this stage that the primary events of homologous recombination take place (formation of DNA double strand breaks (DSB) and the primary steps of DSB repair). Furthermore, at leptotene stage, it has been observed, in some species, homologous chromosome alignment (or “pairing”) with a distance longer than 300 nm (Zickler, 1999). In many species, chromosomes are attached to nuclear membrane by telomeric ends. This configuration, known as bouquet, forms before the association between homologs and seems to facilitate their alignments. Unlike many species, Arabidopsis does not form a telomeric bouquet. However, telomeres tend to cluster at the nucleolus (Armstrong et al., 2001). Constituent proteins of AE [the Hop1 protein excluded] are highly divergent between different organisms. Several

8 Species Duration Leptotene Zygotene Pachytene Metaphase I

of meiosis to

in hours Telophase II Triticale (Wheat x Rye ) hybride 20.8 7.5 3.0 2.3 6.5

Antirrhinum majus 24.0 6.0 3.0 3.0 9.0

Triticum aestivum (Wheat) 24.0 10.4 3.4 2.2 8.0

Secale cereal (Rye) 51.2 20.0 11.4 8.0 10.8

Tradescantia paludosa 126.0 48.0 24.0 24.0 13.0

Lilium candidum (Madonna Lily) 168.0 40.0 40.0 40.0 24.0

Lilium longiflorum (November Lily) 192.0 40.0 36.0 40.0 48.0

Trillium erectum 274.0 70.0 70.0 50.0 64.0

Table 1-2: The duration of meiosis, of prophase I (only leptotene, zygotene, pachytene) Metaphase I to Telophase II in 8 plant species. Adapted from (Bennett, 1971).

Species Mitotic cycle time in hours Duration of meiosis in hours Haplopappus gracilis 11.9 24.0

Trillium erectum 29.0 274.0

Tradescantia paludosa 20.0 126.0

Secale cereal (Rye) 12.75 51.2

Allium cepa (Onion) 17.40 96.0

Lilium longiflorum (November Lily) 24.0 192.0

Triticum aestivum (Wheat) 10.5 24.0

Table 1-3: Mitotic cycle time and duration of meiosis in 7 plant species . Adapted from (Bennett, 1971). homologues of Hop1 have been identified in higher eukaryote such as HIM-3 in C.elegans, ASY1 in Arabidopsis and PAIR2 in rice (Zetka et al., 1999; Caryl et al., 2000; Nonomura et al., 2004). All of them are localized on chromosome axes (Zetka et al., 1999; Armstrong et al., 2002; Nonomura et al., 2004). A human homologue of Hop1, known as CT46/HORMAD1, has been also identified (Chen et al., 2005). Many data show that cohesion is very important for the assembly of AE. Immunolocalization experiments showed that cohesins are associated with AE (Eijpe et al., 2003; Kitajima et al., 2003; Prieto et al., 2004; Chelysheva et al., 2005; Krasikova et al., 2005; Valdeolmillos et al., 2007).

1.2.2 Zygotene

The word Zygotene comes from the Greek: “zygos” (yoke). At this stage, homologous chromosomes begin to align closely at a distance of 100 nm where both AE are connected by a Central Element (CE) (this is called synapsis). Thus, the Synaptonemal Complex (SC) is made of two AE (then renamed lateral elements) connected by a CE (figure 1-4). This triple structure is easily recognizable by electron microscopy (figure 1-4C). SYCP1 in mammals, Zip1 in S.cerevisiae, C(3)G in Drosophila, Syp-1 and Syp-2 in C.elegans and ZYP1a and ZYP1b in Arabidopsis are all components of transverse filaments of SC. They are used as synapsis markers (Meuwissen et al., 1992; Sym et al., 1993; Page and Hawley, 2001; MacQueen et al., 2002; Colaiacovo et al., 2003; Higgins et al., 2005). Although their amino acid sequence is poorly conserved, it appears that these proteins have a common structure and function: (a) they are exclusively localized at CE when the homologous chromosome synapsis occurs (and not at non-synapsed region), (b) their absence leads to problems of SC formation (Sym et al., 1993; Tarsounas et al., 1997; Page and Hawley, 2001; MacQueen et al., 2002; Higgins et al., 2005), (c) these proteins are structurally similar as they are characterized by two globular domains at the N and C-terminal connected by a large "coiled-coil" area (Meuwissen et al., 1992; Sym et al., 1993) (figure 1-4). (For complementary information about Zip1, see ZMM proteins in 1.3). The yeast S. pombe is part of a group of species that do not develop SC. However, it forms a protein body called linear element (LinEs). This closely resembles axial elements. Three proteins have been shown to be associated with LinEs and correspond to the homologous proteins from S. cerevisiae. First, Rec10 protein shows a low sequence similarity and with the axial element protein Red1 (Rockmill and Roeder, 1988; Smith and Roeder,

9 Leptotene Zygotene Pachytene Diplotene Diakinesis

A

B

Parental homologous chromosomes C The central element Components of the SC

Cohesin

LN The axial element components EN of the SC

Elements of recombination D machinery Crossing-over Zygotene stage Pachytene stage Figure 1-3: Schematic representation of cytological progression of meiotic prophase I. From de Muyt, 2008, thesis (A) At leptotene, the chromosomes are condensed around an axis called the axial element (AE). Homologous chromosomes begin to associate closely in zygotene. The formation of synaptonemal complex (SC) which involves the AE of each homologues (renamed lateral elements) and a central element can be visualized using antibodies directed against ASY1 and ZYP1 mark respectively the axes of the chromosomes and the central element of the SC. At pachytene SC polymerisation is completed. SC is depolymerized in diplotene, where the chiasmata can be visualized. In diakinesis, the chromosomes are condensed. (B) enlargement of bordered area in red. It shows in detail the axes, the SC and CO formation. (C) Immunolocalization of ASY1 and ZYP1 on the Arabidopsis male meiocytes. (D) recombination nodules on tomato chromosome spreads observed by electron microscopy (from Anderson and Stack, 2005). The early nodules (ENs) make their first appearance in leptotene. In pachytene, ENs have disappeared and late nodules (LNs) remain in a limited number (arrow). 1997). Rec10 appears to be the major component of LinEs as its absence leads to the disappearance of LinEs (Molnar et al., 2003). Like Rec10, the Mek1 and Hop1 homologous of S. pombe are located at the LinEs (Lorenz et al., 2004). However the formation of LinEs does not seem to be affected in their absence (reviewed by (Loidl, 2006)).

Early Recombination Nodules Recombination nodules (RN) have been described for the first time during the meiotic prophase in Drosophila females as dense ovoid structures localized at CE of SC (Carpenter, 1975). Subsequently, their existence has been confirmed in other species, like Sordaria macrospora (Zickler, 1977), human (Rasmussen and Holm, 1984), tomato (Stack, 1982) and maize (Anderson et al., 2003). Following studies have identified two types of recombination nodules that differ in size, shape, number, appearance and protein composition: early nodules (EN) and late nodules (LN) (reviewed by (Zickler and Kleckner, 1999)) (For LN see 1.2.3). ENs are present in large numbers (several hundreds in some organisms) and are distributed homogenously along chromosomes (Anderson et al., 2001; Stack and Anderson, 2002) (figure 1-3D). Immunolocalization experiments have revealed an association between ENs and several proteins involved in homologous recombination such as RAD51, DMC1 (Anderson et al., 1997; Moens et al., 2002; Moens et al., 2007) See 1.3)

I.2.3 Pachytene Stage

At pachytene (from Greek, "pachus" for thick) homologous chromosomes synapse along their axes length. They also get more compact and therefore become shorter and thicker than previous stages. It is during the pachytene stage that the maturation of recombination intermediate completes to generate crossovers (CO). In most organisms, the pachytene is the longest stage of prophase I. In Arabidopsis, zygotene and pachytene stages are half of the time of meiotic prophase I (about 15 hours of a total 33 hours) (figure 1-2) (Armstrong et al., 2003).

Late Recombination Nodules Many studies of different organisms suggest that LNs correspond to CO sites (i) the number of LNs is consistent with the number of reciprocal exchanges between homologs, predicted by the size or genetic maps based on the number of chiasmata (corresponding to cytological visualization of CO) (Sherman and Stack, 1995; Zickler and Kleckner, 1999;

10 A Zip1 molecule: A Zip1 dimer :

Globular N N C Globular C A domain domain N C

«coiled coil » domain 84 nm Transverse filaments

N N N N

B

N N N C N C

Sister chromatides Central Lateral Lateral element elements elements

100nm

Lateral elements Central element C

Transverse filaments

Figure 1-4: The central element of synaptonemal complex. From de Muyt, 2008, thesis (A) Structural model of the ZIP1 protein in the SC (According to Tung and Roeder 1998; Sym et al. 1993). (B) Schematic structure of the SC. The central element is composed of transverse filament proteins (green). The lateral elements are composed of cohesins and axial protein components (in red). (C) Longitudinal section of the SC of the beetle Blap cribrosa visualized by electron microscopy on spread pachytene chromosomes (Schmekel et al. 1993). Anderson et al., 2003). (ii) The LNs, similar to CO, are very rare in heterochromatin regions. (iii) The distribution of LNs, comparable to CO, is not random. It was shown that each bivalent possess at least one LN which is consistent with the notion of the obligatory crossover per bivalent (see below). (iv) Finally, most LNs contain the protein MLH1, a protein required for CO maturation (Moens et al., 2002; Lhuissier et al., 2007).

I.2.4 Diplotene and Diakinesis Stages

Diplotene stage ("diploos" for double) is characterized by temporary chromosomes decompaction, progressive loss of components of SC and RN. Thus, homologous chromosomes separate from each other, with exception to specific sites: chiasmata (figure 1- 3A). This connection between homologs, stabilized by cohesion between sister chromatids, remains until metaphase I. During diakinesis stage ("dia" for through and "kinesis" for movement) chromosomes condense again to reach their maximum level of compaction. It is also at this stage that the chromosomes are disconnected from nuclear envelope.

1.2. 5 Metaphase, anaphase and telophase 1

At Metaphase I, bivalents co-orientate on the spindle fibers, and each bivalent contains at least one chiasmata. The fibers connect to kinetochores at centromeres. Anaphase I accomplishes the separation of homologous chromosomes, but sister chromatids remain associated by centromere regions. Homologs migrate to opposite poles. Telophase I contains two polar groups of chromosomes that partially decondense without achieving the full interphase condition.

1.2. 6 Metaphase, anaphase and telophase 2

Metaphase II and anaphase II leads to separation of sister chromatids. Late anaphase II cells contain the expected four groups of chromatids within a common cytoplasm, which gradually re-form into tetrads of haploid interphase nuclei. This aspect of meiosis will not be presented in more detail.

11 Figure 1-5: The double Holliday junction (dHJ) model of Szostak, Orr-Weaver, Rothstein and Stahl (1983). A DSB is enlarged into a double-stranded gapped region, which is subsequently resected to have 3’-ended single-strand tails that could engage in strand invasion. The first strand invasion would produce a D-loop to which the end of the second resected end could anneal. The initial structure has one complete and one half HJ, but branch migration of the half HJ allows the formation of a second complete HJ. New DNA synthesis completes the formation of a fully ligated structure that can be resolved into crossovers if the two HJs are cleaved in different orientations. Depending on the position of the cleavage sites, the structure is determined either by two crossovers, or a non-crossover (From Haber 2008). 1.3 Molecular description of meiotic recombination

Our knowledge on molecular mechanisms of meiotic recombination was deduced from genetic data obtained from fungi such as Neurospora crassa, Ascobolus immersus, S. pombe, and S. cerevisiae. Then the progress of molecular biology and biochemistry essentially in S. cerevisiae and S. pombe allowed the identification of molecular actors involved at each stage. Three models of homologous recombination have marked the progress of research in this area. In 1922, the analysis on chromosomes irradiated by X-ray allowed Muller to introduce the idea that chromosome breaks are the cause of some chromosomal rearrangements ((Muller, 1922; reviewed by (Haber, 2007)). In 1964 Robin Holliday proposed a model now known as the “Holliday Junction” (Holliday 1964). Subsequently, several models suggest that homologous recombination was initiated by single-strand breaks (SSB) (reviewed by (Haber, 2007)). It was only in 1976 that Resnik proposed a model based on the formation of double- strand breaks (DSB) and the degradation of 5'-3' ends of single-stranded ((Resnick, 1976); reviewed by (Haber, 2007)). But it is really in 1983 that a consensus model of "DSB repair" has been proposed by Jack Szostack. The latter, called double-strand breaks repair (DSBR), served as a model until early 2000 (figure 1-5; Szostak et al., 1983). Then an alternative model was suggested (Allers and Lichten, 2001b; Hunter and Kleckner, 2001). The study of recombination mechanism at the molecular level is limited in higher eukaryotes such as plants and animals. This is due to several causes: First, it is often difficult to have a large number of synchronized meiocytes. This complicates the physical analysis of recombination intermediates and biochemical analysis of the proteins involved. Second, many genes involved in meiotic recombination are also required for somatic recombination, and their inactivation can have pleiotropic effects, sometimes lethal. However model species such as mouse, C. elegans and A. thaliana have already contributed very significantly to the advancement of knowledge on the mechanisms of meiotic recombination; their characteristics allow studying specific aspects of meiotic recombination. I will first detail the actual knowledge of the molecular mechanisms of meiotic recombination and then I will summarize the state of art in Arabidopsis thaliana.

12 1.3.1 Double strand breaks (DSB) formation

In 1909, Janssens proposed that meiotic recombination could result from the break followed by joining homologous chromosomes (Janssens, 1909). The requirement to chromosome breakage to initiate the recombination process has rapidly emerged, but the formal evidence of their existence have been made in 1989 (Game et al., 1989; Sun et al., 1989). The involvement of the Spo11 protein in the formation of the programmed DNA double-strand breaks (DSBs), during prophase of meiosis (leptotene stage) was demonstrated soon after (Cao et al., 1990; Bergerat et al., 1997; Keeney et al., 1997). Spo11 generates a protein-DNA intermediate that it is covalently bound to the DNA 5' ends by a phospho- tyrosine bond. Then, Spo11 is removed by endonucleolytic cleavage generating short Spo11- bound oligonucleotides. The DNA 5' ends are then degraded to produce 3' single stranded on each side of the break (Bergerat et al., 1997; Keeney et al., 1997). The mechanism of DSB formation appears to be highly conserved during since homologs of Spo11 were found both in protists and in multicellular eukaryotes (Ramesh et al., 2005; Malik et al., 2007). In most species, Spo11 is encoded by a single gene and its inactivation by mutation leads to a reduction in meiotic recombination and thus to reduced fertility. This is the case in S. cerevisiae (Klapholz et al., 1985), S. pombe (rec12) (Lin and Smith, 1994), T. thermophila (Mochizuki et al., 2008), C. cinereus (Celerin et al., 2000), D. melanogaster (mei-W68) (McKim et al., 1998; McKim and Hayashi-Hagihara, 1998), C. elegans (Dernburg et al., 1998) and Mus musculus (Baudat et al., 2000; Romanienko and Camerini-Otero, 2000).It has been proven later that this role is conserved in all eukaryotic species reproducing sexually, including A. thaliana (see 1.3.6; reviewed in (Keeney, 2008)). Spo11, a protein of 45 kDa that shares homology with the A-subunit (i.e. Top6A) of topoisomerase VI in the archaeon, Sulfolobus shibatae (Bergerat et al., 1997), is actually responsible for the catalysis of the DSB, but does not act alone. Spo11-dependent DSB formation in S. cerevisiae requires the presence of at least nine other proteins among which Mei4, Mer2, Rec102, Rec104 and Rec114 are meiosis-specific. The four other proteins, Ski8, Mre11, Rad50 and Xrs2, also have roles in vegetative cells. Indeed, Ski8 functions in metabolism of RNAs in mitotic cells (Masison et al., 1995; Araki et al., 2001), and the highly conserved MRX complex (Mre11–Rad50–Xrs2) has multiple functions in DSB repair, telomere maintenance and DNA damage checkpoint activation in both mitotic and meiotic cells (Borde, 2007; Keeney, 2008).

13 Figure 1-6: Model for strand invasion by Rad51 and mediator proteins, Rad52, Rad54 & Rad55/ Rad57 (only one side of the DSB is shown). 1) 3 ssDNA tails are produced by the MRX complex and/ or a 5’ to 3’ resection exonuclease. 2) The ssDNA tails are coated by RPA to eliminate secondary structures. 3) Rad52 recruits Rad51 to the RPA-ssDNA complex. 4) The Rad51 nucleoprotein filament extends on the ssDNA mediated by Rad55/Rad57 and RPA is displaced. 5) The Rad51 nucleoprotein filament locates a homologous DNA donor sequence. 6) Rad54 interacts with Rad51, promotes chromatin remodeling, DNA unwinding and strand annealing between donor DNA and the incoming Rad51 nucleoprotein filament (From Krogh et Symington 2004). In S. cerevisiae, it is possible to target the activity of Spo11 to a given DNA region, for example, fusing to the Gal4 DNA binding domain (Pecina et al., 2002; Robine et al., 2007). Recently, the role of histone modifications in the control of DSBs site distribution has been suggested in S. cerevisiae (Borde et al., 2009), and especially demonstrated in mammals (Baudat et al., 2010; Myers et al., 2010; Parvanov et al., 2010). This will be discussed in more details in other parts of this manuscript (1.6.2.3).

1.3.2 Maturation of DSBs

After breakage, the Spo11 protein is covalently attached to each of the DSB 5' ends via a phosphodiester bond. In S. cerevisiae, it is released following an endonucleolytic cleavage of DNA, presumably mediated by Mre11 (in the context of the MRX complex formed with the Rad50 and Xrs2) in collaboration with Sae2 (Keeney, 2008; Mimitou and Symington, 2009), leaving short 3′ single-stranded DNA tails (figure 1-6). Then, these single-stranded ends are resected by a mechanism not yet fully elucidated, involving protein Exo1, Sgs1 and Dna2 (Mimitou and Symington, 2008; Manfrini et al., 2010). In S. cerevisiae the tracts length of resection varies between 1 and 2 kb (Borts and Haber, 1989). It seems that limitation depends on sister chromatids cohesion (Klein et al., 1999). The RPA protein binds to the single strands DNA produced by resection and cover them for eliminating secondary structures (Ehmsen and Heyer, 2008; Heyer et al., 2010). Then mediator proteins allow RPA to be replaced by recombinases: Rad51 (expressed in both mitosis and meiosis) and Dmc1 (expressed specifically in meiosis). These two proteins are assembled like a helix around each single strand ends, forming nucleoprotein filaments or nucleofilament. The mediator proteins are well known in S. cerevisiae. For Rad51, Rad52 (Sung, 1997b), Rad55 and Rad57 proteins (Sung, 1997a) act as a mediator. Rad55 and Rad57 are paralogs of Rad51 (figure 1-6). The Mei5 and Sae3 proteins are specific mediators of Dmc1 (Hayase et al., 2004). In animals and plants, the situation is relatively less clear. The Rad52 protein is not required for meiotic recombination in mice (Bannister and Schimenti, 2004). In A. thaliana, two homologs of Rad52 have been recently identified (AtRAD52-1 and AtRAD52-2). Mutation in each of them lead to fertility reduction but their role in meiotic recombination has not yet been documented (Samach et al., 2011). In contrast, the BRCA2 protein could substitute for Rad52 in those two organisms (Sharan et al., 2004; Siaud et al., 2004; Dray et al., 2006; Thorslund et al., 2007). Furthermore, in mammals, there are five paralogs of RAD51 which form two distinct complexes RAD51B/ RAD51C/ RAD51D-XRCC2 and RAD51C-XRCC3 (Liu et al., 2007).

14 Their meiotic function is not documented in mice. In A. thaliana, the RAD51C-XRCC3 complex is necessary for meiotic recombination (Bleuyard and White, 2004; Bleuyard et al., 2005).

1.3.3 Strand invasion

The Rad51 and Dmc1 nucleofilaments have the ability to associate with a double- stranded DNA by interactions called “paranemic”. These interactions allow them to "detect" and "search" a complementary sequence with which they can form an intermediate (relatively) stable three-stranded molecule. Paranemic joints are then converted into plectonemic joints between the invasive strand and its complementary strand. The exchange of homologue strands initiates from the 3' end of the invasive strand (San Filippo et al., 2008). The molecular structure resulting from invasive strand is called displacement loop (D-loop) (Hunter and Kleckner, 2001). In S. cerevisiae, the process of strand invasion is facilitated by Rad54 (figure 1-6), Rdh54/Tid1 (Shinohara et al., 1997; Schmuckli-Maurer and Heyer, 2000; Shinohara et al., 2000) and Mer3 (Mazina et al., 2004). One or more homologs of Rad54 are present in vertebrates (Bannister and Schimenti, 2004; Wesoly et al., 2006) and plants (Osakabe et al., 2006; Klutstein et al., 2008), although playing a role in somatic homologous recombination. One Rad54 homolog is required to remove inter-chromosomal telomeric connections during meiotic recombination in Arabidopsis (Higgins et al., 2011). The Mer3 helicase is necessary, both in A. thaliana (Mercier et al., 2005), and rice (Wang et al., 2009), as in S. cerevisiae (Nakagawa and Kolodner, 2002), for most of CO formation. In addition, in all species containing the Dmc1 gene, the complex of the Hop2 and Mnd1 proteins is also involved in meiotic homologous recombination. In S. cerevisiae and mice, the Hop2-Mnd1 complex has in vitro a mediator activity of strand invasion by Rad51 nucleofilament and / or Dmc1 (Chen et al., 2004; Chi et al., 2007; Pezza et al., 2007). Furthermore, the Hop2-Mnd1 complex seems to be involved in the control of the strand invasion specificity in vivo. Indeed, inactivation of Hop2 and/or Mnd1 genes leads to the formation of non-homologous interactions in S. cerevisiae (Leu et al., 1998; Henry et al., 2006), mice (Petukhova et al., 2005), and probably A. thaliana (Vignard et al., 2007). While the Rad54, Rdh54, Mer3, Hop2 and Mnd1 have the nucleofilament stabilizing function and / or extend the duplex between the invasive strand and its complement, other proteins have exactly a reverse function: the helicases Mph1 (Prakash et al., 2009) and Srs2 (Dupaigne et al., 2008) of S. cerevisiae, BLM (Bachrati et al., 2006) and

15 RTEL-1 (Barber et al., 2008) of mammals, disassemble D-loop in vitro. Meiotic function of these proteins is not yet well understood.

1.3.4 Inter-homologue bias

During somatic homologous recombination, Rad51 is the only recombinase-forming nucleofilaments. The Dmc1 protein, which is encoded by a paralogue gene presented in most species (except D. melanogaster and C. elegans), is involved only during meiosis. On a functional level, these recombinases are not redundant and that seems to be related to a fundamental difference between somatic and meiotic cells. In the first case, the repair usually involves the sister chromatids, while in the second case the homologous chromatid is the preferred partner in most species (Haber et al., 1984; Game et al., 1989; Schwacha and Kleckner, 1994). This phenomenon, referred to as “inter-homologue bias” is likely to be shared by all eukaryotes, with the possible exception of S. pombe. In this organism, meiotic recombination involves two sister chromatids probably more frequently than two homologous non sister chromatids (Cromie et al., 2006). In S. cerevisiae, Rad51 and Dmc1 are both involved in the repair of DSBs. But since these two proteins are hardly distinguishable in vitro, both in terms of biochemical activity (strand invasion) and on the structure of the nucleofilament (Sheridan et al., 2008), it seems reasonable to assume that their partners allow them to differentiate their functions in vivo. In fact, it was shown that Mek1 and Hed1 proteins inhibit the stimulation of the Rad51 activity (invasion strand) by Rad54 (Busygina et al., 2008; Niu et al., 2009). Without these two proteins, Dmc1 is not necessary for the DSB repair: Rad51 alone allows recombination with homologous chromatids (Niu et al., 2009), although it is possible that the homologue / sister bias is reduced. However, the Dmc1 alone appears to invade only on the sister chromatids, not on the homologous chromatids (Schwacha and Kleckner, 1997). Moreover, in at least one strain of S. cerevisiae (BR2495) the absence of Dmc1 only leads to reduction and not the virtual disappearance of recombination related to homologous chromatids (Rockmill et al., 1995). In summary, Rad51 is necessary for recombination between homologue chromatids, but is normally not sufficient, while Dmc1 is sufficient for recombination between sister chromatids, but is not necessary. In S. pombe, as in the BR2495 strain of S. cerevisiae, Dmc1 plays relatively an accessory role. In its absence, inter-homologue recombination is reduced from 60 to 80%, but all the DSBs are repaired and the viability of spores is almost normal (Fukushima et al., 2000;

16 Grishchuk and Kohli, 2003; Grishchuk et al., 2004; Young et al., 2004). Dmc1 appears to simply facilitate recombination toward the homologous chromatids. In contrast, the Rhp51, homologue of Rad51, plays a crucial role in meiotic DSB repair by involving both sister and homologous chromatids (Grishchuk and Kohli, 2003). In mice, RAD51 is necessary for embryo survival (Lim and Hasty, 1996), and cells lacking it, exhibit chromosomal fragmentation leading to apoptosis (Sonoda et al., 1998). Its meiotic function is unknown but it is probably essential for programmed DSB repair. In A. thaliana, the absence of RAD51 does not lead to vegetative developmental defects, but has a very strong meiotic defect: the homologues do not synapse and appear to be fragmented at diakinesis stage (Li et al., 2004). Both in mice (Pittman et al., 1998; Yoshida et al., 1998) and A. thaliana (Couteau et al., 1999), the DMC1 protein seems necessary for recombination between homologous chromatids but not between sister chromatids, where mutants individuals exhibit a defect in synapsis of homologous, but no detectable chromosomal fragmentation. The mechanism of differentiation of meiotic functions of the RAD51 and DMC1 in animals and plants is still unknown.

1.3.5 Homologous Recombination Pathways

The step of strand invasion is a crossroad of the homologous recombination process. Positive or negative regulators of the D-loop extension determine the evolution of recombination intermediates. In the classical model of DSBR, the D-loop is stabilized, and the displaced strand probably binds RPA and Rad52, which allows then to hybridize to the second single-strand end (Nimonkar et al., 2009). This stage of second end "capture" guides the intermediates to the CO formation. Again according to the classical model of the DSBR, there was then a single pathway of CO formation. However, it is now accepted that there are at least two pathways, which differ both in nature of the intermediates and the proteins involved (figure 1-7).

1.3.5.1 Type I Crossover In the first pathway, the continuity of strands is restored by DNA synthesis and ligation. This process leads to a stable structure with two junctions called Double Holliday Junctions (DHJ). According to the DSBR model of Szostak, DHJ resolution can give rise to both NCO and CO, depending on the position of the endonucleolytic cuts at the HJ level. In

17 Figure 1-7: General model of homologous recombination pathways in meiosis. From Nicolas MACAISNE, thesis, 2010. The double-strand break (a) is resected to release 3’ ends on both sides of the break (b). One of the 3’ ends will be taken over by specialized proteins to invade the homologous chromatid leading to the formation of a D-loop (c). This structure is common to all meiotic HR pathways described to date. The Double Holliday Junction (DHJ) (d-f) processing can lead the formation of a type I CO (g) or NCO (h, i, j). The repair pathway by SDSA (k-m), which does not depend on the DHJ produces only NCO (m). The pathway dependent on the MUS81-EME1complex (n-v) will generate a type II CO (p-r), or the resolution of a single Holliday junction that can lead also to a NCO (s, u). Genes known to be involved in these pathways in Arabidopsis are indicated. The involvement of genes shown in gray is to date still putative. fact, the DHJ generates mainly CO and a few NCO (Allers and Lichten, 2001a). It seems however that this could happen through a process of dissolution of DHJ involving the protein complex Sgs1-Top3-Rmi1 in S. cerevisiae (Ira et al., 2003), or its homologs in higher eukaryotes: BLM-TOP3alpha in animals (Plank et al., 2006; Raynard et al., 2008) and RECQ4A -TOP3α-BLAP75 in plants (Hartung et al., 2007; Chelysheva et al., 2008; Hartung et al., 2008). In S. cerevisiae, genes Zip1, Zip2, Zip3, Zip4, Spo16, Msh4, Msh5 and Mer3 (collectively called ZMM) are all involved in the differentiation of type I crossing-over (Lynn et al., 2007). Indeed, the respective mutant phenotypes are comparable in the first approach (failure to form synaptonemal complex, reduction of CO, and residual non-interfering CO). However, differences can be observed; this suggests that the ZMM proteins, in fact, do not form a homogeneous group (Borner et al., 2004; Shinohara et al., 2008). Moreover, the coordinated control of the initiation of synapsis and differentiation of type I CO is observed specifically in S. cerevisiae. In contrast, in A. thaliana none of ZIP4, MSH4, MSH5, MER3 or SHOC1 (which is a homologue of Zip2 (Macaisne et al., 2008)) are necessary for synapsis. The Msh4 and Msh5 proteins form heterodimers that bind, in vitro, the branched DNA structures similar to the invasion strand intermediates and Holliday junctions (Snowden et al., 2004; Snowden et al., 2008). In vivo, Msh4-Msh5 heterodimers are presumed to stabilize recombination intermediates (D-loop and then DHJ), forming a kind of molecular clamps surrounding the chromatids involved in homologous recombination (Hoffmann and Borts, 2004; Ehmsen and Heyer, 2008). The Msh4-Msh5 heterocomplex could furthermore restricts the access to the dissolution complex (Sgs1-Top3-Rmi1), and in contrast recruit proteins required for subsequent maturation and/or resolution of DHJ (Hoffmann and Borts, 2004). The Mlh1 and Mlh3 proteins are not part of the ZMM, to the extent that, in S. cerevisiae, they are not necessary for synapsis. However their absence also causes a reduction of type I CO, but less than the one seen in zmm mutants of S. cerevisiae (Wang et al., 1999) and A. thaliana (Jackson et al., 2006). In this context, the underlying mechanism is not documented, but it is assumed that, similar to Msh4-Msh5, Mlh1-Mlh3 heterodimers stabilize recombination intermediates locally at later stage of their maturation (Argueso et al., 2003); (Hoffmann and Borts, 2004). It is quite likely that they may play a role to resolve DHJ for CO (Hoffmann and Borts, 2004; Franklin et al., 2006; Svetlanov et al., 2008). However in S. cerevisiae, the observation of post-meiotic segregation patterns in mlh1Δ mutant (Wang et al., 1999; Argueso et al., 2003) provides evidence that Mlh1 additionally has a mismatch repair function in the heteroduplex formed during the process of homologous recombination (Surtees et al., 2004). It seems that Mlh1 is a kind of pivot molecule to recruit not only

18 specific proteins involved in mismatch repair, but also proteins such as Exo1 and Sgs1, and thus may be part of a coordination between correction and resolution of recombination intermediates (figure 1-7).

1.3.5.2 Type II Crossover The second pathway of CO formation involves the proteins Mus81 and Mms4 (S. cerevisiae). Mus81 is an XPF endonuclease that would make breaks on the two strands of the intact chromatids, each end being associated with a different end of the invasive chromatid; the structure would be resolved in CO by DNA synthesis followed by ligation (Gaskell et al., 2007). According to an alternative model, the strand displaced by the invasive end is cleaved by Mus81 and strand hybridizes to the other end, leading to formation of a single Holliday junction, after DNA synthesis and ligation (Cromie et al., 2006) (figure 1-7). The two CO pathways can be distinguished by the nature of the recombination intermediates (DHJ versus SHJ), the identity of the proteins involved, and especially by their susceptibility to the interference phenomenon. The type II COs are formed independent of each other, but not the type I COs, which tend to be more spaced than expected simply by chance. In most eukaryotes, the first pathway determines the majority of COs: about 90% in mice (Guillon et al., 2005), 85% in A. thaliana (Franklin et al., 2006), 70% in tomato (Lhuissier et al., 2007) and S. cerevisiae (Argueso et al., 2004). The second pathway generates, in contrast, all COs in S. pombe: about 45 crossovers among only three chromosome pairs guarantee that each chromosome pair will be connected by a chiasma (Munz, 1994; Cromie et al., 2006).

1.3.5.3 Non Crossover (Gene Conversion non associated with CO) It is clear now that most of the ends are not involved in a stable manner in the process of strand invasion leading to the formation of Type I or Type II CO. DNA synthesis from the 3' end of the invasive strand allows, after dissolution of the nascent loop, the rehybridization of complementary strands beyond the break. The continuity of each strand, disrupted by the resection tracts, is reestablished by DNA synthesis. This mechanism of synthesis-dependent strand-annealing (SDSA) does not generate CO and its products are generally named non- crossovers (NCO) (Allers and Lichten, 2001a; Borner et al., 2004; Hunter, 2007; McMahill et al., 2007) (figure 1-7). This model of repair would not involve stable recombination intermediates. The major difference with the model involving DHJ repair is that the broken

19 strand quickly leaves the homologous strand after a repair synthesis avoiding induction of the nascent single-end invasion (SEI) stabilization (Paques and Haber, 1999). RTEL-1 in C. elegans (and its homologue RTEL1 in human) is the only protein known to date to promote the formation of NCO by SDSA. Cytological and genetic studies performed in nematode show that CO rate is severely increased in rtel-1 mutant compared to wild type: while in this species CO formation is strongly regulated by a single bivalent, it was noted the appearance of bivalent with two to three COs in rtel -1. This phenotype is not due to an increase in DSBs number. The biochemical analyses suggest that the activity of RTEL-1 would disassociate the D-loop to promote the NCO formation at the expense of CO formation (Barber et al., 2008; Youds et al., 2010). Since no gene has been described as influencing the formation of NCO without being involved in the regulation of the CO formation, it is possible to think that the NCO can be a product of the CO pathways. zip1, msh4, msh5, mlh1 and mer3 mutants of S. cerevisiae, which form more DHJ, show a drastic reduction in CO rate, but a wild-type level of NCO (Allers and Lichten, 2001a; Hollingsworth et al., 1995; Hunter and Borts, 1997; Nakagawa and Ogawa, 1999; Ross-Macdonald and Roeder, 1994; Storlazzi et al., 1996; Sym and Roeder, 1994), indicating that most of NCOs are not depend on the ZMM in yeast. In addition, in the ndt80 mutant of yeast, with a meiosis arrest at the pachytene stage, the stable recombination intermediates (SEI and DHJ) are accumulated. Nevertheless in the same mutant, the CO number is severely reduced, while the NCO number is not significantly different from wild type (Allers and Lichten, 2001a). This suggests that (i) the COs are mainly the product of the resolution of a joined molecule and that (ii) the NCOs appear principally via a mechanism of formation that does not require the formation of a stable intermediate. In all studies, NCOs are probably underestimated because of the difficulty in their detection and the limited number of polymorphisms available. In S. cerevisiae, genome wide studies have obtained 90.5 COs for 66.1 NCOs per meiosis for an estimated of 140-170 DSBs per meiosis (Mancera et al., 2008). It is not the case in other organisms as diverse as mice, humans, A. thaliana or Allium sativa, where DSB repair sites estimated by cytological methods (immunocytology using antibodies against the DSB repair protein RAD51 and/or DMC1, or early nodule visualized by electron microscopy) are in excess of 10 to 40 times compared to COs (estimated by chiasmata count, late nodules visualized by electron microscopy, or by genetic maps) (Baudat and de Massy, 2007a; de Muyt et al., 2009b). It is not known yet if all these DSBs in excess are repaired as NCOs but if so their number could be in large excess compared to COs.

20 In mammals, the immunolocalization of MLH1 protein, which co-localized with type I COs (Lynn et al., 2004), shows that the number of foci associated to this protein is significantly less than the number of MSH4 foci, which is, however, a ZMM protein essential for CO formation. In contrast, in S. cerevisiae, the number of Msh4 foci is correlated to the CO number. This suggests that a part of MSH4-dependent recombination events does not result in the appearance of CO in mammals. The detection of MER3 protein also shows, as well as in fungi and plants, the number of foci on the chromatin is more important than the number of CO (Storlazzi et al., 2010; Yu et al., 2010). Similarly for Zip2, Zip3 and Zip4 proteins, whose appearance is earlier than Msh4 in S. cerevisiae (Shinohara et al., 2008). If there are more ZMM foci than CO, this may be that they are involved in the formation of other products of the homologous recombination, as NCOs. We cannot exclude that at least part of the formation of NCO can be regulated by certain ZMM (Baudat and de Massy, 2007a). Recently Martini and her colleagues revisited the mechanisms of meiotic DSB repair in S. cerevisiae (Martini et al., 2011). Their genome wide study performed in S. cerevisiae mutated in the Msh2 gene, suggest a more complex interplay between sister and non-sister chromatids, and HJ migration in the resolution of recombination intermediates. Their results are compatible with NCOs being in part the results of the dissolution of DHJs.

1.3.6. What about molecular mechanisms of meiotic recombination in Arabidopsis thaliana?

One important characteristic of plants is that they lack a virtual predetermined germ line. On the other hand, their meiocytes differentiate late during plant development, following a series of homeotic activation phases (Feng and Dickinson, 2007). During the last fifteen years, following the first isolated meiotic gene in plants, roughly 50 genes were identified and characterized. At late 1990s, the first Arabidopsis thaliana recombination genes were isolated based on sequence similarity with S. cerevisiae (Sato et al., 1995; Klimyuk and Jones, 1997; Doutriaux et al., 1998). The first reports based on the detection of meiotic defect screens are from 1997 (Peirson et al., 1997; Ross et al., 1997), and these were rapidly followed by the cloning of the first Arabidopsis meiotic genes (Bai et al., 1999; Bhatt et al., 1999). Successively, screen of T-DNA mutants and systematic sequencing of insertion borders allowed the first isolation (see below) (and the subsequent widespread usage of this approach)

21 and functional characterization of meiotic genes using sequence similarity (Grelon et al., 2001). First: Forward genetic method involves screening mutated lines for a phenotype indicative of a meiosis defect, and the subsequent cloning of that gene by looking for the tag inserted. To date, almost half of the presently known meiotic genes in plants were identified using this method. Nevertheless, such approach has several limitations. (i) Functionally redundant and duplicated genes cannot be identified. (ii) Phenotype screens are difficult and time- demanding. (iii) Certain phenotypic criteria as fertility state are not very sensitive, which lead to missing probable meiotic phenotypes. For example, an Atmlh3 mutant, with an obvious meiotic phenotype, shows only a modest decrease in fertility (about 40%) (Jackson et al., 2006). This probably explains why such genes were not found in a large screen in which the mutants with stronger phenotypes are recurrently found. (iv) And finally, unfortunately when T-DNA insertion is used as mutagenic agent, only approximately 20% of the isolated mutations are linked with a T-DNA insert, the best scenario for rapid identification of the mutated gene. The other 80% mutations are mainly caused by very short insertion/deletions, probably caused by failed T-DNA insertions (Mathilde Grelon, personal communication). Second: Reverse genetics approach relies on candidate-gene selection, search for corresponding mutations and analyses of their phenotypic characteristics. Mutations due to T- DNA/transposons are found by PCR screening among pooled mutant DNA or more recently in databases generated by the systematic sequencing of insertion borders. The EMS mutagenesis associated with tilling is also another option to obtain mutant resources for Arabidopsis (Alonso and Ecker, 2006). An alternative approach is the gene silencing by RNA Interference (RNAi) technology, which is particularly valuable when the candidate gene is duplicated or crucial for plant development (Siaud et al., 2004; Higgins et al., 2005; Liu and Makaroff, 2006). Most frequently, genes are selected based on their sequence similarity with meiotic genes previously found in other kingdoms. An emerging approach for the selection of candidate genes is the transcriptomic/proteomic data. The level of expression at the right place and/or at the right time provides invaluable information for choosing reliable candidates approach to previously known meiotic genes.

Genes involved from early to late meiotic events in Arabidopsis thaliana Arabidopsis genes involved in several aspects of meiotic recombination have been cited when appropriate in previous sections. Here, will be provided information on more genes.

22 Gene Protein activity* ⁄ function Mutant phenotype

AtSPO11-1 ⁄ AtSPO11-2 Topoisomerase II-related proteins essential for Absence of chiasmata; DSB formation. AtSPO11-1 and AtSPO11-2 may univalent chromosomes function as a heterodimer observed at metaphase I AtPRD1 Required for DSB formation; potential functional Absence of chiasmata; homologue of mammalian MEI1 protein. Interacts univalents at metaphase I; with AtSPO11 absence of AtDMC1 foci AtPRD2 Required for DSB formation; contains motifs Absence of chiasmata; present in DNA-binding proteins. Probable univalents at metaphase I; orthologue of MEI4 found in S. cerevisiae and absence of AtDMC1 foci mouse AtPRD3 Required for DSB formation. Homologue of rice Absence of chiasmata; PAIR1 gene univalents at metaphase I; absence of AtDMC1 foci AtSDS unknown Absence of chiasmata; univalents at metaphase I; AtMRE11 MRN functions as a complex acting with Failure of chromosome AtRAD50 AtCOM1. Removal of AtSPO11 from 5’ DNA alignment and synapsis. AtNBS1 end on each side of the DSB; resection of 5’ DNA Extensive chromosome (MRN complex) to leave 3’ single-stranded DNA tails fragmentation observed at metaphase I

AtCOM1 Processing of DSBs. Interaction partner with Extensive chromosome SPO11. Possible functional homologue of the fragmentation observed mammalian MEI1 at metaphase I protein required for DSB formation *Proposed biochemical activity on the basis of studies of the corresponding proteins in other species

Table 1-4: Genes involved in Arabidopsis meiosis. Part 1 Proteins involved in DNA double-strand break (DSB) formation are in red and in processing of DSBs are in blue. Adapted from Osman et al 2011 and de Muyt et al 2009 Cohesin complexes contain four major subunits: two members of the Structural Maintenance of Chromosome family (SMC1/SMC3), Scc1/Rad21 and Scc3/Psc3 (Ishiguro and Watanabe, 2007), are essential for accurate segregation of chromosomes at meiosis. The meiosis specific Rad21 subunit (Rec8) was identified and functionally characterized in Arabidopsis. Unfortunately the smc1 and smc3 mutants are embryo lethal and their potential meiotic role could not be studied (Liu Cm et al., 2002). The unique Arabidopsis SCC3 homologue was also characterized and found to be an essential gene in meiosis, luckily, thanks to a leaky allele (Chelysheva et al., 2005). These studies obviously confirmed the functional conservation of the cohesin complex in meiotic chromosomal segregation in plants and emphasized a role of cohesins in the setting up of monopolar attachment of sister kinetochores at meiosis I (Ishiguro and Watanabe, 2007). Centromeric cohesion is specifically protected at anaphase I, by the Shugoshin protein (Watanabe, 2006).The Arabidopsis genome contains two shugoshin homologues, unlike yeast and similar to mammals, but none of these could be characterized so far because of the absence of mutants in the public databases. As mentioned in section 1.3.1, in yeast, Spo11 and its nine other partner are involved in the formation of accurate DNA double strand breaks. In A. thaliana, three homologues of the yeast Spo11 gene have been described; AtSPO11-1, AtSPO11-2, and AtSPO11-3. Only two of them (AtSPO11-1 and AtSPO11-2) had meiotic defects when they were inactivated (Grelon et al., 2001; Hartung and Puchta, 2001; Stacey et al., 2006). Both of these mutants showed a typical asynaptic phenotype (no synapsis, no or very few bivalents) associated with a strong reduction in meiotic recombination. Moreover, they suppress DNA repair defects when crossed to a wide range of meiotic mutants, indicating that both AtSPO11-1 and AtSPO11-2 are absolutely required for DSB formation. The lack of functional redundancy between these two genes suggests that DSB formation is catalyzed by a Spo11 heterodimer in plants, instead of a homodimer as it is thought to in other eukaryotes. However, no direct interaction between the two proteins has yet been described. In addition to AtSPO11-1 and AtSPO11-2, three other proteins are necessary for DSBs in Arabidopsis: AtPRD1, AtPRD2 and AtPRD3. These latter were identified through a phenotypic screen revealing an absence of DSB and therefore recombination (de Muyt et al., 2009a; de Muyt et al., 2007). The Rad50, Mre11, Nbs1, Ski8 are conserved (at the sequence level) in plants, but none of them have conserved all the functions described in yeast. SKI8 has no meiotic role (Jolivet et al., 2006). The function of both AtRad50 and AtMre11 showed that the MRX complex is required for DSB processing in plants, but not for their formation (Bleuyard and White, 2004; Puizina et al., 2004).

23 Gene Protein activity*⁄ function Mutant phenotype

AtRAD51 ⁄ AtDMC1 Homologues of the bacterial RecA protein. Atrad51: Chromosome catalyse strand exchange. AtRAD51 is present fragmentation at metaphase I. No in mitotic and meiotic cells, whereas AtDMC1 chromosome synapsis is meiosis specific Atdmc1: Absence of chiasmata; univalent observed at metaphase I, no fragmentation, no chromosome synapsis AtASY1/AtSDS DMC1-driven interhomologue repair Asynapc, strong reducon of bivalent number AtRAD51C/AtXRCC3 Required for DSB repair Chromosome fragmentation at metaphase I. No synapsis AtRPA1a Regulation of presynaptic filament assembly? Reduced chiasma formation Second-end capture? Yeast and mammals No fragmentation possess a single RPA large subunit gene, whereas five are found in Arabidopsis. AtRPA1a is present throughout prophase I of meiosis. AtBRCA2 Recruitment of AtRAD51 ⁄ AtDMC1 to nascent Univalents at metaphase I, some presynaptic filament. Role may be analogous chromosome to Rad52 in budding yeast fragmentation and mis-segregation AtMND1 ⁄ AtHOP2 Stabilization of presynaptic filament and Failure of chromosome pairing and promotion of duplex capture. Act as a synapsis. Defect in DSB repair complex, interaction demonstrated by yeast two-hybrid analysis

*Proposed biochemical activity on the basis of studies of the corresponding proteins in other species Table 1-4: Genes involved in Arabidopsis meiosis. Part 2 Proteins involved in DNA strand exchange. Adapted from Osman et al 2011 and de Muyt et al 2009 Atrad51 mutants do not repair meiotic DSBs (Li et al., 2004) whereas DSBs in Atdmc1 mutants are repaired, probably taking the sister chromatid as a template (Couteau et al., 1999; Siaud et al., 2004). This suggests that AtRAD51 initiates homology searches regardless of the target, while AtDMC1 is involved in inter-homologue repair. In addition to AtRAD51 and AtDMC1, two RAD51 paralogues, AtRAD51C and AtXRCC3, are involved in meiotic DSB repair. The results in Atrad51c and Atxrcc3 mutants suggest that both proteins cooperate with AtRAD51 (Kerzendorfer et al., 2006). Several proteins are thought to contribute with DMC1/RAD51 in strand invasion: homologs of the breast cancer susceptibility gene BRCA2 product and of the MND1/HOP2 complex were recently identified as key players together with recombinases in Arabidopsis meiotic recombination process (Schommer et al., 2003; Siaud et al., 2004; Dray et al., 2006; Kerzendorfer et al., 2006; Vignard et al., 2007). In 2002, Copenhaver and his colleagues first suggested the presence of two CO pathways in plants. Later, characterization of Arabidopsis MUS81 and ZMM homologues confirmed it. Disruption of the Arabidopsis ZMM does not seem to alter early meiotic prophase events and mutants present only mild synapsis defects (Mercier et al., 2003; Higgins et al., 2004; Chen et al., 2005; Jackson et al., 2006; Chelysheva et al., 2007; de Muyt et al., 2009b). However, it always provokes a severe decrease in CO formation. The CO formation was never superior to 15% of the wild-type level (Type I CO). This suggests that at least 15% of COs are ZMM-independent (Type II CO). The Arabidopsis genome comprises two putative MUS81 genes; one of these was shown to be a pseudogene (Hartung et al., 2006; Berchowitz et al., 2007). The other one is involved in somatic DNA repair (Hartung et al., 2006; Berchowitz et al., 2007), and probably also accounts for 9% of COs (Berchowitz et al., 2007). However, as observed in yeast (de los Santos et al., 2003; Argueso et al., 2004), COs remain when both (interfering and non-interfering) pathways are simultaneously disrupted in Arabidopsis (Berchowitz et al., 2007), suggesting the presence of a possible third mechanism for CO formation. In table 1-4 are summarized known Arabidopsis meiotic genes involved in early to late meiosis.

24 Gene Protein activity*⁄ function Mutant phenotype

(i) Class I crossover pathway DNA helicase, thought to stabilize and Reduced chiasma frequency. AtMER3 ⁄ RCK smulate DNA heteroduplex extension residual chiasmata are randomly distributed AtMSH4 ⁄ AtMSH5 Homologues of the bacterial MutS mismatch Reduced chiasma frequency, repair protein. Biochemical studies on residual chiasmata are randomly hMSH4 ⁄ MSH5 indicate that they form a distributed. Delay in prophase I complex that binds and stabilizes progression recombinaon intermediates (HJs and D- loops) AtZIP3 Yeast Zip3 possesses SUMO-E3 ligase acvity. Reduced chiasma frequency May regulate SC polymerizaon to ensure it is conngent on homologue pairing AtZIP4 In yeast, Zip4 funcons in conjuncon with Reduced chiasma frequency, Zip2 to promote polymerizaon of the SC residual chiasmata are randomly transverse filament protein Zip1 distributed AtPTD Unknown Reduced chiasma frequency, residual chiasmata are randomly distributed AtSHOC1 Putave XPF endonuclease. Interacon with Reduced chiasma frequency, PTD residual chiasmata are randomly distributed AtZYP1a ⁄ AtZYP1b Encodes transverse filament protein of the Asynapc. Slight reducon in synaptonemal complex. Required for COs. Nonhomologous COs controlled formaon of crossovers occur. AtMLH3 Thought to ensure that dHJs are resolved as Reduced CO formaon COs. AtMLH3 probably forms a heterodimer with AtMLH1 (ii) Class II crossover pathway Implicated in the formaon of c. 30% of class II Minor reducon in CO formaon AtMUS81 noninterfering COs. In vitro studies indicate that it can cleave nicked and intact HJs *Proposed biochemical activity on the basis of studies of the corresponding proteins in other species

Table 1-4: Genes involved in Arabidopsis meiosis. Part 3 Proteins required for CO formation. Adapted from Osman et al 2011 and de Muyt et al 2009 1.4 Distribution of Crossovers

1.4.1 Chromosomal Map Determination of Crossover Location

In the 1930s, the comparison between genetic maps and cytogenetic observation (polytene chromosomes) of Drosophila (Painter, 1934; Bridges, 1935; Mather, 1938) suggested that COs are not distributed uniformly along the chromosomes. The first observations of cytological localization of chiasmata are old and vague. It is only in the 1960s that studies of the fine localization of chiasmata along chromosomes in diplotene and diakinesis stages were published. The accuracy of chiasmata distribution maps was then quite largely overtaken by localization studies of LN along the synaptonemal complex. For more than ten years, the immunolocalization of MLH1 protein was commonly used to determine the position of crossovers along chromosomes at pachytene (figure 1-8). MLH1 foci distribution maps were established in species as diverse as mice (Anderson et al., 1999), the human male (Sun et al., 2004; Codina-Pascual et al., 2006; Sun et al., 2006; Oliver-Bonet et al., 2007; Ferguson et al., 2009) or female (Tease et al., 2002), tomato (Lhuissier et al., 2007), the common shrew (Borodin et al., 2008), the American mink (Borodin et al., 2009), dog (Basheva et al., 2008) the zebra finch (Pigozzi, 2008), the zebrafish (Kochakpour and Moens, 2008), etc.. It appears first, based on these works that the localization of crossing-over along bivalent is not homogeneous. Instead, they clearly reveal a significant variation in recombination frequency per physical length unit. A systematic analysis of distribution profiles constantly indicates that telomeres, pericentromeric and nucleolar organizers (consisting of rDNA repeats) are devoid of CO. Then, in many species it was described that the synaptonemal complex length varies between the sexes, sometimes remarkable variation (Lynn et al., 2002; Wallace and Wallace, 2003; Tease and Hulten, 2004; Kochakpour and Moens, 2008), and the number of COs varies correlatively. In addition, it is admitted that the bivalent size can vary significantly, both between cells of the same individual (Lynn et al., 2002; Codina-Pascual et al., 2006; Kochakpour and Moens, 2008) and between individuals of the same species (Sun et al., 2005). The size variation increases CO number variance or changes their localization, so that the superposition effect of heterogeneous profiles is further increased. In brief, the distribution maps show when there is a significant proportion of bivalents more than one CO, an increase in CO frequency is observed near the ends. The unique CO, present mainly on short

25 Figure 1-8: MLH1 foci at pachytene stage. Immunocytology of a human spermatocyte at pachytene. The axial SYN1 cohesin is marked in red, centromeres in blue and MLH1 foci in yellow (From Sun, Oliver-Bonet et al. 2004). chromosomes, seems rather in the middle of arms in mice (Lawrie et al., 1995) and human female (Tease et al., 2002).

1.4.2 Genomic Map Determination of Crossover Location

If there is little evidence of a chromosomal determinant of CO distribution per se, it seems reasonable to question the existence of a sub-chromosomal region control linked to the genome. For at least two different reasons, first there is an “evolutionary” implication: COs generate new genetic combinations, consequently their position regarding genes and more generally genetic elements considered "active" cannot be selectively neutral. Second, in most species, regarding physical reference, "regional" frequency of CO is not homogeneous at any considered scale. The exponential growth of genomic sequence information about ten years ago allowed to carry out the study at sub-chromosomal scale: correlations between recombination rates and various genomic parameters such as genes, pseudo-genes, transposable elements, various repetitive elements, GC content (GC %), CpG island density, etc... One of the first studies performed in 2002 on the human genome (Kong et al., 2002), revealed that the CO frequency is essentially correlated with GC content and CpG islands. It seems that recombination is increased in regions rich in G and C, but it is even more in areas poor in G and C whose CpG island density is high. Other factors analyzed (including gene density) are very weakly correlated with recombination rate. Subsequently, similar studies on mice and rats (Jensen-Seaman et al., 2004; Shifman et al., 2006) and bee (Beye et al., 2006) also showed that among all relatively low correlations, the GC% and/or CpG island are the best "explanatory" variables in recombination rate. In rat and mouse according to what was observed in humans, regions poor in G and C and rich in CpG exhibit increased level of COs. In chickens, CO rate is also highly correlated with GC% (Duret and Galtier, 2009). These results may not seem convincing, yet they are consistent. They were obtained by analysis of large genomic intervals (125 kb in bee, 680 kb in chicken and 3Mb in human). Nonetheless, subsequent studies on linkage disequilibrium (LD) in human populations, with a very small scale (1 kb) using mathematical tools, revealed that there are actually net correlations, but the power is highly dependent on the scale of analysis (Myers et al., 2006; Spencer et al., 2006). Thus the GC% is correlates positively with recombination rate, but at 8 to 512 kb intervals. The significance of apparently universal correlation between recombination rate and GC content is still being debated.

26 100 Hotspot center

Double Strand Breaks

A Repair 50 Percentage of “blue” allele allele of “blue” Percentage 0 -1 0 1 50% 50%

100

Double Strand Breaks B

Repair 50 Percentage of “blue” allele allele of “blue” Percentage 0 -1 0 1 100% Distance from hotspot center kb

Figure 1-9: Biased gene conversion of allelic transmission (A) Equal initiation of meiotic recombination between two parental alleles (bleu and red). Equal recovery of both alleles at the center of the hotspot in the recombinant products. (B) Bias of initiation of meiotic recombination where only one of two parental allele is broken (red). Strong bias in the recovery of the alleles. In 2006, a research about the correlation between CO rate and various chromosome features was done in A. thaliana. The CO rates were correlated positively with the CpG ratio but negatively with the GC content. Note that the experience was only limited to chromosome 4 and by using sex-averaged data (mix of male and female meiosis products) (Drouaud et al., 2006). Five years later, they analyze correlations not only across chromosome 4, but for all five Arabidopsis chromosomes in male and female meiosis separately (Giraut et al., 2011). Female CO rates correlated strongly and negatively with GC content and gene density but positively with transposable elements (TEs) density whereas male CO rates correlated positively with the CpG ratio. However, except for CpG, the correlations could be explained by the unequal repartition of these sequences along the Arabidopsis chromosome. For some authors, the recombination would increase GC% via a phenomenon of "biased gene conversion" (BGC) (Marais, 2003). BGC refers to two possible mechanisms: mismatches created during the recombination process could be more frequently repaired towards GC leading to an increased probability of fixing GC alleles alternatively (Marais et al., 2003), the allele containing the least GC may initiate DSBs more frequently and be thus repaired by the GC-rich allele (Marsolier-Kergoat, 2011). The former hypothesis is well supported by recent analysis in human (Duret and Arndt, 2008) but at contrario, in S. cerevisiae GC content is not driven by recombination (Marsolier-Kergoat and Yeramian, 2009). Unfortunately, a bias in the sense of gene conversion (i.e. an allele to another) resulting from changes in initiation of recombination frequency has also been called BGC (Coop and Myers, 2007; Mancera et al., 2008) (figure 1-9). This "other" BGC is supposed to be the cause of a separate phenomenon, the “hot spot paradox” which will be discussed in chapter 2.

1.5 Distribution of DSBs

Neither DSBs in S. cerevisiae (Baudat and Nicolas, 1997; Gerton et al., 2000; Blitzblau et al., 2007; Buhler et al., 2007) nor H2AX-gamma foci (which mark sites of meiotic DSB repair) in mice (Grey et al., 2009) and the ENs in maize (Stack and Anderson, 2002) are distributed homogeneously along the chromosomes. Is there a relationship between distributions of early recombination events and distributions of crossovers? Indeed, the ratio of ENs to LNs along the SC is almost constant in maize (Stack and Anderson, 2002). There are the same distributions of MLH1 and RPA1 foci in human spermatocytes (Oliver-Bonet et al., 2007). In mice, on the contrary, there is no correlation between distribution of H2AX-

27 gamma and MLH1 foci along chromosome 17 (Grey et al., 2009). In S. cerevisiae, CO and NCO distributions along chromosomes are also discordant (Mancera et al., 2008). So it seems rather doubtful that inhomogeneous distribution of crossovers across chromosomes results mainly from determination of DSBs localization. However, several studies have also demonstrated that early recombination events are not independent of one another. More precisely, the study of distance between successive events reveals that they are subject to significant interference. This is true for ENs in various plant species (Anderson et al., 2001), MSH4 foci in mice (de Boer et al., 2006), RPA in human males (Oliver-Bonet et al., 2007), Msh4 and Mer3 foci in S. macrospora (Storlazzi et al., 2010). In S. cerevisiae, such an analysis of cytological interference is not yet feasible, but examination of physical distances (in bp) separating different classes of recombination products, measured in tetrads, shows that NCOs interfere with COs although they do not interfere with each other (Mancera et al., 2008).

1.6 Meiotic Recombination Hotspots

The cytological studies performed in higher eukaryotes have established formally that COs are not localized homogeneously along the chromosomes. But resolution of those studies was limited and did not describe distribution of recombination events at a very small scale. From 1960s and 1970s, the genetic studies in fungi have revealed that meiotic recombination is actually localized in highly discrete areas of genomes. This observation has subsequently been extended to eukaryotes large-genomes, so that a unitary model for distribution of meiotic recombination events appeared in the early 1990s (Lichten and Goldman, 1995). By definition, hotspots are the cluster of COs (narrow region of 2 to 3 kb in length) isolated in “large” adjacent region in which recombination activity is reduced on sides (de Massy, 2003). For instance, in human 80% of all recombination takes place in 10 to 20% of genome (Myers et al., 2005).The "hotspots" of recombination model can be applied for various species of fungi, animals and plants, but there is currently no evidence suggesting that it is universally valid: despite abundance works on C. elegans and D. melanogaster no hotspot of meiotic recombination has yet been described.

28 1.6.1 Meiotic Recombination Hotspots in Fungi

Many studies of meiotic recombination in various species of fungi revealed that the frequency of recombination is locally dependent on the position. There is generally a bidirectional polarized variation from a localized site, near the 5' end of genes. This is especially true for me-2 locus in N. crassa (Murray, 1963), b2 in A. immersus (Leblon, 1972), M26 in S. pombe (Goldman and Smallets, 1979), ARG4 (Nicolas et al., 1989; Sun et al., 1989; Schultes and Szostak, 1990) and HIS4 (Detloff et al., 1992) in S. cerevisiae. The first homologous recombination models, including that of R. Holliday (Holliday 1964), suggested the presence of initiation sites, discrete DNA breaks, in narrow and specific zones of genome. This hypothesis was later confirmed, first by demonstrating that DNA DSBs actually initiate meiotic recombination (Sun et al., 1989), and then by high-resolution mapping of DSBs at the scale of whole chromosomes, first in S. cerevisiae (Baudat and Nicolas, 1997; Gerton et al., 2000; Blitzblau et al., 2007; Buhler et al., 2007) then in S. pombe (Cromie et al., 2007). Moreover, it has earlier been proved that the meiotic recombination initiation sites systematically produced gene conversions accompanied by exchange of flanking markers (i.e. a CO) in about 50% of cases in S. cerevisiae (Hurst et al., 1972). Finally, strong variations in frequency of recombination were observed from one region to another, which appeared due to variations in frequency of initiation sites (Nicolas, 1979). These initiation sites were referred to as "hot spots" (HS) a few years later (Cao et al., 1990; Symington et al., 1991; Rocco et al., 1992; Lichten and Goldman, 1995). In the mid-1990s, it thus seemed that meiotic recombination was initiated by DSB formed at variable frequencies at HS located upstream of genes, and produced both COs and NCOs in accordance with the DSBR model of Szostak. In addition to crucial role that the concept of localized initiation played in successive models of the homologous recombination, HSs were then used as tools to study the underlying mechanisms such as physical analysis of recombination intermediates. A rather interesting question then remained to be asked: what is the determining nature of the HS localization? Indeed in S. cerevisiae, analysis of DSBs distribution at scale of a chromosome (Baudat and Nicolas, 1997), or an entire genome (Gerton et al., 2000; Buhler et al., 2007) shows that the majority of them are located upstream of genes in regions containing a transcriptional promoter, but these regions do not systematically contain HS. Also the attempts to identify a conserved sequence motif, necessary for recombination initiation, have failed. Some studies have shown that activity of some HS were only dependent on the

29 presence of binding sites for factors, but not the transcription itself (White et al., 1993; Mieczkowski et al., 2006). Other HSs seemed rather localized in areas where the chromatin is devoid of nucleosomes (Ohta et al., 1994; Kirkpatrick et al., 1999; Ohta et al., 1999). Furthermore, the frequency of meiotic DSB seems positively correlated with the local proportion (at 5 kb intervals) of G and C nucleotides in the chromosomal DNA (Gerton et al., 2000). These various observations led Petes et al. to develop a typology of meiotic recombination HS in S. cerevisiae, the three types of HS: alpha, beta and gamma (Petes, 2001). Nevertheless, according to the same authors, in the three situations, specific histone modifications could be involved in recruitment proteins responsible for meiotic DSB formation by an unknown mechanism. A series of data acquired in S. cerevisiae, S. pombe and C. elegans (detailed in S. cerevisiae (Borde et al., 2009)) demonstrates that the level of meiotic DSB either globally or locally is very significantly correlated with trimethylation on histone H3 lysine 4 (H3K4). This work raises some questions: First, some sites of DSBs do not seem to depend on the presence of H3K4me3. It seems that other determining forms of DSB localization can operate in parallel or simultaneously. Second, trimethylation of H3K4 also marks the promoters of genes transcribed during meiosis, but the causal relationship between transcription and formation of meiotic DSB seems rather unclear. Third, the link between H3K4 trimethylation and recruitment of proteins involved in DSB formation is mechanistically unknown. Surprisingly, such mechanism does not exist in S.pombe. In 2008 two articles were published separately (Chen et al., 2008; Mancera et al., 2008) describing the fine distribution of CO and NCO events for the entire S. cerevisiae genome. Mostly, the conclusions of these studies confirm, clarify or expand what had been previously observed, including the analysis of meiotic DSB distribution (Baudat and Nicolas, 1997; Gerton et al., 2000; Buhler et al., 2007). One of the most original results concerns the variation of CO and NCO relative frequencies: in 1.5% of initiation sites, the proportion deviates significantly from average (two COs for one NCO). This difference may be important because sometimes only one type of event is detected (CO/NCO extremes are respectively 14/0 and 0/7) (Mancera et al., 2008). Curiously, it appears that CO/NCO balance is reduced within 20 kb adjacent to telomeres (Chen et al., 2008).

30 1.6.2 Meiotic Recombination Hotspots in Mammals

Advances in molecular genetic techniques in the 1970s led to the developments from physical mapping studies for eukaryotic genomes (with a large genome) to nucleotide sequence scale and multiplication of chromosomal approaches for fine mapping and positional cloning. These works revealed that in very narrow areas (a few kilo bases) of the genome, frequency of meiotic COs is much higher than average, and in adjacent areas. For example a 3.5 kb fragment between Aβ3 and Aβ2 genes of the mouse major histocompatibility complex (MHC) contains a CO in 0.6% of meiosis which represents a frequency (relative to the physical size) 300 times higher than the average genome (Uematsu et al., 1986). These regions called "hotspots" of recombination (before, this term has been applied by extension to DSBs sites in S. cerevisiae) were identified initially in the mouse MHC region because it was actively studied through classical genetics approaches (Steinmetz et al., 1987; Shiroishi et al., 1993). Subsequently, the HS were also characterized in humans by pedigree- analysis approaches or linkage disequilibrium (LD) studies. The term "hotspot" was also frequently (and sometimes still is) used to refer to regions of the genome hotter than average, without consideration to their size or fine distribution of CO contained. This rather vague definition contrasts with the HS model defined in S. cerevisiae that is much more precise. This can certainly be attributed to the particular difficulties that have a sufficiently high number of events. Indeed, at most HSs, the CO density is actually highly increased compared to genome average, but their overall frequency is still generally very low and rarely exceeds 1% (0.3% at the SHOX hotspot in H. sapiens (May et al., 2002)). This implies that, by classical genetic approaches, the population of several thousands individuals is required to have a few dozen events to the hottest HSs. In this regard, the development of PCR has greatly contributed to improving HS quality studies, first in mammals. To begin with, the development of methods for genetic mapping based on PCR, isolated from sperm (Boehnke et al., 1989), has produced a detailed map of human MHC (Hubert et al., 1994; Cullen et al., 2002) and other large areas (Lien et al., 2000; Greenawalt et al., 2006). These studies, although identifying areas that may contain HS, lacked the accuracy needed to study them finely.

1.6.2.1. Mammalian Hotspots Characterized by Sperm Typing An additional technological innovation has further opened the way to the precise characterization of HS in animals and plants genomes. It consisted of amplifying specifically

31 Hotspot Number Length Frequency Frequency Localization Species Sources of CO 95% of of peak analyzed CO (kb) CO x 10-5 (cM/Mb)

DNA1 1 69 1,9 0,5 0,4 promoter H. sapiens (Jeffreys, Kauppi et al. 2001)

DNA2 237 1,3 3,7 2,5 intergenic H. sapiens (Jeffreys, Kauppi et al. 2001)

DNA3 661 1,2 130 140 intergenic H. sapiens (Jeffreys, Kauppi et al. 2001)

DMB1 36 1,8 3,1 2 intron/ exon H. sapiens (Jeffreys, Kauppi et al. 2001)

DMB2 358 1,2 28100 20 intragenic H. sapiens (Jeffreys, Kauppi et al. 2001)

TAP2 141 1 5,8 9 intron H. sapiens (Jeffreys, Ritchie et al. 2000)

MS32 250 1,5 39 40 intergenic H. sapiens (Jeffreys, Neumann et al. 2005)

SHOX 527 1,9-2,5 300 190-370 intron/exon H. sapiens (May, Shone et al. 2002)

NID1 1345 1,5 67 70 intron H. sapiens (Jeffreys, Neumann et al. 2005)

NID2a 302 1,4 8,5 10 intron H. sapiens (Jeffreys, Neumann et al. 2005)

NID2b 107 1,1 3 4 intron H. sapiens (Jeffreys, Neumann et al. 2005)

NID3 1094 2 90 70 intergenic H. sapiens (Jeffreys, Neumann et al. 2005)

MSTM1 179 1,6 8,8 9 intergenic H. sapiens (Jeffreys, Neumann et al. 2005) a

MSTM1 374 2,1 15 16 intergenic H. sapiens (Jeffreys, Neumann et al. 2005) b

MSTM2 46 1,3 0,7 0,9 intergenic H. sapiens (Jeffreys, Neumann et al. 2005)

DPA1 130 1,7 29 27 intergenic H. sapiens (Kauppi, Stumpf et al. 2005)

Beta 283 1,2 150 200 intergenic H. sapiens (Holloway, Lawson et al. 2006) globine

Table 1-5. List of crossover’s hotspots studied by sperm typing in mammals. Part 1 recombinant molecules (mainly COs) in pools of genomic DNA extracted from gametic/gametophytic cells produced by hybrids. This technique called “sperm typing”, developed in 1998 by A. Jeffreys (Jeffreys et al., 1998), has allowed to study in detail more than 40 and 20 HS in humans and mice respectively. The total can seem modest, regarding the interest which this approach represents a priori, but it can be understood for two reasons. First, the typing sperms only apply to studies of unique or very close to previously identified or at least suspected HS. It is necessary to have a more classical genetic analysis, allowing their localization as a small scale as a few kb, with all the difficulties which this requires. However, currently the importance of this limitation is decreasing because the approaches of genomic characterization (sequencing, hybridization chips) tend to become more common. In fact, several HS characterized in humans were initially identified by high-resolution analysis of linkage disequilibrium at the entire genome scale (Webb et al., 2008; Jeffreys and Neumann, 2009). Second, the success of sperm typing analysis depends on the existence of polymorphisms to amplify specifically recombinant molecules by PCR and to localize the transition genotypes with sufficient resolution. If we ignore these inherent limitations, the typing of sperm is an extremely powerful method for analyzing COs. In theory, it allows to characterize a virtually infinite number of events occurring at possibly very low frequencies. But in practice, it is a technique which with a laborious development and implementation. When suitable polymorphisms are present within the same HS, the sperm typing can also be applied to characterization of NCO events. The accurate information published from analysis of HS, by sperm typing in mammals, is summarized in table 1-5, 1-6 and table 1-7. Several key facts have emerged: First, the CO distribution is symmetrical (as it is possible to consider the polymorphisms density, sometimes very inhomogeneous), and apparently consistent with a normal distribution (Jeffreys et al., 2001). Second, the HS size is fairly constant from 1 to 2 kb. The M2 mouse HS (Kauppi et al., 2007) is a notable exception, but may be a composite HS. Third, there are very hot and barely warm spots, between which CO frequency can vary by a factor of 1000. Fourth, at some HS and some crosses, allelic representation among CO is biased to center of HS. This reflects a shift in average position of genotype transitions in one direction and in another. The hypothesis most commonly advanced to explain this fact is that the DSBs do not occur at the same frequency on both parental haplotypes, thus gene conversion does not affect

32 Hot spot Number Length Frequency Frequency Localization Species Sources of CO 95% of of peak analyzed CO (kb) CO x 10-5 (cM/Mb)

A, B, ~110 1,2 – 1,9 15 - 1000 Intergénique H. sapiens (Webb, Berg et al. 2008) C1, C2, (13) /intron (6) D, E, F, G1, G2, H, J1, J2, K, L, M, N, P, Q, R

Psmb9 69 1,8 550 650 intergenic M. musculus (Guillon et de Massy 2002)

Ebeta 124 1,4 159 140 intron/exon M. musculus (Yauk, Bois et al. 2003)

HS22 327 2 217 62 intron M. musculus (Bois 2007)

M1 75 2 5 3 intergenic M. musculus (Kauppi, Jasin et al. 2007)

M2 57 9 7 <2 intergenic M. musculus (Kauppi, Jasin et al. 2007)

Hlx1 14 1,5 2800 ND ND M. musculus (Ng, Parvanov et al. 2008)

HS14.9 91 ND 5,1 ND ND M. musculus (Wu, Getun et al. 2010)

HS18.2 122 ND 14,7 ND ND M. musculus (Wu, Getun et al. 2010)

HS23.9 144 ND 2,5 ND ND M. musculus (Wu, Getun et al. 2010)

HS44.2 71 ND 21,1 ND ND M. musculus (Wu, Getun et al. 2010)

HS48.3 87 ND 0,9 ND ND M. musculus (Wu, Getun et al. 2010)

HS61.1 102 ND 49,9 ND ND M. musculus (Wu, Getun et al. 2010)

HS61.2 210 ND 55,8 ND ND M. musculus (Wu, Getun et al. 2010)

Table 1-5. List of crossover’s hotspots studied by sperm typing in mammals, ND, not determined Part 2 them as well, which leads to transmission bias (Jeffreys and Neumann, 2002; Yauk et al., 2003; Jeffreys et al., 2004; Jeffreys et al., 2005; Baudat and de Massy, 2007b; Cole et al., 2010) (figure 1-9). This regulation of recombination activity carried out by local determinants is called local control in cis. The initiation activity of a HS can also depend on polymorphisms located elsewhere in the genome: in this case a control in trans (see below). Fifth, sometimes a portion of CO has a complex transition profile between parental genotypes. Along these molecules, in HS region, alternating segments either specific to each parent or heteroduplex are observed. Generally, these complex molecules are rare (1% or less), but the proportion may reach 4% (Bois, 2007) or even 25% (Kauppi et al., 2007) in some HSs. These structures are likely mosaic from a defective or aberrant heteroduplexes repair that are formed during homologous recombination process or by strand invasion and second end capture. It was suggested that the frequency and nature (SNP/InDel) of polymorphisms present in HS can determine these accidental repairs of mismatched regions (Bois, 2007). Published data on the characterization of NCO events at HS among mammals are rare (table 1-7). But when available, the data are consistent with what has been observed in S. cerevisiae (section 1.6.1.): the balance between the CO and NCO frequencies fluctuate very much, about 3 to less than 1/12, while the genomic average expected in humans is less than one sixth (47 MLH1 foci for about 300 RAD51 foci). By contrast, gene conversion tract lengths associated with NCO are clearly smaller (probably less than 540 bp) in comparison to that of S. cerevisiae (1.8 kb on average) (Baudat and de Massy, 2007a; Chen et al., 2008; Mancera et al., 2008).

1.6.2.2 The HapMap project and historical hotspots in human One of the main limitations in recombination HS studies in large genomes concerns their preliminary location. A HS can only be analyzed in detail when it has been identified and positioned within a relatively small physical interval. The traditional methods of genetic mapping can be used in model species such as mice, but in humans the analysis of pedigrees is comparatively much less powerful. In humans, it is the LD study which proved to be the most powerful approach to detect "hot regions" (Chakravarti et al., 1984; Chakravarti et al., 1986; Yip et al., 1999; Jeffreys et al., 2001; May et al., 2002; Kauppi et al., 2003; Rana et al., 2004). Nonetheless at population- scale, LD variations not only result in the inhomogeneous distribution of COs along

33 Hot spot Number Length 95% Frequency of CO % - PRDM9 alleles Species Sources of CO CO (kb) analyzed

T 289 ND Up to 2.53 in Af – A/A H. sapiens (Berg et al. 2010)

U 289 ND Up to 0.66 in Eu – A/A H. sapiens (Berg et al. 2010)

S 289 ND Up to 0.76 in Af – A/A H. sapiens (Berg et al. 2010)

CF 289 ND Up to 0.79 in Eu – A/A H. sapiens (Berg et al. 2010)

CG 289 ND Up to 1.67 in Eu – A/E H. sapiens (Berg et al. 2010)

PAR2 289 ND Up to 1.51 in Eu – A/A H. sapiens (Berg et al. 2010)

2A 289 ND Up to 0.204 in Af – A/L11 H. sapiens (Berg et al. 2011)

5A 289 1.5 Up to 1.150 in Af – C/A H. sapiens (Berg et al. 2011)

12B 289 1.4 Up to 3.704 in Af – L6/L13 H. sapiens (Berg et al. 2011)

12D 289 1.3 Up to 0.370 in Af – L4/A H. sapiens (Berg et al. 2011)

22A 289 1.1 Up to 2.939 in Af – L17/L18 H. sapiens (Berg et al. 2011)

PAR2A 289 1.1 Up to 1.921 in Af – C/A H. sapiens (Berg et al. 2011)

Table 1-6. List of hotspots studied by sperm typing in European , Indian and African individuals according to their PRDM9 alleles. ND, not determined; Af, African; Eu, European; chromosomes, but are also the consequence of multiple phenomena related to the selection and demographic history (genetic drift, variation in size, migration and mixed populations). For example in the DPB1 region of MHC, there is a “good” breakdown of LD, but “sperm typing” reveals that in fact the COs are extremely infrequent and homogeneously distributed, that is to say there is no HS (Kauppi et al., 2005). Conversely, sometimes the HSs are detected in regions with a slight decrease in LD. This is particularly true near NID3 gene (Jeffreys et al., 2005). These examples suggest that (i) the recent appeared HSs may not have had "time" to mark their imprints on haplotypical structure of population and (ii) breaks in LD may have been caused by presently disappeared HS. In 2005 a study was published describing the identification of 25,000 “historical hotspots” (HHS) for the entire human genome (Myers et al., 2005). The authors estimated that it represented approximately 50% of HSs currently available, and deduced that there was an active HS every 50 kb, which corresponded to the density observed in the MHC class II (Jeffreys et al., 2001). In addition, the retro-transposons THE1A and THE1B were found to be strongly over-represented in the HHS. Among those retro-transposons found in HHS, a particular sequence motif, CCTCCCT, had been very frequently observed. More recently, a second study by the same team (Myers et al., 2008), based on additional data generated by the HapMap Project (HapMap 2007) made it possible to refine the results of the previous analysis: the motif degenerated CCTCCCTNNCCAC in the context of an THE1A element was associated with a HHS in 73% of cases, and "only" 10% of cases were outside that context. The authors noted in conclusion that this motif could be bound by a zinc finger protein.

1.6.2.3 Towards Elucidation of Mammalian Hotspots Localization Determinant The story of the mysterious motif CCTCCCTNNCCAC did not stop in 2008. In 2010, the Oxford team behind this discovery revealed that in human the PRDM9 gene encode a protein with a series of zinc fingers that is most likely to link the motif CCTCCCTNNCCAC (Myers et al., 2010), but not in chimpanzee, macaque, orangutan, mouse, rat or the elephant. In fact this gene evolves very quickly. Interestingly the motif CCTCCCTNNCCAC was associated to 41% of 22,700 LD-based HSs identified in the human genome (Myers et al., 2008). Despite very high nucleotide identity (99%) between their regions of genomes, it was found that neither human HSs nor HHSs seem to be conserved in chimpanzee. There was

34 Hot spot Frequency of Frequency of CO:NCO Tract lenght Species Sources CO x 10-5 NCO x 10-5 ration (bp)

Beta 290 <25 <1:12 60-300 H. sapiens (Holloway, Lawson et al. 2006) globine

DNA3 110 300 2,7:1 55-280 H. sapiens (Jeffreys et May 2004)

DMB2 5 4 01:01,3 H. sapiens (Jeffreys et May 2004)

NID1 50 13 <1:4 <1200 H. sapiens (Jeffreys, Neumann et al. 2005)

SHOX 370 90 01:03,3 H. sapiens (Jeffreys et May 2004)

Psmb9 550 270 02:01 <540 M. musculus (Guillon et de Massy 2002)

Hlx1 2800 1600 1,8:1 ND M. musculus (Parvanov, Ng et al. 2009)

Esrrg1 2000 2500 1:1;2 ND M. musculus (Parvanov, Ng et al. 2009)

Table 1-7. List of CO and NCO hotspots studied by sperm typing in mammals little concordance in their HSs location. At the site of a recombination HSs in one species, the recombination rate in the other species was typically lower by a factor of 10 to 60 (table 1-8) (Ptak et al., 2004; Ptak et al., 2005; Winckler et al., 2005). It was known since 2005 that in mice, the protein PRDM9 (originally named MEISETZ) was involved in trimethylation of histone H3 lysine 4, and its presence was also necessary for meiotic recombination (Hayashi et al., 2005). A few years later, the relationship between these two activities was demonstrated: in mice, the chromatin of Psmb9 hotspot contains specifically in spermatocytes at pachytene stage a set of modified histones. Some modifications characterize a cold allele, while others including H3K4me3, are associated with a hot allele. H3K4me3 is present in absence of meiotic DSBs (ie in a spo11-/- mutant), and therefore may be necessary for their formation. Moreover, the activity of the hot allele and the presence of histone modifications depend on a given haplotype distant of at least 10 Mb (Buard et al., 2009). Moreover, the locus responsible for this control in trans of Psmb9 activity modulates also the distribution of meiotic recombination events (both DSB and type I COs), at the scale of the entire chromosome 17 and probably the entire genome (Grey et al., 2009). This locus named Dsbc1 (double-strand break control 1) was found to contain Prdm9 (Baudat et al., 2010). Mice strains with the Dsbc1wm7 allele have a different genome-wide distribution of COs in comparison to strains possessing the Dsbc1b allele. Moreover, the single modification of PRDM9 zinc fingers leads to changes in HS activity, H3K4me3 levels, and chromosome-wide distribution of COs (Grey et al., 2011). Not only the zinc fingers sequences are important to hold proper HS activity, the mutations in cis located at HS centers, associated to reduction of HS activity, affect PRDM9 binding. Recently the group of Alec Jeffreys characterized alleles of PRDM9 in African populations (Berg et al., 2010; Berg et al., 2011). They observed that variation within the PRDM9 ZnF domain strongly influences hotspot activity (table 1-6). Moreover, their results point out that minor changes in the ZnF array that predicted not to influence PRDM9 DNA binding specificity have a major impact on hotspot activity and conversely specific PRDM9 variants influence hotspot activation not obviously depending on a binding motif. This suggests the presence of more complex regulators of HS activity, either in parallel or together to PRDM9. The hypothesis according to which the distribution of meiotic recombination events would be regulated by rapidly evolving proteins, PRDM9 as an archetypal example, provides a very convenient solution to the “hotspot paradox”. This hypothesis predicts that as the "hot" alleles, the initiators of meiotic recombination are converted to "cold" alleles in heterozygous

35 Table 1-8: Recombination rates estimated from population genetic data at known human hotspots. Rates estimated using LDhat (McVean et al., 2004) in Utah residents with ancestry from northern and western Europe from the CEPH resource (CEU), Beni sampled from Nigeria (BEN), and western African chimpanzees. P values were obtained by performing a one-sided test for the presence of a 2-kb HS centered at the position of the known human HS (McVean et al., 2004) , based on 10,000 simulations. ‘‘Chimp sites’’ indicates the number of SNPs found solely by resequencing in the region of the human HSs. The estimated rates are centered at the positions of the known human HSs. NA, not applicable (From Winckler et al 2005). by the only mechanism of homologous recombination, their frequency should gradually decreased in the population, so that the cooling HSs with time are expected to disappear finally (Boulton et al., 1997). It can be argued that this mechanism is only active in species with a high degree of intrapopulation polymorphisms, which is not the case when self- fertilization is the predominant mode of sexual reproduction, or when sex is used occasionally. However, the “sperm typing” studies demonstrate that HS can actually disappear in humans, at least one generation to the next. The PRDM9 polymorphisms observed in human populations suggests that HSs may well disappear gradually, as there is a mechanism to create new ones quickly. Some questions remain unanswered: how PRDM9 targets the activity of double-strand breaks catalyzed by SPO11? What is the proper role of H3K4 methylation? How other proteins involved in DSB formation are involved? There is now a group of arguments pointing to the role of histone modifications in determining DSB formation, as well as in S. cerevisiae and mammals (figure 1-10). It is therefore desirable to imagine that this mechanism is universal, even among plants, in which the distribution of recombination events seems to follow a pattern of hotspots similar to that of mammals. Despite the important role of PRDM9, it is absent in sauropsids (birds, lizards) and amphibians, nonetheless appears to be conserved in other metazoans diverging as much as 700 million (Munoz-Fuentes et al., 2011). Surprisingly, a recent research showed that the PRDM9 orthologue of dog (and their wild relatives), has accumulated a number of loss -of- function mutations, expected to render it nonfunctional (Axelsson et al., 2011). This is not true in the case of panda and cat, because these species show neither frameshifts nor premature stop codons in their PRDM9 sequence compared to that of human. Although the presence of similar recombination hotspots in dog (~40000) as in human (25000-50000) and similar spread of recombination rates as humans, but it lacks a functional PRDM9 protein. Furthermore many dog hotspots seem to have remained in the same location for many millions of years. The absence of non-functional PRDM9, or another similar gene (containing both SET and zinc finger domains) in the dog genome suggests that an alternative mechanism of recombination initiation predominates in the dogs. The presence of such mechanism could also be important in species that have a functional PRMD9. This could be used as an auxiliary mechanism in parallel with PRDM9 initiation (Axelsson et al., 2011).

36 Figure 1-10: Model of hotspot specification by PRDM9. Grey and collaborators (2011) proposed a very nice model to suggest a role for PRDM9 in the regulation of hotspot. PRDM9 would bind to a target sequence (green) and catalyzes tri methylation of H3K4 in the surrounding nucleosomes. Other proteins (partners of PRDM9 ?) could participate in the formation of a hotspot-specific signature. SPO11 and/or its partner would thus be able to recognize this signature and catalyzes DSB formation. 1.6.3 Meiotic Recombination Hotspots in Plants

So far the distribution of recombination events in small-scale has been very poorly studied in plants, in comparison to animals. Only two regions of maize have been described in detail: the bronze gene (bz) and the anthocyaninless 1-shrunken kernel-2 region (a1-sh2).The presence of a series of bz gene alleles and thank to phenotypic screens based on grain color allowed to isolate large numbers of intergenic recombination events (Yao and Schnable, 2005; Dooner, 1986; Dooner and He, 2008; He and Dooner, 2009). Several clusters of COs have thus been identified. However, due to the limitation of the phenotypic screen, only a few NCOs have been completely localized and characterized within CO clusters (Yandeau-Nelson et al., 2005; Dooner and Martinez-Ferez, 1997; Dooner, 2002). Moreover, the frequency and position of recombination events in flanking regions of bz gene are still unknown. The a1-sh2 region has also been characterized by using phenotypic screen on grain color and shape, but without a priori on intra/extra-genic localization of recombination events. The fine distribution of CO performed from 176 events among hybrids context has revealed three distinct peaks in 10 kb regions between a1 and yz1 genes, and one isolated peak at 35 kb distance (Yao and Schnable, 2005) (figure 1-11). This graph is consistent with the HS model validated in fungi and animals. In addition, 8 putative NCO events localized partially or completely in a1 gene have been identified (Yandeau-Nelson et al., 2005). The tract length associated to gene conversion exceeds more than 500 bp in most NCO (7/8). Without phenotypic markers with a quantifiable genetic linkage while being physically close, as in the case of a1 and sh2, HS identification depends on positional cloning strategies using anonymous molecular markers. The implementation of such approaches is laborious as chromosomes of plant genomes are physically large. Nonetheless, advances in technology, physical mapping and genetic typing makes possible, the investigation of meiotic recombination hotspots such in A. thaliana (Drouaud et al., 2006) or wheat (Saintenac et al., 2009; Saintenac et al., 2011). It therefore seems likely that the number of HSs characterized in detail in plants growing much in the fairly near future.

1.7 Human genetic diseases caused by meiotic recombination

One cause of human genetic diseases is due to genomic rearrangements. Although meiosis can increase genetic diversity in novel generation, however an improper recombination can lead to human disorders. For example Non-Allelic Homologous

37 /Mb) cM CO rates ( CO rates

Physical distance (kb)

Figure 1-11: Distribution of crossovers at the a1-yz1 region in maize. 161 crossovers were identified and localized in the progeny of a cross between two isogenic lines of Zea mays spp. mays carrying different a1 sh2 haplotypes. The number of CO is indicated above each interval. (From Drouaud, thesis, 2010). Recombination (NAHR), which is meiotic crossing-over between sequences that are not in allelic positions, could create genetic diseases due to recombination between Low Copy Repeats (LCRs). If recombination occurs between LCRs that flank unique genomic segments, an insertion or deletion can result. NAHR events between direct LCRs leads to a duplication or deletion, while that between inverted LCRs gives inversion (Lupski, 1998; Liu et al., 2011). Genes containing LCRs may be interrupted or fused with another gene/ pseudogene through meiotic recombination, leading to loss of gene function (figure 1-12). Certain genetic diseases occur more frequently in some populations than others. For instance : CF (Cystic Fibrosis) in European and North American Caucasians, SCD (Sickle Cell Anemia) in African, PKU(Phenylketonuria) in Celtic and Northern Europeans, and TSD (Tay-Sachs disease) in Ashkanazi Jews (Nussbaum et al., 2001). Such variation of disease based on population or associated with allele frequencies are frequently found in point mutations into single-gene leading to disease. In some populations, the high rate of diseases can be associated to either founder alleles or heterozygous advantage.

Contrary to population-related diseases, meiotic recombination-based disorders can happen in each population (For example, CMT1A (Charcot-Marie-Tooth disease type 1)) (Inoue et al., 2001). The human genome as most of eukaryotes contains a number of duplicated genes in addition to pseudogenes with high sequence similarity. Furthermore, some large segments of genome have significant regions of paralogy through genomic duplication (Khurana et al., 2010). Such paralogous segments can be served as substrates for NAHR. Here, I summarize only a few human genetic diseases caused by NAHR and finally the role of gene conversion to create novel protein, consequently genomic disorders.

1.7.1 Diseases caused by deletion and duplication

- Alpha-thalassemia is one of the first and well-characterized examples of a genomic disorder that affects 5 – 40 % in African and 40 – 80 % in South Asia, causing deletion of one or both of the paralogous alpha-globin genes resulted by NAHR (Higgs et al., 1980; Lauer et al., 1980). Each alpha gene is situated within about 4kb region of homology. NAHR between these regions results deletion of one functional alpha-gene.

38 Figure 1-12: Schematic models of molecular mechanisms for meiotic NAHR between LCRs. (A) CO between two paralogous LCRs (black and shaded rectangles) in direct orientation results in reciprocal duplication and deletion. These rearrangements lead to a duplication or deletion of a unique genomic segment (circle) flanked by the LCRs. Interchromosomal or intrachromosomal (interchromatid or intrachromatid) recombination can lead to this rearrangement. (B) Recombination between inverted LCRs results in an inversion of unique genomic segment (circle and hexagon), but no gain or loss of the genomic segment is observed. Interchromosomal or interchromatid recombination may lead to major change in chromosomal conformation (dicentric and acentric chromosomes). (C) A complex LCR that contains multiple modules, some are in a direct orientation, while others are in an inverted orientation. Depending on which modules are used as substrates for recombination, duplication/deletion (between tandem modules) or inversion (between inverted modules) rearrangement may occur ( From Inoue et al., 2002). - 5%–22% of patients with neurofibromatosis type 1 have deletion of the NF1 gene on 17q11.2 as long as 1.5 Mb (Valero et al., 1997). Most breakpoints occur in a 2-kb segment within two copies of LCR, NF1-REPs (repeats) about 85-kb in length. This 2-kb segment contains two regions with perfect sequence for more than 600 bp in which the recombination breakpoints are localized (Lopez-Correa et al., 2001). The molecular analysis of these patients revealed some interesting features: (i) the presence of NCO adjacent to the recombination breakpoints (Lopez-Correa et al., 2001) (ii) the presence of an apparent bias in the parental origin (maternal preference) of the de novo recombination event (Lazaro et al., 1996). - CMT1A and HNPP (Hereditary Neuropathy with liability to Pressure Palsies) are two autosomal dominant neuropathies caused by duplication and deletion, respectively, on chromosome 17p12. Meiotic NAHR between LCRs (called proximal and distal CMT1A-REPs containing the PMP22 gene) creates a duplication (two copy of PMP22 gene) and deletion (loss of PMP22 gene) (Chance et al., 1994). In this region, the recombination breakpoints were localized to a 1.7-Kb region within the about 24-Kb CMT1A-REPs (more than 78%) containing a 456 bp perfect sequence identity (Lopes et al., 1999). Further analysis revealed the presence of meiotic hotspot within about 24-kb LCRs together a sequence motif; mariner transposase-like element in CMT1A- REPs. Interestingly, a genetic variation in PRDM9 alleles can also cause this disorder (Berg et al., 2010). In addition, in both CMT1A and HNPP, the NCO events were accompanied the strand exchange (Reiter et al., 1996). Like NF1, an apparent bias in the paternal origin was only observed in CMT1A duplication (paternal preference) but not in HNPP deletion (Niedrist et al., 2009). - DGS/VCFS (DiGeorge Syndrome or VeloCardioFacial Syndrome) is often associated to small deletion in 22q11.2. NAHR between multiple copies of LCRs (called LCR22) was cause of this disorder. A large number of patients with DGS/VCFS had a common deletion at about 3 Mb (when NAHR occurred between LCR22A and LCR22D), while a small number of them had an about 1.5-Mb deletion (when NAHR happened between LCR22A and LCR22B) (Liling et al., 1999; Shaikh et al., 2007). - SMS (Smith-Magenis Syndrome) is associated with a 4-Mb deletion flanked by two direct LCRs (called proximal and distal SMS-REPs) on 17p11.2. A third inverted- orientation copy (middle SMS-REP) is situated between those proximal and distal copies. NAHR between distal and middle SMS-REPs located at several loci result in a

39 duplication of the same about 4-Mb genomic fragment, associated with a relatively mild mental retardation phenotype (Potocki et al., 2000; Elsea and Girirajan, 2008).

1.7.2 Diseases due to inversions by homologous recombination

Sometimes, high sequence similarity between inverted repeats may take place within or close to a gene. A mispairing between such sequences can result an inversion through NAHR. This can lead to disruption of gene function.

- About one fourth to one half of patients with severe hemophilia A have an inversion and disruption of the coagulation factor VIII gene mediated by NAHR (Lakich et al., 1993). This inversion, covering about 400 Kb, is mediated by inverted LCRs (Feuk, 2010). - A minor part of Hunter syndrome (also called known as mucopolysaccharidosis II) patients, an X-linked recessive disorder caused by defects of the IDS (iduronate- 2- sulfatase) gene. NAHR between IDS and IDS pseudogene (about 20kb away) generates an inversion causing the disruption of the functional IDS gene in intron 7 (Feuk, 2010). The breakpoints of this inversion are clustered in a 1-Kb region with more than 98% sequence identity. - Flanking FLN1/emerin region contains two large inverted repeats (11.3 kb) with high sequence identity (99%). NAHR between these sequences can lead to EMD (Emery– Dreifuss muscular dystrophy) and generates a complete inversion at FLN1/emerin region without any sequence alteration in either FLN1 or Emerin gene (Small et al., 1997).

1.7.3 Diseases due to gene fusion and gene conversion mediated by recombination

In addition to duplications, deletions and inversions, genes fusion is also another consequence of recombination between homologous sequences. Such rearrangements are generated sometimes between members of gene families or pseudogenes.

- Hemoglobin Lepore (Hb Lepore) was identified in 1958 as gene fusion related disease in which a NAHR led to gene fusion between the delta (HBD) and beta globin (HBB) gene (Gerald and Diamond, 1958). As a result, Hb Lepore includes normal alpha chains; its non-alpha chain is a delta–beta fusion chain. Many different varieties of Hb

40 Lepore have been described such as: Hb Lepore Hollandia, Hb Lepore Baltimore and Hb Lepore Boston; the transition from delta to beta arises at diverse sites (Chaibunruang et al., 2010). - Red-green color blindness (dichromacy) is another example of gene fusion. Red and green pigment genes (OPN1LW and OPN1MW) share 98% identity. These genes are located on Xq28 in a tandem direct orientation. Individuals with normal color vision (euchromates) have one copy of the red pigment gene, OPN1LW, and one or more copies of the green pigment gene, OPS1MW. While the majority of red-green color blindness (about 4%–5% of males) is caused by deletions (of the green pigment genes) or fusion genes (where one or more green pigment genes fuse at 3’ of the red pigment gene) created through NAHR between OPN1LW and OPN1MW (Deeb, 2005; Nathans et al., 1986). Gene conversion that transfers the genetic material from a ‘donor’ sequence to a homologous ‘acceptor’ can also be associated to genetic disease. This is true when some polymorphisms are associated to that transfer. As such polymorphisms leads to amino acid changes and subsequently loss of protein function (partially or completely). In mammals, gene conversion tracts are usually short, from 200 bp to 1 kb in length in average. For instance, 113–2,266 bp for the human globin genes (Papadakis and Patrinos, 1999), 1–1,365 bp for two Yq-located human endogenous retroviral (HERV) sequences (Bosch et al., 2004), 54–132 bp in human leukocyte antigen HLA-DPB1 locus (Zangenberg et al., 1995) and 55– 290 bp in various gene conversion hotspots (Jeffreys et al., 2004; Berg et al., 2010; Berg et al., 2011).

Gene conversion events can be non-allelic (also known as interlocus) or interallelic. Nonallelic gene conversion often is associated with biased directionality. For instance: biased gene conversion between two directly repeated HERV elements along the human Y chromosome (Bosch et al., 2004), and or in human globin genes (Papadakis and Patrinos, 1999).

Although interlocus gene conversion can cause several genetic disorders, however it has an important role in evolution of multigene families (like as the Rh blood group antigen genes RHD and RHCE) and highly repeated DNA sequence (Innan, 2003).

A non-exhaustive due to non-proper gene conversion leading to human genetic disorders is listed in table 1-9.

41 Disease/phenotype Donor gene Acceptor gene Chromosomal Converted localization tract length (bp)

Atypical haemolytic uraemic syndrome CFHR1 CFH 1q32 19–331

Congenital adrenal hyperplasia CYP21A1P CYP21A2 6p21.3 21–155 142–523 42–210 80–202 Syndrome of corticosterone methyloxidase II CYP11B1 CYP11B2 8q21–q22 446–626 deficiency

Increased 18-hydroxycortisol production CYP11B1 CYP11B2 8q21–q22 4–56

Autosomal dominant cataract CRYBP1 CRYBB2 22q11.2–q12.1 9–104 Neural tube defects FOLR1P FOLR1 11q13.3–q14.1 166–215 3–65 Gaucher disease GBAP GBA 1q21 604–974 525–848 36–175 50–475 Short stature GH2 GH1 17q22–q24 40–218 Mild microcytosis HBB HBD 11p15.5 ≥212–≤348 Hereditary persistence of fetal haemoglobin HBG2 HBG1 11p15.5 423–1554 Agammaglobulinaemia IGLL3 IGLL1 22q11.23 33–152 Chronic granulomatous disease NCF1B or NCF1 7q11.23 124–1474 NCF1C 369–1529 247–423 Blue cone OPN1MW OPN1LW Xq28 ≥636– ≤3676 Chronic pancreatitis PRSS2 PRSS1 7q35 289–457 Shwachman–Bodian–Diamond syndrome SBDSP SBDS 7q11.22 11–104 44–217 61–276 120–398 19–118 78–240 60–197 Spinal muscular atrophy SMN2 SMN1 5q13.2 264–776 Von Willebrand disease VWFP VWF 22q11.22– 63–131 q11.23 (VWFP)/ 7–95 12p13.3 (VWF) 112–195 9–99 47–195 163–291 117–229 271–335 21–191 175–297

Table 1-9: Interloci gene conversion events that cause human inherited diseases (From Chen et al 2007)

1.8 Influence of Meiotic Recombination Control on Plant Breeding

One of the most important efforts of plant breeders is the creation of new elite varieties by crossing different parents to produce a series of valuable traits through allele’s combination. The aim is to combine desired alleles together in one hybrid by choosing the suitable combination of chromosomes. Meiotic recombination has a key role in plant breeding by reshuffling chromosomal segments specifically during meiosis. The number of chromosomes in addition to the number and positions of COs between homologous can determine the maximum possible variation among hybrid plants. Today, by modern methods for high-throughput genotyping following markers development, the determination of CO frequencies and their position is easier than in the past (Gupta et al., 2008). For example, a fine-screen of all COs and NCOs issued from a single meiosis product (tetrad) is possible in Arabidopsis, by using a quartet mutant, in which the four meiotic spores remain together (Preuss et al., 1994; Copenhaver et al., 2000), and/or by combining the markers associated with florescent proteins (Berchowitz and Copenhaver, 2008). Thanks to these technologies, the plant breeder hope to control over CO formation is larger than ever before. The most important factors which can have significant impacts on CO frequencies are: genetic/epigenetic background and also morphological/developmental differences. For instance: among barley (Hordeum vulgare) cultivars, with variable genotypes, the difference of CO frequencies reached up to 30% (Sall, 1990). This difference also was observed across Arabidopsis ecotypes (17%; (Sanchez-Moran et al., 2002)). More interestingly, in 1963 Allard observed a three-time difference in recombination frequencies in the F6 than the F2 in lima bean (Phaseolus lunatus) (Allard, 1963). The morphological position of flowers can also influence on CO frequencies in some species such as Arabidopsis (anthers on secondary or tertiary branches have up to 16% more COs than those on primary branches (Francis et al., 2007)), while no difference of CO frequencies related to flower position was observed in barley and rye (Secale cereal) (Sall, 1990). The temperature can influence a wide variation of recombination rates. For instance, the high temperatures lead to increasing CO rates in Arabidopsis (up to 18%), Hordeum vulgare and Vicia faba while, in contrast, the high temperatures decrease recombination frequencies in Allium ursinum (Francis et al., 2007). This is not true in the case of barley and rye, which are less susceptible to environmental changes (Sall, 1990). Furthermore, a series of various chemical agents or physical stress, such

42 as ultraviolet exposure (UV) or temperature shock can also be useful as recombination-rate modifiers (Preuss and Copenhaver, 2000). COs frequency between two closely linked loci was increased three-times in Hordeum by using the actinomycin D or diepoxubytane (Sinha and Helgason, 1969). As in Arabidopsis, a two to sevenfold, fourfold and threefold increase of CO frequencies were observed by using various chemical agents, heat shock and UV respectively (Preuss and Copenhaver, 2000).

1.9. Aims of my thesis work

The aim of my PhD was to confirm the presence of meiotic recombination hotspots in Arabidopsis thaliana and subsequently their analysis. In fact the work done previously in the lab (Drouaud et al., 2006) on the detailed genetic map of chromosome 4 strongly suggests that COs clustered in small regions of a few kilobases but we wanted to confirm and to analyze more in details the characteristics of these regions

When I started my Ph.D in 2008, Jan Drouaud had already started to characterize a hotspot located on short arm of chromosome 4 (named 130x). I was asked to analyze another candidate hotspot (14a) on the long arm of chromosome 4 and to determine (i) if it was a real hotspot (ii) the similarities and differences between 130x and 14a (iii) the similarities and differences between Arabidopsis hotspots and yeasts or mammalian hotspots. One of the main questions concerns the NCO events. Until the beginning of my PhD, only one NCO event had been identified (Francis et al., 2007) in Arabidopsis and only a few in others in maize and rice (review in (Mezard, 2006)). Thus nothing was known about NCOs in Arabidopsis (rate, length, localization…) and more generally in plants.

To answer the different points, I had to adapt the pollen-typing technique set up by Jan Drouaud for 130x. To study NCOs, I could not use the pollen typing technique contrary to what Jan Drouaud did for 130x: the polymorphisms at the center of the hotspots were not suitable for the technique. Thus I developed a new technique in the lab based on both allelic PCR, cloning and genotyping. This technique is written for an article in press in Methods in Molecular Biology (see Material and Methods). This method has allowed me to analyze NCOs at 14a in wild type (Chapter 2) and mutant backgrounds (Chapter 3).

43 Then I decided to look at the activity of 14a in various genetic backgrounds to better understand the control of meiotic recombination hotspots. I choose a series of Arabidopsis accessions on criteria that I will explain in the appropriate chapter (Chapter 4) and started the characterization.

Finally, during my PhD, a series of striking results were published that linked the hotspot localization to an epigenetic control: both in S. cerevisiae and in mice, hotspots were marked by a trimethylation of histone H3 on lysine 4 (see Introduction). Thus I decided to look at the potential role of putative histone H3K4 trimethyltransferase in meiotic recombination in Arabidopsis (Chapter5).

44

CHAPTER 2: CHARACTERIZATION OF THE 14A HOTSPOT

45

2.1 Introduction In this Chapter, I present the results that I have obtained on the characterization of the hotspot 14a.

When I arrived in the lab, the region 14 (113 kb) on the long arm of the Arabidopsis chromosome 4 was known to have a high rate of CO (14.5 cM) in male meiosis (Drouaud et al., 2006; Drouaud et al., 2007; Giraut et al., 2011). Within this region, it had been shown that there was a CO cluster (5 COs among 730 F2 plants) in 5 kilobases. This subregion had been called 14a.

I was asked to characterize the meiotic recombination behavior of 14a: is it a hotspot of meiotic recombination? If so, what is the distribution of COs? Are there any NCOs in 14? If there are, what is their rate, what is the average conversion tract length?

I first sequenced a 10 kb DNA region containing 14a in the two accessions Columbia (Col) and Landsberg erecta (Ler) used for the genetic map (see (Drouaud et al., 2006; Drouaud et al., 2007; Giraut et al., 2011)) to identify all the polymorphisms. Then, on each side of the putative 14a hotspot, I designed a series of primers to be used in allele specific assay. Each primer was then assayed as described in (Drouaud and Mezard, 2011) to check the specificity for the parental alleles in PCR reaction. Among 39 primers designed, a set of four primers for each allele was obtained that could be used in allele-specific PCR (see article). Following the protocol set up in the laboratory by Jan Drouaud, I was able to characterize 14a. The results obtained on CO rate and distribution are presented in the following article submitted to Plant Cell.

Then, I tackled the problem of the identification of NCOs. This was not trivial. In the laboratory, Jan Drouaud used the allele specific assay described by Baudat (Baudat and de Massy, 2009) to isolate and characterize NCOs and then Laurène Giraud modified this technique to improve their detection (Khademian et al., in press). However, both methods relied on suitable polymorphisms on the center of the hotspot to be used in an allele-specific PCR. At the hotspot 130x characterized by Jan Drouaud and Laurène Giraud, it was efficient. However, along the central 600 bp of the 14a1 hotspot, there were three polymorphisms: #35, #36 and #37. #35 and #37 were single SNPs for which we never were able to obtain specific

46 primers. #36 was a microsatellite (GC repeats) and differed by 4 repeats between Col and Ler, thus not the right place to anchor a specific primer.

At the same time, Ng et al. (Ng et al., 2008) published an assay to detect and rate COs and NCOs events on DNA isolated from mice sperm. I adapted the protocol to detect specifically NCOs in pollen (Khademian et al., in press). This assay is now used currently in the lab. The results obtained on NCOs at 14a are presented in the following article.

The following article also describes the comparative analysis of 14a and 130x hotspots between them and with other hotpots described in yeast and mammals.

47 2.2 Manuscript Contrasted patterns of crossover and non-crossover at Arabidopsis thaliana meiotic recombination hotspots

Jan Drouaud1, Hossein Khademian1, 2, Laurène Giraut2, Matthieu Falque3 and Christine Mézard2*

1 these two authors contributed equally to this work

2 Institut Jean-Pierre Bourgin UMR1318 INRA-AgroParisTech, Route de St-Cyr (RD10) 78026 Versailles Cedex France.

3 UMR de Génétique Végétale du Moulon, INRA/CNRS/Univ Paris-Sud/AgroParisTech, Ferme du Moulon ,F-91190 Gif sur Yvette, France

*Corresponding author. E-mail: [email protected]. Tel.: (33)130833739. Fax: (33)130833319.

Running title: Meiotic recombination hotspots in Arabidopsis

Estimate length: 11.1 pages. 9080 words. 57266 characters including references and including spaces

The author responsible for distribution of materials: Christine Mézard ([email protected])

48 Abstract

The vast majority of meiotic recombination events (crossovers (COs) and non-crossovers

(NCOs)) cluster in narrow hotspots surrounded by large regions devoid of recombinational activity. Here, we show that meiotic recombination hotspots do exist in plants and that they have specific characteristics. Using a new molecular approach in plants, called "pollen- typing", we detected and characterized hundreds of CO molecules in two different hotspot candidate regions in A. thaliana. This analysis revealed several features. Firstly, COs are concentrated in regions of a few kilobases where their rates reach up to 80 times the genome average. Secondly, the hotspots themselves tend to cluster in less than 8 kilobases with overlaps between CO distributions. Thirdly, non-crossover (NCOs) events also occur in two hotspots but at very different levels (local CO/NCO ratio of 2/1 and 30/1) and with dissimilar conversion tract lengths. Fourth, at one hot spot, we obtained evidence that recombination is initiated at different levels on each allele.

Keywords: Crossover/ Meiosis/ Non crossover/ Plants/ Recombination,

49 Introduction

The aim of meiosis is to reduce the level of ploidy by half. To fulfil this goal, homologous chromosomes (homologs) are segregated at the first meiotic division. In most eukaryotes, accurate segregation is ensured by the formation of at least one reciprocal recombination event or crossover

(CO) between chromatids of homologs (Martinez-Perez and Colaiacovo, 2009). In addition to this crucial mechanical role, COs increase genetic diversity by reshuffling alleles along the genome.

In all eukaryotes, CO distribution along chromosomes is not homogeneous. Domains with high or low levels of CO rates alternate along chromosomes without clear rules. At a smaller scale, COs tend to be clustered in isolated narrow regions (2 to 3 kilobases wide) called hotspots where CO frequencies are greatly enhanced compared to large adjacent regions almost devoid of any recombinational activity (de Massy, 2003). For example, 80% of all recombination occurs in

10 to 20% of the human genome (Myers et al., 2005).

The baseline of hotspot organisation has been deciphered in the two yeasts Saccharomyces cerevisiae and Schizosacchamomyces pombe. As most of the proteins involved in the meiotic recombination process are evolutionary conserved, it is thought that its basic features are similar in all eukaryotes: meiotic recombination is initiated by DNA double-strand breaks (DSBs) formed early in meiotic prophase at the leptotene stage by the Spo11 protein (Bergerat et al., 1997;

Keeney et al., 1997). The initiating DSBs are repaired preferentially by interactions with a non- sister chromatid. After completion of the DSB repair process, both COs and non-reciprocal recombination events, also called non-crossovers (NCOs) can be recovered (Baudat and de Massy,

2007b) COs and NCOs cluster around the DSBs sites in hotspot regions. Their rates peak at the center of the hotspots and then decrease on both sides (de Massy, 2003; Kauppi et al., 2004).

COs per se are the major determinant of linkage disequilibrium (non random association of genetic markers) breakdown. In addition, the gene conversion tracts contained both in COs and

NCOs shape the haplotype landscape. Indeed, CO associated gene conversion events soften the

50 boundaries between haplotype blocks while NCOs create holes within blocks (Wall, 2004). Thus it is important to appreciate both phenomena as they have implications for genetic association analyses.

Although recombination hotspots are thought to be a universal feature of meiosis their characteristics – i.e. CO rates and distribution – have been extensively described in a few organisms only: S. cerevisiae, S. pombe, mice and human. The comparison of all the data obtained about these hotspots has highlighted common characteristics but also some important discrepancies.

Recent data have identified a level of control for DSB localization. In S. cerevisiae as in mice, DSB sites have been shown to be enriched in histone H3 trimethylated on lysine 4

(H3K4me3), even in the absence of the Spo11 protein (Borde et al., 2009; Buard et al., 2009).

When the histone methylase protein Set1 is absent in S. cerevisiae, cells show a reduced level of

DSBs while a few new hotspots are detected. In mice, 94% of hotspots colocalize with enriched

H3K4me3 peaks. Prdm9 has been demonstrated to be a master regulator of hotspots localization.

It encodes a H3K4 histone trimethyltransferase and has a DNA binding specificity due to a domain containing multiple Zn fingers. This domain is highly variable among and between species

(Ponting, 2011). Interestingly, the Zn fingers of the predominant allele of the human population are able to bind and recognize a 13-mer sequence (CCNCCNTNNCCNC), which is detected in

41% of historical hotspots (Myers et al., 2008; Baudat et al., 2010). In mice, 73% of the hotspots contain the particular consensus sequence that matches with the Prdm9 DNA binding sites of the strain (Smagulova et al., 2011) and the targeted modification of Prdm9 zinc fingers leads to a change in hotspot activity and H3K4me3 levels (Grey et al, 2011). However, Prdm9 is not a universal regulator as it is not functional in dogs and relatives (Axelsson et al., 2011; Munoz-

Fuentes et al., 2011). Moreover, the link between H3K4 trimethylation and hotspot localization is not a general rule: in S. pombe, a mutation in the histone methylase Set1 that severely reduces the

51 level of H3K4 trimethylation does not affect sporulation and thus probably not the global level of

DSBs (Noma and Grewal, 2002). In Arabidopsis thaliana, 41 SET domain proteins have been identified. However, none of them is a straightforward homolog of Prdm9. Moreover, no obvious meiotic phenotype has been reported when the corresponding genes were inactivated (Pontvianne et al., 2010; Thorstensen et al., 2011). Another difference is that, no consensus sequence for DSB sites has been yet identified in both yeasts at the DNA level.

In all hotspots described, the distribution pattern of CO junctions is rather similar. It shows clustering in very narrow regions and a shape following roughly a Gaussian curve centred on the mean CO position (Arnheim et al., 2007). This mean CO position reflects the position of meiotic

DSBs and the width of the curve reflects the length of the associated gene conversion event: indeed, a molecular marker located close to the initiating DSB has more chance of being converted in the repair process than a marker further away. The mean size of gene conversion tracts associated with COs is longer in S. cerevisiae (1.8 kb) (Mancera et al., 2008) than in mammals (500 bp; Baudat and de Massy, 2007b), but the control underlying this difference is unknown.

In all studies, NCOs are probably underestimated because of the difficulty in their detection and the limited number of polymorphisms available. In S. cerevisiae, genome wide studies have obtained 90.5 COs for 66.1 NCOs per meiosis with an estimated of 140-170 DSBs per meiosis (Buhler et al., 2007; Mancera et al., 2008; Pan et al., 2011). In other organisms as diverse as mice, humans, A. thaliana or Allium sativa, NCOs are thought to largely exceed COs.

Indeed, DSB repair sites estimated by cytological methods (immunocytology using antibodies against the DSB repair protein RAD51 and/or DMC1, or early nodule visualised by electron microscopy) exceed COs by 10 to 40 times compared to COs (estimated by chiasmata count, late nodules visualised by electron microscopy, or by genetic maps) (Baudat and de Massy, 2007b; de

Muyt et al., 2009). It is not known yet if all these DSBs in excess are repaired as NCOs but if so

52 their number could greatly exceed the number of COs. Moreover, the relative ratio of CO to NCO varies from one hotspot to another. It varies from 14:0 to 0:7 in S. cerevisiae with a very low CO to NCO ratio next to telomeres (excess of NCOs) and repression of both CO and NCOs close to centromeres (Mancera et al., 2008). In humans and mice CO to NCO ratio is also extremely variable from more than 12:1 (no NCO detected) to 1:2.7 (reviewed in (Baudat and de Massy,

2007b)). The basis for these variations is totally unknown.

The understanding of the organization of hotspots benefits from the analyses done in a few species, essentially fungi and mammals, however, it seems that rules may differ from one species to another. We thus undertook the characterization of hotspot regions in a very different model, the plant A. thaliana. We previously reported that, on the A. thaliana chromosome 4 genetic map,

COs were grouped in several regions, each a few kilobases in length (Drouaud et al., 2006). These were good hotspot candidates but could not be characterized further by classical genetic studies as the CO rates were estimated to be around 0.35 cM in these regions. Therefore, we set up a "pollen- typing" molecular approach (see Results), based on the “sperm typing” technique developed previously in mammals (Jeffreys et al., 1998), that allowed us to detect and characterize hundreds of CO and NCO molecules in two different hotspots. Here, we show that CO distribution differs from those described in other species. Numerous NCOs were also detected in these regions but at very different levels. Furthermore, at one hotspot the data suggest that recombination is not initiated at the same level on both alleles.

Results

Detection of COs at three meiotic hotspots by "pollen-typing"

The 14a and 130x regions were identified in a previous study of CO distribution over the entire chromosome 4 in large populations of A. thaliana hybrids between “Columbia-0”

(Col) and “Landsberg erecta-4” (Ler) (Drouaud et al., 2006). CO frequencies at both loci

53 were 0.36 cM, with a peak rate around 90 cM/Mb and 171 cM/Mb at 14a and 130x respectively, while the chromosome average was 4.8 cM/Mb. Thus, these two regions were good candidates for true hotspots. However, classical genetic techniques could not be used as thousands of plants would have been needed to obtain enough COs to characterize these regions. Thus we set up a "pollen-typing" technique (see Materials and Methods, Figure 1;

(Drouaud and Mezard, 2011)), that parallels the "sperm-typing technique used for hotspot studies in mice and humans (Baudat and de Massy, 2009; Kauppi et al., 2009; Cole and Jasin,

2011). Briefly, recombinant products are detected directly from pollen DNA by series of allelic PCRs. We were able to detect and characterize hundreds of recombinant CO molecules at both hotspots in genomic DNA extracted from the pollen of ColxLer hybrid plants. The meiotic origin of these molecules was assessed with control reactions carried out in parallel with pollen and leaf (somatic) DNA. Using the same amounts of DNA, CO molecules could be detected in pollen, but never in leaf (Figure 2). In fact, the few positive PCRs detected with an input of a large amount of leaf genomes (>16000) were sequenced and shown to be non- specific amplification.

The rate of COs in pollen genomic DNA was 0.85 % (Confidence Intervals (CI) 0.61-

1.21) at 14a, and 0.53 % (CI:0.34-0.80) at 130x. Then the CO molecules were sequenced to confirm and mapped precisely their exchange point. All CO molecules characterized (104 at

14a and 167 at 130x) contained a single transition between parental haplotypes.

The 14a region exhibited two distinct CO peaks (subsequently referred to as 14a1 and

14a2), which both fit a Gaussian distribution (Figure 3A). The width of hotspots within which

95% of COs occurred (determined by best fit normal distributions, Material and Methods) was 1,475 bp and 3,775 bp respectively, and their centers lie 3,047 bp apart from each other, with an in-between “valley” where just a few COs were detected. CO frequency was null on

54 each side of this region (Figure 3A). At 14a1 and 14a2, CO rates peak at 380 and 185 cM/Mb, respectively (80 and 39 times the chromosome average).

At 130x, CO rate was also null on both sides of the region and reached a maximum

(close to the centre) at 102 cM/Mb, which is 22 times the chromosome average (Figure 3B).

The distribution of COs differed from that observed in 14a: it was very broad (more than 7 kb) and irregular with alternating “peaks” and “valleys” (Figure 3B) which does not fit well with a unique Gaussian curve. However, both the CO rate and distribution at 14a and 130x indicate clearly the existence of hotspots in A. thaliana.

We then looked at the distribution of exchange points in each orientation at both hotspots. At the 14a hotspots (14a1+14a2), discrepancies between CO distribution in reciprocal orientations “Col to Ler” and “Ler to Col” (i.e. ‘CtoL’ and ‘LtoC’) were observed:

’LtoC‘ exchanges were displaced to the left of ‘CtoL’ exchanges (Figure 4A). This led to an excess of Col allele at the median of both hotspots as shown by the comparison of the CO cumulative distribution (Figure 4A). At 14a1, the Col allele was over-transmitted (68%) at the mediane of the CO distribution. The difference was highly significant at 14a1 (p-val 0.00111) while it was only barely significant (p-val 0.0734) for the 14a2 hotspot, probably due to the lower CO number. These patterns are consistent with the hypothesis that the Ler allele has a stronger initiation activity than the Col allele at these hotspots. The mean position of the two reciprocal distributions ‘CtoL’ and ‘LtoC’ is separated in average by 213 bp and 483 bp for the 14a1 and 14a2 hotspots respectively. In contrast, at the 130x hotspots both alleles seem equally proficient at initiating recombination (Figure 4B).

Detection of NCO events at meiotic recombination hotspots

At meiotic recombination hotspots, DSBs are repaired either as CO or NCOs. In plants, very few meiotic NCOs have been characterized because of the difficulty in detecting

55 molecular events not linked to a phenotypic change. We characterized NCO events at both the

14a1 and 130x hotspots, with different molecular approaches adapted to the polymorphism landscape of each hotspot.

For 14a1, cloned PCR fragments were genotyped at three SNPs: two (#35 and #37) located on opposite sides of the median of the hotspot and one (#33) on the left border (Figure

5). Positives clones were then sequenced (see Material and Methods; Figure 1). Among 3,000 molecules tested, 8 and 7 NCO events were detected at polymorphism #35 and #37 respectively, and none at polymorphism #33 (Figure 5). The cumulative NCO frequency for both SNPs (#35 and #37) (1/203, 0.50% (CI: 0.30-0.82)) was slightly lower than the overall

CO frequency estimated with the pollen typing approach (1/125, 0.85%). We recovered unequal number of NCOs in both directions: two ‘LtoCtoL’ and six ‘CtoLtoC’ were detected at polymorphism #35 while one ‘LtoCtoL’ and six 'CtoLtoC’ at polymorphism #37 (Figure 5).

When all NCOs were pooled, the difference between NCO rates in reciprocal orientations

(‘LtoCtoL’ versus ‘CtoLtoC’) was significant (p val = 0.018). This result was consistent with the hypothesis that initiation was preferentially on the Ler allele. NCO events at both sites were all restricted to a single polymorphism, either #35 or #37, i.e. without co-conversion of left and/or right flanking markers, which are located 111 bp and 482 bp away for #35 and 166 bp and 340 bp for #37 (Figure 5). Interestingly, the central region between SNP #35 and #37 contained a polymorphic microsatellite at 482 bp and 166 bp from the two SNPs. The NCO rate at this polymorphism could not be directly evaluated by fluorescent genotyping but, by

DNA sequencing of the positive clones, we observed that none of the conversion tracts isolated at SNP #35 and #37 included this polymorphic site. Thus the mean minimal tract was

1 bp if only the polymorphism converted was considered and the mean theoretical maximal tract was 552 bp if the tract extended just prior the next polymorphisms on both sides (276 bp when minimal and maximal mean are averaged).

56 At 130x, NCO molecules were characterized using a PCR-based “pollen-typing” strategy (see Materials and Methods, Figure 1) using three polymorphic sites (#21, #44 and

#52), 2339 and 2052 bp away respectively. #44 was next to the median of the hotspot where the CO frequency was maximal, #21 was located on the left part of 130x where CO rate was low and #52 was on the right part where CO rates were average (green marks on Figure 3B).

At each SNP, PCRs were performed on a set of 397,996 genomes extracted from pollen.

When both left and right PCR were positive at one SNP, recombinant PCRs molecules were fully sequenced to map the recombinant point (Figure 1). At #44, 18 ‘LtoCtoL’ and 11

‘CtoLtoC’ events were found (Figure 6A). Similar NCO rates were obtained at #21: 18

'LtoCtoL' and 12 'CtoLtoC' among the same pool of 397,966 genomes (Figure 6B). However, at #52 only 4 'LtoCtoL' molecules were recovered and no 'CtoLtoC' (Figure 6C). The observed NCO frequency was approximately 0.007 % (CI: 0.005-0.010), 0.008 % (CI: 0.005-

0.011) and, 0.001 % (CI: 0.0004-0.0033) at #44, #21 and #52 respectively. Pooling the results obtained at the 3 SNPs, the NCO rate 0.016% (CI: 0.012-0.020) was roughly thirty times lower than the overall CO frequency (0.53%).

We also noticed that as for 14a1, an excess of ‘LtoCtoL’ (36) NCOs compared to

‘CtoLtoC’ (23) were detected at both #21 and #44. At #52, only ‘LtoCtoL’ NCOs were obtained. In this latter case, we could not distinguish between the absence of ‘CtoLtoC’ NCOs in our starting pool of 397,996 genomes or if we missed them by pollen-typing. However, when the results obtained with the first two SNPs were pooled, the difference was not significant (‘LtoCtoL’ : 0.009 % (CI: 0.007-0.013); 'CtoLtoC': 0.006% (CI: 0.004-0.009)).

We then sequenced the PCR products to precise the length of the NCO tract. At #44,

23/29 NCO tracts extend to the right to the neighboring polymorphism (89 bp away), while polymorphisms on the left were co-converted in only 5 tracts, but over a greater distance (up to 791 bp)(Figure 6); similarly, at #21, 20/30 NCOs include the first two polymorphisms on

57 the left (275 bp) whereas only 5 extend to the right but again over a greater distance (up to

1028 bp)(Figure 6). Thus the minimal tract length mean are comparable for both SNPs (160 bp at #21 and 278 bp at #44) whereas the maximal mean tract length is more than 3 times longer at #21 (1798 bp) compared to #44 (492 bp) and the average is also two times longer at

#21 (1038 bp compared to 492 bp). The longest NCO tract was found at #52: 6 SNPs were co- converted along a tract of 1882 bp that could extend up to 3045 bp. Interestingly, none of the

NCO tracts cover either #21 and #44 or #44 and #52 (Figure 6).

At #21 three NCO tracts were chimeric: two 'CtoLtoCtoLtoC' and one

'LtoCtoLtoCtoL' (Figure 6B).

Discussion

Similarities and discrepancies between CO distributions at meiotic recombination hotspots

Here, we characterized three hotspots of meiotic recombination on Arabidopsis thaliana chromosome 4. Both the rate and distribution of COs and the occurrence of NCOs across these regions confirm that they are indeed meiotic recombination hotspots.

At 14a, CO distribution patterns fit with the existence of two independent hotspots located very close to each other. 14a1 has a very high peak rate of COs (380 cM/Mb) and is narrow (1475 bp) while 14a2 is broader (3775 bp) and peaks at less than half the rate of 14a1

(185 cM/Mb). At 130x, the CO landscape is more complex: CO distribution is broad and does not conform to a single Gaussian curve. Instead, it is irregular with alternating peaks and valleys. In this region, COs could originate from a single initiation zone: the irregularities observed in the distribution could be explained by the presence of several insertions/deletions

- which are 20, 13, 12, 70, 10, 7 and 10 bp wide respectively - along the region (Figure 3B;

Figure 6). Such heterologies could either block branch migration of double Holliday junctions

58 or channel recombination intermediates towards NCOs or exchanges between sister chromatids, as suggested previously for the mouse HS22 hotspot (Bois, 2007). However, even if at 130x, heterologies could be responsible for to a slight CO drop there is not a dearth of

COs as described in several mammalian hotspots (Jeffreys and Neumann, 2005; Baudat and de Massy, 2007a; Cole et al., 2010; Wu et al., 2010). Alternatively, 130x could be, like 14a, a cluster of several close hotspots (3 or more), each having arisen from a discrete initiation zone and the resulting CO distributions would overlap extensively. This last hypothesis is strengthened by the fact that NCOs are initiated independently at least in three regions around

SNPs #21, #44 and #52 and the conversions tract do not overlap between these three regions.

Interestingly, the only other region in which the fine scale distribution of COs has been characterized in a plant species (the a1 region of maize (Yao et al., 2002; Yao and

Schnable, 2005)), exhibited a pattern similar to that of 130x. COs were distributed throughout a wide region (10 kb) with peaks and valleys and there was no region devoid of COs between two peaks.

The occurrence of hotspot clusters (two or more) within less than 12 kb was also described in at least four regions in human: DNA1-DNA2-DNA3, DMB1-DMB2 (Jeffreys et al., 2001), NID2a-NID2b, MSTM1a-MSTM1b (Jeffreys and Neumann, 2005). However, these clusters differ from those we have identified in Arabidopsis and from a1 in maize in several aspects. Firstly, within the clusters, the adjacent hotspots can be distinguished from each other because the corresponding CO distributions do not overlap. A contrario, at 130x or a1 in maize (Yandeau-Nelson et al., 2005), the precise identification of individual hotspots is not possible as the peaks of CO rate are not separated from each other by regions devoid of COs.

In 14a1, two independent peaks can be distinguished but the valley in-between is not as neat as in mammalian hotspots. Secondly, in the clusters described in human, hotspots are not larger than the 2.1 kb described for single hotspots in mammals and all hotspots in S.

59 cerevisiae (Arnheim et al., 2007; Mancera et al., 2008). At both 130x and 14a2, however, we describe hotspots larger than 3 kb, suggesting that in contrast to mammals the underlying mechanism could accommodate for some variation in the size of recombination intermediates.

Thirdly, in most clusters described in mammals, one hotspot has a very high rate of CO while the others are often 10 times less recombinogenic or just above or below the genome average rate. At both 130x and 14a, each individual peak has a very high CO rate. The molecular mechanisms underlying this clustering are presently unknown. One hypothesis is that (i) there are regions of tens of kb that are permissive for recombination and then that (ii) within these regions, similar to S. cerevisiae, S. pombe and mammals, recombination hotspots arise at sites where particular sequence motifs and/or chromatin modifications target the activity of Spo11

(Noma and Grewal, 2002; Borde et al., 2009; Baudat et al., 2010; Myers et al., 2010).

Interestingly, the median of all three hotspots lies close to gene promoters, which are active in

A. thaliana meiocytes in both Col and Ler (3 to 4 times above the average transcription level,

(Libeau et al., 2011)). A similar localization was noted for most DSB hotspots in S. cerevisiae

(Baudat and Nicolas, 1997; Buhler et al, 2007; Ludin et al, 2008; Pan et al, 2011) while in S. pombe, hotspots lie preferentially in large intergenic regions (Cromie et al, 2007; Petes,

2001). In human, genome wide characterization of historical CO hotspots, through the analysis of linkage disequilibrium, provided evidence that recombination mainly occurs near genes but outside transcribed regions (Myers et al., 2005) even if some lie within genes. And in mice hotspots have a tendency to overlap genes (Smagulova et al., 2011). A more precise localization of more hotspots in Arabidopsis is needed to generalize our observations.

However, the potential similarity between the location of S. cerevisiae and Arabidopsis hotspots in promoter regions could be due to the resemblance of their genomic structures.

Both are compact, have a high density of genes along chromosomes and small intergenic regions (Goffeau et al., 1996; AGI, 2000).

60

CO to NCO ratio

Meiotic DSBs are repaired either as COs or NCOs. In Arabidopsis male meiosis, CO number estimated by counting chiasmata on metaphasic plates, varies from 7.90 to 9.36 depending on accessions analyzed (Sanchez-Moran et al., 2002). A detailed genetic map gave

11.6 COs per male meiosis in the ColxLer background used in this study (Giraut et al., 2011).

Using antibodies directed against the meiotic DSB repair protein DMC1, DSB sites have been estimated to be around 230 in male meiosis of various accessions (Chelysheva et al., 2005;

Chelysheva et al., 2007; Vignard et al., 2007). If these breaks were mainly repaired as NCOs, there should be a large excess (at least 20 times) of NCO events over COs but the relative ratio of DSBs repaired on homologous versus sister chromosomes is totally unknown in

Arabidopsis as in other higher eukaryotes.

At the 130x and 14a1 hotspots, we observed both COs and NCOs, but in very different relative local ratios. COs are relatively easy to score as they exchange a large set of markers among which it is relatively easy to get scorable SNPs by "pollen-typing". Our approach to measure NCO rates relies on the detection of conversion events at selected SNPs. By doing that, we could not detect NCO events that did not convert these SNPs, either because, even if initiated closed by they were small and did not reach the SNPs studied, or they initiated further away and did not cover the region of the SNPs chosen. Moreover, many SNPs cannot be studied by our pollen typing strategy either because specific primers are not specific enough to be used to amplify NCO recombinant molecules from pools containing a large excess of non recombinant molecules or because they are not suitable for the fluorescent mapping technique. Thus, NCOs are largely underestimated. However, we were able to observe and analyse tens of NCOs at both hotspots.

61 At 130x, NCOs were detected at three polymorphic sites distributed along the hotspot with a global rate of 0.016%, thus 30 times less than that of COs. For all reasons developed above, this rate is likely underestimated. At 130x, if we hypothesize that each base along the

7485 bp hotspot can be the target of a DSB at the same rate and that the average conversion tract length is around 781 bp (average global conversion tract length), then the NCO frequency could be up to 0.15%, a third of the CO rate. If we follow the same logic for 14a

(14a1+14a2), NCO could reach as high as 9.9% that is 12 times the CO rate. It is likely that our hypothesis oversimplifies the NCO frequency estimate as at both hotspots analyzed the

NCO rates are very inhomogeneous, however, NCOs could represent a large fraction of recombination events.

Our results show that in plants, as previously demonstrated in yeasts or mammals, the local CO to NCO ratio varies from one hotspot to another. However, in mice and human,

NCOs rarely largely exceed COs (Arnheim et al., 2007; Baudat and de Massy, 2007b) whereas cytological observations suggest a large excess of NCO events. It could be that there are NCOs exclusive hotspots at other locations but these would be difficult to identify. In

Arabidopsis in fact, the two regions studied here have been selected on their high CO rate in a previous study (Drouaud et al., 2006). Another hypothesis which is not exclusive to explain the lack of large excess of NCOs, would be that a high proportion of these DSBs are repaired on the sister chromatid. To date, the underlying mechanical basis of these fluctuations remains totally unknown.

Variable features of NCO events at Arabidopsis hotspots

At 14a1, we observed that none of the NCOs isolated co-converts SNP #35 and #37, which are 648 bp away from each other, and furthermore none co-converts a polymorphic microsatellite lying at 482 bp and 166 bp between the two SNPs. This result together with the

62 similar NCO rates at both positions, strongly suggests that the main DSB initiation zone lies close to this microsatellite and that the corresponding conversion tracts rarely include both the microsatellite and SNP #35 or #37.

We analyzed 63 NCO tracts at 130x. Their length belongs to the range 1 to 3045 bp with a mean length around 781 bp. At the a1 locus in maize, the five NCO tracts characterized are within a similar size range (31 bp to more than 1603 bp) (Yandeau-Nelson et al., 2005).

Thus in plants, NCO tracts could be shorter than those of S. cerevisiae, reported to be on average around 2 kb (Mancera et al., 2008), and probably longer than those of mammals.

Indeed, none of the NCO tracts reported so far in mice and human exceeds 1.2 kb in length, with an average estimated between 50 to 300 bp (Jeffreys and May, 2004; Jeffreys and

Neumann, 2005; Holloway et al., 2006; Arnheim et al., 2007; Baudat and de Massy, 2007a).

Additional studies of mammalian and plant hotspots are needed to generalize this hypothesis.

Crossover asymmetry and bias for recombination initiation

The DSB-repair model for meiotic recombination and its derivatives predict that the broken chromatid receives information from the homolog (Szostak et al., 1983; Allers and

Lichten, 2001; Hunter and Kleckner, 2001). It implies that, if one homolog is preferentially involved in DSB formation, a shift between crossover breakpoints in the two reciprocal orientations should be observed, with the alleles of the less active homolog near the hotspot centre being over transmitted (see (Baudat and de Massy, 2007a) for a comprehensive discussion). At both 14a1 and 14a2 hotspots, CO distribution in reciprocal orientations suggests that the Ler allele is more proficient for initiating meiotic recombination than the Col allele. This hypothesis is supported by the significantly higher rate of ‘LtoCtoL’ versus

‘CtoLtoC’ NCO at the centre of the 14a1 hotspot. As they have similar rates and conversion tract length, CO and NCO would contribute likewise to transmission distortion. However, it is

63 not a general rule for all hotspots as no bias was detected at 130x. Bias of initiation can lead to fixation of alleles and hotspot extinction and thus can substantially influence genome evolution.

Variation in the capability of an allele to initiate recombination has been shown previously in several mammalian hotspots (DNA2 (Jeffreys and Neumann, 2002), NID1

(Jeffreys and Neumann, 2005), Psmb9 (Baudat and de Massy, 2007a), A3 (Cole et al., 2010)).

This plasticity has been suggested to be controlled mainly by the capacity of Prdm9 zinc fingers arrays to recognize a consensus sequence within hotspots and thus to mark them. The rapid evolution of minisatellites encoding the zinc finger array would explain the highly polymorphic location of hotspots between individuals and related species in mice and human

(Ponting, 2011). However, no Prdm9-like mechanism has been reported yet in Arabidopsis and it is also not known if hotspots localization is highly variable. Biais of initiation

In conclusion, we have formally demonstrated that true meiotic recombination hotspots exist in the plant Arabidopsis thaliana. We have also established that COs and NCOs occur at very high rates at these hotspots, as in yeast and mammals. However, we have shown that the pattern of COs and NCOs differs from those described in other species, emphasizing the need for the analysis of more hotspots in various species in order to understand the underlying mechanisms that control them

64 Methods

Plant material

The Arabidopsis thaliana accessions “Columbia-0” (186AV) and “Landsberg erecta”

(213AV) were obtained from the “Centre de Ressources Biologiques” at the “Institut Jean

Pierre Bourgin”, Versailles, France. All plants were grown in the greenhouse under standard conditions.

Extraction of pollen genomic DNA

Genomic DNA from pollen was extracted as described in (Drouaud and Mezard, 2011).

Briefly, whole inflorescences from hybrid plants were harvested in 10% saccharose, and crushed in a “Waring Blendor” (two 4 sec pulses at full speed). The homogenate, containing intact microspores and pollen grains, was then filtered and stored at -20°C until DNA extraction. Pollen grains and microspores were resuspended and incubated with proteinase K at 65°C for three hours with gentle shaking. Then, pollen grains were disrupted by mixing with glass beads with a vortex at full speed for 1 to 3 minutes. One volume liquid phenol was added and tubes were rocked for 30 min at 4°C. After centrifugation, the supernatant was recovered and nucleic acids were precipitated by sodium acetate and ethanol addition.

Genomic DNA was dissolved in (10 mM Tris-Cl pH8, 1 mM EDTA, 100 µg/ml RNAseA) and incubated at room temperature for 15 min. Four volumes of freshly made (5M guanidine isothiocyanate, 50 mM Tris-Cl pH 8) were then added, and DNA was purified with DNeasy minicolumns (Qiagen ref. 69106).

Extraction of leaf genomic DNA

Genomic DNA was extracted from young leaves as described in (Weigel and Glazebrook,

2002). Then, four volumes of freshly made (5M guanidine isothiocyanate, 50 mM Tris-Cl pH

65 8) were added to the extract, and DNA was purified with DNeasy minicolumns (Qiagen ref.

69106).

Quantification of genomic DNA

Quantification of genomic DNAs was performed as described in (Drouaud and Mezard,

2011). Briefly, PCR reactions were performed in 20 µl of buffer (Jeffreys et al., 1990) with 1

U Taq DNA polymerase and 0.1 U Pfu DNA polymerase. Whenever less than 100 pg/µl of genomic DNA was used, herring sperm DNA (Clontech) was added into the reactions (1 ng/µl). Primers (sequence and genomic coordinates) are listed in Table S1. Pairs of oligonucleotides used for allele-specific PCR are listed in Table S2.

Amplifiable molecules in genomic DNA extracts were quantified on a series of dilutions through two rounds of PCR, using nested allele-specific oligonucleotides (ASOs) listed in

Table S1 and Table S2. The product of the first PCR was diluted 1/1000 in the second reaction. The thermal cycling profile of the reactions is: (((92°C;2 min)((92°C;20 sec)(Tm;30 sec)(68°C;30 sec + 45 sec/kb)))x30(68°C;90 sec/kb)(4°C;∞)). After the second PCR, the proportion of negative wells among a set of aliquot reactions is approximated by e-m, where

‘m’ is the mean number of DNA molecules per well in the first reaction. Parental molecules were thus quantified using ASOs all specific to either Col or Ler DNA.

CO detection and mapping of CO exchange points

CO molecules were amplified with ASOs specific to either Col or Ler on one side, and Ler or

Col respectively on the other side (Figure 1). Primers (sequence and genomic coordinates) are listed in Table S1. Pairs of oligonucleotides used for allele-specific PCR are listed in Table

S2.

66 For mapping CO exchange points, a series of aliquot reactions was carried out, which were predicted to contain an average of less than 0.2 CO molecules, so that more than 90 % of positive reactions issued from a single CO molecule. PCR products were then sequenced in order to locate exchange points from single CO molecules.

Characterization of NCO events at the 130x and the 14a1 hotspots

NCO molecules at polymorphism #44, #21 and #52 in the 130x hotspot were detected using a

PCR-based strategy adapted from (Baudat and de Massy, 2009). The outline of this approach is described in Figure 1. The recombinant molecules detected were then fully sequenced to map the breakpoint. Pairs of primers for these experiments are listed in Table S3.

For 14a1, a Col or Ler-specific 7.3 kb genomic DNA fragment was amplified by two rounds of allele specific PCR followed by one round of "universal" PCR with oligonucleotide pairs listed in (table S1 and Table S2). The PCR products were then digested with BglII and XbaI and ligated into pCRIITOPOblunt (Invitrogen) between the BamHI and XbaI unique sites, using standard procedures. The ligation products were then used to transform DH10B E. coli strain by electroporation. Transformed cells were spread onto LB agar plates containing 100

µg/ml carbenicillin, 0.2 mM IPTG and 40 µg/m X-gal. Following blue/white screening, individual colonies were transferred into 200 µl of LB medium containing 100 µg/ml carbenicillin in 1 ml MASTERBLOCK® microplates (Greiner Bio-One ref. 780215) and grown with gentle shaking at 37°C for 16 hours. Then, 100 µl of each cell culture was transferred to 96 well V-bottom microplates and spun down at 3200xg for 10 min. Cell pellets were resuspended in 100 µl of sterile water.

Bacterial clones were then genotyped using the “Chemicon Amplifluor SNPs Genotyping

System”. For this purpose, oligonucleotides either specific to each parent at polymorphism

67 #33, #35 and #37, or non-specific, were designed using the Amplifluor AssayArchitect software (https://apps.serologicals.com/AAA/) (see table S4 for oligonucleotides details).

Genotyping was then performed as described in (Ng et al., 2008). Plasmids supposed to contain a NCO event by genotyping were fully sequenced to map precisely the gene conversion event

Statistical analyses

The differences between distributions of CO in reciprocal orientations at a given hotspot were tested as follows: (i) CO breakpoints located on each side of the median position were grouped separately for the two reciprocal orientations (‘CtoL’ CL and ‘LtoC’ LC), thus providing four numbers COCLleft, COCLright, COLCleft, COLCright ; (ii) these numbers were grouped in a contingency table for testing the association between left/right and CL/LC classification using the two-tailed Fisher’s exact test.

The difference between ‘CtoLtoC’ and ‘LtoCtoL’ NCO rates at polymorphisms #35 and #37 in the 14a1 hotspot was tested using the one tailed Fisher’s exact test.

The parameters of the best fitting Gaussian distributions were calculated using an Excel macro (available on request), which computes the least sum of squared differences between observed and theoretical (Gaussian) integrated distributions over every interval between successive SNPs.

Estimated CO frequencies and associated confidence intervals have been analyzed as follows:

Repeated PCR experiments were performed on highly diluted pollen DNA samples collected on a F1 plant obtained from a cross between two homozygous parents carrying different alleles at two marker loci. Two primer pairs were used for PCR amplifications. The first pair was specific for molecules carrying the alleles of the first parent at both loci, and could thus amplify half of the non-recombinant molecules. The second primer pair was specific for

68 molecules carrying the first parental allele at the first locus and the second parental allele at the second locus, and could then amplify half of the recombinant molecules. It is considered that a PCR amplification gave a product if the template contained at least one molecule corresponding to the primer pair used. The same initial pollen DNA sample (unknown concentration C) was used as template for all experiments, but at different dilutions. Si series of experiments (indexed with i) were performed with the first primer pair, and Sj series

th (indexed with j) with the second primer pair. For the k series, a total of Nk PCR reactions were carried out using as template the initial DNA sample diluted at the rate . Let us note yk the number of reactions that did not produce any amplification product.

A Bayesian inference approach was used to infer the recombination rate between the two marker loci, as well as its 95% confidence intervals.

For the first primer pair, amplifying non-recombinant molecules, the probability of having no amplification in a given well follows a Poisson law and is: . The likelihood of the observed PCR results obtained with this primer pair follows a binomial law:

.

With the second primer pair, amplifying recombinant molecules, the probability of having no amplification is where r is the recombination rate between the two marker loci.

The likelihood of the observed results is then:

.

69 The joint likelihood of all observations (both primer pairs) is:

.

Because C and r are independent, the a posteriori probability of parameters C and r knowing

the observations is: where fC and fr are the prior density functions of C and r respectively.

The prior distributions for r and C were taken uniformly distributed in [0;0.1] and

respectively, which supposes that r<10%, and that at least 10% positive reactions are expected with the least concentrated sample.

The a posteriori distributions of r and C were numerically computed by a two-dimensional scan of the parameter space, and the 95% confidence intervals on r and C were determined from these distributions.

95% confidence intervals on NCO frequencies were computed based on the binomial law, by numerically adjusting the frequencies corresponding to distribution function values equal to

0.025 and 0.0975.

Acknowledgments

We thank all members of our laboratories for discussions, Mathilde Grelon, Eric Jenczewski,

Raphaël Mercier, Fabien Nogué, Andrew Lloyd for helpful comments on the manuscript,

Aurélie Bérard and Dominique Brunel for plasmid genotyping, Christine Dillmann and

Olivier Martin for advice on statistical analyses. The research leading to these results has received funding from the European Community's Seventh Framework Programme

FP7/2007-2013 under grant agreement number KBBE-2009-222883. This study was supported by a grant from Agence Nationale de la Recherche (grant ANR GNP05040G). The

70 funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Author Contributions

JD and CM conceived the experimental design. HK, JD, LG performed the experiments. MF performed the statistical analysis. JD, HK, LG, MF and CM analysed the results. CM wrote the manuscript.

Figure legends

Figure 1. Specific detection of CO and NCO molecules in genomic DNA extracted from pollen.

The F1 hybrid contains at each locus 1 allele of each parent. Filled circles represent polymorphisms on the C (Col, blue) or L (Ler, red) chromosomes. After meiosis, in DNA extracted from pollen, there are either non-recombinant molecules (C, L) or CO ('LtoC' or

'CtoL') or NCOs ('LtoCtoL' or 'CtoLtoC') in various proportions depending on the locus studied. To detect CO molecules, two rounds of allele specific PCR was performed with allele-specific oligonucleotides (ASOs, blue and black triangles) on pools of genomic DNA.

To detect NCO molecules, first alleles containing on each side polymorphisms of one specific parent were amplified by one or two rounds of allele specific PCRs. Then the region of interest was cloned in E.coli and interesting SNPs were genotyped using fluorescent mapping as described (Ng and Paigen, 200x). Alternatively, two complementary allele-specific PCRs were performed to analyze the status of interesting SNPs. A NCO was scored when both

PCRs were positive.

71 Figure 2. Specific detection of CO molecules in genomic DNA extracted from pollen.

PCR was performed with allele-specific oligonucleotides (ASOs) designed for the amplification of CO molecules (see Material and Methods, Figure 1), using decreasing amounts of genomic DNA (the number of template molecules is indicated on the photograph) extracted from F1 Col x Ler hybrid plants, either from leaf (two top rows) or pollen (two bottom rows). Eight aliquot reactions were carried out for each dilution. Very faint bands could be detected (16384 template molecules and more) for the high amounts of leaf DNA.

However, the PCR products were sequenced and shown not to be recombinant molecules. At lower amounts of leaf DNA no PCR products were amplified. Conversely, using equivalent low concentrations of DNA extracted from pollen (4096 molecules and less), a strong and specific amplification of CO molecules was obtained.

Figure 3. Distribution of CO rates across the 14a (A) and 130x (B) hotspots.

104 COs were analyzed at 14a (62 at 14a1 and 42 at 14a2) and 167 at 130x. Grey dotted curves show the underlying crossover distribution established by least-squares best fit analysis assuming that COs are normally distributed across each hotspot (see Materials and Methods).

Grey dotted vertical lines: median positions of the hotspots. Triangle: positions of insertions- deletions larger than 7 bp. Gene structures along the hotspot regions are displayed on top of graphical areas (light hashed grey: exons, dark grey: UTR, broken lines: introns; green marks: position of SNPs studied for NCOs, triangles are indels, filled circles are SNPs).

Figure 4. CO distribution and cumulative CO rates for reciprocal orientations at 14a (A) and 130x (B).

29 ‘CtoL’ and 33 ‘LtoC’ COs were analyzed at 14a1, 23 and 19 at 14a2, and 75 and 92 at

130x. The ‘LtoC’ distribution is presented in red and the ‘CtoL’ in blue. For all three hotspots

72 and both reciprocal orientations, cumulated (from left to right) relative CO rates were scored at successive polymorphic sites along the region. Blue curves: cumulated relative ‘CtoL’ CO rates. Red curves: cumulated relative ‘LtoC’ CO rates. Blue histogram: distribution of ‘CtoL’

CO rates. Red histogram: distribution of ‘LtoC’ CO rates. Grey dotted vertical lines: median positions of the hotspots.

Figure 5. NCO at 14a1.

The SNPs #33, #35 and #37 are indicated by filled green circles. The polymorphisms are indicated by filled black circles along the chromosome coordinate axes. Thick horizontal lines: converted SNPs, Thin horizontal lines: interval in which NCO tract ends are located.

Blue: Col; Red: Ler

Figure 6. NCO at 130x.

(A) SNP #44, (B) SNP #21, (C) SNP #52

Position of SNPs genotyped are indicated by green filled circles or triangles (indel). The polymorphisms are indicated by filled black circles along the chromosome coordinate axes.

Triangles: insertion or deletion above 7 nt. Thick horizontal lines: converted SNPs, Thin horizontal lines: interval in which NCO tract ends are located. Blue: Col; Red: Ler. Yellow line: chimeric SNPs

References

AGI. (2000). Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 408, 796-815. Allers, T., and Lichten, M. (2001). Intermediates of yeast meiotic recombination contain heteroduplex DNA. Mol Cell 8, 225-231.

73 Arnheim, N., Calabrese, P., and Tiemann-Boege, I. (2007). Mammalian meiotic recombination hot spots. Annu Rev Genet 41, 369-399. Axelsson, E., Webster, M.T., Ratnakumar, A., Consortium, L., Ponting, C.P., and Lindblad-Toh, K. (2011). Death of PRDM9 coincides with stabilization of the recombination landscape in the dog genome. Genome Res. Baudat, F., and de Massy, B. (2007a). Cis- and Trans-Acting Elements Regulate the Mouse Psmb9 Meiotic Recombination Hotspot. PLoS Genet 3, e100. Baudat, F., and de Massy, B. (2007b). Regulating double-stranded DNA break repair towards crossover or non-crossover during mammalian meiosis. Chromosome Res 15, 565-577. Baudat, F., and de Massy, B. (2009). Parallel detection of crossovers and noncrossovers in mouse germ cells. In Methods Mol Biol, pp. 305-322. Baudat, F., Buard, J., Grey, C., Fledel-Alon, A., Ober, C., Przeworski, M., Coop, G., and de Massy, B. (2010). PRDM9 is a major determinant of meiotic recombination hotspots in humans and mice. Science 327, 836-840. Bergerat, A., de Massy, B., Gadelle, D., Varoutas, P.C., Nicolas, A., and Forterre, P. (1997). An atypical topoisomerase II from Archaea with implications for meiotic recombination. Nature 386, 414-417. Bois, P.R. (2007). A highly polymorphic meiotic recombination mouse hot spot exhibits incomplete repair. Mol Cell Biol 27, 7053-7062. Borde, V., Robine, N., Lin, W., Bonfils, S., Geli, V., and Nicolas, A. (2009). Histone H3 lysine 4 trimethylation marks meiotic recombination initiation sites. Embo J 28, 99- 111. Buard, J., Barthes, P., Grey, C., and de Massy, B. (2009). Distinct histone modifications define initiation and repair of meiotic recombination in the mouse. Embo J 28, 2616- 2624. Buhler, C., Borde, V., and Lichten, M. (2007). Mapping Meiotic Single-Strand DNA Reveals a New Landscape of DNA Double-Strand Breaks in Saccharomyces cerevisiae. PLoS Biol 5, e324. Chelysheva, L., Gendrot, G., Vezon, D., Doutriaux, M.P., Mercier, R., and Grelon, M. (2007). Zip4/Spo22 is required for class I CO formation but not for synapsis completion in Arabidopsis thaliana. PLoS Genet 3, e83. Chelysheva, L., Diallo, S., Vezon, D., Gendrot, G., Vrielynck, N., Belcram, K., Rocques, N., Marquez-Lema, A., Bhatt, A.M., Horlow, C., Mercier, R., Mezard, C., and Grelon, M. (2005). AtREC8 and AtSCC3 are essential to the monopolar orientation of the kinetochores during meiosis. J Cell Sci 118, 4621-4632. Cole, F., and Jasin, M. (2011). Isolation of meiotic recombinants from mouse sperm. Methods Mol Biol 745, 251-282. Cole, F., Keeney, S., and Jasin, M. (2010). Comprehensive, Fine-Scale Dissection of Homologous Recombination Outcomes at a Hot Spot in Mouse Meiosis. Mol Cell 39, 700-710. de Massy, B. (2003). Distribution of meiotic recombination sites. Trends Genet 19, 514-522. de Muyt, A.D., Mercier, R., Mezard, C., and Grelon, M. (2009). Meiotic recombination and crossovers in plants. In Meiosis, pp. 14-25. Drouaud, J., and Mezard, C. (2011). Characterization of meiotic crossovers in pollen from Arabidopsis thaliana. Methods Mol Biol 745, 223-249. Drouaud, J., Camilleri, C., Bourguignon, P.Y., Canaguier, A., Berard, A., Vezon, D., Giancola, S., Brunel, D., Colot, V., Prum, B., Quesneville, H., and Mezard, C. (2006). Variation in crossing-over rates across chromosome 4 of Arabidopsis thaliana reveals the presence of meiotic recombination "hot spots". Genome Res 16, 106-114.

74 Giraut, L., Falque, M., Drouaud, J., Pereira, L., Martin, O.C., and Mezard, C. (2011). Genome-Wide Crossover Distribution in Arabidopsis thaliana Meiosis Reveals Sex- Specific Patterns along Chromosomes. PLoS Genet 7, e1002354. Goffeau, A., Barrell, B.G., Bussey, H., Davis, R.W., Dujon, B., Feldmann, H., Galibert, F., Hoheisel, J.D., Jacq, C., Johnston, M., Louis, E.J., Mewes, H.W., Murakami, Y., Philippsen, P., Tettelin, H., and Oliver, S.G. (1996). Life with 6000 genes. Science 274, 546, 563-547. Holloway, K., Lawson, V.E., and Jeffreys, A.J. (2006). Allelic recombination and de novo deletions in sperm in the human {beta}-globin gene region. Hum Mol Genet 15, 1099- 1111. Hunter, N., and Kleckner, N. (2001). The single-end invasion: an asymmetric intermediate at the double-strand break to double-holliday junction transition of meiotic recombination. Cell 106, 59-70. Jeffreys, A.J., and Neumann, R. (2002). Reciprocal crossover asymmetry and meiotic drive in a human recombination hot spot. Nat Genet 31, 267-271. Jeffreys, A.J., and May, C.A. (2004). Intense and highly localized gene conversion activity in human meiotic crossover hot spots. Nat Genet 36, 151-156. Jeffreys, A.J., and Neumann, R. (2005). Factors influencing recombination frequency and distribution in a human meiotic crossover hotspot. Hum Mol Genet 14, 2277-2287. Jeffreys, A.J., Neumann, R., and Wilson, V. (1990). Repeat unit sequence variation in minisatellites: a novel source of DNA polymorphism for studying variation and mutation by single molecule analysis. Cell 60, 473-485. Jeffreys, A.J., Murray, J., and Neumann, R. (1998). High-resolution mapping of crossovers in human sperm defines a minisatellite-associated recombination hotspot. Mol Cell 2, 267-273. Jeffreys, A.J., Kauppi, L., and Neumann, R. (2001). Intensely punctate meiotic recombination in the class II region of the major histocompatibility complex. Nat Genet 29, 217-222. Kauppi, L., Jeffreys, A.J., and Keeney, S. (2004). Where the crossovers are: recombination distributions in mammals. Nat Rev Genet 5, 413-424. Kauppi, L., May, C.A., and Jeffreys, A.J. (2009). Analysis of meiotic recombination products from human sperm. In Methods Mol Biol, pp. 323-355. Keeney, S., Giroux, C.N., and Kleckner, N. (1997). Meiosis-specific DNA double-strand breaks are catalyzed by Spo11, a member of a widely conserved protein family. Cell 88, 375-384. Libeau, P., Durandet, M., Granier, F., Marquis, C., Berthome, R., Renou, J.P., Taconnat-Soubirou, L., and Horlow, C. (2011). Gene expression profiling of Arabidopsis meiocytes. Plant Biol (Stuttg) 13, 784-793. Mancera, E., Bourgon, R., Brozzi, A., Huber, W., and Steinmetz, L.M. (2008). High- resolution mapping of meiotic crossovers and non-crossovers in yeast. Nature 454, 479-485. Martinez-Perez, E., and Colaiacovo, M.P. (2009). Distribution of meiotic recombination events: talking to your neighbors. Curr Opin Genet Dev. Munoz-Fuentes, V., Di Rienzo, A., and Vila, C. (2011). Prdm9, a major determinant of meiotic recombination hotspots, is not functional in dogs and their wild relatives, wolves and coyotes. PLoS One 6, e25498. Myers, S., Bottolo, L., Freeman, C., McVean, G., and Donnelly, P. (2005). A fine-scale map of recombination rates and hotspots across the human genome. Science 310, 321- 324.

75 Myers, S., Freeman, C., Auton, A., Donnelly, P., and McVean, G. (2008). A common sequence motif associated with recombination hot spots and genome instability in humans. Nat Genet 40, 1124-1129. Myers, S., Bowden, R., Tumian, A., Bontrop, R.E., Freeman, C., MacFie, T.S., McVean, G., and Donnelly, P. (2010). Drive against hotspot motifs in primates implicates the PRDM9 gene in meiotic recombination. Science 327, 876-879. Ng, S.H., Parvanov, E., Petkov, P.M., and Paigen, K. (2008). A quantitative assay for crossover and noncrossover molecular events at individual recombination hotspots in both male and female gametes. Genomics 92, 204-209. Noma, K., and Grewal, S.I. (2002). Histone H3 lysine 4 methylation is mediated by Set1 and promotes maintenance of active chromatin states in fission yeast. Proc Natl Acad Sci U S A 99 Suppl 4, 16438-16445. Pan, J., Sasaki, M., Kniewel, H., Blitzblau, H.G., Tischfield, S.E., Zhu, X., Neale, M.J., Jasin, M., Socci, N.D., Hochwagen, A., and Keeney, S. (2011). A hierarchical combination of factors shapes the genome-wide topography of yeast meiotic recombination initiation. Cell 144, 719-731. Ponting, C.P. (2011). What are the genomic drivers of the rapid evolution of PRDM9? Trends Genet 27, 165-171. Pontvianne, F., Blevins, T., and Pikaard, C.S. (2010). Arabidopsis Histone Lysine Methyltransferases. Adv Bot Res 53, 1-22. Sanchez-Moran, E., Armstrong, S.J., Santos, J.L., Franklin, F.C., and Jones, G.H. (2002). Variation in chiasma frequency among eight accessions of Arabidopsis thaliana. Genetics 162, 1415-1422. Smagulova, F., Gregoretti, I.V., Brick, K., Khil, P., Camerini-Otero, R.D., and Petukhova, G.V. (2011). Genome-wide analysis reveals novel molecular features of mouse recombination hotspots. Nature 472, 375-378. Szostak, J.W., Orr-Weaver, T.L., Rothstein, R.J., and Stahl, F.W. (1983). The double- strand-break repair model for recombination. Cell 33, 25-35. Thorstensen, T., Grini, P.E., and Aalen, R.B. (2011). SET domain proteins in plant development. Biochim Biophys Acta 1809, 407-420. Vignard, J., Siwiec, T., Chelysheva, L., Vrielynck, N., Gonord, F., Armstrong, S.J., Schlogelhofer, P., and Mercier, R. (2007). The interplay of RecA-related proteins and the MND1-HOP2 complex during meiosis in Arabidopsis thaliana. PLoS Genet 3, 1894-1906. Wall, J.D. (2004). Close look at gene conversion hot spots. Nat Genet 36, 114-115. Weigel, D., and Glazebrook, J. (2002). Arabidopsis: A Laboratory Manual. (Cold Spring Harbor: Cold Spring Harbor Laboratory Press). Wu, Z.K., Getun, I.V., and Bois, P.R. (2010). Anatomy of mouse recombination hot spots. Nucleic Acids Res. Yandeau-Nelson, M.D., Zhou, Q., Yao, H., Xu, X., Nikolau, B.J., and Schnable, P.S. (2005). MuDR transposase increases the frequency of meiotic crossovers in the vicinity of a Mu insertion in the maize a1 gene. Genetics 169, 917-929. Yao, H., and Schnable, P.S. (2005). Cis-effects on meiotic recombination across distinct a1- sh2 intervals in a common Zea genetic background. Genetics 170, 1929-1944. Yao, H., Zhou, Q., Li, J., Smith, H., Yandeau, M., Nikolau, B.J., and Schnable, P.S. (2002). Molecular characterization of meiotic recombination across the 140-kb multigenic a1-sh2 interval of maize. Proc Natl Acad Sci U S A 99, 6157-6162.

76

Abreviations

CO: crossover; NCO: non reciprocal recombination event also called gene conversion not associated with crossovers; DSB: double strand break; H3K4: histone H3 lysine 4; Col:

“Columbia-O”, Ler: “Landsberg erecta-4”.

Supplemental tables will be provided in Annexes of the thesis

77 C! Hybrid! L!

meiosis!

C ! L! CO: LtoC! Pollen! CO: CtoL! NCO: LtoCtoL! NCO: CtoLtoC!

CO assay! NCO assays!

1st PCR! Cloning strategy! 1st PCR! PCR strategy! 1st PCR! 2nd PCR! 2nd PCR!

2nds PCRs! DNA sequencing! DNA cloning! AND!

DNA sequencing!

SNPs fluorescent mapping!

DNA sequencing of! Figure 1! positive clones !

Figure 1. Specific detection of CO and NCO molecules in genomic DNA extracted from pollen.! The F1 hybrid contains at each locus 1 allele of each parent. Filled circles represent polymorphisms on the C (Col, blue) or L (Ler, red) chromosomes. After meiosis, in DNA extracted from pollen, there are either non-recombinant molecules (C, L) or CO ('LtoC' or 'CtoL') or NCOs ('LtoCtoL' or 'CtoLtoC') in various proportions depending on the locus studied. To detect CO molecules, two rounds of allele specific PCR was performed with allele-specific oligonucleotides (ASOs, blue and black triangles) on pools of genomic DNA. To detect NCO molecules, first alleles containing on each side polymorphisms of one specific parent were amplified by one or two rounds of allele specific PCRs. Then the region of interest was cloned in E.coli and interesting SNPs were genotyped using fluorescent mapping as described (Ng and Paigen, 200x). Alternatively, two complementary allele-specific PCRs were performed to analyze the status of interesting SNPs. A NCO was scored when both PCRs were positive. ! kb! Leaf DNA!

32768! 16384! 8192! genomes!

4096! 2048! 1024!

Pollen DNA!

4096! 2048! 1024! genomes!

512! 256! 128!

Figure 2. Specific detection of CO molecules in genomic DNA extracted from pollen.! PCR was performed with allele-specific oligonucleotides (ASOs) designed for the amplification of CO molecules (see Material and Methods, Figure 1), using decreasing amounts of genomic DNA (the number of template molecules is indicated on the photograph) extracted from F1 Col x Ler hybrid plants, either from leaf (two top rows) or pollen (two bottom rows). Eight aliquot reactions were carried out for each dilution. Very faint bands could be detected (16384 template molecules and more) for the high amounts of leaf DNA. However, the PCR products were sequenced and shown not to be recombinant molecules. At lower amounts of leaf DNA no PCR products were amplified. Conversely, using equivalent low concentrations of DNA extracted from pollen (4096 molecules and less), a strong and specific amplification of CO molecules was obtained.! A!

AT4G35070! AT4G35080! 400! 350!

300! 14a1! 14a2! 250!

200!

150!

CO rateCO ! (cM/Mb) 100!

50!

0!

chromosome coordinate (kb)! B!

AT4G00390! AT4G00400! AT4G00416! AT4G00420! 100! 130x! 75!

50! CO rateCO ! (cM/Mb)

25!

0! 172! 174! 176! 178! 180! 182! chromosome coordinate (kb)!

Figure 3. Distribution of CO rates across the 14a (A) and 130x (B) hotspots.! 104 COs were analyzed at 14a (62 at 14a1 and 42 at 14a2) and 167 at 130x. Grey dotted curves show the underlying crossover distribution established by least-squares best fit analysis assuming that COs are normally distributed across each hotspot (see Materials and Methods).! Grey dotted vertical lines: median positions of the hotspots. Triangle: positions of insertions-deletions larger than 7 bp. Gene structures along the hotspot regions are displayed on top of graphical areas (light hashed grey: exons, dark grey: UTR, broken lines: introns; green marks: position of SNPs studied for NCOs, triangles are indels, filled circles are SNPs).! Grey dotted vertical lines: median positions of the hotspots. positions median lines: Grey vertical dotted ʼ CO rates. of LtoC ʻ distribution ʼ CO histogram: rates. of CtoL Red ʻ distribution CO histogram: rates. Blue ʼ LtoC ʻ relative CO curves: cumulated ʼ rates. CtoL Red ʻ relative curves: cumulated Blue the region. along sites polymorphic at successive scored CO (from rates were left to relative right) cumulated orientations, reciprocal both and hotspots three For all blue. in ʼ CtoL the ʻ and red in is presented ʼ distribution LtoC ʻ at 92 130x. and 75 and at 19 14a2, and 23 at 14a1, analyzed COs ʼ were LtoC ʻ 33 ʼ and CtoL ʻ 29 The (B).! Figure 4. CO distribution andfor orientations reciprocal (A) andcumulative CO rates at 14a 130x

CO rate (cM/Mb)! CO rate (cM/Mb)! 150 300 450 600 125 100 25 50 75 0 0 A B ! ! 14a1 130x 174 ʻCtoLʼ ʻLtoCʼ ʻCtoLʼ ʻLtoCʼ 175 chromosome coordinatechromosome (kb)! chromosome coordinatechromosome (kb)! 176 177 178 179 180 181 14a2 182 100 0 20 40 60 80 0 100 25 50 75

cumultaive CO (%)! cumulative CO (%)! 400! 350!

300! 14a1! 250!

200!

150!

CO rateCO ! (cM/Mb) 100!

50!

0! 33! 35! 37! kb!

none!

Figure 5. NCO at 14a1.! The SNPs #33, #35 and #37 are indicated by filled green circles. The polymorphisms are indicated by filled black circles along the chromosome coordinate axes. Thick horizontal lines: converted SNPs, Thin horizontal lines: interval in which NCO tract ends are located. Blue: Col; Red: Ler! A! #44 !

177.3! 177.7! 178.1! 178.5! 178.9!

B! #21!

173.8 ! 174.3 ! 174.8 ! 175.3 ! 175.8 ! 176.3 ! 176.8 ! 177.3 kb!

C! #52!

179 ! 179.5 ! 178 ! 180.5 ! 181! 181.5 ! 182 ! 182.5 kb!

Figure 6. NCO at 130x.! (A) SNP #44, (B) SNP #21, (C) SNP #52! Position of SNPs genotyped are indicated by green filled circles or triangles (indel). The polymorphisms are indicated by filled black circles along the chromosome coordinate axes. Triangles: insertion or deletion above 7 nt. Thick horizontal lines: converted SNPs, Thin horizontal lines: interval in which NCO tract ends are located. Blue: Col; Red: Ler. Yellow line: chimeric SNPs!

2.3. Conclusion

We confirmed for the first time the presence of hotspot of meiotic recombination in Arabidopsis thaliana. A high-resolution analysis showed that not only 14a region has a high rate of CO (0.85%) but it revealed the presence of two close hotspots: 14a1 and 14a2. These two hotspots were very similar as those previously described in other organisms. They both had a typical Gaussian shape with a highest rate of CO in the centre and a gradually decrease of CO on each side. Their lengths were also similar to those of most mammals and yeast hotspots. Interestingly, 130x hotspot had quite different characteristics. It exhibits a non- Gaussian distribution of CO events: it is very large (8kb in length) suggesting a cluster of at least four or five hotspots. But it also has a very high rate of CO (0.7%).

Not only 14a and 130x were hotspots of CO events, they were hotspots of NCOs. The tract length of 14a1 NCO hotspots was 276 bp in average and never included co-conversion of other SNPs. CO/NCO rate was around 1 to 1. In contrast, 130x, had longer NCO tract lengths (from 492 to 1038 pb) with bilateral co-conversion for most of SNPs tested. The ratio CO/NCO was around 30 to 1.

I also detected a bias of recombination initiation in both 14a1 and 14a2 hotspots. Arabidopsis is self-fertile and this bias of recombination would have a limited effect on genome evolution.

Finally, I would like to emphasize that although meiotic recombination promotes genetic diversity, it can also be sometimes hazardous for organism’s life. I mentioned in the introduction a list of human genetic diseases due to improper meiotic recombination. In this work in Arabidopsis, I found that in the samples of pollen genomic DNA characterized, meiotic recombination created new gene sequences that potentially could encode proteins with a sequence different from each parent both at 14a and 130x. I have not yet tested the function of these recombinant proteins as the recombinant genes obtained were only found in

85 pollen DNA. But as we found these recombinant proteins at the three hotspots studied, we can hypothesize that the conception of a new protein by meiotic recombination is not uncommon and can lead to major change in the life of a plant, benefit, silent or deleterious.

86

CHAPTER 3: ROLE OF MSH4 IN CO AND NCO FORMATION AT 14A HOTSPOT IN ARABIDOPSIS THALIANA

87 cM/Mb 450 400 350 300 250 "normal" wild-type 200 msh4-/- 150 100 50 0 16692000 16693000 16694000 16695000 16696000 16697000 16698000 16699000 16700000

chromosome coordinates (bp)

cM/Mb

25

20

15 msh4-/- 10

5

0 16692000 16693000 16694000 16695000 16696000 16697000 16698000 16699000 16700000

chromosome coordinates (bp)

Figure 3-1 : CO distribution at 14a in "normal" wild type (upper panel) and atmsh4-/- (upper and lower panel). 108 and 48 COs were analyzed in "normal" wild-type (see Chapter 2) and atmsh4-/- respectively. Due to the difference in CO rates the distribution of CO is hardly visible in the upper panel.

In A. thaliana, on the basis of chiasma counts, it is assumed that 85% of COs belongs to the interference dependent pathway (class I), while the remaining 15% are interference-free (class II) (reviewed in (de Muyt et al., 2009b)). To test the contribution of both CO pathways, we analyzed their distribution at the 14a hotspot in an Atmsh4 mutant background in which interfering COs are absent. We set up a F1 hybrid between Col (SALK_136296) and Ler (CSHL_GT14269) each being hemizygous for a T-DNA insertion in the AtMSH4 gene. Both lines and the F1 have been analyzed and have been shown to have a null msh4 phenotype when the insertion is homozygous (Higgins et al., 2004; Laurène Giraut, personal communication). The descendants of the F1 plants were genotyped to isolate each population: one fourth of sterile plants (msh4-/-), one fourth of wild type plants (MSH4 +/+), one half of hemizygous plants (MSH4 +/-). We then extracted genomic DNA from their pollen and performed pollen-typing PCR as described in Chapter 2. For the populations that contain at least one wild-type AtMSH4 allele (MSH4+/+ and MSH4+/-), the CO rate obtained at 130x was very similar to the CO rate obtained in the "normal" wild-type ColxLer studied in chapter 2 (0.78% with confidence intervals (CI) 0.55-1.1 and 0.85% with CI 0.61-1.21 respectively). Thus we pooled the genomic DNA preparations for further experiments and thereafter we will refer to this pool of genomic DNA as "wild-type MSH4".

Thus we undertook the study in the pollen genomic DNA extracted from the Atms4 -/- plants at the 14a hotspot. We observed a lower rate of CO: 0.1 % (CI: 0.06-0.15) thus 12.5 % of the "normal" wild type CO rate (Figure 3-1).

We then analyzed the CO distribution in Atmsh4-/-. At 14a1, the CO rate was dramatically reduced (6.4%) whereas it was less pronounced at 14a2 (18.7%) (Figure 3-1). Only 31% of the CO recovered lied in 14a1 whereas they constituted 55% of the total numbers of COs in the "normal" wild type (p-value 10-4). We found intriguing that we could not detect any significant differences between the "ColtoLer" or "LertoCol" distributions (Figure 3-2).

Next, we tested the NCO frequency in the Atmsh4 mutant background. We genotyped the SNP#35 in the center of 14a1. Among 11896 colonies tested (half containing the Col allele and the other half the Ler allele), we detected and sequenced 41 NCO events. Thus, at this marker the NCO rate is comparable to those obtained in the "normal" wild type (1/290 and 1/381 respectively). Like in normal wild type, all the events recovered changed only one

88 positions ofthe hotspots. rates.CO Red histogram: distribution of‘LtoC’ rates. CO Greydotted vertical lines: median were analyzed at 14a1, 9and 25at 14a2 in were analyzed at 14a1, 23and 19at 14a2 in “normal wild-type” .7‘CtoL’ and 6 ‘LtoC’ COs Figure for 3-2.COdistribution reciprocal orientations at14a.

CO rate (cM/Mb)! CO rate (cM/Mb)! 150 300 450 600 16692000 20 25 10 15 0 0 5

Atmsh4-/- Atmsh4-/- "normal" "normal" wild-type 14a1

16693000 ʻCtoLʼ ʻLtoCʼ

16694000

16695000 atmsh4-/-

16696000 chromosome coordinatechromosome (bp)! chromosome coordinatechromosome (kb)! .Blue histogram: distribution of‘CtoL’

16697000

16698000 29 ‘CtoL’ and 33‘LtoC’ COs

16699000 14a2

16700000

CtoL LtoC

SNP (Chapter 2 and Figure 3-3), thus the average length of the conversion tract was similar (296 bp). Surprisingly, we recovered an equal proportion of NCO in both directions "LerColLer" and "ColLerCol" contrary to "normal" wild-type where we had 3 times more "LerColLer" events compared to "ColLerCol" (see chapter 2). This was an unexpected result, as MSH4 is not supposed either to play a role in the choice of the chromosome to be broken or in the way to mature NCO recombination intermediates.

This latter result prompted us to re-examine our "wild-type MSH4" population compared to the "normal" wild-type population. Both have the same CO rates but do they have the same CO distribution?

Thus we isolated by pollen-typing and sequenced more than 100 COs (54 in each direction "CtoL" and "LtoC") in the "wild-type MSH4" and analyze their distribution. The two distributions ("normal" wild-type and "wild-type MSH4") were quite similar. First the proportion of COs in 14a1 and 14a2 was almost the same (Figure 3-4) but there was not such a huge difference in the "hotness" of the two hotspots. Instead of having at 14a1, a peak almost three times higher than at 14a2 as in "normal" wild-type", in "wild-type MSH4" the peaks were comparable at both hotspots. Then we analyzed the two distributions “ColtoLer” and “LertoCol”. Surprisingly, in "wild-type MSH4", we could not anymore see asymmetry between CO distribution in reciprocal orientations “Col to Ler” and “Ler to Col” (Figure 3-4). We thus concluded that the two ColxLer F1, "normal" wild-type and "wildtype MSH4", even if they had similar CO rates at 14a, had a different behaviour for CO formation at 14a hotspot.

Thus we could not compare the results obtained in atmsh4-/- directly to the "normal" wild-type but we had to confront them to the results obtained in "wild-type MSH4". For global CO rates and CO rates at 14a1 and 14a2, our conclusions still hold because CO rates were comparable between "wild-type MSH4" and "normal" wild-type: atmsh4-/- CO rates at 14a is 12.8% of the "wild-type MSH4" CO rates and the decrease is more pronounced at 14a1 (6.6%) than at 14a2 (22.7%). However, as there was no significant difference between the "ColtoLer" or "LertoCol" distributions in "wild-type MSH4", the lack of visible differences in atmsh4-/- was not anymore intriguing.

Now, we have to look at NCO frequencies in "wildtype MSH4" at 14a. The expected result is that the NCO rate will be similar to the one found in "normal" wild-type and in Atmsh4-/- but without bias in the conversion between both alleles.

89 cM/Mb

25

20

15 msh4-/- 10

5 33!35! 37! 0 kb!

20 NCOs 21 NCOs

Figure 3-3. NCO at 14a1 in atmsh4-/- 41 NCOs have been recovered at SNPs #35. The polymorphisms are indicated by filled black circles along the chromosome coordinate axes. Thick horizontal lines: converted SNPs, Thin horizontal lines: interval in which NCO tract ends are located. Blue: Col; Red: Ler In parallel, we have re-examined CO and NCO rates at 130x in Atmsh4-/-. As we did not have yet the differences between the two ColxLer F1, some of the results we got could not be fully compared with the "wildtype MSH4". Nevertheless, we saw that (i) CO rates in Atmsh4-/- were also dramatically reduced, 5% of "normal" wild-type and "wildtype MSH4". The distribution looked different to the one observed in "normal" wild-type but we are currently doing the distribution in "wildtype MSH4" to get the appropriate comparison. We are also on the process of looking at NCO rates.

In conclusion to this part, we observed that our two wild type strains do not have the same behavior regarding the 14a hotspot. Although CO rates are similar, the distribution is different: the bias of initiation exists in one and not in the other one. This is puzzling as the sequence of the 8 kb region is totally identical in the two Col and two Ler accessions used. Thus the difference observed is not due to the hotspot sequence itself. We have to hypothesize that there is a factor genetic or epigenetic that controls the behavior of the 14a hotspot but independently of the sequence within the hotspot. Could it be a different epigenetic mark? A polymorphism adjacent to the hotspot? A different allele of a protein that control the DSB formation?

We do not know yet how to get insight into the mechanisms that control the differences but we hope to be able to do it in the future.

Regarding the role of MSH4 in CO formation, we observed as expected a dramatic decrease of COs when MSH4 is absent. What is more surprising is that it seems to act at different level on different hotspots: 5 to 7% of COs are independent of MSH4 at 14a1 and 130x whereas more than 20% can occur without MSH4 at 14a2. Thus it could be that at some loci, recombination intermediates are more prone to be processed by MSH4 that at others loci.

Regarding NCOs, our data are not sufficient yet to discuss the role or the absence of role of MSH4 in NCO formation.

90 cM/Mb

450 400 350 300

250 "normal" wild-type 200 Wild-type "MSH4" 150 100 50 0 16692 16693 16694 16695 16696 16697 16698 16699 16700 kb cM/Mb

160

140

120

100 LxC 80 CxL 60

40

20

0 16692 16693 16694 16695 16696 16697 16698 16699 16700 kb

Figure 3-4: distribution of COs at 14a in "normal" wild type and wild-type "MSH4" (upper panel). 'CtoL and 'LtoC' crossovers at 14a in "wild-type MSH4" (lower panel). Median of hotspots are presented with dotted lines. cM/Mb 40 Col x Cvi 35 Col x Ws Col x Sha 30 Col x Ler

25 14a 20 § § 15 # § + + § 10 § + § 5

0 0 2 4 6 8 10 12 14 16 18 Mb

Figure 4-1: Genetic map of chromosome 4 in 4 different crosses.

: Intervals with significantly different CO rates between Col x Ws and Col x Ler (Bonferroni 0.05). : Intervals with significantly different CO rates between Col x Ler and Col x Sha (Bonferroni 0.05). Triangle : Intervals with significantly different CO rates between Col x Ler and Col x Cvi (Bonferroni 0.05). # : Intervals with significantly different CO rates between Col x Sha and Col x Cvi (Bonferroni 0.05). + : Intervals with significantly different CO rates between Col x Sha and Col x Ws (Bonferroni 0.05). § : Intervals with significantly different CO rates between Col x Cvi and Col x Ws (Bonferroni 0.05).

CHAPTER 4: CHARACTERIZATION OF THE 14A HOTSPOT IN VARIOUS GENETIC BACKGROUNDS

91 Accessions Versailles CRB identification number Country Columbia-0 (Col-0) 186 AV Poland Landsberg erecta (Ler-1) 213 AV Poland Wassilewskija (WS-4) 530 AV Belarus Cvi-0 166 AV Cape Verde Islands Shahdara 236 AV Tadjikistan Jea 25 AV France Ita-0 157 AV Morocco Ct-1 162 AV Italy Bur-0 172 AV Ireland Blh-1 180 AV Czechoslovakia Oy-o 224 AV Norway Mt-0 94 AV Libya Gre-0 200 AV USA Sakata 257 AV Japon Yo-0 250 AV USA Hiroshima 254 AV Japon Ri-0 160 AV Canada Pa-1 50 AV Italy Per-1 100 AV Russia Bij 355 AV Russia Mah-1 323 AV Nepal Pyl-1 8 AV France Bl-1 42 AV Italy Tsu-0 91 AV Japan Stw-0 92 AV Russia Akita 252 AV Japan N13 266 AV Russia

Table 4-1: List of Arabidopsis thaliana accessions, sequenced for allele specific primers in the 14a region. In red are presented the accessions crossed with Col (maternal parent) for which flowers have been collected to extract pollen DNA and study the region 14a by pollen typing.

When I started my PhD, it was known that meiotic recombination activity of hotspots varied between individuals in humans or strains in mice or yeasts (examples in (Jeffreys et al., 2000; Baudat and de Massy, 2007b; Jeffreys and Neumann, 2009; Cole et al., 2010). Since then, part of these variations has been shown to be linked to the polymorphisms of the DNA binding domain of the PRDM9 protein (see Introduction and Chapter V) (Baudat et al., 2010) (Berg et al., 2010; Berg et al., 2011).

During the course of my PhD, I wanted to test if there was difference in the recombination activity of a hotspot in Arabidopsis thaliana depending on genetic backgrounds and if I could link these variations to some DNA features. In fact, I had the proper material to test the first question: a well-defined hotspot and the availability of a huge collection of Arabidopsis thaliana accessions (more than 400) in the Institute (http://www- ijpb.versailles.inra.fr/fr/cra/cra_accueil.htm). I also had several F1 hybrids already available in the lab: ColxSha (Shadahra, 236AV), ColxCvi (Cape Verde Islands, 166AV), ColxWs-4 (Wassilewskija, 530AV). For these three F1, the genetic map of chromosome 4 had been obtained and compared to the one of the ColxLer F1 (Christine Mézard, personal communication). The genetic length of chromosome 4 differed significantly between the different backgrounds. ColxSha and ColxWs had similar short maps, 75.3 cM and and 74.7 cM, respectively, whereas it reached 94.7 cM in ColxCvi and 85.6 cM for Colx Ler. CO distribution also differed between crosses but to a different extent (Figure 4-1). In Ws, there were two regions devoid of COs. Both may correspond to local rearrangement like inversions that preclude the occurrence of CO in these regions. These two rearrangements specific to Ws were not described previously even though the accessions used in this study have been thoroughly studied. Apart from the two putative inversions, CO rates differ significantly between the crosses for several intervals both on the long and short arm. Thus, CO rates differ between crosses implying different genetic backgrounds, but the distributions are rather similar.

Therefore, I wanted first to test the presence of 14a hotspot in these three crosses and I started to collect pollen from all these F1 and I sequenced the DNA regions corresponding to 14a in the three accessions. As shown in Table 4-2, only the ColxWs had 3 DNA sequences that could be properly used to anchor allele specific primers (1 on the left side and two on the right side of the hotspot). I spent time to anchor and test two others allele specific primers on

92 Figure 4-2: Geographical position of Arabidopsis thaliana accessions selected

Specific Primer n° 8 n° 15 n° 9 n° 23 n° 27 n° 54 n° 63 Accessions

Ler + + + +

Ws-4 + + - - + + +

Pyl-1 + - ? + + + +

Bur-0 + - ? + + + +

N13 + - - - + + +

Table 4-2: List of allele specific primers in the 14a region Ler-allele specific primers are shown in red and primers for other accessions are shown in yellow. Primers n° 8, 15, 9, 23 and 27 correspond to left side, and primers n° 54 and 63 correspond to right side of the 14a region. the left side (Table 4-2). As this is the main limitation to use the pollen-typing method, I decided not to spend time developing new allele specific primers in the ColxSha and the ColxCvi crosses but use a different approach to find other F1 that could be used "easily". I selected several Arabidopsis accessions from different geographical regions (Figure 4-2) and sequenced only the DNA regions where allele-specific primers were anchored in ColxLer and ColxWs. Among 22 accessions, I only find three where at least some of these DNA regions where conserved: Bur-0, Pyl-1 and N13 (Figure 4-1 and Table 4-1). Thus I started to cross the accessions to obtain the F1, saw the F1 seeds and collect the primers. To complete this part of the analysis, I also sequenced the DNA regions used to anchor allele specific primers at the 130x hotspot. The study of 130x in these F1 will need more work to develop new allele specific primers.

During the time of my PhD, I only had time to collect and analyze the pollen from the F1 ColxWs. However, I collected enough pollen of the other crosses (ColxBur-0, ColxPyl-1, ColxN13) to be analyzed by pollen-typing technique.

Rate and Distribution of COs in the F1 ColxWs We noticed that the DNA sequence is fully conserved in the centre of the two hotspots between ColxLer(s) and ColxWs. However between the hotspots 14a1 and 14a2 and on each side there are many different SNPs.

In the ColxWs F1, we found that (i) the hotspot was conserved (ii) the CO rate was found to be slightly lower (0.49% (CI: 0.33-0.73)) but in the same range than the two ColxLer F1s studied previously (0.85 % and 0.78%; see chapter 3). The two hotspots (14a1 and 14a2) were still present. The only difference was that they exhibit the same activity instead of having 14a1 "hotter" than 14a2 as in the two ColxLers (Figure 4-3).

Then we isolated 87 COs (54 Col to Ws and 33 Ws to Col) to analyze their distribution. When we examined the distribution in both directions, we observed that at 14a1, the “ColtoWs” CO distribution peaked on the left side than the “WstoCol” distribution. Strikingly, the highest peak for the “ColtoWs” contains 11 CO events whereas none was observed in the same interval in “WstoCol” (Figure 4-3). The statistical analysis of the comparison of the two distributions gave a p-value of 0.051, thus limit to but no significant.

93 cM/Mb

ColxWs

ColxWs ColxLer wild-type "MSH4"

ColxWs ColxLer "normal" wild-type

Cumulave CO (%) 100 87,5 75 62,5 50 ColtoWs WstoCol 37,5 25 12,5 0

Chromosome coordinates (bp) Figure 4-3: Distribution of COs in ColxWs (upper panel). Comparison with ColxLer (middle panels). ColtoWs and WstoCol distribution (lower panel). For a better visualization the cumulative frequencies of Col and Ws alleles have been calculated separately at 14a1 and 14a2. We are currently isolating additional COs to confirm or infirm the existence of a bias in CO distribution.

We are also looking at NCOs frequency and tract length. The SNP #35 is going to be particularly interesting as it lies in between two high peaks of CO in Col to Ws and a contrario in between a cold (0 CO) and a high peak in Ws to Col.

For other F1, ColxPyl-1, ColxBur-0 and ColxN13 the flowers have been collected and the analysis will be performed in the next weeks.

94 Figure 5-1. Metaphase 1 and telophase II from wild-type (a, b) and mcc1 (c, d) by DAPI staining in Arabidopsis thaliana (2n = 10). (a) 5 bivalents (b) four balanced nuclei (c ) 4 bivalents and two univalents (arrows). Univalents are never observed in a wild-type meiosis. (d) unbalanced number of chromosomes at Telophase II. Scale bar: 10 lm. (From Perrella et al 2010).

Human PRDM9

KRAB SSXRD SET ZnF

Figure 5-2. Human PRDM9 domain types: Human PRDM9 domain types: KRAB (Kruppel-associated box); SSXRD(synovial sarcoma,X breakpoint,repression domain); SET [Su(var)3-9, Polycomb-group protein Enhancer of Zeste and Trithorax group protein TRX,domain]; ZnF(Cys2-His2 ZF) (Adapted from Ponting 2011)

CHAPTER 5: EPIGENETIC CONTROL OF MEIOTIC RECOMBINATION IN ARABIDOPSIS THALIANA

95 Figure 5-3. Domain architecture of histone HKMTs in A. thaliana. Abbreviations: EZD, E(Z) domain; SANT, SWI3, ADA2, N-CoR and TFIIIB” DNA-binding domain; CXC, cysteine-rich region; PHD, plant homeodomain; zf-CW, a zinc finger with conserved Cys and Trp residues; PWWP, domain named after a conserved Pro–Trp–Trp–Pro motif; FYRN, F/Y-rich N-terminus; FYRC, F/Y-rich C-terminus. (From Pontvianne et al 2010). In fact SDG6 has been shown since to possess 3 ZFs (Krichevsky et al 2007) 5.1. Introduction

The PRDM9 mechanism has been described in 1.6.5. Briefly, it relies on the methylation of histone H3 on lysine 4. Although it seems that the mechanism is conserved in mammals and yeast, but there are exceptions. For example in S. pombe, instead of H3K4me3, histone acetyltransferases (HAT) are necessary for a proper meiotic recombination so that deletion of a HAT, SpGcn5, required for ade6–M26 hotspot-specific hyperacetylation, led to a partial decrease in DSBs and subsequently a partial reduction in recombination activity at this locus (Yamada et al., 2004). In contrast, deletion of a histone deacetylase (HDAC), ScRpd3p, increases HIS4 hotspot activity in S. cerevisiae (Merker et al., 2008).

In 2010, Perrella and colleagues have described the meiotic role of Meiotic Control of Crossovers1 (MCC1), a GCN5-related histone N-acetyltransferase, in Arabidopsis (Perrella et al., 2010). The over expression mutant showed that acetylation of histone H3 increased in male prophase I. MCC1 seems to be required for proper chromosome segregation, normal chiasma number and distribution during meiosis. The overexpressed MCC1 did not influence on crossover number at cellular level. But it had a differential effect on individual chromosomes increasing COs for chromosome 4, with a shift in chiasma distribution, and decreasing COs for chromosome 1 and 2 (Figure 5-1). This reduction of COs went with the loss of the obligate CO per bivalent but in only 8% of the male meiocytes. Furthermore, fifty percent of both male and female gametes were aborted as consequence of meiotic defects in the mutant. These evidences provide evidence for an impact of histone hyperacetylation in Arabidopsis meiosis. According to Perrella’s results, it is therefore possible that hyperacetylation might enhance at least the activity of hotspots cluster on the short arm of chromosome 4 or, alternatively, induce novel recombination sites possibly located at flanking regions (Perrella et al., 2010).

Regarding previous works on mammals and yeast, I wanted to ask the following questions:

First, is there an epigenetic regulation of meiotic recombination in plants?

Second, do plants have a similar mechanism such as PRDM9 or Set1 for regulating meiotic recombination events by methylation on H3K4?

96 Figure 5-4. Diagram representing the H3 and H4 Lysines targeted by Arabidopsis HKMT proteins. The HKMTs whose activity has been biochemically demonstrated are highlighted using in black, whereas the HKMTs whose specificities have not been confirmed are shown in gray. (From Pontvianne et al 2010) Third, if there are other mechanisms, are they common among plants?

Fourth, are they involved in plant speciation? In mice it has been shown that PRDM9, is involved in hybrid sterility between Mus m.domesticus and Mus m.musculus (Mihola et al 2009).

During the time of my PhD, I decided to have a look at the second question. I was interested to see if A. thaliana histone methyltransferases could be putative meiotic recombination regulator. A simple search for PRDM9 homology in A.thaliana genome failed to find any clear homologs.

PRDM9 is one of 17 human proteins containing a SET methyltransferase domain followed by a series of C-terminal Cys2His2 zing-finger (ZF) and single N-terminal copy of KRAB and SSXRD motifs ( figure 5-2 ) (Birtle and Ponting, 2006; Oliver et al., 2009). As the SET domain catalyzes trimethylation of lysine 4 of histone H3, and the ZF domains have DNA-binding function (determining the position of some meiotic recombination sites), we have setup such criteria for our screen across A.thaliana genome.

In Arabidopsis, a vast family of genes encodes putative histone methyltransferases. Although some of these enzymes function as arginine methyltransferases, however most of them are believed to be histone lysine methyltransferases (HKMTs) (reviewed in (Pontvianne et al., 2010)). So far, five lysine methylation sites have been identified in plants, namely lysines 4, 9, 27 and 36 of Histone 3 and lysine 20 of Histone 4. In A.thaliana, only 31 of 49 putative SET domain-containing proteins identified are known, or though, to have HKMT activity (www.chromDB.org; Ng et al., 2007)). These proteins based on their domain architectures and/or differences in enzymatic activity can be divided into five classes (I to V), (figue 5-3, 5-4). Among HKMTs shown in figure 5-3 and 5-4, we have chosen those have a putative H3K4me activity like PRDM9. In addition, we also decided to screen genes that were transcribed during meiosis (table 2-1).

97 A

Meiocytes Root Meiocytes-root Meiocytes Leaf Meiocytes-Leaves SDG4, ASHR3 AT4G30860 9,15 8,74 0,40 8,43 8,01 0,42 SDG8, ASHH2 AT1G77300 10,07 9,74 0,33 8,92 8,34 0,58 SDG27, ATX1 AT2G31650 7,88 7,88 -0,01 7,23 7,38 -0,15 SDG30, ATX2 AT1G05830 7,97 8,13 -0,16 7,14 7,08 0,05 SDG14, ATX3 AT3G61740 8,85 8,62 0,23 7,25 7,19 0,06 SDG16, ATX4 AT4G27910 miR867 10,25 9,10 1,15 9,61 8,32 1,30 SDG29, ATX5 AT5G53430 8,01 8,07 -0,05 7,22 7,24 -0,02

SDG25, ATXR7 AT5G42400 9,39 9,12 0,26 7,93 7,64 0,29

SDG2, ATXR3 AT4G15180 12,25 11,14 1,11 10,97 9,18 1,79

SDG6, SUVR5 AT2G23740 8,35 7,93 0,42 7,85 7,57 0,29

AT3G22880 DMC1 10,67 8,36 2,31 10,03 8,46 1,58 SPO11-1 AT3G13170 8,62 7,93 0,69 7,64 7,14 0,50 B

SDG14, ATX3 AT3G61740 1,7332 SDG16, ATX4 AT4G27910 miR867 5,8785 SDG29, ATX5 AT5G53430 3,0452

Table 5-1: Transcriptome of Arabidopsis male meiocytes.

(A) The transcriptional level of ten candidate genes (SDGs) compared to two meiotic genes (DMC1 and SPO11-1) (Libeau et al, 2011).

(B) The SDG genes expressed at high level during meiosis (Yang et al., 2011 ). 5.2 The screen

For each gene selected we searched for T-DNA insertions in databases. Then we grew seeds and genotype plants to select homozygous plants for the T-DNA insertion. These plants are then screened with the following protocol (table5-1; table 5-2; figure 5-5):

First: pollen viability. If there is no problem of pollen viability i.e. more than 95 % viable pollen, the corresponding mutation has no obvious role in meiosis thus the screen is stopped. If not, the pollen mortality can be due to pre-meiotic, meiotic or post-meiotic defects. To clarify the situation, a second step is necessary.

Second: tetrad observation. If there is no problem of tetrad formation, the pollen lethality is not due to a meiotic defect. When abnormal tetrads are observed (monad, dyads or polyads) we continue the analysis to detect a putative meiotic defect.

Third: observation of different stage/sub-stage of meiosis by DAPI staining on male meiotic cells.

5.2.1 SDG6 (SUVR5, CZS)

The SDG6 (also called: SUVR5, CZS) is the only A .thaliana HKMT which contains both SET and Cys2His2 ZF domains like PRDM9. According to Pontvianne’s HKMT classification, this protein belongs to class five which marks inactive chromatin via histone 3 lysine 9 methylation (figure 5-3) (Pontvianne et al 2010). The number of ZF varies among SDG6 plants homologs: two in Sorghum bicolor, three in A.thaliana and Vitis vinifera, four in Papulus trichocarpa, Ricinus communis and Zea mays. In Physcomitrella patens subspecies the number of ZF varies from three to five (data from www.ncbi.com, www.scripps.edu/mb/barbas/zfdesign/zfdesignhome.php , http://compbio.cs.princeton.edu/zf/form.html ).

In 2007, Krichevsky and colleagues reported that the AtSDG6 plant-specific ZF domain interacts with a SWIRM domain polyamine oxidase protein, AtSWP1 (Krichevsky et al., 2007). Distruption of either AtSWP1 or AtSDG6 reduced dimethylation of both H3K9 and H3K27, also hyperacetylation of histone H4 within the FLC locus. This led to elevation of the FLC transcription levels, subsequently, delayed flowering (Krichevsky et al., 2007).

98 Heterozygote plant (+/T-DNA)

Self-fertilization

DNA genotyping of the descendants and selection of Homozygous plants (T-DNA/T-DNA)

1 Pollen viability test

Normal tetrad

2

Polyad formation Tetrad observation

Viable pollen grains More that 5 % mortality of pollen grains

Observation of Meiosis stages By DAPI staining 3

Figure 5-5 :A simple schema of the screen Chromosome fragmentation Less bivalent No bivalent Unfortunately Krichevsky and his colleagues have not investigated fertility in czs-1 mutant, Thus we decided to include this gene in our screen.

Both T-DNA insertion lines used (one of two was the same was described in ((Krichevsky et al., 2007)) did not show any pollen mortality (figure 5-6). They both exhibit a delayed flowering like that observed in Krichevsky’s results. We concluded that SDG6 could not play a PRDM9-like role in Arabidopsis meiosis

In addition, we compared ZF sequence variation among the A. thaliana accessions chosen in chapter 4. All of them conserved the same sequence at their three ZFs.

5.2.2 SDG4 (ASHR3) and SDG8 (ASHH2)

SDG4 (ASHR3) and SDG8 (ASHH2) are both members of class II HKMTs. Generally, class II genes are thought to be implicated in the methylation of H3K36, enriched within actively transcribed gene regions. They both exhibit a C4HC3 ZF motif for PHD (Plant Homeo Domain), a SET domain and an AWS (Associated With SET) motif for SDG4 (figure 5-6) (Pontvianne et al., 2010). SDG4 is highly expressed in the pollen and contributes to the epigenetic regulation of pollen and stamen development (Pontvianne et al., 2010). Loss-of- function of SDG4 leads to down-regulation of multiple genes, probably due to defects in H3K4 dimethylation and H3K36 trimethylation. In 2008, Cartagena and colleagues demonstrated a modulating role for the SDG4 by H3K4 and H3K36 methylation on genes involved to pollen tube growth: sdg4 mutants exhbited a significant reduction of H3K4 and H3K36 methylation level in pollen vegetative nuclei leading to an arrest of pollen tube growth, and consequently a fertility reduction (Cartagena et al., 2008). Overexpression of SDG4 gave the same phenotype.

SDG8, known as EFS (Early Flowering in Short days), is thought to be an H3K4 or H3K36 methyltransferase. Mutation of this gene induces pleiotropic plant phenotypes. These include: (i) early flowering due to FLC deregulation (ii) perturbation of several metabolic pathways (iii) alteration of SPS/BUS (Supershoot/Bushy) and UGT74E2 genes expression, both shoot branching regulators (iv) biotic resistance deficiency (v) severe alteration of reproductive organ development (vi) In sdg8 (ashh2) mutant, near to 80% of mature ovules lack embryo sac and close to 90% deficiency of functional pollen per anther observe due

99 A B C D

E F G H

I J K L

M N O

P Q R S

Figure 5-6: Alexander-stained anthers : (A) wild-type Col-0;(B) wild-type Ws; (C) sdg6; (D) sdg4 ; (E) sdg8; (F) atx1; (G) atx2 ; (H) axt3 (I) atx4; (J) atx5 ; (K) atx1 atx2 ; (L) atx3 atx5; (M) atx3 atx4; (N) atx4 atx5 ; (O) atx3 atx4 atx5; (P) sdg25 ; (Q) sdg2; (R) sdg2 sdg25 (S) Tetrad formation of sdg2 stained with toluidine blue. either lack of pollen sac development or specific tapetum layer defects. SDG8 is necessary for genes regulation of inflorescences via H3K36 trimethylation (Grini et al., 2009).

However, nothing is known about the putative meiotic role of these two genes.

The first screen did not reveal any problem of pollen mortality for both sdg4 and sdg8 mutants (figure 5-6). Despite the presence of semi-sterility of sdg8 mutant, all pollen grains were viable in anthers. Then we decided to see if there was a putative redundant function for these genes. However, the self-fertilization of SDG4+/ -x SDG8 +/- double heterozygote revealed an embryonic lethality for the sdg4 sdg8 double-mutant.

5.2.3 SDG2 (ATXR3) and SDG25 (ATXR7)

SDG2 and SDG25 belong to class III HKMTs. SDG2 (also called ATXR3) was recently characterized by Guo and his colleagues as the major H3K4 trimethyltransferase in Arabidopsis (Guo et al., 2010). They performed an Arabidopsis genome-wide study of mono-, di- and trimethylation of H3K4 distribution in which they showed that more than two thirds of genes contain at least one type of H3K4me. Loss-of-function of SDG2 leads the misregulation of a large number of genes and thus to severe and pleiotropic phenotypes: dwarf accompanied to smaller rosettes during vegetative growth, and a significantly early flowering in all photoperiods tested, and sterility.

According to Guo’s results about the pivotal role of SDG2 as the major H3K4 trimethyltransferase, we chose SDG2 and SDG25 genes (both belong to the same sub-group, figure 5-7), as good candidates.

sdg25 mutant had normal pollen viability, but pollen viability of sdg2 mutant varied between 50 to 100% with a few meiocytes. However it had normal tetrads. This suggests that mortality of pollen grains is due to post-meiotic defects. Despite the complete sterility of sdg2 mutant, a few pollen grains were used to make cross between sdg2 and sdg25. An extremely severe phenotype in sdg2 sdg25 double-mutant was revealed. The floral organs were severely affected (figure 5-6). We did not follow meiosis in detail because of this severe developmental defect.

100 Figure 5-7. Classification of Class II and III HKMTs From Cazzonelli et al 2009 5.2.4 ATX1 to ATX5

The Class III HKMTs contain five Arabidopsis genes which encode homologues of Trithorax. These Arabidopsis Trithorax-like proteins are named ATX1 (SDG27), ATX2 (SDG30), ATX3 (SDG14), ATX4 (SDG16) and ATX5 (SDG29) (Avramova, 2009). Structurally, Class III proteins contains: SET, post-SET, PHD, PWWP (proline–tryptophane– tryptophane–proline), FYRN (F/Y-rich N-terminus) and FYRC (F/Y-rich C-terminus) domains (figure 5-3; 5-7). ATX1 and ATX2 form a protein sub-group: ATX1 is involved in H3K4 trimethylation, while ATX2 mediates H3K4 dimethylation. The atx1 mutants show an early flowering phenotype (like sdg8, sdg2 and sdg25 mutant) and altered leaf morphogenesis. Interestingly, the early flowering phenotype is more severe in double atx1 atx2 mutant than simple atx1 mutant. This suggests that ATX1 and ATX2 activities overlap for suitable genes expression implicated in flowering time regulation. Although, there is evidence for partial redundancy in monitoring flowering time, however the ATX1 and ATX2 do not seem to regulate the same group of genes (reviewed in (Avramova, 2009)). Transcriptome analysis confirmed this difference by revealing that in atx1 mutant 7% of overall gene expression is affected, while in atx2 mutant only 0.7% of all genes exhibit a different pattern of expression compared with controls. So far, no functions have been reported for ATX3, ATX4 or ATX5. Regarding their highly conserved amino acid sequences, it is possible that they have redundant functions.

The pollen viability screen did not detect any meiotic or gametophytic deficiency for simple atx1, atx2, atx3, atx4, atx5 mutants (figure 5-6) nor atx1 atx2, atx3 atx4, atx3 atx5, atx4 atx5 double-mutants (figure 5-6). Surprisingly, the atx3 atx4 atx5 triple mutant revealed a severe perturbation in both male and female gamete formation. In male organs, anther malformation led to a few and abnormal pollen grains: lethality varies from fifty to one hundred percent. However normal tetrad formation was observed indicating a normal meiosis in male side. By contrast, in female organs, normal gametophytes were observed, but a detailed analysis using confocal microscopy revealed fifty percent of abnormal ovules, with sometimes three-ovules instead of one (figure 5-8). There might be an alteration of Telophase II. However none of the phenotypes observed reflect a Prophase I perturbation related to a meiotic recombination machinery defect.

101 A B

C D

Figure 5-8: Comparison of male and female meiosis in atx 3 atx5 double–mutant (A, B) and atx3 atx4 atx5 triple-mutants (C,D) by Confocal microscopy. (A) Normal meiosis with healthy ovules . (B) Normal meiosis ( tetrad formation) (D) Aborted ovules and also three-ovule developed instead of one ( green arrow) (D) Abnormal anthers with few meiocytes, but accurate tetrad formation ( red arrow) 5.3 Conclusion

The results of our screen are presented in table 5-2. None of the genes selected, present when mutated a perturbation of meiotic recombination. Thus either there is no PRDM9/Set1- like control of recombination hot spots in Arabidopsis or we have not yet found the histone methyltransferase(s) involved.

102 Gene T-DN line/ Mutant Pollen viability / Tetrad Meiotic defect Genetic Fertility state background

SDG6 (SUVR5, CZS) SALK_026224 /Col-0 sdg6 simple mutant 100% /fertile _ _

SALK_050304 / Col-0 sdg6 simple mutant 100% /fertile _ _

SDG4 (ASHR3) FLAG_225G07 /Ws-4 sdg4 simple mutant 95% /fertile _ _

SDG8 (ASHH2, EFS) FLAG_135B08 /Ws-4 sdg8 simple mutant 98% /semi-sterile _ _

SALK_065480 /Col-0 sdg8 simple mutant 95% /semi-sterile _ _

SDG2 (ATXR3) FLAG_224A03 /Ws-4 sdg2 simple mutant 50 - 100% /sterile Normal _

SALK_120450 /Col-0 sdg2 simple mutant 100% /sterile _ _

SALK_138889 /Col-0 sdg2 simple mutant 95% /sterile _ _

SALK_120448 /Col-0 sdg2 simple mutant 97% /sterile _ _

SALK_055991 /Col-0 sdg2 simple mutant 100% /sterile _ _

SALK_129789 /Col-0 sdg2 simple mutant 98% /sterile _ _

SDG25 (ATXR7) FLAG_370G04 /Ws-4 sdg25 simple mutant 97% /fertile _ _

SALK_149692 /Col-0 sdg25 simple mutant 98% /fertile _ _

SDG27 (ATX1) SALK_149002 /Col-0 atx1 simple mutant 96% /fertile _ _

SDG30 (ATX2) GABI_057B09 /Col-0 atx2 simple mutant 98% /fertile _ _

SDG14 (ATX3) GABI_143H01 /Col-0 atx3 simple mutant 95% /fertile _ _

GABI_128H01 /Col-0 atx3 simple mutant 100% /fertile _ _

SDG16 (ATX4) SALK_006002 /Col-0 atx4 simple mutant 95% /fertile _ _

FLAG_368G11 /Ws-4 atx4 simple mutant 97% /fertile _ _

SDG29 (ATX5) SK25472 /Col-0 atx5 simple mutant 95% /fertile _ _

SK25155 /Col-0 atx5 simple mutant 98% /fertile _ _

Table 5-2: Phenotypes of mutants. Part1

CONCLUSION

103 Gene T-DN line/ Mutant Pollen viability / Tetrad Meiotic defect Genetic background Fertility state

SDG4 (ASHR3) FLAG_225G07 /Ws-4 sdg4 sdg8 Lethal _ _ double mutant SDG8 (ASHH2, EFS) FLAG_135B08 /Ws-4

SDG2 (ATXR3) FLAG_224A03 /Ws-4 sdg2 sdg25 Problem in _ _ double mutant anther development SDG25 (ATXR7) FLAG_370G04 /Ws-4

SDG27 (ATX1) SALK_149002 /Col-0 atx1 atx2 95% /fertile _ _ double mutant SDG30 (ATX2) GABI_057B09 /Col-0

SDG14 (ATX3) GABI_143H01 /Col-0 atx3 atx4 95% /fertile _ _ double mutant SDG16 (ATX4) SALK_006002 /Col-0

SDG14 (ATX3) GABI_143H01 /Col-0 atx3 atx5 96% /fertile _ _ double mutant SDG29 (ATX5) SK25472 /Col-0

SDG16 (ATX4) GABI_143H01 /Col-0 atx4 atx5 95% /fertile _ _ double mutant SDG29 (ATX5) SK25472 /Col-0

SDG14 (ATX3) GABI_143H01 /Col-0 atx3 atx4 atx5 0 – 50 % /semi-sterile Normal Telophase II arrest triple mutant in female meiosis SDG16 (ATX4) GABI_143H01 /Col-0

SDG29 (ATX5) SK25472 /Col-0

Table 5-2: Phenotypes of mutants. Part 2

The aim of my thesis was the identification and the characterization of a meiotic recombination hotspot in Arabidopsis thaliana. I confirmed that the 14a was a true hotspot of meiotic recombination based on specific features: (i) 14a has a very high rate of COs (0.85%) (ii) COs cluster in two small regions of a few kilobases, 14a1 and 14a2 and the shape of their distribution form a typical Gaussian curve observed in other organisms.(iii) 14a1 is also a hotspot of NCO; at the center of the hotspot (based on the peak of CO rates) NCOs occur at a similar rate than COs (0.5%) (iv) for the first time in Arabidopsis, a sufficient number of events was observed to obtain a mean tract length of conversion of 276 bp (1 to 552 bp). When these features were compared to the one obtained at the 130x hotspot described by Jan Drouaud, similarities and discrepancies were detected: 130x has a comparable rate of COs (0.7%) but COs are distributed along a broad region (8kb) without a proper Gaussian shape; NCOs are also detected in discrete regions of the hotspot with a lower rate than COs (CO/NCO ratio 1/30); NCOs conversion tract length tend to be longer at 130x (492 to 1038 bp) depending on the polymorphism studied.

A bias of in the recovery of the Col and the Ler allele at the peak of COs and NCOs was found at both 14a1 and 14a2 hotspots in one of the ColxLer hybrid studied. The Col allele more frequently recovered than the Ler allele in the recombination products. This suggests that there is a bias of recombination initiation with Ler being more frequently the target of meiotic breaks than Col. However, surprisingly, in different ColxLer hybrids the bias was not found anymore neither in a ColxLer hybrid homozygous for a msh4 mutation (see below). Yet, these three ColxLer F1s have the same DNA sequence at the hotspot. Thus either there is no real consensus sequence within the hotspot as described recently in mice and humans or there is a trans-acting factor to be identified that modulates the activity of hotspots regions. These results underlined the necessity to compare more hotspots in more hybrids.

The analysis of the 14a hotspot was undertaken in a series of F1. In the ColxWs it was observed that: (i) the two hotspots 14a1 and 14a2 are conserved with a similar global CO rate (0.49%) (ii) the distribution of Col and Ws allele suggest that there could be a bias of initiation of recombination but this time with the Col allele being more frequently broken than the Ws allele. Additional experiments (more COs and NCOs analysis) are needed to confirm or infirm this hypothesis. It would be very interesting to study the behavior of 14a in

104 the LerxWs F1 and we are very excited to see what happens in ColxPyl-1, ColxBur-0 and ColxN13 hybrids. The 130x needs also to be analyzed in this various F1.

I also looked at the role of MSH4 in the rate and distribution of recombination events. I confirmed that MSH4 is involved in CO formation as described previously in other species: 12.5% of the wild-type CO rate. However, this reduction of CO frequency was not homogenous at 14a1 (6.4%) and 14a2 (18.7%) and this difference was significant. At 130x only 5% of COs were MSH4 independent. It suggests that the proportion of MSH4 dependent COs could differ from one to another hotspot. It would be interesting to study other hotspots in the absence of MSH4 to look at their landscape to better understand MSH4's role. NCOs were also analyzed in the msh4 homozygous mutant. The NCO frequency was similar to the one found in wild type. Thus as described in S. cerevisiae previously, MSH4 does not seem to play a role in NCO formation. We did not detect a bias in recombination initiation in the NCO products (the bias could not be analyzed for CO products due to the low number of COs recovered). However, we have shown that in the wild type ColxLer sister plants (wild-type MSH4) of the msh4 homozygous plants, we do not detect any bias in the CO products. We are currently looking at the NCO products in these plants. We thus hypothesize (see above) that the two ColxLer F1 behave differently at 14a (see above).

I also wanted to look at a putative epigenetic control of meiotic recombination hotspots in Arabidopsis thaliana. I focused on histone methylation as Set1 in S.cerevisiae or PRDM9 in mammals (mice and human) had been described as main controller. These genes are histone methyltransferase that methylate lysine 4 on histone H3 (H3K3me3) and regulate the activity of meiotic recombination hotspots. I analyzed the role of ten putative histone methyltransferase genes in meiosis but none of them showed a meiotic defect. As I performed crosses to obtain double or triple mutants within subfamilies of histone methyltransferase to limit the potential redundancy of function between gene products, I cannot eliminate totally the possibility of redundancy between subfamilies of genes. Alternatively, there is another histone methyltransferase in charge of labeling meiotic recombination hotspots (more than 29 putative histone methyltransferase have been identified in the Arabidopsis genome!). Another hypothesis is that contrary to S. cerevisiae, mice and humans, there is another mechanism for epigenetic control of meiotic recombination. In S.pombe, histone acetylation but not histone methylation controls activity at some hotspots (Yamada et al 2004). Moreover, in dogs and their wild relatives, wolves and coyotes PRDM9 is not a meiotic recombination regulator because it is non-functional. In addition, PRDM9 is also absent in birds, reptiles, amphibians

105 and diptera (Muñoz-Fuentes et al 2011). Thus it would be interesting to study directly chromatin structure at meiotic recombination hotspots to determine the marks linked to the activity of hotspots. However, these studies are actually difficult to initiate due to the limited meiotic material.

Finally, we can suggest that although meiotic recombination promotes genetic diversity, it can also be sometimes hazardous in certain hybrids. This is the case of human genetic disease due to meiotic recombination, where improper CO occurs between non-allelic sequences or an NCO into genes leads to non-functional protein (see chapter I, 1-7). If meiotic recombination occurs within genes or at gene promoters or regulators it can lead to new proteins or abnormal expression. We have not yet tested the function of recombinant proteins at 14a and 130x as the recombinant genes obtained were only found in pollen DNA. But as we found these recombinant proteins the three hotspots studied, we can hypothesize that the conception of a new protein by meiotic recombination is not uncommon and can lead to major change in the life of a plant: benefit, silent or deleterious.

106

MATERIAL AND METHODS

107

Plant material

The Arabidopsis thaliana lines used for CO and NCO analysis

The Arabidopsis thaliana accessions “Columbia-0 (Col)” (186AV), “Landsberg erecta (Ler)” (213AV) and “Wassilewskija (Ws-4)” (530AV) were used for CO and NCO analysis (chapter II, III and IV). A list of all accessions used (in addition to their corresponding hybrids) for further studies on hotspot variation among different Arabidopsis hybrids is shown in chapter IV (see Table 4-1 and 4-2). All these accessions were obtained from the “Centre de Ressources Biologiques” at the “Institut Jean Pierre Bourgin”, Versailles, France http://dbsgap.versailles.inra.fr/vnat/.

To obtain a msh4-/- mutant line in ColxLer background the two heterozygous lines Col (SALK_136296) and Ler (CSHL_GT14269) have been crossed. In the descendants, sterile plants (msh4-/-) were used for CO and NCO analysis (see chapter III).

The Arabidopsis thaliana lines used in chapter 5

SDG6, SDG4, SDG8, SDG2, SDG25, ATX1, ATX2, ATX3, ATX4 and ATX5 genes were chosen as candidate genes. A list of their T-DNA insertion lines, in addition to their double and triple mutants, is shown in chapter V (see table 5-2). Double or triple mutants have been made either in Col-0 or in Ws genetic background (see table 5-2).

All plants presented above were grown in the greenhouse under standard conditions.

Methods

CO and NCO analysis

The method used for CO and analysis is briefly presented in chapter II but is detailed in (Drouaud and Mezard, 2011).

The methods used for NCO analysis are detailed in a paper in press in “Methods in Molecular Biology” ((Khademian et al., in press) see last part of this chapter).

108

Primers used for ColxWs analysis

Name Sequence 5'->3' Genomic coordinates

14a8CoL4 ATTAtgcaacagcaacaC 16691253..16691271

14a8Le (Ws) L2 ACAAcagcaACTAAACTAATC 16691258..16691279

14a15CoL1 CCagtaatcagtattccct 16692044..16692061

14a15WsL1 CCagtaatcagtaAtcccC 16692044..16692061

14a54CoR2 GGAGAGCTAATGCAGGC 16699799..16699783

14a54Le (Ws) R2 GCAGAGCCAATGCGGGT 16699799..16699783

14a75CoR1 GTATCAAGTGTGGGGTGTC 16700416..16700435

14a75Le (Ws) R1 GTATCAAGTGTGGGGTCTT 16700416..16700435

14af4L TTCTGAAATTTGCCATATTGAT 16693364.. 16693386

14aKH3R AAGCACATAAATATGGAATC 16697738..16697758

Nucleotides surlined in yellow are nucleotides added to increase artificially the Tm of the primer

Primers used for CO detection

First PCR Annealing Second PCR Annealing Third PCR Annealing Target:

temperature temperature temperature Parental

(°C) (°C) (°C) or

CO

1stPCR:

14a8CoL4 14a15CoL1 14af4L CO

58 63 51 2ndPCR:

109 14a75Le (Ws) R1 14a54Le (Ws) R2 14aKH3R CO

3rdPCR:

Parental

1stPCR:

14a15WsL1 14af4L CO

14a8Le (Ws) L2 58 63 51 2ndPCR:

14a54CoR2 14aKH3R CO

3rdPCR:

Parental

14a75CoR1

Primers used for mutant genotyping

Gene T-DNA line Wild-type alleles PCR T-DNA alleles PCR

SDG6 SALK_026224 LP LB- CATCATCGACGACACAAATTG Salk2GCTTTCTTCCCTTCCTTTCTC

RP RP TTGGAAATTCATGTGGAGGAG TTGGAAATTCATGTGGAGGAG

SDG6 SALK_050304 LP LB-Salk2 TTGGAAATTCATGTGGAGGAG GCTTTCTTCCCTTCCTTTCTC

RP RP CATCATCGACGACACAAATTG CATCATCGACGACACAAATTG

SDG4 FLAG_225G07 LP Lb-Bar2 GGAAAATCTAGCCGGAGAATG CGTGTGCCAGGTGCCCACGGAATAG RP RP

CATGCTGCTCAATCAACAGAG CATGCTGCTCAATCAACAGAG

110 SDG8 FLAG_135B08 LP Lb-Bar2 AAACATCCAAGCATCAAGTGG CGTGTGCCAGGTGCCCACGGAATAG RP RP AAGCAATCATCACATCGAACC AAGCAATCATCACATCGAACC

SDG8 SALK_065480 LP LB-Salk2

CCTTCATCGCAATCGTAAATC GCTTTCTTCCCTTCCTTTCTC

RP RP

TTTTGCGCTAAACTAGTTGGG TTTTGCGCTAAACTAGTTGGG

SDG2 FLAG_224A03 LP Lb-Bar2

ATGGAATCGATCCTTACACCC CGTGTGCCAGGTGCCCACGGAATAG RP RP AGGCAGGCTAGTTTATGCATG AGGCAGGCTAGTTTATGCATG

SDG2 SALK_120450 LP LB-Salk2

TGTCACTCTTGCCATTGTCAG GCTTTCTTCCCTTCCTTTCTC

RP RP

AAGTCTAAGGGTTTAGGGGGC AAGTCTAAGGGTTTAGGGGGC

SDG2 SALK_138889 LP LB-Salk2

ACAAAACTGATCCTGCACCAG GCTTTCTTCCCTTCCTTTCTC RP RP AACAAGCGCATACAACAAACC AACAAGCGCATACAACAAACC

SDG2 SALK_120448 LP LB-Salk2

TTTGATTTCTCGGTCAGATGC GCTTTCTTCCCTTCCTTTCTC RP RP GCTCAGATGCCAGAATACTCG GCTCAGATGCCAGAATACTCG

SDG2 SALK_055991 LP LB-Salk2

TCATCTCTCAACCTTCATGGG GCTTTCTTCCCTTCCTTTCTC RP RP TTGTTGTATGCGCTTGTTCAG TTGTTGTATGCGCTTGTTCAG

111 SDG2 SALK_129789 LP LB- GTTCGGATGATTCAGACGATG Salk2GCTTTCTTCCCTTCCTTTCTC

RP RP CGATGTTGCTTCCTTCTATGC CGATGTTGCTTCCTTCTATGC

SDG25 FLAG_370G04 LP Lb-Bar2

TGATTTTGATCCTGTTGCCTC CGTGTGCCAGGTGCCCACGGAATAG RP RP ATGACAATCCACGAACTCTGG ATGACAATCCACGAACTCTGG

SDG25 SALK_149692 LP LB-Salk2

TCTTGTGACAGGTGCAACTTG GCTTTCTTCCCTTCCTTTCTC RP RP AAACAAAGCTAGGCACAAGGC AAACAAAGCTAGGCACAAGGC

SDG27 SALK_149002 LP LB-Salk2

AATGAAAGCATGCGGATACAC GCTTTCTTCCCTTCCTTTCTC RP RP TCCGTGTTGACTGGAAAGATC TCCGTGTTGACTGGAAAGATC

SDG30 GABI_057B09 LP LB-gabi1

ATAAGCTTCCAGGAAGGATGG CCCATTTGGACGTGAATGTAGACAC RP RP AATGCAGACCAGGACACATTC AATGCAGACCAGGACACATTC

SDG14 GABI_143H01 LP LB-gabi1

TCAAAGAATGATGGCTTGGTC CCCATTTGGACGTGAATGTAGACAC RP RP CTTTAAAACCAAGCGACGTTG CTTTAAAACCAAGCGACGTTG

SDG14 GABI_128H01 LP LB-gabi1

ATTACTAGCGATTCCGGCTTC CCCATTTGGACGTGAATGTAGACAC RP RP CTAATACCCCGTAGGCTGAGG CTAATACCCCGTAGGCTGAGG

112 SDG16 SALK_006002 LP LB-Salk2

GTATGCAACCGTTTCTGCATC GCTTTCTTCCCTTCCTTTCTC RP RP TTCGGTGCTTGATAATTGGAG TTCGGTGCTTGATAATTGGAG

SDG16 FLAG_368G11 LP Lb-Bar2

TCTCCTCACCCATGCATAGTC CGTGTGCCAGGTGCCCACGGAATAG RP RP TGTTTGATAGAACGGGTACGC TGTTTGATAGAACGGGTACGC

SDG29 SK25472 LP pSKTAIL-L1

TTCCTTTTGTGCCATCTTCAC TTCTCATCTAAGCCCCCATTTGG

RP RP

TCCAGGTTCTTGAATACCGTG TCCAGGTTCTTGAATACCGTG

SDG29 SK25155 LP pSKTAIL-L1

TCGTCGAATTGAAACTGAACC TTCTCATCTAAGCCCCCATTTGG

RP RP

TGCTTCCTTCACGGTTTAATG TGCTTCCTTCACGGTTTAATG

LP : left primer; RP: right primer

Light and confocal microscopy

Alexander’s staining of anthers was performed as described by Alexander (Alexander, 1969) and observed by light microscopy. Tetrads were dissected and stained with 0.1% toluidine blue following observation by light microscopy. For confocal laser scanning microscopy, anthers and pistils of different stages were observed for double and triple mutants as described by Motamayor et al. (Motamayor et al., 2000) with laser confocal microscopy Leica SP2.

Manuscript

113 Characterization of meiotic Non Crossover molecules from

Arabidopsis thaliana pollen

Hossein Khademian, Laurène Giraut, Jan Drouaud and, Christine Mézard

Institut Jean-Pierre Bourgin UMR1318 INRA-AgroParisTech

Institut National de la Recherche Agronomique, Centre de Versailles-Grignon

Route de St-Cyr (RD10)

78026 Versailles Cedex France.

*Corresponding author.

E-mail: [email protected]. Tel.: (33)13083739. Fax.: (33)130833319. Summary

Meiotic recombination is essential for the proper segregation of homologous chromosomes and thus for the formation of viable gametes. Recombination generates either crossovers

(COs), which are reciprocal exchanges between chromosome segments, or gene conversion not associated to crossovers (NCOs). Both kinds of events occur in narrow regions (less than

10 kilobases) called hotspots, which are distributed along chromosomes. In plants as in many other higher eukaryotes, whereas NCOs may represent a large fraction of meiotic recombination events, the have been poorly characterized due to the technical difficulty to detect them. Here, we present a powerful approach based on allele-specific PCR amplification of single molecules from pollen genomic DNA allowing detection, quantification and characterization of NCO events arising at low frequencies in recombination hotspots.

Key words: Meiosis, recombination, hotspot, non crossover, pollen, allele-specific PCR

1. Introduction

In most eukaryotes, recombination between homologous chromosomes is essential to their proper segregation at the first meiotic division. Meiotic recombination between homologous chromosomes produces either reciprocal exchanges of large fragments

(Crossovers, COs) or non-reciprocal exchanges of small patches (Non Crossover, NCO) of genetic material (Figure 1). The recombination events are not distributed homogeneously along the chromosomes and also cluster at "hotspots": small regions of a few kilobases with a higher CO and NCO rates compared to the surrounding DNA. Studies in yeasts have shown that both COs and NCOs derive from the repair of DNA double strand breaks induced by the

Spo11 protein early in meiosis (Reviewed in (1)).

In higher eukaryotes, the sperm typing technique set up first by Arnheim and Jeffreys enabled the characterization of tens of hotspots in human and mice ((2, 3); Figure 2).

Mammalian hotspots exhibit similar characteristics to the ones described in yeasts: a 1 to 2 kb width, a quasi normal distribution of CO exchange points, and a gradient of NCOs rate peaking at the center of the hotspots.

In all studies conducted in higher eukaryotes, NCOs are probably underestimated because of the difficulty in their detection and the limited number of polymorphisms available. However, DSB repair sites estimated by cytological methods (immunocytology using antibodies against the DSB repair protein RAD51 and/or DMC1, or quantification of early nodules visualised by electron microscopy) are 10 to 40 times greater than CO numbers

(estimated by chiasmata count, quantification of late nodules visualised by electron microscopy, or by genetic maps) (1, 4). It is not known yet if all these DSBs in excess are repaired as NCOs but if so their number would be in large excess compared to COs.

Moreover, NCO tracts could be very short. Indeed, none of the NCO tracts reported so far in mice and human exceed 1.2 kb in length, with an average estimated between 50 to 300 bp (1,

3, 5-7). Thus just a few polymorphisms could be included in NCO tracts and that would add to the difficulty to detect them.

In plants, only a few hotspots have been isolated and characterized so far. All the hotspots described have been discovered due to phenotypic screens allowing the detection of

COs and sometimes NCOs. For example, the maize bronze hotspot has been identified because a mutation in the corresponding gene changes the color of the seeds (8) . At another well analyzed set of hotspots a1-sh2, the regions is flanked by two genes where mutations can change the color (a1) and the aspects (sh2) of corn kernels (9). In both regions, clusters of COs and NCOs have been found but with some differences compared to mammalian and yeasts hotspots: at bronze, NCOs rate do not seem to exhibit a gradient with a peak in the center (10); however, the CO distribution may not be sufficiently characterized to delimit the borders and thus the center; in the a1-sh2 region COs are detected over a large 10 kb region that could reflect a cluster of overlapping hotspots (11) and the NCOs distribution could not be properly analyzed because of the too few events recovered.

To overcome the limitations of the phenotypic screens on the characterization of plant meiotic hotspots, we set up a molecular approach adapted from the "sperm-typing" technique thoroughly described in a series of recent papers (1, 2, 12). The procedure is based on a series of nested PCRs for isolating recombinant molecules from pools of parental non-recombinant molecules (Figure 2). We previously described the method to characterize CO molecules (13).

Here we report a modified protocol to extract high quality DNA from Arabidopsis pollen and two "pollen-typing"-based methods to isolate, score and analyze NCO events (Figure 3).

2.Materials

2.1 DNA extraction

2.1.1 Extraction of DNA from pollen

1. 10% saccharose

2. RNase A 100 mg/ml

3. DNeasy Plant maxi Kit (Qiagen ®)

4. TE buffer: 5mM Tris-HCl (pH8), 0.5 mM EDTA

2.1.2 Extraction of DNA from leaves

Described in (13)

2.1.3 DNA Purification

Described in (13)

2.2. Allele-Specific Long PCR

Described in (13)

3. Methods

3.1 Genomic DNA preparation

In most experiments described below, especially those involving long PCR amplification, the use of high molecular weight, high purity genomic DNA (gDNA) is required. This is not always a trivial matter, however, because the mechanical disruption of the cell wall often leads to extensive gDNA shearing, in particular for Arabidopsis (and most other plant species) pollen. Hence, careful optimization of the tissue lysis procedure is mandatory, keeping in mind that the optimal protocol aims at reaching a compromise between yield (through wall breakage efficiency) and integrity of gDNA.

Several methods for pollen disruption have been described, but few have proven to be adequate for high molecular weight gDNA isolation. Among the latter, grinding in liquid nitrogen, bead-vortexing and French pressing have been successfully used in our laboratory.

We present here a protocol using a French press, which allows extracting a high yield of high molecular weight gDNA, providing that adequate cell pressure is used. The pressure value should be determined for each plant species, prior to routine large scale isolation of gDNA.

3.1.1 Extraction of Genomic DNA from Pollen

Pollen is first harvested from inflorescences as described in (13).

Then:

1. Resuspend the cell pellet in 5 ml of Buffer AP1 (DNeasy Plant maxi Kit-Qiagen)

Lyse cell with a French Press using adapted pressure. Add 1 µl of the suspension into

10 µl of water onto a microscope glass side. Confirm the disruption of the cells with

the microscope. If the lysis is successful, add RNase A to 0.2 mg/ml

Note: For the pollen suspension prepared from the Columbia x Landsberg erecta F1

we adjusted the pressure to 3 000 psi in medium. However, we recommend testing

different levels of pressure for each cross. After lysis, the percentage of broken cells

should be evaluated and the average DNA size should be estimated by gel

electrophoresis. Then, the pressure that maximizes the percentage of broken pollen

cells and minimizes the DNA fragmentation is chosen to perform the DNA extraction

on the rest of the pollen suspension.

2. Proceeding with the Qiagen DNeasy maxi kit (steps 9 to 17)

Note: For Step 16 and 17, the elution is performed three times with 1 ml of TE Buffer

3.1.2 Extraction of genomic DNA from leaves

As described in (13)

3.1.3 Purification of Genomic DNA

As described in (13)

3.1.4. Quantification of Genomic DNA As described in (13)

3.2. Designing and testing oligonucleotides

Two kinds of oligonucleotides are used for PCRs.

Those called universal oligonucleotides (UOs) anneal to identical sequences on both parental alleles. They should be designed following usual rules for PCRs primers (50% GC, avoid repeated regions….). Their efficiency should be tested in pairs between UOs and also in pairs with allele specific oligonucleotides (ASOs)(see below).

In contrast, ASOs are designed to anneal the most specifically as possible only to one of the two parental alleles. Thus, their positions depend on polymorphic sites between parental alleles. There are not clear rules to predict what will be a "highly" or "poorly" specific primer but with the only obligation that the 3' terminal nucleotide of the ASOs must be a polymorphic nucleotide between the parental alleles. Thus all ASOs will have to be tested empirically (see below). To detect NCOs, ASOs are designed at polymorphisms within the hotspot (Figure 3)

Additional criteria and methods to design and test the efficiency of UOs and ASOs and the specificity of ASOs have been extensively described elsewhere (2, 12-14).

3.3. Detection and Characterization of Single NCO Molecules

Prior to any detection of recombinant molecules, genomic DNA extracted from pollen must be tested to determine its amplifiability (proportion of genomic molecules that can be PCR amplified). Long PCRs up to 14 kilobases could be needed depending on the width of the hotspot and the polymorphisms available at the borders. Typically, amplifiability ranges from

20 to 50% for amplicons longer than 8 kilobases and drops around 10% for 14 kilobases. Thus there could be a real discrepancy between the DNA concentration estimated by gel electrophoresis (see above) and the amplifiability.

Methods to score amplifiable molecules are described in (13).

NCO detections methods described thereafter rely on a first allele-specific PCRs to amplify only molecules having both flanking markers of one allele (Figure 3). If there is a suspicion that the first allele-specific PCRs may not be highly specific, it is recommended to perform a second nested allele-specific PCRs. Then, on this amplified material, two methods can be used to identify NCOs (Figure 3).

The simplest and quickest method is based on two allele-specific PCRs with a pair of two

ASOs, performed in parallel. In each ASO pair, one ASO matches the flanking polymorphism from the allele amplified previously while the second ASO matches the polymorphism to be tested for the NCO and thus belonging to the other parental allele (Figure 3A). To score a

NCO, both PCRs with both ASOs in reverse orientation matching the NCO polymorphism must give amplifications (Figure 4).

When the types of polymorphisms do not allow designing specific ASOs, another method adapted from (15) can be used. As described above and in Figure 3B, after two rounds of allele-specific PCRs, PCRs fragments are digested and cloned into a vector. Individual clones are recovered from E. coli and genotyped using fluorescence at polymorphic sites to be tested for NCOs.

Thereafter, we will use the following name for ASOs designation: ASO_AL1 and ASO_AR1,

ASO-AL2 and ASO_AR2 are two couples of ASOs that amplify specifically allele A and couple "2" is internal to couple "1"; ASO_BL1 and ASO_BR1, ASO-BL2 and ASO_BR2 are two couples of ASOs that amplify specifically allele A and couple "2" is internal to couple

"1". ASO_ASnpUp and ASO_ASnpLo are the upper and lower primers respectively specific of the SNP to be scored in allele A. Conversely, ASO_BSnpUp and ASO_BSnpLo are specific for allele B (Figure 3).

3.3.1 NCO detection by PCR

As the NCO activity of hotspots could be variable (NCO to CO ratios have been described between 12:1 and 1:9 in mammals), a series of pollen DNA concentrations should be tested to determine the ideal range before performing large scale analysis. From our unpublished data

(Laurène Giraut, personal communication), NCO rates at one SNPs could vary between

1/100,000 to 1/250.

1. Prepare a reaction pre-mix for 14 reactions as follows:

28 µl of 4 µM ASO_AL1

28 µl of 4 µM ASO_AR1

28 µl of 10X PCR buffer

14 µl of 0.5 U/µl Taq:Pfu mix

196 µl of H20

2. Combine 42 µl of pre-mix and 2 µl of 30 ng/µl F1 pollen gDNA in PCR tube/well ‘1’.

3. Add 12 µl of H20 to the remaining pre-mix. Then aliquot 22 µl into PCR tubes/wells ‘2’ to

‘12’.

4. Transfer 22 µl from tube/well ‘1’ to tube/well ‘2’. Mix.

5. Transfer 22 µl from tube/well ‘2’ to tube/well ‘3’. Mix. So on until tube/well ‘12’.

gDNA is serially diluted at 1/2 from tube ‘1’ to ‘12’, starting from 30 ng (roughly 200000

Arabidopsis genomes, 10000 of each parent in a F1).

6. Proceed to thermal cycling as follows:

(92°C;2 min){(92°C;20 sec)(Topt; 30 sec)(68°C;30 sec + 45 sec/kb)}x30(68°C;90

sec/kb)(4°C;∞). 7. Proceed to the second PCR to test the presence of NCOs. Dilute 2 µl of PCR products in

100 µl of (5 mM Tris-HCl pH 8, 0.01% Triton X-100).

8. Than prepare two reaction pre mixes for 14 reactions as follows:

Pre mix 1:

28 µl of 4 µM ASO_AL2

28 µl of 4 µM ASO_BSnpLo

28 µl of 10X PCR buffer

14 µl of 0.5 U/µl Taq:Pfu mix

168 µl of H20

Pre mix 2:

28 µl of 4 µM ASO_AR2

28 µl of 4 µM ASO_BSnpUp

28 µl of 10X PCR buffer

14 µl of 0.5 U/µl Taq:Pfu mix

168 µl of H20

9. Add 14 µl of the diluted PCR product obtained in the first PCR (see 7) in pre mix1 and pre

mix2. Than aliquot 22 µl in 12 tubes for each mix

10. Proceed to thermal cycling as follows:

(92°C;2 min){(92°C;20 sec)(Topt;30 sec)(68°C;45 sec/kb)}x30(68°C;90 sec/kb)(4°C;∞).

11. Add 5 µl of DNA loading dye, and run 10 µl on a 0.8 % agarose gel (containing 0.2 µg/ml

ethidium bromide) in 1X TBE. Photograph the gel under UV, including a high exposure

time in order to see faint bands.

12. Choose the last dilutions and the first negative dilutions positive for both second PCRs to

perform a series of 12 reactions to refine the optimized dilutions to be used thereafter.

Choose the dilution that will give less than one NCO product per reaction to avoid the risk of gettingmultiple NCO products in the same PCR. However, if the NCO rate found is low,

it is preferable to multiply reactions with fewer genomes than to carry less reactions with a

high number of genomes. Indeed, PCRs start to be non specific when non recombinant

molecules are in large excess. The tolerated excess depends on the quality of ASOs (from a

thousand to ten thousands). The same analysis has to be conducted independently for allele

B. Actually, bias of initiation of recombination has been described in several hotspots and

in several species leading to unequal rate of NCO on each allele at one specific SNP (16)

13. Perform the reaction to detect and quantify NCOs. Prepare a reaction mix for 98 reactions

as follows:

196 µl of 4 µM ASO_AL1

196 µl of 4 µM ASO_AR1

196 µl of 10X PCR buffer

98 µl of 0.5 U/µl Taq:Pfu mix

69.3 µl of gDNA to get 98x optimal dilution determined in (12) in 5 ng/µl carrier DNA

126.7 µl of 5 ng/µl carrier DNA

1274 µl of H20

14. Aliquot 22 µl of the mix in a PCR plate

15. Proceed to thermal cycling as follows:

(92°C;2 min){(92°C;20 sec)(Topt;30 sec)(68°C;45 sec/kb)}x30(68°C;90 sec/kb)(4°C;∞)

16. Proceed to the second PCR. Dilute 2 µl of PCR products in 100 µl of (5 mM Tris-HCl pH

8, 0.01% Triton X-100).

17. Prepare two reaction pre-mixes for 98 reactions as follows:

Pre mix 1 196 µl of 4 µM ASO_AL2

196 µl of 4 µM ASO_BSnpLo

196 µl of 10X PCR buffer

98 µl of 0.5 U/µl Taq:Pfu mix

1372 µl of H20

Aliquot 21 µl of pre-mix 1 into the wells of PCR plate 2a.

Pre mix 2

196 µl of 4 µM ASO_AR2

196 µl of 4 µM ASO_BSnpUp

196 µl of 10X PCR buffer

98 µl of 0.5 U/µl Taq:Pfu mix

1372 µl of H20

Aliquot 21 µl of pre-mix 2 into the wells of PCR plate 2b.

18. Transfer 1 µl of diluted products from the first PCR into PCR plate 2a and plate 2b.

Proceed to thermal cycling as follows:

(92°C;2 min){(92°C;20 sec)(Topt;30 sec)(68°C;45 sec/kb)}x30(68°C;90 sec/kb)(4°C;∞).

19. Add 5 µl of DNA loading dye, and run 10 µl on a 0.8 % agarose gel (containing 0.2 µg/ml

ethidium bromide) in 1X TBE. Photograph the gel under UV, including a high exposure

time in order to see faint bands.

To be able to score a NCO, both corresponding second sets of PCRs performed with

ASO_AL2/ASO_BSnpLo and ASO_AR2/ASO_BSnpUp must be positive for the same

PCR product obtained in PCR1 (See Figure 5). The NCO rate is the ratio between the numbers of clear positive reactions to the numbers

of input genomes in PCR1.

NCO tract lengths are measured by sequencing PCR products.

3.3.2 NCO detection by cloning and high throughput genotyping

1. Allele-specific PCRs are first performed on pollen genomic DNAs. Typically a series of 12

to 24 reactions each containing one to ten thousands of amplificable molecules are

amplified with pairs of ASOs specific of one parental allele (Figure 3; See 3.3.1). Thus

after amplification, each set of reactions should contain a majority of non recombinant

PCR products of one parental allele and a minority of molecules with a NCO.

In parallel, amplify the same region on genomic DNAs extracted from leaves from each

parent. It will be used as a control in a fluorescent assay.

2. Then, analyze PCR products by gel electrophoresis. If amplifications are of good quality

(one unique strong band at the correctsize), pool PCR products with the same pair of

ASOs. If after the first allelic PCRs, several bands are seen after gel electrophoresis,

perform a second nested PCRs with ASOs specific to the same parental allele and control

the quality of the amplification.

3. Clone the region containing the polymorphisms to be tested for NCOs in an appropriate

cloning vector. It is important to optimize the efficiency of cloning by choosing the

appropriate restriction enzyme(s), dephosphorylating the vector ends before ligation or,

using additional restriction enzyme to avoid self-circularization of the cloning vector. The

efficiency of cloning should be estimated before further analyses either from blue/white

screening or by PCR on E. coli colonies using adapted primers. Ideally, each colony will

represent a unique molecule isolated from pollen genomic DNA. The number of haploid

genomes to be put in the first specific PCR depends on the number of colonies that will be tested. The ratio of the number of colonies to the number of amplifiable haploid genomes

will give the probability (m) that a given pollen genomic DNA is represented among the

selected clones. (15) have estimated the probability that any 2 (n) clones are duplicate from

the same initial event using a Poisson estimation P(n) = (e-m mn) n!. In the example they

provided, they showed that when 14 meiotic events were detected in 500 clones obtained

from an input of 11,000 haploid genome, there is a 1,25% probability that the 14 clones

contain a duplicate of the same initial molecule.

In parallel, clone the parental allele from the PCR with leaf DNA.

4. E. coli colonies are grown in 96 wells plates at 37°C for 18 hours. Cultures are then

centrifuged and the cell pellet is resuspended in water. There is no need to extract plasmid

DNA, genotyping is performed directly on cell cultures

5. Clones are then genotyped using the “Chemicon Amplifluor SNPs Genotyping System”

(Millipore). For this purpose, oligonucleotides specific to each parent at polymorphisms

(SNPs) to be tested, are designed using the Amplifluor AssayArchitect software

(https://apps.serologicals.com/AAA/). SNP genotyping is performed using the

recommended protocol of the manufacturer. Appropriate controls should be put in each

plate: at least 2 wells with only H20 and 2 wells with cloned fragments of each parental

allele in each plate. To increase the number of colonies genotyped it is possible to pool 2 or

3 colonies in the same well after cell growth. In this case, positive colonies give a

"heterozygous" signal (Figure 5).

The same plates can be used to genotype several different polymorphisms independently.

6. DNA from "positive" colonies is then extracted to (i) confirm the NCO event by DNA

sequencing and (ii) estimate the length of the meiotic event.

Acknowledgments We thank members of the group "Méiose et Recombinaison" for helpful discussions. The research leading to this work has received funding from the (European Community’s)

(European Atomic Energy Community’s) Seventh Framework Programme (FP7/2007-2013

FP7/2007-2011) under grant agreement number KBBE-2009-222883.

Figure 1. Schematic cartoon of CO and NCO formation during meiosis

In early prophase I of meiosis, recombination is initiated by programmed DNA double strand brakes (DSBs) made by the Spo11 protein. Repair of these DSBs using the homologous chromosome as a template leads to the formation of crossover or non-crossover.

Figure 2. Outline of the Sperm typing and Pollen typing assay

A. Genomic DNA is extracted from sperm or pollen of an F1 (A x B) hybrid. At a given hotspot locus, the majority of the molecules contains either the A or the B allele. Only a few molecules contain a crossover or a non crossover event.

B. Crossover molecules are specifically amplified using two rounds of PCRs with specific

ASOs (Allele Specific Oligonucleotide; ). (a) ASO_AL1, (b) ASO_AL2, (c) ASO_BR2, (d)

ASO_BR1

C. PCR fragments are then sequenced to analyze the distribution of exchange points

Figure 3. Isolation of NCO events from pollen DNA.

Genomic DNA is extracted from F1 (A x B) pollen. At a given hotspot locus, the majority of the molecules contain either the A or the B allele. A small fraction of the molecules contain a

NCO event. Molecules containing flanking polymorphisms of the same allele are specifically amplified with one or two rounds of PCRs with ASOs (1).

A. NCOs are specifically amplified using two different PCRs conducted in parallel with two sets of ASOs. PCR a with (b) ASO_AL2 and (f) ASO_BSnpLo and PCR b with ASO_AR2 and ASOBSnpp. To score a NCO, both PCRs have to be positive. Both PCRs fragments are then sequenced to measure the length of the conversion event.

B. The central region containing the polymorphism to be tested for a NCO is cloned into a vector and individual clones are genotyped using fluorescence at polymorphic sites. DNA is extracted from positive clones and sequenced to confirm the NCO event and measure its length.

Figure 4. Detection of SNPs using PCRs

PCRs have been performed as described in Figure 3. Typical results are shown in the photos of the two gels. Positive reactions correspond to the same well on each gel. They are considered to be true NCOs

Figure 5. Detection of NCOs using fluorescence.

A. Fluorescent assay on individual colonies. Controls with only water do not exhibit any fluorescence. Parental control clones exhibit specific fluorescence. Clones containing non- recombinant DNA exhibit the same fluorescence as the parental A allele whereas positive clones exhibit fluorescence of the B allele.

B. Fluorescent assay on pooled clones. Controls are the same as above. After cell growth, clones are pooled by two or three. Pools that do not contain any recombinant molecules exhibit the same fluorescence as the parental A allele whereas pools containing one recombinant molecule clone exhibit fluorescence for both A and B alleles.

References 1. Baudat, F., and de Massy, B. (2007) Regulating double-stranded DNA break repair towards crossover or non-crossover during mammalian meiosis, Chromosome Res 15, 565-577. 2. Kauppi, L., May, C. A., and Jeffreys, A. J. (2009) Analysis of meiotic recombination products from human sperm, In Methods Mol Biol, pp 323-355. 3. Arnheim, N., Calabrese, P., and Tiemann-Boege, I. (2007) Mammalian meiotic recombination hot spots, Annu Rev Genet 41, 369-399. 4. de Muyt, A. D., Mercier, R., Mezard, C., and Grelon, M. (2009) Meiotic recombination and crossovers in plants, In Meiosis, pp 14-25. 5. Jeffreys, A. J., and Neumann, R. (2005) Factors influencing recombination frequency and distribution in a human meiotic crossover hotspot, Hum Mol Genet 14, 2277-2287. 6. Jeffreys, A. J., Holloway, J. K., Kauppi, L., May, C. A., Neumann, R., Slingsby, M. T., and Webb, A. J. (2004) Meiotic recombination hot spots and human DNA diversity, Philos Trans R Soc Lond B Biol Sci 359, 141-152. 7. Holloway, K., Lawson, V. E., and Jeffreys, A. J. (2006) Allelic recombination and de novo deletions in sperm in the human {beta}-globin gene region, Hum Mol Genet 15, 1099-1111. 8. Dooner, H. K. (1986) Genetic fine structure of the BRONZE locus in maize, Genetics 113, 1021-1036. 9. Xu, X., Hsia, A. P., Zhang, L., Nikolau, B. J., and Schnable, P. S. (1995) Meiotic recombination break points resolve at high rates at the 5' end of a maize coding sequence, Plant Cell 7, 2151-2161. 10. Dooner, H. K., and Martinez-Ferez, I. M. (1997) Recombination occurs uniformly within the bronze gene, a meiotic recombination hotspot in the maize genome, Plant Cell 9, 1633-1646. 11. Yao, H., Zhou, Q., Li, J., Smith, H., Yandeau, M., Nikolau, B. J., and Schnable, P. S. (2002) Molecular characterization of meiotic recombination across the 140-kb multigenic a1-sh2 interval of maize, Proc Natl Acad Sci U S A 99, 6157-6162. 12. Cole, F., and Jasin, M. (2011) Isolation of meiotic recombinants from mouse sperm, Methods Mol Biol 745, 251-282. 13. Drouaud, J., and Mezard, C. (2011) Characterization of meiotic crossovers in pollen from Arabidopsis thaliana, Methods Mol Biol 745, 223-249. 14. Baudat, F., and de Massy, B. (2009) Parallel detection of crossovers and noncrossovers in mouse germ cells, In Methods Mol Biol, pp 305-322. 15. Ng, S. H., Parvanov, E., Petkov, P. M., and Paigen, K. (2008) A quantitative assay for crossover and noncrossover molecular events at individual recombination hotspots in both male and female gametes, Genomics 92, 204-209. 16. Baudat, F., and de Massy, B. (2007) Cis- and Trans-Acting elements regulate the mouse Psmb9 meiotic recombination hotspot, PLoS Genet 3, e100.

Double Strand Breaks

Spo11 Repair

Crossover (CO) Non Crossover (NCO) allele A F1 allele B

A

NCOs

(a) (b) COs

(c) (d) B

C cM/Mb

Kb allele A F1 allele B

Pollen DNA extraction

(a)

(d) 1

A B

(b) (e)

(f) (c)

+ SNP Fluoresent genotyping + Control: H Control: Control: H Control: A

Fluorescence of the allele A B Fluorescence of the allele A Control: allele allele Control: A Control: allele allele Control: A x x x x x x x 2 2 O O Fluorescence of the allele B of the allele Fluorescence Fluorescence of the allele B of the allele Fluorescence x Positive clone clone Positive Positive clone clone Positive x x x x x x x x x x x x x x Non Recombinant Recombinant Non Control: allele B allele Control: Non Recombinant Recombinant Non Control: allele B allele Control:

ANNEXES

137 Table S1. Primers used

Name Sequence 5'->3' Genomic coordinates 130x0CoL1 CGCACACTCTTATCCAACA 171696..171714 130x0LeL1 CGCACATTCTTTTCCAACT 171696..171714 130x7CoL4 GCCTAATGTTCCCATGGATCTC 173199..173220 130x7LeL5 TGCCTAATGTTCCCATATCGAAG 173198..173220 130x21CoL1 CGACCAAGATTGGATAAAAAG 176005..176025 130x21LeL5 TCGACCAAGATTGGAAC 176004..176033 130x21CoR5 GGTACTGTTCCAATCTTTTTATC 176039..176017 130x21LeR6 GCAAGGTACTGTTCCAACCTCGG 176043..176007 130x42LeL1 GGTTACAGAGAATTGTAGAATGTTAC 178103..178128 130x43CoL1 GAAGAATGGATTATATAATACAGTACAGTATAT 178138..178170 130x44CoL4 AGTGGTCACTGTCGCTATAG 178344..178363 130x44LeL4 AAGTGGTCACTGTCGTTATAC 178343..178363 130x44CoR1 CAGCCCAATAGAATCCTATAG 178378..178358 130x44LeR4 CAGCCCAATAGAATCGTATAA 178378..178358 130x47CoR2 TGAGTCAATTAACGTGGTTTGCA 179054..179032 130x47LeR4 GTCGATGTCGTTGGTTTTCG 179051..179032 130x52CoL1 GAATCTTGTCTCACATGTCTAGA 180401..180420 130x52CoR1 TGTCAGAGATATTCTAGACATGTGA 180432..180408 130x52LeR2 TGGGATGAATTTTGTCAGAGCTATGA 180444..180408 130x72CoR2 GGGCCAATACATTGGAAACA 185687..185668 130x72LeR2 CAAGGGCCAAGACCTCAATC 185690..185666 130x76CoR1 TTTCTTCCTATACTCGTCTCAG 185981..185960 130x78LeR3 GGTTGGCCTCTTCTG 186005..185991 14a8CoL3 TATGCAACAGCAACACG 16691260..16691275 14a5LeL3 CCCCCCGCATGCTTTTACATTATA 16691245..16691263 14a9Col2 ACCGTTGCACTTTCCTT 16692501..16692519 14a9LeL2 CCGAAACCGTTGCACTATG 16692500..16692530 14a23CoL1 CCCCCGATTTATTCATACATATC 16692993..16693014 14a23LeL1 CCCCCGATTTATTCGTACATATT 16692993..16693014 14a54CoR2 GGAGAGCTAATGCAGGC 16699799..16699783 14a54LeR2 GCAGAGCCAATGCGGGT 16699799..16699783 14a63CoR3 CCTTTTGTTCGTACAGGTG 16700136..16700120 14a63LeR3 CCCCCATTTTGTTCGCAAAGGTA 16700137..16700120 14aKH1L ATCTTATAAACGTTATTGTCA 16692865..16682885 14aKH2bisR TTCGCCCGGCAAACTATTCC 16695220..16695202

Table S2. Primers for allele-specific PCR. Annealing Annealing Target: First PCR temperature Second PCR temperature Parental (°C) (°C) or CO 130x0CoL1-130x76CoR1 61 130x7CoL4-130x72CoR2 58 parental 130x0LeL1-130x78LeR3 58 130x7LeL5-130x72LeR2 58 parental 130x0CoL1-78LeR3 59 130x7CoL4-130x72LeR2 64 CO 130x0LeL1-130x76CoR1 59 130x7LeL5-130x72CoR2 64 CO 130x0CoL1-130x52LeR2 59 130x7CoL4-130x47LeR4 59 CO 130x0LeL1-130x52CoR1 59 130x7LeL5-130x47CoR2 59 CO 130x43CoL1-130x78LeR3 57 130x44CoL4-130x72LeR2 60 CO 130x42LeL1-130x78CoR2 58 130x44LeL4-130x72CoR2 58 CO 14a9Col2-14a63CoR3 58 14a23CoL1-14a54CoR2 58 parental 14a9LeL2-14a63LeR3 58 14a23LeL1-14a54LeR2 58 parental 14a8CoL3-14a63LeR3 61 14a9Col2-14a54LeR2 58 CO 14a5LeL3-14a63CoR3 61 14a9LeL2-14a54CoR2 58 CO

Table S3. Primers for NCO detection at 130x. Ler to Col to Ler Annealing Annealing First PCR temperature Second PCR temperature (°C) (°C) 130x7LeL5-130x44CoR1 66 130x44CoL4-130x52LeR2 61 130x7LeL5-130x52CoR1 66 130x0LeL1-130x78LeR3 58 130x52CoL1-130x72LeR2 59 130x7LeL5-130x21CoR5 63 130x21CoL1-130x44LeR4 64

Col to Ler to Col Annealing Annealing First PCR temperature Second PCR temperature (°C) (°C) 130x7CoL4-130x44LeR4 62 130x44LeL4-130x52CoR1 61 130x0CoL1-130x76CoR1 61 130x7CoL4-130x21LeR6 60 130x21LeL5-130x44CoR1 63

Table S4. Oligonucleotides used for NCO characterization at 14a1 hotspot Oligonucleotides SNP Name Sequence Y-1201-Col GAAGGTCGGAGTCAACGGATTTTTCTTTTCGGCTTTTTCGG #35 Y-1201-Ler GAAGGTGACCAAGTTCATGCTATTTCTTTTCGGCTTTTTCGA Y-1021-Rev GTGCCCTGCCATAATTCCACT Y-1675-Col GAAGGTGACCAAGTTCATGCTAGACTGATCGTTACCGATTCCG #37 Y-1675-Ler GAAGGTCGGAGTCAACGGATTAGACTGATCGTTACCGATTCCA Y-1675-Rev GGTGGGTCCGAGTTTTCTTTGAT M-768-Col GAAGGTGACCAAGTTCATGCTTCTTAGTCAAACAATGGCTTAG #33 M-768-Ler GAAGGTCGGAGTCAACGGATTCTTAGTCAAACAATGGCTTAT M-768-Rev ATGGCTGTTTTCGTTTCCTTAT

REFERENCES

Alexander, M.P. (1969). Differential staining of aborted and non aborted pollen. Stain Technology 44, 117-122. Allard, R.W. (1963). Evidence for Genetic Restriction of Recombination in the Lima Bean. Genetics 48, 1389-1395. Allers, T., and Lichten, M. (2001a). Differential timing and control of noncrossover and crossover recombination during meiosis. Cell 106, 47-57. Allers, T., and Lichten, M. (2001b). Intermediates of yeast meiotic recombination contain heteroduplex DNA. Mol Cell 8, 225-231. Alonso, J.M., and Ecker, J.R. (2006). Moving forward in reverse: genetic technologies to enable genome-wide phenomic screens in Arabidopsis. Nat Rev Genet 7, 524-536. Anderson, L.K., and Stack, S.M. (2005). Recombination nodules in plants. Cytogenetic Genome Research 109. Anderson, L.K., Hooker, K.D., and Stack, S.M. (2001). The distribution of early recombination nodules on zygotene bivalents from plants. Genetics 159, 1259-1269. Anderson, L.K., Offenberg, H.H., Verkuijlen, W.M., and Heyting, C. (1997). RecA-like proteins are components of early meiotic nodules in lily. Proc Natl Acad Sci U S A 94, 6868-6873. Anderson, L.K., Reeves, A., Webb, L.M., and Ashley, T. (1999). Distribution of crossing over on mouse synaptonemal complexes using immunofluorescent localization of MLH1 protein. Genetics 151, 1569-1579. Anderson, L.K., Doyle, G.G., Brigham, B., Carter, J., Hooker, K.D., Lai, A., Rice, M., and Stack, S.M. (2003). High-resolution crossover maps for each bivalent of Zea mays using recombination nodules. Genetics 165, 849-865. Araki, Y., Takahashi, S., Kobayashi, T., Kajiho, H., Hoshino, S., and Katada, T. (2001). Ski7p G protein interacts with the exosome and the Ski complex for 3'-to-5' mRNA decay in yeast. EMBO J 20, 4684-4693. Argueso, J.L., Wanat, J., Gemici, Z., and Alani, E. (2004). Competing crossover pathways act during meiosis in Saccharomyces cerevisiae. Genetics 168, 1805-1816. Argueso, J.L., Kijas, A.W., Sarin, S., Heck, J., Waase, M., and Alani, E. (2003). Systematic mutagenesis of the Saccharomyces cerevisiae MLH1 gene reveals distinct roles for Mlh1p in meiotic crossing over and in vegetative and meiotic mismatch repair. Mol Cell Biol 23, 873-886. Armstrong, S.J., Franklin, F.C., and Jones, G.H. (2001). Nucleolus-associated telomere clustering and pairing precede meiotic chromosome synapsis in Arabidopsis thaliana. J Cell Sci 114, 4207- 4217. Armstrong, S.J., Franklin, F.C.H., and Jones, G.H. (2003). A meiotic time-course for Arabidopsis thaliana. Sexual Plant Reproduction 16, 141-149. Armstrong, S.J., Caryl, A.P., Jones, G.H., and Franklin, F.C. (2002). Asy1, a protein required for meiotic chromosome synapsis, localizes to axis-associated chromatin in Arabidopsis and Brassica. J Cell Sci 115, 3645-3655. Avramova, Z. (2009). Evolution and pleiotropy of TRITHORAX function in Arabidopsis. Int J Dev Biol 53, 371-381. Axelsson, E., Webster, M.T., Ratnakumar, A., Consortium, L., Ponting, C.P., and Lindblad-Toh, K. (2011). Death of PRDM9 coincides with stabilization of the recombination landscape in the dog genome. Genome Res. Bachrati, C.Z., Borts, R.H., and Hickson, I.D. (2006). Mobile D-loops are a preferred substrate for the Bloom's syndrome helicase. Nucleic Acids Res 34, 2269-2279.

142 Bai, X., Peirson, B.N., Dong, F., Xue, C., and Makaroff, C.A. (1999). Isolation and characterization of SYN1, a RAD21-like gene essential for meiosis in Arabidopsis. Plant Cell 11, 417-430. Bannister, L.A., and Schimenti, J.C. (2004). Homologous recombinational repair proteins in mouse meiosis. Cytogenet Genome Res 107, 191-200. Barber, L.J., Youds, J.L., Ward, J.D., McIlwraith, M.J., O'Neil, N.J., Petalcorin, M.I., Martin, J.S., Collis, S.J., Cantor, S.B., Auclair, M., Tissenbaum, H., West, S.C., Rose, A.M., and Boulton, S.J. (2008). RTEL1 maintains genomic stability by suppressing homologous recombination. Cell 135, 261-271. Basheva, E.A., Bidau, C.J., and Borodin, P.M. (2008). General pattern of meiotic recombination in male dogs estimated by MLH1 and RAD51 immunolocalization. Chromosome Res 16, 709- 719. Baudat, F., and Nicolas, A. (1997). Clustering of meiotic double-strand breaks on yeast chromosome III. Proc Natl Acad Sci U S A 94, 5213-5218. Baudat, F., and de Massy, B. (2007a). Regulating double-stranded DNA break repair towards crossover or non-crossover during mammalian meiosis. Chromosome Res 15, 565-577. Baudat, F., and de Massy, B. (2007b). Cis- and Trans-Acting Elements Regulate the Mouse Psmb9 Meiotic Recombination Hotspot. PLoS Genet 3, e100. Baudat, F., and de Massy, B. (2009). Parallel detection of crossovers and noncrossovers in mouse germ cells. In Methods Mol Biol, pp. 305-322. Baudat, F., Manova, K., Yuen, J.P., Jasin, M., and Keeney, S. (2000). Chromosome synapsis defects and sexually dimorphic meiotic progression in mice lacking Spo11. Mol Cell 6, 989-998. Baudat, F., Buard, J., Grey, C., Fledel-Alon, A., Ober, C., Przeworski, M., Coop, G., and de Massy, B. (2010). PRDM9 is a major determinant of meiotic recombination hotspots in humans and mice. Science 327, 836-840. Bennett, M.D. (1971). The Duration of Meiosis. Proceedings of the Royal Society of London 178, 277- 299. Bennett, M.D. (1977). The time and duration of meiosis. Philos Trans R Soc Lond B Biol Sci 277, 201- 226. Berchowitz, L.E., and Copenhaver, G.P. (2008). Fluorescent Arabidopsis tetrads: a visual assay for quickly developing large crossover and crossover interference data sets. Nat Protoc 3, 41-50. Berchowitz, L.E., Francis, K.E., Bey, A.L., and Copenhaver, G.P. (2007). The role of AtMUS81 in interference-insensitive crossovers in A. thaliana. PLoS Genet 3, e132. Berg, I.L., Neumann, R., Sarbajna, S., Odenthal-Hesse, L., Butler, N.J., and Jeffreys, A.J. (2011). Variants of the protein PRDM9 differentially regulate a set of human meiotic recombination hotspots highly active in African populations. Proc Natl Acad Sci U S A. Berg, I.L., Neumann, R., Lam, K.W., Sarbajna, S., Odenthal-Hesse, L., May, C.A., and Jeffreys, A.J. (2010). PRDM9 variation strongly influences recombination hot-spot activity and meiotic instability in humans. Nat Genet. Bergerat, A., de Massy, B., Gadelle, D., Varoutas, P.C., Nicolas, A., and Forterre, P. (1997). An atypical topoisomerase II from Archaea with implications for meiotic recombination. Nature 386, 414-417. Beye, M., Gattermeier, I., Hasselmann, M., Gempe, T., Schioett, M., Baines, J.F., Schlipalius, D., Mougel, F., Emore, C., Rueppell, O., Sirvio, A., Guzman-Novoa, E., Hunt, G., Solignac, M., and Page, R.E. (2006). Exceptionally high levels of recombination across the honey bee genome. Genome Res 16, 1339-1344. Bhatt, A.M., Lister, C., Page, T., Fransz, P., Findlay, K., Jones, G.H., Dickinson, H.G., and Dean, C. (1999). The DIF1 gene of Arabidopsis is required for meiotic chromosome segregation and belongs to the REC8/RAD21 cohesin gene family. Plant J 19, 463-472. Birtle, Z., and Ponting, C.P. (2006). Meisetz and the birth of the KRAB motif. Bioinformatics 22, 2841- 2845. Bleuyard, J.Y., and White, C.I. (2004). The Arabidopsis homologue of Xrcc3 plays an essential role in meiosis. Embo J 23, 439-449.

143 Bleuyard, J.Y., Gallego, M.E., Savigny, F., and White, C.I. (2005). Differing requirements for the Arabidopsis Rad51 paralogs in meiosis and DNA repair. Plant J 41, 533-545. Blitzblau, H.G., Bell, G.W., Rodriguez, J., Bell, S.P., and Hochwagen, A. (2007). Mapping of meiotic single-stranded DNA reveals double-stranded-break hotspots near centromeres and telomeres. Curr Biol 17, 2003-2012. Boehnke, M., Arnheim, N., Li, H., and Collins, F.S. (1989). Fine-structure genetic mapping of human chromosomes using the polymerase chain reaction on single sperm: experimental design considerations. Am J Hum Genet 45, 21-32. Bois, P.R. (2007). A highly polymorphic meiotic recombination mouse hot spot exhibits incomplete repair. Mol Cell Biol 27, 7053-7062. Borde, V. (2007). The multiple roles of the Mre11 complex for meiotic recombination. Chromosome Res 15, 551-563. Borde, V., Robine, N., Lin, W., Bonfils, S., Geli, V., and Nicolas, A. (2009). Histone H3 lysine 4 trimethylation marks meiotic recombination initiation sites. Embo J 28, 99-111. Borner, G.V., Kleckner, N., and Hunter, N. (2004). Crossover/noncrossover differentiation, synaptonemal complex formation, and regulatory surveillance at the leptotene/zygotene transition of meiosis. Cell 117, 29-45. Borodin, P.M., Basheva, E.A., and Zhelezova, A.I. (2009). Immunocytological analysis of meiotic recombination in the American mink (Mustela vison). Anim Genet 40, 235-238. Borodin, P.M., Karamysheva, T.V., Belonogova, N.M., Torgasheva, A.A., Rubtsov, N.B., and Searle, J.B. (2008). Recombination map of the common shrew, Sorex araneus (Eulipotyphla, Mammalia). Genetics 178, 621-632. Borts, R.H., and Haber, J.E. (1989). Length and distribution of meiotic gene conversion tracts and crossovers in Saccharomyces cerevisiae. Genetics 123, 69-80. Bosch, E., Hurles, M.E., Navarro, A., and Jobling, M.A. (2004). Dynamics of a human interparalog gene conversion hotspot. Genome Res 14, 835-844. Boulton, A., Myers, R.S., and Redfield, R.J. (1997). The hotspot conversion paradox and the evolution of meiotic recombination. Proc Natl Acad Sci U S A 94, 8058-8063. Bridges, C.B. (1935). Salivary chromosome maps with a key to the banding of the chromosomes of Drosophila melanogaster. J Hered 26, 60-64. Buard, J., Barthes, P., Grey, C., and de Massy, B. (2009). Distinct histone modifications define initiation and repair of meiotic recombination in the mouse. Embo J. Buhler, C., Borde, V., and Lichten, M. (2007). Mapping Meiotic Single-Strand DNA Reveals a New Landscape of DNA Double-Strand Breaks in Saccharomyces cerevisiae. PLoS Biol 5, e324. Busygina, V., Sehorn, M.G., Shi, I.Y., Tsubouchi, H., Roeder, G.S., and Sung, P. (2008). Hed1 regulates Rad51-mediated recombination via a novel mechanism. Genes Dev 22, 786-795. Cao, L., Alani, E., and Kleckner, N. (1990). A pathway for generation and processing of double-strand breaks during meiotic recombination in S. cerevisiae. Cell 61, 1089-1101. Carpenter, A.T. (1975). Electron microscopy of meiosis in Drosophila melanogaster females: II. The recombination nodule--a recombination-associated structure at pachytene? Proc Natl Acad Sci U S A 72, 3186-3189. Cartagena, J.A., Matsunaga, S., Seki, M., Kurihara, D., Yokoyama, M., Shinozaki, K., Fujimoto, S., Azumi, Y., Uchiyama, S., and Fukui, K. (2008). The Arabidopsis SDG4 contributes to the regulation of pollen tube growth by methylation of histone H3 lysines 4 and 36 in mature pollen. Dev Biol 315, 355-368. Caryl, A.P., Armstrong, S.J., Jones, G.H., and Franklin, F.C. (2000). A homologue of the yeast HOP1 gene is inactivated in the Arabidopsis meiotic mutant asy1. Chromosoma 109, 62-71. Cazzonelli, C.I., Millar, T., Finnegan, E.J., and Pogson, B.J. (2009). Promoting gene expression in plants by permissive histone lysine methylation. Plant Signal Behav 4, 484-488. Celerin, M., Merino, S.T., Stone, J.E., Menzie, A.M., and Zolan, M.E. (2000). Multiple roles of Spo11 in meiotic chromosome behavior. Embo J 19, 2739-2750.

144 Chaibunruang, A., Srivorakun, H., Fucharoen, S., Fucharoen, G., Sae-ung, N., and Sanchaisuriya, K. (2010). Interactions of hemoglobin Lepore (deltabeta hybrid hemoglobin) with various hemoglobinopathies: A molecular and hematological characteristics and differential diagnosis. Blood Cells Mol Dis 44, 140-145. Chakravarti, A., Elbein, S.C., and Permutt, M.A. (1986). Evidence for increased recombination near the human insulin gene: implication for disease association studies. Proc Natl Acad Sci U S A 83, 1045-1049. Chakravarti, A., Buetow, K.H., Antonarakis, S.E., Waber, P.G., Boehm, C.D., and Kazazian, H.H. (1984). Nonuniform recombination within the human beta-globin gene cluster. Am J Hum Genet 36, 1239-1258. Chance, P.F., Abbas, N., Lensch, M.W., Pentao, L., Roa, B.B., Patel, P.I., and Lupski, J.R. (1994). Two autosomal dominant neuropathies result from reciprocal DNA duplication/deletion of a region on chromosome 17. Hum Mol Genet 3, 223-228. Chelysheva, L., Vezon, D., Belcram, K., Gendrot, G., and Grelon, M. (2008). The Arabidopsis BLAP75/Rmi1 homologue plays crucial roles in meiotic double-strand break repair. PLoS Genet 4, e1000309. Chelysheva, L., Gendrot, G., Vezon, D., Doutriaux, M.P., Mercier, R., and Grelon, M. (2007). Zip4/Spo22 is required for class I CO formation but not for synapsis completion in Arabidopsis thaliana. PLoS Genet 3, e83. Chelysheva, L., Diallo, S., Vezon, D., Gendrot, G., Vrielynck, N., Belcram, K., Rocques, N., Marquez- Lema, A., Bhatt, A.M., Horlow, C., Mercier, R., Mezard, C., and Grelon, M. (2005). AtREC8 and AtSCC3 are essential to the monopolar orientation of the kinetochores during meiosis. J Cell Sci 118, 4621-4632. Chen, J., Cooper, D.N., Chuzhanova, N., Férec, C., and Patrinos, G.P. (2007). Gene conversion: mechanisms, evolution and human disease. Nature Reviews Genetics 8, 762-775. Chen, S.Y., Tsubouchi, T., Rockmill, B., Sandler, J.S., Richards, D.R., Vader, G., Hochwagen, A., Roeder, G.S., and Fung, J.C. (2008). Global analysis of the meiotic crossover landscape. Dev Cell 15, 401-415. Chen, Y.K., Leng, C.H., Olivares, H., Lee, M.H., Chang, Y.C., Kung, W.M., Ti, S.C., Lo, Y.H., Wang, A.H., Chang, C.S., Bishop, D.K., Hsueh, Y.P., and Wang, T.F. (2004). Heterodimeric complexes of Hop2 and Mnd1 function with Dmc1 to promote meiotic homolog juxtaposition and strand assimilation. Proc Natl Acad Sci U S A 101, 10572-10577. Chen, Y.T., Venditti, C.A., Theiler, G., Stevenson, B.J., Iseli, C., Gure, A.O., Jongeneel, C.V., Old, L.J., and Simpson, A.J. (2005). Identification of CT46/HORMAD1, an immunogenic cancer/testis antigen encoding a putative meiosis-related protein. Cancer Immun 5, 9. Chi, P., San Filippo, J., Sehorn, M.G., Petukhova, G.V., and Sung, P. (2007). Bipartite stimulatory action of the Hop2-Mnd1 complex on the Rad51 recombinase. Genes Dev 21, 1747-1757. Codina-Pascual, M., Campillo, M., Kraus, J., Speicher, M.R., Egozcue, J., Navarro, J., and Benet, J. (2006). Crossover frequency and synaptonemal complex length: their variability and effects on human male meiosis. Mol Hum Reprod 12, 123-133. Cohen, P.E., Pollack, S.E., and Pollard, J.W. (2006). Genetic analysis of chromosome pairing, recombination, and cell cycle control during first meiotic prophase in mammals. Endocr Rev 27, 398-426. Colaiacovo, M.P., MacQueen, A.J., Martinez-Perez, E., McDonald, K., Adamo, A., La Volpe, A., and Villeneuve, A.M. (2003). Synaptonemal complex assembly in C. elegans is dispensable for loading strand-exchange proteins but critical for proper completion of recombination. Dev Cell 5, 463-474. Cole, F., Keeney, S., and Jasin, M. (2010). Comprehensive, Fine-Scale Dissection of Homologous Recombination Outcomes at a Hot Spot in Mouse Meiosis. Mol Cell 39, 700-710. Coop, G., and Myers, S.R. (2007). Live Hot, Die Young: Transmission Distortion in Recombination Hotspots. PLoS Genet 3, e35.

145 Copenhaver, G.P., Keith, K.C., and Preuss, D. (2000). Tetrad analysis in higher plants. A budding technology. Plant Physiol 124, 7-16. Couteau, F., Belzile, F., Horlow, C., Grandjean, O., Vezon, D., and Doutriaux, M.P. (1999). Random chromosome segregation without meiotic arrest in both male and female meiocytes of a dmc1 mutant of Arabidopsis. Plant Cell 11, 1623-1634. Cromie, G.A., Hyppa, R.W., Taylor, A.F., Zakharyevich, K., Hunter, N., and Smith, G.R. (2006). Single Holliday junctions are intermediates of meiotic recombination. Cell 127, 1167-1178. Cromie, G.A., Hyppa, R.W., Cam, H.P., Farah, J.A., Grewal, S.I., and Smith, G.R. (2007). A discrete class of intergenic DNA dictates meiotic DNA break hotspots in fission yeast. PLoS Genet 3, e141. Cullen, M., Perfetto, S.P., Klitz, W., Nelson, G., and Carrington, M. (2002). High-resolution patterns of meiotic recombination across the human major histocompatibility complex. Am J Hum Genet 71, 759-776. de Boer, E., Stam, P., Dietrich, A.J., Pastink, A., and Heyting, C. (2006). Two levels of interference in mouse meiotic recombination. Proc Natl Acad Sci U S A 103, 9607-9612. de los Santos, T., Hunter, N., Lee, C., Larkin, B., Loidl, J., and Hollingsworth, N.M. (2003). The Mus81/Mms4 endonuclease acts independently of double-Holliday junction resolution to promote a distinct subset of crossovers during meiosis in budding yeast. Genetics 164, 81-94. de Massy, B. (2003). Distribution of meiotic recombination sites. Trends Genet 19, 514-522. de Muyt, A., Vezon, D., Gendrot, G., Gallois, J.L., Stevens, R., and Grelon, M. (2007). AtPRD1 is required for meiotic double strand break formation in Arabidopsis thaliana. Embo J 26, 4126- 4137. de Muyt, A., Pereira, L., Vezon, D., Chelysheva, L., Gendrot, G., Chambon, A., Laine-Choinard, S., Pelletier, G., Mercier, R., Nogue, F., and Grelon, M. (2009a). A High Throughput Genetic Screen Identifies New Early Meiotic Recombination Functions in Arabidopsis thaliana. PLoS Genet 5, e1000654. de Muyt, A.D., Mercier, R., Mezard, C., and Grelon, M. (2009b). Meiotic recombination and crossovers in plants. In Meiosis, pp. 14-25. Deeb, S.S. (2005). The molecular basis of variation in human color vision. Clin Genet 67, 369-377. Dernburg, A.F., McDonald, K., Moulder, G., Barstead, R., Dresser, M., and Villeneuve, A.M. (1998). Meiotic recombination in C. elegans initiates by a conserved mechanism and is dispensable for homologous chromosome synapsis. Cell 94, 387-398. Detloff, P., White, M.A., and Petes, T.D. (1992). Analysis of a gene conversion gradient at the HIS4 locus in Saccharomyces cerevisiae. Genetics 132, 113-123. Dooner, H.K. (1986). Genetic Fine Structure of the BRONZE Locus in Maize. Genetics 113, 1021-1036. Dooner, H.K. (2002). Extensive interallelic polymorphisms drive meiotic recombination into a crossover pathway. Plant Cell 14, 1173-1183. Dooner, H.K., and Martinez-Ferez, I.M. (1997). Recombination occurs uniformly within the bronze gene, a meiotic recombination hotspot in the maize genome. Plant Cell 9, 1633-1646. Dooner, H.K., and He, L. (2008). Maize genome structure variation: interplay between retrotransposon polymorphisms and genic recombination. Plant Cell 20, 249-258. Doutriaux, M.P., Couteau, F., Bergounioux, C., and White, C. (1998). Isolation and characterisation of the RAD51 and DMC1 homologs from Arabidopsis thaliana. Mol Gen Genet 257, 283-291. Dray, E., Siaud, N., Dubois, E., and Doutriaux, M.P. (2006). Interaction between Arabidopsis Brca2 and its partners Rad51, Dmc1, and Dss1. Plant Physiol 140, 1059-1069. Drouaud, J., and Mezard, C. (2011). Characterization of meiotic crossovers in pollen from Arabidopsis thaliana. In Methods Mol Biol, pp. 223-249. Drouaud, J., Mercier, R., Chelysheva, L., Berard, A., Falque, M., Martin, O., Zanni, V., Brunel, D., and Mezard, C. (2007). Sex-Specific Crossover Distributions and Variations in Interference Level along Arabidopsis thaliana Chromosome 4. PLoS Genet 3, e106. Drouaud, J., Camilleri, C., Bourguignon, P.Y., Canaguier, A., Berard, A., Vezon, D., Giancola, S., Brunel, D., Colot, V., Prum, B., Quesneville, H., and Mezard, C. (2006). Variation in crossing-

146 over rates across chromosome 4 of Arabidopsis thaliana reveals the presence of meiotic recombination "hot spots". Genome Res 16, 106-114. Dupaigne, P., Le Breton, C., Fabre, F., Gangloff, S., Le Cam, E., and Veaute, X. (2008). The Srs2 helicase activity is stimulated by Rad51 filaments on dsDNA: implications for crossover incidence during mitotic recombination. Mol Cell 29, 243-254. Duret, L., and Arndt, P.F. (2008). The impact of recombination on nucleotide substitutions in the human genome. PLoS Genet 4, e1000071. Duret, L., and Galtier, N. (2009). Biased Gene Conversion and the Evolution of Mammalian Genomic Landscapes. Annu Rev Genomics Hum Genet. Ehmsen, K., and Heyer, W.-D. (2008). Biochemistry of Meiotic Recombination: Formation, Processing, and Resolution of Recombination Intermediates. In Recombination and Meiosis - Models, Means, and Evolution, pp. 91-164. Eijpe, M., Offenberg, H., Jessberger, R., Revenkova, E., and Heyting, C. (2003). Meiotic cohesin REC8 marks the axial elements of rat synaptonemal complexes before cohesins SMC1beta and SMC3. J Cell Biol 160, 657-670. Elsea, S.H., and Girirajan, S. (2008). Smith-Magenis syndrome. Eur J Hum Genet 16, 412-421. Ernst, H. (1938). Meiosis und Crossing over Zytologische und genetische Untersuchungen an Antirrhinum majus L. Thesis. Feng, X., and Dickinson, H.G. (2007). Packaging the male germline in plants. Trends Genet 23, 503- 510. Ferguson, K.A., Leung, S., Jiang, D., and Ma, S. (2009). Distribution of MLH1 foci and inter-focal distances in spermatocytes of infertile men. Hum Reprod. Feuk, L. (2010). Inversion variants in the human genome: role in disease and genome architecture. Genome Med 2, 11. Francis, K.E., Lam, S.Y., Harrison, B.D., Bey, A.L., Berchowitz, L.E., and Copenhaver, G.P. (2007). Pollen tetrad-based visual assay for meiotic recombination in Arabidopsis. Proc Natl Acad Sci U S A 104, 3913-3918. Franklin, F.C., Higgins, J.D., Sanchez-Moran, E., Armstrong, S.J., Osman, K.E., Jackson, N., and Jones, G.H. (2006). Control of meiotic recombination in Arabidopsis: role of the MutL and MutS homologues. Biochem Soc Trans 34, 542-544. Fukushima, K., Tanaka, Y., Nabeshima, K., Yoneki, T., Tougan, T., Tanaka, S., and Nojima, H. (2000). Dmc1 of Schizosaccharomyces pombe plays a role in meiotic recombination. Nucleic Acids Res 28, 2709-2716. Game, J.C., Sitney, K.C., Cook, V.E., and Mortimer, R.K. (1989). Use of a ring chromosome and pulsed-field gels to study interhomolog recombination, double-strand DNA breaks and sister- chromatid exchange in yeast. Genetics 123, 695-713. Gaskell, L.J., Osman, F., Gilbert, R.J., and Whitby, M.C. (2007). Mus81 cleavage of Holliday junctions: a failsafe for processing meiotic recombination intermediates? Embo J 26, 1891-1901. Gerald, P.S., and Diamond, L.K. (1958). A new hereditary hemoglobinopathy (the Lepore trait) and its interaction with thalassemia trait. Blood 13, 835-844. Gerton, J.L., DeRisi, J., Shroff, R., Lichten, M., Brown, P.O., and Petes, T.D. (2000). Global mapping of meiotic recombination hotspots and coldspots in the yeast Saccharomyces cerevisiae. Proc Natl Acad Sci U S A 97, 11383-11390. Giraut, L., Falque, M., Drouaud, J., Pereira, L., Martin, O.C., and Mezard, C. (2011). Genome-Wide Crossover Distribution in Arabidopsis thaliana Meiosis Reveals Sex-Specific Patterns along Chromosomes. PLoS Genet 7, e1002354. Goldman, S.L., and Smallets, S. (1979). Site specific induction of gene conversion: the effects of homozygosity of the ade6 mutant M26 of Schizosaccharomyces pombe on meiotic gene conversion. Mol Gen Genet 173, 221-225. Greenawalt, D.M., Cui, X., Wu, Y., Lin, Y., Wang, H.Y., Luo, M., Tereshchenko, I.V., Hu, G., Li, J.Y., Chu, Y., Azaro, M.A., Decoste, C.J., Chimge, N.O., Gao, R., Shen, L., Shih, W.J., Lange, K., and

147 Li, H. (2006). Strong correlation between meiotic crossovers and haplotype structure in a 2.5- Mb region on the long arm of chromosome 21. Genome Res 16, 208-214. Grelon, M., Vezon, D., Gendrot, G., and Pelletier, G. (2001). AtSPO11-1 is necessary for efficient meiotic recombination in plants. Embo J 20, 589-600. Grey, C., Baudat, F., and de Massy, B. (2009). Genome-wide control of the distribution of meiotic recombination. PLoS Biol 7, e35. Grey, C., Sommermeyer, V., Borde, V., and de Massy, B. (2011). [What defines the genetic map? The specification of meiotic recombination sites.]. Med Sci (Paris) 27, 63-69. Grini, P.E., Thorstensen, T., Alm, V., Vizcay-Barrena, G., Windju, S.S., Jorstad, T.S., Wilson, Z.A., and Aalen, R.B. (2009). The ASH1 HOMOLOG 2 (ASHH2) histone H3 methyltransferase is required for ovule and anther development in Arabidopsis. PLoS One 4, e7817. Grishchuk, A.L., and Kohli, J. (2003). Five RecA-like proteins of Schizosaccharomyces pombe are involved in meiotic recombination. Genetics 165, 1031-1043. Grishchuk, A.L., Kraehenbuehl, R., Molnar, M., Fleck, O., and Kohli, J. (2004). Genetic and cytological characterization of the RecA-homologous proteins Rad51 and Dmc1 of Schizosaccharomyces pombe. Curr Genet 44, 317-328. Guillon, H., and De Massy, B. (2002). An initiation site for meiotic crossing-over and gene conversion in the mouse. Nature Genetics 32, 296 - 299. Guillon, H., Baudat, F., Grey, C., Liskay, R.M., and de Massy, B. (2005). Crossover and Noncrossover Pathways in Mouse Meiosis. Mol Cell 20, 563-573. Guo, L., Yu, Y., Law, J.A., and Zhang, X. (2010). SET DOMAIN GROUP2 is the major histone H3 lysine [corrected] 4 trimethyltransferase in Arabidopsis. Proc Natl Acad Sci U S A 107, 18557-18562. Gupta, P.K., Rustgi, S., and Mir, R.R. (2008). Array-based high-throughput DNA markers for crop improvement. Heredity (Edinb) 101, 5-18. Haber, J. (2007). Multiple Mechanisms of Repairing Meganuclease-Induced Double-Strand DNA Breaks in Budding Yeast. In Molecular Genetics of Recombination, pp. 285-316. Haber, J. (2008). Evolution of Models of Homologous Recombination. In Recombination and Meiosis - Models, Means, and Evolution, pp. 1-64. Haber, J.E., Thorburn, P.C., and Rogers, D. (1984). Meiotic and mitotic behavior of dicentric chromosomes in Saccharomyces cerevisiae. Genetics 106, 185-205. Hartung, F., and Puchta, H. (2001). Molecular characterization of homologues of both subunits A (SPO11) and B of the archaebacterial topoisomerase 6 in plants. Gene 271, 81-86. Hartung, F., Suer, S., and Puchta, H. (2007). Two closely related RecQ helicases have antagonistic roles in homologous recombination and DNA repair in Arabidopsis thaliana. Proc Natl Acad Sci U S A 104, 18836-18841. Hartung, F., Suer, S., Bergmann, T., and Puchta, H. (2006). The role of AtMUS81 in DNA repair and its genetic interaction with the helicase AtRecQ4A. Nucleic Acids Res 34, 4438-4448. Hartung, F., Suer, S., Knoll, A., Wurz-Wildersinn, R., and Puchta, H. (2008). Topoisomerase 3alpha and RMI1 Suppress Somatic Crossovers and Are Essential for Resolution of Meiotic Recombination Intermediates in Arabidopsis thaliana. PLoS Genet 4, e1000285. Hassold, T., Hall, H., and Hunt, P. (2007). The origin of human aneuploidy: where we have been, where we are going. Hum Mol Genet 16, R203-208. Hayase, A., Takagi, M., Miyazaki, T., Oshiumi, H., Shinohara, M., and Shinohara, A. (2004). A protein complex containing Mei5 and Sae3 promotes the assembly of the meiosis-specific RecA homolog Dmc1. Cell 119, 927-940. Hayashi, K., Yoshida, K., and Matsui, Y. (2005). A histone H3 methyltransferase controls epigenetic events required for meiotic prophase. Nature 438, 374-378. He, L., and Dooner, H.K. (2009). Haplotype structure strongly affects recombination in a maize genetic interval polymorphic for Helitron and retrotransposon insertions. Proc Natl Acad Sci U S A 106, 8410-8416.

148 Henry, J.M., Camahort, R., Rice, D.A., Florens, L., Swanson, S.K., Washburn, M.P., and Gerton, J.L. (2006). Mnd1/Hop2 facilitates Dmc1-dependent interhomolog crossover formation in meiosis of budding yeast. Mol Cell Biol 26, 2913-2923. Heyer, W.D., Ehmsen, K.T., and Liu, J. (2010). Regulation of Homologous Recombination in Eukaryotes. Annu Rev Genet. Higgins, J.D., Armstrong, S.J., Franklin, F.C., and Jones, G.H. (2004). The Arabidopsis MutS homolog AtMSH4 functions at an early step in recombination: evidence for two classes of recombination in Arabidopsis. Genes Dev 18, 2557-2570. Higgins, J.D., Ferdous, M., Osman, K., and Franklin, F.C. (2011). The RecQ helicase AtRECQ4A is required to remove inter-chromosomal telomeric connections that arise during meiotic recombination in Arabidopsis. Plant J 65, 492-502. Higgins, J.D., Sanchez-Moran, E., Armstrong, S.J., Jones, G.H., and Franklin, F.C. (2005). The Arabidopsis synaptonemal complex protein ZYP1 is required for chromosome synapsis and normal fidelity of crossing over. Genes Dev 19, 2488-2500. Higgs, D.R., Old, J.M., Pressley, L., Clegg, J.B., and Weatherall, D.J. (1980). A novel alpha-globin gene arrangement in man. Nature 284, 632-635. Hoffmann, E.R., and Borts, R.H. (2004). Meiotic recombination intermediates and mismatch repair proteins. Cytogenet Genome Res 107, 232-248. Hollingsworth, N.M., Ponte, L., and Halsey, C. (1995). MSH5, a novel MutS homolog, facilitates meiotic reciprocal recombination between homologs in Saccharomyces cerevisiae but not mismatch repair. Genes Dev 9, 1728-1739. Holloway, K., Lawson, V.E., and Jeffreys, A.J. (2006). Allelic recombination and de novo deletions in sperm in the human β-globin gene region Human molecular genetics 15, 1099-1111. Hubert, R., MacDonald, M., Gusella, J., and Arnheim, N. (1994). High resolution localization of recombination hot spots using sperm typing. Nat Genet 7, 420-424. Hunter, N. (2007). Meiotic recombination. In Molecular Genetics of Recombination, A. Aguilera and R. Rothstein, eds (Berlin: Springer), pp. 381-442. Hunter, N., and Borts, R.H. (1997). Mlh1 is unique among mismatch repair proteins in its ability to promote crossing-over during meiosis. Genes Dev 11, 1573-1582. Hunter, N., and Kleckner, N. (2001). The single-end invasion: an asymmetric intermediate at the double-strand break to double-holliday junction transition of meiotic recombination. Cell 106, 59-70. Hurst, D.D., Fogel, S., and Mortimer, R.K. (1972). Conversion-associated recombination in yeast. Proc Natl Acad Sci U S A 69, 101-105. Innan, H. (2003). A two-locus gene conversion model with selection and its application to the human RHCE and RHD genes. Proc Natl Acad Sci U S A 100, 8793-8798. Inoue, K., and Lupski, J.R. (2002). Molecular mechanisms for genomic disorders. Annual Review of Genomics and Human Genetics 3, 199–242. Inoue, K., Dewar, K., Katsanis, N., Reiter, L.T., Lander, E.S., Devon, K.L., Wyman, D.W., Lupski, J.R., and Birren, B. (2001). The 1.4-Mb CMT1A duplication/HNPP deletion genomic region reveals unique genome architectural features and provides insights into the recent evolution of new genes. Genome Res 11, 1018-1033. Ira, G., Malkova, A., Liberi, G., Foiani, M., and Haber, J.E. (2003). Srs2 and Sgs1-Top3 suppress crossovers during double-strand break repair in yeast. Cell 115, 401-411. Ishiguro, K., and Watanabe, Y. (2007). Chromosome cohesion in mitosis and meiosis. J Cell Sci 120, 367-369. Jackson, N., Sanchez-Moran, E., Buckling, E., Armstrong, S.J., Jones, G.H., and Franklin, F.C. (2006). Reduced meiotic crossovers and delayed prophase I progression in AtMLH3-deficient Arabidopsis. Embo J 25, 1315-1323. Janssens, F.A. (1909). La théorie de la chiasmatypie. Nouvelle interprétation des cinèses de maturation. La Cellule 25, 389-411.

149 Jaramillo-Lambert, A., Ellefson, M., Villeneuve, A.M., and Engebrecht, J. (2007). Differential timing of S phases, X chromosome replication, and meiotic prophase in the C. elegans germ line. Dev Biol 308, 206-221. Jeffreys, A.J., and Neumann, R. (2002). Reciprocal crossover asymmetry and meiotic drive in a human recombination hot spot. Nat Genet 31, 267-271. Jeffreys, A.J., and May, C.A. (2004). Intense and highly localized gene conversion activity in human meiotic crossover hot spots. Nature Genetics 36, 151 - 156 Jeffreys, A.J., and Neumann, R. (2009). The rise and fall of a human recombination hot spot. Nat Genet. Jeffreys, A.J., Murray, J., and Neumann, R. (1998). High-resolution mapping of crossovers in human sperm defines a minisatellite-associated recombination hotspot. Mol Cell 2, 267-273. Jeffreys, A.J., Ritchie, A., and Neumann, R. (2000). High resolution analysis of haplotype diversity and meiotic crossover in the human TAP2 recombination hotspot. Hum Mol Genet 9, 725- 733. Jeffreys, A.J., Kauppi, L., and Neumann, R. (2001). Intensely punctate meiotic recombination in the class II region of the major histocompatibility complex. Nat Genet 29, 217-222. Jeffreys, A.J., Neumann, R., Panayi, M., Myers, S., and Donnelly, P. (2005). Human recombination hot spots hidden in regions of strong marker association. Nat Genet 37, 601-606. Jeffreys, A.J., Holloway, J.K., Kauppi, L., May, C.A., Neumann, R., Slingsby, M.T., and Webb, A.J. (2004). Meiotic recombination hot spots and human DNA diversity. Philos Trans R Soc Lond B Biol Sci 359, 141-152. Jensen-Seaman, M.I., Furey, T.S., Payseur, B.A., Lu, Y., Roskin, K.M., Chen, C.F., Thomas, M.A., Haussler, D., and Jacob, H.J. (2004). Comparative recombination rates in the rat, mouse, and human genomes. Genome Res 14, 528-538. Jolivet, S., Vezon, D., Froger, N., and Mercier, R. (2006). Non conservation of the meiotic function of the Ski8/Rec103 homolog in Arabidopsis. Genes Cells 11, 615-622. Kauppi, L., Sajantila, A., and Jeffreys, A.J. (2003). Recombination hotspots rather than population history dominate linkage disequilibrium in the MHC class II region. Hum Mol Genet 12, 33-40. Kauppi, L., Stumpf, M.P., and Jeffreys, A.J. (2005). Localized breakdown in linkage disequilibrium does not always predict sperm crossover hot spots in the human MHC class II region. Genomics 86, 13-24. Kauppi, L., Jasin, M., and Keeney, S. (2007). Meiotic crossover hotspots contained in haplotype block boundaries of the mouse genome. Proc Natl Acad Sci U S A 104, 13396-13401. Keeney, S. (2008). Spo11 and the Formation of DNA Double-Strand Breaks in Meiosis. In Recombination and Meiosis - Crossing-Over and Disjunction, pp. 81-123. Keeney, S., Giroux, C.N., and Kleckner, N. (1997). Meiosis-specific DNA double-strand breaks are catalyzed by Spo11, a member of a widely conserved protein family. Cell 88, 375-384. Kerzendorfer, C., Vignard, J., Pedrosa-Harand, A., Siwiec, T., Akimcheva, S., Jolivet, S., Sablowski, R., Armstrong, S., Schweizer, D., Mercier, R., and Schlogelhofer, P. (2006). The Arabidopsis thaliana MND1 homologue plays a key role in meiotic homologous pairing, synapsis and recombination. J Cell Sci 119, 2486-2496. Khademian, H., Giraut, L., Drouaud, J., and Mezard, C. (in press). Characterization of meiotic non crossover molecules from Arabidopsis thaliana pollen. Methods in Molecular Biology Plant Meiosis. Khurana, E., Lam, H.Y., Cheng, C., Carriero, N., Cayting, P., and Gerstein, M.B. (2010). Segmental duplications in the human genome reveal details of pseudogene formation. Nucleic Acids Res 38, 6997-7007. Kirkpatrick, D.T., Wang, Y.H., Dominska, M., Griffith, J.D., and Petes, T.D. (1999). Control of meiotic recombination and gene expression in yeast by a simple repetitive DNA sequence that excludes nucleosomes. Mol Cell Biol 19, 7661-7671. Kitajima, T.S., Yokobayashi, S., Yamamoto, M., and Watanabe, Y. (2003). Distinct cohesin complexes organize meiotic chromosome domains. Science 300, 1152-1155.

150 Klapholz, S., Waddell, C.S., and Esposito, R.E. (1985). The role of the SPO11 gene in meiotic recombination in yeast. Genetics 110, 187-216. Klein, F., Mahr, P., Galova, M., Buonomo, S.B., Michaelis, C., Nairz, K., and Nasmyth, K. (1999). A central role for cohesins in sister chromatid cohesion, formation of axial elements, and recombination during yeast meiosis. Cell 98, 91-103. Klimyuk, V.I., and Jones, J.D. (1997). AtDMC1, the Arabidopsis homologue of the yeast DMC1 gene: characterization, transposon-induced allelic variation and meiosis-associated expression. Plant J 11, 1-14. Klutstein, M., Shaked, H., Sherman, A., Avivi-Ragolsky, N., Shema, E., Zenvirth, D., Levy, A.A., and Simchen, G. (2008). Functional Conservation of the Yeast and Arabidopsis RAD54-Like Genes. Genetics 178, 2389-2397. Kochakpour, N., and Moens, P.B. (2008). Sex-specific crossover patterns in Zebrafish (Danio rerio). Heredity 100, 489-495. Kokotas, H., Grigoriadou, M., and Petersen, M. (2008). Meiotic Nondisjunction - The Major Cause of Trisomy 21. In Recombination and Meiosis - Crossing-Over and Disjunction, pp. 245-278. Kong, A., Gudbjartsson, D.F., Sainz, J., Jonsdottir, G.M., Gudjonsson, S.A., Richardsson, B., Sigurdardottir, S., Barnard, J., Hallbeck, B., Masson, G., Shlien, A., Palsson, S.T., Frigge, M.L., Thorgeirsson, T.E., Gulcher, J.R., and Stefansson, K. (2002). A high-resolution recombination map of the human genome. Nat Genet 31, 241-247. Krasikova, A., Barbero, J.L., and Gaginskaya, E. (2005). Cohesion proteins are present in centromere protein bodies associated with avian lampbrush chromosomes. Chromosome Res 13, 675- 685. Krichevsky, A., Gutgarts, H., Kozlovsky, S.V., Tzfira, T., Sutton, A., Sternglanz, R., Mandel, G., and Citovsky, V. (2007). C2H2 zinc finger-SET histone methyltransferase is a plant-specific chromatin modifier. Dev Biol 303, 259-269. Krogh, B.O., and Symington, L.S. (2004). Recombination proteins in yeast. Annual Review of Genetics 38, 233-271. Lakich, D., Kazazian, H.H., Jr., Antonarakis, S.E., and Gitschier, J. (1993). Inversions disrupting the factor VIII gene are a common cause of severe haemophilia A. Nat Genet 5, 236-241. Lauer, J., Shen, C.K., and Maniatis, T. (1980). The chromosomal arrangement of human alpha-like globin genes: sequence homology and alpha-globin gene deletions. Cell 20, 119-130. Lawrie, N.M., Tease, C., and Hulten, M.A. (1995). Chiasma frequency, distribution and interference maps of mouse autosomes. Chromosoma 104, 308-314. Lazaro, C., Gaona, A., Ainsworth, P., Tenconi, R., Vidaud, D., Kruyer, H., Ars, E., Volpini, V., and Estivill, X. (1996). Sex differences in mutational rate and mutational mechanism in the NF1 gene in neurofibromatosis type 1 patients. Hum Genet 98, 696-699. Leblon, G. (1972). Mechanism of gene conversion in Ascobolus immersus I. Existence of a Correlation Between the Origin of Mutants Induced by Different Mutagens and Their Conversion Spectrum. Mol Gen Genet 115, 36-48. Leu, J.Y., Chua, P.R., and Roeder, G.S. (1998). The meiosis-specific Hop2 protein of S. cerevisiae ensures synapsis between homologous chromosomes. Cell 94, 375-386. Lhuissier, F.G., Offenberg, H.H., Wittich, P.E., Vischer, N.O., and Heyting, C. (2007). The mismatch repair protein MLH1 marks a subset of strongly interfering crossovers in tomato. Plant Cell 19, 862-876. Li, W., Chen, C., Markmann-Mulisch, U., Timofejeva, L., Schmelzer, E., Ma, H., and Reiss, B. (2004). The Arabidopsis AtRAD51 gene is dispensable for vegetative development but required for meiosis. Proc Natl Acad Sci U S A 101, 10596-10601. Libeau, P., Durandet, M., Granier, F., Marquis, C., Berthomé, R., Renou, J.P., Taconnat-Soubirou, L., and Horlow, C. (2011). Gene expression profiling of Arabidopsis meiocytes. Plant Biology 13, 784–793. Lichten, M., and Goldman, A.S. (1995). Meiotic recombination hotspots. Annu Rev Genet 29, 423- 444.

151 Lien, S., Szyda, J., Schechinger, B., Rappold, G., and Arnheim, N. (2000). Evidence for heterogeneity in recombination in the human pseudoautosomal region: high resolution analysis by sperm typing and radiation-hybrid mapping. Am J Hum Genet 66, 557-566. Liling, J., Cross, I., Burn, J., Daniel, C.P., Tawn, E.J., and Parker, L. (1999). Frequency and predictive value of 22q11 deletion. J Med Genet 36, 794-795. Lim, D.S., and Hasty, P. (1996). A mutation in mouse rad51 results in an early embryonic lethal that is suppressed by a mutation in p53. Mol Cell Biol 16, 7133-7143. Lin, Y., and Smith, G.R. (1994). Transient, meiosis-induced expression of the rec6 and rec12 genes of Schizosaccharomyces pombe. Genetics 136, 769-779. Lio, Y.C., Mazin, A.V., Kowalczykowski, S.C., and Chen, D.J. (2003). Complex formation by the human Rad51B and Rad51C DNA repair proteins and their activities in vitro. J Biol Chem 278, 2469- 2478. Liu Cm, C.M., McElver, J., Tzafrir, I., Joosen, R., Wittich, P., Patton, D., Van Lammeren, A.A., and Meinke, D. (2002). Condensin and cohesin knockouts in Arabidopsis exhibit a titan seed phenotype. Plant J 29, 405-415. Liu, P., Lacaria, M., Zhang, F., Withers, M., Hastings, P.J., and Lupski, J.R. (2011). Frequency of nonallelic homologous recombination is correlated with length of homology: evidence that ectopic synapsis precedes ectopic crossing-over. Am J Hum Genet 89, 580-588. Liu, Y., Tarsounas, M., O'Regan, P., and West, S.C. (2007). Role of RAD51C and XRCC3 in genetic recombination and DNA repair. J Biol Chem 282, 1973-1979. Liu, Z., and Makaroff, C.A. (2006). Arabidopsis separase AESP is essential for embryo development and the release of cohesin during meiosis. Plant Cell 18, 1213-1225. Loidl, J. (2006). S. pombe linear elements: the modest cousins of synaptonemal complexes. Chromosoma 115, 260-271. Lopes, J., Tardieu, S., Silander, K., Blair, I., Vandenberghe, A., Palau, F., Ruberg, M., Brice, A., and LeGuern, E. (1999). Homologous DNA exchanges in humans can be explained by the yeast double-strand break repair model: a study of 17p11.2 rearrangements associated with CMT1A and HNPP. Hum Mol Genet 8, 2285-2292. Lopez-Correa, C., Dorschner, M., Brems, H., Lazaro, C., Clementi, M., Upadhyaya, M., Dooijes, D., Moog, U., Kehrer-Sawatzki, H., Rutkowski, J.L., Fryns, J.P., Marynen, P., Stephens, K., and Legius, E. (2001). Recombination hotspot in NF1 microdeletion patients. Hum Mol Genet 10, 1387-1392. Lorenz, A., Wells, J.L., Pryce, D.W., Novatchkova, M., Eisenhaber, F., McFarlane, R.J., and Loidl, J. (2004). S. pombe meiotic linear elements contain proteins related to synaptonemal complex components. J Cell Sci 117, 3343-3351. Lupski, J.R. (1998). Genomic disorders: structural features of the genome can lead to DNA rearrangements and human disease traits. Trends Genet 14, 417-422. Lynn, A., Ashley, T., and Hassold, T. (2004). Variation in human meiotic recombination. Annu Rev Genomics Hum Genet 5, 317-349. Lynn, A., Soucek, R., and Borner, G.V. (2007). ZMM proteins during meiosis: crossover artists at work. Chromosome Res 15, 591-605. Lynn, A., Koehler, K.E., Judis, L., Chan, E.R., Cherry, J.P., Schwartz, S., Seftel, A., Hunt, P.A., and Hassold, T.J. (2002). Covariation of synaptonemal complex length and mammalian meiotic exchange rates. Science 296, 2222-2225. Macaisne, N., Novatchkova, M., Peirera, L., Vezon, D., Jolivet, S., Froger, N., Chelysheva, L., Grelon, M., and Mercier, R. (2008). SHOC1, an XPF endonuclease-related protein, is essential for the formation of class I meiotic crossovers. Curr Biol 18, 1432-1437. MacQueen, A.J., Colaiacovo, M.P., McDonald, K., and Villeneuve, A.M. (2002). Synapsis-dependent and -independent mechanisms stabilize homolog pairing during meiotic prophase in C. elegans. Genes Dev 16, 2428-2442.

152 Malik, S.B., Ramesh, M.A., Hulstrand, A.M., and Logsdon, J.M., Jr. (2007). Protist homologs of the meiotic Spo11 gene and topoisomerase VI reveal an evolutionary history of gene duplication and lineage-specific loss. Mol Biol Evol 24, 2827-2841. Mancera, E., Bourgon, R., Brozzi, A., Huber, W., and Steinmetz, L.M. (2008). High-resolution mapping of meiotic crossovers and non-crossovers in yeast. Nature 454, 479-485. Manfrini, N., Guerini, I., Citterio, A., Lucchini, G., and Longhese, M.P. (2010). Processing of meiotic DNA double-strand breaks requires cyclin-dependent kinase and multiple nucleases. J Biol Chem. Marais, G. (2003). Biased gene conversion: implications for genome and sex evolution. Trends Genet 19, 330-338. Marais, G., Mouchiroud, D., and Duret, L. (2003). Neutral effect of recombination on base composition in Drosophila. Genet Res 81, 79-87. Marsolier-Kergoat, M.-C. (2011). A simple model for the influence of meiotic conversion tracts on GC content. PLoS One 6, e16109. doi:16110.11371/journal.pone.0016109. Marsolier-Kergoat, M.C., and Yeramian, E. (2009). GC Content and Recombination: Reassessing the Causal Effects for the Saccharomyces cerevisiae Genome. Genetics. Martini, E., Borde, V., Legendre, M., Audic, S., Regnault, B., Soubigou, G., Dujon, B., and Llorente, B. (2011). Genome-wide analysis of heteroduplex DNA in mismatch repair-deficient yeast cells reveals novel properties of meiotic recombination pathways. PLoS Genet 7, e1002305. Masison, D.C., Blanc, A., Ribas, J.C., Carroll, K., Sonenberg, N., and Wickner, R.B. (1995). Decoying the cap- mRNA degradation system by a double-stranded RNA virus and poly(A)- mRNA surveillance by a yeast antiviral system. Mol Cell Biol 15, 2763-2771. Mather, K. (1938). Crossing-over. Biological Reviews 13, 252-292. May, C., Slingsby, M., and Jeffreys, A. (2008). Human Recombination Hotspots: Before and After the HapMap Project. In Recombination and Meiosis - Crossing-Over and Disjunction, pp. 195-244. May, C.A., Shone, A.C., Kalaydjieva, L., Sajantila, A., and Jeffreys, A.J. (2002). Crossover clustering and rapid decay of linkage disequilibrium in the Xp/Yp pseudoautosomal gene SHOX. Nat Genet 31, 272-275. Mazina, O.M., Mazin, A.V., Nakagawa, T., Kolodner, R.D., and Kowalczykowski, S.C. (2004). Saccharomyces cerevisiae Mer3 helicase stimulates 3'-5' heteroduplex extension by Rad51; implications for crossover control in meiotic recombination. Cell 117, 47-56. McKim, K.S., and Hayashi-Hagihara, A. (1998). mei-W68 in Drosophila melanogaster encodes a Spo11 homolog: evidence that the mechanism for initiating meiotic recombination is conserved. Genes Dev 12, 2932-2942. McKim, K.S., Green-Marroquin, B.L., Sekelsky, J.J., Chin, G., Steinberg, C., Khodosh, R., and Hawley, R.S. (1998). Meiotic synapsis in the absence of recombination. Science 279, 876-878. McMahill, M.S., Sham, C.W., and Bishop, D.K. (2007). Synthesis-Dependent Strand Annealing in Meiosis. PLoS Biol 5, e299. Mercier, R., Armstrong, S.J., Horlow, C., Jackson, N.P., Makaroff, C.A., Vezon, D., Pelletier, G., Jones, G.H., and Franklin, F.C. (2003). The meiotic protein SWI1 is required for axial element formation and recombination initiation in Arabidopsis. Development 130, 3309-3318. Mercier, R., Jolivet, S., Vezon, D., Huppe, E., Chelysheva, L., Giovanni, M., Nogue, F., Doutriaux, M.P., Horlow, C., Grelon, M., and Mezard, C. (2005). Two meiotic crossover classes cohabit in Arabidopsis: one is dependent on MER3,whereas the other one is not. Curr Biol 15, 692- 701. Merker, J.D., Dominska, M., Greenwell, P.W., Rinella, E., Bouck, D.C., Shibata, Y., Strahl, B.D., Mieczkowski, P., and Petes, T.D. (2008). The histone methylase Set2p and the histone deacetylase Rpd3p repress meiotic recombination at the HIS4 meiotic recombination hotspot in Saccharomyces cerevisiae. DNA Repair (Amst) 7, 1298-1308. Meuwissen, R.L., Offenberg, H.H., Dietrich, A.J., Riesewijk, A., van Iersel, M., and Heyting, C. (1992). A coiled-coil related protein specific for synapsed regions of meiotic prophase chromosomes. Embo J 11, 5091-5100.

153 Mezard, C. (2006). Meiotic recombination hotspots in plants. Biochem Soc Trans 34, 531-534. Mieczkowski, P.A., Dominska, M., Buck, M.J., Gerton, J.L., Lieb, J.D., and Petes, T.D. (2006). Global analysis of the relationship between the binding of the Bas1p transcription factor and meiosis-specific double-strand DNA breaks in Saccharomyces cerevisiae. Mol Cell Biol 26, 1014-1027. Mimitou, E.P., and Symington, L.S. (2008). Sae2, Exo1 and Sgs1 collaborate in DNA double-strand break processing. Nature 455, 770-774. Mimitou, E.P., and Symington, L.S. (2009). DNA end resection: many nucleases make light work. DNA Repair (Amst) 8, 983-995. Mochizuki, K., Novatchkova, M., and Loidl, J. (2008). DNA double-strand breaks, but not crossovers, are required for the reorganization of meiotic nuclei in Tetrahymena. J Cell Sci 121, 2148- 2158. Moens, P.B., Marcon, E., Shore, J.S., Kochakpour, N., and Spyropoulos, B. (2007). Initiation and resolution of interhomolog connections: crossover and non-crossover sites along mouse synaptonemal complexes. J Cell Sci 120, 1017-1027. Moens, P.B., Kolas, N.K., Tarsounas, M., Marcon, E., Cohen, P.E., and Spyropoulos, B. (2002). The time course and chromosomal localization of recombination-related proteins at meiosis in the mouse are compatible with models that can resolve the early DNA-DNA interactions without reciprocal recombination. J Cell Sci 115, 1611-1622. Molnar, M., Doll, E., Yamamoto, A., Hiraoka, Y., and Kohli, J. (2003). Linear element formation and their role in meiotic sister chromatid cohesion and chromosome pairing. J Cell Sci 116, 1719- 1731. Morgan, T.H., and Cattell, E. (1912). Data for the study of sex-linked inheritance in Drosophila. Journal of Experimental Zoology 13, 79-101. Motamayor, J.C., Vezon, D., Sauvanet, A., Grandjean, O., Marchand, M., Bechtold, N., Pelletier, G., and Horlow, C. (2000). Switch (swi1), an Arabidopsis thaliana mutant affected in the female meiotic switch. sex Plant Reprod 12, 209-218. Muller, H. (1922). Variation due to change in the individual genes. Am Naturalist 56, 32–50. Munoz-Fuentes, V., Di Rienzo, A., and Vila, C. (2011). Prdm9, a major determinant of meiotic recombination hotspots, is not functional in dogs and their wild relatives, wolves and coyotes. PLoS One 6, e25498. Munz, P. (1994). An analysis of interference in the fission yeast Schizosaccharomyces pombe. Genetics 137, 701-707. Murray, N.E. (1963). Polarized recombination and fine structure within the me-2 gene of Neurospora crassa. Genetics 48, 1163-1183. Myers, S., Bottolo, L., Freeman, C., McVean, G., and Donnelly, P. (2005). A fine-scale map of recombination rates and hotspots across the human genome. Science 310, 321-324. Myers, S., Freeman, C., Auton, A., Donnelly, P., and McVean, G. (2008). A common sequence motif associated with recombination hot spots and genome instability in humans. Nat Genet 40, 1124-1129. Myers, S., Spencer, C.C., Auton, A., Bottolo, L., Freeman, C., Donnelly, P., and McVean, G. (2006). The distribution and causes of meiotic recombination in the human genome. Biochem Soc Trans 34, 526-530. Myers, S., Bowden, R., Tumian, A., Bontrop, R.E., Freeman, C., MacFie, T.S., McVean, G., and Donnelly, P. (2010). Drive against hotspot motifs in primates implicates the PRDM9 gene in meiotic recombination. Science 327, 876-879. Nakagawa, T., and Ogawa, H. (1999). The Saccharomyces cerevisiae MER3 gene, encoding a novel helicase-like protein, is required for crossover control in meiosis. Embo J 18, 5714-5723. Nakagawa, T., and Kolodner, R.D. (2002). The MER3 DNA helicase catalyzes the unwinding of holliday junctions. J Biol Chem 277, 28019-28024. Nasmyth, K. (1999). Separating sister chromatids. Trends Biochem Sci 24, 98-104.

154 Nathans, J., Piantanida, T.P., Eddy, R.L., Shows, T.B., and Hogness, D.S. (1986). Molecular genetics of inherited variation in human color vision. Science 232, 203-210. Ng, D.W., Wang, T., Chandrasekharan, M.B., Aramayo, R., Kertbundit, S., and Hall, T.C. (2007). Plant SET domain-containing proteins: structure, function and regulation. Biochim Biophys Acta 1769, 316-329. Ng, S.H., Parvanov, E., Petkov, P.M., and Paigen, K. (2008). A quantitative assay for crossover and noncrossover molecular events at individual recombination hotspots in both male and female gametes. Genomics 92, 204-209. Nicolas, A. (1979). Variation of gene conversion and intragenic recombination frequencies in the genome of Ascobolus immersus. Mol Gen Genet 176, 129-138. Nicolas, A., Treco, D., Schultes, N.P., and Szostak, J.W. (1989). An initiation site for meiotic gene conversion in the yeast Saccharomyces cerevisiae. Nature 338, 35-39. Niedrist, D., Joncourt, F., Matyas, G., and Muller, A. (2009). Severe phenotype with cis-acting heterozygous PMP22 mutations. Clin Genet 75, 286-289. Nimonkar, A.V., Sica, R.A., and Kowalczykowski, S.C. (2009). Rad52 promotes second-end DNA capture in double-stranded break repair to form complement-stabilized joint molecules. Proc Natl Acad Sci U S A 106, 3077-3082. Niu, H., Wan, L., Busygina, V., Kwon, Y., Allen, J.A., Li, X., Kunz, R.C., Kubota, K., Wang, B., Sung, P., Shokat, K.M., Gygi, S.P., and Hollingsworth, N.M. (2009). Regulation of meiotic recombination via Mek1-mediated Rad54 phosphorylation. Mol Cell 36, 393-404. Nonomura, K., Nakano, M., Fukuda, T., Eiguchi, M., Miyao, A., Hirochika, H., and Kurata, N. (2004). The novel gene HOMOLOGOUS PAIRING ABERRATION IN RICE MEIOSIS1 of rice encodes a putative coiled-coil protein required for homologous chromosome pairing in meiosis. Plant Cell 16, 1008-1020. Nussbaum, R.L., McInnes, R.R., and Willard, H.F. (2001). Genetics in Medicine. Philadelphia, PA: WB Saunders Co. Ohta, K., Shibata, T., and Nicolas, A. (1994). Changes in chromatin structure at recombination initiation sites during yeast meiosis. Embo J 13, 5754-5763. Ohta, K., Wu, T.C., Lichten, M., and Shibata, T. (1999). Competitive inactivation of a double-strand DNA break site involves parallel suppression of meiosis-induced changes in chromatin configuration. Nucleic Acids Res 27, 2175-2180. Oliver, P.L., Goodstadt, L., Bayes, J.J., Birtle, Z., Roach, K.C., Phadnis, N., Beatson, S.A., Lunter, G., Malik, H.S., and Ponting, C.P. (2009). Accelerated evolution of the Prdm9 speciation gene across diverse metazoan taxa. PLoS Genet 5, e1000753. Oliver-Bonet, M., Campillo, M., Turek, P.J., Ko, E., and Martin, R.H. (2007). Analysis of replication protein A (RPA) in human spermatogenesis. Mol Hum Reprod 13, 837-844. Osakabe, K., Abe, K., Yoshioka, T., Osakabe, Y., Todoriki, S., Ichikawa, H., Hohn, B., and Toki, S. (2006). Isolation and characterization of the RAD54 gene from Arabidopsis thaliana. Plant J 48, 827-842. Osman, K., Higgins, J.D., Sanchez-Moran, E., Armstrong, S., Chris, F., and Franklin, H. (2011). Pathways to meiotic recombination in Arabidopsis thaliana. New Phytologist 190, 523–544. Page, S.L., and Hawley, R.S. (2001). c(3)G encodes a Drosophila synaptonemal complex protein. Genes Dev 15, 3130-3143. Painter, T.S. (1934). The Morphology of the X Chromosome in Salivary Glands of Drosophila Melanogaster and a New Type of Chromosome Map for This Element. Genetics 19, 448-469. Papadakis, M.N., and Patrinos, G.P. (1999). Contribution of gene conversion in the evolution of the human beta-like globin gene family. Hum Genet 104, 117-125. Paques, F., and Haber, J.E. (1999). Multiple pathways of recombination induced by double-strand breaks in Saccharomyces cerevisiae. Microbiol Mol Biol Rev 63, 349-404. Parvanov, E.D., Petkov, P.M., and Paigen, K. (2010). Prdm9 controls activation of mammalian recombination hotspots. Science 327, 835.

155 Parvanov, E.D., Ng, S.H.S., Petkov, P.M., and Paigen, K. (2009). Trans-Regulation of Mouse Meiotic Recombination Hotspots by Rcr1. PLoS Biol 7. Pecina, A., Smith, K.N., Mezard, C., Murakami, H., Ohta, K., and Nicolas, A. (2002). Targeted stimulation of meiotic recombination. Cell 111, 173-184. Peirson, B.N., Bowling, S.E., and Makaroff, C.A. (1997). A defect in synapsis causes male sterility in a T-DNA-tagged Arabidopsis thaliana mutant. Plant J 11, 659-669. Perrella, G., Consiglio, M.F., Aiese-Cigliano, R., Cremona, G., Sanchez-Moran, E., Barra, L., Errico, A., Bressan, R.A., Franklin, F.C., and Conicella, C. (2010). Histone hyperacetylation affects meiotic recombination and chromosome segregation in Arabidopsis. Plant J 62, 796-806. Petes, T.D. (2001). Meiotic recombination hot spots and cold spots. Nat Rev Genet 2, 360-369. Petronczki, M., Siomos, M.F., and Nasmyth, K. (2003). Un menage a quatre: the molecular biology of chromosome segregation in meiosis. Cell 112, 423-440. Petukhova, G.V., Pezza, R.J., Vanevski, F., Ploquin, M., Masson, J.Y., and Camerini-Otero, R.D. (2005). The Hop2 and Mnd1 proteins act in concert with Rad51 and Dmc1 in meiotic recombination. Nat Struct Mol Biol 12, 449-453. Pezza, R.J., Voloshin, O.N., Vanevski, F., and Camerini-Otero, R.D. (2007). Hop2/Mnd1 acts on two critical steps in Dmc1-promoted homologous pairing. Genes Dev 21, 1758-1766. Pigozzi, M.I. (2008). Relationship between physical and genetic distances along the zebra finch Z chromosome. Chromosome Res 16, 839-849. Pittman, D.L., Cobb, J., Schimenti, K.J., Wilson, L.A., Cooper, D.M., Brignull, E., Handel, M.A., and Schimenti, J.C. (1998). Meiotic prophase arrest with failure of chromosome synapsis in mice deficient for Dmc1, a germline-specific RecA homolog. Mol Cell 1, 697-705. Plank, J.L., Wu, J., and Hsieh, T.S. (2006). Topoisomerase IIIalpha and Bloom's helicase can resolve a mobile double Holliday junction substrate through convergent branch migration. Proc Natl Acad Sci U S A 103, 11118-11123. Ponting, C.P. (2011). What are the genomic drivers of the rapid evolution of PRDM9? Trends in Genetics 27,, 165-171. Pontvianne, F., Blevins, T., and Pikaard, C.S. (2010). Arabidopsis Histone Lysine Methyltransferases. Adv Bot Res 53, 1-22. Potocki, L., Chen, K.S., Park, S.S., Osterholm, D.E., Withers, M.A., Kimonis, V., Summers, A.M., Meschino, W.S., Anyane-Yeboa, K., Kashork, C.D., Shaffer, L.G., and Lupski, J.R. (2000). Molecular mechanism for duplication 17p11.2- the homologous recombination reciprocal of the Smith-Magenis microdeletion. Nat Genet 24, 84-87. Prakash, R., Satory, D., Dray, E., Papusha, A., Scheller, J., Kramer, W., Krejci, L., Klein, H., Haber, J.E., Sung, P., and Ira, G. (2009). Yeast Mph1 helicase dissociates Rad51-made D-loops: implications for crossover control in mitotic recombination. Genes Dev 23, 67-79. Preuss, D., and Copenhaver, G.P. (2000). Chemical and physical treatment that stimulate recombination. University of Chicago WO/2000/054574. Preuss, D., Rhee, S.Y., and Davis, R.W. (1994). Tetrad analysis possible in Arabidopsis with mutation of the QUARTET (QRT) genes. Science 264, 1458-1460. Prieto, I., Tease, C., Pezzi, N., Buesa, J.M., Ortega, S., Kremer, L., Martinez, A., Martinez, A.C., Hulten, M.A., and Barbero, J.L. (2004). Cohesin component dynamics during meiotic prophase I in mammalian oocytes. Chromosome Res 12, 197-213. Ptak, S.E., Roeder, A.D., Stephens, M., Gilad, Y., Paabo, S., and Przeworski, M. (2004). Absence of the TAP2 human recombination hotspot in chimpanzees. PLoS Biol 2, e155. Ptak, S.E., Hinds, D.A., Koehler, K., Nickel, B., Patil, N., Ballinger, D.G., Przeworski, M., Frazer, K.A., and Paabo, S. (2005). Fine-scale recombination patterns differ between chimpanzees and humans. Nat Genet 37, 429-434. Puizina, J., Siroky, J., Mokros, P., Schweizer, D., and Riha, K. (2004). Mre11 deficiency in Arabidopsis is associated with chromosomal instability in somatic cells and Spo11-dependent genome fragmentation during meiosis. Plant Cell 16, 1968-1978.

156 Ramesh, M.A., Malik, S.B., and Logsdon, J.M., Jr. (2005). A phylogenomic inventory of meiotic genes; evidence for sex in Giardia and an early eukaryotic origin of meiosis. Curr Biol 15, 185- 191. Rana, N.A., Ebenezer, N.D., Webster, A.R., Linares, A.R., Whitehouse, D.B., Povey, S., and Hardcastle, A.J. (2004). Recombination hotspots and block structure of linkage disequilibrium in the human genome exemplified by detailed analysis of PGM1 on 1p31. Hum Mol Genet 13, 3089-3102. Rasmussen, S.W., and Holm, P.B. (1984). The synaptonemal complex, recombination nodules and chiasmata in human spermatocytes. Symp Soc Exp Biol 38, 271-292. Raynard, S., Zhao, W., Bussen, W., Lu, L., Ding, Y.Y., Busygina, V., Meetei, A.R., and Sung, P. (2008). Functional role of BLAP75 in BLM/Topo IIIalpha -dependent holliday junction processing. J Biol Chem. Reiter, L.T., Murakami, T., Koeuth, T., Pentao, L., Muzny, D.M., Gibbs, R.A., and Lupski, J.R. (1996). A recombination hotspot responsible for two inherited peripheral neuropathies is located near a mariner transposon-like element. Nat Genet 12, 288-297. Resnick, M.A. (1976). The repair of double-strand breaks in DNA; a model involving recombination. J Theor Biol 59, 97-106. Robine, N., Uematsu, N., Amiot, F., Gidrol, X., Barillot, E., Nicolas, A., and Borde, V. (2007). Genome-wide redistribution of meiotic double-strand breaks in Saccharomyces cerevisiae. Mol Cell Biol 27, 1868-1880. Rocco, V., de Massy, B., and Nicolas, A. (1992). The Saccharomyces cerevisiae ARG4 initiator of meiotic gene conversion and its associated double-strand DNA breaks can be inhibited by transcriptional interference. Proc Natl Acad Sci U S A 89, 12068-12072. Rockmill, B., and Roeder, G.S. (1988). RED1: a yeast gene required for the segregation of chromosomes during the reductional division of meiosis. Proc Natl Acad Sci U S A 85, 6057- 6061. Rockmill, B., Sym, M., Scherthan, H., and Roeder, G.S. (1995). Roles for two RecA homologs in promoting meiotic chromosome synapsis. Genes Dev 9, 2684-2695. Romanienko, P.J., and Camerini-Otero, R.D. (2000). The mouse Spo11 gene is required for meiotic chromosome synapsis. Mol Cell 6, 975-987. Ross, K.J., Fransz, P., Armstrong, S.J., Vizir, I., Mulligan, B., Franklin, F.C., and Jones, G.H. (1997). Cytological characterization of four meiotic mutants of Arabidopsis isolated from T-DNA- transformed lines. Chromosome Res 5, 551-559. Ross-Macdonald, P., and Roeder, G.S. (1994). Mutation of a meiosis-specific MutS homolog decreases crossing over but not mismatch correction. Cell 79, 1069-1080. Saintenac, C., Falque, M., Martin, O.C., Paux, E., Feuillet, C., and Sourdille, P. (2009). Detailed recombination studies along chromosome 3B provide new insights on crossover distribution in wheat (Triticum aestivum L.). Genetics 181, 393-403. Saintenac, C., Faure, S., Remay, A., Choulet, F., Ravel, C., Paux, E., Balfourier, F., Feuillet, C., and Sourdille, P. (2011). Variation in crossover rates across a 3-Mb contig of bread wheat (Triticum aestivum) reveals the presence of a meiotic recombination hotspot. Chromosoma 120, 185-198. Sall, T. (1990). Genetic control of recombination in barley. II.Variation in linkage between marker genes. Hereditas 112, 171-178. Samach, A., Melamed-Bessudo, C., Avivi-Ragolski, N., Pietrokovski, S., and Levy, A.A. (2011). Identification of Plant RAD52 Homologs and Characterization of the Arabidopsis thaliana RAD52-Like Genes. Plant Cell. San Filippo, J., Sung, P., and Klein, H. (2008). Mechanism of eukaryotic homologous recombination. Annu Rev Biochem 77, 229-257. Sanchez-Moran, E., Armstrong, S.J., Santos, J.L., Franklin, F.C., and Jones, G.H. (2002). Variation in chiasma frequency among eight accessions of Arabidopsis thaliana. Genetics 162, 1415-1422.

157 Sato, S., Hotta, Y., and Tabata, S. (1995). Structural analysis of a recA-like gene in the genome of Arabidopsis thaliana. DNA Res 2, 89-93. Schmekel, K., Wahrman, J., Skoglund, U., and Daneholt, B. (1993). The central region of the synaptonemal complex in Blaps cribrosa studied by electron microscope tomography. Chromosoma 102, 669-681. Schmuckli-Maurer, J., and Heyer, W.D. (2000). Meiotic recombination in RAD54 mutants of Saccharomyces cerevisiae. Chromosoma 109, 86-93. Schommer, C., Beven, A., Lawrenson, T., Shaw, P., and Sablowski, R. (2003). AHP2 is required for bivalent formation and for segregation of homologous chromosomes in Arabidopsis meiosis. Plant J 36, 1-11. Schultes, N.P., and Szostak, J.W. (1990). Decreasing gradients of gene conversion on both sides of the initiation site for meiotic recombination at the ARG4 locus in yeast. Genetics 126, 813- 822. Schwacha, A., and Kleckner, N. (1994). Identification of joint molecules that form frequently between homologs but rarely between sister chromatids during yeast meiosis. Cell 76, 51-63. Schwacha, A., and Kleckner, N. (1997). Interhomolog bias during meiotic recombination: meiotic functions promote a highly differentiated interhomolog-only pathway. Cell 90, 1123-1135. Shaikh, T.H., O'Connor, R.J., Pierpont, M.E., McGrath, J., Hacker, A.M., Nimmakayalu, M., Geiger, E., Emanuel, B.S., and Saitta, S.C. (2007). Low copy repeats mediate distal chromosome 22q11.2 deletions: sequence analysis predicts breakpoint mechanisms. Genome Res 17, 482- 491. Sharan, S.K., Pyle, A., Coppola, V., Babus, J., Swaminathan, S., Benedict, J., Swing, D., Martin, B.K., Tessarollo, L., Evans, J.P., Flaws, J.A., and Handel, M.A. (2004). BRCA2 deficiency in mice leads to meiotic impairment and infertility. Development 131, 131-142. Sheridan, S.D., Yu, X., Roth, R., Heuser, J.E., Sehorn, M.G., Sung, P., Egelman, E.H., and Bishop, D.K. (2008). A comparative analysis of Dmc1 and Rad51 nucleoprotein filaments. Nucleic Acids Res 36, 4057-4066. Sherman, J.D., and Stack, S.M. (1995). Two-dimensional spreads of synaptonemal complexes from solanaceous plants. VI. High-resolution recombination nodule map for tomato (Lycopersicon esculentum). Genetics 141, 683-708. Shifman, S., Bell, J.T., Copley, R.R., Taylor, M.S., Williams, R.W., Mott, R., and Flint, J. (2006). A High-Resolution Single Nucleotide Polymorphism Genetic Map of the Mouse Genome. PLoS Biol 4, e395. Shinohara, M., Gasior, S.L., Bishop, D.K., and Shinohara, A. (2000). Tid1/Rdh54 promotes colocalization of rad51 and dmc1 during meiotic recombination. Proc Natl Acad Sci U S A 97, 10814-10819. Shinohara, M., Oh, S.D., Hunter, N., and Shinohara, A. (2008). Crossover assurance and crossover interference are distinctly regulated by the ZMM proteins during yeast meiosis. Nat Genet 40, 299-309. Shinohara, M., Shita-Yamaguchi, E., Buerstedde, J.M., Shinagawa, H., Ogawa, H., and Shinohara, A. (1997). Characterization of the roles of the Saccharomyces cerevisiae RAD54 gene and a homologue of RAD54, RDH54/TID1, in mitosis and meiosis. Genetics 147, 1545-1556. Shiroishi, T., Sagai, T., and Moriwaki, K. (1993). Hotspots of meiotic recombination in the mouse major histocompatibility complex. Genetica 88, 187-196. Siaud, N., Dray, E., Gy, I., Gerard, E., Takvorian, N., and Doutriaux, M.P. (2004). Brca2 is involved in meiosis in Arabidopsis thaliana as suggested by its interaction with Dmc1. Embo J 23, 1392- 1401. Sigurdsson, S., Van Komen, S., Bussen, W., Schild, D., Albala, J.S., and Sung, P. (2001). Mediator function of the human Rad51B-Rad51C complex in Rad51/RPA-catalyzed DNA strand exchange. Genes Dev 15, 3308-3318. Sinha, R.P., and Helgason, S.B. (1969). The action of actinomycin D and diepoxybutane on recombination of two closely linked loci in Hordeum. Can J Genet Cytol 11, 745-751.

158 Small, K., Iber, J., and Warren, S.T. (1997). Emerin deletion reveals a common X-chromosome inversion mediated by inverted repeats. Nat Genet 16, 96-99. Smith, A.V., and Roeder, G.S. (1997). The yeast Red1 protein localizes to the cores of meiotic chromosomes. J Cell Biol 136, 957-967. Snowden, T., Acharya, S., Butz, C., Berardini, M., and Fishel, R. (2004). hMSH4-hMSH5 recognizes Holliday Junctions and forms a meiosis-specific sliding clamp that embraces homologous chromosomes. Mol Cell 15, 437-451. Snowden, T., Shim, K.S., Schmutte, C., Acharya, S., and Fishel, R. (2008). hMSH4-hMSH5 adenosine nucleotide processing and interactions with homologous recombination machinery. J Biol Chem 283, 145-154. Sonoda, E., Sasaki, M.S., Buerstedde, J.M., Bezzubova, O., Shinohara, A., Ogawa, H., Takata, M., Yamaguchi-Iwai, Y., and Takeda, S. (1998). Rad51-deficient vertebrate cells accumulate chromosomal breaks prior to cell death. Embo J 17, 598-608. Sparrow, A.H., Pond, V., and Kojan, S. (1955). Microsporogenesis in excised anthers of Trillium erectum grown on sterile media. American Journal of Botany 72. Spencer, C.C., Deloukas, P., Hunt, S., Mullikin, J., Myers, S., Silverman, B., Donnelly, P., Bentley, D., and McVean, G. (2006). The influence of recombination on human genetic diversity. PLoS Genet 2, e148. Stacey, N.J., Kuromori, T., Azumi, Y., Roberts, G., Breuer, C., Wada, T., Maxwell, A., Roberts, K., and Sugimoto-Shirasu, K. (2006). Arabidopsis SPO11-2 functions with SPO11-1 in meiotic recombination. Plant J 48, 206-216. Stack, S. (1982). Two-dimensional spreads of synaptonemal complexes from solanaceous plants. I. The technique. Stain Technol 57, 265-272. Stack, S.M., and Anderson, L.K. (2002). Crossing over as assessed by late recombination nodules is related to the pattern of synapsis and the distribution of early recombination nodules in maize. Chromosome Res 10, 329-345. Steinmetz, M., Uematsu, Y., and Lindahl, K.F. (1987). Hotspots of homologous recombination in mammalian genomes. Trends in Genetics 3, 7-10. Storlazzi, A., Xu, L., Schwacha, A., and Kleckner, N. (1996). Synaptonemal complex (SC) component Zip1 plays a role in meiotic recombination independent of SC polymerization along the chromosomes. Proc Natl Acad Sci U S A 93, 9043-9048. Storlazzi, A., Gargano, S., Ruprich-Robert, G., Falque, M., David, M., Kleckner, N., and Zickler, D. (2010). Recombination Proteins Mediate Meiotic Spatial Chromosome Organization and Pairing. Cell 141, 94-106. Sturtevant, A.H. (1913). The linear arrangement of six sex-linked factors in Drosophila, as shown by their mode of association. J Exp Zool 5, 168-173. Sturtevant, A.H. (1915). The behavior of the chromosomes as studied through linkage. Molecular and General Genetics 13, 234-287. Sun, F., Trpkov, K., Rademaker, A., Ko, E., and Martin, R.H. (2005). Variation in meiotic recombination frequencies among human males. Hum Genet 116, 172-178. Sun, F., Oliver-Bonet, M., Liehr, T., Starke, H., Turek, P., Ko, E., Rademaker, A., and Martin, R.H. (2006). Variation in MLH1 distribution in recombination maps for individual chromosomes from human males. Hum Mol Genet 15, 2376-2391. Sun, F., Oliver-Bonet, M., Liehr, T., Starke, H., Ko, E., Rademaker, A., Navarro, J., Benet, J., and Martin, R.H. (2004). Human male recombination maps for individual chromosomes. Am J Hum Genet 74, 521-531. Sun, H., Treco, D., Schultes, N.P., and Szostak, J.W. (1989). Double-strand breaks at an initiation site for meiotic gene conversion. Nature 338, 87-90. Sung, P. (1997a). Yeast Rad55 and Rad57 proteins form a heterodimer that functions with replication protein A to promote DNA strand exchange by Rad51 recombinase. Genes Dev 11, 1111- 1121.

159 Sung, P. (1997b). Function of yeast Rad52 protein as a mediator between replication protein A and the Rad51 recombinase. J Biol Chem 272, 28194-28197. Surtees, J.A., Argueso, J.L., and Alani, E. (2004). Mismatch repair proteins: key regulators of genetic recombination. Cytogenet Genome Res 107, 146-159. Svetlanov, A., Baudat, F., Cohen, P.E., and de Massy, B. (2008). Distinct Functions of MLH3 at Recombination Hot Spots in the Mouse. Genetics 178, 1937-1945. Sym, M., and Roeder, G.S. (1994). Crossover interference is abolished in the absence of a synaptonemal complex protein. Cell 79, 283-292. Sym, M., Engebrecht, J.A., and Roeder, G.S. (1993). ZIP1 is a synaptonemal complex protein required for meiotic chromosome synapsis. Cell 72, 365-378. Symington, L.S., Brown, A., Oliver, S.G., Greenwell, P., and Petes, T.D. (1991). Genetic analysis of a meiotic recombination hotspot on chromosome III of Saccharomyces cerevisiae. Genetics 128, 717-727. Szostak, J.W., Orr-Weaver, T.L., Rothstein, R.J., and Stahl, F.W. (1983). The double-strand-break repair model for recombination. Cell 33, 25-35. Tarsounas, M., Pearlman, R.E., Gasser, P.J., Park, M.S., and Moens, P.B. (1997). Protein-protein interactions in the synaptonemal complex. Mol Biol Cell 8, 1405-1414. Tease, C., and Hulten, M.A. (2004). Inter-sex variation in synaptonemal complex lengths largely determine the different recombination rates in male and female germ cells. Cytogenet Genome Res 107, 208-215. Tease, C., Hartshorne, G.M., and Hulten, M.A. (2002). Patterns of meiotic recombination in human fetal oocytes. Am J Hum Genet 70, 1469-1479. Thorslund, T., Esashi, F., and West, S.C. (2007). Interactions between human BRCA2 protein and the meiosis-specific recombinase DMC1. Embo J 26, 2915-2922. Tung, K.S., and Roeder, G.S. (1998). Meiotic chromosome morphology and behavior in zip1 mutants of Saccharomyces cerevisiae. Genetics 149, 817-832. Uematsu, Y., Kiefer, H., Schulze, R., Fischer-Lindahl, K., and Steinmetz, M. (1986). Molecular characterization of a meiotic recombinational hotspot enhancing homologous equal crossing- over. Embo J 5, 2123-2129. Valdeolmillos, A.M., Viera, A., Page, J., Prieto, I., Santos, J.L., Parra, M.T., Heck, M.M., Martinez, A.C., Barbero, J.L., Suja, J.A., and Rufas, J.S. (2007). Sequential loading of cohesin subunits during the first meiotic prophase of grasshoppers. PLoS Genet 3, e28. Valero, M.C., Pascual-Castroviejo, I., Velasco, E., Moreno, F., and Hernandez-Chico, C. (1997). Identification of de novo deletions at the NF1 gene: no preferential paternal origin and phenotypic analysis of patients. Hum Genet 99, 720-726. Vignard, J., Siwiec, T., Chelysheva, L., Vrielynck, N., Gonord, F., Armstrong, S.J., Schlogelhofer, P., and Mercier, R. (2007). The interplay of RecA-related proteins and the MND1-HOP2 complex during meiosis in Arabidopsis thaliana. PLoS Genet 3, 1894-1906. Wallace, B.M., and Wallace, H. (2003). Synaptonemal complex karyotype of zebrafish. Heredity 90, 136-140. Wang, K., Tang, D., Wang, M., Lu, J., Yu, H., Liu, J., Qian, B., Gong, Z., Wang, X., Chen, J., Gu, M., and Cheng, Z. (2009). MER3 is required for normal meiotic crossover formation, but not for presynaptic alignment in rice. J Cell Sci 122, 2055-2063. Wang, T.F., Kleckner, N., and Hunter, N. (1999). Functional specificity of MutL homologs in yeast: evidence for three Mlh1-based heterocomplexes with distinct roles during meiosis in recombination and mismatch correction. Proc Natl Acad Sci U S A 96, 13914-13919. Watanabe, Y. (2006). A one-sided view of kinetochore attachment in meiosis. Cell 126, 1030-1032. Webb, A.J., Berg, I.L., and Jeffreys, A. (2008). Sperm cross-over activity in regions of the human genome showing extreme breakdown of marker association. Proc Natl Acad Sci U S A 105, 10471-10476. Wesoly, J., Agarwal, S., Sigurdsson, S., Bussen, W., Van Komen, S., Qin, J., van Steeg, H., van Benthem, J., Wassenaar, E., Baarends, W.M., Ghazvini, M., Tafel, A.A., Heath, H., Galjart,

160 N., Essers, J., Grootegoed, J.A., Arnheim, N., Bezzubova, O., Buerstedde, J.M., Sung, P., and Kanaar, R. (2006). Differential contributions of mammalian Rad54 paralogs to recombination, DNA damage repair, and meiosis. Mol Cell Biol 26, 976-989. White, M.A., Dominska, M., and Petes, T.D. (1993). Transcription factors are required for the meiotic recombination hotspot at the HIS4 locus in Saccharomyces cerevisiae. Proc Natl Acad Sci U S A 90, 6621-6625. Wilson, J.Y. (1959). Duration of meiosis in relation to temperature. Heredity 13, 263–267. Winckler, W., Myers, S.R., Richter, D.J., Onofrio, R.C., McDonald, G.J., Bontrop, R.E., McVean, G.A., Gabriel, S.B., Reich, D., Donnelly, P., and Altshuler, D. (2005). Comparison of fine-scale recombination rates in humans and chimpanzees. Science 308, 107-111. Wu, Z.K., Getun, I.V., and Bois, P.R.J. (2010). Anatomy of mouse recombination hot spots. Nucleic Acids Res 38, 2346-2354. Yamada, T., Mizuno, K., Hirota, K., Kon, N., Wahls, W.P., Hartsuiker, E., Murofushi, H., Shibata, T., and Ohta, K. (2004). Roles of histone acetylation and chromatin remodeling factor in a meiotic recombination hotspot. Embo J 23, 1792-1803. Yandeau-Nelson, M.D., Zhou, Q., Yao, H., Xu, X., Nikolau, B.J., and Schnable, P.S. (2005). MuDR transposase increases the frequency of meiotic crossovers in the vicinity of a Mu insertion in the maize a1 gene. Genetics 169, 917-929. Yang, H., Lu, P., Wang, Y., and Ma, H. (2011). The transcriptome landscape of Arabidopsis male meiocytes from high-throughput sequencing: the complexity and evolution of the meiotic process. The Plant Journal 65, 503–516. Yao, H., and Schnable, P.S. (2005). Cis-effects on meiotic recombination across distinct a1-sh2 intervals in a common Zea genetic background. Genetics 170, 1929-1944. Yauk, C.L., Bois, P.R., and Jeffreys, A.J. (2003). High-resolution sperm typing of meiotic recombination in the mouse MHC Ebeta gene. Embo J 22, 1389-1397. Yip, S.P., Lovegrove, J.U., Rana, N.A., Hopkinson, D.A., and Whitehouse, D.B. (1999). Mapping recombination hotspots in human phosphoglucomutase (PGM1). Hum Mol Genet 8, 1699- 1706. Yoshida, K., Kondoh, G., Matsuda, Y., Habu, T., Nishimune, Y., and Morita, T. (1998). The mouse RecA-like gene Dmc1 is required for homologous chromosome synapsis during meiosis. Mol Cell 1, 707-718. Youds, J.L., Mets, D.G., McIlwraith, M.J., Martin, J.S., Ward, J.D., Oneil, N.J., Rose, A.M., West, S.C., Meyer, B.J., and Boulton, S.J. (2010). RTEL-1 Enforces Meiotic Crossover Interference and Homeostasis. Science 327, 1254-1258. Young, J.A., Hyppa, R.W., and Smith, G.R. (2004). Conserved and nonconserved proteins for meiotic DNA breakage and repair in yeasts. Genetics 167, 593-605. Yu, H., Wang, M., Tang, D., Wang, K., Chen, F., Gong, Z., Gu, M., and Cheng, Z. (2010). OsSPO11-1 is essential for both homologous chromosome pairing and crossover formation in rice. Chromosoma. Zangenberg, G., Huang, M.M., Arnheim, N., and Erlich, H. (1995). New HLA-DPB1 alleles generated by interallelic gene conversion detected by analysis of sperm. Nat Genet 10, 407-414. Zetka, M.C., Kawasaki, I., Strome, S., and Muller, F. (1999). Synapsis and chiasma formation in Caenorhabditis elegans require HIM-3, a meiotic chromosome core component that functions in chromosome segregation. Genes Dev 13, 2258-2270. Zickler, D. (1977). Development of the synaptonemal complex and the "recombination nodules" during meiotic prophase in the seven bivalents of the fungus Sordaria macrospora Auersw. Chromosoma 61, 289-316. Zickler, D. (1999). [The synaptonemal complex: a structure necessary for pairing, recombination or organization of the meiotic chromosome?]. J Soc Biol 193, 17-22. Zickler, D., and Kleckner, N. (1998). The leptotene-zygotene transition of meiosis. Annu Rev Genet 32, 619-697.

161 Zickler, D., and Kleckner, N. (1999). Meiotic chromosomes: integrating structure and function. Annu Rev Genet 33, 603-754.

162

(Armstrong et al., 2003) (Ernst, 1938; Sparrow et al., 1955; Wilson, 1959; Bennett, 1971; Schmekel et al., 1993; Sym et al., 1993; Tung and Roeder, 1998; Jeffreys et al., 2000; Jeffreys et al., 2001; Guillon and De Massy, 2002; Inoue and Lupski, 2002; May et al., 2002; Yauk et al., 2003; Jeffreys and May, 2004; Krogh and Symington, 2004; Sun et al., 2004; Anderson and Stack, 2005; Jeffreys et al., 2005; Kauppi et al., 2005; Winckler et al., 2005; Holloway et al., 2006; Bois, 2007; Chen et al., 2007; Kauppi et al., 2007; Krichevsky et al., 2007; Haber, 2008; Ng et al., 2008; Webb et al., 2008; Cazzonelli et al., 2009; de Muyt et al., 2009b; Parvanov et al., 2009; Berg et al., 2010; Perrella et al., 2010; Pontvianne et al., 2010; Wu et al., 2010; Berg et al., 2011; Grey et al., 2011; Libeau et al., 2011; Osman et al., 2011; Ponting, 2011; Yang et al., 2011) (McVean et al., 2004)

163