<<

University of New Hampshire University of New Hampshire Scholars' Repository

Master's Theses and Capstones Student Scholarship

Winter 2016

Marker assisted selection of sex determination in kiwiberry ( arguta and )

Haley Gustafson University of New Hampshire, Durham

Follow this and additional works at: https://scholars.unh.edu/thesis

Recommended Citation Gustafson, Haley, "Marker assisted selection of sex determination in kiwiberry ( and Actinidia kolomikta)" (2016). Master's Theses and Capstones. 1101. https://scholars.unh.edu/thesis/1101

This Thesis is brought to you for free and open access by the Student Scholarship at University of New Hampshire Scholars' Repository. It has been accepted for inclusion in Master's Theses and Capstones by an authorized administrator of University of New Hampshire Scholars' Repository. For more information, please contact [email protected].

MARKER ASSISTED SELECTION OF SEX DETERMINATION IN KIWIBERRY (ACTINIDIA ARGUTA AND ACTINIDIA KOLOMIKTA)

BY

HALEY GUSTAFSON BS Biology & Biotechnology, Worcester Polytechnic Institute, 2012

THESIS

Submitted to the University of New Hampshire in Partial Fulfillment of the Requirements for the Degree of

Master of Science in Biology

December, 2016

This thesis has been examined and approved in partial fulfillment of the requirements of the degree of Master of Science in Plant Biology by:

Thesis Director: Iago Hale, Assistant Professor; Biological Sciences

Thomas Davis, Professor; Biological Sciences

David Plachetzki, Assistant Professor; Molecular, Cellular and Biomedical Sciences

On November 22, 2016

Original approval signatures are on file with the University of New Hampshire Graduate School

ii

TABLE OF CONTENTS

LIST OF TABLES...... iv LIST OF FIGURES...... vi ABSTRACT...... v

CHAPTER PAGE

PREFACE...... 1

I. GENETIC MARKERS AND THEIR APPLICATIONS IN PLANT BREEDING...... 3 Morphological Markers Biochemical Markers Molecular Markers Molecular Marker Development Fragment-Based Markers Sequence-Based Markers Marker Assisted Breeding Breeding Actinidia spp.

II. SEX DETERMINATION IN ANGIOSPERMS...... 16 Pathways to Dioecy Breeding Dioecious Crop Sex determination in Actinidia

III. DEVELOPMENT OF PCR MARKERS FOR SEX IDENTIFICATION IN KIWIBERRY...... 26 Introduction Materials and Methods Results Discusson

CONCLUSION AND FUTURE DIRECTIONS...... 42

LIST OF REFERENCES...... 44

SUPPLEMENTARY MATERIALS...... 53

iii

LIST OF TABLES

Table 1: Comparison of common MAS marker systems

Table 2: The sex determination systems of some important dioecious crop plants

Table 3: Genotypic and Phenotypic descriptions of sex in Actinidia

Table 4: Summary of Bioinformatics Analysis

Table 5: Details of Putatively Sex-Linked Clusters

Table 6: Summary of PCR Scores

Table 7: Primer and Cluster Sequence Information

Table 8: Sex-marker scores in thirteen A. arguta F1 populations

iv

LIST OF FIGURES

Figure 1: Genetic Markers Commonly Used in Plants

Figure 2: Development of Chromosomal Sex Systems from Hermaphroditism

Figure 3: Floral Structure of Kiwiberry Sexes

Figure 4: Imaging of A. arguta and A. kolomikta Sex Markers

v

ABSTRACT

MARKER ASSISTED SELECTION OF SEX DETERMINATION IN KIWIBERRY

(ACTINIDIA ARGUTA AND ACTINIDIA KOLOMIKTA)

by

Haley Gustafson

University of New Hampshire

The selection of phenotypic traits in plant breeding has recently become augmented by the implementation of molecular markers. Marker-assisted breeding strategies are used to develop new of many important crop plants but have yet to be assimilated into breeding programs for many minor crops with untapped economic potential. In the following chapters, I discuss methods of marker development in crop plants, sex determination as a trait of economic interest in plant breeding, and the identification of two male-dominant, PCR-based markers in the commercially viable Kiwiberry Actinidia arguta and A. kolomikta.

vi

PREFACE

The genetic improvement of crops is a lengthy and costly process that can often be made more efficient through the implementation of molecular tools. The molecular underpinnings of traits can be used to monitor and exploit genetic linkage, pleiotropy, and confounding epistatic effects in ways that may be unachievable via traditional, phenotype-based approaches. Marker assisted selection (MAS) is used today in a number of economically-important crops, largely to facilitate the improvement of disease resistance, product quality, and yield. MAS is notably useful in breeding crops for which traditional approaches are particularly difficult or highly resource intensive – particularly in late-maturing and long lived perennial crops, like

(Actinidia spp.), in which plants can require several years of development before a cross can be made. The first chapter of this thesis describes common marker systems in plants, their uses in

MAS, and current breeding efforts in Actinidia spp.

A major complication of breeding efforts in many crops is the presence of dioecy, wherein the opposite (male and female) sexes develop on separate individuals, reducing not only the number of available crosses but also the overall resource efficiency of breeding programs.

Dozens of markers for sex-determination have been developed for use in the breeding and commercial production of dioecious species in the last 30 years, but only recently have any specific sex mechanisms been identified. In the second chapter, a review of sexual systems in plants elucidates the general mechanisms involved in many taxa and describes what is known about the hereditary basis of sex determination in the Actinidia.

Of specific interest to this study is the development of improved fruiting varieties of kiwiberry (Actinidia arguta and A. kolomikta), which is a highly resource intensive process

(land, labor, and time) due to the dioecy and extended juvenile phase (3-5 years) of these

1 vigorous, woody vining species. The ability to identify gynoecious (female) plants at the seedling stage is desired as a means of increasing the resource-use efficiency of breeding programs. In order to develop a molecular marker to this end, both the available marker systems and the hereditary basis of sex determination in kiwiberries must be understood. The third chapter describes how sex-linked polymorphisms in both kiwiberry species were identified through reference-independent (de novo) genotyping-by-sequencing (GBS) analysis, validated in multiple independent populations, and converted into robust gel-based PCR markers appropriate for marker-assisted breeding programs. Both developed markers are now being used for high- throughput sex-screening assays of seedling populations.

The development of sex-linked markers and practical protocols for genotypic screening in kiwiberries paves the way for successive research in marker-assisted selection in these species.

Markers for color, development and ripening, nutritional and allergenic components, and disease resistance are all possible to investigate with the genotyping strategy presented here, and breeding to improve these traits will benefit from parallel sex screening. Finally, the thesis ends with discussion of further optimization of these protocols and their amenability to multiplexing for efficient MAS of multiple traits, as well as other possible uses in Actinidia genomics.

2

CHAPTER I. GENETIC MARKERS AND THEIR APPLICATIONS IN PLANT BREEDING

In genetics, a marker is any heritable characteristic that exists in variant forms, generally referred to as alleles, and can be monitored or mapped relative to other markers. In plant breeding, markers become useful as a basis for selection when they are linked to a phenotype of interest within a population and throughout generations. To that end, such selectable markers must have either a functional relationship with or be in close genetic proximity to (i.e. linked to) a gene or genomic region underlying a trait of interest, reducing the likelihood of crossing over between the marker and associated gene(s) during meiosis. Markers may be morphological, biochemical, or molecular, and each class of marker has advantages and disadvantages relative to plant breeding (Figure 1). As described in detail in the following paragraphs, several considerations must be made in developing markers for traits of interest in breeding, namely (1) how informative the marker is; (2) the abundance, distribution, and polymorphism levels of the marker in the target population; (3) whether the marker is conducive to scaling and multiplexing with others; and, all else being equal, (4) the cost and time taken in both its development and implementation.

3

Figure 1 Genetic Markers Commonly Used in Plants

Morphological Markers

The first genetic markers were variable morphological characteristics, such as color or morphology, which result from underlying genetic polymorphism. Morphological markers were studied for heritable associations with other phenotypic traits; for example, Karl

Sax identified an association between coat patterning and seed size in common bean (Sax,

1923), and the first linkage map in tomato showed a strong association between habit (tall versus dwarf) and fruit pubescence (MacArthur, 1925). The study of trait associations revealed opportunities in plant breeding, such as making controlled crosses that were more likely to result in certain disease resistances with the presence of another trait. Some efforts have been made

4 recently in papaya and jojoba to identify juvenile leaf characteristics linked to plant sex (Inoti et al, 2015), in part because of the inexpensive and quick scoring of such traits. However, morphological markers are not only sparse but also inadequately informative; they tend to be heavily influenced by the environment, confounded by epistatic and pleiotropic gene action, and do not provide genotypic information without a pedigree.

Biochemical Markers

Biochemical markers are more direct indicators of allelic states than morphological markers in that they assay a gene product. The most common biochemical markers in plant breeding are isozymes, which are enzymes with variable structure or charge resulting from amino acid variations but that retain identical catalytic function. Apart the costs of identifying isozyme systems, isozymetic analysis of known systems consists of simple and low-cost methodology (Kumar et al, 2009). Many isozymes have been used in diversity analyses and fingerprinting, such as in grapevine (Stavrakakis and Loukas, 1983), kiwifruit (Messina et al, 1991), and potato (Huaman et al, 2000), and some have been associated with agronomic traits. One such marker is a specific high molecular weight gluten subunit (HMW-GS), which was determined to be linked to gluten strength in bread wheat and, consequentially, bread- making quality (Payne et al., 1987). The modern use of isozymes has become limited by their low abundance (Kumar et al, 2009) and low variability in many crop plants. For example, genotyping efforts in peach found isozymes to be insufficiently polymorphic for use (Ahmad,

2004). Also, isozymes may be altered by adaptive effects and environmental variance, and variable presence in tissues and during a plant’s development are obstacles in detection.

5

Molecular Markers

Molecular markers are specific DNA sequences with variants called alleles. DNA variation can be in the form of single nucleotide polymorphisms (SNPs), insertions and deletions

(indels), or variable number tandem repeats (VNTRs). Molecular markers have several advantages over morphological and biochemical markers for use in plant breeding. To begin with, they are generally not confounded by environmental, pleiotropic, penetrative, or epistatic effects. For example, the RNA of an isozyme variant linked to a disease resistance allele may be subjected to alternative splicing in the presence of some external factor, altering the enzyme conformation, while the DNA sequence remains unchanged; in this example, the DNA sequence is a more robust marker. Secondly, their abundance and wide genomic distribution facilitates the development of much denser genetic maps than have previously been possible. For example, only six dimorphic traits were included in MacArthur’s (1925) tomato linkage map. Similarly, of the 295 loci included in an early integrated genetic map of the barley genome, only seven were either morphological or isozymetic markers (Kleinhofs et al, 1993). More recent high-density linkage maps contain anywhere from thousands of molecular markers, as in pear (Wu et al,

2014) and sweet cherry (Guajardo et al, 2015), to millions, as in rice (Huang et al, 2010) and maize (Liu et al, 2015). Also of significance is that molecular markers do not need to be expressed in order to be detected and are assayable in all tissues from which genomic DNA can be isolated. Some morphological and biochemical markers have been converted into molecular markers to gain these benefits. For example, PCR-based markers of HMW-GS alleles in bread wheat are used as an efficient alternative to isozymetic analysis (Ahmad, 2000; Moczulski and

Salmanowicz, 2003), which can only be performed following grain production. While they

6 require some time and financial investment to develop, molecular markers are often a more informative, efficient and reliable option for marker-assisted selection.

Molecular Marker Development

A molecular marker can be defined by the type of variation as well as the method of discovery, both of which contribute to their relative benefits and drawbacks. In plant genomes,

SNPs are typically the most abundant (Mammadov et al, 2012, Zargar, 2015), while repeat variation is often hyper-variable and co-dominant in length (Sunnucks, 2000, Zargar, 2015).

Vital to marker development is that the marker must be reliably detectable by some means, typically hybridization probing, PCR, or sequencing. Historically, the most common marker systems in plants have been fragment based techniques: restriction fragment length polymorphisms (RFLPs), amplified fragment length polymorphisms (AFLPs), randomly amplified polymorphic DNAs (RAPDs), and microsatellites or simple sequence repeats (SSRs)

(Table 1). Although many are still in use, particularly SSRs, array and sequencing based techniques are becoming increasingly common. Single nucleotide polymorphisms (SNPs), detectable by sequencing-based techniques and amenable to a number of genotyping methods, have most recently begun to surpass other marker systems.

7

Table 1 Comparison of common marker systems. Adapted from Agarwal et al,(2008) and Kumar et al, (2009).

Marker System Isozyme RFLP RAPD AFLP SSR SNP Abundance Low to High High High Moderate Very High Moderate to High Expression Co-Dom Co-Dom Dom Both Co-Dom Both Sequence No No No No Yes Yes Required Reproducability High High Low High High High Amenability to Low Low Moderate High High Moderate to Multiplexing High Genotyping Low Low Low High High High throughput Technical Low High Low Medium Medium Low to High Requirements Development/ High Moderate Low Moderate Moderate High start-up cost to High to High Cost per assay Low High Low Moderate Low Low

Fragment-Based Markers

In the past, mining for genetic variation involved primarily wet-lab techniques and was labor and time intensive. Developing RFLPs involves restriction digestion, electrophoresis, southern blotting, hybridization with labeled probes from specifically designed libraries, and autoradiography, all to detect only a few polymorphic loci per assay (Semagn et al, 2006; Kumar et al, 2009; Jiang, 2013). PCR-based approaches allow a significant reduction in the time spent per assay. The most direct example of this is AFLP, also called PCR-RFLP, which involves the selective amplification of restricted fragments via adapter-mediated PCR, removing several steps from the RFLP methodology. PCR-based markers are further simplified in the form of RAPDs, which avoids the digestion and adapter ligation of AFLPs and produces many more polymorphic loci per assay (Jiang, 2013). However, it does not typically produce co-dominant markers like

RFLPs and AFLPs and is not as reproducible due to sensitivity to assay conditions (Kumar et al,

2009). A RAPD marker can be converted into a Sequence Characterized Amplified Region

8

(SCAR) if sequence information can be obtained, allowing for the development of more transferable markers, but this does not improve its use in mining polymorphic loci.

Microsatellites, or simple sequence repeats (SSRs), are very commonly used as markers in plants due to their abundance and high rates of polymorphism, tendency toward co- dominance, and transferability among closely related species (Sunnucks, 2000; Semagn et al,

2006; Kumar et al, 2009; Jiang, 2013). Therefore, the potential for mining polymorphisms surpasses other hybridization and PCR based methods once a library of microsatellites is developed for a species. However, microsatellite library development is a time consuming procedure with high start-up costs (Jiang, 2013), involving several steps like DNA extraction,

DNA digestion with a restriction enzyme, ligation of linkers to DNA fragments, PCR-enrichment for microsatellite-containing fragments, hybridization to microsatellite-specific probes, recovery of microsatellite-containing fragments, cloning and sequencing of products, and validation across diverse germplasm (Rauscher and Simko, 2013). The availability and quality pf SSRs are consequently limited in organisms without available genomic information.

Sequence-Based Markers

Marker discovery techniques are largely trial-and-error processes, and resource-intensive laboratory methods like those described above (e.g. RFLP, AFLP, etc.) have become less attractive with the development of Next Generation Sequencing (NGS). NGS is a collection of high-throughput sequencing methods that may differ in sequencing chemistry but which all provide access to large-scale genomic sequence information. In conjunction with improved bioinformatics technologies, the trial-and-error process inherent to marker development is largely automated in silico and polymorphism recovery is significantly higher. The lower cost

9 per data point and short development timeframe of NGS methods (Davey and Blaxter, 2010), have made the bioinformatic mining of polymorphisms more accessible and have led to the proliferation of SNP genotyping (Mammadov et al,2012). Although wet-lab work is necessary for sequencing, it is comparatively higher-throughput than other marker-discovery techniques and the data collected can be used indefinitely, limiting sample handling and minimizing the amount of DNA needed. Markers can be analyzed strictly by sequencing, but dense marker assays, such as the 52,157 marker SNP Illumina Infinium array in rapeseed (Clarke et al, 2016) and the 95,062 marker Affymetrix Axiom SNP array for the cultivated (Bassil et al,

2014), remain cost-prohibitive in all but a small number of breeding programs. Targeted genome enrichment via methods like restriction-site or RNA-associated genome-reduction make

Genotyping-by-Sequencing (GBS) more affordable to smaller programs while still recovering a significant number of polymorphisms (Cronn et al, 2012).

In less-funded programs, or those for which the target species has a large and complex genome, NGS is commonly used as a marker detection system in conjunction with traditional marker assays. For example, Expressed-Sequence-Tagged (EST) databases are now commonly mined for microsatellites which occur regularly in plant ESTs. In kiwifruit, for example, an integrated SSR and AFLP female linkage map contained 71 SSR loci (Testolin et al, 2001), whereas 107 EST-derived loci were identified in significantly less time (Fraser et al, 2004). More recently, Fraser et al (2009) developed linkage maps with 644 SSRs from the Kiwifruit EST database and enriched genomic libraries, noting that EST-derived SSRs were more often suitable for mapping. SNPs and indels identified via sequencing are often converted into Cleaved

Amplified Polymorphisms (CAPs) and SCAR markers, as in kiwifruit (Fraser et al, 2009), barley

(Varshney et al, 2008), and soybean (Shu et al, 2011); and several allele specific methods, such

10 as the Amplification Refractory Mutation System (ARMS) developed by Newton et al (1989), can be used to directly amplify allele specific SNPs.

Marker Assisted Breeding

The identification of polymorphic loci leads to the selection of candidate markers for traits of interest via association analyses. Simple, qualitative traits or complex, quantitative traits can both be associated with markers; and whether a trait is monogenic or polygenic can be inferred by observing inheritance patterns. Markers of presumed monogenic traits tend to be more informative due to simple genetic control and limited environmental influence. Complex traits like yield and grain or fruit quality are of particular interest in crop improvement due to their economic importance, and advances in technologies for marker development and genetic mapping have enabled the creation of complex marker profiles for these traits. More recently, there has been a movement towards genomic selection for the improvement of complex traits

(Mammadov et al, 2012). Nonetheless, many mono- and pseudo-monofactorial traits (those determined monogenically with limited environmental influence), like many disease resistance, fertility, and fruiting characteristics, remain of economic interest and worthwhile targets for

MAS; and the high polymorphism recovery of NGS methods allow for genome-wide association analyses (Zargar, 2015).

When markers are verified to be associated with phenotypic traits in a population of interest and the association is retained in progeny, those traits can be tracked in subsequent generations without phenotypic observation, and breeding selections can be made based on marker profiles. Marker-trait associations are particularly useful when breeders want to track traits that are not easily discernible, such as in cases where a phenotype cannot be easily

11 measured, one gene masks the effects of another (epistasis), or multiple desired traits have the same presentation (such as in resistance gene pyramiding). For example, as shown in the work of

Xu et al, 2004, scoring submergence tolerance in rice involves a 20+ day procedure with a complex system of equipment, and doing so for thousands of individuals over many generations of a breeding stock is extremely costly. Although it took screening nearly 3000 seedlings to identify a PCR-based microsatellite marker tightly linked to the major controlling gene of submergence tolerance, that marker can now be used to screen future plants in a fraction of the time and cost. MAS also introduces new breeding capabilities, such as in resistance gene pyramiding, where multiple desired alleles all have the same phenotypic effect and are therefore difficult to assess for their combined presence. Markers have been used to combine resistance genes in a number of crops, such as those for cereal cyst resistance in wheat (Barloy et al, 2006), soybean mosaic virus in soybean (Maroof et al, 2008), and powdery and downy mildews in grape (Eibach et al, 2007).

MAS is a particularly promising technique for breeding crops that are slow-maturing and perennial, as these characteristics increase the time, space, and materials needed for traditional breeding. In many cases, the desired traits of perennial crops (e.g. fruit quality and yield) do not present for several years, even decades, and maintaining the thousands of plants needed for adequate breeding population sizes can be extremely resource inefficient. Most temperate fruit tree crops (e.g. apple, peach, pear, cherry, etc.) have extended juvenile phases, meaning many traits of interest (e.g. fruit quality and yield) cannot be assessed for at least a few years, and sometimes for up to 12 years, as is the case for some cultivars of apple (Iwata et al, 2016).

Although MAS cannot itself speed up the generational time of fruit tree crops, it can drastically reduce the time, space, and materials required to maintain effective breeding populations; and

12

MAS has been used in the development of tree crops that transition more quickly into the adult phase (Iwata et al, 2016). Unlike the wait-and-see approach of traditional phenotype-based selection, selection based on molecular markers, despite the initial investment of MAS assays, can be of great benefit to breeding programs.

Dioecy, a characteristic of angiosperms where unisexual develop on separate individuals, has implications in both plant breeding and crop establishment. In many cases, dioecy has no impact on crop production, as in the cases of plants grown for vegetative structures

(e.g. poplar and spinach). However, the productivity of crops grown for fruit produced via pollination of female flowers scales with the number of females planted, and the necessity of pollinizing males reduces the field space available for female plants. Dioecy complicates breeding efforts by reducing the number of possible crosses, forcing selection of pollenizers without direct knowledge of the fruiting characteristics they carry, and delaying selection choices when plants have no sexually dimorphic traits until flowering. Isozymes (particularly peroxidases) have been explored as sex markers due to sexually differential expression in kiwifruit, papaya, date palm, and others (Doust and Doust, 1988). Molecular markers for sex determination have been developed and employed in breeding programs for a number of dioecious crop plants, such as asparagus (Reamon-Buettner and Jung, 2000; Kanno et al, 2014), papaya (Parasnis et al, 2000; Deputy et al, 2002; Saxena et al 2016), pistachio (Homaza et al,

1994; Yakubov et al, 2005; Kafkas et al, 2015), and hops (Polley et al, 1997; Danilova and

Karlov, 2006).

13

Breeding Actinidia spp.

The kiwifruit () and the golden kiwifruit (A. chinensis) are relatively new crop , having been domesticated and cultivated commercially within the last century

(Ferguson, 2013). Kiwifruit are dioecious perennial that are space and management intensive, and all commercially grown cultivars are propagated clonally due to the extended juvenile-phase (3-5 years), variable fruit quality of hybrids (Ferguson and Huang, 2007), and variable ploidy levels within and between species (Mcneilage and Considine, 1989). As such, most available varieties are at most a few generations away from wild plants, with very few successful cultivars originating from controlled crosses (Ferguson and Huang, 2007). Despite the difficulties in their cultivation, the few available cultivars are the basis of a $500M global industry (FAO, 2013) and boast exceptional flavor and nutritional profiles (Nishiyama et al.,

2004; Ferguson and Huang, 2007; Crowhurst et al, 2008; Huang et al, 2013). Of similar potential are the yet-undomesticated species A. arguta and A. kolomikta, the two cold-hardiest species among the 76 within the highly diverse Actinidia genus (Ferguson and Huang, 2007).

Coalescing in the marketplace under the term "kiwiberry," these species produce grape-sized, smooth-skinned versions of the better known fuzzy and have been recognized for over a century as a promising horticultural crop, particularly in more northern latitudes (Ferguson and

Huang, 2007).

Numerous resources have recently become available in kiwifruit genomics, including male and female linkage maps (Fraser et al, 2009), a draft genome of A. chinensis (Huang et al,

2013) and multiple sex markers in both cultivated kiwifruit species (Harvey et al, 1997; Gill et al, 1998; Shirkot et al, 2002; Fraser et al, 2009). Some studies have characterized the frequencies of marker types in the A. deliciosa, showing SSRs to be more abundant and more polymorphic

14 than RAPDs (Palombi and Damiano, 2001); and a large EST database has been developed from several Actinidia species (A. deliciosa, A. chinensis, A. arguta, A. eriantha, and others) and analyzed for genes related to fruit flavor, color, nutritional components (e.g. C), and ripening (Crowhurst et al, 2008). Many markers in Actinidia are used for cultivar fingerprinting and diversity analysis (Messina et al, 1991; Cipriani et al, 1996); but few markers, apart from those of sex determination, are yet used for MAS in kiwifruit, and none have proven transferrable to the kiwiberries.

The success of sex marker development and improving genetic resources in kiwifruit prompts similar approaches in the kiwiberries. Developing sex markers in kiwiberries will begin to facilitate the development of new cultivars via controlled crosses, as is beginning to happen with the increasing discovery of molecular markers in Actinidia spp. In order to do this, the hereditary basis of sex determination in kiwiberries must be ascertained.

15

CHAPTER II. SEX DETERMINATION IN ANGIOSPERMS

Although the majority of flowering plants are hermaphroditic, many fall under some category of monoecy or dioecy, in which sexes are produced separately on the same (monoecy) or different (dioecy) individuals. Occurring in 6% of angiosperms, dioecy has representatives in at least 200 plant families and 900 genera with diverse taxonomic distribution (Ming et al., 2011;

Renner, 2014). Given its distribution, dioecy is expected to have arisen independently between several hundred and several thousand times and in many ancestrally hermaphroditic or monoecious lineages (Charlesworth, 2002, Renner, 2014). The specific genetic mechanisms of sex determination (SD) are therefore likely to vary greatly between species. Several basic sex determination systems have been documented in plants; environmental, cytoplasmic, and chromosomal SD. These systems are often distinguishable by sex segregation patterns, with chromosomal SD being the primary system where ratios approach 1M:1F, as is the case of

Actinidia spp. (Testolin et al, 1995).

Pathways to Dioecy

Most dioecious species are hypothesized to have developed via a monoecy-dioecy or a gynodioecy-dioecy pathway (Renner, 2014), culminating in the evolution ofso-called sex chromosomes (Figure 2). These pathways are supported by not only the rarity of androdioecy but also the prevalence of taxa containing both monoecious and dioecious species or gynodioecious and dioecious species. Indeed, while 210 genera contain both monoecious and dioecious members, and 59 genera contain both gynodioecious and dioecious members, very few species are androdioecious (Renner, 2014). These pathways ultimately determine the initial architecture

16 of the sex chromosomes, leading to three possible SD systems: XY, the most prevalent system

(Ming et al, 2011), in which males are the heterogametic sex; ZW, in which females are the heterogametic sex; and XA, in which sex is determined by the X:autosome ratio.

In the gynodioecy-dioecy pathway, proposed by Charlesworth and Charlesworth (1978) and highlighted in Figure 2, at least two reciprocal genetic changes are required for the development of separate sexes. First, a mutation causing male-sterility converts hermaphroditism to gynodioecy. In some gynodioecious populations, male-sterility is cytoplasmically (or gene- cytoplasmically) determined, as in Lobelia spp. (Delph and Montgomery, 2014). However, the transition to full dioecy with chromosomal SD also requires nuclear male sterility (mst). This mutation may be dominant or recessive, with the latter being significantly more likely due to the rarity of gain-of-function and dominant loss-of-function mutations. However, dominant mst mutations have been documented in gynodioecious species, such as in Fragaria vesca ssp. bracteata (Tennessen et al, 2013), leading to a ZW system. For an XY system to develop, a dominant female-sterility (fst) mutation linked to the male-fertility locus also must occur. While unlinked recessive mutations are more likely to occur, they cannot invade the population; instead, hermaphroditic individuals will increase in frequency and revert the population to gynodioecy, while a recessive fst locus linked to the mst locus will result in sterility instead of males. Once a dominant fst allele linked to the male-fertility locus occurs, it allows a 1:1 production of males and females as it invades the population, with a low occurrence of hermaphrodites and sterile individuals, depending on the tightness of the linkage.

17

Figure 2 Development of Chromosomal Sex Systems from Hermaphroditism 1a) A male sterility mutation leads to gynodioecy or 1b) a female sterility mutation leads to androdioecy. 2a) A second recessive mutation inducing reciprocal sterility is unstable; hermaphrodites accumulate. 2b) From gynodioecy, a dominant female sterility mutation is retained if linked to the male fertility locus, creating an XY system of SD.3) Recombination suppression leads to male and female specific chromosomes. 4) Expansion of non-coding DNA content in the Y chromosome leads to cytological heteromorphism. 5) Y chromosome degenerates and 6) is ultimately lost, establishing the XA system of SD.

18

The reciprocal pathway, androdioecy-dioecy, though relatively less studied, is hypothesized to occur similarly to create a ZW system (Ming et al, 2011). The prevalence of gynodioecy over androdioecy as an intermediary, evidenced by the prevalence of the XY system and the rarity of androdioecy (Ainsworth, 2000; Ming et al, 2011; Renner, 2014), can possibly be explained by the female reproductive advantage: with often at least a two-fold fertility advantage over hermaphrodites (Spigler and Ashman, 2011), females can readily proliferate in a population. For androdioecy to become established, males must either double pollen output or propagate vegetatively (Charlesworth and Charlesworth, 1978). The monoecy-dioecy pathway involves a similar derivation of sexes, which develop either through andromonoecy or gynomonoecy. Once the sex determination system is established, mutations are thought to occur to alter the ratio of unisexual flowers to create entirely unisexual individuals (Renner and

Ricklefs, 1995).

The fixation of sterility alleles can be followed by an accumulation of sexually- antagonistic genes, large inversions, and translocations, which suppress recombination. In some cases, this can lead to the degeneration of noncoding regions of the Y chromosome and the development of heteromorphic sex chromosomes, as seen in Silene latifolia (Ishii et al, 2013). In some cases, the loss of the Y chromosome may also lead to an X:autosome ratio sex determination system, as seen in Humulus and Rumex. However, many plant sex chromosomes are homomorphic (Table 2).

19

Breeding Dioecious Crop Plants

Despite the rarity of dioecy, many important crop plants fall under this category (Table

2). Breeders and growers manage dioecious species in a number of ways, typically depending on the system of sex determination involved. In some cases, nearly entirely female populations can be developed through vegetative propagation, as in kiwifruit (retaining some male pollenizers to induce fruit-set). However, genetic uniformity poses other problems, such as increased disease susceptibility, which can be compounded by the accumulation of pathogens in long-standing crops. This is seen in banana, a vegetatively propagated crop heavily affected by black leaf streak disease (Isaza et al, 2016). Sometimes, genetically diverse single sex populations can be created through crossing due to the presence of some hermaphroditic individuals. For example, in genotypic males of asparagus, the vegetative tissues of the ovary remain, and functionally hermaphroditic flowers have been known to occur on otherwise male plants (Bracale et al.,

1991). These hermaphroditic flowers on males can result in female (m/m), male (M/m), or supermale (M/M) progeny (Gebler et al.2007). Supermales will yield all male (M/m) plants when crossed with a female, promising the increased vegetative vigor (Moon, 1976) of both all- male populations and hybrid populations. Ultimately, both breeders and growers must base population management on the specific system of SD in the target population, and breeders need to consider these strategies when developing new cultivars.

20

Table 2 The sex determination system of some important dioecious crop plants. Adapted from Ming et al, 2011 and Grewal and Goyat, 2015, with other sources (Ainsworth, 2000; Renganayaki et al, 2005; Li et al, 2010; Jangra et al, 2014; Kafkas et al, 2015; Chen et al, 2016).

Higher Taxa Species Common Sex System Sex Chromosomes Sex Marker Name Dioscoreales Dioscoreaceae Dioscorea tokoro Yam active-Y homomorphic yes Asparagales Asparagaceae Asparagus Asparagus active-Y homomorphic yes officinalis Arecales Arecaceae Calamus Rattan unknown No evidence yes simplicifolius Phoenix Date Palm active-Y homomorphic yes dactylifera arachnifera Texas unknown no evidence yes Bluegrass Cucurbitales Cucurbitaceae Coccinia grandis Ivy Gourd active-Y heteromorphic yes Momordica dioica Spiny Gourd active-Y unknown yes Trichosanthes Pointed Gourd active-Y heteromorphic yes dioica Malphighiales Salicaceae Salix spp. Willow active-W unknown yes or active-Y Populus spp. Poplar active-W homomorphic yes or active-Y Rosales Cannabaceae Cannabis Sativa Hemp active-Y heteromorphic yes Humulus lupulus Hop X:A heteromorphic yes H. japonicus Hop X:A heteromorphic yes Sapindales Anacardiaceae Pistacia vera Pistachio active-W homomorphic yes Brassicales Caricaceae Carica papaya Papaya active-Y homomorphic yes Vasconcellea spp Mountain active-Y homomorphic no Papaya Caryophyllales Polygonaceae Rumex spp. Sorrel X:A heteromorphic yes active-Y Simmondsiaceae Simmondsia Jojoba active-Y homomorphic yes chinensis Amaranthaceae Spinacia oleracea Spinach active-Y homomorphic no Actinidia deliciosa Kiwifruit active-Y homomorphic yes A. chinensis Golden active-Y homomorphic yes Kiwifruit Ebenaceae Diospyros lotus Persimmon active-Y homomorphic yes

21

In many cases, the undesired sex must be culled when it becomes known, with some being retained as non-productive pollinizers. This can take many years in perennial plants, amplifying the amount of resources needed to produce a primarily female crop. Papaya produces both dioecious and gynodioecious varieties in which females and hermaphrodites are the productive sexes and hermaphrodites have preferable fruit morphology (Fitch et al, 2005).

Similarly to asparagus, papaya sex is monogenically determined; females are homozygous (mm) and both males and hermaphrodites are heterozygous (M/m and Mh/m, respectively), while

M/M, M/Mh, and Mh/Mh genotypes are lethal (Na et al, 2012). The crop is propagated both clonally and by seed (Fitch et al, 2005; Koehler et al, 2013; Saxena et al, 2016), with the latter method being most common due to its affordability (Koehler et al, 2013; Saxena et al, 2016). It is therefore necessary for growers to uproot females and most males from the field once the sexes are evident, which takes five to eight months (Parasnis et al, 2000; Koehler et al, 2013), wasting grower resources and preventing strategic planting. Several PCR sex-markers have been developed and are in use today for both breeding and commercial production (Grewal and Goyat,

2015).

The ideal solution to dioecy is the development of productive hermaphroditic cultivars, such as in grape, whose wild relatives are dioecious (Charlesworth, 2015), However, the complicated nature of sex mechanisms in plants makes this a difficult prospect. Hermaphrodites resulting from recombination in dioecious species can be clonally propagated, allowing the establishment of fully-productive orchards. However, breeding hermaphrodites is confounded by sex segregation in progeny; if the two linked genes mechanism of SD applies, most progeny resulting from crosses with recombined hermaphrodites will not have hermaphroditic characteristics. The degeneration of dioecy into monoecy or hermaphroditism has been

22 hypothesized in several genera, such as in Carica (VanBuren et al, 2015) and Vitis

(Charlesworth, 2015), whose wild relatives are strictly dioecious but whose cultivated taxa are subdioecious or hermaphroditic. The lack of understanding of this mechanism currently prevents strategic breeding for hermaphroditism.

Sex determination in Actinidia

Actinidia is a genus of 76 species of woody, climbing vines in the family Actinidiaceae

(order Ericales). All known Actinidia species are functionally dioecious. Morphologically,

Actinidia spp. are androdioecious (Figure 3); male plants develop staminate flowers with rudimentary ovaries while female plants develop flowers that appear hermaphroditic but produce empty pollen grains (i.e. sterile pollen), making the species cryptically dioecious. A study of cryptic dioecy in A. polygama determined the retention of sterile male structures to play a role in pollinator attraction (Kawagoe and Suzuki, 2003).

Figure 3 Floral morphology of female (left) and male (right) kiwiberry.

23

It is unclear how dioecy evolved in Actinidia. Hermaphroditism, monoecy, gynodioecy, androdioecy, strict dioecy, and polygamodioecy have all been reported in the Ericales (Renner,

2014). However, species of the two other genera in Actinidiaceae are either cryptically dioecious

() or primarily hermaphroditic (), indicating evolution from hermaphroditism rather than monoecy. This is further supported by the rare occurrence of both sterile and hermaphroditic flowers in Actinidia spp., with the latter in the form of fruiting males with an average of half the number of ovules per carpel as females (McNeilage, 1991). The segregation pattern of sexes, approaching 1M:1F in open pollinated families and 3M:1F in the progeny of selfed fruiting males, indicates an XY sex system with males as the heterogametic sex (Testolin et al, 1995). Diploids (A. chinensis) are simply XY, while tetraploids (A. arguta) are XXXY and hexaploids (A. deliciosa) are XXXXXY (Testolin et al, 1995). A. deliciosa is hypothesized to be an allohexaploid resulting from the hybridization of A. chinensis and a tetraploid member of the genus (Ferguson and Huang, 2007). Both flower development and the rare presence of hermaphroditic and sterile flowers indicates that sex is controlled by two separate, albeit tightly linked, genes affecting ovary abortion (O) and pollen fertility (P), as outlined in Table 3. This hypothesized system of control has been further supported by the identification of two tightly linked sex loci in A. chinensis (Fraser et al, 2009). Some minor modifying genes may explain the occasional feminization of males, but overall sex is effectively monofactorial across all ploidy levels (Testolin et al, 1995).

24

Table 3 Genotypic and phenotypic descriptions of sex classes in diploid Actinidia spp.

Sex Male Female Hermaphrodite Sterile Genotype op/OP op/op op/Op op/Op (X/Y)

Ovary aborts Ovary develops Ovary develops Ovary aborts Phenotype Pollen fertile Pollen sterile Pollen fertile Pollen sterile

Several PCR-based sex-linked markers have been identified in A. chinensis and are posited to lie in a sub-telomeric region of a single chromosome (Fraser et al, 2009), supporting the likelihood of suppressed recombination between the sex-determining loci. Floral buds of A. chinensis were also screened for the differential expression of genes between the sexes. Of the

15 genes identified, 11 were of known function, including the coding of an enzyme involved in pollen development that was more highly expressed in male floral buds (Kim et al, 2010).

However, none of the markers identified in A. chinensis or A. deliciosa to date amplify in other

Actinidia species. Although the controlling genes of sex could very well be the same in the fuzzy and kiwiberries, until they are known it is likely that species-specific sex-linked markers will be needed. Therefore, in order to overcome the obstacles present in breeding the dioecious kiwiberries, sex markers currently must be independently identified.

25

CHAPTER III. DEVELOPMENT OF PCR MARKERS FOR SEX IDENTIFICATION IN

KIWIBERRY

Introduction

Although the genus Actinidia consists of about 76 species (Ferguson and Huang, 2007) native to a wide range of habitats across eastern , nearly all of the more than $500M (USD) global kiwifruit trade (FAO, 2013) is based on cultivars of two closely related and recently domesticated temperate species: A. deliciosa and, increasingly, A. chinensis (Ferguson, 2013).

While cold-hardier members of the genus, namely A. arguta and A. kolomikta, have long been recognized for their economic potential, particularly at higher elevations and in more northern latitudes (Goodale, 1981), these species (collectively referred to as kiwiberries) have received relatively little attention by formal breeding programs in the United States since their initial introduction into the country as ornamentals in the 1870's (Ferguson and Seal, 2008). On one hand, this lack of investment is surprising, given the proven commercial viability of extant selections (Ferguson and Huang, 2007) and the numerous desirable traits of their small, grape- sized fruits, including impressive nutritional profiles (Nishiyama et al., 2004; Crowhurst et al.,

2008; Huang et al., 2013), exceptional flavor (Ferguson, 2013), and hairless, edible skins. On the other hand, a kiwiberry breeding program is a resource-intensive proposition, particularly for a novel crop with no current commercial base. Not only do the vigorous plants require substantial physical support (e.g. trellising), intensive vine management (training and multiple prunings per year), and significant physical space (>150 ft2 per plant under commercial production systems); they also require up to five years before reaching reproductive maturity after planting (Ferguson and Bollard, 1990).

26

Complicating matters further is the fact that, although most angiosperms are hermaphroditic, all Actinidia species are functionally dioecious (Seal & McNeilage, 1988). This trait presents a fundamental inefficiency to breeding programs, particularly in light of the moderate male bias observed in some randomly segregating populations (D. Jackson, pers. comm.; R. Guthrie, pers. comm.; this study). With neither sexually dimorphic traits at the seedling stage nor cytologically-evident sex chromosomes within Actinidia spp. (Ming et al,

2011), the sex of any given kiwiberry plant is discernible only once it reaches reproductive maturity. Sex-linked molecular markers have been developed successfully in other dioecious genera, e.g. Asparagus (Reamon-Buettner & Jung, 2000), Carica (Deputy et al., 2002), and

Cannabis (Mandolino et al., 1999), and enhance breeding efficiency by allowing an increase in effective population size, given resource constraints. Similarly in A. chinensis, a sub-telomeric sex-associated locus was discovered, with tightly linked male (SMY) and female (SMX) associated loci (Fraser et al., 2009), though the causative genes were not identified and the markers do not show sex-linkage within either kiwiberry species A. arguta or A. kolomikta.

While the specific mechanisms underlying sex determination in kiwiberry are unknown, it has long been observed that male plants within the genus possess staminate flowers with rudimentary ovaries, while female plants appear hermaphroditic but are male-sterile (Ferguson,

2013). Due to their evident disomic segregation across various ploidy levels (Testolin et al.,

1999), these traits are thought to be controlled by two tightly linked genes controlling ovary abortion (O) and pollen fertility (P), with op/OP individuals developing as male and op/op individuals as female (Testolin et al,, 1999). This hypothesized genetic model finds support in the rare occurrences of both sterile (op/Op) and hermaphroditic (op/oP) individuals (Ferguson,

27

2013) , as well as the observed linkage between the sex-associated SMX and SMY regions identified in A. chinensis (Fraser et al., 2009).

As no marker-assisted protocols are currently available to support public breeding efforts in the relatively understudied kiwiberry species A. arguta and A. kolomikta, this study was undertaken to identify sex-linked loci in both species and convert them into cost-effective PCR- based assays. In light of the recent ban on importing additional Actinidia germplasm into the US due to concerns over pv actinidiae (USDA, 2010), the target germplasm for this study consists of all available accessions in the nursery trade and the USDA's National

Plant Germplasm System (NPGS), hereafter referred to as the US kiwiberry collection.

Materials and Methods

Plant material

The plant materials used for the initial genotyping-by-sequencing (GBS) association analysis consisted of 60 accessions of A. arguta and 21 accessions of A. kolomikta of known gender from the US kiwiberry collection (see Supplementary Material A). Specifically, the A. arguta population has 41 females, 18 males and one accession genotype with unknown gender and out of 21 A. kolomikta accessions, 16 are female and 5 are males. Independent segregating populations of both species were used for subsequent marker validation: two populations of A. arguta and A. kolomikta from the University of Minnesota Landscape Arboretum Horticultural

Research Center consisted of 7/10 and 11/10 males/females, respectively, and are hereby collectively referred to as the Minnesota Populations. An A. arguta population of 10 males and

11 females from Kiwi Korners Farm is hereby referred to as the Pennsylvania Population. The

28 female parents (A. arguta, Ogden Point; A. kolomikta, Nahodka) of the Minnesota populations were included in the analysis.

DNA isolation, library preparation, and sequencing

Genomic DNA was extracted using a modified CTAB protocol (Murray and Thompson,

1980). Specifically, ~100 mg of fresh, healthy leaf tissue was lyophilized and ground to a powder in a Retsch mill with three 3 mm tungsten carbide beads as the grinding agent. The powder was incubated in 750 ul of 65ºC DNA extraction buffer (CTAB, 1% polyvinylpyrritate,

2% 2B-mercaptoethanol) for 45 min, with occasional mixing. 750 ul of chloroform:isoamyl alcohol (24:1 v/v) were added and mixed by gentle inversion. The mixture was centrifuged for

10 min at 10,000 rpm and the aqueous phase recovered. 10% volume of 65ºC CTAB/NaCl solution was added, followed by a second chloroform:isoamyl step. Nucleic acids were precipitated with an equal volume of 95% isopropanol and centrifuged for 10 min at 13,000 rpm.

The recovered pellet was washed twice with 700 ul of 70% ethanol, air dried, and resuspended in

50 ul of 10 mM Tris. Prior to library preparation, the isolated gDNA was purified using Zymo

DCC-10 columns (Zymo #D4011).

A multiplexed GBS library was prepared according to the two enzyme (PstI-MspI) protocol described by Poland et al. (2012). Using 6-10 bp barcodes from that protocol, the 60 accessions from A. arguta and the 21 accessions from A. kolomikta were multiplexed within a single library and sequenced using the Illumina 2500 HiSeq machine at the Hubbard Center for

Genome Studies, University of New Hampshire (http://http://hcgs.unh.edu/). FASTQ files of the

150 bp paired-end sequence data were generated using CASAVA 1.8.3, and all parsed, high- quality, PE reads are available in the NCBI Sequence Read Archive (Table S1). Detailed

29 information about the 81 accessions sampled from both species such genotype name, NPGS repository IDs, reported and observed gender information, the number of high quality paired-end reads, the number of SNPs called for each genotype and the SRA NCBI submission ID can be found on Table S1.

Genotyping

The CASAVA-processed raw sequence data from both species were submitted to version

2.0 (SNP + indel functionality) of the GBS-SNP-Calling Reference Optional Pipeline (GBS-

SNP-CROP; Melo et al., 2016) for analysis. The pipeline calls Trimmomatic (Bolger et al.,

2014) to trim reads based on a sequence of three contiguous bases with an average Phred score Q

≤ 30 and to remove reads shorter than 36 bp. High-quality reads are then demultiplexed into read pairs, by genotype. The most read-abundant female genotype from each species (cv. 'ORUS 2-

16' for A. arguta and cv. 'Matonaya' for A. kolomikta) was then chosen to build a GBS-specific, reduced-representation reference (Mock Reference) to enable GBS read mapping and facilitate

SNP and indel discovery. After alignment to the mock reference using BWA-men (Li and

Durbin, 2009), only reads that mapped in proper pairs with no supplementary or duplicate alignments were retained by SAMTools (Li et al., 2009) for further SNP and indel calling. As the two species evaluated have different ploidy levels, we followed the recommendations of the

GBS-SNP-CROP parameters in step 7 to call SNPs in diploid (A. kolomikta) and in tetraploid (A. arguta) species.

By performing a search of both SNPs and indels showing homozygosityin females and heterozygosity in males, we identified candidate markers associated with gender. Markers homozygous in all female accessions and heterozygous across >95% of the male accessions were

30 selected as candidate markers. For those markers, the FASTA entry (Cluster/Centroid) from the mock reference was examined to understand the sequence context of the putative male/female polymorphism and guide PCR-based primer design.

PCR marker development and validation

Primers for polymorphic amplicons putatively associated with sex were designed using

Primer3 (Rozen & Skaletsky, 2000) and initially tested on ten accessions of known gender (see

Table S1). PCRs were conducted with a total reaction volume of 20 ul (0.25 mM of each primer,

100 µM of each dNTP, 0.75 U Taq DNA Polymerase, 10x standard Taq buffer [NEB N0273], and 50-100 ng of template DNA). Cycling conditions consisted of 5 minutes at 94°C, 32 cycles of 30 s at 94°C, 30 s at the primer annealing temperature, and 15 s at 68°C, followed by a final 5 minute elongation step. Amplified products were separated on a 2% TBE/ethidium bromide agarose gel for 60 min at 75 V and imaged with UV transillumination. Following successful amplification on the initial set of 10 accessions, promising markers were used to screen the entire

US kiwiberry collection, as well as the independent segregating populations. Subsequent to marker validation, a high-throughput DNA extraction method described by Slotta et al., 2008 was adapted to screen large populations of A. arguta described here. Specifically, thirteen independently segregating A. arguta populations were scored for gender.

Results

Bioinformatics analysis and primer design

Using only the single most read-abundant genotype to build the species-specific mock reference, the GBS-SNP-CROP pipeline called a total of 32,222 (mean depth D = 60.24) and

31

12,486 (D = 57.06) variants for A. arguta (60 accessions) and A. kolomikta (21 accessions), respectively. Out of the total variants called for A. arguta, 30,468 were SNPs and 1,754 were indels, while 12,067 SNPs and 419 indels were called for A. kolomikta. These sets of variants were called using 286.3 and 96.9 million PE high quality reads for A. arguta and A. kolomikta, respectively. These two species exhibited similar values of overall loci heterozygosity, homozygosity, and missing data, with the intraspecific averages being 27.6% (Hetero), 60.4%

(Homo), and 11.8% (NA) for A. arguta and 35.2% (Hetero), 55.2% (Homo), and 9.3% (NA) for

A. kolomikta (Table 1).

32

Table 4 Summary of bioinformatics analysis. The analyzed genotypes are recorded in Table S1.

Total Polymorphic Sex-linked Species Genotypesa PE readsb Clustersc Clustersd Clusterse

A. arguta 60 286,335,086 173,164 11,164 3

A. kolomikta 21 96,975,354 115,680 6,719 5

Sex- Var. Var. Hetero Homo linked Depthi NA (%)l typef calledg (%)j (%)k Var.h SNPs 30,468 2 59.99 27.82 60.17 11.99 Indels 1,754 1 64.58 24.42 65.15 10.41 A. arguta Total 32,222 3 ------Mean -- -- 60.24 27.63 60.44 11.93 SNPs 12,067 8 56.52 35.16 55.44 9.38 Indels 419 2 66.43 38.82 52.00 9.17 A. kolomikta Total 12,486 10 ------Mean -- -- 57.06 35.26 55.37 9.35 a The number of accessions sampled from each species. b The total number of Paired-End reads, considering all genotypes sampled. c Total number of non-redundant consensus sequences (clusters) identified via clustering to represent the GBS fragment space used to build the reference sequence, i.e., the number of FASTA entries in the mock reference genome. d The total number of polymorphic clusters. e The number of clusters associated with gender segregation. f The type of variants called. g The total number of variants called by the GBS-SNP-CROP pipeline. h Sex associated variants. i Average read depth for all variants called across the entire population. j Percentage of heterozygous genotype calls. k Percentage of homozygous genotype calls. l Percentage of missing cells (i.e. no genotype call for a given SNP*accession combination) in the final SNP genotype matrix.

33

Association screens yielded three candidate sex-linked clusters in A. arguta and five in A. kolomikta (hereafter noted with the prefix aC and kC, respectively) (Tables 4 and 5), several of which contained multiple polymorphisms. Sex-linked sequences were prioritized for marker development based on practicality in the lab; efficiency, reproducibility, and ease of scoring were all considered. Specifically, allele-specific primers were preferred over CAPS markers in order to avoid restriction digestion, an extra step liable to failure; length polymorphisms were preferred over SNPs due to co-dominant expression while maintaining a single signal generation step (PCR); and the molecular weight of amplicons was selected with regard to gel resolution.

Three sex-linked polymorphic sequences could not be immediately developed into PCR markers due to polymorphism locations that restricted primer design, although the variant positions of most of the sex-linked clusters are greater than 20 bp from either 3’ or 5’ boundary of the reference Clusters (Table 5).

Table 5 Details of putatively sex-linked clusters. Numbers in brackets represent the numbers of variants identified within the cluster. An asterisk (*) indicates the polymorphisms around which primer pairs were designed.

Species Cluster Length Variant Variant Female Male Female Male IDa (bp) Type Position Homob Heteroc Genotype Genotype 70626 128 SNP 115 40 15 C/C C/T 36306 157 SNP* 64 40 13 T/T T/C A. arguta A/+10 123485 131 Indel* 94 36 12 A/A GGTTTACCTT 693 43 SNP 32 16 5 T/T T/G SNP* 42 16 5 T/T T/C 12625 (2) 141 SNP* 43 16 5 G/G G/T SNP 6 16 4 G/G G/A A. 1273 (3) 203 SNP 17 16 4 T/T T/C kolomikta Indel 189 16 4 T/T T/-2TG SNP* 63 16 4 C/C C/T 48400 (2) 221 SNP* 153 16 4 C/C C/T SNP 57 16 3 A/A A/T 72369 (2) 270 Indel* 154 16 3 G/G G/+3ACC a The FASTA entry (Cluster) # from the sequence in which polymorphisms were identified b Number of females scored as homozygotes for the locus c Number of males scored as heterozygotes for the locus

34

Primer pairs were designed for two polymorphic sequences within A. arguta and three within A. kolomikta. Where necessary, primer pairs were designed to amplify male-specific sequences in order to avoid introducing a second, costly restriction digestion step. In instances of a SNP, a primer was designed such that the final nucleotide on the 3’ end matched only the male variant and the antepenultimate nucleotide mismatched in both male and female variants, severely inhibiting nucleotide extension on female variants. Two usable variants were indels; in cluster aC123485, a co-dominant primer pair was designed around a 10bp Indel, yielding two amplicons in males and one in females, whereas in the cluster kC72369, the 3’ end of a primer was aligned to the 3 bp male insertion to create a male-dominant marker.

Marker Validation and Loci Characteristics

Primer pairs were tested on a set of five male and five female accessions (indicated in

Table S1) of the corresponding species in order to validate the association prior to confirmation across the larger population. Following this analysis, primer pairs designed within clusters aC36306, aC123485, kC12625 and kC72369 were tested within the remaining US kiwiberry accessions of known gender (Table S1). The successful A. arguta pairs (aC36306 and aC123485) were further tested in two independent half-sib populations, while the successful A. kolomikta pairs (kC12625 and kC72369) were tested in a single independent half-sib population (Table 6).

Ultimately, a single primer pair in each species displayed perfect sex linkage across all lines: aC36306 and kC72369.

35

Table 6: Summary of PCR scores for lines of known gender. A subscripted R indicates recombinants (i.e. FR is a phenotypic female that amplifies a male band). An X indicates an unscorable line.

Marker US Kiwiberry Pennsylvania Species Minnesota Population %a ID Collection Population M F MR FR X M F MR FR X M F MR FR X aC36306 18 41 0 0 0 7 10 0 0 0 10 11 0 0 0 100 A. arguta aC123485 16 41 2 0 0 5 8 2 0 2 9 10 0 0 2 92 A. kC12625 ------5 5 3 6 2 ------59 kolomikta kC72369 5 16 0 0 0 11 10 0 0 0 ------100

a Percentage of successful scores

The primer pair for aC36306 (Table 7) amplifies a 106 bp band in A. arguta males with

none to various alternate amplicons in females. When the aC36306 primers are multiplexed with

a second universal primer nested between them, males are indicated by both an 81 bp and a 106

bp band, while females have a single 81 bp band (Figure 4a). The primer pair for kC72369

(Table 7) amplifies a 161 bp band in A. kolomikta males and a series of alternative loci in

females, with the brightest female band typically of a slightly higher molecular weight than the

male band (Figure 4b). Given the internal control included in the A. arguta marker and the

consistent non-target amplification in A. kolomikta females, both markers are effectively co-

dominant, reducing inaccurate scoring due to DNA extraction or PCR failures. Neither marker

exhibits sex-linkage in the other species; the male locus amplifies in both genders. The A. arguta

marker was used to screen thirteen F1 seedling populations (Table 8). Some crosses showed the

previously observed male-bias, although the progeny sex ratio overall approached 1:1.

36

Table 7 Primer and cluster sequences for the two most successful markers.

Species Primer Primer Sequencea Annealing Band Marker Type Tmb Size aC36306_M_Fd3 TTTGGCAAAAACCACATGAC Male- 106 bp A. arguta aC36306_R1 AAAAGGGGGTTCGAACTTTG 52°C Dominant aC36306_cF2 TGTTGGATGCCATTAAGTCG 81 bp Control

Cluster 36306 Paired Male and Female Sequences c

♂ 1 GTTAGCCAAAAAAATGGTAATATGTTACATTGGTACAATATCATTTTGGC ♀ ∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙ ♂ 51 AAAAACCACATCACCTTTCTGTTGGATGCCATTAAGTCGAAGCCGTCCAT ♀ ∙∙∙∙∙∙∙∙∙∙∙∙∙T∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙ ♂ 101 TGGTTGAAACAAGTATTCCCACCCTTGGATCAAAGTTCGAACCCCCTTTT ♀ ∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙ ♂ 151 AGGTCCG ♀ ∙∙∙∙∙∙∙

kC72369_MiF1 TAGCCAAGGTAATGCCTCCA Male- A. kolomikta 52°C 161 bp kC72369_MiR1 CAGGAATATGTTCAAGAGTTGGT Dominant

Cluster 72369 Paired Male and Female Sequences c

♂ 1 GCTTATGTTACCAAGCTAGCCAAGGTAATGCCTCCACCAACATTTTCTTT ♀ ∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙ ♂ 51 ATATTTTCACGTCTTAGTTTGATTATTTTTGTACAGAGACTTGACTCCGA ♀ ∙∙∙∙∙∙A∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙ ♂ 101 CCTTTAATTATTTTATTGGATATTCAAAAGTGTCCTCTGCATTTTCAGCT ♀ ∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙ ♂ 151 GTAGACCAACTCTTGAACATATTCCTGGTCAAAATTATTCAATTTCATAT ♀ ∙∙∙∙---∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙ ♂ 201 GCAATTCAATAGAAAATTACAAAGTTTTTAACTCAATGAGAGAGATTGTG ♀ ∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙ ♂ 251 TAGATTACTTTTTATTCCTGTCT ♀ ∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙

aUnderlined nucleotides are male specific. bOptimal annealing temperatures were determined through gradient trials. cThe underlined sequences indicate primer positions.

37

Figure 4 Agarose gel electrophoresis of PCR products of a. A. arguta marker aC36306 with an 81 bp internal control and b. A. kolomikta marker kC72369. M = Male; F = Female, Lad = 50 bp Gene Ruler; and NTC = non-template control.

38

Table 8 Sex-marker scores in thirteen A. arguta F1 populations. X indicates unscorable lines due to extraction or PCR failure or ambiguity.

POPULATION MALES FEMALES X %MALE D. OAKS BG × RUSSIA #114 19 11 0 63.3 UCD 3-1 × CACT 160 #111 47 17 0 73.4 D. OAKS BG × CB MTN. 3 5 5 0 50.0 TATYANA × 74-32 23 16 9 59.0 OGDEN PT × CACT 105 #107 74 54 3 57.8 OGDEN PT × C-5 88 74 15 54.3 UCD 3-1 × C-5 19 22 5 46.3 D. OAKS BG × C-5 56 65 15 46.3 CB MTN. 4 × 74-32 61 53 34 53.5 D. OAKS BG × CACT 105 #107 68 60 14 53.1 DACT 218 × CACT 105 #107 20 15 10 57.1 OGDEN PT × CB MTN. 3 68 75 9 47.6 OGDEN PT × CACT 160 #111 65 51 0 56.0 TOTAL 613 518 114 54.2

When BLASTed against the A. chinensis draft genome (Huang et al., 2013), the cluster sequences of both markers align to regions in the “Unknown” pseudomolecule, with the amplified regions of aC36306 and kC72369 aligning at 97% and 92% identity, respectively.

However, neither align to a known coding region; and no significant similarities were found when BLASTed against the NCBI database.

Discussion

Nowadays, with the advent of improved technologies for genome sequencing, the challenge is the use of bioinformatic algorithms to scan the whole genome sequence to find polymorphic regions and then to design a pair of primers for these regions. Jiao et al. (2012), for example, developed 158 polymorphic SSR markers for Chinese bayberry (Myrica rubra) by scanning 323Mb of genome shotgun sequence. Here, the employment of the GBS-SNP-CROP

39 pipeline was effective in calling reliable SNPs and Indels from GBS fragments that can be screened for sex pattern inheritance. The use of a genotyping matrix from a genome complexity reduction system like GBS to identify sex-linked markers is reported here for the first time, and we have demonstrated the effectiveness of the GBS-SNP-CROP pipeline in the facilitation of reference-independent, high-throughput marker development in minor crops.

The markers discovered, developed, and validated in this study are in line with the Y- active sex inheritance system in Actinidia, regardless of ploidy, already observed in A. chinensis and A. deliciosa (Testolin et al., 1995, 1999; Gill et al., 1998; Fraser et al., 2009). The tightness of linkage of these markers indicates that the majority of progeny can be confidently scored.

Recombination between the ovary abortion and pollen fertility genes, identified by sterility and hermaphroditism, will result in erroneous gender scores. These conditions have been observed in populations at rates of 0-4% (Testolin et al., 1999) and will score as either male or female, depending on whether the marker is linked to the O or P gene of the male-linked locus. This could be determined through testing these markers on hermaphroditic and sterile lines, but was not explored here due to a lack of such individuals. Nonetheless, the effects of occasional erroneous scores are largely outweighed by the benefits of marker-assisted selection in kiwiberry.

The clarity of markers depends not only on the inherent properties of the sequences involved, but also on the DNA extraction method and PCR cycling conditions. Although the extraction method used was primarily sufficient for downstream applications, PCR inhibitor concentrations varied batch-to-batch. While this was typically remedied through the reduction of

DNA used in the PCR reaction and an increase in cycle number, a large-scale screening process

40 benefits from accuracy after a single test. Young leaf tissue is especially important in the methods presented here, as well as speed and care during the extraction process.

We also have demonstrated here the viability of the GBS-SNP-CROP pipeline (Melo et al

2016a) as a tool for identifying polymorphic sites suitable for association analysis and subsequent PCR marker development specifically in a minor crop with no reference and otherwise limited genomic resources. In recent screens, these markers have shown to be a useful tool in increasing the commercial population size, i.e. allow a high proportion of female plants

(fruit productive vines) compared with a few pollen donor males, and ultimately reducing the monetary, temporal, and spatial costs spent on ultimately undesirable plants.

41

CONCLUSION AND FUTURE DIRECTIONS

While sex-markers for two kiwiberry species are very beneficial to breeding efforts, there are several steps that can be taken to improve their efficient use. These steps are in regard to the preparatory procedures, the markers themselves, and the future genetic resources for a kiwiberry breeding program. Also, finding connections between other Actinidia species could expose new possibilities in resource sharing.

Regardless of a marker’s accuracy, there are many instances that may prevent correct genotypic characterization. This includes reaction failures or cross-contamination during tissue collection or during the DNA extraction process, made more likely during high-throughput screens. Although some of this is dependent on the performance of the handler, there are some complications that can be inherent in a wet-lab procedure. High-throughput gDNA extraction and screening protocols are susceptible to cross-contamination at multiple stages, and therefore require operating procedures that ensure adequate sample handling.

Although the A. kolomikta marker described amplifies alternate loci without the presence of a male allele, both it and the A. arguta marker described are male-specific and do not indicate heterozygosity. Although the alternate amplification in kC12625 and the internal control added to aC36306 help avoid mis-scoring due to extraction or PCR failures, it would be beneficial to identify a co-dominant marker for both species. Unfortunately, the length polymorphisms identified were not as tightly linked as the SNP markers ultimately used. One possible solution might be to multiplex the male-specific primer with a female-specific primer containing excess, non-homologous 5’ nucleotides in order to introduce a pseudo length polymorphism. In this case, females would appear homozygous at the sex marker locus and males would appear heterozygous, making the marker effectively co-dominant. Also, a female-specific primer would

42 out-compete the male-specific primer at female loci, preventing mis-priming. Nonetheless, the single-primer pair male-specific markers described may ultimately be more amenable to multiplexing with markers for other traits by reducing electrophoretic co-segregation and primer- dimer formation during PCR.

Future studies aiming to identify markers for other traits of interest in both species are important next steps in kiwiberry breeding. Genes associated with fruit color, development and ripening, nutrient metabolism, disease resistance, and quinic acid content, and allergens have all been explored in Actinidia, providing ample groundwork for marker development of these agronomic traits. The creation of a genetic map in both kiwiberries is a possible next step. Although the scarcity of kiwiberry hermaphrodites makes this process more difficult, Fraser et al (2009) demonstrated the viability of separate male and female linkage maps in the genus. Genetic maps could lead to QTL analyses and genomic comparisons between the kiwiberries and with A. chinensis, for which genetic maps and a draft genome already exist. This could unlock fundamental information regarding the sex loci of the genus as well as the loci of economically important traits.

43

LIST OF REFERENCES

Ahmad, M. (2000). Molecular marker-assisted selection of HMW glutenin alleles related to wheat bread quality by PCR-generated DNA markers. Theoretical and Applied Genetics, 101(5-6), 892–896. http://doi.org/10.1007/s001220051558

Ahmad, R., Potter, D., & Southwick, S. M. (2004). Genotyping of peach and nectarine cultivars with SSR and SRAP molecular markers. Journal of the American Society for Horticultural Science, 129(2), 204–210.

Ainsworth, C. (2000). Boys and girls come out to play: the molecular biology of dioecious plants. Annals of Botany, 86, 211–221. http://doi.org/10.1006/anbo.2000.1201

Arango Isaza, R. E., Diaz-Trujillo, C., Dhillon, B., Aerts, A., Carlier, J., Crane, C. F., … Kema, G. H. J. (2016). Combating a global threat to a clonal crop: banana black sigatoka pathogen Pseudocercospora fijiensis (synonym Mycosphaerella fijiensis) genomes reveal clues for disease control. PLoS Genetics, 12(8). http://doi.org/10.1371/journal.pgen.1005876

Barloy, D., Lemoine, J., Abelard, P., Tanguy, A. M., Rivoal, R., & Jahier, J. (2007). Marker- assisted pyramiding of two cereal cyst nematode resistance genes from Aegilops variabilis in wheat. Molecular Breeding, 20(1), 31–40. http://doi.org/10.1007/s11032-006-9070-x

Bassil, N. V, Davis, T. M., Zhang, H., Ficklin, S., Mittmann, M., Webster, T., … Tang, H. (2015). Development and preliminary evaluation of a 90 K Axiom® SNP array for the allo- octoploid cultivated strawberry Fragaria × ananassa. BMC Genomics, 16(1), 155. http://doi.org/10.1186/s12864-015-1310-1

Bracale, M., Caporali, E., Galli, M. G., Longo, C., Marziani-Longo, G., Rossi, G., … Tassi, F. (1991). Sex determination and differentiation in Asparagus officinalis L. Plant Science, 80(1-2), 67–77. http://doi.org/10.1016/0168-9452(91)90273-B

Charlesworth, D. (2002). Plant sex determination and sex chromosomes. Heredity, 88(2), 94– 101. http://doi.org/10.1038/sj/hdy/6800016

Charlesworth, D. (2015). Plant contributions to our understanding of sex chromosome evolution. New Phytologist, 208(1), 52–65. http://doi.org/10.1111/nph.13497

Charlesworth, D., & Charlesworth, B. (1978). Population genetics of partial male-sterility and the evolution of monoecy and dioecy. Heredity, 41(2), 137–153. http://doi.org/10.1038/hdy.1978.83

Chen, Y., Wang, T., Fang, L., Li, X., & Yin, T. (2016). Confirmation of Single-Locus Sex Determination and Female Heterogamety in Willow Based on Linkage Analysis. PloS One, 11(2), e0147671. http://doi.org/10.1371/journal.pone.0147671

44

Cipriani, G., Dibella, R., & Testolin, R. (1996). Screening RAPD primers for molecular and cultivar fingerprinting in the genus Actinidia. Euphytica, 90, 169–174. http://doi.org/10.1007/BF00023855

Clarke, W. E., Higgins, E. E., Plieske, J., Wieseke, R., Sidebottom, C., Khedikar, Y., … Parkin, I. A. P. (2016). A high-density SNP genotyping array for Brassica napus and its ancestral diploid species based on optimised selection of single-locus markers in the allotetraploid genome. Theoretical and Applied Genetics, 129(10), 1887–1899. http://doi.org/10.1007/s00122-016-2746-7

Cronn, R., Knaus, B. J., Liston, A., Maughan, P. J., Parks, M., Syring, J. V, & Udall, J. (2012). Targeted enrichment strategies for next-generation plant biology. American Journal of Botany, 99(2), 291–311. http://doi.org/10.3732/ajb.1100356

Crowhurst, R. N., Gleave, A. P., MacRae, E. A, Ampomah-Dwamena, C., Atkinson, R. G., Beuning, L. L., … Laing, W. A. (2008). Analysis of expressed sequence tags from Actinidia: applications of a cross species EST database for gene discovery in the areas of flavor, health, color and ripening. BMC Genomics, 9(July), 351. http://doi.org/10.1186/1471-2164-9-351

Danilova, T. V., & Karlov, G. I. (2006). Application of inter simple sequence repeat (ISSR) polymorphism for detection of sex-specific molecular markers in hop (Humulus lupulus L.). Euphytica, 151(1), 15–21. http://doi.org/10.1007/s10681-005-9020-4

Davey, J. W., & Blaxter, M. W. (2010). RADSeq: next-generation population genetics. Briefings in Functional Genomics, 9(5-6), 416–23. http://doi.org/10.1093/bfgp/elq031

Delph, L. F., & Montgomery, B. R. (2014). The evolutionary dynamics of gynodioecy in Lobelia. International Journal of Plant Sciences, 175(4), 383–391. http://doi.org/10.1086/675572

Deputy, J. C., Ming, R., Ma, H., Liu, Z., Fitch, M. M. M., Wang, M., … Stiles, J. I. (2002). Molecular markers for sex determination in papaya (Carica papaya L.). Theoretical and Applied Genetics, 106, 107–111. http://doi.org/10.1007/s00122-002-0995-0

Doust, J. L., & Doust, L. L. (1988). Plant Reproductive Ecology. New York: Oxford University Press.

Eibach, R., Zyprian, E., Welter, L., & Töpfer, R. (2007). The use of molekular markers for pyramiding resistance genes in grapevine breeding. Vitis, 46(2), 120–124.

FAOSTAT Statistics Database. (2013). Food and Agriculture Organization of the United Nations. Available: http://faostat3.fao.org.

Ferguson, A. R. (2013). Kiwifruit: the wild and the cultivated plants. Advances in Food and Nutrition Research, 68, 15–32. http://doi.org/10.1016/B978-0-12-394294-4.00002-X

45

Ferguson, A. R., & Huang, H. (2007). Genetic resources of kiwifruit: domestication and breeding. Horticultural Reviews, 33, http://doi.org/10.1002/9780470168011.ch1

Fitch, M. M. M., Moore, P. H., Leong, T. C. W., Akashi, L. A. Y., Yeh, A. K. F., White, S. A., … Poland, L. J. (2005). Clonally propagated and seed-derived papaya orchards: I. Plant production and field growth. Horticultural Science, 40(5), 1283–1290.

Fraser, L. G., Harvey, C. F., Crowhurst, R. N., & De Silva, H. N. (2004). EST-derived microsatellites from Actinidia species and their potential for mapping. Theoretical and Applied Genetics, 108(6), 1010–1016. http://doi.org/10.1007/s00122-003-1517-4

Fraser, L. G., Tsang, G. K., Datson, P. M., De Silva, H. N., Harvey, C. F., Gill, G. P., … McNeilage, M. a. (2009). A gene-rich linkage map in the dioecious species (kiwifruit) reveals putative X/Y sex-determining chromosomes. BMC Genomics, 10, 102. http://doi.org/10.1186/1471-2164-10-102

Gebhardt, C. (2016). The historical role of species from the Solanaceae plant family in genetic research. Theoretical and Applied Genetics. http://doi.org/10.1007/s00122-016-2804-1

Gebler, P., Wolko, Ł., & Knaflewski, M. (2007). Identification of molecular markers for selection of supermale (YY) asparagus plants. Journal of applied genetics, 48(2), 129-131.

Gill, G. P., Harvey, C. F., Gardner, R. C., & Fraser, L. G. (1998). Development of sex-linked PCR markers for gender identification in Actinidia. Theoretical and Applied Genetics, 97(3), 439–445. http://doi.org/10.1007/s001220050914

Grewal, A., & Goyat, S. (2015). Marker assisted sex differentiation in dioecious plants. Journal of Pharmacy Research, 9(8), 531–549.

Guajardo, V., Solís, S., Sagredo, B., Gainza, F., Muñoz, C., Gasic, K., & Hinrichsen, P. (2015). Construction of high density sweet cherry (Prunus avium L.) linkage maps using microsatellite markers and SNPs detected by genotyping-by-sequencing (GBS). PLoS ONE, 10(5). http://doi.org/10.1371/journal.pone.0127750

Harvey, C. F., Gill, G. P., Fraser, L. G., & McNeilage, M. A. (1997). Sex determination in Actinidia. 1. Sex-linked markers and progeny sex ratio in diploid A. chinensis. Sexual Plant Reproduction, 10(3), 149–154. http://doi.org/10.1007/s004970050082

Hormaza, J. I., Dollo, L., & Polito, V. S. (1994). Identification of a RAPD marker linked to sex determination in Pistacia vera using bulked segregant analysis. Theoretical and Applied Genetics, 89(1), 9–13. http://doi.org/10.1007/BF00226975

Huamán, Z., Ortiz, R., Zhang, D., & Rodríguez, F. (2000). Isozyme analysis of entire and core collections of Solanum tuberosum subsp. andigena potato cultivars. Crop Science, 40(1), 273–276.

46

Huang, S., Ding, J., Deng, D., Tang, W., Sun, H., Liu, D., … Liu, Y. (2013). Draft genome of the kiwifruit Actinidia chinensis. Nature Communications, 4(May), 2640. http://doi.org/10.1038/ncomms3640

Huang, X., Wei, X., Sang, T., Zhao, Q., Feng, Q., Zhao, Y., … Han, B. (2010). Genome-wide association studies of 14 agronomic traits in rice landraces. Nature Genetics, 42(11), 961– 967. http://doi.org/10.1038/ng.695

Inoti, S. K., Thagana, W. M., & Dodson, R. (2015). Sex determination of young nursery Jojoba (Simmondsia chinensis L .) plants using morphological traits in semi-arid areas of Voi, Kenya. Journal of Biology, Agriculture and Healthcare, 5(16), 113–124.

Ishii, K., Nishiyama, R., Shibata, F., Kazama, Y., Abe, T., & Kawano, S. (2013). Rapid degeneration of noncoding DNA regions surrounding SlAP3X/Y after recombination suppression in the dioecious plant Silene latifolia. G3: Genes, Genomes, Genetics, 3, 2121– 2130. http://doi.org/10.1534/g3.113.008599

Iwata, H., Minamikawa, M. F., Kajiya-Kanegae, H., Ishimori, M., & Hayashi, T. (2016). Genomics-assisted breeding in fruit trees. Breeding Science, 66(1), 100–15. http://doi.org/10.1270/jsbbs.66.100

Jangra, S., Kharb, P., Mitra, C., & Uppal, S. (2014). Early diagnosis of sex in jojoba, Simmondsia chinensis (Link) schneider by sequence characterized amplified region marker. Proceedings of the National Academy of Sciences India Section B - Biological Sciences, 84(2), 251–255. http://doi.org/10.1007/s40011-013-0226-2

Jiang, G.-L. (2013). Molecular Markers and Marker-Assisted Breeding in Plants. In Plant Breeding from Laboratories to Fields. InTech. http://doi.org/10.5772/52583

Kafkas, S., Khodaeiaminjan, M., Güney, M., & Kafkas, E. (2015). Identification of sex-linked SNP markers using RAD sequencing suggests ZW/ZZ sex determination in Pistacia vera L. BMC Genomics, 16(1), 1–11. http://doi.org/10.1186/s12864-015-1326-6

Kanno, A., Kubota, S., & Ishino, K. (2014). Conversion of a male-specific RAPD marker into an STS marker in Asparagus officinalis L. Euphytica, 197(1), 39–46. http://doi.org/10.1007/s10681-013-1048-2

Kawagoe, T., & Suzuki, N. (2004). Cryptic dioecy in : a Test of the Pollinator Attraction Hypothesis. Canadian Journal of Botany, 82, 214–218. http://doi.org/10.1139/b03-150

Kim, H. B., Jun, S.-S., Choe, S., Cho, J. Y., Choi, S.-B., & Kim, S.-C. (2010). Identification of differentially expressed genes from male and female flowers of kiwifruit. African Journal of Biotechnology, 9(40), 6684–6694. http://doi.org/10.5897/AJB09.2016

47

Kleinhofs, A., Kilian, A., Saghai Maroof, M. A., Biyashev, R. M., Hayes, P., Chen, F. Q., … Steffenson, B. J. (1993). A molecular, isozyme and morphological map of the barley (Hordeum vulgare) genome. Theoretical and Applied Genetics, 86(6), 705–712. http://doi.org/10.1007/BF00222660

Koehler, A. D., Carvalho, C. R., & Abreu, I. S. (2013). Somatic embryogenesis from leaf explants of hermaphrodite Carica papaya : A new approach for clonal propagation. African Journal of Biotechnology, 12(18), 2386–2391. http://doi.org/10.5897/AJB12.1939

Kumar, P., Gupta, V. K., Misra, A. K., Modi, D. R., & Pandey, B. K. (2009). Potential of Molecular Markers in Plant Biotechnology. Plant Biotechnology, 2(4), 141–162.

Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler Transform. Bioinformatics. 2009; 25:1754-1760.

Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer J, et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics. 2009; 25 (16): 2078-2079.

Li, M., Yang, H., Li, F. et al. (2010) A male-specific SCAR marker in Calamus simplicifolius, a dioecious rattan species endemic to . Molecular Breeding, 25, 549. doi:10.1007/s11032-009-9349-9

Liu, H., Niu, Y., Gonzalez-Portilla, P. J., Zhou, H., Wang, L., Zuo, T., … Pan, G. (2015). An ultra-high-density map as a community resource for discerning the genetic basis of quantitative traits in maize. BMC Genomics, 16(1), 1078. http://doi.org/10.1186/s12864- 015-2242-5

MaCarthur, J. W. (1926). Linkage studies with the tomato. Genetics, 11.

Mammadov, J., Aggarwal, R., Buyyarapu, R., & Kumpatla, S. (2012). SNP markers and their impact on plant breeding. International Journal of Plant Genomics. http://doi.org/10.1155/2012/728398

Mandolino, G., Carboni, A., Forapani, S., Faeti, V., & Ranalli, P. (1999). Identification of DNA markers linked to the male sex in dioecious hemp (Cannabis sativa L.). Theoretical and Applied Genetics, 98(1), 86–92. http://doi.org/10.1007/s001220051043

Mcneilage, M., & Considine, J. (1989). Chromosome studies in some Actinidia taxa and implications for breeding. Journal of Botany, 27(1), 71–81. http://doi.org/10.1080/0028825X.1989.10410145

Melo, A. T. O., Bartaula, R., Hale, I. (2016) GBS-SNP-CROP: a reference-optional pipeline for SNP discovery and plant germplasm characterization using variable length, paired-end genotyping-by- sequencing data. BMC Bioinformatics. doi: 10.1186/s12859-016-0879-y

48

Messina, R., Testolin, R., Morgante, M., Vegetale, P., Udine, U., & Udine, I.-. (1991). Isozymes for cultivar identification in kiwifruit. Horticultural Science, 26(7), 899–902.

Ming, R., Bendahmane, A., & Renner, S. S. (2011). Sex Chromosomes in Land Plants. Annual Review of Plant Biology, 62(1), 485–514. http://doi.org/10.1146/annurev-arplant-042110- 103914

Moczulski, M., & Salmanowicz, B. P. (2003). Multiplex PCR identification of wheat HMW glutenin subunit genes by allele-specific markers. Journal of Applied Genetics, 44(4), 459– 471.

Moon, D. M. (1976). Yield potential of Asparagus officinalis L. New Zealand journal of experimental agriculture, 4(4), 435-438.

Murray, M. G., & Thompson, W. F. (1980). Rapid isolation of high molecular weight plant DNA. Nucleic Acids Research, 8(19), 4321–4326. http://doi.org/10.1093/nar/8.19.4321

Na, J.-K., Wang, J., Murray, J. E., Gschwend, A. R., Zhang, W., Yu, Q., … Ming, R. (2012). Construction of physical maps for the sex-specific regions of papaya sex chromosomes. BMC Genomics, 13(1), 176. http://doi.org/10.1186/1471-2164-13-176

Newton, C. R., Graham, A., Heptinstall, L. E., Powell, S. J., Summers, C., Kalsheker, N., … Markham, A. F. (1989). Analysis of any point mutation in DNA. The amplification refractory mutation system (ARMS). Nucleic Acids Research, 17(7), 2503–2516.

Nishiyama, I., Yamashita, Y., Yamanaka, M., Shimohashi, A., Fukuda, T., & Oota, T. (2004). Varietal difference in vitamin C content in the fruit of kiwifruit and other Actinidia species. Journal of Agricultural and Food Chemistry, 52(17), 5472–5475. http://doi.org/10.1021/jf049398z

Palombi, M. a., & Damiano, C. (2002). Comparison between RAPD and SSR molecular markers in detecting genetic variation in kiwifruit (Actinidia deliciosa A. Chev). Plant Cell Reports, 20(11), 1061–1066. http://doi.org/10.1007/s00299-001-0430-z

Parasnis, A. S., Gupta, V. S., Tamhankar, S. A., & Ranjekar, P. K. (2000). A highly reliable sex diagnostic PCR assay for mass screening of papaya seedlings. Molecular Breeding, 6(3), 337–344. http://doi.org/10.1023/A:1009678807507

Payne, P. I. (1987). Genetics of Wheat Storage Proteins and the Effect of Allelic Variation on Bread-Making Quality. Annual Review of Plant Physiology, 38(1), 141–153. http://doi.org/10.1146/annurev.pp.38.060187.001041

Poland J, Brown PJ, Sorrells ME, Jannink JL (2012) Development of high-density genetic maps for barley and wheat using a novel two-enzyme genotyping-by-sequencing approach. PLoS One. doi: 10.1371/journal.pone.0032253

49

Polley, A., Ganal, M. W., & Seigner, E. (1997). Identification of sex in hop ( Humulus lupulus ) using molecular markers. Genome, 40(3), 357–361. http://doi.org/10.1139/g97-048

Rauscher, G., & Simko, I. (2013). Development of genomic SSR markers for fingerprinting lettuce (Lactuca sativa L.) cultivars and mapping genes. BMC Plant Biology, 13(1), 11. http://doi.org/10.1186/1471-2229-13-11

Reamon-Buettner, S. M., & Jung, C. (2000). AFLP-derived STS markers for the identification of sex in Asparagus officinalis L. Theoretical and Applied Genetics, 100(3-4), 432–438. http://doi.org/10.1007/s001220050056

Renganayaki, K., Jessup, R. W., Burson, B. L., Hussey, M. a., & Read, J. C. (2005). Identification of male-specific AFLP markers in dioecious Texas bluegrass. Crop Science, 45(6), 2529–2539. http://doi.org/10.2135/cropsci2005.0144

Renner, S. S. (2014). The relative and absolute frequencies of angiosperm sexual systems: Dioecy, monoecy, gynodioecy, and an updated online database. American Journal of Botany, 101(10), 1588–1596. http://doi.org/10.3732/ajb.1400196

Renner, S. S., & Ricklefs, R. E. (1995). Dioecy and its correlates in the flowering plants. American Journal of Botany, 82(5), 596–606. http://doi.org/10.2307/2445418

Rozen, S., & Skaletsky, H.J. (2000). Primer3 on the WWW for general users and for biologist programmers. In: Krawetz S, Misener S (eds) Bioinformatics Methods and Protocols: Methods in Molecular Biology. Humana Press, Totowa, NJ, pp 365-386

Saghai Maroof, M. A., Jeong, S. C., Gunduz, I., Tucker, D. M., Buss, G. R., & Tolin, S. A. (2008). Pyramiding of soybean mosaic virus resistance genes by marker-assisted selection. Crop Science, 48(2), 517–526. http://doi.org/10.2135/cropsci2007.08.0479

Sax, K. (1923). The association of size differences with seed-coat pattern and pigmentation in Phaseolus vulgaris. Genetics, 8, 522–560.

Saxena, S., Singh, V. K., & Verma, S. (2016). PCR mediated detection of sex and PaLCuV infection in papaya - A review. Journal of Applied Horticulture, 18(1), 80–84.

Semagn, K., Bjornstad, A., & Ndjiondjop, M. N. (2006). An overview of molecular marker methods for plants. African Journal of Biotechnology, 5(25), 2540–2568. http://doi.org/10.5897/AJB2006.000-5110

Shirkot, P., Sharma, D. R., & Mohapatra, T. (2002). Molecular identification of sex in Actinidia deliciosa var. deliciosa by RAPD markers. Scientia Horticulturae, 94(1-2), 33–39. http://doi.org/10.1016/S0304-4238(01)00357-0

50

Shu, Y., Li, Y., Zhu, Z., Bai, X., Cai, H., Ji, W., … Zhu, Y. (2011). SNPs discovery and CAPS marker conversion in soybean. Molecular Biology Reports, 38(3), 1841–1846. http://doi.org/10.1007/s11033-010-0300-2

Slotta, T. A. B., Brady, L., & Chao, S. (2008). High throughput tissue preparation for large-scale genotyping experiments. Molecular Ecology Resources, 8(1), 83–87. http://doi.org/10.1111/j.1471-8286.2007.01907.x

Spigler, R. B., & Ashman, T. L. (2012). Gynodioecy to dioecy: Are we there yet? Annals of Botany, 109(3), 531–543. http://doi.org/10.1093/aob/mcr170

Stavrakakis, M., & Loukas, M. (1983). The between- and within-grape-cultivars genetic variation. Scientia Horticulturae, 19(3-4), 321–334. http://doi.org/10.1016/0304- 4238(83)90080-8

Sunnucks, P. (2000). Efficient genetic markers for population biology. Trends in Ecology and Evolution. http://doi.org/10.1016/S0169-5347(00)01825-5

Tennessen, J. A., Govindarajulu, R., Liston, A., & Ashman, T.-L. (2013). Targeted sequence capture provides insight into genome structure and genetics of male sterility in a gynodioecious diploid strawberry, Fragaria vesca ssp. bracteata (Rosaceae). G3 (Bethesda, Md.), 3(8), 1341–51. http://doi.org/10.1534/g3.113.006288

Testolin, R., Cipriani, G., & Costa, G. (1995). Sex segregation ratio and gender expression in the genus Actinidia. Sexual Plant Reproduction, 8, 129–132.

Testolin, R., Huang, W. G., Lain, O., Messina, R., Vecchione, A., & Cipriani, G. (2001). A kiwifruit (Actinidia spp.) linkage map based on microsatellites and integrated with AFLP markers. Theoretical and Applied Genetics, 103(1), 30–36. http://doi.org/10.1007/s00122- 001-0555-z

USDA APHIS PPQ. Federal Order for Pseudomonas syringae pv. actinidiae, bacterial canker of kiwifruit. 2010 Nov 10 [cited 16 November 2016]. In: USDA APHIS Federal Import Quarantine Order. Available: http://nationalplantboard.org/wp- content/uploads/docs/spro/spro_bck_2010_11_10.pdf.

Vanburen, R., Zeng, F., Chen, C., Zhang, J., Wai, C. M., Han, J., … Ming, R. (2015). Origin and domestication of papaya Y h chromosome. Genome Research, 524–533. http://doi.org/10.1101/gr.183905.114.9

Varshney, R. K., Thiel, T., Sretenovic-Rajicic, T., Baum, M., Valkoun, J., Guo, P., … Graner, A. (2008). Identification and validation of a core set of informative genic SSR and SNP markers for assaying functional diversity in barley. Molecular Breeding, 22(1), 1–13. http://doi.org/10.1007/s11032-007-9151-5

51

Wu, J., Li, L. T., Li, M., Khan, M. A., Li, X. G., Chen, H., … Zhang, S. L. (2014). High-density genetic linkage map construction and identification of fruit-related QTLs in pear using SNP and SSR markers. Journal of Experimental Botany, 65(20), 5771–5781. http://doi.org/10.1093/jxb/eru311

Xu, K., Deb, R., & Mackill, D. J. (2004). A Microsatellite Marker and a Codominant PCR-Based Marker for Marker-Assisted Selection of Submergence Tolerance in Rice. Crop Science, 44(1), 248–253. http://doi.org/10.2135/cropsci2004.0248

Yakubov, B., Barazani, O., & Golan-Goldhirsh, a. (2005). Combination of SCAR primers and Touchdown-PCR for sex identification in Pistacia vera L. Scientia Horticulturae, 103(4), 473–478. http://doi.org/10.1016/j.scienta.2004.06.008

Zargar, S. M., Raatz, B., Sonah, H., Muslimanazir, Bhat, J. a., Dar, Z. A., … Rakwal, R. (2015). Recent advances in molecular marker techniques: Insight into QTL mapping, GWAS and genomic selection in plants. Journal of Crop Science and Biotechnology, 18(5), 293–308. http://doi.org/10.1007/s12892-015-0037-5

52

SUPPLEMENTARY MATERIALS

Table S1 List of the 60 accessions of A. arguta and 21 accessions of A. kolomikta used for screening the gender-associated markers. The Minnesota and Pennsylvania individuals listed were used in laboratory screening but not bioinformatics analysis. The genotypes marked with asterisks were used in initial laboratory PCR-based test analyses, while the boxed genotypes had the greatest number of PE reads and were chosen to build the mock reference. Genotyping results for the tested markers are included.

A. arguta US Kiwiberry Collection Gender Gender aC aC N PI CACT DACT PE reads SRA number Genotype name reported observed 36306 123485 1 Chang Bai Mountain 3 F F F F 617110 50 212 11,533,362 SRR3234099 2 ORUS 2-16 F F F F 667891 241 249 12,638,534 SRR3234100 3 ORUS 1-3 M M M M 667903 254 244 3,857,518 SRR3234102 4 Chang Bai Mountain 4 F M M M 617110 50 212 8,735,374 SRR3234104 5 Optiz2 M M M M 637803 163 241 5,665,492 SRR3234105 6 Optiz1* M M M M 637802 162 242 6,587,636 SRR3234106 7 Frenchman’s Bay F F F F 617162 116 226 6,368,934 SRR3234107 8 ORUS 2-17* F M M M 667895 245 252 2,219,262 SRR3234108 9 Flower Cloud* M M M M ------1,606,094 SRR3234109 10 Chang Bai Mountain 5 F F F F 617110 50 212 1,855,156 SRR3234111 11 Tatyana F F F F ------1,469,580 SRR3234112 12 Cornell TBF F F F F ------3,747,072 SRR3234113 13 211-B F F F F 637809 170 271 5,959,902 SRR3234114 14 ORUS 1-4 F F F F 667890 239 322 4,645,838 SRR3234116 15 Chang Bai Mountain 2 F F F F 617110 50 212 3,433,118 SRR3234117 16 HVSC-115* F F F F 641101 186 231 2,840,748 SRR3234118 17 Issai Small Fruit Variant F F F F 617116 58 134 3,151,572 SRR3234120 18 74-32 M M M M 617113 54 127 5,488,740 SRR3234121 19 Red Princess F F F F 617118 60 274 3,386,808 SRR3234122 20 Issai F F F F 617106 46 233 4,525,900 SRR3234123 21 Hardy Red F F F F 617107 47 230 4,646,988 SRR3234124 22 OGW M M M M ------4,940,840 SRR3234125 23 ORUS 1-5 F F F F 667899 250 245 4,455,038 SRR3234127 24 ORUS 1-8* F M M M 667904 255 247 2,096,570 SRR3234128 25 ORUS 1-6 F F F F 667889 238 246 11,782,074 SRR3234130

53 CONTINUED ON NEXT PAGE

Gender Gender aC aC SRA N Genotype name PI CACT DACT PE reads reported observed 36306 123485 number 26 ORUS 2-7* F F F F 667901 245 252 2,135,604 SRR3234132 27 ORUS 3-3 F F F F 667892 242 254 8,156,828 SRR3234133 28 HVSC-117 -- M M M 641102 188 232 3,209,702 SRR3234134 29 Chang Bai Mountain 1 F F F F 617110 50 212 397,712 SRR3234135 30 Ogden Point* F F F F 617140 90 240 2,749,792 SRR3234136 31 DACT-214 F F F F 637812 174 214 5,322,144 SRR3234139 32 Dumbarton Oaks* F F F F 617135 84 225 6,049,356 SRR3234143 33 Geneva 3 F F F F -- -- 37 6,276,020 SRR3234144 34 Jumbo F F F F 617108 48 234 5,544,624 SRR3234146 35 Michigan State TBF F F F F 617136 86 239 5,521,514 SRR3234149 36 ORUS 2-1 F M M M 667905 256 248 8,998,044 SRR3234151 37 Smith 2 Male M M M M ------5,664,902 SRR3234152 38 211-A M M M F 637808 169 272 6,401,008 SRR3234153 39 127-40 (UNH42) F F F F 617142 92 206 1,087,034 SRR3234154 40 74-46* M M M M -- -- 129 4,387,350 SRR3234155 41 West* F F F F 617121 66 257 6,337,356 SRR3234156 42 DACT 123 F F F F 617152 104 123 6,626,818 SRR3234158 43 Andrey M M M M ------4,494,836 SRR3234159 44 Chico F F F F 667888 114 223 4,743,214 SRR3234161 45 DACT-216 F F F F 637811 172 216 3,793,736 SRR3234163 46 DACT-260 F F F F 617168 139 260 8,100,576 SRR3234165 47 Early Cordifolia F F F F ------4,634,780 SRR3234167 48 119-40 F F F F 617156 108 202 3,857,758 SRR3234174 49 74-55 (UNH106) F F F F 617114 56 132 3,337,274 SRR3234177 50 Ananasnaya F F F F 617104 44 220 3,368,520 SRR3234178 51 125-40 F -- F F 617157 109 204 3,554,890 SRR3234179 52 DACT-218 F F F F 637810 171 218 5,227,524 SRR3234183 53 Geneva 1 F F F F 617101 30 227 1,651,662 SRR3234186 54 New Zealand F F F F 617125 70 334 965,764 SRR3234191 55 ORUS 2-3 F F F F 667900 251 333 4,351,174 SRR3234192 56 74-52 -- M M M ------3,936,772 SRR3234193 57 #74 Female F F F F 637804 164 326 4,478,434 SRR3234194 58 127-40 (UNH57) M M M M 617163 120 205 4,746,724 SRR3234196 59 74-49 (UNH3) F F F F 617103 43 208 4,739,038 SRR3234197

60 DACT-213 F M M F 637813 175 213 4,723,552 SRR3234199 5

4 CONTINUED ON NEXT PAGE

Minnesota Population Gender Gender aC Ac SRA N PI CACT DACT PE reads Ogden Point Seedlings reported observed 36306 123485 number 1 MINN A - parent F -- F F 143 ------2 MINN B M -- M M 119 ------3 MINN C F -- F F 120 ------4 MINN D F -- F F 121 ------5 MINN E M -- M F 122 ------6 MINN G F -- F X 124 ------7 MINN H M -- M M 125 ------8 MINN I F -- F F 126 ------9 MINN J M -- M F 127 ------10 MINN K M -- M M 128 ------11 MINN L F -- F F 129 ------12 MINN M F -- F F 130 ------13 MINN N F -- F F 131 ------14 MINN O F -- F F 132 ------15 MINN P F -- F F 133 ------16 MINN R M -- M M 135 ------17 MINN S M -- M M 136 ------Gender Gender aC aC SRA N Pennsylvania Population PI CACT DACT PE reads reported observed 36306 123485 number 1 PENN A M -- M M ------2 PENN B F -- F F ------3 PENN C F -- F F ------4 PENN D M -- M M ------5 PENN E M -- M M ------6 PENN F M -- M M ------7 PENN G M -- M M ------8 PENN H F -- F F ------9 PENN I F -- F F ------10 PENN J F -- F F ------11 PENN K M -- M M ------12 PENN L F -- F F ------13 PENN M M -- M M ------14 PENN N F -- F F ------15 PENN O M -- M M ------

16 PENN P M -- M M ------5

5 17 PENN Q F -- F F ------CONTINUED ON NEXT PAGE

Gender Gender aC aC SRA N Pennsylvania Population PI CACT DACT PE reads reported observed 36306 123485 number 18 PENN R F -- F X ------19 PENN S F -- F F ------20 PENN T M -- M X ------21 PENN U F -- F F ------A. kolomikta US Kiwiberry Collection Gender Gender kC Kc SRA N PI CACT DACT PE reads Genotype Name reported observed 12625 72369 number 1 Krupnopladnaya F M M -- 617122 67 282 3,596,006 SRR3234071 2 Dr. Szymanowski M F F -- 617166 137 280 5,907,658 SRR3234072 3 Urozhainaya F F F -- 641096 129 295 3,639,332 SRR3234073 4 Aromatnaya F F F -- 617143 93 277 4,701,056 SRR3234074 5 Nahodka F F F -- 617120 65 287 2,343,654 SRR3234075 6 Arctic Beauty* F M M -- 617144 94 276 4,346,014 SRR3234076 7 Pavlovskaya -- F F -- 617148 99 -- 3,545,306 SRR3234077 8 Red Beauty -- F F ------6,578,532 SRR3234078 9 Sentyabraskaya M F F -- -- 128 51 4,772,602 SRR3234079 10 A/O* F F F -- -- 95 -- 7,989,260 SRR3234080 11 Krupnopladnaya -- F F -- 617138 88 283 3,956,698 SRR3234082 12 Matonaya -- F F -- 617147 97 286 9,769,392 SRR3234083 13 Oluyckos* F F F -- 641095 127 289 4,791,980 SRR3234084 14 Pasha* F M M ------1,289,652 SRR3234086 15 Sentyabraskaya F F F -- 617149 100 293 4,277,928 SRR3234087 16 Klara Zeitkin* M F F -- 667980 -- 328 3,464,698 SRR3234091 17 Krupnopladnaya F F F -- 617146 96 284 4,564,738 SRR3234092 18 Raintree* F M M -- 617126 71 292 5,763,436 SRR3234094 19 September Sun* F F F -- 667979 128 294 1,570,490 SRR3234095 20 Ananasnaya -- F F -- 667981 -- 329 4,765,536 SRR3234096 21 Leningradskaya Pozdnaya* M M M -- 617150 101 285 5,341,386 SRR3234097 Minnesota Population Gender Gender kC Kc SRA N PI CACT DACT PE reads Nahodka Seedlings reported observed 12625 72369 number 1 MnNS-1 F -- F M ------2 MnNS-3 F -- F M ------3 MnNS-4 F -- F F ------4 MnNS-5 F -- F F ------

5 5 MnNS-6 F -- F F ------CONTINUED ON NEXT PAGE

6

Minnesota Population Gender Gender kC Kc SRA N PI CACT DACT PE reads Nahodka Seedlings reported observed 12625 72369 number 6 MnNS-7 F -- F M ------7 MnNS-7.5 F -- F M ------8 MnNS-8 F -- F F ------9 MnNS-9 F -- F M ------10 MnNS-F F -- F M ------11 Nahodka - parent F -- F F ------12 MnNSM-1 M -- M M ------13 MnNSM-2 M -- M X ------14 MnNSM-3 M -- M F ------15 MnNSM-4 M -- M X ------16 MnNSM-5 M -- M M ------17 MnNSM-6 M -- M F ------18 MnNSM-7L M -- M F ------19 MnNSM-8 M -- M M ------20 MnNSM-9 M -- M M ------21 MnNSM-10 M -- M M ------

5

7