ABSTRACT

BALJINDER KAUR,. Developing Tolerance and Genetic Architecture of Agronomic Traits in . (Under the direction of Vasu Kuraparthy and Randy Wells).

Thrips (order Thysanoptera) are a significant early season pest of cotton throughout most of the cotton belt. Unless controlled in a timely manner, seedling feeding by both adults and immatures results in maturity delays and yield reductions. One approach to annual costly insecticide-dependent controlof thrips would be to identify natural sources of tolerance to thrips feeding and/or damage. Field studies were conducted, including 391 hirsutum L. and 34 G. barbadense accessions, for two years at the Upper Coastal Plains

Research Station near Rocky Mount, NC. At 2.5, 3.5 and 4.5 weeks after planting, accessions were evaluated for visual damage (on a scale 0-5), thrips counts on seedlings, and seedling dry weight. Based on visual damage score, five tolerant (score 0-1.5) G. barbadense accessions and five moderately tolerant (score 1.5-3) upland cotton accessions were identified. Tobacco thrips [Frankliniella fusca (Hinds)] and western flower thrips

(Frankliniella occidentalis) were present in majority of seedling samples. In 2015-16, greenhouse experiments were conducted and height, growth rate, leaf pubescence, and leaf area were recorded on ten lines selected based on thrips response in field. When grouped into tolerant and susceptible types, leaf pubescence and relative growth rate recorded in absence of thrips were significantly higher in tolerant accessions compared to susceptible lines.

Leaf shape is an important agronomical trait known to influence yield, lint trash, boll rot resistance, and flowering rate. Studying the genetic basis of leaf shape in Gossypium arboreum L, an A-genome diploid progenitor species of tetraploid cotton is easier compared to tetraploid species. Additionally, it provides a better understanding of the orthologous loci controlling biological traits in both diploid and polyploid species. Based on lobe depth, leaf shape in cotton is predominantly classified as normal, subokra, okra, and laciniate. Laciniate type leaf shape in diploids is equivalent to okra in tetraploid cotton. A bi-parental population of 135 F2 developed from a cross between G. arboreum accessions NC 501 and NC

505 was used to genetically map laciniate leaf shape in diploid cotton. A single incompletely

L dominant gene (L –A2) controlled the laciniate leaf shape trait in G. arboreum. Using simple- sequence repeat (SSR) and sequence-tagged sites (STS) markers leaf shape locus (L-A2) was mapped on chromosome 2 within a region comprising of nine putative genes. Gene sequences from two candidates, Cotton_A_505 and Cotton_A_507, had similarity to LMI1- like genes in Arbaidopsis. Orthologous relationship between laciniate leaf shape in G. arboreum and okra leaf shape in G. hirsutum L. was confirmed by targeted mapping using candidate genes in the region. Also, it was observed that the gene order in the targeted region was well conserved across two diploid (G. arboreum and G. raimondii) genomes.

Genus Gossypium is distributed throughout the tropics and displays huge morphological, physiological, and ecological diversity. However, cultivated cotton in US was derived from limited day neutral stocks, thus imparting a narrow genetic base. Simple

Sequence Repeat (SSR) markers were applied on 185 G. hirsutum wild accessions collected primarily from Central America to access genetic diversity and population structure. One hundred and twenty two SSR markers used to genotype the diversity panel amplified at 143 loci with more than 800 alleles. Based on Structure analysis and Principal Component

Analysis (PCA) five major clusters were identified. These groups closely corresponded to the geographical regions of collection with significant admixture between the sub-populations.

AMOVA (Analysis of Molecular Variance) suggested that most of the genetic variation was explained by differences within groups. Additionally, pairwise kinship estimates were calculated and core sets representing various levels of allelic richness were identified.

Estimating genetic diversity and population structure will help breeding programs to exploit genetic variation and rare alleles present in wild germplasm.

© Copyright 2017 Baljinder Kaur

All Rights Reserved Developing Thrips Tolerance and Genetic Architecture of Agronomic Traits in Cotton

by Baljinder Kaur

A dissertation submitted to the Graduate Faculty of North Carolina State University in partial fulfillment of the requirements for the degree of Doctor of Philosophy

Crop Science

Raleigh, North Carolina

2017

APPROVED BY:

______Vasu Kuraparthy Randy Wells Committee Co-Chair Committee Co- Chair

______Jack Bacheler James Holland DEDICATION

To family and friends

ii BIOGRAPHY

Baljinder Kaur was born on August 27, 1988 in Gurdaspur, Punjab, India to Kuldeep Singh and Harjeet Kaur. After completing her high school education in KV Gurdaspur, she obtained her bachelor’s degree in Biotechnology (Hons.) from DAV College, Chandigarh.

Baljinder completed her masters in Plant Biotechnology at Punjab Agricultural University,

Punjab. In Fall 2012, she came to United States to pursue a Ph.D degree at Department of

Crop and Soil Sciences, NC State University. She married her husband Avinav on May 14,

2016.

iii ACKNOWLEDGMENTS

There are many people who helped me during my stay at NC State. I might not be able to mention all the names, but your support is always appreciated.

First of all, I express my sincere gratitude to my advisor Dr. Vasu Kuraparthy for providing the opportunity to join NC State and work with a great bunch of people. His constant support, guidance, immense knowledge, enthusiasm, patience, and faith in me throughout were greatly beneficial. His passion about science and drive to ‘discover the truth’ always encouraged me and pushed me to explore new research ideas. Without his guidance and help this dissertation would not have been possible.

Besides my advisor, I would like to thank other committee members Dr. Randy Wells, Dr.

Jack Bacheler, and Dr. Jim Holland for their time, insightful suggestions, comments, and encouragement during review of my research work. Also, I am grateful to Dr. Keith

Edmisten for his constant support and encouragement.

My thanks and appreciation goes to all lab members, fellow researchers, and graduate students for their support. I would like to give special thanks to Dr. Priyanka Tyagi for helping and guiding at every stage of my experiments and being my ‘go-to person’. I would also like to thank my fellow co-workers especially, Ryan Andres, Linglong Zhu, Hui Fang,

Kuang Zhang and team of under-graduates for their help and advice on day to day basis during my research work in lab as well as in field. The personal at the USDA Small Grain

Regional Genotyping Laboratory, Method Greenhouses, and Staffs at Central Crops and

Upper Coastal Plain Research Stations contributed immensely to genotyping and field work.

iv Also, I would like to acknowledge the financial support provided by NC Cotton Producers

Association which helped paying my fees and putting bread on the table.

Friends are ‘the new family’ especially when you are in a new land away from home. I would like to thank all my friends Amsarani, Sushila, Lucky, Priyanka, Chirag, Nehal,

Jaspreet, Naren, and Dev for supporting during my highs and lows and helping me achieve my dreams. Special thanks to Avinav for being a great friend and partner, and for motivating me the whole time. I know you are always there standing beside me.

And of course; I am highly grateful to my parents Kuldeep Singh and Harjeet Kaur and my

brother Banveer, for their love and support during my education and life in general.

v TABLE OF CONTENTS

Chapter 1: Introduction…………………………………………..…………………………1 References……………………………..…………………………………………….13

Chapter 2: Screening Germplasm and Quantification of Components Contributing to Thrips Tolerance in Cotton ……………………………………………………………….17 Abstract……………………………...………….…………………………………...17 Introduction……………………………………….…………………..…………….18 Materials and Methods…………………………….……………………………….21 Plant Material………………………………………………………..……………...21 Field Planting/Locations...... …...22 Evaluation of Thrips Tolerance...……………………….…………………………23 Statistical Analysis………...... 25 Identification of Thrips Species…………………………………...... 25 Greenhouse Experiments…………………………………………………….26

Results and Discussion.………………………………………………………..……28 Thrips Screening at Rocky Mount ……………………………..……………….…28 Comparison between G. hirsutum and G. barbadense Accessions ...………...29 Performance of Lines Over Two Years (2014 and 2015)….……………….…...31 Evaluation of Different Scoring Techniques…………..…………………………29 Thrips Species Identification …………...... 32 Identification of Tolerant Accessions and Introgression of thrips tolerance to G. hirsutum……………………………. …………………….………………………33 Growth Components Contributing to Thrips Tolerance...……………………..34

References…………………………………………...………………………………50

Chapter 3: Major leaf shape genes, laciniate in diploid cotton and okra in polyploid Upland Cotton, Map to an orthologous genomic region……………...………………….55 Abstract……………………………...………….…………………………………...55 Introduction……………………………………….…………………..…………….56

vi Materials and Methods…………………………….……………………………….59 Plant Material………………………………………………………..……………...59 Phenotyping...... …...59 Molecular Genetic Mapping……...……………………….……………………….59 Genetic Map Construction…...... 60 Marker and Comparative Genomic Analysis…………………………………....61

Results……...…………………………………………………………………..……63 Inheritance of Leaf Shape Trait in G. arboreum……………..……………….…63 L SSR Marker Analysis and Genetic Mapping of L –A2 Gene…………………..63 L Comparative Analysis of the L –A2 Genetic Map with G. hirsutum High- Density Consensus Map ………………...…………………………………….…..….64 L Comparative Mapping of the L –A2 Genetic Map to the Diploid A- and D- Genomes …………………………………………………..…………………………65 L STS Marker Development and Genomic Targeting of L –A2 Gene…………...65 L Annotation and Orthologous Mapping of the L – A2 Candidate Region using G. arboreum, G. raimondii, and G. hirsutum Genomes………………………..66 Extension of Comparative Genomic Analysis of Orthologous Leaf Shape Region in Diploid A- and D-Genome Physical Maps…………………………..68

Discussion………………………………...…………………………………………69 References…………………………………………...………………………………96

Chapter 4: Assessment of genetic diversity and population structure in the tropical landrace accessions of Gossypium hirsutum L.....………………………….………...... 101 Abstract…………………………………………………………...…………….….101 Introduction…...... 102 Materials and Methods……………………………….…………………………...105 Plant Material…...... 105 Genotyping Studies...... …...106 Scoring of SSR Markers and Assessment of Genetic Diversity…………...... 107

vii Analysis of Population Structure...... 109 Relative Kinship and Gene Flow Estimates…...... 109 Selection of Core Set Lines …...... 110 Results...... …...111 SSR Marker Analysis ……………………………………………………………...111 Genetic Diversity Analysis in the Diversity Panel …………………………….112 Population Structure Analysis ...... 113 Phylogenetic and Principal Component Analysis …………………………….114 Estimation of Kinship and Gene Flow in G. hirsutum Accessions ………….116 Core Sets of G. hirsutum Wild Accessions ………………………………….….117 Discussion...... …...117 References…...... 179

Appendices…………………………………………………………………………………186 Appendix A- List of 289 elite G. hirsutum and 34 G. barbadense accessions screened for thrips tolerance at Upper Coastal Research Station, Rocky Mount, NC………..186

viii LIST OF TABLES

Chapter 2

Table 1: List of G. hirsutum and G. barbadense accessesions evaluated for plant height, relative growth rate, leaf pubescence and leaf area in greenhouse …40 Table 2: Comparison of damage score, thrips count, and dry weight between G. hirsutum and G. barbadense accessions ...…………………………………..41 Table 3: Comparison of damage score, thrips count and dry weight readings between summer 2014 and summer 2015…………………………………….……….…..42 Table 4: Pearson Correlation Cofficients between damage score, thrips count and dry weight recorded on three scoring dates in year 2014 and 2015………….43 Table 5: List of thrips tolerant and moderately tolerant G. hirsutum and G. barbadense accessions based on field performance in year 2014 and 2015..44 Table 6: List of thrips tolerant G. barbadense accessions and thrips susceptible Upland cotton accessions used to introgres thrips tolerance in G. hirsutum.45 Table 7: P > 퐹 for plant height, relative growth rate, leaf pubescence, and leaf area of 10 pima and upland cotton accessions grouped based on thrips response under field conditions .……………………………………………………………..45 Table 8: Least mean squares for thrips tolerant and susceptible groups of G. hirsutum and G. barbadense for plant height, relative growth rate, pubescence and leaf area recorded on 10, 15, 20, and 25 DAP ……..………46 Table 9: Mean trichome counts for leaf size groups for pubescence and leaf area recorded on 25 DAP for individual runs…………………………………………47 Table 10: P > 퐹 for plant height, relative growth rate, leaf pubescence, and leaf area of 10 pima and upland cotton accessions grouped based on species ……….48 Table 11: Least mean squares for G. barbadense and G. hirsutum groups for plant height, relative growth rate, pubescence and leaf area recorded on 10, 15, 20, and 25 DAP …………………………………………………………………………49

Chapter 3

ix L Table 1: Polymorphic SSR and STS markers used for molecular mapping of L -A2 gene in G. arboreum.……………………..………………………………….79 Table 2: SSR and STS markers used for orthologous mapping and genomic targeting L of leaf shape gene (L -A2) in cotton.………………………………………………80 Table 3: Annotation and comparative genomic analysis of putative gene sequences identified in the genomic region of orthologous leaf shape locus (L) using sequence based physical maps of diploid progenitor cotton species G. arboreum and G. raimondii……………………………………………...... 81 Table S1: Chromosome locations of the markers and their allele sizes on each parent

used for mapping the L-A2 locus…………………………………….……….…..83

Chapter 4

Table 1: Pairwise genetic distance estimates within and between G. hirsutum groups identified by STRUCTURE analysis based on Nei et al. (1983)……………. 133 Table 2: Analysis of molecular variance (AMOVA) between and within groups for G. hirsutum accessions estimated based on STRUCTURE analysis………...... 134

Table 3: Pairwise FST estimates for the five groups obtained from STRUCTURE analysis of the diversity panel of landrace accessions of G. hirsutum……...135 Table 4: List of G. hirsutum landrace accessions included in the core sets identified by simulated annealing algorithm using PowerMarker software……………136 Table S1: List of G. hirsutum accessions used in the genetic diversity study along with their PI numbers, race and geographical location…………………………….138 Table S2: List of SSR markers used to genotype the diversity panel of 185 G. hirsutum landrace accessions………………………………………………………………..145 Table S3: A summary of the marker statistics based on POWERMARKER analysis used to genotype the diversity panel of G. hirsutum landrace accessions….152 Table S4: List of accessions carrying unique alleles in the diversity panel…………159 Table S5: Proportional membership of cotton accessions to clusters as determined by model-based analysis using STRUCTURE……………………………………..163

x Table S6: Fixation indices (FIS and FST) and gene flow estimate (Nm) for each locus across the groups obtained from STRUCTURE analysis of the diversity panel of landrace accessions…………………………………………………………….167

xi LIST OF FIGURES

Chapter 2 Figure 1- Visual damage scoring scale using to evaluate the thrips respone of G. hirsutum and G. barbadense accessions at Upper Coastal Plains Research Station, Rocky Mount, NC………………………………………………………….38 Figure 2- Thrips tolerant, moderately tolerant, and susceptible G. hirsutum and G. barbadense accessions identified in field screening at Upper Coastal Plains Research Station, Rocky Mount, NC………………………………………………39

Chapter 3 Figure 1- Leaf shapes of diploid cotton and tetraploid upland cotton……………...... 75 L Figure 2- Linkage map of L –A2 gene on chr02 of G. arboreum and its comparative map analysis with high-density consensus map of homoelogous chr01 of the A-subgenome of upland cotton...... 76 L Figure 3- Molecular mapping of L –A2 gene in G. arboreum and its genomic location in relation to the sequence-based physical maps of G. raimondii and G. arboreum genomes and to tetraploid genetic map of Andres et al. (2014).....77

Figure S1- Leaf shapes of the Gossypium arboreum parental accessions and their F1 hybrid…………………………………………………………………………………78

Chapter 4 Figure 1- Bar graph of allele frequencies and allele count in the diversity panel of 185 G. hirsutum landrace accessions………….………………………………..124 Figure 2- (a) Graph showing probability of data (Ln) for K values ranging from 2 to 12. (b) Estimating number of subpopulations using delta K values for K ranging from 2-12 using method proposed by Evanno et al (2005)………...125 Figure 3- Q plot showing clustering of 182 G. hirsutum landrace accessions into 5 clusters based co-dominant genotypic data using STRUCTURE...... 126 Figure 4- Phylogenetic tree obtained from NJ analysis on 182 G. hirsutum landrace accessions………………………………………………………...…………………127

xii Figure 5- Two-dimensional Principal Component Analysis (PCA) of 182 G. hirsutum landrace accessions………………………………………………………………..128 Figure 6- Plot depicting the percent of alleles captured in core sets with different number of lines……………………………………………………………………..129 Figure S1- Phylogenetic tree obtained by NJ analysis on a complete panel of 185 G. hirsutum accessions………………………………………………………………130 Figure S2- Two-dimensional Principal Component Analysis (PCA) of the 185 member panel……………………………………………………………………….131 Figure S3- Histogram of pairwise relative kinship estimates between 182 G. hirsutum landrace accessions………….…………………………………………………….132

xiii

CHAPTER 1: Introduction

Evolution and History of Cotton

Cotton belongs to the genus Gossypium, which comprises of more than 50 diploid (2n

= 2x = 26) and tetraploid species (2n = 4x = 52) (Fryxell et al. 1992; Percival et al. 1999;

Wendel et al. 2009; Wendel and Grover, 2015). The diploid species are grouped across eight genomes with the following denominations: A, B, C, D, E, F, G, and K (Percival et al. 1999).

These species represent a broad range of genetic diversity of cotton that spreads over vast geographic and ecological regions (Abdurakhmonov et al. 2012). There are three major lineages of diploid species, and these correspond to geographical regions of evolution:

Australia (C, G, and K genome), the Americas (D genome), and Africa/Arabia which included A, B, E, and F genomes (Fryxell 1979, 1992; Percival et al. 1999). The Australian species are herbaceous perennials that exhibit two-season growth type i.e., during the dry season the vegetative growth dies due to heat or fire but underground roots remain alive and start a new growth cycle in the next wet season. C, G, and K genome species have special features which are characteristics of ‘fire-adapted’ plants (Percival et al. 1999).

African/Arabian species collectively comprise of four out of eight genome groups of cotton.

Out of these, A genome representatives G. arobreum and G. herbacerum have been studied extensively. The F genome species are cytologically distinct and well adapted to moist conditions compared to other diploids (Phillips and Strickland, 1966). The adaptation to extremely dry conditions of eastern Africa makes E genome species valuable. However, they are genetically distinct which limit their role in improvement of cultivated cotton germplasm

(Fryxell, 1992). American species may grow as large shrubs and small trees. D genome

1

species have gossypol as the dominant terpenoid aldehyde in their leaves unlike most species in the genus Gossypium that carry a mix of terpenoids (Fryxell, 1992).

An A-genome diploid hybridized with a D-genome diploid to form the allotetraploid species (AADD) about 1-2 million years ago (MYA). and G. arobreum, the remaining A genome diploid species appeared to be equally related to the A genome donor of tetraploid whereas G. raimondii closely resembles the D genome donor

(Wendel et al. 2009). Following the hybridization event the primitive tetraploid speciated into six new world . Gossypium hirsutum L. (AD1) and G. barbadense (AD2), which together dominate more than 90 percent of world cotton production, are indigenous to

Southern Mexico/Guatemala and South America respectively (Wendel et al., 2009). The centers of diversity for the remaining four tetraploids are restricted to comparatively smaller regions. (AD3) is found in the Hawaiian Islands, whereas G. mustelinum (AD4) is restricted to northeastern Brazil (Meyer and Meyer, 1961; Wendel et al., 1994). Gossypium darwinii (AD5) is indigenous to the Galápagos Islands (Percival et al.

1999). The recently reported species G. ekmanianum is endemic to the Dominican Republic

(Grover et al. 2015). Currently, only two of the diploid species (G. arboreum and G. herbaceum) and two of the tetraploid cottons (G. hirsutum and G. barbadense) are cultivated

(Kantartzi et al. 2009).

Together, the diploid and tetraploid species in the genus Gossypium display great diversity in terms of morphology, ecology, growth habits, and geographical range that spreads across most tropical and subtropical regions of the world. Germplasm resources in cotton are categorized into gene pools based on their genomes and their proximity and degree

2

of accessibility to the cultivated tetraploid cotton species (Stewart, 1995). The Primary gene pool includes cultivated and wild tetraploid species. The secondary pool comprises of A, B,

F, and D genomes, and the tertiary gene pool is made of C, E, G, and K genome species.

Germplasm resources included in Primary pool can easily be crossed with G. hirsutum and produce a fertile F1 hybrid. The species in secondary and tertiary pool may require manual interference to produce a fertile hybrid or in some cases may not hybridize at all (Stewart,

1995). Therefore, Primary germplasm resources are most commonly used in improvement efforts.

Cotton as a Field Crop

Cotton (Gossypium hirsutum L.) has one of the more complex morphological structures among the major field crops. It evolved in tropical regions and most of the ancestral or wild cotton is short day flowering (Brubaker et al. 1999). Cotton is perennial with indeterminate growth habit, growing even after entering the reproductive phase and with growth similar to a tree (Wendel and Cronn, 2003). Starting as a small, brownish seed and growing into a plant which produces fibers, the cotton plant displays complex growth habits.

It has a complicated growth habit with monopodial vegetative branches and sympodial fruiting branches, and sensitivity towards unfavorable environmental, which makes cotton difficult to manage as a field crop (Oosterhuis, 1990). Studies suggest that if a cotton plant is continually disbudded, its growth and fruit production continue, producing 5 times more buds compared to the control plants (Dale, 1959). This perennial, tree-like growth pattern is counter to the annual crop production system that is universally utilized for commercial production. Understanding this balance between raising a healthy cotton plant and producing

3

lint and seed on a timely basis is very important. Surprisingly, under favorable conditions the growth of the cotton plant is very predictable. It follows a well-defined pattern which can be expressed in number of days needed to reach a specific stage. It usually takes between 130-

160 days from planting to boll maturation (Richie et al., 2007).

Another way to monitor the development rate is in terms of critical temperature. This is termed as growing degree days (DD) concept (Oosterhuis, 1990). It is based on the understanding that there is a temperature threshold below which there is little to no development. Degree days in cotton are termed as “DD60’s” because of the minimum requirement of 60˚F for growth. In Cotton Information (Edmisten and Collins, 2017) the suggested condition for planting cotton is when the soil temperature is above this threshold value and maximum day temperature for the next five days is above or approximately 80˚F.

Stages of Growth: The growth stages in cotton can be divided in four main stages

(Oosterhius, 1990):

1. Germination, emergence and seedling establishment

2. Vegetative growth : expansion of leaf area and canopy development

3. Reproductive growth: flowering and boll development

4. Fiber maturation

Some researchers have further divided the first stage into two parts i.e., (a)

Germination and (b) Seedling Establishment. Each growth stage in life of plant leads to different physiological and morphological changes and can have specific temperature and moisture requirements (Oosterhius, 1990). Mostly, transition between the stages is overlapping and there is no clear demarcation between the growth phases. Comprehensive

4

knowledge about these stage specific physiological and morphological changes and growth requirements can help grow and manage the crop efficiently and lead to higher profits.

Cotton Production in World and United States

Cotton is the most important natural fiber crop, accounting for ~30 percent of total world fiber use. Also, it is an emerging oilseed crop. In 2016-17, cotton was grown on over

29 million ha in more than 75 countries. The projected worldwide cotton production for

2016-17 is 105.34 million bales (USDA’s World Agricultural Production Report, January

2017).

In the United States, along with corn, wheat and soybean, cotton is among the most economically important crops. It ranks third in domestic oil production (Oil Crops Yearbook,

2017). Most of the cotton produced in the United States comes from the Cotton Belt, which includes states of Alabama, Florida, Georgia, North Carolina, South Carolina and Virginia in the southeast. Cotton is usually planted from early April to early June and the Southeast accounts for 32 percent of the total upland cotton production in United States. The Mid-

South region spans the states of Arkansas, Louisiana, Mississippi, Missouri, and Tennessee.

The Southwest region includes Kansas, Oklahoma and Texas and produces highest percentage (37%) of the Upland cotton crop domestically. The states of Arizona, California and New Mexico comprise the West region of the cotton belt (http://www.cottonusa.org/).

According to USDA’s market and trade report, the United States ranks third in production behind India and China and is the leading exporter of cotton with 12.7 million bales in year

2016/17. Domestically, U.S. textile mills consumed 3.6 million bales in 2016 (NCCA,

5

2016). Each year the majority of the crop (74 percent) is used for apparel, 18 percent for home furnishings and 8 percent for industrial products. Another important component of cotton crop produced along with fiber is cottonseed, which accounts for two thirds of total crop produced by weight. Annual cottonseed production averages 5.0 million tons, used as whole cottonseed or cottonseed meal in feed for livestock and poultry (World of Cotton,

NCCA). Cottonseed oil ranks behind soybean and peanuts in total domestic fat and oil production in the United States (Oil Crops Yearbook, 2016). Annually, the U.S. cotton industry accounts for more than $25 billion in products and services, and provides approximately 200,000 jobs from farms to mills (Cotton and Wool Outlook 2016, USDA).

Challenges and Possibilities in Cotton

There is an increase in worldwide cotton yield and production over the years as a result of new high yielding , breeding strategies, better agricultural practices, improved irrigation techniques and efficient use of fertilizers, insecticides, and fungicides.

However, increasing population and climate change is putting enormous pressure on the production systems to further increase yield from limited farm land. A range of abiotic and biotic stresses are encountered by cotton plants over the production period. In the following sections, general challenges faced by cotton crop and research opportunities in cotton are discussed.

Abiotic stresses such as drought, salinity and extreme temperature fluctuations are the major factors which negatively influence plant development and limit crop productivity and lead to significant agricultural yield losses throughout the world (Barnabas et al. 2008).

Drought can impact cotton production, as water deficient conditions are negatively correlated

6

to plant growth. Significant yield differences between irrigated and non-irrigated fields are frequently observed in arid and semi-arid regions across US Cotton belt (Grimes et al., 1969;

Radin et al., 1992). Studies have showed that boll size, boll number, micronaire, and stages critical to cotton fiber development (fiber initiation, elongation, secondary cell wall development and maturation) are affected by low water availability (Marani and Amirav,

1971; Pettigrew, 2004; Luo et al., 2008). High salt concentration in the soil is becoming a problem worldwide due to natural reasons and agricultural practices like irrigation (Munns and Tester, 2008). Additionally, more than half of all the irrigated land of the world is affected by salinity at some level (Arzani, 2008). Cotton is considered a salt tolerant crop, however high salt levels have negative impact on plant growth and development, photosynthetic rate, the fiber strength, micronaire value, maturity ratio, and maturity

(Abdullah and Ahmed, 1986; Zhang et al., 2013).

High temperature is a significant problem especially in dry-land cotton where heat combined with water stress result in yield losses. Excessive daytime heat leads to wilting, and when this heat extends to the night-time, respiration rate increases and the plant spends extra energy in form of carbohydrates to maintain the rate. At blooming stage, a shortage in supply of carbohydrates needed to fill the developing bolls results in fewer seeds per boll, reduced boll size, and boll shedding (Silvertooth, 1990). Also, excessive temperatures at the growth stage of early square formation lead to pollen sterility. Although Upland cotton is less susceptible to high heat, Pima cotton is highly sensitive to such conditions and sterile pollen results in reduced fruit set. Moreover, the leaf damage due to high temperatures causes low photosynthetic rates and prematurely stops vegetative growth (Silvertooth, 1990).

7

Major diseases in cotton are caused mainly by fungi, nematodes, and bacteria. Fungi like Pythium spp., Rhizoctonia solani, Phoma exigua (Ascochyta), and Fusarium spp. are primarily responsive for seedling diseases (Cotton Information, 2017). These fungi can attack at any stage between germination to emergence of the young seedling. Common symptoms include decay in seeds or seedlings before emergence, root rot and girdling in seedlings after emergence. In North Carolina, Pythium spp. is the most common fungi associated with cotton seedlings (Cotton Information, 2017). Usually, seed and root parts of seedling are susceptible to Pythium spp. and Fusarium spp. whereas R. solani and P. exigua attack seedlings after emergence till the time they have about 4-5 true leaves.

Another important disease of cotton is bollrot, which is a common term used for a number of diseases caused by fungi as well as bacterial species (Guthrie et al., 1994). It ranks secondin terms of percent damage caused after nematodes. Bollrot is identified with small brown lesions that spread on the boll, leading to blackened and dried out bolls. The primary factor contributing and favoring bollrot is moisture (Cotton Information, 2017). High relative humidity exposes plants to range of causal organisms (Guthrie et al., 1994). In addition, bacterial blight caused by Xanthomonas campestris pv. malvacearum (Smith), and Crown gall caused by Agrobacterium tumefaciens (Smith and Townsend 1907) are common bacterial diseases of cotton.

Nematodes are plants parasites which have been identified in every cotton growing state. They are worm like creatures that puncture root cells and feed on plant nutrients and cause tissue injury (Cotton Information, 2017). Based on soil type and climatic areas, several nematode species have been reported to infest cotton. The major species are root-knot,

8

reniform, lance and sting nematodes. The root-knot nematode is prevalent across the cotton belt, whereas reniform nematode is found mainly from North Carolina to Texas

(Understanding Cotton Nematodes, NCC). The lance and string nematodes are common in the southeast region. All three species except reniform nematode are most commonly associated with damage in sandy coarse soils. General symptoms of nematode damage includes root gals, susceptibility to seedling diseases, stunted height and roots, lower yield, yellowed leaves, wilting, and plant stress. Root damage caused by nematodes limits the supply of water and nutrients to developing bolls resulting in boll drop, water stress and susceptibility to other diseases (Cotton Inc: Managing Nematodes, 2012). Nematodes lead to highest percentage of yield losses in cotton (NCCA: Disease Database, 2014). Some proposed strategies to control nematode infestations are crop rotation with corn, peanut and soybean; cultural practices like deep tillage, disking and fallowing; growing resistant cultivars; and use of the nematicides. The most commonly used nematicide in the US was aldicarb, commercially sold as Temik 15G initially by Union Carbide and more recently by

Bayer CropScience. It was also recommended to control thrips in cotton. In 2010, EPA banned Temik 15G, requiring the stoppage of the distribution and sale by 2017. Other available alternatives are imidacloprid and thiomethoxam as seed treatments; metam-sodium

(Vapam) and 1, 3-dichloropropene (Telone II) as before planting soil treatments (NC

Cooperative Extension-2016 Cotton Information).

Cotton is attacked by a number of insect pests throughout the growing season. These include cotton bollworm, plant bugs, stink bugs, aphids, thrips and spider mites.

Comprehensive eradication programs were initiated to control two major cotton pests i.e.,

9

pink bollworm and boll weevil. In 2003, the pink bollworm moth population across Phase I of the program was reduced by 90 percent by using Bt cotton and mating disruption pheromones. By 2014, pink bollworm was successfully eradicated throughout Texas,

Arizona, New Mexico, and California (USDA-APHIS, 2009) and complete boll weevil eradication was achieved in all cotton producing states except for the Lower Rio Grande

Vally of Texas (USDA-APHIS, 2014). Development of Bt cotton helped in controlling destructive caterpillar species like tobacco budworm, European corn borer, fall and beet armyworm, and the cotton bollworm. Successful eradication programs and development of

Bt cotton resulted in reduced applications of lepidopteran insecticides, which in turn led to the rise of formerly seconday insects such as stink bugs and plant bugs. Plant bug is a collective term used for a group of sucking pests including tarnished plant bug (Lygus lineolaris), the western tarnished plant bug (L. hesperus), the cotton fleahopper

(Pseudatomoscelis seriatus), the clouded plant bug (Neurocolpus leucopterus), and the verde plant bug (Creontiades signatus) (Cotton Incorporated, 2017). In recent years, three stink bug species, namely, green stink bug, the southern green stink bug, and brown stink bug have become dominant cotton pest complex. Plant bugs preferentially feed on squares and young bolls whereas stink bugs puncture the small to medium sized bolls and feed on soft developing seeds. The tarnished plant bugs have be come the primary cotton pest of Mid- south cotton; the cotton fleahopper a significant issue in Texas; western plant bugs are prevalent in Arizona and California; and both stink bugs bugs and plant bugs have become significant pest in the Southeast region of the US cotton belt.

10

Thrips is the most economically damaging insect pest of cotton in Virginia and North

Carolina. They are small, slender insects, often only few millimeters or less in length (Bhatti,

1989). Thrips have stalk-like wings with long hairs and are often yellow, brown, or black as adults. Immature thrips are often pale yellow. Several thrips species attack seedling cotton.

Most common thrips species in the Southeast are tobacco thrips, flower thrips, onion and western flower thrips. Thrips feeding results in stunted growth, reduced yield, lower boll set and delayed maturity (Smith, 1942; Gaines, 1965, Bourland et al, 1992). Insecticides like

Imidacloprid and Thiamethoxam are available as seed treatments, while at-planting in-furrow granular or foliar spray or post emergence foliar applications are also used.

Spider mites, whiteflies and cotton aphid are typically minor cotton pests in the

Southeast. Out of 10 species of spider mites known to attack cotton in the United States, only the twospotted spider mite (Tetranychus urticae Koch) and the carmine spider mite [T. cinnabarinus (Boisduval)] are economically important (Cotton Incorporated, 2017). Spider mites feed by inserting a stylet-like mouth part into plant cells, extracting soluble plant contents. Often natural enemies like predator mites, ladybird beetles, and minute pirate, bigeyed and damsal bugs keep spider mite below economic levels. However, hot and dry weather cause occasional outbreaks of high spider mite populations, leading to substantial yield losses (Leonard et al. 1999). The whiteflies (Homoptera: Aleyrodidae) are a pest problem primarily in Arizona and California. The common whitefly species infesting cotton are Sweet-potato whitefly [Bemisia tabaci (Gennadium)] Biotypes A and B, Silverleaf whitefly [B. argentifolii (Perring and Bellows)], and the greenhouse whitefly [Trialeurides. vaporariorum (Westwood)] (Cotton Incorporated, 2017). Whitefly nymphs suck plant sap

11

and feed on nutrients from plants. Large populations can cause wilting, stunting and leaf chlorosis. Adult and immature flies excrete a sugary exudate (honeydew) on leaves and cotton bolls resulting in ‘sticky cotton’. Under unfavorable conditions, secondary fungi grow on sticky cotton and produce a black mold. Sticky and sooty cotton causes problems in harvesting and ginning (Cotton Incorporated, 2017). Similar problems are induced by cotton aphid [Aphis gossypii (Glover)], another sucking pest in cotton. Aphids are found in cotton field throughout the United States but rarely lead to economic losses, because natural control agents regulate aphid populations in the field (Leonard et al. 1999). Additionally, herbicide resistance resulting both high input costs and yield loss, the relatively narrow genetic base of currently planted cultivars, high input costs, low commodity prices, and competition from synthetic fibers are some of the challenges being faced by the cotton industry.

The future of cotton relies on reduction of production costs, eco-friendly, sustainable farming, developments in the field of plant health, fiber quality, yield, diseases, pests, herbicide tolerance, abiotic stresses and plant efficiency. Efforts to enhance the value of cotton seed will be needed. Whole cottonseed has high levels of protein (23%), higher fat

(20%), and fiber (24%) content compared to available protein supplements (Cotton

Incorporated, 2017). Cotton seed and cotton seed meal are a competitive feed component for high production dairy cows. Unlike corn, fat is the primary source of energy in cottonseed which doesn’t interfere with forage digestion and supports reproductive performance. Also, advances in molecular sciences, genomics, transgenics, phenotyping, and next generation sequencing techniques can help cotton cultivars withstand the economic pressures generated by a changing climate and market competition.

12

References

Abdullah, Z, Ahmad R. 1986. Salinity induced changes in the reproductive physiology of cotton plants. In: R. Ahmad and A.S. Pietro, editors, Prospects for Biosaline Research, Proc US-Pak Biosaline Res. Workshop, Karachi University, Karachi, Pakistan, p: 125–138.

Abdurakhmonov, I.Y., A. Abdukarimov, A.E. Pepper, A.A. Abdullaev, F. Kushanov, Z.Y. John, J.N. Jenkins, K. Urmonov, M. Ulloa, R.J. Kohel, and S.S. Egamberdiev. 2012. Genetic Diversity in Gossypium genus. In: M. Caliskan, editor, Genetic Diversity in Plants, ISBN: 978-953-51-0185-7, InTech, p. 313–338. doi:10.5772/2640.

Arzani, A. 2008. Improving salinity tolerance in crop plants: a biotechnological view. In Vitro Cell. Dev. Biol.-Plant 44:373. doi:10.1007/s11627-008-9157-7.

Barnabás, B., K. Jäger, and A. Fehér. 2008. The effect of drought and heat stress on reproductive processes in cereals. Plant Cell Environ. 31(1):11-38.

Bhatti J.S. 1989. The classification of Thysanoptera into families. Zoology 2(1):1-23.

Bourland, F.M., D.M. Oosterhuis, and N.P. Tugwell. 1992. Concept for monitoring the growth and development of cotton plants using main-stem node counts. J. Prod. Agric. 5:532-538.

Brubaker, C.L., F.M. Bourland, and J.F. Wendel. 1999. The origin and domestication of cotton. In: C.W. Smith, and J.T. Cothren, editors, Cotton: Origin, History, Technology, and Production. Wiley, New York, p. 3–32.

Cotton Inc. 2017. http://www.cottoninc.com/fiber/AgriculturalDisciplines/Entomology/ (Accessed March, 2017).

Cotton Information 2017. NCSU University Extension Pub. http://www.cotton.ncsu.edu/. Raleigh, NC: North Carolina State University Press.

Cotton Insects Losses Report. 2014. Mississippi State University Extension. http://www.entomology.msstate.edu/resources/cottoncrop.asp

Dale, J. E. 1959. Some effects of the continuous removal of floral buds on the growth of the cotton plant. Annals of Botany. 23(4): 636-649.

Edmisten, K., and Collins G. 2017. The cotton plant. 2017 Cotton Information. NCSU University Extension Pub. http://www.cotton.ncsu.edu/. Raleigh, NC: North Carolina State University Press. pp 5–15

Fryxell, P.A. 1979. The Natural History of Cotton Tribe. Texas A&M University Press, College Station, TX.

13

Fryxell, P.A. 1992. A revised taxonomic interpretation of Gossypium L. (). Rheedea 2:108-165.

Gaines, J.C. 1965. Cotton insects. Texas Agricultural Experiment Station Serial Bulletin. 933.

Grimes, D.W., H. Yamada, and W.L. Dickens. 1969. Functions for cotton (Gossypium hirsutum L.) production from irrigation and nitrogen fertilization variables: I. Yield and evapotranspiration. Agron. J. 61(5):769-773.

Grover, C.E., X. Zhu, K.K. Grupp, J.J. Jareczek, J.P. Gallagher, E. Szadkowski, J.G. Seijo, and J.F. Wendel. 2015. Molecular confirmation of species status for the allopolyploid cotton species, Gossypium ekmanianum Wittmack. Genet. Resour. Crop Evol. 62:103–114.

Guthrie, D., K. Whitam, B. Batson, J. Crawford, and G. Jividen. 1994. Boll Rot. Cotton Physiology Today. 5(8).

Kantartzi, S.K., M. Ulloa, E. Sacks, and J. M. Stewart. 2009. Assessing genetic diversity in Gossypium arboreum L. cultivars using genomic and EST-derived microsatellites. Genetica 136(1):141-147.

Leonard, B.R., J.B. Graves, and P.C. Ellsworth. 1999. Insect and mite pests of cotton. In: C. W. Smith and J.T. Cothren, editors, Cotton: origin, history, technology, and production. Wiley, New York, p.489-552.

Luo, H.H., J.H. Li, L. Gou, W.F. Zhang, Z.J. He, and X.J. Yang. 2008. Regulation of Under- Mulch-Drip Irrigation on Production and Distribution of Photosynthetic Assimilate and Cotton Yield under Different Soil Moisture Contents During Cotton Flowering and Boll- Setting Stage [J]. Scientia Agricultura Sinica 7:012.

Marani, A. and A. Amirav. 1971. Effects of soil moisture stress on two varieties of upland cotton in Israel I. The coastal plain region. Experimental Agriculture 7(03):213-224.

Meyer, J.R. and V.G. Meyer. 1961. Origin and inheritance of nectariless cotton. Crop Science, 1(3), pp.167-169.

Munns, R., and M. Tester. 2008. Mechanisms of salinity tolerance. Annu. Rev. Plant Biol. 59: 651-681.

NCCA. 2014. Disease database. http://www.cotton.org/tech/pest/ (Accessed March, 2017).

NCCA. 2017. Understanding cotton nematodes. http://www.cotton.org/tech/pest/nematode/ucn (Accessed March, 2017).

14

NCCA. 2017. World of Cotton. http://www.cotton.org/econ/world/index.cfm (Accessed March, 2017).

Oosterhuis, D.M. 1990. Growth and development of a cotton plant. In Nitrogen Nutrition of Cotton: Practical Issues, Miley, W.N., and D.M. Oosterhuis(Editors) pp.1–24. Madison, WI: ASA.

Percival, A.E., J.F. Wendel, and J.M. Stewart. 1999. and germplasm resources. Cotton: Origin, History, Technology, and Production. In: C. W. Smith and J.T. Cothren, editors, Cotton: origin, history, technology, and production. John Wiley and Sons, Inc., New York, NY.

Pettigrew, W.T. 2004. Physiological consequences of moisture deficit stress in cotton. Crop Science, 44(4):1265-1272.

Phillips, L.L., and M.A. Strickland. 1966. The cytology of a hybrid between Gossypium hirsutum and G. longicalyx. Can. J. Genet. Cytol. 8(1):91-95.

Radin, J.W., L.L. Reaves, J.R. Mauney and O.F. French. 1992. Yield enhancement in cotton by frequent irrigations during fruiting. Agron. J. 84(4):551-557.

Ritchie, G. L., C. W. Bednarz, P. H. Jost, and S. M. Brown. 2007. Cotton growth and development. Bulletin 1252. Cooperative Extension Service and the University of Georgia College of Agricultural and Environmental Sciences, Athens, GA, USA.

Silvertooth, J. 1990. High Temperature Effects on Cotton. Cotton Physiology Today. 1(10):4.

Smith, E.F. and C.O. Townsend. 1907. A plant tumor of bacterial origin. Science 25:671–67.

Smith, G.L. 1942. California cotton insects. Univ. of California Bull. 660.

Stewart, J.M. 1995. Potential for crop improvement with exotic germplasm and genetic engineering. Challenging the future: Proceedings of the World Cotton Research, CSIRO, p.313-327.

Thiessen, L., and G. Collins. 2017. Disease Manangement in Cotton. 2017 Cotton Information. NCSU University Extension Pub. http://www.cotton.ncsu.edu/. Raleigh, NC: North Carolina State University Press. pp 57–65

USDA-APHIS. 2009. Pink Bollworm Eradication. https://www.aphis.usda.gov/plant_health/plant_pest_info/cotton_pests/ (Accessed March, 2017).

15

USDA-APHIS. 2014. Boll Weevil Eradication. https://www.aphis.usda.gov/plant_health/plant_pest_info/cotton_pests/ (Accessed March, 2017).

USDA-ERS. 2016. Cotton and Wool Outlook. https://www.ers.usda.gov/publications/pub- details/?pubid=81591 (Accessed March, 2017).

USDA-ERS. 2016. Oil Crops Yearbook. https://www.ers.usda.gov/data-products/oil-crops- yearbook/ (Accessed March, 2017)

USDA. 2017. World Agricultural Production Report. WAP 04-17. https://apps.fas.usda.gov/psdonline/circulars/production.pdf (Accessed March, 2017)

Wendel, J., R. Rowley, and J. Stewart. 1994. Genetic diversity in and phylogenetic- relationships of the Brazilian endemic cotton, Gossypium mustelinum (malvaceae). Plant Syst. Evol. 192:49–59.

Wendel, J.F., and R.C. Cronn. 2003. Polyploidy and the evolutionary history of cotton. Advances in agronomy 78: 139-186.

Wendel, J.F., and C.E. Grover. 2015. Taxonomy and evolution of the cotton genus. In: D. Fang and R. Percy, editors, Cotton, Agronomy Monograph 57. ASA, CSSA, and SSSA, Madison, WI. doi:10.2134/agronmonogr57.2013.0020

Wendel, J.F., C. Brubaker, I. Alvarez, R. Cronn, and J.M. Stewart. 2009. Evolution and natural history of the cotton genus. In: A.H. Patterson, editor, Genetics and genomics of cotton. Springer, New York. p. 3–22.

Zhang, L., G.W. Zhang, Y.H. Wang, Z.G. Zhou, Y.L. Meng, and B.L. Chen. 2013. Effect of soil salinity on physiological characteristics of functional leaves of cotton plants. Journal of Plant Research 126: 293–304. doi: 10.1007/s10265-012-0533-3. pmid:23114969.

16

CHAPTER 2: Screening Germplasm and Quantification of Components Contributing to Thrips Tolerance in Cotton

Abstract

Three hundred and ninety one Gossipium hirsutum and 34 G. barbadense accessions were screened for thrips tolerance under field conditions at the Upper Coastal Plain Research

Station in Rocky Mount, North Carolina in 2014 and 2015. Visual damage ratings, thrips counts, and seedling dry weights were recorded at 2.5, 3.5 and 4.5 weeks after planting.

Population density and thrips arrival times varied between years. Data from the three separate damage scoring dates provided a better estimate of tolerance or susceptibility than ratings from individual dates over the season. Five tolerant G. barbadense accessions and five moderately tolerant upland cotton accessions were identified from field evaluations. Tobacco thrips [Frankliniella fusca (Hinds)], followed by western flower thrips [Frankliniella occidentalis (Pergande)] were the dominant thrips species in the study. Greenhouse experiments were conducted in 2015-16 to determine if plant height, growth rate, leaf pubescence, and leaf area were significantly different in tolerant and susceptible groups of G. hirustum and G. barbadense accessions identified from the field screenings. Leaf pubescence and relative growth rate were significantly higher in tolerant accessions compared to susceptible lines in absence of thrips. There was no difference for plant height and leaf area.

Results suggest thrips tolerant plants have a possible competitive advantage in faster growth and higher trichome density, which limits thrips movement.

17

Introduction

Thrips (order Thysanoptera) are a major pest in cotton (Gossypium hirsutum L.) throughout most of the cotton belt, unlike most other major cotton pests which have a more limited economic distribution. Cotton is most vulnerable to thrips damage at seedling emergence through the 4 true leaf stage when seedlings are tender and slow–growing

(Stewart, 2011). Once cotton reaches the five true leaf stage, the greater plant biomass and generally lower migrating adult thrips levels results in minimal subsequent plant damage.

Most common thrips species are tobacco thrips [Frankliniella fusca (Hinds)] (Newsom et al,

1953), flower thrips [Frankliniella tritici (Fitch)] (Watts, 1936), onion thrips [Thrips tabaci

(Lindeman)] (Smith 1942), and western flower thrips [Frankliniella occidentalis (Pergande)]

(Bailey, 1957; Mound and Walker, 1982; Akin et al, 2011; Stewart et al, 2013).

Thrips may cause moderate to serious damage to cotton crop. Several studies have reported negative agronomic impacts, including stunted growth, flowering, boll set, maturity, and yield (Smith, 1942; Gaines, 1965; Watson, 1965; Bourland et al, 1992). Stunted growth leads to delayed maturity which makes the crop susceptible to other late season pests, such as stink bugs, plant bugs and bollworms. A late crop also translates into both shorter days associated with a late harvest and increased probability of frost damage (Morris, 1963;

Hawkins et al, 1966; Stewart et al, 2013). According to the Cotton Insects Losses report, thrips were the third most critical pest in the US cotton belt infesting 7,808,224 acres and causing an estimated loss of 150,740 bales (2014). In North Carolina alone yield losses due to thrips were 0.4% (Cotton Insects Losses Report, 2014). Finally, the cost of control for

18

thrips is high because a significant portion of this expense are automatic “up front” expenditures; that is, the thrips insecticide is either on the seed, put out as a granular at- planting insecticide, or both.

As a tropical crop with perennial growth habits, cotton seedlings grow at a slow rate after emergence, resulting in a susceptibility to thrips feeding and damage over a longer period than with annual plants. Thrips adults and immatures rasp the outer cells of developing leaves and meristematic bud tissue. Common symptoms of thrips damage include brown and curled leaves (often termed as “possum-eared” leaves), damaged deformed bud tissue, and small silvery areas in new expanding leaves. Thrips often cause greater damage under conditions that limit seedling growth, such as both cool wet and hot dry conditions

(Terry and Barstow, 1988; Faircloth et al, 1998). Cooler days slow cotton seedling growth and allow thrips to invade and damage the seedlings, resulting in serious damage (Race,

1965). Under hot and/or dry conditions, seedling uptake of at-planting insecticides (granular at-planting insecticides and seed treatments) may be limited, lessening thrips mortality and favoring reproduction. Additionally, dry weather causes early drying of other host crops and weeds in neighboring fields and ditch banks, which promotes migration of thrips from these dying hosts to young cotton seedlings, resulting in higher thrips levels (Sites and Chambers,

1990). These factors, either individually or in combination, often result in significant thrips damage to the young, tender cotton seedlings. The heavy reliance on pre-plant and pre- emergence herbicides can cause seedling damage and slow growth that is exacerbated by thrips feeding (Clarkson et al, 2014; Roberts et al, 2015). Although cotton seedlings typically recover from this herbicide damage, in the presence of economic levels of thrips, this slower

19

growth and seedling damage can result in maturity delays and yield loss (Roberts et al,

2015).

Although still widely used, insecticides for thrips control impose some environmental risk. Additionally, tolerance and/or resistance to the chloronicotinoids (which presently account for essentially all insecticide treated seed), has resulted in widespread insecticide failures in the Mid-south and Southeast (Huseth et al, 2016). Finally, the disruptive effect of foliar insecticides used for thrips may increase the levels of cotton aphids and spider mites via the destruction of beneficial arthropods (Cole et al, 1999; Keilor and Godfrey, 2000).

Existing genotypic variation for thrips tolerance may be an alternative to insecticidal control of thrips. New cultivars displaying both improved agronomic traits and significant thrips tolerance would be a welcomed development. Screening to identify thrips tolerant accessions is the first step in this approach. Previous studies have reported differential response to thrips attack, suggesting that there is natural tolerance present in some accessions

(Ballard, 1951). Many reports are available where cultivated germplasm was screened to explore the genetic variation for thrips tolerance in cotton (Ballard, 1951; Quisenberry and

Rummel, 1979; Stanton et al, 1992; Bowman and McCarthy, 1997). There are also cases of tolerance to thrips feeding, with a few cultivars demonstrating a moderate level of tolerance

(Ballard, 1951; Bourland and Jones, 2005).

Several plant defense mechanisms have been proposed based on studies conducted in other crops. These include development of morphological characteristics to restrict insect attack, e.g. presence of trichomes and epicuticular wax on leaves (Ballard 1951; Quisenbery and Rummel, 1979; Eigenbrode and Espelie, 1995; Crammer et al 2014), activation of

20

pathways including Jasmonic acid (Abe et al 2008; Abe et al 2009), accelerated growth rate to overcome thrips damage (Lei et al. 2004), and development of tolerance and non- preference (Fery et al 1991; Frei et al 2004). However, none of these mechanisms except correlation with pubescence (Ballard, 1951; Quisenbery and Rummel, 1979) has been studied in cotton. Also, most of the studies in cotton focused on assessing economic damage caused by thrips and their effect on agronomical traits. There are very few reports available in cotton where mechanisms behind the tolerance have been studied.

In current study we: (1) screened more than 250 elite US cotton cultivars, 102 introgression lines, and 34 Pima accessions under field conditions for two years in North

Carolina to identify new sources of thrips tolerance, (2) identified thrips species prevalent,

(3) investigated the impact of physiological and morphological characteristics, such as plant height, relative growth rate, leaf pubescence, and leaf area on with thrips tolerance in both G. hirsutum and G. barbadense.

Materials and Methods

Plant material: A diverse set of cotton germplasm was planted during summer 2014 and

2015 at the Upper Coastal Plain Research Station in Rocky Mount, NC to screen for thrips tolerance. It included 289 lines from a panel consisting of elite cotton cultivars collected from throughout the US cotton belt. This panel was used by Tyagi et al (2014) to carry out a genetic diversity analysis in upland cotton. Secondly, 139 F6:7 lines were developed by crossing Coastland 320, a G. barbadense parent with three G.hirsutum lines, Georgia King,

Deltapine 20, and MD51ne at Central Crops Research Station, Clayton, NC in summer 2005.

21

The F1s were backcrossed to the G. hirsutum parents and then all three populations were allowed to intercross in the field during several cycles of recurrent selection. Additionally, 42 land race accessions, which were photoperiod insensitive were obtained from Dr. Todd

Campbell, (Research Geneticist, USDA-ARS, Florence, SC). Some lines were tested only for one year, either in summer 2014 or 2015. These lines included 90 converted race stocks

(tested in 2014), 59 photoperiod sensitive landraces (tested in 2014), 87 elite cotton cultivars

(tested in 2015), and 34 G. barbadense lines which were screened in summer 2015. G. hirsutum photoperiod sensitive land races were collected mainly from Central America during the period of 1946 to 1989. Pima (G. barbadense) lines screened in this study were part of Arizona B collection and Arizona K collection. Out of 34 Pima accessions, thirteen were photoperiod sensitive and did not flower under long day conditions in North Carolina.

Photoperiod sensitive lines were seed increased through manual self-pollination in Mexico during winter 2013. Seeds for all these lines were procured from the USDA Cotton

Germplasm Collection in College Station, TX and selfed in the field to maintain homozygosity. Only those G. hirsutum accessions which were screened for two years are discussed in the results. Elite G. hirsutum cultivars and G. barbadense lines used in this study are provided in Appendix A.

Field planting/locations: All the entries were randomly planted in 20 feet single row plots with one plot per entry under non-irrigated conditions. Seeds for these lines were treated with fungicide only. No insecticides were used at planting. All other standard agronomic practices were followed (2014 Cotton Information). Lines were planted at Central Crops

22

Research Station, Clayton, NC, the Fountain Farm at the Upper Coastal Research Station in

Rocky Mount, NC or both locations.

Three hundred and six accessions were planted at Central Crops Research Station,

Clayton, NC in 2014. It included 139 F6:7 lines, 38 elite cotton cultivars, 40 lines from Dr.

Todd Campbell’s selection, and 89 converted race stocks. These lines were planted alongside wheat rows to ensure high thrips pressure for screening. Four wheat rows were planted after every eight rows of cotton.

Three hundred and ninety one upland cotton lines were screened in the field at the

Fountain Farm, Upper Coastal Research Station, Rocky Mount, NC for two consecutive years. Out of 391 lines, 289 were elite cotton cultivars and the remaining 102 were F6:7 lines of the Upland X Coastland 320 cross. Some lines were only once either in 2014 or 2015.

These included 90 converted race stocks (in 2014), 59 photoperiod sensitive landraces

(2014), 87 elite cotton cultivars (2015), and 34 G. barbadense lines (2015).

Evaluation of thrips tolerance: The first parameter used to evaluate the damage caused by thrips was visual scoring of the plants in the field based on leaf damage. An average score was assigned to each plot depending on the overall leaf damage of all the plants in that plot.

Scoring was performed on three dates. The first scoring was conducted at 2-3 true leaf stage

(2.5 weeks after planting), followed by 3-4 leaf stage (3.5 weeks after planting), and finally at 5-6 true leaf stage (4.5 weeks after planting). For a given scoring date, all the plots were scored on the same day by one person. The scoring was based on a scale of 0 through 5,

23

where score 0 is for completely healthy plants and a score of 5 indicates plant death due to severe thrips damage (Faircloth et al, 2001) (Figure 1):

0 No thrips damage and completely healthy looking plants

1 Little leaf damage, with small brown spots on leaves

2 Moderate damage with some tearing and chewing of true leaves

3 Severe damage with malformed and wrinkled true leaves, injury to apical meristem

4 Very severe damage with no true leaves, only small out growths are visible

5 Dead plants

Second, the number of thrips was counted from cotton seedlings collected from the field and brought into the laboratory. Because this method is highly laborious, it was not performed for all the lines. Twenty seven G. hirsutum lines were selected through stratified sampling representing all visual scoring classes and sampled for two years. Also, six G. barbadense lines were sampled for thrips in 2015. Thrips were collected using the Soap

Wash method (Reisig and Godfrey, 2006). This method involved 1) cutting off four randomly-selected seedlings per plot at ground level, 2) placing the seedlings into 1 qt.

Mason jars filled with soapy water, 3) washing the thrips from the plants in the lab using a

270-mesh screen with 0.053 mm openings, 4) rinsing the thrips into a small glass vial of 70% ethanol, 5) transferring the thrips to a gridded plastic Petri dish and 8) separately counting the adult and immature thrips under a Bausch and Lomb 7 to 30x dissecting stereo-zoom microscope (Reisig and Godfrey, 2006).

24

The third method used to estimate thrips damage was to measure dry weight of seedlings used to count thrips. Seedlings were dried at room temperature for a week and then dry weight (in grams) was recorded for the biomass of each 4-plant sample.

In the current study, a score assigned based on leaf tissue damage was recorded on all the accessions, whereas thrips count, and dry weight was recorded for randomly selected twenty six G. hirsutum and seven G. barbadense accessions representing all damage scores.

Statistical analysis for field study: Damage score, thrips count and seedling dry weight for the whole sampling period were obtained by taking mean of values recorded over three scoring dates. Mean values for each scoring date also were calculated for each year and over the years as well. Paired t-tests were conducted to test differences between damage score, thrips count, and dry weight value recorded over three scoring dates from two years. Also,

SAS ver 9.4 software was to calculate Pearson coefficients between three parameters

(damage score, thrips count, and dry weight) used to evaluate thrips damage in the field.

Pearson’s correlation coefficients were estimated for individual scoring dates as well as mean values.

Identification of thrips species: Adult thrips from four samples collected at 2-3 true leaf stage (first scoring date) were pipetted out and used for identification of thrips species.

Permanent slides were prepared using CMC-10 mounting media (Masters Company, Inc.,

Wooddale, IL). Thrips species were identified using keys modified from Reed et al. 2006.

25

Greenhouse experiments: G. barbadense and G. hirsutum accessions were screened for thrips tolerance under field conditions in North Carolina during summers of 2014 and 2015.

Based on their thrips response under field conditions, five Pima lines (3 tolerant and 2 susceptible lines) and 5 upland cotton lines (3 tolerant and 2 susceptible lines) were selected for additional evaluation under greenhouse conditions. The list of lines included in this experiment is presented in Table 1. These accessions were evaluated for plant height, relative growth rate, leaf area, and leaf pubescence. One seed was planted per round plastic cone (10 cm diameter by 12 cm depth) containing commercial potting mix (Fafard 4P potting mix,

Conrad Fafard Inc., Agawam, MA, USA) making sure that all cones contained the equal weight of potting mix. Plants were fertilized with 25mL pot−1 of a 4.6gL−1 fertilizer solution

(Scotts Starter Fertilizer, The Scotts Company LLC, Marysville, OH, USA) at 10 and 20

DAP (days after planting) to ensure optimum plant growth. Each cone was watered daily to bring soil to optimum soil moisture. The greenhouse temperature was maintained at 35 ± 5∘C and natural lighting was supplemented for 14 hours daily with metal halide lamps (Hubbell

Lighting, Inc., Greenville, SC, USA). The experimental design was a randomized complete block with four replications, and this experiment was conducted three times.

Beginning 10 DAP, plant height was recorded on 10, 15, 20 and 25 DAP. These intervals were selected keeping in mind that cotton seedlings are most susceptible to thrips up to 4-5 true leaf stage or 3-4 weeks after planting. SigmaPlot version 12.5 (Systat Software,

San Jose, CA) was used to plot plant height for individual genotypes pooled over three runs.

Above ground wet and dry weight was recorded on 15 and 25 DAP. Relative growth rate

(RGR) was calculated using following formula.

26

RGR = (ln DW2 – ln DW1) / 10 days, where DW2 and DW1 are the dry weights at 25 and

15 DAP, respectively.

On 25th DAP, a leaf disc (4mm2) from first fully expanded leaf from the top was collected from each accession and leaf trichomes were counted under the microscope.

Leaf area was measured during third run of the experiment. Total leaf area for each accession was measured using an LI-3100 area meter (LI-COR Biosciences, P.O. Box 4425, Lincoln,

NE, USA) on 25 DAP. Data were subjected to analysis of variance (Statistical Analysis

Systems, version 9.4, SAS Institute Inc., SAS Campus Drive, Cary, NC, USA) appropriate for cotton accessions grouped into tolerant and susceptible types based on thrips response during field evaluations (data not included here). In the model, Response (based on thrips damage score in field) was used as fixed effect and run was added as random effect. In cases with lack of interaction between response and run, the data were pooled over the three runs of the experiment. Plant height was recorded four times for each plant during the experiment and these measurements were treated as repeated measures for analysis. Also, genotypes were grouped based on species and analysis of variance was carried out. The relationship between leaf pubescence and leaf size was used for counting leaf trichomes. T-tests were conducted to differentiate between Least-squares means (LS-means) for thrips response

(tolerant and susceptible type) and species (Pima and Upland cotton). Mean trichome counts for three leaf sizes (small, medium, and large) groups were also obtained and analyzed by

SAS software using PROC MEANS function.

27

Results and Discussion

Thrips screening at Rocky Mount: Thrips damage on accessions planted at this location during summer of 2014 and 2015 were analyzed in groups. In the first group, 289 elite US cotton cultivars were included. Mean damage score for these accessions was 2.40 in year

2014 which was significantly different than the mean score (3.06) in year 2015 (P-value <

0.0001). Lines displaying a range of thrips tolerance were observed in both years (ranging from score 1-4). Group 2 comprised of 102 lines developed by crossing Coastland 320 line with susceptible upland cotton lines followed by recurrent selection for several cycles. In

2014, these lines had minimum score of 1 and maximum score of 4 and average thrips damage score of 2.4. Similarly, in 2015 damage score ranged between damage scores 2 and

4 with an average thrips damage score of 3.0. ANOVA analysis suggested that there lines were not significantly different based on their thrips damage score in field. In 2015, thirty four G. barbadense lines were also screened for thrips tolerance. Damage score ranged between 0 through 4 with an average score of 1.8. Most of the lines had a damage score of 2 and there were very few lines with score 4 and none of the lines were completely dead (score

5).

Comparison between G. hirsutum and G. barbadense accessions: In 2015, lines from two majorly cultivated tetraploid cotton species G. hirsutum (33 accessions) and G. barbadense

(7 accessions) were screened for thrips tolerance in the field conditions. Overall Pima lines performed better than upland cotton accessions (Table 2). Both species followed a similar pattern over three scoring dates. However, the mean damage score was higher for G.

28

hirsutum lines (3.05) than Pima cotton accessions (1.48) (p = 0.0009) (Table 2). Damage score went down by second scoring date and by 4-5 true leaf stage plants started to overcome the damage. The minimum and maximum thrips damage score were 1 and 4 for G. hirsutum and 0 and 3 for G. barbadense. A similar observation was reported by Zhang et al (2013) when they compared thrips tolerance in five tetraploid cotton species. Average rating for G. barbadense accessions (1.6) was lower in comparison to G. hirsutum accessions (2.3). Total number of thrips counted on seedling samples lowered as plants grew bigger and this pattern was conserved across species. Number of thrips was higher in Pima samples compared to upland cotton accessions; however by the time of the third date, upland cotton had higher number of thrips. For adult thrips, there was an increase in the number of adult thrips as the season advanced in both species and there was no significant difference between two species

(Table 2). Number of immature thrips reduced as the season advanced. Initially both species had almost same number of thrips but Pima lines had higher number of thrips at second scoring date. Finally, on the third scoring date thrips number was higher in upland cotton lines compared to Pima lines (Table 2). There was an expected overall increase in dry weight of plants with time. The increase was equal for both species for the first two collection dates

(Table 2). However, during the recovery phase (4-5 true leaf stage) G. barbadnese lines had more dry weight compared to upland cotton lines (Table 2).

Performance of lines over two years (2014 and 2015): Damage score, thrips count, and dry weight values for 26 accessions were recorded for two years to confirm their thrips response.

Damage score for sampling period (mean of damage score over three scoring dates) for year

29

2014 and 2015 was 2.62 and 3.01, respectively (Table 3). There was no significant difference between damage score for the sampling period, however mean score (1.7) for the first date in

2014 was statistically different from mean score (3.38) in 2015 (p < 0.0001). For next two scoring dates, mean score was similar for two years. Total thrips count for the sampling period for year 2014 and 2015 was not statistically different (167.55 and 182.08, respectively) (Table 3). However, when total thrips count mean for each scoring date was compared some differences were observed. Mean count for Date 1 and Date 3 was different between two years. Count for Date 1 was higher in 2015 compared to 2014. This was in agreement with higher damage scores observed in 2015 (Table 3). These results suggest that thrips arrived earlier in year 2015. Seedlings had significantly more damage and higher thrips count by first scoring date. Mean dry weight readings for all scoring dates were statistically different between 2014 and 2015. A similar trend was recorded by Cook et al (2013) where they observed differences in thrips densities over the years. It suggests that environmental conditions majorly influence response of cotton accessions towards thrips attack.

Based on these observations the second scoring date (3.5 weeks) was considered as the best time for screening. On other dates, plants were either too young (2.5 weeks) or old

(4.5 weeks) for effective scoring. However thrips pressure could vary over the years, as discussed above, and in cases where thrips attack early/late in the season, a better strategy would be to score on multiple dates and use mean value for whole sampling season.

Also, correlation coefficients between damage score, and thrips count recorded on successive scoring dates were calculated to better understand the trend over scoring time. As

30

expected damage scores, thrips counts and seedling biomass for three scoring dates were strongly correlated (Table 4).

Evaluation of different scoring techniques: Several methods for assessing thrips damage have been reported in literature. They include scoring based on leaf damage (Ballard et al,

1951; Stanton et al, 1992; Faircloth et al, 2001), counting number of thrips using either a thrips-box or by seedling washing method (Leigh et al., 1984), sweeping insects using nets

(Newsom et al, 1953), sticky traps (Moffit, 1964), and using chemicals to collect thrips

(Race, 1965). Recording dry weight is an indirect parameter to estimate thrips damage because severe tissue damage by thrips results in lower seedling biomass. In this study, we used three screening methods for evaluation of thrips tolerance/susceptibility in G. hirsutum and G. barbadense accessions. Visual damage assessment on a scale is a commonly used method for scoring disease/pest tolerance. In cotton, a 0-5 scale (Faircloth et al, 2001) and a

0-7 scale (Stanton et al, 1992; Zhang et al, 2013) have been reported to evaluate thrips tolerance. Washing method (or counting method) (Burris et al, 1990) is a laborious method which includes washing and counting thrips seedlings. It can be used on seedling samples to quantify thrips damage and to make sure that the damage is due to thrips and no other pests.

However it is very time consuming to count thrips on all the samples included in large field trials. Also, as thrips populations could be highly variable across the field, and thus screening solely based on counts would not useful.

Correlation coefficients were calculated to understand the relationship between these scoring parameters. In year 2014, no significant correlation was observed between damage

31

score, total thrips count, and dry weight readings. However, in 2015, a strong positive correlation (r = 0.57, p-value = 0.002) was observed between mean thrips score and mean total thrips count. Also, a negative correlation (r = -0.66, p-value = 0.0002) was recorded between damage score and dry weight in seedlings. It indicates that higher leaf damage could be an indicator for higher total thrips count on the seedlings. But this association was not true for 2014 suggesting factors other than thrips infestation could lead to seedling leaf damage.

Negative correlation between damage score and dry weight was expected because higher leaf damage translates to lower dry matter in the seedlings. Also, in our study presence of thrips

(thrips count) on seedlings was not directly related to the damage caused on leaves. It was found that in some G. barbadense lines (NCM14-04, NCM14-76, and Coastland 320) there was no damage although enough thrips were present on the seedlings. It suggests that these accessions might have tolerance mechanism which helped tolerate damage or make them lessor favorites for the thrips.

Thrips species identification: Samples were mounted to screen for four common thrips species infesting cotton seedlings in the southeast US. They included tobacco thrips, western flower thrips, onion thrips [Thrips tabaci (Lindeman)], and flower thrips [Frankliniella tritici

(Fitch)] (Gaines, 1934; Albeldano et al, 2008). Most of the samples were infested by a single species, tobacco thrips [Frankliniella fusca (Hinds)]. Similar results were reported by

Stewart et al (2013) where they studied distribution of thrips species across the Southern US

Cotton Belt. Frankliniella occidentalis commonly known as western flower thrips followed tobacco thrips in thrips count and was second most dominant thrips species in current study.

32

Samples collected in summer 2015 had only tobacco thrips [Frankliniella fusca (Hinds)] on them.

Identification of tolerant accessions and Introgression of thrips tolerance to G. hirsutum: Two years of field screening of 289 upland cotton accessions, 102 lines (cross between G. barbadense and G. hirsutum lines) developed by recurrent selection and 34 Pima lines at Upper Coastal Research Station at Rocky Mount helped to identify ten tolerant and moderately tolerant accessions of G. hirsutum and G. barbadense (Figure 2). List of these accessions classified based on damage score is presented in Table 5. Lines with a mean damage score between 0-1.5 were considered highly tolerant and accessions were considered moderately tolerant if they had a damage score between1.5-3. As mentioned earlier, G. barbadense accessions showed higher level of tolerance compared to G. hirsutum lines.

These accessions were used to make crosses to transfer the segment carrying thrips tolerance genes to cultivated germplasm using backcrossing. Interspecific crosses between previously identified susceptible Upland cotton lines (TM1, Acala maxxa, FM966 and Georgia King) and thrips tolerant G. barbadense accession Coastland 320 were made. In 2014, F1 hybrids were backcrossed to their respective susceptible parental line. For next two years backcrosses were made using thrips tolerant plants to generate BC3F1 plants at Upper Coastal Research

Station in Rocky Mount, NC. BC2F1 plants segregated for the presence of highly tolerant and susceptible plants in each of the four Upland cotton backgrounds. However, the expected 1:1 segregation ratio of tolerance and susceptibility was not observed. This suggested thrips tolerance might be controlled by multiple genes in the interspecific introgression from donor

33

G. barbadense accession Coastland 320. Additionaly, three newly identified tolerant G. barbadense accessions (NCM-14-1, NCM-14-40, and NCM-14-76) were also crossed with susceptible Upland cotton lines (TM1, Acala maxxa, FM966 and Georgia King) during summer 2016 at Upper Coastal Research Station in Rocky Mount, NC. The list of interspecific crosses generated to introgress thrips tolerance to G. hirsutum accessions is provided in Table 6.Also, three highly tolerant and moderately susceptible G. barbadense lines (Table 1; Lines 1-5 ) were planted in the greenhouse in winter 2015/2016. Crosses were made between these lines to obtain F1 hybrids and segregating mapping populations. These

F1 hybrids can be used to develop mapping populations to study the interitance of thrips tolerant trait in G. barbadense. Development of molecular markers and mapping the associated region will also facilitate and expedite the development of tolerant cultivars.

Earlier studies have suggested a quantitative inheritance with non-additive genetic variance being prominent measures of thrips tolerance (Bowman and McCarty 1997; Zhang et al.

2013). Identifying markers associated with thrips tolerance will be helpful to diagnose tolerance quickly and efficiently compared to field screening where multiple years/locations trials are required to confirm uniform tolerance and adequate thrips pressure. Moreover, genetic markers are abundant, are relatively inexpensive in their application in most breeding programs (Powell et al. 1996).

Growth components contributing to thrips tolerance: The main effect of thrips response was analyzed for plant height, relative growth rate, leaf pubescence, and leaf area for accessions grouped in tolerant and susceptible types. Plant height and leaf area also provide

34

indirect measure of plant growth. The primary objective of this experiment was to determine the relationship of thrips tolerance expression with morphological and growth parameters like plant height, relative growth rate, leaf pubescence, and leaf area. When accessions were segregated into two groups according to their tolerance characteristics (tolerant and susceptible), the effect of response group was not significant for height and leaf area reductions (Table 7). However, the effect of accession grouping was significant for relative growth rate (푃 = 0.0398) and individual runs for leaf pubescence (푃 = <0.0001, 0.0003, and

<0.0001 respectively). Considerable differences in relative growth rate existed between tolerant (0.138) and susceptible (0.121) lines. Also, leaf pubescence count was higher for tolerant accessions (Table 8). It suggested that relative growth rate and leaf pubescence could be contributing towards thrips tolerance in G. hisutum and G. barbadense cotton accessions.

This observation is accordance with earlier reports. Ballard (1951) screened cotton accessions for thrips tolerance and suggested that natural variation for thrips tolerance was present between accessions were graded on a scale of 0 to 10 points based on visual damage on leaves . He reported a positive correlation between young leaf pubescence and thrips resistance in some cases. However, this correlation was based on visual observation and was not quantified statistically. A similar study was reported by Quisenbery and Rummel (1979) where thrips damage on cotton accessions was evaluated using reduction in leaf area as a measurement. They proposed that resistance to thrips in cotton is associated with pubescence/pilose marker and plant trichomes hinder thrips entry onto the leaf surface. On the other hand, there are some studies which suggest that thrips damage is severe on highly

35

pubescent varieties compared to smooth leaf accessions (Wardle and Simpson, 1927; Baloch et al. 1982; Zareh, 1985).

Role of trichomes in plant defense response against other insect pests have been studied extensively. Leaf trichomes can be of different shapes (like straight, spiral, hooked, branched, or unbranched), sizes, and can be glandular or non-glandular (Hanley et al. 2007).

Non-glandular trichomes have been reported to restrict the insect pests mechanically and degree of control depends on their density, length, and shape (Handley et al. 2005). Levin

(1973) proposed that trichome density is negatively related to the feeding, nutrition, and ovipositional behavior of insect. Dense trichomes also affect the plant-insect interactions by interfering with the movement of insects and other arthropods on the plant surface, and limiting their access to leaf epidermis (Agrawal et al. 2009). In addition to this, glandular trichomes are reportedly associated with chemical control and secrete secondary metabolites including flavonoids, terpenoids, and alkaloids that are toxic, unpalatable or act as insect traps (Hanley et al 2007). This mechanism have been reported to be present in tomatoes, peppers, and potatoes against insects like aphids, spider mites, potato leaf hopper on beans, and whitefly (Eigenbrode and Espelie, 1995; Bonierbale et al, 1994; Simmons and Gurr,

2005).

Thrips infestations occur early during growing season and can lead to heavy damage to leaf tissues. As mentioned earlier cotton plants are most susceptible up to 3-4 true leaf stage. After this point, plants overcome the damage and resume the production of normal leaves. In our study, relative growth rate was statistically higher in tolerant accessions compared to susceptible lines. It suggests that accelerated growth rate could be a possible

36

mechanism for thrips tolerance in cotton. Genotypes exhibiting tolerant response produce leaves at a faster rate than leaf damage caused by thrips. In previous studies, accelerated growth rate has been reported to play role in recovery process. Lei and Wilson (2004) studied thrips infested plants and reported that plants overcame the reduction in the leaf area by an accelerated growth of main stem leaves. Recovering plants complete the expansion of smaller, damaged leaves earlier and use the time and resources to expand the new leaves coming from upper nodes when compared to the control plants with no thrips damage.

Leaf pubescence trait was further investigated and relationship between size of leaf used to collect 4mm2 disc to obtain pubescence count was studied. At the time of sampling three leaf sizes i.e. small, medium, and large were considered. It was observed that there was significant difference in leaf pubescence density between three leaf sizes (Table 9). Larger leaves had lower pubescence count compared to leaves smaller in size in all three runs. It indicates that as leaves expand, trichomes are either lost or their density reduces per unit area. Similar observation was reported by earlier studies where trichome density was higher young leaves compared to mature cotton leaves (Chu et al. 2001; Grover et al. 2016).

A second analysis of variance was carried out where accessions were groups based on the two cotton species, G. barbadense and G. hirsutum to examine if plant height, relative growth rate, leaf pubescence, and leaf are statically different between upland and Pima accessions. The main effect of species was non-significant for plant height, relative growth rate, and leaf area and was significant for each run of leaf pubescence (Table 10). Mean trichome count for G. barbadense accessions (both tolerant and susceptible) was higher compared to G. hirsutum accessions (Table 11). It could be possible because G. hirsutum

37

accessions included in the study were commercial cultivars which might have been selected against ‘hairiness’ trait. Trichomes are reported to increase leaf trash in ginned cotton which lowers down fiber quality and there is a general preference for smooth type cultivars

(Meredith et al. 1996; Wanjura et al. 1976).

Additional studies would help to better understand the role of growth rate and leaf trichomes in thrips tolerance. Also, the number of accessions in tolerant and susceptible groups was limited to five in this study. A more comprehensive experiment with a larger number of accessions would be more informative in defining relationships between relative growth rate, leaf pubescence and thrips tolerance in cotton. Limited studies are published addressing this aspect of thrips tolerance. Understanding of factors contributing to thrips tolerance would facilitate transferring this trait into commercial cultivars. Leaf pubescence and other morphological characteristics identified to be associated with tolerant response have been sufficient to provide acceptable protection from thrips infestations. There is need to consider other possible approaches such as identifying R-genes/QTL involved in defense mechanism and investigating the effects of secondary products such as, tannins, lignin, and gossypol on thrips. Combination of morphological traits, secondary chemicals, and resistance genes may provide a diverse and enduring tolerance to thrips

38

Figures

Figure 1. Visual damage scoring scale using to evaluate the thrips respone of G. hirsutum and G. barbadense accessions at Upper Coastal Plains Research Station, Rocky Mount, NC

39

Figure 2. (a) Thrips tolerant, moderately tolerant, and (b) susceptible G. hirsutum and G. barbadense accessions identified in field screening at Upper Coastal Plains Research Station, Rocky Mount, NC

40

Tables

Table 1. List of G. hirsutum and G. barbadense accessesions evaluated for plant height, relative growth rate, leaf pubescence and leaf area in greenhouse S. no Species Accession Response 1 Pima Cotton NCM-14-1 Moderately tolerant 2 NCM-14-40 Moderately tolerant 3 NCM-14-76 Moderately tolerant 4 NCM-14-31 Susceptible 5 NCM-14-75 Susceptible 6 Upland Cotton DIV-27 Moderately tolerant 7 DIV-147 Moderately tolerant 8 DIV-378 Moderately tolerant 9 DIV-90 Susceptible 10 DIV-142 Susceptible

41

Table 2. Comparison of damage score, thrips count, and dry weight between G. hirsutum and G. barbadense accessions G. hirsutum G.barbadense No of Lines 33 7 Mean Damage Score 3.05 1.48 Min Damage Score 1.33 0.33 Max Damage Score 4.00 2.67 Mean Adult Thrips Count 23.05 25.57 Min Adult Thrips Count 7.33 17.00 Max Adult Thrips Count 51.33 48.67 Mean Immature Thrips Count 168.24 191.81 Min Immature Thrips Count 44.00 101.00 Max Immature Thrips Count 320.00 277.33 Mean Total Thrips Count 191.29 217.38 Min Total Thrips Count 51.33 118.00 Max Total Thrips Count 357.00 326.00 Mean Dry Weight (g) 1.35 1.84 Min Dry Weight (g) 0.89 1.30 Max Dry Weight (g) 2.32 2.37

42

Table 3. Comparison of damage score, thrips count and dry weight readings between summer 2014 and summer 2015 2014 2015 Date 1 Date 2 Date 3 Mean Date 1 Date 2 Date 3 Mean No of Lines 26 26 Mean Damage Score 1.70 3.23 2.85 2.62 2.50 3.08 3.38 3.01 Min Damage Score 1.00 2.00 1.00 1.50 1.00 1.00 2.00 1.33 Max Damage Score 3.00 4.00 4.00 4.00 4.00 4.00 4.00 4.00 Mean Adult Thrips Count 5.27 31.62 67.12 34.67 12.27 15.42 42.81 23.50 Min Adult Thrips Count 1.00 12.00 21.00 13.00 2.00 3.00 3.00 7.33 Max Adult Thrips Count 15.00 78.00 169.0 73.00 45.00 39.00 107.0 51.33 Mean Immature Thrips Count 106.9 156.7 135.0 132.9 232.5 154.8 88.46 158.6 Min Immature Thrips Count 33.00 38.00 39.00 38.33 90.00 26.00 9.00 44.00 Max Immature Thrips Count 178.0 314.0 317.0 214.6 501.0 388.0 235.0 320.0 Mean Total Thrips Count 112.15 188.35 202.15 167.55 244.77 170.19 131.27 182.08 Min Total Thrips Count 35.00 57.00 63.00 56.00 98.00 36.00 12.00 51.33 Max Total Thrips Count 184.00 332.00 454.00 287.67 524.00 406.00 342.00 357.00 Mean Dry Weight (g) 0.95 1.36 1.62 1.31 0.69 1.15 2.21 1.35 Min Dry Weight (g) 0.44 0.64 0.00 0.64 0.46 0.77 1.13 0.89 Max Dry Weight (g) 1.42 2.30 3.49 2.08 1.13 1.87 4.26 2.32

43

Table 4. Pearson Correlation Coefficients between damage score, thrips count and dry weight recorded on three scoring dates in year 2014 and 2015 2014 2015 (a) Damage Score Score_1 Score_2 Score_1 Score_2 Score_2 0.60* 0.53* Score_3 0.31 0.37 0.52* 0.44 (b) Total thrips count Count_1 Count_2 Count_1 Count_2 Count_2 0.59* 0.62** Count_3 0.00 0.03 0.35 0.34 (c) Dry weight Weight_1 Weight_2 Weight_1 Weight_2 Weight_2 0.67** 0.35 Weight_3 -0.11 -0.10 0.18 0.76** (d) Mean values of Damage Score, Thrips count and Dry weight Mean_Score Mean_count Mean_Score Mean_count Mean_count 0.33074 0.57* Mean_Drywt -0.02289 -0.00211 -0.66** -0.49502 * p<0.01 ; **p<0.001

44

Table 5. List of thrips tolerant and moderately tolerant G. hirsutum and G. barbadense accessions based on field performance in year 2014 and 2015 S. no Species Accession Tolerance Score 1 G. barbadense NCM-14-1 Higher 1.3 2 G. barbadense NCM-14-04 Higher 1.0 3 G. barbadense NCM-14-40 Higher 1.0 4 G. barbadense NCM-14-76 Higher 0.3 5 G. barbadense Coastalnd 320 Higher 1.0 6 G. hirsutum DIV-27 Moderate 1.8 7 G. hirsutum DIV-29 Moderate 2.5 8 G. hirsutum DIV-32 Moderate 3.1 9 G. hirsutum DIV-147 Moderate 2.3 10 G. hirsutum DIV-378 Moderate 3.0

45

Table 6. List of thrips tolerant G. barbadense accessions and thrips susceptible Upland cotton accessions used to introgres thrips tolerance in G. hirsutum S. no Tolerant parent Susceptible parent 1 Coastland 320 TM1 Acala Maxxa Georgia King FM966

2 NCM-14-1 TM1 Acala Maxxa Georgia King FM966

3 NCM-14-40 TM1 Acala Maxxa Georgia King FM966 4 NCM-14-76 TM1 Acala Maxxa Georgia King FM966

46

Table 7. P > 퐹 for plant height, relative growth rate, leaf pubescence, and leaf area of 10 pima and upland cotton accessions grouped based on thrips response under field conditions Heighta Relative Pubescence Pubescence Pubescence Leaf Growth (Run1) (Run2) (Run3) Area Ratea Responseb 0.1125 0.0398 <0.0001 0.0003 <0.0001 0.5203 Day <0.0001 - - - - - Response x Day 0.3654 - - - - - aData are pooled over experiments. bConsists of a group of 6 thrips tolerant and a group of 4 susceptible cotton accessions.

47

Table 8. Least mean squares for thrips tolerant and susceptible groups of G. hirsutum and G. barbadense for plant height, relative growth rate, pubescence and leaf area recorded on 10, 15, 20, and 25 DAP Response Heighta Realtive Pubescence Pubescence Pubescence Leaf Growth (Run1) (Run2) (Run3) Area Ratea Tolerant 16.98 a 0.138 a 4.60 a 5.89 a 5.38 a 336.39 a Susceptible 15.20 a 0.121 b 3.07 b 4.66 b 2.79 b 324.70 a aData are pooled over experiments. bMeans within a parameter followed by the same letter are not significantly different Fisher’s LSD test at alpha ≤ 0.05.

48

Table 9. Mean trichome counts for leaf size groups for pubescence and leaf area recorded on 25 DAP for individual runs Pubescence Pubescence Pubescence (Run1) (Run2) (Run3) Small 87.35 299.0 162.92 Medium 80.46 323.0 110.33 Large 33.55 57.75 19.62

49

Table 10. P > 퐹 for plant height, relative growth rate, leaf pubescence, and leaf area of 10 pima and upland cotton accessions grouped based on species Heighta Realtive Pubescence Pubescence Pubescence Leaf Growth (Run1) (Run2) (Run3) Area Ratea Speciesb 0.8445 0.6434 0.0004 <0.0001 <0.0001 0.9269 Day <0.0001 - - - - - Species x Day 0.0294 - - - - - aData are pooled over experiments. bConsists of a group of 5 G. barbadense and a group of 5 G. hirsutum cotton accessions.

50

Table 11. Least mean squares for G. barbadense and G. hirsutum groups for plant height, relative growth rate, pubescence and leaf area recorded on 10, 15, 20, and 25 DAP Species Heighta Relative Pubescence Pubescence Pubescence Leaf Growth (Run1) (Run2) (Run3) Area Ratea G. barbadense 16.25 a 0.134 a 4.77 a 6.04 a 5.50 a 332.5 a G. hirsutum 15.32 a 0.130 a 3.52 b 4.55b 3.38 b 330.9 a aData are pooled over experiments. bMeans within a parameter followed by the same letter are not significantly different Fisher’s LSD test at alpha ≤ 0.05.

51

References

Abe, H., J. Ohnishi, M. Narusaka, S. Seo, Y. Narusaka, S. Tsuda, and M. Kobayashi. 2008. Function of jasmonate in response and tolerance of Arabidopsis to thrip feeding. Plant Cell Physiol. 49(1): 68-80. Abe, H., T. Shimoda, J. Ohnishi, S. Kugimiya, M. Narusaka, S. Seo, N. Yoshihiro, S. Tsuda, and M. Kobayashi. 2009. Jasmonate-dependent plant defense restricts thrips performance and preference. BMC Plant Biol. 9(1): 97. Agrawal, A.A., M. Fishbein, R. Jetter, J.P. Salminen, J.B. Goldstein, A.E. Freitag, and J.P. Sparks. 2009. Phylogenetic ecology of leaf surface traits in the milkweeds (Asclepias spp.): chemistry, ecophysiology, and insect behavior. New Phytol. 183: 848–67. doi: 10.1111/j.1469-8137.2009.02897.x Akin, D.S., J. Reed, K.C. Allen, J.S. Bacheler, A. Catchot, D. Cook, J. Gore, J. Greene, A. Herbert, D.L. Kerns, B.L. Leonard, G.M Lorenz III, S. Micinski, D. Reisig, P. Roberts, M. Toews, S.D. Stewart, G.E. Studebaker, and K. Tindell. 2011. Regional Survey 2009-2010: Thrips Species Composition Across the Upland Cotton Belt. In: Proc. Beltwide Cotton Conf. National Cotton Council of America, Memphis, TN. p 838-846. Albeldano, W.A., J.E. Slosser and M.N. Parajulee. 2008. Identification of thrips species on cotton on the Texas Rolling Plains. Southwest Entomol. 33: 43-51. Arnold, M.D., J.K. Dever, M.N. Parajulee, S.C. Carroll, and H.D. Flippin. 2012. Simple and effective method for evaluating cotton seedlings for resistance to thrips in a greenhouse, and a thrips species composition on the Texas High Plains. Southwest Entomol. 37: 305-313. Bailey, S.F. 1957. The thrips of California. Part I: Suborder Terebrantia. Bull. Calif. Insect Sur. 4: 143–220. Ballard, W.W. 1951. Varietal differences in susceptibility to thrips injury in Upland cotton. Agronomy Journal. 43(1): 37-44. Baloch, A.A., B.A. Soomro, and G.H. Mallah. 1982. Evaluation of some cotton varieties with known genetic markers for their resistance/tolerance against sucking and bollworm complex. Turkiye Bitki Korume Dergisi 6: 3–14. Bhatti J.S. 1989. The classification of Thysanoptera into families. Zoology 2(1): 1-23. Bonierbale, M.W., R.L. Plaisted, O. Pineda, and S.D. Tanksley. 1994. QTL analysis of trichome-mediated insect resistance in potato. Theor. Appl. Genet. 87(8): 973-987. Bourland, F.M., and D.C. Jones. 2005. Registration of Arkot 9101 and Arkot 9108 germplasm lines of cotton. Crop Sci. 45(5): 2128-2130.

52

Bourland, F.M., D.M. Oosterhuis, and N.P. Tugwell. 1992. Concept for monitoring the growth and development of cotton plants using main-stem node counts. J. Prod. Agric. 5: 532–538. Bowman, D.T., and J.C. McCarthy, Jr. 1997. Thrips (Thysanoptera: Thripidae) tolerance in cotton: sources and heritability. J. Entomol. Sci. 32: 460-471. Burris, E., A.M. Pavloff, B.R. Leonard, J.B. Graves, and G. Church. 1990. Evaluation of two procedures for monitoring populations of early season insect pests (Thysanoptera: Thripidae and Homoptera: Aphididae) in cotton under selected management strategies. J. Econ. Entomol. 83: 1064-1068. Chu, C.C., Freeman, T.P., Buckner, J.S., Henneberry, T.J., Nelson, D.R. and Natwick, E.T., 2001. Susceptibility of upland cotton cultivars to Bemisia tabaci biotype B (Homoptera: Aleyrodidae) in relation to leaf age and trichome density. Ann. Entomol. Soc. Am. 94(5): 743-749. Clarkson, D.L, G.M. Lorenz, N.M. Taillon, A.W. Plummer, B.C. Thrash, L.R. Orellana, and M.E. Everett. 2014. The interaction of pre-emergence herbicides and insecticides seed treatments and its effects on early season cotton. In: Proc. Beltwide Cotton Conf. National Cotton Council of America, Memphis, TN. p 778-781. Cole, J.F.G., E.D. Pilling, R. Boykin, and J.R. Ruberson. 1997. Effects of KARATE® insecticide on beneficial arthropods in Bollgard® cotton. In: Proc. Beltwide Cotton Conf. National Cotton Council of America, Memphis, TN. p 1118-1120. Crammer, C.S., N. Singh, N. Kamal, and H.R. Pappu. 2014. Screening Onion Plant Introduction Accessions for Tolerance to Onion Thrips and Iris Yellow Spot. Hort. Science 49(10): 1253-1261. Eigenbrode, S.D., and K.E. Espelie. 1995. Effects of plant epicuticular lipids on insect herbivores. Annu. Rev. Entomo. 40(1): 171-194. Faircloth, J.C., J.R. Bradley, Jr. and J.W. Van Duyn. 1998. The impact of thrips cotton productivity: what a difference a year makes. In: Proc. Beltwide Cotton Conf. National Cotton Council of America, Memphis, TN. 2: 976-978. Faircloth, J.C., J.R. Bradley, J.W. Van Duyn, and R.L. Groves. 2001. Reproductive success and damage potential of tobacco thrips and Western flower thrips on cotton seedlings in a greenhouse environment. J. Agric. Urban Entomol. 18: 179-185. Fery, R.L., and J.M. Schalk. 1991. Resistance in pepper (Capsicum annuum L.) to western flower thrips [Frankliniella occidentalis (Pergande)]. Hort. Science 26(8): 1073-1074.

53

Frei, A., J.M. Bueno, J. Diaz‐Montano, H. Gu, C. Cardona, and S. Dorn. 2004. Tolerance as a mechanism of resistance to Thrips palmi in common beans. Entomol. Exp. Appl. 112(2): 73-80. Gaines, J.C. 1957. Cotton insects and their control in the United States. Annu. Rev. Entomol. 2(1): 319-338. Gaines, J.C. 1965. Cotton insects. Texas Agricultural Experiment Station Serial Bulletin. 933. Gaston, K.J., and L.A. Mound. 1993. Taxonomy, hypothesis testing and biodiversity crisis. Proc. R. Soc. Lond. B. Biol. Sci. 251: 139-142. Grover, G., B. Kaur, D. Pathak, and V. Kumar. 2016. Genetic variation for leaf trichome density and its association with sucking insect-pests incidence in Asiatic cotton. Indian J. Genet. Plant Breed. 76(3): 365-368. Handley, R., B. Ekbom, and J. Agren. 2005. Variation in trichome density and resistance against a specialist insect herbivore in natural populations of Arabidopsis thaliana. Ecol. Entomol. 30: 284–92. doi: 10.1111/j.0307-6946.2005.00699.x. Hanley, M.E., B.B. Lamont, M.M. Fairbanks, and C.M. Rafferty. 2007. Plant structural traits and their role in anti-herbivore defense. Perspect. Plant Ecol. Evol. Syst. 8(4): 157-178. Hawkins, B.S., H.A. Peacock, and T. E. Steele. 1966. Thrips injury to upland cotton (Gossypium hirsutum L.) varieties. Crop Sci. 6(3): 256-258. Huseth, A.S., T.M. Chappell, K. Langdon, S.C. Morsello, S. Martin, J.K. Greene, A. Herbert, A.L. Jacobson, F.P.F. Reay-Jones, T. Reed, D.D. Reisig, P.M. Roberts, R. Smith, and G.G. Kennedy. 2016. Frankliniella fusca resistance to neonicotinoid insecticides: an emerging challenge for cotton pest management in the Eastern United States. Pest. Manag. Sci. 72: 1934–1945. doi:10.1002/ps.4232. Keilor, K.E. and L.D. Godfrey. 2000. Effects of registered and experimental insecticides on Lygus hesperus and beneficial arthropods in California cotton. In: Proc. Beltwide Cotton Conf. National Cotton Council of America, Memphis, TN. p. 1286-1289. Lei, T.T. and L.J. Wilson. 2004. Recovery of Leaf Area through Accelerated Shoot Ontogeny in Thrips‐damaged Cotton Seedlings. Ann. Bot. 94(1): 179-186. Leigh, T.F., V.L. Maggi, and L.T. Wilson. 1984. Development and use of a machine for recovery of arthropods from plant leaves. J. Econ. Entomol. 77(1): 271-276. Levin, D.A. 1973. The role of trichomes in plant defense. Q. Rev. Biol. 3-15.

54

Meredith, W.R., W.T. Pettigrew, and J.J. Heitholt. 1996. Sub-okra, semi-smoothness, and nectariless effect on cotton lint yield. Crop Science, 36(1): 22-25. Moffitt, H.R. 1964. A color preference of the western flower thrips, Frankliniella occidentalis. J. Econ. Entomol. 57: 604–605. Morris D.A. 1963. Variation in the boll maturation period of cotton. Empire Cotton Growing Reviews 40: 114-123. Morse, J.G., and M.S. Hoddle. 2006. Invasion biology of thrips. Annu. Rev. Entomol. 51: 67- 89. Mound, L.A. 1997. Biological diversity. Thrips as Crop Pests. In T. Lewis, editor, CAB International, Wallingford, UK. p. 197–215. Mound, L.A., and A.K. Walker. 1982. Terebrantia (Insecta: Thysanoptera). Fauna New Zealand. 1:1–19. Newsom, L.D., J.S. Roussel, and C.E. Smith. 1953. The tobacco thrips, its seasonal history and status as a cotton pest. Louisiana Agric. Exp. Sta. Tech. Bull. 474. Powell, W., G.C. Machray, and J. Provan. 1996. Polymorphism revealed by simple sequence repeats. Trends in Plant Sci. 1: 215–222. Quisenberry, J.E., and D.R. Rummel. 1979. Natural resistance to thrips injury in cotton as measured by differential leaf area reduction. Crop Sci. 19(6): 879-881. Race, S.R. 1965. Predicting thrips populations on seedling cotton. J. Econ. Entomol. 58(5): 1013-1014. Reed, J.T., R. Bagwell, C. Allen, E. Burris, D. Cook, B. Freeman, R. Leonard, and G. Lentz. 2006. A key to the thrips (Thysanoptera: Thripidae) on seedling cotton in the Mid-Southern United States. Mississippi Agricultural & Forestry Experiment Station Information Bulletin. 1156: 33. Reisig, D.D., L.D. Godfrey, and D.B. Marcum. 2009. Thresholds, injury, and loss relationships for thrips in Phleum pratense (Poales: Poaceae). Environ. Entomol. 38(6): 1737-1744. Roberts, P., M. Toews, S. Culpepper, A. Herbert, J. Greene, M. Marshall, T. Reed, and R. Smith. 2015. Potential Interaction of Thrips Management and Pre herbicides in Cotton. In: Proc. Beltwide Cotton Conf. National Cotton Council of America, Memphis, TN. p.507–512. Simmons, A.T., and G.M. Gurr. 2005. Trichomes of Lycopersicon species and their hybrids: effects on pests and natural enemies. Agric. Forest Entomol. 7(4): 265-276.

55

Sites, R.W., and W.S. Chambers. 1990. Initiation of vernal activity of Frankliniella occidentalis and Thrips tabaci on the Texas south plains. Southwest. Entomol. 15(3): 339- 344. Smith, G.L. 1942. California cotton insects. Univ. of California Bull. 660. Stanton, M.A., J.M. Stewart, and N.P. Tugwell. 1992. Evaluation of Gossypium arboreum L. germplasm for resistance to thrips. Genet. Resour. Crop Ev. 39(2): 89-95. Stewart, SD. 2011. Cotton Pests and Their Management: Thrips [Online]. University of Tennessee Extension Service W026. Available at http://www.utcrops.com/cotton/ cotton_insects/Pests/thrips.html (verified 3 July 2013). Stewart, S.D., D.S. Akin, J. Reed, J. Bacheler, A. Catchot, D. Cook, J. Gore, J. Greene, A. Herbert, R. Jackson, and D. Kerns. 2013. Survey of thrips species infesting cotton across the Southern US Cotton Belt. J Cotton Sci, 17: 263-269. Terry, I.L., and B.B. Barstow. 1988. Susceptibility of early season cotton floral bud types to thrips (Thysanoptera: Thripidae) damage. J. Econ. Entomol. 81(6): 1785-1791. Tyagi, P., M.A. Gore, D.T. Bowman, B.T. Campbell, J.A. Udall, and V. Kuraparthy. 2014. Genetic diversity and population structure in the US Upland cotton (Gossypium hirsutum L.). Theor. Appl. Genet. 127(2): 283-295. Wardle, R.A., and R. Simpson. 1927. The Biology of Thysanoptera with reference to the cotton plant. Ann. Appl. Biol. 14(4): 513-528. Watson, T.F. 1965. Influence of thrips on cotton yields in Alabama. J. Econ. Entomol. 58: 1118-1122. Watts, J.G. 1936. Study of the biology of the flower thrips [Frankliniella tritici (Fitch)] z with special reference to cotton. Bull. S. Carol. Agric. Exp. Stn. 306: 1–46. Zareh, N. 1985. Evaluation of six cotton cultivars for their resistance to thrips and leafhoppers. Iran Agricultural Research 4(2): 89-97. Zhang, J.F., H. Fang, H.P. Zhou, S.E. Hughs, and D.C. Jones. 2013. Inheritance and transfer of thrips resistance from Pima cotton to Upland cotton. J. Cotton Sci. 17: 163-169.

56

CHAPTER 3: Major Leaf Shape Genes, Laciniate in Diploid Cotton and Okra in Polyploid Upland Cotton, Map to an Orthologous Genomic Region

As adapted from: Kaur B, Andres R, Kuraparthy V (2016) Major Leaf Shape Genes, Laciniate in Diploid Cotton and Okra in Polyploid Upland Cotton, Map to an Orthologous Genomic Region. Crop Science 56(3): 1096-1105.

Abstract

Gossypium arboreum L, which produces spinnable cotton fibers, is an A-genome diploid progenitor species of tetraploid cotton. With its diploid genome, publicly available genome sequence, adapted growth, and developmental and agronomic attributes, G. arboreum could make an ideal cotton species to study the genetic basis of biological traits that are controlled by orthologous loci in diploid and polyploid species. Leaf shape is an important agronomic trait in cotton. Normal, subokra, okra, and laciniate are the predominant leaf shapes in cotton cultivars. Laciniate in diploids is phenotypically similar to okra leaf shape in tetraploid. In the present study, a population of 135 F2 plants derived from accessions NC 501 and NC 505 was used for genetic and molecular mapping of laciniate leaf shape in diploid cotton (G. arboreum). An inheritance study showed that laciniate leaf shape

L was controlled by a single incompletely dominant gene (L –A2). Molecular genetic map-ping using simple-sequence repeat (SSR) markers placed the leaf shape locus L-A2 on chromosome 2. Targeted mapping using putative genes from the delineated region established that laciniate leaf shape in G. arboreum and okra leaf shape in Gossypium hirsutum L. were controlled by genes at orthologous loci. Collinearity was well conserved between the diploid A- (G. arboreum) and D- (G. raimondii Ulbr.) genomes in the targeted genomic region narrowing the candidate region for the leaf shape locus (L-A2) to nine

57

putative genes. Establishing the orthologous genomic region for the L loci could help use the diploid cotton resources toward map-based cloning of leaf shape genes in Gossypium.

Introduction

Cotton is the world’s most important source of natural fiber as well as one of its leading oilseed crops. Cotton belongs to the genus Gossypium, which is comprised of both tetraploid and diploid species. Currently, about 44 species of diploid cotton (2n = 2x = 26) are known to exist across the eight genomes (A–G and K), which are spread throughout the arid and semiarid regions of the tropics (Hutchinson et al., 1947; Saunders, 1961; Wendel et al., 2009; Wendel and Grover, 2015). Spinnable cotton fibers evolved only in the A-genome diploids, and both remaining A-genome species, G. arboreum and G. herbaceum L., were domesticated independently (Wendel et al., 2009). Of these A-genome species, G. arboreum is a commercially important diploid cotton mostly grown in Asia. About 1 to 2 million years ago (MYA), an A-genome diploid hybridized with a D-genome diploid to form an allopolyploid species (2n = 4x = 52, AADD) (Wendel et al., 2009). The diploid D-genome donor was most closely related to G. raimondii, while the two remaining A-genome diploids appear equally closely related to the A-genome donor (Wendel et al., 2009). The polyploid then spread throughout the tropics of the New World, spawning at least six different species, two of which (G. hirsutum and G. barbadense L.) were independently domesticated (Wendel et al., 2009).

The genus Gossypium shows a wide geographical distribution across the globe with a multitude of growth, developmental, and morphological attributes (Hutchinson et al., 1947;

58

Wendel and Cronn, 2003; Wendel et al., 2010). Remarkable phenotypic diversity exists for leaf shape in cotton, ranging widely from overtly simple to deeply lobed leaves across both the diploids and polyploids (Hammond, 1941; Hutchinson et al., 1947; Saunders, 1961).

Although the role of leaf shape diversity in the evolution and adaptation of the Gossypium genus is not yet clearly established, leaf shape plays an important role in cotton production.

Leaf shape affects the plant and crop canopy architecture and can influence yield, biotic stress tolerance, earliness, input use efficiency, and other production characteristics in cotton.

The predominant leaf shapes in cultivated upland cotton are normal, subokra, and okra. All of these leaf shapes, along with super okra, form an allelic series and map to a single locus (L; (renamed here as L-D1) in the D-subgenome of upland cotton (Jones, 1982).

Among these leaf shapes, okra leaf shape (Fig. 1) is of particular interest in cotton production

(Green, 1953; reviewed in Andres et al., 2014; Jiang et al., 2000). It is also used to study the biological basis of leaf shape variation in plants (Dolan and Poethig, 1991). The okra leaf shape gene (LO) of the D-subgenome was shown to be incompletely dominant to normal leaf shape and the L-D1 locus was mapped using SSR markers on chromosome 15 of upland cotton (Andres et al., 2014; Jiang et al., 2000).

Genes at the homoeologous locus in the A-genome were also found to control leaf shapes such as laciniate in tetraploid cotton (Endrizzi and Stein, 1975). However, to date, mapping information for the allelic series of leaf shape at the A-subgenome locus is not available, and the orthologous relationship with the major leaf shape alleles of the D- subgenome has not been established.

59

Cultivated diploid cottons (G. arboreum and G. herba-ceum) show wide variation for leaf shape and size (Hutchinson, 1934). In a series of crosses among and between the diploid

A-genome species, Hutchinson (1934) demonstrated that there existed five leaf shapes in the

Asiatic cottons G. arboreum and G. herbaceum, all of which are allelomorphic: laciniate

(LL), arboreum (L), recessive broad (l), mutant broad (LB), and mutant intermediate (LI).

However, linkage and chromosome map location of these genes are not avail-able to date, and their orthologous relationship with major leaf shape genes in tetraploid cotton are not known.

Gossypium arboreum, which is a domesticated, diploid (2n = 2x = 26, A2A2) cotton, offers a unique opportunity to study the molecular genetic basis of agronomically important biological traits. Because of its diploid nature, trait mapping is less cumbersome in G. arboreum than mapping in the tetraploid with its duplicated genome and corresponding genetic redundancy (Li et al., 2014). Gossypium arboreum shows similar growth, developmental, and agronomic attributes to tetraploid cotton (Hutchinson et al., 1947), making it a valid comparative model. Further-more, the availability of a draft genome sequence and the presence of higher allelic diversity (Ma et al., 2008; Li et al., 2014; Lu et al., 2015) could make G. arboreum an ideal cotton species to study the genetic basis of biological traits, specifically the traits that are controlled by orthologous loci in diploid and polyploid species.

The objectives of the current study were to (i) study the inheritance of the laciniate

L leaf shape trait in diploid cotton, (ii) genetically map the laciniate leaf shape gene (L –A2) in

60

diploid cotton, and (iii) study the orthologous relationship between the major leaf shape genes in diploid and tetraploid cottons.

Materials and Methods

Plant Material: Accession NC 505 (PI 615700, A2-0191), named as Chinese Narrow Leaf, is a laciniate leaf shape line of G. arboreum from China, whereas NC 501 (PI 167905,

A15.02SD) with recessive broad leaf shape (Fig. 1; Supplemental Fig. S1) was originally collected by J. R. Harlan from Turkey in 1948. Seeds of both lines were procured from the

USDA Cotton Germplasm Col-lection in College Station, TX. Line NC 505 was crossed with

NC 501 during fall 2013. A single F1 plant was selfed in the greenhouse to obtain F2 seed in spring 2014. During fall 2014, 135 F2 plants along with the two parental lines were planted in the greenhouse to record the phenotype for leaf shape and collect leaf samples for molecular genetic mapping. All plants were grown in 25.4-cm pots (Hummert International, Inc.) in the greenhouse under short day photoperiod conditions with supplemental lighting. Day temperature was set at 88 C and night temperature was set at 70 C.

Phenotyping: Phenotypic data was recorded by visual observations 2 mo after planting. The

F2 plants were scored either as normal type or okra type because the difference between the heterozygotes and okra types was occasionally ambiguous.

Molecular Genetic Mapping: Leaf tissue samples from parental lines and F2 plants were collected and ground in liquid nitrogen. DNA was extracted using a modified miniprep extraction protocol reported by Li et al. (2001). Quantity and quality of the DNA was

61

estimated by using a NanoDrop 2000 UV-Vis spectrophotometer (Thermo Fisher Scientific).

Samples were diluted to a concentration of 20 ng mL−1 to carry out the polymerase chain reaction (PCR) amplification reactions. A final volume of 6 mL per reaction was used for

PCR. Each reaction included 20 ng of genomic DNA, 1 reaction buffer with 7.5 mM MgCl2,

0.24 mM dNTPs, 0.5 units of Taq DNA polymerase, 0.48 µM forward primer, 3.6 µM reverse primer, and 3.6 µM M13 primer labeled with either HEX (hexachlorofluorescein) or

6-FAM (6-carboxyfluorescein) fluorescent tags. A touchdown program was used to amplify all the primers starting with 5 min denaturation at 95 C, then 15 cycles of 94 C for 45 s, and

65 to 51 C (1 cycle each degree) for 45 s, and 72 C for 1 min followed by 25 cycles of 94 C for 45 s, 50 C for 45 s, 72 C for 1 min, and a final extension at 72 C for 10 min. To study the size polymorphism, the PCR products from all primer pairs were run on a 3% GenePure high-resolution agarose gel (ISC BioExpress) and on an ABI 3730 capillary-based electrophoresis sequencer (Applied Biosystems). For all capillary-based gel electrophoresis

GeneScan–500 LIZ (Applied Biosystems) was used as size standard. GeneMarker V2.6.0

(SoftGenetics, 2013) software was used to visualize and analyze the data obtained from the

ABI 3730 sequencer.

Genetic Map Construction: A Chi-square test was conducted to check the goodness of fit for both the phenotypic and genotypic (marker) data. JoinMap 4.1 (Van Ooijen, 2006) was used to develop the linkage map for the leaf shape trait. A logarithm of odds (LOD) score of

10.0 was used to develop the linkage map.

62

Marker and Comparative Genomic Analysis: Studies conducted in tetraploid upland cotton (G. hirsutum) mapped the leaf shape locus (L-D2) on chromosome 15 in G. hirsutum and its homoeologous chromosome 2 in G. raimondii (Andres et al., 2014). Chromosome 15 of D-subgenome was homoeologous to chromosome 1 of A-subgenome in upland cotton

(Blenda et al., 2012). For mapping the leaf shape gene in G. arboreum, 33 SSR markers from chromosome 1 of the high-density consensus (HDC) map (Blenda et al., 2012) and 23 SSR markers from the Cotton Marker Database (http://www.cot-tonmarker.org/) were selected.

L For finding closely linked markers and genomic targeting of L –A2 gene, current and previously mapped markers (SSR and sequence-tagged site [STS]) were BLAST searched against the publically available G. arboreum genome sequence (G. arboreum A-genome

BGI-CGP v2.0 [annotation v1.0]). Sequence-tagged site markers were designed from putative genes from the tentatively identified genomic region in G. arboreum. All the primers were designed using Primer3 software

(http://biotools.umassmed.edu/bioapps/primer3_www.cgi) and synthesized by Integrated

DNA Technologies (Coralville, IA). A M13 tail sequence 5-

CACGACGTTGTAAAACGAC-3 was added to the 5 end of all the forward primers to resolve the PCR products with capillary-based gel electrophoresis (Schuelke, 2000).

Mapped SSR and STS markers closely linked to the leaf shape trait in the current linkage map and from Andres et al. (2014) were then used to establish the orthologous relationship between the L-D1 locus of upland cotton and L-A2 locus of diploid cotton. The candidate genomic region for leaf shape genes was established in the sequenced diploid A- and D-genomes and the A-subgenome of upland cotton using markers mapping to the

63

orthologous gene sequences. To do this, marker sequences were BLAST searched against sequenced G. hirsutum accession TM1 (Zhang et al., 2015), G. raimondii (DOE Joint

Genome Institute: Cotton D V2.0), and G. arboreum Shixiya 1 (Li et al., 2014) genomes. Further, putative genes placed in the genomic region of leaf shape in the A-genome by the BGI-CGP sequence annotation were studied for collinearity to the JGI G. raimondii genome (JGI assembly v2.0 [annotation v2.1]; Paterson et al., 2012) and BLAST searched to find the correct orthologous genes. Usage of the JGI G. raimondii genome at Phytozome allowed gene function to be predicted through the protein homologs and gene ancestry tools of Phytozome. Of the three sequenced genomes used for BLAST analysis G. hirsutum accession TM1 is a normal leaf shaped upland cotton, G. arboreum cultivar Shixiya1 was a broad-leaf cultivar (Chen et al., 2015) and G. raimondii shows simple leaf shape (Hammond,

1941; Hutchinson et al., 1947; Saunders, 1961).

Based on the collinearity map using sequenced A- and D-genome maps as bridging species, STS markers were then designed to find newer and closely linked markers to the LL–

A2 gene and to further establish the orthologous relationship between A- and D-genome leaf shape loci. Polymorphic STS markers were synthesized, amplified, and integrated into the genetic map as described above. A comparative map between chromosome 2 of the G. raimondii draft genome, chromosome 2 of the G. arboreum draft sequence, chromosome 15 of the G. hirsutum LSMapPop map (Andres et al., 2014), and molecular genetic map of the

L L –A2 gene of the current study was constructed using the Strudel software (JHI Plant

Bioinformatics) and redrawn to the scale in Microsoft PowerPoint.

64

Results

Inheritance of Leaf Shape Trait in Gossypium arboreum: The F1 hybrid between NC 505

NC 501 showed inter-mediate phenotype compared with the parents (Fig. 1; Supplemental

Fig. S1). It indicated that the leaf shape trait was incompletely dominant in the heterozygous condition.

While normal leaf shape phenotype was unambiguously determined in the F2 population, the distinction between okra type and heterozygotes was not pronounced for a few F2 individuals. Therefore, the F2 plants were scored as either okra type or normal leaf shape. Phenotypic data showed that the F2 population segregated in a ratio of 3:1 of okra to normal type ( χ2 = 0.062, p-value = 0.8033) confirming monogenic control of the laciniate leaf shape trait in G. arboreum.

L Simple-Sequence Repeat Marker Analysis and Genetic Mapping of L –A2 Gene:

Chromosome 15, which carries the okra leaf shape gene of D-subgenome (Andres et al.,

2014), was reported to be homoeologous to chromosome 1 of tetraploid upland cotton and chromosome 2 of diploid D-genome cotton (Blenda et al., 2012; Li et al., 2014). Fifty-six

SSR markers genetically and physically mapped on chromosome 1 in tetraploid cotton were used to assess the polymorphism between the parental accessions NC 505 and NC 501. Out of the 46 SSR markers that amplified, 18 (39.1%) were polymorphic between the two parents. All the SSR markers were codominant except for one (Table 1; Supplemental Table

S1). Polymorphic markers were used for genotyping the F2 population and mapping of the leaf shape gene. Genetic analysis using JoinMap with a LOD score of 10.0 mapped the leaf

65

shape locus (L-A2) on chromosome 2. Six SSR markers showed linkage to the L-A2 locus mapping at varying genetic distances (Fig. 2). Linked markers of the L-A2 locus showed a genetic map length of 40.5 cM (Fig. 2).

L Comparative Analysis of the L –A2 Genetic Map with Gossypium hirsutum High-

Density Consensus Map: Two linked SSR markers (DPL526 [MON_DPL0526] and

BNL1693 [CLU16]) flanking the leaf shape locus L-A2 in the current map (Fig. 2) were mapped toward the telomeric end of chromosome 1 of the upland cotton HDC map (Blenda et al., 2012; Fig. 2). DPL526 maps to 20 cM and BNL1693/ CLU16 to 32 cM on the 152- cM-long chromosome 1 in the HDC map. This supported the tentative localization of the L-

A2 locus to the distal region of chromosome 2 in Asiatic cotton. The SSR marker HAU2936, which maps between DPL526 and the L-A2 locus (Fig. 2) was not present on chromosome 1 of the HDC map but did map near the telomere on the homeologous chromosome 15. The remaining three linked SSRs in Fig. 2 all mapped proximally to BNL1693 (CLU16) on chromosome 1 in the HDC map (Fig. 2). The order of the genetically mapped SSRs in the present study (Fig. 2) was identical to their order on chromosome 1 of the HDC map with minor differences in the marker distances (Fig. 2). This suggested that no major rearrangements existed between G. arboreum chromosome 2 and G. hirsutum chromosome 1 in the genomic region of the L-A2 locus. Although most of the SSRs linked to the okra leaf shape locus (L-D1) in G. hirsutum (Andres et al., 2014) showed amplification in the G. arboreum parents, none were polymorphic (Supplemental Table S1).

66

L Comparative Mapping of the L –A2 Genetic Map to the Diploid A- and D-Genomes:

The six polymorphic SSRs were BLAST searched against both the G. arboreum and the G. raimondii genomes to establish the orthologous genomic region in both species. Only two of the SSRs (DPL526 and HAU2936) identified an orthologous genomic region on chromosome 2 of both G. arboreum and G. raimondii (Table 2). Three of the markers

(BNL1693, NAU2095, and BNL2921) showed high sequence similarity to appropriate physical regions only on chromosome 2 of G. raimondii, while CIR18 did not have a high- scoring match on chromosome 2 in either species (Table 2). Because none of the four proximal SSR markers could be found in the G. arboreum physical sequence of chromosome

2, a physical candidate region could not be constructed in the G. arboreum genome. Thus, a genome annotation to identify candidate genes could not be completed using SSRs alone.

L Sequence-Tagged Site Marker Development and Genomic Targeting of L –A 2 Gene:

To find markers that mapped proximally to the L-A2 locus in the genetic map, but could also be found in the G. arboreum physical map, 93 STS markers were designed off of 22 putative genes located proximally to HAU2936 in the G. arboreum physical sequence. Of these 93

STS markers, 86 amplified in both parental lines and 7 (8.1%) were polymorphic

(Supplemental Table S1). With the exception of 15-LSFM-7, all the polymorphic STS markers were codominant (Supplemental Table S1). Four of the polymorphic STS markers

(15-LSFM-7 and LS-GA-13, 23, and 24) were run on the F2 population and all four markers showed tight linkage to the L-A2 gene (Fig. 2, 3). Two flanking STS markers (LS-GA-13 and

L 15-LSFM-7) showed especially close linkage with the L-A2 locus mapping the L –A2 gene

67

within a 1.3-cM region on chromosome 2 of G. arboreum. With respect to the L-A2 locus,

LS-GA-13 mapped 0.9 cM distally and 15-LSFM-7 mapped 0.4 cM proximally on chromosome 2 of G. arboreum. The order of the mapped STS markers in the genetic map was consistent with their physical locations in the G. arboreum genome sequence (Fig. 3;

Table 2).

L Annotation and Orthologous Mapping of the Leaf Shape Gene (L – A2) Candidate

Region using Gossypium arboreum, Gossypium raimondii, and Gossypium hirsutum

Genomes: BLAST analysis using genetically mapped marker sequences against sequenced diploid A-and D-genomes and A-subgenome of G. hirsutum showed that the order of the genetically mapped markers was similar to their physical order in each of the sequence based maps, suggesting that collinearity is well conserved among tetraploid A-subgenome and diploid A- and D-genomes in the targeted candidate L-A2 leaf shape region. The collinearity of the putative orthologous gene sequences among the sequenced diploid A- and D-genomes and A-subgenome of G. hirsutum and their collinear map location with respect to the

L O genetically linked markers to the leaf shape genes L –A2 and L –D1 (Fig. 3; Table 2, 3) suggest that laciniate leaf shape locus in diploid cotton is orthologous to okra leaf shape locus of upland cotton.

The laciniate leaf shape candidate region in the diploid A-genome spans a physical distance of ~108 kb from STS marker LS-GA-13 (located within gene Cotton_A_00499) to

STS marker 15-LSFM-7 (located within gene Cotton_A_00509). Nine genes

(Cotton_A_00500 through Cotton_A_00508) are predicted to lie between these two markers

68

(Table 3). Comparative genomic analysis using these putative gene sequences showed that the gene order and sequence similarity are highly conserved in the region of the L-A2 locus among G. arboreum and G. raimondii and A-subgenome of G. hirsutum (Table 2, 3). The one difference between the diploid genomes is that the ortholog of G. arboreum gene

Cotton_A_00502 is broken into two separate genes (Gorai.002G244500 and

Gorai.002G244600) in G. raimondii (Table 3). Nevertheless, all three genes are putatively characterized as exostosins and the sequence similarity of the region between the two species is high at 88.8% (Table 3). Therefore, it is likely that this difference is the result of a difference in annotation between the two genomes rather than a large insertion or deletion.

Almost all of the genes share sequence similarity above 87% between the two species (Table

3). The exceptions are Cotton_A_00503 and Cotton_A_00506, both of which are serine- threonine protein kinases (Table 3). However, both Cotton_A_00503 and Cotton_A_00506 are predicted to be considerably shorter than their D-genome orthologs. Extension of the

Cotton_A_00503 and Cotton_A_00506 sequences to match that of their D-genome orthologs resulted in significantly improved percentage similarities.

Off the nine putative genes annotated in the diploid A-genome, genes

Cotton_A_00505 and Cotton_A_00507 showed high homology to genes implicated in leaf shape determination in other studies (Saddic et al., 2006; Andres et al., 2014; Vlad et al.,

2014; Sicard et al., 2014) (Table 3). These two candidate leaf shape genes were 48.9 kb apart and were separated by single gene Cotton_A_00506 encoding a serine–threonine protein kinase (Table 3). Further, genes Cotton_A_00505 and Cotton_A_00507 showed high DNA sequence similarity (94.3 and 87.4%, respectively) with their orthologs in the G. raimondii

69

genome (Table 3). In upland cotton, the orthologs of these two genes in the D-subgenome were also identified as possible leaf shape candidates by Andres et al. (2014). A marker developed from one of the candidate genes cosegregated with the okra leaf shape locus L-D1

(Andres et al., 2014). However, preliminary attempts at developing markers based on its diploid A-genome homeolog in the current study were not successful as the primer pairs did not show polymorphism between the parents (Supplemental Table S1).

The genomic region delineated by flanking STS markers (LS-GA-13 and 15-LSFM-

7) in sequenced A-genome was ~108 kb. This is equivalent to 95 kb in the homoeologous diploid D-genome sequence and 103.14 kb in the A-subgenome of G. hirsutum (Table 2, 3;

Fig. 3). Based on the flanking STS markers, the physical sequence size to genetic distance

L −1 ratio in the genomic region of L – A2 was 86.3 kb cM .

Extension of Comparative Genomic Analysis of Orthologous Leaf Shape Region in

Diploid A- and D-Genome Physical Maps: The annotation in G. arboreum was extended beyond the L-A2 locus to range from the SSR markers Gh565 and NAU2343. These SSRs defined the L-D1 candidate region established in Andres et al. (2014) (Table 3). Neither

Gh565 nor NAU2343 were polymorphic on the G. arboreum map-ping population parents

(Supplemental Table S1), but both had high scoring matches in the G. arboreum physical sequence. This expanded candidate region contained genes Cotton_A_00482 through

Cotton_A_00518 (Table 3). The order of the genes remains highly conserved between the two diploid species and all gene sequences have percentage similarities of at least 80% with the two exceptions noted in the previous section (Table 3). The annotation of the G.

70

arboreum sequence contains three genes (Cotton_A_00496–00498) that appear to lack a clear homeolog in G. raimondii. Genes Cotton_A_00496 and Cotton_A_00498 show high sequence similarity to unannotated regions of G. raimondii chromosome 2 located between the closest flanking annotated genes in G. raimondii. This indicates that these two genes are possibly the result of differences in the annotation of the two genomes rather than a sizeable insertion or deletion. However, Cotton_A_00497 does not have a high-scoring match on G. raimondii chromosome 2 and therefore may be a gene unique to the diploid cotton A- genome. Gossypium arboreum also appears to carry an extra carbonic anhydrase in the candidate region as both Cotton_A_00484 and Cotton_A_00485 are most similar to

Gorai.002G246000. Nevertheless, the continued strong collinearity between the flanking genes of the L-A2 and L-D1 loci strengthens the notion that the two regions may be conditioned by the orthologous genes.

The L-D1 locus maps to a position at ~60.8 Mb on the ~62.8-Mb chromosome 2 of G. raimondii, establishing that the gene is physically located close to the telomere. In G. arboreum the L-A2 locus maps to ~68.4 Mb on chromosome 2. However, the ~100 Mb G. arboreum chromosomes 2 is much longer than ~62.8 Mb G. raimondii chromosome 2.

Therefore, the L-A2 locus does not appear to be telomeric in the A-genome diploid cotton compared with its D-genome ortholog.

Discussion

In a series of crosses among and between the diploid A-genome species G. arboreum and G. herbaceum, Hutchinson (1934) demonstrated five leaf shapes in Asiatic or diploid

71

cotton, all of which are allelomorphic: laciniate (LL), arboreum (L), recessive broad (l), mutant broad (LB), and mutant intermediate (LI). Only laciniate, which is phenotypically similar to okra, was transferred to G. hirsutum by a Dr. C. Rhyne ca. 1960 (Endrizzi and

Stein, 1975; Jones, 1982). The laciniate locus was placed on chromosome 1 of the tetraploid

A-subgenome in cytogenetic work using monosomes (White and Endrizzi, 1965). Since okra and laciniate alleles have similar effects on leaf shape, they were considered to be genes at duplicate loci in the two genomes (White and Endrizzi, 1965). This served as the basis for establishing that chromosome 1 and chromosome 15 are homeologous chromosomes in tetraploid cotton (White and Endrizzi, 1965). There exists no mention of any of the other three-leaf shape alleles being transferred to G. hirsutum from Asiatic cotton. However, since they are considered an allelic series in Asiatic cotton, it is assumed that they could also make up an allelic series at chromosome 1, the A-genome homoeologue, in G. hirsutum (Jones,

1982; Meredith, 1984). Some G. hirsutum lines still exist today that purportedly carry the laciniate allele. However, the term laciniate is occasionally used interchangeably with okra and super okra in the literature. Therefore, which of these lines, if any, truly carry the laciniate allele is unknown. In the current study, two of the leaf shapes described by

Hutchinson (1934) were investigated for their genetics and relationship with the okra leaf shape of the tetraploid upland cotton. Genetic analysis showed that laciniate leaf shape gene

L (L –A2) of G. arboreum showed incomplete dominance similar to the okra leaf shape of tetraploid upland cotton (Andres et al., 2014). Molecular genetic mapping of the laciniate leaf

L shape in diploid cotton indicated that L –A2 gene mapped on chromosome 2 of diploid

72

cotton, and L-A2 locus was orthologous to the okra leaf shape locus (L-D1) of upland cotton

(Fig. 3; Table 2).

Fine mapping and chromosome walking toward the target gene is difficult in polyploids because of the genetic redundancy of the homeoalleles in duplicated genomes. In many cases, the progenitor or related diploid species with smaller genomes have been sequenced ahead of their agriculturally important polyploids. Such wild and progenitor diploid species of polyploid crop plants were successfully used for fine mapping and map- based cloning of biological traits. Examples include the vernalization genes VRN1 and VRN2 in wheat using Triticum monococcum L. (Yan et al., 2003, 2004), disease resistance genes

Lr10, Pm3b, and Sr35 of wheat using T. monococcum (Feuillet et al., 2003; Yahiaoui et al.,

2004; Saintenac et al., 2013), Lr21 disease resistance gene of wheat using Aegilops tauschii

Coss. (Huang et al., 2003), and the late blight resistance gene RB in potato using the wild diploid potato species Solanum bulbocastanum Dunal (Song et al., 2003). Although the diploid D-genome of species G. raimondii is sequenced, genetic analysis is not a feasible option in this species because these accessions are photoperiod-sensitive perennial wild species with narrow morphological diversity within a species. Wide phenotypic diversity exists among the 14 extant wild diploid D-genome species for leaf shape (Hutchinson 1934;

Fryxell, 1979; Ulloa, 2014). However, developing interspecific crosses and segregating mapping populations in the diploid D-genome species could be a cumbersome process as a result of gametic and sporophytic incompatibility, sterility in hybrids, hybrid breakdown, segregation distortion, etc. (Saunders, 1961; He and Liang, 1989). In addition, genetic analysis in these wild diploid D-genome species also would not be feasible because most of

73

the accessions are photoperiod-sensitive perennials that show poor seed germination. Thus,

L genomic targeting and orthologous mapping of L –A2 gene in the adapted, photoperiod- insensitive, and cultivated diploid A-genome species offers an ideal opportunity to fine map and clone the orthologous leaf shape locus (L) in cotton.

L Orthologous relationships between the laciniate gene L –A2 of diploid cotton and the

O okra leaf shape gene (L –D1) of upland cotton was established by using a combination of genetic mapping and comparative mapping using sequenced A- and D-genomes (Fig. 3;

Table 2). This showed the utility of the physical maps, especially the sequence-based maps of the progenitor diploid species G. arboreum and G. raimondii in improving the genetic mapping efficiency in polyploid cotton. An ideal orthologous map would involve cross- validating the orthologous flanking markers in the two segregating mapping populations in diploid and polyploid cottons. However, microsatellite markers are mostly genome specific and, as such, cannot be used for orthologous mapping between homoeologous chromosomes

(Roder et al., 1998). Therefore, the sequenced diploid A- and D-genomes were used as bridging species in the shuttle mapping while establishing orthologous relationship between laciniate of diploid and okra of tetraploid cottons.

The okra leaf shape locus was previously targeted to a genomic region of 337 kb containing 34 putative genes in the diploid D-genome of cotton (Andres et al., 2014). In the

O present study, by establishing the orthologous relationship between okra leaf shape (L -D1)

L of upland cotton and laciniate leaf shape (L –A2) of G. arboreum, the candidate genomic region of the leaf shape locus was narrowed to a ~108-kb region in the diploid A-genome, which contains nine putative gene sequences (Fig. 3; Table 3). Its orthologous collinear

74

region is equivalent to 95 kb in the diploid D-genome and 103.14 kb in the G. hirsutum - subgenome sequences (Table 2, 3; Fig. 3). The physical to genetic map ratio within the delineated A-genome region was estimated to be 86.3 kb cM−1. This ratio is less than the genomic average of 276 to 352 kb cM−1 estimated previously in G. hirsutum (Xu et al.,

2008). A smaller physical size to genetic distance ratio further validates the previous observation by Andres et al. (2014) that leaf shape locus (L-D1) is localized toward the distal region of the chromosome, which was characterized by its propensity for high recombination and gene density (Rong et al., 2004; Li et al., 2014; Wang et al., 2013; Werner et al., 1992;

Gill et al., 1996). Thus, efficiency of genetic analyses of agronomic traits in upland cotton can be improved by using the diploid species mapping and genomic resources.

Since the major leaf shape loci (L) have been mapped to homeologous regions in both cotton genomes using orthologous gene sequences in the sequenced A- and D-genomes (Fig.

3; Table 2), it is likely that the same gene is responsible for leaf shape in both genomes.

However, most genes in the region of interest appear to have a gene of similar function either in tandem or close proximity (Table 3), and it remains possible that leaf shape in the two genomes may be influenced by either or both of these two related genes. Of particular interest from these nine genes are two putative genes (Cotton_A_00505 and

Cotton_A_00507) that code for HD-Zip transcription factor proteins. These HD-Zip transcription factors were previously implicated in leaf morphological differences (Saddic et al., 2006; Vlad et al., 2014; Sicard et al., 2014). An STS marker developed based on one of these genes showed cosegregation with leaf shape phenotype in 236 F2 plants in a previous study by Andres et al. (2014). Markers from its A-genome homoeologue Cotton_A_00507

75

did not show polymorphism between the parents used in the current study. Expanded efforts are currently underway to fine map the leaf shape locus L-A2 and test the role of the two candidate genes in leaf shape variation at L locus of cotton.

76

Figures

Figure 1: Leaf shape phenotypes of diploid cotton (Gossypium arboreum) and tetraploid upland cotton (G. hirsutum). Top: (left) normal broad-shaped leaf of the accession NC 501, (right) laciniate leaf of the breeding line NC 505. Bottom: (left) normal leaf of accession NC11-2100, (right) okra leaf of the breeding line NC05AZ21.

77

L Figure 2. Linkage map of L –A2 gene on chromosome 2 of Gos-sypium arboreum and its comparative map analysis with high-density consensus map of homoelogous chromosome 1 of the A-subgenome of upland cotton. In both maps, genetic distance in centimorgans is on the left with marker names on the right, while the top of the maps are oriented toward the telomere.

78

L Figure 3. Molecular mapping of the L –A2 gene in Gossypium arboreum and its genomic location in relation to the sequence-based physical maps of G. raimondii (2n = 2x = 26, DD) and G. arboreum (2n = 2x = 26, AA) genomes and to the tetraploid genetic map of Andres et al. (2014).

79

Figure S1. Leaf shape phenotypes of the G. arboreum parental accessions and their F1 hybrid at approximately 65 days after germination

80

Tables

L Table 1. Polymorphic simple-sequence repeat (SSR) and sequence-tagged site (STS) markers used for molecular mapping of L -A2 gene in diploid cotton Gossypium arboreum Allele size Marker Marker name type Forward sequence 5' -3' Reverse sequence 5' -3' NC 501 NC 505

———— bp——— MON_DPL0526 SSR GTTCTTGGTCATGCTGGTAAGAAA TAGCCATATCCACCTTAGCAGATT 176 173 HAU2936 SSR TGCGGGGACCAGAAAGAGAGT TTTGTCCTGGCCACCCAAGG 289 283 LS-GA-24 STS GCAACCCATTTTCATTCCAC TCCCTCTCATCCTCTGCAAT 221.2 215.2, 225.3 LS-GA-23 STS GTGGCACTTCACCCATTTTT ATTCCATCAAACACGGCAAT 225.9 223.6 LS-GA-13 STS AAGGATGGTACCGGGGTAAG TGTGGCCATCTGCTAAATCA 227.3 226.1 15-LSFM-7 STS TCATATAGATATCGTTTTTGACTTCCT TCCATGATTCCCAAAGACAAG ~480 Absent BNL1693 SSR CCCTTGGGAATAGCAGGTG CATGTGTCTCCGTGTGTGTGTG 249 251 NAU2095 SSR GGGACACAAACAAAACACAC GGAACTTGAGAACTTGAAGG 194 200 BNL2921 SSR CGAGAGATTTTAAAGGGAAACA GGGAGTGGTCTGATGGAAAA 193 243 CIR18 SSR TCAACTATCAGTCCAAT AAAGAGACCCACAAG 195.5, 208 208

81

Table 2. Simple-sequence repeat (SSR) and sequence-tagged site (STS) markers used for orthologous mapping and genomic L targeting of leaf shape gene (L -A2) in cotton. The SSR markers showing <90% similarity were considered to have no homologues in the physical maps of A-subgenome of Gossypium hirsutum (Zhang et al., 2015), diploid A-genome (Li et al., 2014), and diploid D-genome (DOE Joint Genome Institute: Cotton D V2.0) Physical position Physical position Physical position Gene no. in G. Gene no. in G. Similarity Marker Marker in Chr.A01 NBI in Chr.02 BGI.v2 in Chr.02 JGI G. arboretum raimondii between A name type G. hirsutum G. arboreum raimondii Cotton_A_ Gorai.002G24 and D gene ——————————bp—————————— % DPL526 SSR 98,936,617– 69,515,284– 61,837,914– n/a n/a n/a 98,936,725 69,515,709 61,838,339 HAU 2936 SSR 98,513,877– 69,124,435– 61,400,237– n/a n/a n/a 98,512,979 69,124,705 61,400,507 LS-GA-24 STS 98,067,274– 68,723,133– 61,010,898– 00484 6000 79.9 98,067,372 68,723,335 61,011,100 LS-GA-23 STS 98,027,438– 68,665,556– 60,989,413– 00488 5700 89.6 98,027,643 68,665,761 60,989,618 LS-GA-13† STS 97,874,747– 68,525,694– 60,892,051– 00499 4900 89.6 97,874,961 68,525,908 60,892,259 15-LSFM- STS 97,771,156– 68,413,593– 60,796,820– 00509 3800 96.6 7† 97,771,607 68,414,044 60,797,271 BNL1693 SSR 96,240,192– Chr. 13 and 11 59,691,657– n/a n/a n/a 96,240,148 59,691,887 NAU2095 SSR 95,794,416– Chr. 13 59,354,418– n/a n/a n/a 95,794,196 59,354,593 BNL2921 SSR 40,133,071– Chr. 7 and 8 27,353,767– n/a n/a n/a 40,133,251 27,353,935 CIR18 SSR Chr. A13 Chro. 7 and 13 Chr. 13 n/a n/a n/a † Markers are flanking markers to the laciniate leaf shape gene in both genetic and physical maps.

82

Table 3. Annotation and comparative genomic analysis of putative gene sequences identified in the genomic region of orthologous leaf shape locus (L) using sequence based physical maps of diploid progenitor cotton species Gossypium arboreum (BGI-CGP assembly v2.0 [annotation v1.0]; Li et al., 2014) and G. raimondii (JGI assembly v2.0 [annotation v2.1]; Paterson et al., 2012) Physical (sequence-based) map coordinates Similarity Putative function A-genome (G. arboreum) D-genome (G. raimondii) between A and Cotton_A Physical position Gorai.002 Physical position D sequences _00 chromosome 2(bp) G chromosome 2(bp) % Gh565 68,745,887..68,746,114 Gh565 61,031,975..61,032,199 87.5 SSR marker 482 68,741,409..68,745,808 246200 61,027,658..61,032,210 82.6 Pectate Lyase 483 68,728,321..68,730,997 246100 61,015,799..61,018,395 87.7 Carbonic anhydrase 484 68,721,745..68,724,118 246000 61,009,256..61,011,927 79.9 Carbonic anhydrase 485 68,708,231..68,710,581 246000 61,009,256..61,011,927 78.9 Carbonic anhydrase 486 68,701,883..68,703,112 245900 61,004,759..61,006,420 97.2 Aquaporin transporter 487 68,669,424..68,670,317 245800 60,992,779..60,993,648 95.2 Unknown 488 68,664,168..68,666,437 245700 60,987,773..60,990,298 89.6 Ankyrin repeat 489 68,657,366..68,662,624 245600 60,979,680..60,986,732 97.5 Kinase 490 68,615,778..68,623,408 245500 60,950,898..60,959,619 95.5 PH-D finger 491 68,591,320..68,599,029 245400 60,934,232..60,942,403 92.4 PH-D finger 492 68,572,895..68,576,363 245300 60,918,534..60,922,736 95.3 Aspartyl protease 493 68,569,096..68,572,179 245200 60,915,361..60,918,548 91.3 Isomerase 494 68,563,933..68,565,933 245100 60,909,449..60,912,281 96.4 Hydrolase 495 68,556,349..68,563,129 245000 60,902,520..60,910,968 81.2 Ubiquitin transferase 496 68,552,461..68,552,923 n/a 60,899,991..60,900,421 81.5 Unknown 497 68,550,904..68,552,090 n/a No Match on Chr02 n/a Unknown

83

Table 3 continued. 498 68,536,247..68,539,114 n/a 60,889,803..60,892,670 87.8 Epimerase 499† 68,523,600..68,526,467 244900 60,889,583..60,893,220 89.6 Epimerase 500 68,517,140..68,519,951 244800 60,882,963..60,886,378 91 Synthetase 501 68,513,273..68,515,930 244700 60,878,611..60,881,268 98 Pentratricopeptide repeat 502 68,493,999..68,502,968 244500 60,863,949..60,866,523 25.8 Exostosin 503 68,487,614..68,491,649 244400 60,856,168..60,863,202 60.3 Ser–Thr protein kinase 504 68,483,664..68,485,162 244300 60,852,082..60,854,077 96.6 Ribosomal protein L24e 505 68,480,260..68,481,502 244200 60,848,207..60,849,955 94.3 HD-Zip transcription factor 506 68,442,940..68,444,771 244100 60,818,675..60,821,101 69.2 Ser–Thr protein kinase 507 68,432,568..68,433,521 244000 60,816,695..60,817,565 87.4 HD-Zip transcription factor 508 68,419,352..68,420,719 243900 60,802,515..60,804,472 93.6 Hypoxia response 509† 68,410,422..68,415,291 243800 60,793,342..60,798,916 96.6 Pyruvate kinase 510 68,400,894..68,405,445 243700 60,784,258..60,789,349 93.1 Pyruvate kinase 511 68,389,173..68,391,283 243600 60,775,425..60,777,957 98.6 Ser–Thr protein kinase 512 68,377,517..68,379,223 243500 60,763,514..60,767,171 97.8 Unknown 513 68,351,915..68,356,673 243400 60,739,751..60,745,494 88.6 Glycosyl transferase 514 68,346,525..68,348,978 243300 60,734,924..60,737,571 97.8 Pentratricopeptide repeat 515 68,345,531..68,346,214 243200 60,733,892..60,734,923 98.8 Redoxin 516 68,327,789..68,331,092 243100 60,725,536..60,729,527 94.5 Enolase 517 68,281,323..68,284,053 243000 60,721,265..60,724,366 79.8 Unknown 518 68,274,602..68,280,195 242900 60,700,190..60,718,341 23.3 Ergosterol biosynthesis NAU2343 68,283,937..68,284,089 NAU2343 60,699,180..60,699,332 89.2 SSR marker † Putative genes used for developing genetically mapped flanking STS markers LS-GA-13 and 15-LSFM-7

84

Table S1. Chromosome locations of the markers and their allele sizes on each parent used for mapping the L-A2 locus. SSR markers were based on HDC map of Blenda et al. (2012). STS markers were designed from putative sequences based on the physical maps of G. arboreum (BGI-CGP assembly v2.0 (annotation v1.0), Li et al., 2014) and G. raimondii (JGI assembly v2.0 (annotation v2.1), Paterson et al., 2012). Marker type Marker name Physical/ Allele Sizes (bp) Polymorphism Genotyping platform Genetic Map location NC501 NC505 STS LS-GA-1 Cotton_A_00507 219.6 219.7 ABI 3730 Sequencer STS LS-GA-2 Cotton_A_00507 228.3 228.2 ABI 3730 Sequencer STS LS-GA-3 Cotton_A_00507 209.4 209 ABI 3730 Sequencer STS LS-GA-4 Cotton_A_00506 217.5 217 ABI 3730 Sequencer STS LS-GA-5 Cotton_A_00506 219 219 ABI 3730 Sequencer STS LS-GA-6 Cotton_A_00506 218.9 218.9 ABI 3730 Sequencer STS LS-GA-7 Cotton_A_00505 223.6 223.8 ABI 3730 Sequencer STS LS-GA-8 Cotton_A_00505 212 212 ABI 3730 Sequencer STS LS-GA-9 Cotton_A_00505 218.1 217.9 ABI 3730 Sequencer STS LS-GA-10 Cotton_A_00499 216 216 ABI 3730 Sequencer STS LS-GA-11 Cotton_A_00499 215.3 215.5 ABI 3730 Sequencer STS LS-GA-12 Cotton_A_00499 multiple multiple ABI 3730 Sequencer STS LS-GA-13 Cotton_A_00499 227.3 226.1 Co-dominant ABI 3730 Sequencer

85

Table S1 continued. STS LS-GA-14 Cotton_A_00492 222.2 222.1 ABI 3730 Sequencer STS LS-GA-15 Cotton_A_00492 222.4 222.4 ABI 3730 Sequencer STS LS-GA-16 Cotton_A_00492 219.1 219.1 ABI 3730 Sequencer STS LS-GA-17 Cotton_A_00492 220.7 220.5 ABI 3730 Sequencer STS LS-GA-18 Cotton_A_00491 223.2 223.2 ABI 3730 Sequencer STS LS-GA-19 Cotton_A_00491 215 215.2 ABI 3730 Sequencer STS LS-GA-20 Cotton_A_00491 219 219 ABI 3730 Sequencer STS LS-GA-21 Cotton_A_00491 219 219.1 ABI 3730 Sequencer STS LS-GA-22 Cotton_A_00488 215.4 215.3 ABI 3730 Sequencer STS LS-GA-23 Cotton_A_00488 225.9 223.6 Co-dominant ABI 3730 Sequencer STS LS-GA-24 Cotton_A_00484 221.2 215.2,225.3 Co-dominant ABI 3730 Sequencer STS LS-GA-25 Cotton_A_00484 207.7 207.7 ABI 3730 Sequencer STS LS-GA-26 Cotton_A_00484 212.8 213.6 ABI 3730 Sequencer STS LS-GA-27 Cotton_A_00510 225 224.7 ABI 3730 Sequencer STS LS-GA-28 Cotton_A_00510 215 214.9 ABI 3730 Sequencer STS LS-GA-29 Cotton_A_00513 214.7 214.4 ABI 3730 Sequencer STS LS-GA-30 Cotton_A_00513 211.1 211 ABI 3730 Sequencer

86

Table S1 continued. STS LS-GA-31 Cotton_A_00513 226.2 226.1 ABI 3730 Sequencer STS LS-GA-32 Cotton_A_00513 No No ABI 3730 Sequencer amplification amplification STS LS-GA-33 Cotton_A_00517 215 215.1 ABI 3730 Sequencer STS LS-GA-34 Cotton_A_00517 224.4 224.3 ABI 3730 Sequencer STS LS-GA-35 Cotton_A_00524 219.7 219.8 ABI 3730 Sequencer STS LS-GA-36 Cotton_A_00524 221.1 221.3 ABI 3730 Sequencer STS LS-GA-37 Cotton_A_00524 217.4 217.4 ABI 3730 Sequencer STS LS-GA-38 Cotton_A_00528 216.7 216.7 ABI 3730 Sequencer STS LS-GA-39 Cotton_A_00528 220.6 220.5 ABI 3730 Sequencer STS LS-GA-40 Cotton_A_00528 209.7 209.6 ABI 3730 Sequencer STS LS-GA-41 Cotton_A_00508 220 220 ABI 3730 Sequencer STS LS-GA-42 Cotton_A_00508 226 226 ABI 3730 Sequencer STS LS-GA-43 Cotton_A_00509 220 220 ABI 3730 Sequencer STS LS-GA-44 Cotton_A_00509 225 225 ABI 3730 Sequencer STS LS-GA-45 Cotton_A_00511 219 219 ABI 3730 Sequencer STS LS-GA-46 Cotton_A_00511 217 217 ABI 3730 Sequencer STS LS-GA-47 Cotton_A_00511 217 217 ABI 3730 Sequencer

87

Table S1 continued. STS LS-GA-48 Cotton_A_00514 217 217 ABI 3730 Sequencer STS LS-GA-49 Cotton_A_00514 230 230 ABI 3730 Sequencer STS LS-GA-50 Cotton_A_00514 223 223 ABI 3730 Sequencer STS LS-GA-51 Cotton_A_00515 218 218 ABI 3730 Sequencer STS LS-GA-52 Cotton_A_00515 217 217 ABI 3730 Sequencer STS LS-GA-53 Cotton_A_00516 219 219 ABI 3730 Sequencer STS LS-GA-54 Cotton_A_00516 219 219 ABI 3730 Sequencer STS LS-GA-55 Cotton_A_00518 219 219 ABI 3730 Sequencer STS LS-GA-56 Cotton_A_00519 234 234 ABI 3730 Sequencer STS LS-GA-57 Cotton_A_00519 234 234 ABI 3730 Sequencer STS LS-GA-58 Cotton_A_00520 218 218 ABI 3730 Sequencer STS- 13-LS-144 Cotton_A_00505/Gor 237 237 ABI 3730 Sequencer candidate ai.002G244200 genes STS- 13-LS-147 Cotton_A_00505/Gor 165 165 ABI 3730 Sequencer candidate ai.002G244200 genes STS- 13-LS-178 Cotton_A_00505/Gor No clear No clear ABI 3730 Sequencer candidate ai.002G244200 amplification amplification genes

88

Table S1 continued. STS- 13-LS-179 Cotton_A_00505/Gor No clear No clear ABI 3730 Sequencer candidate ai.002G244200 amplification amplification genes STS- 13-LS-187 Cotton_A_00505/Gor 177 177 ABI 3730 Sequencer candidate ai.002G244200 genes STS- 13-LS-170 Cotton_A_00505/Gor 255 275 ABI 3730 Sequencer candidate ai.002G244200 genes STS- 13-LS-171 Cotton_A_00505/Gor 224 224 ABI 3730 Sequencer candidate ai.002G244200 genes STS- 13-LS-172 Cotton_A_00505/Gor 216 216 ABI 3730 Sequencer candidate ai.002G244200 genes STS- 13-LS-173 Cotton_A_00505/Gor 235 235 ABI 3730 Sequencer candidate ai.002G244200 genes STS- 13-LS-188 Cotton_A_00505/Gor 227 227 ABI 3730 Sequencer candidate ai.002G244200 genes STS- 13-LS-148 Cotton_A_00505/Gor 253 251 Co-dominant ABI 3730 Sequencer candidate ai.002G244200 genes STS- 13-LS-149 Cotton_A_00505/Gor 243 242 Co-dominant ABI 3730 Sequencer candidate ai.002G244200 genes

89

Table S1 continued. STS- 13-LS-192 Cotton_A_00505/Gor 170 164 Co-dominant ABI 3730 Sequencer candidate ai.002G244200 genes STS- 13-LS-191 Cotton_A_00505/Gor 123 123 ABI 3730 Sequencer candidate ai.002G244200 genes STS- 13-LS-193 Cotton_A_00505/Gor 262 262 ABI 3730 Sequencer candidate ai.002G244200 genes STS- 13-LS-190 Cotton_A_00505/Gor 207 207 ABI 3730 Sequencer candidate ai.002G244200 genes STS- 13-LS-174 Cotton_A_00505/Gor 226 226 ABI 3730 Sequencer candidate ai.002G244200 genes STS- 13-LS-175 Cotton_A_00505/Gor 248 248 ABI 3730 Sequencer candidate ai.002G244200 genes STS- 13-LS-194 Cotton_A_00505/Gor No clear No clear ABI 3730 Sequencer candidate ai.002G244200 amplification amplification genes STS- 13-LS-176 Cotton_A_00507/Gor 219 219 ABI 3730 Sequencer candidate ai.002G244000 genes STS- 13-LS-151 Cotton_A_00507/Gor 220 220 ABI 3730 Sequencer candidate ai.002G244000 genes

90

Table S1 continued. STS- 13-LS-150 Cotton_A_00507/Gor 224 224 ABI 3730 Sequencer candidate ai.002G244000 genes STS- 150F-177R Cotton_A_00507/Gor 378 378 ABI 3730 Sequencer candidate ai.002G244000 genes STS- 13-LS-177R Cotton_A_00507/Gor 187 187 ABI 3730 Sequencer candidate ai.002G244000 genes STS- 13-LS-195 Cotton_A_00507/Gor 180 180 ABI 3730 Sequencer candidate ai.002G244000 genes STS- 15-LSFM-1 Cotton_A_00503/Gor No No 1% Agarose Gel candidate ai.002G244400 amplification amplification genes STS- 15-LSFM-2 Cotton_A_00503/Gor No clear No clear 1% Agarose Gel candidate ai.002G244400 amplification amplification genes STS- 15-LSFM-3 Cotton_A_00503/Gor No No 1% Agarose Gel candidate ai.002G244400 amplification amplification genes STS- 15-LSFM-4 Cotton_A_00503/Gor No No 1% Agarose Gel candidate ai.002G244400 amplification amplification genes STS- 15-LSFM-5 Cotton_A_00508/Gor ~480 ~480 1% Agarose Gel candidate ai.002G243900 genes

91

Table S1 continued. STS- 15-LSFM-6 Cotton_A_00508/Gor No clear No clear 1% Agarose Gel candidate ai.002G243900 amplification amplification genes STS- 15-LSFM-7 Cotton_A_00509/Gor ~470 Absent Dominant 1% Agarose Gel candidate ai.002G243800 genes STS- 15-LSFM-8 Cotton_A_00509/Gor ~500 ~500 1% Agarose Gel candidate ai.002G243800 genes STS- 15-LSFM-9 Cotton_A_00513/Gor No No 1% Agarose Gel candidate ai.002G243400 amplification amplification genes STS- 15-LSFM-10 Cotton_A_00513/Gor No No 1% Agarose Gel candidate ai.002G243400 amplification amplification genes SSR-chm1 CIR9 Blenda's map 248 236 Co-dominant(not a ABI 3730 Sequencer HDC good marker) SSR-chm1 CIR199 Blenda's map No clear No clear ABI 3730 Sequencer HDC amplification amplification SSR-chm1 CIR94 Blenda's map No clear No clear ABI 3730 Sequencer HDC amplification amplification SSR-chm1 BNL2921 Blenda's map 193 243 Co-dominant ABI 3730 Sequencer HDC SSR-chm1 CIR18 Blenda's map 195.5,208 208 Dominant ABI 3730 Sequencer HDC SSR-chm1 BNL3778 Blenda's map 147 145 Co-dominant ABI 3730 Sequencer HDC SSR-chm1 BNL1350 Blenda's map No clear No clear ABI 3730 Sequencer HDC amplification amplification

92

Table S1 continued. SSR-chm1 BNL3090 Blenda's map 259 239 Co-dominant ABI 3730 Sequencer HDC SSR-chm1 BNL3888 Blenda's map 209 256 Co-dominant ABI 3730 Sequencer HDC SSR-chm1 BNL3848 Blenda's map No clear 206 ABI 3730 Sequencer HDC amplification SSR-chm1 CIR114 Blenda's map 367 365 Co-dominant ABI 3730 Sequencer HDC SSR-chm1 TMB1439 Blenda's map ~850 ~900 Co-dominant 1% Agarose Gel HDC SSR-chm1 MUCS164 Blenda's map ~600 ~600 1% Agarose Gel HDC SSR-chm1 CIR199 Blenda's map 118,122 118 Dominant ABI 3730 Sequencer HDC SSR-chm1 MUSB564 Blenda's map ~400 ~400 1% Agarose Gel HDC SSR-chm1 NAU3690 Blenda's map 148 148 ABI 3730 Sequencer HDC SSR-chm1 NAU2474 Blenda's map 205 205 ABI 3730 Sequencer HDC SSR-chm1 NAU2469 Blenda's map 316 316 ABI 3730 Sequencer HDC SSR-chm1 MON_DPL5 Blenda's map 176 173 Co-dominant ABI 3730 Sequencer HDC 26 SSR-chm1 NAU2095 Blenda's map 194 200 Co-dominant ABI 3730 Sequencer HDC SSR-chm1 BNL1693 Blenda's map 249 251 Co-dominant ABI 3730 Sequencer HDC

93

Table S1 continued. SSR-chm1 BNL3886 Blenda's map 214, 216 214, 216 ABI 3730 Sequencer HDC SSR-chm1 Gh641 Blenda's map 94 94 ABI 3730 Sequencer HDC SSR-chm1 NAU3433 Blenda's map 222 221 Co-dominant ABI 3730 Sequencer HDC SSR-chm1 NAU2741 Blenda's map 260 260 ABI 3730 Sequencer HDC SSR-chm1 NAU5411 Blenda's map 265 265 ABI 3730 Sequencer HDC SSR-chm1 NAU1417 Blenda's map 163 163 ABI 3730 Sequencer HDC SSR-chm1 NAU4095 Blenda's map No clear No clear ABI 3730 Sequencer HDC amplification amplification SSR-chm1 NAU3018 Blenda's map 343 343 ABI 3730 Sequencer HDC SSR-chm1 NAU3254 Blenda's map 296 296 ABI 3730 Sequencer HDC SSR-chm1 MON_DPL7 Blenda's map 154 154 ABI 3730 Sequencer HDC 36 SSR-chm1 MON_CGR5 Blenda's map 167 167 ABI 3730 Sequencer HDC 282 SSR-chm1 MON_DPL2 Blenda's map 174 174 ABI 3730 Sequencer HDC 36 SSR- MGHES59 HDC map Chrm 15/ 180 180 ABI 3730 Sequencer physical 55,983,628 map

94

Table S1 continued. SSR- NAU2814 HDC map Chrm 256 271 Co-dominant ABI 3730 Sequencer physical 15/70,613,819 map SSR- JESPR152 HDC map Chrm 130 No clear ABI 3730 Sequencer physical 15/68,250,518 amplification map SSR- Gh565 HDC map Chrm 119,129 No clear ABI 3730 Sequencer physical 15/68,746,094 amplification map SSR- BNL2440 HDC map Chrm 204 204 ABI 3730 Sequencer physical 15/69,303,024 map SSR- MUCS152 HDC map Chrm 193 193 ABI 3730 Sequencer physical 15/69,661,807 map SSR- NAU2343 HDC map Chrm 265 265 ABI 3730 Sequencer physical 15/68,284,179 map SSR- DPL0318 HDC map Chrm 218 218 ABI 3730 Sequencer physical 15/68,486,297 map SSR- MGHES32 HDC map Chrm 186 189 Co-dominant ABI 3730 Sequencer physical 15/66,255,712 map SSR- HAU3132 HDC map Chrm 188 188 ABI 3730 Sequencer physical 15/68,992,809 map

95

Table S1 continued. SSR- HAU2936 HDC map Chrm 289 283 Co-dominant ABI 3730 Sequencer physical 15/69,124,435 map SSR- HAU080 HDC map Chrm 240,248 240,248 ABI 3730 Sequencer physical 15/69,387,420 map SSR- NAU5138 HDC map Chrm 203 203 ABI 3730 Sequencer physical 15/70,237,075 map SSR- NAU2437 HDC map Chrm 245 253 Co-dominant ABI 3730 Sequencer physical 15/70,415,347 map SSR- MUCS164 HDC map Chrm No No ABI 3730 Sequencer physical 15/70,238,145 amplification amplification map SSR- CIR9 HDC map Chrm 248 235 Co-dominant ABI 3730 Sequencer physical 15/70,262,063 map SSR- TMB1910 HDC map Chrm 223 223 ABI 3730 Sequencer physical 15/70,365,233 map SSR- MGHES42 HDC map Chrm 219 219 ABI 3730 Sequencer physical 15/66,293,397 map SSR- NAU3815 HDC map Chrm 189 189 ABI 3730 Sequencer physical 15/66,293,754 map

96

Table S1 continued. SSR- NAU5302 HDC map Chrm 250 250 ABI 3730 Sequencer physical 15/66,293,254 map SSR- HAU2398 HDC map Chrm 201 201 ABI 3730 Sequencer physical 15/66,255,579 map SSR- MON_CGR5 HDC map Chrm No clear No clear ABI 3730 Sequencer physical 326 15/66,255,781 amplification amplification map SSR- MON_CGR5 HDC map Chrm No clear No clear ABI 3730 Sequencer physical 902 15/66,255,668 amplification amplification map

97

References

Andres, R.J., D.T. Bowman, B. Kaur, and V. Kuraparthy. 2014. Mapping and genomic targeting of the major leaf shape gene (L) in upland cotton (Gossypium hirsutum L.). Theor. Appl. Genet. 127:167–177. doi:10.1007/s00122-013-2208-4

Blenda, A., D.D. Fang, J.F. Rami, O. Garsmeur, F. Luo, and J.M. Lacape. 2012. A high- density consensus genetic map of tetra-ploid cotton that integrates multiple component maps through molecular marker redundancy check. PLoS ONE 7:e45739. doi:10.1371/journal.pone.0045739

Chen, Y., Y. Wang, T. Zhao, J. Yang, S. Feng, W. Nazeer, T. Zhang, and B. Zhou. 2015. A new synthetic amphiploid (AADDAA) between Gossypium hirsutum and G. arboreum lays the foundation for transferring resistances to verticillium and drought. PLoS ONE 10:e0128981. doi:10.1371/journal.pone.0128981

Dolan, L., and R.S. Poethig. 1991. Genetic analysis of leaf develop-ment in cotton. Development 1:39–46.

Endrizzi, J.E., and R. Stein. 1975. Association of two marker loci with chromosome 1 in cotton. J. Hered. 66:75–78.

Feuillet, C., S. Travella, N. Stein, L. Albar, A. Nublat, and B. Keller. 2003. Map-based isolation of the leaf rust disease resistance gene Lr10 from the hexaploid wheat (Triticum aestivum L.) genome. Proc. Natl. Acad. Sci. USA 100:15253–15258. doi:10.1073/ pnas.2435133100

Fryxell, P.A. 1979. The natural history of the cotton tribe. Texas A&M Univ. Press, College Station, TX.

Gill, K.S., B.S. Gill, T.R. Endo, and T. Taylor. 1996. Identification and high-density mapping of gene-rich regions in chromosome group 1 of wheat. Genetics 144:1883–1891.

Green, J.M. 1953. Sub-okra, a new leaf shape in upland cotton. J. Hered. 44:229–232.

Hammond, D. 1941. The expression of genes for leaf shape in Gossy-pium hirsutum L. and Gossypium arboreum L. II. The expression of genes for leaf shape in Gossypium hirsutum L. Am. J. Bot. 28:138– 150. doi:10.2307/2436937

He, J.X., and Z.L. Liang. 1989. Embryological studies on interspe-cific cross between Gossypium arboream L. and G. davidsonii Kel-log. Acta Genet. Sin. (2006) 16:256–262.

98

Huang, L., S.A. Brooks, W. Li, J.P. Fellers, H.N. Trick, and B.S. Gill. 2003. Map-based cloning of leaf rust resistance gene Lr21 from the large and polyploid genome of bread wheat. Genetics 164:655–664.

Hutchinson, J.B. 1934. The genetics of cotton. J. Genet. 28:437– 513. doi:10.1007/BF02981765

Hutchinson, J.B., R.A. Silow, and S.G. Stephens. 1947. The evolu-tion of Gossypium. Oxford Univ. Press, London.

Jiang, C.X., R.J. Wright, S.S. Woo, T.A. DelMonte, and A.H. Pat-erson. 2000. QTL analysis of leaf morphology in tetraploid Gos-sypium (cotton). Theor. Appl. Genet. 100:409–418. doi:10.1007/ s001220050054

Jones, J.E. 1982. The present state of the art and science of cotton breeding for leaf- morphological types. Proc. Beltwide Cotton Production Res. Conf. The National Cotton Council of Amer-ica. Memphis, TN. p. 93–99.

Li, F., G. Fan, K. Wang, F. Sun, Y. Yuan, G. Song, Q. Li, Z. Ma, C. Lu, C. Zou, W. Chen, X. Liang, H. Shang, W. Liu, C. Shi, G. Xiao, C. Gou, W. Ye, X. Xu, X. Zhang, H. Wei, Z. Li, G. Zhang, J. Wang, K. Liu, R.J. Kohel, R.G. Percy, J.Z. Yu, Y.X. Zhu, J. Wang, and S. Yu. 2014. Genome sequence of the cultivated cotton Gossypium arboreum. Nat. Genet. 46:567– 572. doi:10.1038/ng.2987

Li, H., J. Luo, J.K. Hemphill, and J.T. Wang. 2001. A rapid and high yielding DNA miniprep for cotton (Gossypium spp.). Plant Mol. Biol. Rep. 19:183. doi:10.1007/BF02772162

Lu, C., C. Zou, Y. Zhang, D. Yu, H. Cheng, P. Jiang, W. Yang, Q. Wang, X. Feng, M.A. Prosper, X. Guo, and G. Song. 2015. Development of chromosome-specific markers with high poly-morphism for allotetraploid cotton based on genome-wide characterization of simple sequence repeats in diploid cottons (Gossypium arboreum L. and Gossypium raimondii Ulbrich). BMC Genomics 16:55. doi:10.1186/s12864-015-1265-2

Ma, X.X., B.L. Zhou, Y.H. Lu, W.Z. Guo, and T.Z. Zhang. 2008. Simple sequence repeat genetic linkage maps of A-genome dip-loid cotton (Gossypium arboreum). J. Integr. Plant Biol. 50:491– 502. doi:10.1111/j.1744-7909.2008.00636.x

Meredith, W.R. 1984. Influence of leaf morphology on lint yield of cotton-enhancement by the sub okra trait. Crop Sci. 24:855–857. doi:10.2135/cropsci1984.0011183X002400050007x

Paterson A.H., J.F. Wendel, H. Gundlach, H. Guo, J. Jenkins, D. Jin, D. Llewellyn, K.C. Showmaker, S. Shu, J. Udall, M. Yoo, R. Byers, W. Chen, A. Doron-Faigenboim, M.V. Duke, L. Gong, J. Grimwood, C. Grover, K. Grupp, G. Hu, T. Lee, J. Li, L. Lin, T. Liu, B.S.

99

Marler, J.T. Page, A.W. Roberts, E. Romanel, W.S. Sanders, E. Szadkowski, X. Tan, H. Tang, C. Xu, J. Wang, Z. Wang, D. Zhang, L. Zhang, H. Ashrafi, F. Bedon, J. E. Bowers, C.L. Brubaker, P.W. Chee, S. Das, A.R. Gingle, C.H. Haigler, D. Harker, L.V. Hoffmann, R. Hovav, D.C. Jones, C. Lemke, S. Mansoor, M. Rahman, L.N. Rainville, A. Rambani, U.K. Reddy, J. Rong, Y. Saranga, B.E. Scheffler, J.A. Scheffler, D.M. Stelly, B.A. Triplett, A.V. Deynze, M.F.S. Vaslin, V.N. Wagh-mare, S.A. Walford, R.J. Wright, E.A. Zaki, T. Zhang, E.S. Dennis, K.F.X. Mayer, D.G. Peterson, D.S. Rokhsar, X. Wang and J. Schmutz. 2012. Repeated polyploidization of Gossypium genomes and the evolution of spinnable cotton fibres. Nature 492:423–427. doi:10.1038/nature11798

Roder, M.S., V. Korzun, K. Wendehake, J. Plaschke, M.H. Tixier, P. Leroy, and M.W. Ganal. 1998. A microsatellite map of wheat. Genetics 149:2007–2023.

Rong, J., C. Abbey, J.E. Bowers, C.L. Brubaker, C. Chang, P.W. Chee, T.A. Delmonte, X. Ding, J.J. Garza, B.S. Marler, C. Park, G.J. Pierce, K.M. Rainey, V.K. Rastogi, S.R. Schulze, N.L. Trolinder, J.F. Wendel, T.A. Wilkins, T.D. Williams-Coplin, R.A. Wing, R.J. Wright, X. Zhao, L. Zhu, and A.H. Paterson. 2004. A 3347-locus genetic recombination map of sequence-tagged sites reveals features of genome organization, transmis-sion and evolution of cotton (Gossypium). Genetics 166:389–417. doi:10.1534/genetics.166.1.389

Saddic, L.A., B. Huvermann, S. Bezhani, Y. Su, C.M. Winter, C.S. Kwon, R.P. Collum, and D. Wagner. 2006. The LEAFY tar-get LMI1 is a meristem identity regulator and acts together with LEAFY to regulate expression of CAULIFLOWER. Develop-ment 133:1673– 1682. doi:10.1242/dev.02331

Saintenac, C., W. Zhang, A. Salcedo, M.N. Rouse, H.N. Trick, E. Akhunov, and J. Dubcovsky. 2013. Identification of wheat gene Sr35 that confers resistance to Ug99 stem rust race group. Science 341:783–786. doi:10.1126/science.1239022

Saunders, J.H. 1961. The wild species of Gossypium. Oxford Univ. Press, London.

Schuelke, M. 2000. An economic method for the fluorescent labeling of PCR fragments. Nat. Biotechnol. 18:233–234. doi:10.1038/72708

Sicard, A., A. Thamm, C. Marona, Y.W. Lee, V. Wahl, J.R. Stinch-combe, S.L. Wright, C. Kappel, and M. Lenhard. 2014. Repeated evolutionary changes of leaf morphology caused by mutations to a homeobox gene. Curr. Biol. 24:1880–1886. doi:10.1016/j. cub.2014.06.061

SoftGenetics. 2013. GeneMarker genotyping software. Release 2.6.0. SoftGenetics LLC, State College, PA

Song, J., J.M. Bradeen, S.K. Naess, J.A. Raasch, S.M. Wielgus, G.T. Haberlach, J. Liu, H. Kuang, S. Austin-Phillips, C.R. Buell, J.P. Helgeson, and J. Jiang. 2003. Gene RB cloned

100

from Solanum bulbocastanum confers broad spectrum resistance to potato late blight. Proc. Natl. Acad. Sci. USA 100:9128–9133. doi:10.1073/ pnas.1533501100

Ulloa, M. 2014. The diploid D genome cottons (Gossypium spp.) of the New World. In: I. Abdurakhmonov, editor, World cotton germplasm resources. .InTech, Rijeka, Croatia.

Van Ooijen, J.W. 2006. JoinMap 4, software for the calculation of genetic linkage maps in experimental populations. Kyazma B. V., Wageningen, Netherlands.

Vlad, D., D. Kierzkowski, M.I. Rast, F. Vuolo, R. Dello Ioio, C. Galinha, X. Gan, M. Hajheidari, A. Hay, R.S. Smith, P. Huijser, C.D. Bailey, and M. Tsiantis. 2014. Leaf shape evolution through duplication, regulatory diversification, and loss of a homeobox gene. Science 343:780–783. doi:10.1126/science.1248384

Wang, Z., D. Zhang, X. Wang, X. Tan, H. Guo, and A. H. Pater-son. 2013. A whole genome DNA marker map for cotton based on the D-genome sequence of Gossypium raimondii L. G3: Genes, Genomes, Genet. 3:1759–1767.

Wendel, J.F., C. Brubaker, I. Alvarez, R. Cronn, and J.M. Stewart. 2009. Evolution and natural history of the cotton genus. In: A.H. Patterson, editor, Genetics and genomics of cotton. Springer, New York. p. 3–22.

Wendel, J.F., C. Brubaker, and T. Seelanan. 2010. The origin and evolution of Gossypium. In: J.M. Stewart, D.M. Oosterhuis, J.J. Heitholt, and J.R. Mauney, editors, Physiology of cotton., Springer, Netherlands. p. 1–18.

Wendel, J.F., and R.C. Cronn. 2003. Polyploidy and the evolu-tionary history of cotton. Adv. Agron. 78:139–186. doi:10.1016/ S0065-2113(02)78004-8

Wendel, J.F., and C.E. Grover. 2015. Taxonomy and evolution of the cotton genus. In: D. Fang and R. Percy, editors, Cotton, Agronomy Monograph 57. ASA, CSSA, and SSSA, Madison, WI. doi:10.2134/agronmonogr57.2013.0020

Werner, J.E., T.R. Endo, and B.S. Gill. 1992. Toward a cytogeneti-cally based physical map of the wheat genome. Proc. Natl. Acad. Sci. USA 89:11307–11311. doi:10.1073/pnas.89.23.11307

White, T.G., and J.E. Endrizzi. 1965. Tests for the association of marker loci with chromosomes in Gossypium hirsutum by the use of aneuploids. Genetics 51:605–612.

Xu, Z., R.J. Kohel, G. Song, J. Cho, J. Yu, S. Yu, J. Tomkins, and J.Z. Yu. 2008. An integrated genetic and physical map of homoeologous chromosomes 12 and 26 in upland cotton (G. hir-sutum L.). BMC Genomics 9:108. doi:10.1186/1471-2164-9-108 Yahiaoui, N.,

101

P. Srichumpa, R. Dudler, and B. Keller. 2004. Genome analysis at different ploidy levels allows cloning of the powdery mildew resistance gene Pm3b from hexaploid wheat.

Plant J. 37:528–538. doi:10.1046/j.1365-313X.2003.01977.x Yan, L., A. Loukoianov, A. Blechl, G. Tranquilli, W. Ramakrishna,

P. San-Miguel, J.L. Bennetzen, V. Echenique, and J. Dubcovsky. 2004. The wheat VRN2 gene is a flowering repressor down-regulated by vernalization. Science 303:1640–1644. doi:10.1126/ science.1094305

Yan, L., A. Loukoianov, G. Tranquilli, M. Helguera, T. Fahima, and J. Dubcovsky. 2003. Positional cloning of wheat vernaliza-tion gene VRN1. Proc. Natl. Acad. Sci. USA 100:6263–6268. doi:10.1073/pnas.0937399100

Zhang, T., Y. Hu, W. Jiang, L. Fang, X. Guan, J. Chen, et al. 2015. Sequencing of allotetraploid cotton (Gossypium hirsutum L. acc. TM-1) provides a resource for fiber improvement. Nat. Biotech-nol. 33:531–537. doi:10.1038/nbt.3207

102

CHAPTER 4: Assessment of genetic diversity and population structure in the tropical landrace accessions of Gossypium hirsutum L.

Abstract

In this study, genetic diversity and population structure was assessed in a set of 185

Gossypium hirsutum L landrace accessions, collected mainly from Central America during the mid 1900s, using genome-wide simple sequence repeat (SSR) markers. Genotyping the diversity panel using 122 SSRs detected 143 marker loci. A total of 819 alleles were identified across 143 markers loci and out of these 23.3% were unique alleles, observed only in one accession. Average genetic distance between accessions was 0.36 suggesting higher levels of genetic variation present in the cotton tropical landrace germplasm. Using Bayesian model based structure analysis, 5 major sub-groups were identified which roughly corresponded to the geographical origins of accessions. Substantial admixture was observed as accessions from different geographical locations were grouped together. Results from phylogenetic, Principal Component Analysis (PCA), and Analysis of Molecular Variance

(AMOVA) supported clustering based on STRUCTURE analysis. Pairwise kinship estimates suggested that most of the accessions were unrelated. Finally, core sets representing various levels of allelic richness were identified using POWERMARKER. Assessing genetic diversity, population structure, and identifying core sets in the landraces will facilitate the utilization of unexploited tropical genetic diversity towards developing improved cotton cultivars.

103

Introduction

Cotton is the world’s most important natural fiber crop. Cottonseed oil ranks behind soybean and peanuts in total domestic fat and oil production in the United States (Oil Crops

Yearbook, 2016). Cotton, a member of genus Gossypium, displays wide geographical distribution and remarkable phenotypic variation. Genus Gossypium is comprised of more than 50 species, which are distributed throughout the tropical and subtropical regions of Asia,

Africa, Americas and Australia. These species have evolved at diploid (2n = 2x = 26) and tetraploid (2n = 4x = 52) levels and are classified into nine genomes with designations AD,

A, B, C, D, E, F, G, and K (Wendel et al., 1992; Percival et al., 1999). Approximately 1.5 million years ago hybridization between diploid species G. herbaceum L. (A1 genome) and

G. raimondii Ulb. (D5 genome) followed by diploidization led to the formation of six allotetraploid cotton species (Wendel et al., 1992). Of these two allotetraploids species, G. hirsutum (Upland cotton) and G. barbadense (Pima cotton or Sea Island cotton) are cultivated species and contribute more than 97% of the world cotton production. Three species G. mustelinum Miers ex Watt, G. darwinii Watt, and G. tomentosum Nutt. ex Seem. are indigenous to Northeast Brazil, Galapagos Islands and Hawaiian Islands, respectively.

They are wild species and are not grown for commercial production (Wendel and Percy,

1990; Wendel et al., 1994; Hawkins et al., 2005). Gossypium ekmanianum Wittmack, reported recently, is endemic to the Dominican Republic (Grover et al., 2015). Gossypium arboreum L., and G. herbaceum are the only cultivated diploid species and mainly grown in

Southern Asia and contribute less than 3% of the world cotton (Zhang et al., 2008). Mexico–

Guatemala region is considered the center of origin and diversity for G. hirsutum

104

(Hutchinson et al., 1947; Brubaker et al., 1999). Wild and land races endemic to this region are photoperiod sensitive that flower under short day length conditions (Abdurakhmonov et al., 2008).

The Gossypium genus shows spectacular diversity in terms of plant growth habit, color, leaf shape, maturity, flowering time, resistance to biotic and environmental stresses, fiber length, and fiber quality (Brubaker et al., 1999; Abdurakhmonov et al., 2008). However,

Upland cotton (G. hirsutum) has a narrow genetic base because of polyploidy and domestication bottlenecks (Wendel et al., 1992). Most of the cultivated germplasm in the

United States has been derived from a limited stock of lines, which were converted to photoperiod insensitive lines so that they can flower under long summer days in North

America (Smith et al., 1999). In addition to the initial bottleneck due to domestication and selection, further breeding efforts involved stringent selections from this small set of lines.

Each breeding program has shaped the germplasm adapted to the local climate placing emphasis on specific traits (Wendel et al., 1992; Brubaker et al., 1999). This limited genetic base could be a serious limitation to maintain continuity in genetic gain. Further, germplasm with a narrow genetic base is prone to climate change and rapidly evolving pests and diseases.

In order to improve and maintain genetic gain in cotton breeding programs and to achieve sustainable crop production it is important to maintain genetic diversity in the cultivated germplasm (Van Esbroeck and Bowman, 1998). Studies suggest that the level of diversity is higher in landraces and wild accessions collected from Central America

(Abdurakhmonov et al., 2008). Breeding programs can utilize natural diversity of wild and

105

unadapted germplasm collected from the centers of diversity (Wallace et al., 2009; Campbell et al., 2010). Wild relatives are hardier and contain genes for high levels of tolerance against biotic and abiotic stresses (Percival et al., 1999; McCarty and Percy, 2001). However, photoperiod sensitivity of these lines has been a barrier to introgress beneficial alleles from tropical germplasm into cultivated Upland cotton germplasm (Abdurakhmonov et al., 2008;

Wallace et al., 2009). With advances in the molecular biology and availability of a range of molecular markers, gene introgressions from wild germplasm can be streamlined to make the transfers more efficient and effective in cotton breeding. In addition to providing useful genes, introduction of new variability will ensure a broader genetic base among elite cultivar germplasm and hence provide better protection from epidemics. Assessing the genetic diversity and population structure is an important first step to utilize the useful variability present in the wild and land races (Carvalho et al., 2004; Warburton et al., 2006).

There are numerous ways to assess the genetic diversity including pedigree information, morphological markers, biochemical or isozyme and DNA based markers. Apart from the region of original collection, pedigree information is not feasible for the wild and landrace accessions because it is generally assumed that these lines are unrelated. Such assumptions can lead to over estimation of diversity (Bowman et al., 1996). Morphological and biochemical markers are limited in number and less polymorphic. DNA based markers overcome all these limitations. There is wide range of genetic markers available today for crop plant genotyping (Van Becelaere et al., 2005; Multani and Lyon, 1995; Rahman et al.,

2008; Abdalla et al., 2001). However, microsatellite or simple sequence repeat (SSR) are the feasible markers of choice for genetic diversity studies because of their abundance, co-

106

dominance, high polymorphism index and they are relatively inexpensive in their application in most breeding programs (Powell et al., 1996).

Several reports are available where markers have been used to estimate genetic diversity in cotton. Most of these include cultivated lines or accessions from a specific breeding program (Campbell et al., 2009, Zhang et al., 2005, Kalivas et al., 2011, Bertini et al., 2006; ). There are very few studies available where genetic diversity has been estimated in accessions of diverse collection of Upland cotton, landraces and wild germplasm collected from Centers of Diversity (Lacape et al., 2007; Abdurakhmonov et al., 2008, Fang et al.,

2013, Hinze et al., 2016). The current study was designed to assess the diversity and population structure in a sub-set of lines forming a part of a large landrace collection from

Central America using Capillary based gel electrophoresis, which has higher resolution in identifying informative marker alleles. The objectives of the current study were to: 1) assess genetic diversity in tropical landraces of G. hirsutum; 2) analyze the population structure and kinship in the diversity panel; and 3) identify the genetically diverse core sets of lines to help the utilization of tropical gene pool in cotton breeding.

Materials and Methods

Plant Material: For this study, a sub-set of 185 tropical wild collections and landrace accessions of G. hirsutum, were randomly selected from cotton germplasm collection. These accessions are mostly photoperiod sensitive and were collected during a period of 1946 to

1989 were sampled. The main collectors of this germplasm are TR Richmond, JO Ware, CW

Manning, S Stephens, and A Percival (GRIN, http://www.ars-grin.gov/). The majority of

107

accessions were from Central America and a few were reported to be collected from

Uzbekistan, Philippines and Sudan. As most of these accessions are native to tropical conditions they do not flower under long day conditions during summers in the United

States. Due to their photoperiod sensitivity their integration into cultivated U.S. germplasm has been restricted and limited information is available about their pedigree and classification. The summary of accessions, their origin, and classification is given in Table

S1. Seeds for these accessions were obtained from the US National Cotton Germplasm

Collection, USDA-ARS, College Station, TX, USA. These accessions were selfed for two consecutive seasons in Mexico during the winters of 2010-2011 and 2011-2012 in order to reduce residual heterozygosity.

Genotyping Studies: Single plants were grown in the greenhouse during winter of 2013.

Leaf tissue was used for DNA isolation. DNA was extracted using a modified small-scale miniprep extraction protocol reported by Li et al. (2001). NanoDrop 2000 UV–Vis spectrophotometer (Thermo Fisher Scientific, Waltham, MA, USA) was used to estimate

DNA quantity and quality. Samples were diluted to a concentration of 20ng/µl to carry out the PCR amplification reactions. One hundred and thirty five SSR markers obtained from

CottonGen (http://www.cottongen.org) were used to study the diversity between these accessions. These markers were uniformly distributed across the genome covering all the chromosomes. Table S2 include the list of SSR markers along with their repeat sequences and chromosome location. Markers were selected to assure uniform distribution across the genome covering all of the chromosomes. A final reaction volume of 6µl per reaction was

108

used for polymerase chain reactions (PCR). Each reaction included 20 ng of genomic DNA,

1X reaction buffer with 7.5 mM MgCl2, 0.24 mM dNTPs, 0.5 units of Taq DNA polymerase,

0.48 µM forward primer, 3.6 µM reverse primer, and 3.6 µM M13 primer labeled with either

HEX (hexachlorofluorescein) or 6-FAM (6-carboxyfluorescein) fluorescent tags. A touchdown program was used to amplify all the primers starting with 5 min denaturation at

95°C, then 15 cycles of 94°C for 45 s and 65°C for 45 s, and 72°C for 1 min followed by 25 cycles of 94°C for 45 s, 50°C for 45 s, 72°C for 1 min, and a final extension at 72°C for 10 min. To study the size polymorphism, PCR products were run on ABI 3730 capillary-based electrophoresis sequencer (Applied Biosystems, Carlsbad, CA) using GeneScan–500 LIZ

(Applied Biosystems, Carlsbad, CA, USA) as size standard. GeneMarker V1.91

(SoftGenetics, State College, PA) software was used to visualize and analyze the data obtained from the ABI 3730 sequencer.

One hundred eighty five accessions were genotyped using 135 SSRs and used to carry out phylogenetic and PCA analysis. From this panel three accessions were identified as

G. barbadense accessions and removed from the study. Hence, for Population Structure analysis, AMOVA, PCA, and Kinship analysis utilized the remaining 182 accessions.

Scoring of SSR markers and assessment of genetic diversity: During allele scoring, it was observed that some markers amplified two loci. This was expected because Upland cotton is an allotetraploid. To separate between heterozygous genotype and markers amplifying at two loci a criterion similar to Tyagi et al. (2014) was used. If two alleles were present and one of the alleles was monomorphic across all the samples it was considered as a separate locus.

109

Initial statistical analysis of the genotypic data was carried out using POWERMARKER software version 3.25 (Liu and Muse, 2005). Number of alleles, gene diversity, heterozygosity, and Polymorphism Information Content (PIC) values were calculated for each marker. Heterozygosity is the proportion of heterozygous individuals in the population.

PIC value is indicative of information potential of a marker. A marker with PIC value ‘1’ can differentiate every line, and a monomorphic marker will have ‘0’ PIC value. Allele frequency, number of unique alleles and allele count was obtained using POWERMARKER.

Distribution of allele frequency was plotted using SigmaPlot version 12.5 (Systat Software,

San Jose, CA).

Pairwise genetic distance between the accessions was calculated with Nei et al.’s

(1983) method using POWERMARKER software version 3.25. Distance matrix was used to construct a dendrogram using neighbor joining option in POWERMARKER. The dendrogram file obtained from POWERMARKER was visualized and edited using

DENDROSCOPE version 3.2.2.

For further genetic analysis and to study population-structuring pattern, Principal

Component Analysis (PCA) was conducted using co-dominant data. Three analysis packages, adegenet, ggplot2, and splitstackshape were used to construct PCA plots in R software version 3.2.3 (R core team, 2013). Eigen values from first four dimensions were used to calculate the variation explained by first two dimensions. Analysis of Molecular

Variance (AMOVA) was carried out using Arlequin version 3.5 software using dominant data (Excoffier and Lischer, 2010).

110

Analysis of population structure: The analysis on population structure of 182 G. hirsutum accessions was also performed using STRUCTURE software version 2.3.4 (Pritchard et al.,

2000). This software uses a model based Bayesian method to identify the number of sub- populations or groups and individuals are assigned to these groups. An admixture model was selected to calculate number of sub-populations or K value. This model assumes that sub- populations did not evolve in complete isolation from each other and some lines can have parts of their genome derived from ancestors belonging to a different sub-population. Also, it is assumed that allele frequencies are correlated among the sub-populations. Initially 10 runs for each value of K ranging from 2 to 12 were conducted with 10,000-length burn-ins and number of replications. Estimation of number of sub-populations (K value) was done by plotting the distribution of ΔK, which is an ad hoc statistic based on the rate of change in the log probability of data between successive K values (Evanno et al., 2005). The value of ΔK was calculated as mean of absolute values of difference between successive likelihood values of K divided by the standard deviation of L (K). The highest value obtained from the graph by plotting ΔK values most accurately detects the uppermost hierarchical level of structure.

ΔK plots were made using Structure Harvester website (Dent and Bridgett, 2012).

Accessions with membership probability greater than 60% were assigned to subgroups; accessions with probability of membership less than 60% were assigned to a mixed group.

Relative kinship and gene flow estimates: Pairwise kinship estimates computed as a correlation coefficient between allelic states proposed by J Nason (Loiselle et al., 1995) were calculated using the software SPAGeDi version 1.4c (Hardy and Vekemans, 2002). The

111

kinship matrix compared the probability of identity by descent among all pairs of the 182 cotton accessions genotyped using 113 markers. The coefficients calculated based on genetic markers estimate ratios of differences of probabilities of identity in state between two specific individuals and random individuals from the sample (Vekemans and Hardy, 2004).

The kinship estimates were averaged across loci to obtain kinship estimate between two individuals (Loiselle et al., 1995). All negative kinship values were set to zero (Hardy and

Vekemans, 2002). Such values indicated that those individuals are less related than random individuals. FIS, FST values, and gene flow estimates were calculated using software Genepop version 4.1 (Rousset, 2008). FST coefficients were estimated with a "weighted" analysis of variance with a single measure for all samples explained by Weir and Cockerham (1984).

Gene flow estimates (Nm) were calculated using the private allele method proposed by

Slatkin (1985).

Selection of core set: A simulated annealing algorithm in POWERMARKER software was used to select core sets based on allele richness from the 182 G. hirsutum landrace accessions. The probability and efficiency of the algorithm depends on three parameters; number of evaluations (R), cooling coefficient (ρ), and initial temperature (T0). Increasing number of evaluations for each annealing schedule (R) yields accurate results but increases the run time. Higher T0 values helps to explore the potential solutions effectively, but if the value is too large it can slow down the process. Similarly, larger values of ρ provide slower, but more accurate runs. In this study, R was set to 1000, cooling coefficient ρ was set to 0.95

112

and T0 was 1. Core sets of sizes ranging from k = 10 to k = 80 were evaluated, with addition of five accessions in each incremental step.

Results

SSR marker analysis: Genetic diversity in the landrace accessions of G. hirsutum was evaluated using 135 SSR primer pairs (Table S2) distributed across the genome. From these,

13 markers could not be scored with confidence and were excluded, leaving 122 SSR primers pairs that were used in the final analyses. Some of the SSR primers amplified more than one locus, an observation similar to previous reports in cotton (Fang et al., 2013; Tyagi et al.,

2014). Due to its allopolyploid genome, the markers can amplify in both genomes in cotton

(Fang et al., 2013; Tyagi et al., 2014). Twenty-six out of 122 markers (21.1%) amplified at multiple loci and each locus was treated as a separate marker. Five loci were monomorphic and were removed from the analysis. Finally, genotypic data from 143 polymorphic loci was used to assess the genetic diversity.

A total of 819 alleles were observed across 143 loci among 183 accessions with an average of 5.5 alleles per SSR locus. Average major allele frequency was 0.74 with minimum and maximum value of 0.18 and 0.99, respectively. PIC value ranged from 0.01 to

0.88 with an average PIC value of the SSR markers was 0.33. Average heterozygosity (H) was 1.95 percent suggesting that the diversity panel accessions are highly homozygous.

Table S3 presents a summary for the above mentioned marker statistics.

Out of total 819 alleles, 191 were unique alleles (alleles found in only one accession), identified in 72 of the 183 accessions used in the study (Table S4). Three accessions from

113

Mexico, TX-347, TX-604, and TX-612 had a higher number of unique alleles compared to other accessions with 26, 22, and 16 unique alleles, respectively. Of the remaining 69 accessions, 31 were from Mexico, 21 from Guatemala and 17 from different parts of Central

America and other countries (Table S4). Based on the classification into races, 42 accessions belonged to Latifolium, one was Morrilli, and 29 were not classified into any specific race.

This suggests that tropical accessions of upland cotton and especially the accessions from the center of origin of tetraploid cotton are an excellent source of novel alleles for sampling the diversity and broadening the genetic base of upland cotton. Histograms of the allele frequency suggested that most of the alleles had a very low allele frequency (Fig. 1).

Genetic diversity analysis in the diversity panel: Genetic distance was obtained from the

Neighbor joining analysis of genotypic data using Powermarker software. The genetic distance in the complete panel ranged from 0.00 to 0.92 with an average of 0.36, suggesting that the panel consists of very diverse accessions. The genetic distance between TX-238 and

TX-1131 was 0, indicating that either they are duplicate accessions submitted under different names in the germplasm collection or markers used in the study were not able to distinguish differences between these accessions or due to human errors in scoring of the alleles. It was observed that the three accessions TX-347, TX-604, and TX-612 were closely related but their genetic distance to other accessions in the panel was the highest (>.80) (Fig. S1). A neighbor-joining (NJ) tree was constructed using this distance matrix (Fig. S1). Three major clusters were observed in the NJ tree. The largest cluster was further divided into three sub- clusters, leading to a total of five clusters (Fig. S1).

114

PCA was conducted to further study the genetic relationships between the G. hirsutum germplasm accessions. It was observed that 65.3% of variation was accounted by first two axes of PCA, and the accessions were grouped into two major clusters (Fig. S2).

Results similar to NJ tree were obtained where three accessions (TX-347, TX-604, and TX-

612) formed a distinct cluster (Fig. S2). Interestingly, these three accessions showed the largest genetic distance from other accessions and had a high number of unique alleles.

Further, these three outlying G. hirsutum accessions (formed a separate cluster with G. barbadense accessions suggesting that they may be mislabeled in the germplasm collection as G. hirsutum wild collections (data not shown). These three accessions were removed from the dataset in subsequent analyses.

Population structure analysis: STRUCTURE software was used for the analysis of population structure in the diversity panel of 182 accessions. It was not feasible to identify the number of clusters using Ln plot for K (Fig. 2a) so plot for ΔK was used to identify the number of sub-populations (Fig. 2b). Based on the admixture model, 154 accessions were placed into clusters using 60% membership probability as threshold value. Twenty-eight accessions had mixed parentage and were not assigned to any group (Table S5). Five major clusters were identified in the landrace accessions of G. hirsutum (Fig. 3). These groups roughly corresponded to the geographical locations from where these accessions were collected. Most of the accessions from Guatemala formed one cluster and Mexican accessions were spread across two different sub-groups. However there was significant

115

admixture and in some cases accessions from different countries were grouped together

(Table S5).

Group 1 had seven accessions of which, four were from Mexico, and one each from

United States and Belize and Guatemala (indicated with red in Fig. 3, Table S5). It was surprising that this group included accessions from different geographic regions. However, accessions in Group 1 shared less than 0.01 membership probabilities for rest of the groups.

A second group was comprised of 43 accessions with majority from Guatemala, three from

Mexico and one from Haiti (indicated with green in Fig. 3). Group 3 and Group 5 consisted of 69 and 16 accessions, respectively, and most of the accessions were collected from

Mexico (indicated with blue and pink respectively in Fig. 3). Group 3 formed the biggest cluster and in addition to accessions from Mexico (43 accessions); there were eight accessions from Guatemala, three each from United States and Uzbekistan, two each from

Cote D'Ivoire, Paraguay, and Philippines, and one each from Malta, Brazil, Colombia,

Trinidad and Tobago, Sudan and Ethiopia. Group 5 included 15 accessions from Mexico and one from Mozambique. Lastly, Group 4 comprised of 19 accessions with nine accessions collected from Guatemala, six from Mexico, two from Puerto Rico and one each from El

Salvador and Martinique (indicated with yellow in Fig.3) suggesting that a strong genetic admixture was present among the accessions.

Phylogenetic and principal component analysis: A phylogenetic tree constructed using neighbor-joining analysis on 182 G. hirsutum accessions formed four major clusters. In order to compare STRUCTURE groups with phylogenetic tree-based clusters, the dendrogram was

116

colored to present STRUCTURE groups (Fig. 4). Overall there was decent agreement between two estimates. Group 1 (red color), Group 2 (green color), and Group 5 (pink color) from STRUCTURE formed well-defined clusters in Phylogenetic tree. However, Group 3

(blue color) and Group 4 (yellow color) were spread across clusters in the dendrogram.

Genetic distances between groups obtained from STRUCTURE analysis suggested that

Group 1 was genetically distinct from all other groups (Table 1). This supported the results from membership probabilities obtained from STRUCTURE (Table S5).

A PCA plot was constructed using the diversity panel and accessions were colored to show the STRUCTURE groups (Fi. 5). Similar to the results inferred from genetic distances in Table 1, Group 1 was separated from other accessions (red color). All other groups except

Group 4 formed tight and clear clusters in PCA analysis. The first two axes of PCA explained 55.2% of the total genetic variation between G. hirsutum accessions. Analysis of molecular variance (AMOVA) revealed that differences between groups obtained from

STRUCTURE analysis were highly significant with 33.3% of the total variation contributed by between-group variance. However, most of the variation (66.6%) was attributed to diversity between individuals within a group (Table 2). The FST value for whole population

(0.33) was highly significant at P< 0.0001. Based on pairwise FST values, it was revealed that accessions in Group 1 were genetically farthest from all other groups (Table 3). Further, high genetic differentiation was observed among the remaining four groups with all pairwise FST values significant at P< 0.0001 (Table 3).

117

Estimation of kinship and gene flow in G. hirsutum accessions: The pairwise kinship estimates based on 113 informative molecular markers showed that the majority of cotton accessions pairs (62.05%) had a kinship value of zero. This suggested that most accessions are not related to each other. Further, 87.80% of kinship estimates ranged from 0 to 0.15, and about 94.3% of the estimates was less than 0.25, indicating a lower degree of genetic relatedness between the accessions (Fig. S3). Gene flow estimates (Nm) and distributions of

FST and FIS for all the markers are presented in Table S6. It was observed that 18 loci had FST value ranging from 0.5 to 0.8, suggesting these markers contributed to most of the divergence among the five groups identified using STRUCTURE. The distribution of these markers was random across the chromosomes. Out of 18 markers, 9 have been reported to be linked with

QTL identified in earlier studies. BNL 3008 was found to be associated with Root knot nematode resistance (Ulloa et al., 2016); BNL3379 (Bolek et al., 2005) CIR97 (Wang et al.,

2014), CIR218 (Zhang et al., 2015), and BNL3031 (Li et al., 2013) with Verticilium wilt resistance; BNL 1404 (Mei et al., 2013) and BNL 256 (Kantartzi and Stewart, 2008) with lint percentage; BNL1693 with leaf shape gene (Lacape et al., 2013); and BNL852 with node of first fruiting branch (Guo et al., 2009). It is possible that some of these traits are distinguishing factors between the sub-populations, however sufficient information is not available to confirm this idea. Most of the markers were fixed in the panel as 103 loci had FIS estimates greater than 0.9. Higher values of Nm estimates indicate increased gene flow, and the loci with low Nm estimates are limited to specific groups in the panel. For marker

BNL3778, the Nm value was 17.75, which indicates that among all markers this locus was most migratory across the groups, followed by BNL3627 (12.25), BNL358 (8.67), and

118

BNL3993 (7.56). The higher rates of migration observed for such markers could be due to their location in the genome. It is possible that these markers are located in the region which is shared by most of the individuals and were preserved during evolution. These regions could be carrying genes important for survival, reproduction, basic cell functions like replication, transcription and translation, defense response. Nm estimates for 82 marker loci were less than 1, which suggested that these are private alleles i.e. alleles that are found only in a single sub-population (Slatkin, 1985). The mean frequency of private alleles was 0.11 per group.

Core sets of G. hirsutum accessions: Core sets were generated using genotypic information available for the 182 G. hirsutum accessions using POWERMARKER software. In Fig. 6, a plot showing the percentage of the total allele number represented by core sets ranging in size from 10 to 80 in the diversity panel of 182 accessions is presented. The core set with 10 accessions captured 51.77 % of the total number of alleles and the largest core set with sample size 80, captured 76.92% (Fig. 6). There was a subtle increase in percentage of total alleles captured with increments in core sets size. This confirms the presence of huge genetic diversity in the panel. Table 4 provides the complete list of accessions grouped into different core sets along with the percent alleles each core set represent.

Discussion

In the current study, genetic diversity and population structure were studied in a diversity panel of landrace accessions of G. hirsutum collected from tropical regions during mid to late 1900s. Significant genetic diversity was observed between these accessions based

119

on SSR markers. Using Bayesian model approach, accessions could be divided into five major clusters, which roughly corresponded to their geographical locations. Overall, there was a good agreement between the clusters obtained from STRUCTURE and phylogenetic analysis. Core sets that represent different levels of allelic diversity were identified which could be used to systematically utilize the tropical gene pool for broadening the genetic base of photoperiod insensitive elite cotton cultivars.

Of the 122 markers used to assess the genetic diversity, 26 markers (21.1%) were amplified at multiple loci. This is expected because G. hirsutum is an allo-tetraploid and

SSRs often exists as homoeoalleles that amplify across both A and D genomes. Using the same set of SSR markers, Tyagi et al. (2014) observed that 17.5 % of the total SSR markers amplified at two loci in a panel of 381 upland cotton cultivars. A slightly higher number of

SSRs amplifying multiple loci in the current study could be due to higher genetic diversity allowing homoeoalleles to be detected in the current diversity panel. In the current study, a total of 819 alleles were amplified among 183 accessions. This is higher compared to 1115 alleles amplified in 831 G. hirsutum wild accessions reported by Hinze et al. (2016).

The number of alleles amplified by a single locus ranged from 2 to 28 with an average of 5.5 alleles per locus. Similar results were reported by Liu et al. (2000) in a set of

97 G. hirsutum converted race stocks with an average of five alleles per locus. Zhang et al.

(2011) reported an average of 5.08 alleles per locus in a panel of 57 Chinese cultivars. Few other reports indicated a higher number of alleles amplified per marker, for example an average of 5.6 alleles per locus in a panel of landraces from G. hirsutum, G. barbadense, G. darwinii, and G. tomentosum (Lacape et al., 2007), and six alleles per marker in a panel of

120

diploid and tetraploid cotton species (Ulloa et al., 2013). Higher numbers of alleles per locus in the above studies could be due to the highly diverse, inter-specific germplasm used. A relatively lower number of alleles per marker (from two to four alleles) have been reported in studies conducted using elite day-length insensitive upland cotton cultivars (Fang et al. 2013;

Tyagi et al., 2014; Zhao et al., 2015; Bertini et al., 2006; Abdurakhmonov et al., 2008).

In this study, we observed that the average PIC value for the polymorphic loci was

0.33 which is comparable to the estimate based on G. hirsutum race stocks (Liu et al., 2000).

Previous studies in cotton reported PIC values from a minimum of 0.122 (Abdurakhmonov et al., 2008) to a maximum of 0.80 (Zhang et al., 2011). This wide variation was due to the differences in methods of detecting marker alleles used and germplasm used in the respective studies. The number of unique alleles (23.3%) observed in the current panel is much higher compared to earlier studies of Abdurakhmonov et al. (2008) which reported 3% unique alleles in 287 exotic G. hirsutum accessions whereas Tyagi et al. (2014) identified 21.5% unique alleles in a panel of elite cotton germplasm. This could be due to the higher overall genetic diversity in the current panel used in the current study. Improved efficiency of the capillary platforms in resolving the marker alleles was assumed for the higher PIC values.

The later observation on the efficiency of the capillary-based genotyping platforms was also reported in Tyagi et al. (2014).

Phylogenetic analysis based on Nei’s 1983 coefficient suggested that huge genetic diversity is present in the germplasm panel. It is comparable to average GS estimated in G. hirsutum wild type germplasm studied by Hinze et al. (2016). A much higher GD of 0.195 was reported by studies conducted on North American cultivars by Tyagi et al. (2014) (Fig.

121

S1). A higher GD values observed in the current study compared to the previous works on

Mexican (0.07) and African cultivars (0.08) by Abdurakhmonov et al. (2008) further indicate the improved efficiency in estimating GD using capillary platforms used to resolve the marker alleles in the current study.

Five major clusters within G. hirsutum accessions were obtained using STRUCTURE software (Fig. 2b). The clusters roughly corresponded to geographical locations of the accessions. However, there was significant sub-structuring within accessions collected from the same country. It was difficult to explain the clustering solely based on geographic origin.

For example, accessions collected from Mexico and Guatemala were grouped into two clusters for each region. This sub-grouping indicates that distinct sub-populations are present within accessions from the Mexico and Guatemala. A possible reason behind this clustering between accessions from the same country could be that accessions were not collected randomly across the region but were collected in groups from smaller areas. Accessions collected from isolated areas could be genetically differentiated to form separate clusters

(Loveless et al., 1984). Further, Mexico is considered as the center of origin for cotton

(Brubaker et al., 1999). Hence, higher genetic diversity is expected in the accessions collected from this geographical region. A separate cluster (Group 1) comprised of accessions from Mexico, Guatemala, United States, and Belize was also observed. Results based on genetic distances, PCA and AMOVA analysis suggested that Group 1 was genetically distinct from other groups which may indicate that these accessions belong to a different race or sub-species of G. hirsutum. However phylogenetic classification of most of these accessions has not been documented in the literature, so with limited information it was

122

not possible to confirm this hypothesis. It is interesting to note the clustering of accessions from distant countries like Uzbekistan, Sudan and Philippines into the same group (Table

S5). The biological reason leading to this clustering was not clear. It may be possible that these accessions were introduced from Central America to different parts of the world by early travelers. Alternatively, these accessions may have been mislabeled or an admixture of the seed during maintenance occurred. This discrepancy between available pedigree information and molecular marker based relationships also was observed for cultivated

Upland cotton germplasm (Fang et al., 2013; Tyagi et al., 2014).

Genetic differentiation between structure groups was confirmed by AMOVA analysis. Thirty three percent of the total variation being explained by population structuring between G. hirsutum accessions (Table 2). Pairwise FST values ranged from 0.61 to 0.25 which is much higher than the range reported in G. hirsutum cultivars by Tyagi et al. (2014).

Kinship estimate for 62.05% accessions was zero which is higher than percentage observed by Zhao et al. (2014) (53.67%) in a panel of 158 cotton cultivars. Similar observations were made in maize where 50% of inbreds showed zero kinship values when an inbred panel genotyped with SNP markers (Yan et al., 2009). Higher percentage of accessions with zero kinship coefficients also indicates the broad genetic base of genotypes used in the current study.

In this study, Nm estimate for more than 80 markers was <1 suggesting that these loci are private alleles for the Structure groups identified. Presence of high number of private alleles further supports the broad genetic base of the current panel. Private alleles are used as

123

indicators of gene flow between the populations (Barton and Slatkin, 1986) and their estimation has applications in conservation genetics (Kalinowski et al., 2004).

Wild and landrace accessions of cotton from tropical regions are excellent sources of biotic and abiotic stress tolerance and novel alleles for fiber quality and yield (Roark and

Quisenberry, 1977; Quisenberry et al., 1981; Niles and Feaster, 1984; Jenkins, 1986;

McCarty and Jenkins, 1992). A large collection of wild and primitive landrace collection is available at the US National Cotton Germplasm Collection, USDA-ARS, College Station,

TX, USA (Percival, 1987). However, evaluating and developing mapping populations using the entire collection is not a feasible option for most public breeding programs. Further, most of the wild and landrace accessions are photoperiod sensitive thus making their utilization tedious and time consuming in Upland cotton breeding. Breeders have developed methods to use this valuable resource by developing day length neutral race stocks (McCarty and

Jenkins, 1992; Percival et al., 1999). However, it was unknown to what extent the genetic diversity is captured in the converted race stocks while it is not possible to develop converted race stocks for all the tropical accessions some of which might be duplicates. Studying the genetic diversity in the larger panel and sampling the diversity in the form of core sets is an economically viable and time efficient approach to systematically utilize the genetic diversity in crop plants (Liu et al., 2003; Kuroda et al., 2009; Tyagi et al., 2014). Core sets are subsets obtained from a bigger set of lines or populations that represent most of the genetic diversity of the complete panel (Frankel and Brown, 1984). The information obtained from core sets is beneficial for the cultivar improvement in a number of ways. First, a subset of landrace collection can be screened and evaluated for agronomically important traits. Secondly,

124

breeders can use selected accessions as parents for new crosses, or introgress the beneficial genes into cultivated germplasm. This panel can also be used in association studies to identify major genes or QTL controlling important traits. Core sets identified based on genetic markers are available in number of field crops like, elite Upland cotton (Tyagi et al.,

2014), Pima cotton (Xu et al., 2006), corn (Liu et al., 2003), soybean (Kuroda et al., 2009;

Priolli et al., 2013), and fruit crops like watermelon (Zhang et al., 2011). Identification of core sets in the current study could complement the ongoing efforts to systematically utilize the tropical cotton genetic diversity for broadening the cotton genetic base.

Acknowledgements: We thank NC Cotton Growers Association Inc. and Cotton

Incorporated for funding this research through an assistantship to Ms. Baljinder Kaur. We also thank Mr. Linglong Zhu and Dr. Hui Fang for their technical help and assistance at various stages of this research. We appreciate the excellent technical help by Jared Smith and

Sharon Williamson with the sequencer based genotyping work. We thank Drs. Richard Percy and James Frelichowski of the USDA-National Cotton Germplasm Collection for supplying the G. hirsutum accessions.

125

Figures

Figure 1. Bar graph of allele frequencies and allele count in the diversity panel of 185 G. hirsutum landrace accessions.

126

Figure 2. (a) Graph showing probability of data (Ln) for K values ranging from 2 to 12. (b) Estimating number of subpopulations using delta K values for K ranging from 2 to 12 using method proposed by Evanno et al (2005).

127

Figure 3. Q plot showing clustering of 182 G. hirsutum landrace accessions into 5 clusters based co-dominant genotypic data using STRUCTURE. A vertical bar represents each accession. The colored sections in a bar indicate membership coefficient of the accession in different clusters. Identified subgroups are: Group 1 (red color), Group 2 (green color), Group 3 (blue color), Group 4 (yellow color) and Group 5 (pink color).

128

Figure 4. Phylogenetic tree obtained from NJ analysis on 182 G. hirsutum landrace accessions. Spikes are colored based on groups identified from STRUCTURE analysis. Group identified are Group1 (red color), Group2 (green color), Group3 (blue color), Group4 (yellow color), and Group5 (pink color). Acessions with mixed ancestry are indicated in black.

129

Figure 5. Two-dimensional Principal Component Analysis (PCA) of 182 G. hirsutum landrace accessions. Colors correspond to the sub-groups identified from STRUCTURE analysis. ‘Mixed’ indicates accessions, which were not placed in any group using 60% membership threshold.

130

Figure 6. Plot depicting the percent of alleles captured in core sets with different number of accessions. Core set sizes range from 10 to 80 accessions.

131

Figure S1. Phylogenetic tree obtained by NJ analysis on a complete panel of 185 G. hirsutum accessions. Three longest spikes represent accessions with higher number of unique alleles.

132

Figure S2. Two-dimensional Principal Component Analysis (PCA) of the 185 member panel. Three accessions forming a separate cluster are possibly mislabeled G. barbadense accessions.

133

Figure S3. Histogram of pairwise relative kinship estimates between 182 G. hirsutum landrace accessions.

134

Tables

Table 1. Pairwise genetic distance estimates within and between G. hirsutum groups identified by STRUCTURE analysis calculated based on Nei et al. (1983) Group1 Group2 Group3 Group4 Group5

Group1 0.134

Group2 0.581 0.272

Group3 0.591 0.443 0.335

Group4 0.615 0.481 0.509 0.416

Group5 0.609 0.464 0.467 0.513 0.265

135

Table 2. Analysis of molecular variance (AMOVA) between and within groups for G. hirsutum accessions estimated based on STRUCTURE analysis Source of Variance Percentage of df Sum of squares variation Components variation

Among groups 4 4436.58 20.03***Va 33.34

Within groups 303 12131.84 40.04***Vb 66.66

Total 307 16568.42 60.07

***Significant at P < 0.0001

136

Table 3. Pairwise FST estimates for the five groups obtained from STRUCTURE analysis of the diversity panel of landrace accessions of G. hirsutum Group1 Group2 Group3 Group4 Group5

Group2 0.553

Group3 0.502 0.291

Group4 0.463 0.273 0.250

Group5 0.614 0.387 0.308 0.292

137

Table 4. List of G. hirsutum landrace accessions included in the core sets identified by simulated annealing algorithm using PowerMarker software. These core sets are obtained from a diversity panel of 182 accessions Set No. of Allel Accessions in core sets of reduced panel size alleles e %

10 424 51.8 TX-0002, TX-0026, TX-0081, TX-0119, TX-0138, TX-0180, TX- 0209, TX-0469, TX-1005, TX-1324.

20 505 61.7 TX-0002, TX-0026, TX-0030, TX-0039, TX-0046, TX-0065, TX- 0066, TX-0081, TX-0119, TX-0138, TX-0140, TX-0209, TX-0329, TX-0469, TX-0570, TX-0725, TX-0763, TX-1324, TX-1556, TX- 2410.

30 547 66.8 TX-0002, TX-0016, TX-0020, TX-0026, TX-0030, TX-0039, TX- 0046, TX-0050, TX-0060, TX-0065, TX-0066, TX-0081, TX-0119, TX-0138, TX-0140, TX-0182, TX-0197, TX-0209, TX-0307, TX- 0329, TX-0338, TX-0469, TX-0570, TX-0620, TX-0763, TX-1167, TX-1168, TX-1324, TX-1556, TX-2410.

40 569 69.4 TX-0002, TX-0065, TX-0066, TX-0048, TX-0106, TX-0182, TX- 0320, TX-0343, TX-0060, TX-0770, TX-0931, TX-1053, TX-0062, TX-1166, TX-1167, TX-1168, TX-1324, TX-2106, TX-0067, TX- 2410, TX-0007, TX-0329, TX-0119, TX-0140, TX-0154, TX-0158, TX-0164, TX-0138, TX-0171, TX-0209, TX-0244, TX-0338, TX- 0570, TX-0620, TX-0725, TX-0763, TX-0016, TX-0026, TX-0039, TX-0046.

50 587 71.7 TX-0002, TX-0016, TX-0019, TX-0026, TX-0030, TX-0039, TX- 0046, TX-0048, TX-0058, TX-0060, TX-0061, TX-0062, TX-0065, TX-0066, TX-0068, TX-0081, TX-0106, TX-0112, TX-0117, TX- 0119, TX-0138, TX-0140, TX-0151, TX-0152, TX-0154, TX-0158, TX-0182, TX-0197, TX-0209, TX-0221, TX-0236, TX-0320, TX- 0329, TX-0338, TX-0343, TX-0570, TX-0620, TX-0634, TX-0636, TX-0641, TX-0695, TX-0763, TX-0770, TX-0931, TX-1053, TX- 1119, TX-1167, TX-1324, TX-2408, TX-2410.

138

Table 4 continued. 60 610 74.5 TX-0002, TX-0007, TX-0016, TX-0019, TX-0020, TX-0026, TX- 0030, TX-0039, TX-0040, TX-0046, TX-0048, TX-0048, TX-0050, TX-0058, TX-0060, TX-0061, TX-0062, TX-0065, TX-0066, TX- 0067, TX-0068, TX-0081, TX-0106, TX-0112, TX-0119, TX-0138, TX-0140, TX-0151, TX-0154, TX-0158, TX-0173, TX-0174, TX- 0182, TX-0197, TX-0209, TX-0236, TX-0244, TX-0320, TX-0338, TX-0343, TX-0401, TX-0466, TX-0570, TX-0620, TX-0634, TX- 0636, TX-0725, TX-0763, TX-0770, TX-0931, TX-1003, TX-1005, TX-1053, TX-1121, TX-1148,, TX-1167, TX-1168, TX-1324, TX- 1464, TX-2410.

70 620 75.7 TX-0002, TX-0007, TX-0016, TX-0019, TX-0020, TX-0026, TX- 0027, TX-0029, TX-0030, TX-0034, TX-0037, TX-0040, TX-0046, TX-0048, TX-0053, TX-0058, TX-0061, TX-0062, TX-0065, TX- 0066, TX-0067, TX-0068, TX-0081, TX-0104, TX-0106, TX-0106, TX-0112, TX-0117, TX-0119, TX-0121, TX-0138, TX-0140, TX- 0151, TX-0156, TX-0158, TX-0173, TX-0180, TX-0182, TX-0197, TX-0209, TX-0214, TX-0227, TX-0236, TX-0244, TX-0320, TX- 0329, TX-0338, TX-0343, TX-0401, TX-0466, TX-0469, TX-0570, TX-0620, TX-0634, TX-0636, TX-0725, TX-0763, TX-0770, TX- 0790, TX-0931, TX-1003, TX-1005, TX-1053, TX-1121, TX-1166, TX-1167, TX-1168, TX-1324, TX-1556, TX-2410.

80 630 76.9 TX-0002, TX-0007, TX-0016, TX-0019, TX-0020, TX-0022, TX- 0026, TX-0027, TX-0029, TX-0030, TX-0034, TX-0037, TX-0039, TX-0040, TX-0046, TX-0048, TX-0048, TX-0050, TX-0053, TX- 0058, TX-0060, TX-0061, TX-0062, TX-0065, TX-0066, TX-0067, TX-0068, TX-0076, TX-0081, TX-0104, TX-0106, TX-0112, TX- 0117, TX-0119, TX-0138, TX-0140, TX-0151, TX-0152, TX-0156, TX-0158, TX-0173, TX-0180, TX-0182, TX-0197, TX-0209, TX- 0215, TX-0236, TX-0244, TX-0307, TX-0320, TX-0329, TX-0338, TX-0343, TX-0401, TX-0464, TX-0466, TX-0469, TX-0570, TX- 0620, TX-0633, TX-0634, TX-0636, TX-0725, TX-0738, TX-0763, TX-0770, TX-0790, TX-0931, TX-1003, TX-1005, TX-1053, TX- 1121, TX-1148, TX-1167, TX-1168, TX-1324, TX-1464, TX-1801, TX-2289, TX-2410.

139

Table S1. List of G. hirsutum accessions used in the genetic diversity study along with their PI numbers, race and geographical location Project PI Geographical location of Accession Race Id Number accession collected PP1 TX-2 PI 153982 Latifolium Guerrero, Mexico PP2 TX-29 PI 154040 Punctatum Mexico PP3 TX-31 PI 154045 Latifolium Chipas, Mexico PP4 TX-32 PI 154046 Not Classified Mexico PP5 TX-33 PI 154047 Latifolium Chipas, Mexico PP6 TX-36 PI 154050 Latifolium Chipas, Mexico PP7 TX-40 PI 549150 Latifolium Chipas, Mexico PP8 TX-41 PI 549151 Latifolium Chipas, Mexico PP9 TX-43 PI 154054 Latifolium Chipas, Mexico PP10 TX-45 PI 154056 Punctatum Chipas, Mexico PP11 TX-48 PI 154061 Latifolium Mexico PP12 TX-50 PI 154068 Latifolium Chipas, Mexico PP13 TX-53 PI 154080 Latifolium Chipas, Mexico PP14 TX-57 PI 154090 Latifolium Chipas, Mexico PP15 TX-60 PI 154093 Latifolium Chipas, Mexico PP16 TX-61 PI 154094 Latifolium Chipas, Mexico PP17 TX-62 PI 154096 Latifolium Chipas, Mexico PP18 TX-63 PI 154099 Latifolium Chipas, Mexico PP19 TX-67 PI 154103 Latifolium Chipas, Mexico PP20 TX-68 PI 153960 Latifolium Guatemala PP21 TX-72 PI 153966 Latifolium Guatemala PP22 TX-76 PI 549139 Latifolium Guatemala PP23 TX-87 PI 153975 Latifolium Guatemala PP24 TX-91 PI 549144 Latifolium Guatemala PP25 TX-96 PI 163665 Latifolium Jutiapa, Guatemala PP26 TX-100 PI 163629 Latifolium Jutiapa, Guatemala PP27 TX-106 PI 163712 Latifolium Chiquimula, Guatemala PP28 TX-113 PI 163704 Latifolium Chiquimula, Guatemala PP29 TX-119 PI 163645 Latifolium Jutiapa, Guatemala

140

Table S1 continued. PP30 TX-121 PI 163667 Latifolium Jalapa, Guatemala PP31 TX-140 PI 163614 Latifolium Jutiapa, Guatemala PP32 TX-149 PI 163609 Not Classified Santa Rosa, Guatemala PP33 TX-151 PI 163633 Not Classified Jutiapa, Guatemala PP34 TX-154 PI 163660 Latifolium Jutiapa, Guatemala PP35 TX-155 PI 163688 Latifolium Jalapa, Guatemala PP36 TX-156 PI 163678 Latifolium Jalapa, Guatemala PP37 TX-158 PI 163714 Latifolium Chiquimula, Guatemala PP38 TX-164 PI 163694 Latifolium Chiquimula, Guatemala PP39 TX-168 PI 163634 Latifolium Jutiapa, Guatemala PP40 TX-170 PI 163691 Latifolium Chiquimula, Guatemala PP41 TX-171 PI 165305 Morrilli Oaxaca, Mexico PP42 TX-178 PI 163688 Latifolium Chiquimula, Guatemala PP43 TX-180 PI 163742 Latifolium Santa Rosa, Guatemala PP44 TX-197 PI 163648 Latifolium Jutiapa, Guatemala PP45 TX-209 PI 163711 Latifolium Chiquimula, Guatemala PP46 TX-215 PI 163637 Latifolium Jutiapa, Guatemala PP47 TX-219 PI 163671 Latifolium Jalapa, Guatemala PP48 TX-221 PI 163706 Latifolium Chiquimula, Guatemala PP49 TX-226 PI 165369 Latifolium Guerrero, Mexico PP50 TX-228 PI 163672 Latifolium Jalapa, Guatemala PP51 TX-237 PI 163657 Latifolium Jutiapa, Guatemala PP52 TX-238 PI 163674 Latifolium Jalapa, Guatemala PP53 TX-241 PI 163733 Latifolium Baja verapez, Guatemala PP54 TX-243 PI 165324 Latifolium Oaxaca, Mexico PP55 TX-244 PI 165341 Latifolium Oaxaca, Mexico PP56 TX-245 PI 165358 Latifolium Guerrero, Mexico PP57 TX-247 PI 163631 Latifolium Jutiapa, Guatemala PP58 TX-326 PI 165326 Not Classified Guerrero, Mexico PP59 TX-338 PI 165361 Not Classified Guerrero, Mexico PP60 TX-570 PI 224178 Kapas parao Sudan

141

Table S1 continued. PP61 TX-612 PI 154023 Not Classified Mexico PP62 TX-620 PI 154062 Not Classified Mexico PP63 TX-633 PI 158458 Not Classified Guatemala PP64 TX-634 PI 158459 Not Classified Guatemala PP65 TX-636 PI 158461 Not Classified Guatemala PP66 TX-641 PI 158485 Not Classified Guatemala PP67 TX-725 PI 265159 Not Classified Belize PP68 TX-763 PI 201599 Not Classified San luis, Mexico PP69 TX-764 PI 201600 Not Classified San luis, Mexico PP70 TX-790 PI 267179 Not Classified Belize PP71 TX-1149 PI 529966 Not Classified Texas Race Collection PP72 TX-0016 PI 154018 Latifolium Mexico PP73 TX-0017 PI 154022 Latifolium Mexico PP74 TX-0018 PI 154026 Richmondii Chipas, Mexico PP75 TX-0019 PI 549146 Richmondii Chipas, Mexico PP76 TX-0020 PI 154028 Latifolium Chipas, Mexico PP77 TX-0022 PI 154029 Latifolium Chipas, Mexico PP78 TX-0025 PI 154035 Punctatum Chipas, Mexico PP79 TX-0026 PI 154036 Punctatum Chipas, Mexico PP80 TX-0027 PI 154037 Punctatum Chipas, Mexico PP81 TX-0028 PI 154038 Punctatum Mexico PP83 TX-0030 PI 154043 Latifolium Chipas, Mexico PP87 TX-0034 PI 154048 Latifolium Chipas, Mexico PP89 TX-0037 PI 154051 Latifolium Chipas, Mexico PP90 TX-0039 PI 154052 Latifolium Chipas, Mexico PP93 TX-0046 PI 154057 Latifolium Chipas, Mexico PP96 TX-0052 PI 154079 Latifolium Chipas, Mexico PP98 TX-0055 PI 154087 Latifolium Chipas, Mexico PP100 TX-0058 PI 154091 Latifolium Chipas, Mexico PP104 TX-0065 PI 154101 Latifolium Chipas, Mexico PP105 TX-0066 PI 154102 Latifolium Chipas, Mexico

142

Table S1 continued. PP108 TX-0079 PI 153970 Latifolium Guatemala PP109 TX-0080 PI 549141 Latifolium Guatemala PP110 TX-0083 PI 153972 Latifolium Guatemala PP112 TX-0088 PI 153976 Latifolium Guatemala PP113 TX-0090 PI 153980 Latifolium Guatemala PP116 TX-0105 PI 163699 Latifolium Chiquimula, Guatemala PP117 TX-0106 PI 163712 Latifolium Chiquimula, Guatemala PP118 TX-0112 PI 163690 Latifolium Chiquimula, Guatemala PP121 TX-0152 PI 163646 Not Classified Jutiapa, Guatemala PP124 TX-0160 PI 165346 Latifolium Oaxaca, Mexico PP125 TX-0162 PI 163615 Latifolium Jutiapa, Guatemala PP126 TX-0167 PI 163610 Latifolium Jutiapa, Guatemala PP129 TX-0182 PI 165333 Latifolium Guerrero, Mexico PP130 TX-0199 PI 163662 Latifolium Jutiapa, Guatemala PP131 TX-0200 PI 163670 Latifolium Jalapa, Guatemala PP132 TX-0204 PI 165338 Latifolium Mexico PP133 TX-0214 PI 163626 Latifolium Jutiapa, Guatemala PP136 TX-0227 PI 163622 Latifolium Jutiapa, Guatemala PP137 TX-0235 PI 163638 Latifolium Jutiapa, Guatemala PP138 TX-0236 PI 163650 Latifolium Jutiapa, Guatemala PP140 TX-0267 PI 165263 Not Classified Oaxaca, Mexico PP141 TX-0307 PI 165390 Not Classified Guerrero, Mexico PP142 TX-0320 PI 165385 Not Classified Guerrero, Mexico PP143 TX-0343 PI 165328 Not Classified Guerrero, Mexico PP144 TX-0347 PI 165389 Not Classified Guerrero, Mexico PP145 TX-0399 PI 529810 HOPI Arizona, US PP146 TX-0401 PI 529812 HOPI Arizona, US PP147 TX-0404 PI 529813 HOPI Arizona, US PP148 TX-0464 PI 154017 Not Classified Chipas, Mexico PP149 TX-0465 PI 549147 Latifolium Chipas, Mexico PP150 TX-0466 PI 154031 Latifolium Chipas, Mexico

143

Table S1 continued. PP151 TX-0469 PI 154075 Latifolium Chipas, Mexico PP152 TX-0604 PI 153994 Not Classified Mexico PP153 TX-0610 PI 154020 Not Classified Mexico PP154 TX-0624 PI 154104 Not Classified Chipas, Mexico PP155 TX-0695 PI 265137 Not Classified Cortes, Honduras PP156 TX-0711 PI 165353 Not Classified Guerrero, Mexico PP157 TX-0717 PI 165365 Latifolium Guerrero, Mexico PP158 TX-0738 PI 173332 Not Classified Mexico PP159 TX-0770 PI 201606 Not Classified Chipas, Mexico PP160 TX-0773 PI 224702 Not Classified Chipas, Mexico PP161 TX-0775 PI 224704 Not Classified Chipas, Mexico PP162 TX-0832 PI 529831 Marie Galante Trinidad and Tobago PP163 TX-0878 PI 529853 Marie Galante Puerto Rico PP164 TX-0931 PI 274464 Not Classified Uzbekistan PP165 TX-0940 PI 529869 Not Classified Columbia PP166 TX-0953 PI 529874 Not Classified Oaxaca, Mexico PP167 TX-1003 PI 529889 Not Classified Luzon, Phillippines PP168 TX-1005 PI 529891 Punctatum Zambezia, Mozambique PP169 TX-1053 PI 529908 Not Classified Chad PP170 TX-1103 PI 529937 Punctatum Chipas, Mexico PP171 TX-1105 PI 529939 Punctatum Chipas, Mexico PP172 TX-1119 PI 341875 Not Classified Campeche, Mexico PP173 TX-1121 PI 529954 Not Classified Texas, US PP174 TX-1125 PI 529955 Not Classified Luzon, Phillippines PP175 TX-1131 PI 529961 Not Classified Veracruz, Mexico PP176 TX-1148 PI 273895 Not Classified Kefa, Ethopia PP177 TX-1166 PI 304771 Not Classified Huchuetenango, Guatemala PP178 TX-1167 PI 304773 Not Classified Chipas, Mexico PP179 TX-1168 PI 304774 Not Classified Chipas, Mexico PP180 TX-1324 PI 403931 Not Classified Cote D'Ivoire PP181 TX-1325 PI 403932 Not Classified Cote D'Ivoire

144

Table S1 continued. PP182 TX-1464 PI 530144 Not Classified Texas, US PP183 TX-1534 PI 530165 Not Classified Martinique PP184 TX-1556 PI 530187 Not Classified Dominica PP185 TX-1600 PI 530231 Not Classified Haiti PP186 TX-1630 PI 530261 Not Classified Guadeloupe PP187 TX-1801 PI 530432 Not Classified Martinique PP188 TX-2077 PI 501484 Not Classified Mexico PP189 TX-2106 PI 501513 Not Classified Mexico PP190 TX-2289 PI 501469 Not Classified Puerto Rico PP191 TX-2367 PI 607682 Not Classified Paraguay PP192 TX-2368 PI 607683 Not Classified Paraguay PP193 TX-2387 PI 529673 Not Classified Uzbekistan PP194 TX-2388 PI 529675 Not Classified Uzbekistan PP195 TX-2408 PI 607714 Not Classified brazil PP196 TX-2410 PI 607716 Not Classified Malta PP197 TX-2517 PI 607821 Not Classified Sonara, Mexico PP198 TX-7 PI 153992 Latifolium Mexico PP199 TX-30 PI 154043 Latifolium Chipas, Mexico PP200 TX-64 PI 154100 Latifolium Chipas, Mexico PP201 TX-77 PI 153969 Latifolium Guatemala PP202 TX-78 PI 549140 Latifolium Guatemala PP203 TX-81 PI 549142 Latifolium Guatemala PP204 TX-101 PI 163643 Latifolium Jutiapa, Guatemala PP205 TX-104 PI 163676 Latifolium Jalapa, Guatemala PP206 TX-117 PI 165318 Latifolium Oaxaca, Mexico PP207 TX-124 PI 163625 Latifolium Jutiapa, Guatemala PP208 TX-173 PI 163623 Latifolium Jutiapa, Guatemala PP209 TX-174 PI 163647 Latifolium Jutiapa, Guatemala PP210 TX-175 PI 163661 Latifolium Jutiapa, Guatemala PP211 TX-212 PI 165313 Latifolium Oaxaca, Mexico PP212 TX-239 PI 163693 Latifolium Chiquimula, Guatemala

145

Table S1 continued. PP213 TX-329 PI 165360 Latifolium Guerrero, Mexico

146

Table S2. List of SSR markers used to genotype the diversity panel of 185 G. hirsutum landrace accessions Primer S.No Chromosome mapping location$ Repeat motif name

1 BNL1017 AD_chr08, AD_chr16 (CA)14

2 BNL1030 AD_chr05, AD_chr09, AD_chr23 (GT)16, (CA)13

3 BNL1034 AD_chr11, AD_chr17, AD_chr21 (CT)16

4 BNL1045 AD_chr12, AD_chr22, AD_chr26 (AG)16, (CA)10

5 BNL1047 AD_chr25 (CA)12

6 BNL1059 AD_chr14 (CA)16, (CA)11

7 BNL1061 AD_chr22, AD_chr25 (CA)12, (GT)11

8 BNL1064 AD_chr06, AD_chr26 (CA)15, (GT)13

9 BNL1066 AD_chr11 (GT)10+(GA)9

10 BNL1079 AD_chr18 (CA)11, (GT)11

11 BNL1122 AD_chr16 (AG)16

12 BNL1145 AD_chr02, AD_chr20 (GA)12

13 BNL1153 AD_chr06, AD_chr25 (GA)11+(GT)7

14 BNL1160 AD_chr10 (AG)10+G+(GA)3

15 BNL1161 AD_chr09, AD_chr23, AD_chr10 (AG)24

16 BNL1162 AD_chr09 (GA)14

17 BNL119 AD_chr20 (AG)10

18 BNL1231 AD_chr11, AD_chr21, AD_chr25 (AG)15

19 BNL1350 AD_chr01, AD_chr15 (CA)8(GA)16

20 BNL1395 AD_chr07, AD_chr16 (AT)11+(AG)10

147

Table S2 continued. (AG)2+(TG)+(AG)3+T+(G 21 BNL1404 AD_chr11, AD_chr25 A)11

22 BNL1408 AD_chr05, AD_chr11 (AG)17

23 BNL1414 AD_chr09, AD_chr23 (AG)16

24 BNL1417 AD_chr25 (AG)15

25 BNL1421 AD_Chr13, AD_Chr 18 (AG)29, (AG)14

26 BNL1423 AD_chr09 (AG)12

27 BNL1434 AD_chr02 (AG)13

28 BNL1438 AD_chr13 (AG)13

29 BNL1440 AD_chr05, AD_chr06, AD_chr25 (AG)15

30 BNL1495 AD_Chr13, AD_Chr 18 (AG)14

31 BNL1513 AD_chr24 (GA)17

32 BNL1551 AD_chr05, AD_chr16, AD_chr21 (AG)22

33 BNL1597 AD_chr07, AD_chr16 (GA)13

34 BNL1604 AD_chr07, AD_chr16 (AG)25

35 BNL1605 AD_chr12 (AG)25

36 BNL1646 AD_chr08, AD_chr24 (AG)20

37 BNL1665 AD_chr09, AD_chr10, AD_chr20 (AG)16

38 BNL1666 AD_chr07, AD_chr15 (AG)14

39 BNL1667 AD_chr02, AD_chr14, AD_chr15 (AG)19

40 BNL1672 AD_chr09, AD_chr23 (AG)14

41 BNL1673 AD_chr12, AD_chr22 (AG)24

148

Table S2 continued. 42 BNL1693 AD_chr01, AD_chr15, AD_chr21 (CT)12+(CA)9

43 BNL1694 AD_chr16 (AG)19, (TC)19

44 BNL1721 AD_chr18 (AG)17

45 BNL226 AD_chr03, AD_chr14 (GA)16

46 BNL2440 AD_chr01, AD_chr15 (AT)11+(AG)18

47 BNL2471 AD_chr06, AD_chr17, AD_chr18 (AG)12

48 BNL2495 AD_chr26 (AG)14, (TC)14

49 BNL2499 AD_chr23, AD_chr24 (GA)14, (CT)15

50 BNL252 AD_chr24 (CT)21

51 BNL2553 AD_chr20 (GA)10

52 BNL256 AD_chr09, AD_chr10 (GA)17

53 BNL2564 AD_chr01, AD_chr15 (AG)16

54 BNL2572 AD_chr04, AD_chr10 (GA)23

(GA)3+G+A2+(AG)4+(GA) 55 BNL2646 AD_chr07, AD_chr09, AD_chr15 4, (TC)4+(CT)17

56 BNL2655 AD_chr24 (CT)14

57 BNL2667 AD_Chr13, AD_Chr 18 (GA)21

58 BNL2812 AD_Chr11, AD_Chr 21 (AAT)8

(GA)12, 59 BNL285 AD_chr19 (GA)3+GC+(GA)12+A+(A G)2

60 BNL2882 AD_Chr03, AD_Chr 14 (GA)12

61 BNL2895 AD_Chr11, AD_Chr 21 (GA)10

149

Table S2 continued. 62 BNL2960 AD_chr10 (GA)10

63 BNL2967 AD_chr12 (GA)17, (TC)17

64 BNL3008 AD_chr16 (GA)13

65 BNL3029 AD_chr05, AD_chr19 (AG)12

66 BNL3031 AD_chr09, AD_chr23 (AG)27

67 BNL3034 AD_chr01, AD_chr14 (AG)12

68 BNL3084 AD_chr08, AD_chr24 (GA)12

69 BNL3379 AD_chr12, AD_chr20 (GA)14, (CT)14

70 BNL3418 AD_chr11, AD_chr21 (AC)13

(AC)18, 71 BNL3441 AD_chr03 (AT)2(AC)18(AT)4

72 BNL3449 AD_Chr11, AD_Chr 21 (CA)12, (CT)6TA(CA)12

73 BNL3452 AD_chr05, AD_chr10, AD_chr19 (CA)13

74 BNL3474 AD_chr08, AD_chr24 (CA)16

(AC)15, 75 BNL3479 AD_Chr13, AD_Chr 18 (TC)6T(AC)15G(CA)2

76 BNL3482 AD_chr20, AD_chr26 (AC)12

77 BNL3502 AD_chr14 (AC)12+(AT)2

78 BNL3510 AD_chr12, AD_chr26 (AC)15 + (TC)20

79 BNL3511 AD_chr23 (AC)11

80 BNL3537 AD_chr12, AD_chr26 (AC)11

81 BNL358 AD_chr22 (AG)12

150

Table S2 continued. 82 BNL3590 AD_chr02, AD_chr17 (CA)20

83 BNL3594 AD_chr06, AD_chr25 (TC)37

84 BNL3599 AD_chr12, AD_chr26 (TC)15

85 BNL3627 AD_chr03, AD_chr08, AD_chr24 (TC)17

86 BNL3649 AD_chr11, AD_chr21 (TC)20

87 BNL3650 AD_chr06 (TC)15+(TA)6

88 BNL3778 AD_chr01 (GT)11

(GT)3+A+T+(TG)3+(TA)2 89 BNL3800 AD_chr08, AD_chr24 +(TG)21

90 BNL3835 AD_chr04, AD_chr12 (TG)18

91 BNL3888 AD_Chr14, AD_Chr01 (TG)15

92 BNL3955 AD_chr05, AD_chr17, AD_chr22 (CA)12, (GT)13

93 BNL3976 AD_chr05, AD_chr21 (TC)17

94 BNL3987 AD_chr06 (CT)14

95 BNL3993 AD_Chr10, AD_Chr 13, AD_Chr 20 (TC)16, (TC)15

96 BNL4007 AD_chr13 (TG)11

97 BNL4029 AD_Chr13, AD_Chr 18 (TG)12, (AC)10

98 BNL4041 AD_chr12 (AT)5+(GT)14

99 BNL4049 AD_Chr04, AD_Chr 09, AD_Chr 22 (AC)11

100 BNL409 CH13_08 (GT)12

101 BNL569 AD_Chr13, AD_Chr 18 (AG)20

102 BNL580 AD_chr16 (CT)7+(GT)+(CT)14

151

Table S2 continued. 103 BNL686 AD_chr23, AD_chr09 (GA)22

104 BNL786 AD_chr15 (AG)14

105 BNL827 AD_chr25 (CA)19

106 BNL830 AD_chr15 (AC)10

107 BNL834 AD_chr17 (CA)13

108 BNL836 AD_Chr07, AD_Chr 11 (TG)11

109 BNL852 AD_Chr05, AD_Chr 19 (CA)13

110 BNL946 AD_Chr10, AD_Chr 20 (GA)14

111 CIR009 AD_chr01, AD_chr15 (TG)6(N)1(TATG)6

112 CIR018 AD_chr01, AD_chr10 (TG)13

113 CIR030 AD_chr03, AD_chr14 (C)8(TC)6(CA)8

114 CIR097 AD_chr14 (GT)7+(GA)7

115 CIR110 AD_chr15 (AC)7+(N)9+(A)10

116 CIR119 AD_chr08, AD_chr24 (GT)8

117 CIR143 AD_chr15, AD_chr26 (AC)7(TA)5

118 CIR171 AD_chr09, AD_chr10, AD_chr20 (TG)16+(AG)6

119 CIR181 AD_chr03, AD_chr14 (TG)7

120 CIR187 AD_Chr10, AD_Chr 20 (CA)8

121 CIR202 AD_chr03, AD_chr12 (AC)9

122 CIR218 AD_Chr04, AD_Chr 22 (GT)9

123 CIR228 AD_chr03, AD_chr14 (TG)12(N)3(TGTA)11

152

Table S2 continued. (TC)15(N)8(AC)5(N)7(CA) 124 CIR253 AD_Chr05 8

125 CIR272 AD_chr12, AD_chr26 (CA)8

126 CIR289 AD_chr24 (TG)7

127 CIR372 AD_chr10 (GT)12

128 CIR376 AD_Chr02, AD_Chr 05, AD_Chr 08 (CA)15

129 CIR381 AD_Chr02, AD_Chr 04, AD_Chr 14 (AC)7

130 GH354 AD_chr19 AGA(17)

131 GH459 AD_chr19 TCT(13)

132 TMB1295 AD_chr19 (GA)22

133 TMB1489 AD_chr19 (GA)13

134 TMB1645 AD_chr19 (GA)36+(GA)12

135 TMB1750 AD_chr05 (GA)12+(TAA)6

$ based on CottonGen (http://www.cottongen.org) database

153

Table S3. A summary of the marker statistics based on POWERMARKER analysis used to genotype the diversity panel of G. hirsutum landrace accessions. Letters ‘a’ and ‘b’ in the marker differentiate between loci amplified by single SSR primer pair. Marker Major Allele No. of Gene Heterozygosity PIC Frequency Alleles Diversity

BNL3029_a 0.929 4.0 0.134 0.0 0.129

BNL3029_b 0.982 4.0 0.035 0.0 0.034

BNL1434 0.497 6.0 0.585 0.0 0.500

BNL1667_a 0.919 7.0 0.154 0.016 0.152

BNL1667_b 0.381 12.0 0.746 0.0 0.710

BNL226_a 0.601 6.0 0.599 0.016 0.569

BNL226_b 0.945 4.0 0.105 0.011 0.102

BNL1059 0.876 4.0 0.220 0.0 0.200

BNL2572 0.308 12.0 0.829 0.005 0.810

BNL3800 0.219 14.0 0.870 0.0 0.858

BNL852 0.686 8.0 0.494 0.005 0.460

BNL1440 0.713 8.0 0.458 0.011 0.425

BNL1693 0.865 8.0 0.248 0.0 0.242

BNL2440_a 0.951 4.0 0.094 0.0 0.092

BNL2440_b 0.265 10.0 0.805 0.0 0.778

BNL2564 0.789 5.0 0.363 0.011 0.345

BNL1145 0.978 3.0 0.042 0.0 0.042

BNL1064 0.668 5.0 0.489 0.005 0.430

BNL1665 0.685 9.0 0.503 0.011 0.476

154

Table S3 continued. BNL830 0.765 2.0 0.360 0.005 0.295

BNL3888 0.984 3.0 0.032 0.0 0.032

BNL2882 0.897 5.0 0.191 0.011 0.184

BNL1160 0.962 3.0 0.073 0.0 0.072

BNL3034 0.551 5.0 0.522 0.011 0.417

BNL1597 0.465 6.0 0.616 0.0 0.540

BNL1034 0.443 9.0 0.646 0.005 0.582

BNL3008 0.751 4.0 0.407 0.0 0.375

BNL2646 0.757 7.0 0.391 0.011 0.348

BNL1721 0.449 9.0 0.708 0.0 0.666

BNL3955 0.573 6.0 0.599 0.0 0.550

BNL1423 0.581 4.0 0.520 0.005 0.424

BNL119 0.546 9.0 0.609 0.005 0.553

BNL1551 0.465 9.0 0.684 0.00 0.635

BNL2553 0.886 2.0 0.201 0.0 0.181

BNL2471 0.768 8.0 0.371 0.0 0.323

BNL1079 0.978 3.0 0.042 0.0 0.042

BNL1673 0.870 10.0 0.240 0.0 0.235

BNL3441 0.868 6.0 0.240 0.005 0.228

BNL1414 0.811 4.0 0.321 0.011 0.292

BNL1231 0.616 6.0 0.530 0.011 0.460

BNL1694_a 0.904 2.0 0.173 0.005 0.158

155

Table S3 continued. BNL1694_b 0.497 9.0 0.686 0.011 0.651

BNL1513 0.616 6.0 0.540 0.0 0.479

BNL3993 0.895 7.0 0.197 0.005 0.194

BNL1061_a 1.0 1.0 0.0 0.0 0.0

BNL1061_b 0.978 2.0 0.043 0.0 0.042

BNL3452 0.951 5.0 0.094 0.0 0.092

BNL1066 0.583 6.0 0.543 0.0 0.463

BNL3627 0.916 7.0 0.159 0.005 0.156

BNL1417 0.786 4.0 0.346 0.005 0.302

BNL3511 0.913 5.0 0.161 0.0 0.154

CIR171_a 0.707 4.0 0.428 0.017 0.356

CIR171_b 0.802 5.0 0.335 0.033 0.308

BNL4007 0.573 3.0 0.539 0.0 0.451

BNL1045_a 1.0 1.0 0.0 0.0 0.0

BNL1045_b 0.984 2.0 0.032 0.0 0.031

BNL2960 0.598 7.0 0.572 0.056 0.520

BNL3479 0.571 3.0 0.503 0.022 0.392

BNL3976_a 0.978 2.0 0.043 0.0 0.042

BNL3976_b 0.762 4.0 0.372 0.016 0.317

CIR289 0.984 2.0 0.032 0.0 0.031

BNL3510 0.692 9.0 0.487 0.0 0.454

BNL2499 0.808 8.0 0.334 0.011 0.318

156

Table S3 continued. BNL2812_a 0.745 5.0 0.411 0.022 0.373

BNL2812_b 0.474 5.0 0.678 0.0 0.628

BNL3778 0.919 4.0 0.152 0.011 0.147

BNL3649 0.557 8.0 0.607 0.016 0.552

BNL3474_a 0.754 5.0 0.387 0.005 0.337

BNL3474_b 0.984 2.0 0.032 0.0 0.031

BNL3449_a 0.754 6.0 0.409 0.005 0.384

BNL3449_b 0.754 2.0 0.371 0.005 0.302

BNL2967 0.735 6.0 0.433 0.0101 0.404

BNL1438 0.978 3.0 0.042 0.0 0.042

BNL3418 0.840 4.0 0.276 0.005 0.252

BNL3590_a 0.686 5.0 0.492 0.022 0.457

BNL3590_b 0.765 4.0 0.367 0.005 0.311

BNL3599 0.568 7.0 0.605 0.011 0.555

BNL3650 0.603 7.0 0.590 0.0 0.556

BNL3379 0.551 5.0 0.557 0.011 0.472

CIR143 0.967 2.0 0.063 0.0 0.061

CIR272_a 0.968 3.0 0.063 0.0 0.062

CIR272_b 0.719 3.0 0.413 0.0 0.340

CIR372_a 0.624 4.0 0.488 0.081 0.393

CIR372_b 0.988 3.0 0.024 0.0 0.024

BNL1646 0.765 6.0 0.377 0.005 0.334

157

Table S3 continued. TMB1750_a 0.967 2.0 0.063 0.011 0.061

TMB1750_b 0.978 2.0 0.042 0.011 0.041

BNL2667 0.476 17.0 0.712 0.005 0.683

TMB1489 0.978 5.0 0.043 0.0 0.042

BNL2655 0.727 9.0 0.450 0.011 0.427

BNL252 0.708 6.0 0.432 0.011 0.366

BNL358 0.757 12.0 0.413 0.016 0.396

BNL2495 0.670 5.0 0.479 0.0 0.414

BNL3031 0.513 13.0 0.607 0.005 0.538

BNL2895 0.765 7.0 0.386 0.005 0.351

BNL3537 0.605 5.0 0.559 0.659 0.504

CIR030 0.551 4.0 0.509 0.0 0.395

BNL409 0.940 3.0 0.114 0.0 0.110

CIR181 0.984 3.0 0.032 0.0 0.032

BNL580_a 0.627 3.0 0.491 0.005 0.400

BNL580_b 0.560 6.0 0.569 0.0 0.494

BNL256 0.913 5.0 0.163 0.0 0.159

BNL786 0.713 7.0 0.464 0.0 0.435

BNL1666_a 0.902 2.0 0.176 0.011 0.161

BNL1666_b 0.758 7.0 0.393 0.022 0.357

BNL1395 0.495 5.0 0.639 0.011 0.578

BNL827 0.773 5.0 0.378 0.005 0.349

158

Table S3 continued. BNL3482 0.408 8.0 0.710 0.005 0.663

BNL3835 0.432 10.0 0.658 0.016 0.597

BNL285 0.586 4.0 0.572 0.016 0.511

BNL834 0.581 3.0 0.450 0.0 0.390

TMB1645 0.179 28.0 0.894 0.0 0.885

BNL1404 0.881 4.0 0.217 0.0 0.207

CIR218_a 0.921 5.0 0.147 0.005 0.141

CIR218_b 0.473 10.0 0.636 1.0 0.57

BNL946 0.651 4.0 0.465 0.005 0.371

GH459 0.639 5.0 0.484 0.005 0.395

CIR009 0.967 3.0 0.065 0.0 0.064

BNL1153_a 1.0 1.0 0.0 0.0 0.0

BNL1153_b 0.957 3.0 0.084 0.0 0.082

BNL1161_a 1.0 1.0 0.0 0.0 0.0

BNL1161_b 0.617 9.0 0.584 0.011 0.557

BNL3594_b 0.478 10.0 0.713 0.005 0.684

BNL3987 0.764 7.0 0.395 0.005 0.369

BNL836 0.889 2.0 0.197 0.005 0.178

BNL1162 0.708 6.0 0.465 0.0 0.431

BNL1017 0.984 2.0 0.032 0.0 0.031

BNL4041 0.735 4.0 0.403 0.0 0.341

BNL1604_a 0.602 8.0 0.607 0.0 0.583

159

Table S3 continued. BNL1604_b 0.608 8.0 0.581 0.027 0.544

BNL1047 0.957 3.0 0.084 0.0 0.082

CIR119_a 1.0 1.0 0.0 0.0 0.0

CIR119_b 0.924 2.0 0.140 0.0 0.130

CIR97_a 0.984 2.0 0.032 0.0 0.032

CIR97_b 0.535 2.0 0.497 0.0 0.379

CIR110 0.876 3.0 0.225 0.0 0.213

BNL1672 0.440 12.0 0.713 0.200 0.673

BNL1122_a 0.520 4.0 0.607 0.052 0.538

BNL1122_b 0.413 5.0 0.674 0.064 0.615

CIR187_a 0.946 3.0 0.104 0.054 0.100

CIR187_b 0.994 2.0 0.011 0.0 0.011

BNL1030 0.611 4.0 0.497 0.011 0.400

BNL1350 0.93 5.0 0.134 0.0 0.131

BNL1408 0.837 6.0 0.286 0.0 0.267

BNL1421 0.322 14.0 0.824 0.016 0.806

BNL3084 0.978 3.0 0.042 0.0 0.042

BNL4029 0.788 5.0 0.356 0.0 0.328

GH354 0.799 9.0 0.352 0.005 0.341

Mean 0.74 5.53 0.36 0.02 0.33

160

Table S4. List of accessions carrying unique alleles in the diversity panel No. of unique Geographical location of the Sr. No Project Id Accession alleles accession collected

1 PP1 TX-2 1 Guerrero, Mexico

2 PP3 TX-31 2 Chipas, Mexico

3 PP4 TX-32 3 Mexico

4 PP5 TX-33 3 Chipas, Mexico

5 PP8 TX-41 2 Chipas, Mexico

6 PP12 TX-50 1 Chipas, Mexico

7 PP13 TX-53 2 Chipas, Mexico

8 PP14 TX-57 1 Chipas, Mexico

9 PP15 TX-60 1 Chipas, Mexico

10 PP16 TX-61 1 Chipas, Mexico

11 PP20 TX-68 1 Guatemala

12 PP21 TX-72 1 Guatemala

13 PP26 TX-100 1 Jutiapa, Guatemala

14 PP27 TX-106 2 Chiquimula, Guatemala

15 PP32 TX-149 1 Santa Rosa, Guatemala

16 PP33 TX-151 2 Jutiapa, Guatemala

17 PP37 TX-158 1 Chiquimula, Guatemala

18 PP39 TX-168 1 Jutiapa, Guatemala

19 PP40 TX-170 1 Chiquimula, Guatemala

20 PP41 TX-171 1 Oaxaca, Mexico

161

Table S4 continued. 21 PP45 TX-209 1 Chiquimula, Guatemala

22 PP49 TX-226 2 Guerrero, Mexico

23 PP54 TX-243 2 Oaxaca, Mexico

24 PP57 TX-247 1 Jutiapa, Guatemala

25 PP58 TX-326 1 Guerrero, Mexico

26 PP60 TX-570 2 Sudan

27 PP61 TX-612 16 Mexico

28 PP70 TX-790 2 Belize

29 PP71 TX-1149 1 Texas Race Collection

30 PP73 TX-0017 7 Mexico

31 PP76 TX-0020 1 Chipas, Mexico

32 PP83 TX-0030 1 Chipas, Mexico

33 PP87 TX-0034 1 Chipas, Mexico

34 PP89 TX-0037 1 Chipas, Mexico

35 PP104 TX-0065 2 Chipas, Mexico

36 PP108 TX-0079 1 Guatemala

37 PP109 TX-0080 1 Guatemala

38 PP112 TX-0088 1 Guatemala

39 PP118 TX-0112 1 Chiquimula, Guatemala

40 PP124 TX-0160 1 Oaxaca, Mexico

41 PP125 TX-0162 1 Jutiapa, Guatemala

42 PP137 TX-0235 1 El Salvador

162

Table S4 continued. 43 PP144 TX-0347 26 Guerrero, Mexico

44 PP145 TX-0399 1 Arizona, US

45 PP147 TX-0404 3 Arizona, US

46 PP150 TX-0466 2 Chipas, Mexico

47 PP152 TX-0604 22 Mexico

48 PP153 TX-0610 1 Mexico

49 PP156 TX-0711 1 Guerrero, Mexico

50 PP157 TX-0717 2 Guerrero, Mexico

51 PP159 TX-0770 3 Chipas, Mexico

52 PP165 TX-0940 1 Columbia

53 PP166 TX-0953 8 Oaxaca, Mexico

54 PP167 TX-1003 1 Luzon, Phillippines

55 PP177 TX-1166 1 Huchuetenango, Guatemala

56 PP181 TX-1325 1 Cote D'Ivoire

57 PP183 TX-1534 4 Martinique

58 PP184 TX-1556 3 Dominica

59 PP185 TX-1600 1 Haiti

60 PP186 TX-1630 3 Guadeloupe

61 PP187 TX-1801 3 Martinique

62 PP189 TX-2106 1 Mexico

63 PP190 TX-2289 3 Puerto Rico

64 PP192 TX-2368 2 Paraguay

163

Table S4 continued. 65 PP195 TX-2408 3 Brazil

66 PP198 TX-7 1 Mexico

67 PP200 TX-64 1 Chipas, Mexico

68 PP202 TX-78 9 Guatemala

69 PP204 TX-101 1 Jutiapa, Guatemala

70 PP207 TX-124 1 Jutiapa, Guatemala

71 PP210 TX-175 1 Jutiapa, Guatemala

72 PP211 TX-212 4 Oaxaca, Mexico

164

Table S5. Proportional membership of cotton accessions to clusters as determined by model-based analysis using STRUCTURE. Lines were assigned to a group based on membership probability higher than 0.60 Project Line/ Group Location Group1 Group2 Group3 Group4 Group5 ID Accession Assigned PP29 TX-119 Jutiapa, Guatemala 0.998 0.001 0 0 0 Group1 PP67 TX-725 Belize 0.998 0.001 0 0 0.001 Group1 PP70 TX-790 Belize 0.862 0.1 0.003 0.001 0.035 Group1 PP147 TX-0404 Arizona, US 0.961 0.001 0.012 0.003 0.023 Group1 PP172 TX-1119 Campeche, Mexico 0.998 0 0 0 0 Group1 PP188 TX-2077 Mexico 0.998 0 0 0 0.001 Group1 PP189 TX-2106 Mexico 0.998 0 0.001 0 0 Group1 PP4 TX-32 Mexico 0.003 0.993 0.001 0.001 0.002 Group2 PP25 TX-96 Jutiapa, Guatemala 0.001 0.997 0.001 0.001 0.001 Group2 PP26 TX-100 Jutiapa, Guatemala 0.001 0.898 0.003 0.001 0.097 Group2 Chiquimula, PP27 TX-106 0.002 0.994 0.002 0.002 0.001 Group2 Guatemala Chiquimula, PP28 TX-113 0.002 0.837 0.003 0.156 0.002 Group2 Guatemala PP30 TX-121 Jalapa, Guatemala 0.001 0.996 0.001 0.001 0.001 Group2 Santa Rosa, PP32 TX-149 0.001 0.672 0.277 0.013 0.038 Group2 Guatemala PP33 TX-151 Jutiapa, Guatemala 0.014 0.918 0.004 0.061 0.001 Group2 PP35 TX-155 Jalapa, Guatemala 0.001 0.996 0.001 0.001 0.001 Group2 PP36 TX-156 Jalapa, Guatemala 0.001 0.997 0.001 0.001 0.001 Group2

165

Table S5 continued. Chiquimula, PP37 TX-158 0.001 0.988 0.001 0.006 0.003 Group2 Guatemala PP39 TX-168 Jutiapa, Guatemala 0.001 0.885 0.002 0.001 0.111 Group2 Chiquimula, PP40 TX-170 0.003 0.722 0.007 0.267 0.001 Group2 Guatemala Chiquimula, PP42 TX-178 0.001 0.997 0.001 0.001 0.001 Group2 Guatemala PP44 TX-197 Jutiapa, Guatemala 0.001 0.997 0.001 0.001 0.001 Group2 Chiquimula, PP45 TX-209 0.001 0.901 0.096 0.002 0.001 Group2 Guatemala PP46 TX-215 Jutiapa, Guatemala 0.002 0.989 0.006 0.001 0.001 Group2 PP47 TX-219 Jalapa, Guatemala 0.002 0.994 0.002 0.001 0.001 Group2 Chiquimula, PP48 TX-221 0.001 0.996 0.001 0.001 0.001 Group2 Guatemala PP51 TX-237 Jutiapa, Guatemala 0.001 0.997 0.001 0.001 0.001 Group2 Baja verapez, PP53 TX-241 0.001 0.918 0.078 0.002 0.001 Group2 Guatemala PP64 TX-634 Guatemala 0.001 0.746 0.249 0.002 0.002 Group2 PP65 TX-636 Guatemala 0.001 0.766 0.226 0.002 0.005 Group2 PP66 TX-641 Guatemala 0.001 0.974 0.004 0.001 0.02 Group2 PP87 TX-0034 Chipas, Mexico 0.001 0.982 0.011 0.001 0.005 Group2 PP110 TX-0083 Guatemala 0.002 0.918 0.012 0.001 0.066 Group2 PP112 TX-0088 Guatemala 0.001 0.996 0.002 0.001 0.001 Group2 PP113 TX-0090 Guatemala 0.001 0.978 0.002 0.019 0.001 Group2

166

Table S5 continued. Chiquimula, PP116 TX-0105 0.001 0.997 0.001 0.001 0.001 Group2 Guatemala Chiquimula, PP117 TX-0106 0.001 0.996 0.001 0.001 0.001 Group2 Guatemala PP130 TX-0199 Jutiapa, Guatemala 0.001 0.985 0.002 0.012 0.001 Group2 PP131 TX-0200 Jalapa, Guatemala 0.001 0.991 0.006 0.001 0.001 Group2 PP133 TX-0214 Jutiapa, Guatemala 0.001 0.856 0.056 0.082 0.006 Group2 PP136 TX-0227 Jutiapa, Guatemala 0.001 0.997 0.001 0.001 0.001 Group2 PP138 TX-0236 Jutiapa, Guatemala 0.001 0.997 0.001 0.001 0.001 Group2 PP154 TX-0624 Chipas, Mexico 0.001 0.977 0.003 0.001 0.018 Group2 PP185 TX-1600 Haiti 0.001 0.997 0.001 0.001 0.001 Group2 PP203 TX-81 Guatemala 0.001 0.741 0.002 0.254 0.002 Group2 PP205 TX-104 Jalapa, Guatemala 0.001 0.997 0.001 0.001 0.001 Group2 PP207 TX-124 Jutiapa, Guatemala 0.001 0.997 0.001 0.001 0.001 Group2 PP208 TX-173 Jutiapa, Guatemala 0.001 0.997 0.001 0.001 0.001 Group2 PP209 TX-174 Jutiapa, Guatemala 0.001 0.997 0.001 0.001 0.001 Group2 PP210 TX-175 Jutiapa, Guatemala 0.001 0.995 0.002 0.001 0.002 Group2 PP1 TX-2 Guerrero, Mexico 0.001 0.004 0.728 0.203 0.065 Group3 PP3 TX-31 Chipas, Mexico 0.003 0.018 0.976 0.002 0.002 Group3 PP6 TX-36 Chipas, Mexico 0.015 0.023 0.959 0.001 0.003 Group3 PP7 TX-40 Chipas, Mexico 0.001 0.002 0.992 0.001 0.005 Group3 PP8 TX-41 Chipas, Mexico 0.001 0.003 0.926 0.001 0.069 Group3

167

Table S5 continued. PP9 TX-43 Chipas, Mexico 0.001 0.003 0.991 0.002 0.004 Group3 PP11 TX-48 Mexico 0.001 0.003 0.993 0.001 0.002 Group3 PP12 TX-50 Chipas, Mexico 0.001 0.004 0.988 0.003 0.004 Group3 PP13 TX-53 Chipas, Mexico 0.001 0.273 0.674 0.005 0.047 Group3 PP14 TX-57 Chipas, Mexico 0.008 0.22 0.695 0.011 0.066 Group3 PP15 TX-60 Chipas, Mexico 0.002 0.168 0.827 0.002 0.002 Group3 PP16 TX-61 Chipas, Mexico 0.001 0.202 0.759 0.002 0.035 Group3 PP17 TX-62 Chipas, Mexico 0.001 0.002 0.986 0.009 0.003 Group3 PP18 TX-63 Chipas, Mexico 0.004 0.005 0.94 0.032 0.02 Group3 PP19 TX-67 Chipas, Mexico 0.002 0.007 0.987 0.001 0.002 Group3 PP31 TX-140 Jutiapa, Guatemala 0.011 0.256 0.656 0.005 0.072 Group3 PP34 TX-154 Jutiapa, Guatemala 0.002 0.3 0.696 0.001 0.002 Group3 Chiquimula, PP38 TX-164 0.007 0.001 0.991 0.001 0.001 Group3 Guatemala Santa Rosa, PP43 TX-180 0.001 0.002 0.995 0.001 0.001 Group3 Guatemala PP49 TX-226 Guerrero, Mexico 0.001 0.001 0.995 0.001 0.002 Group3 PP52 TX-238 Jalapa, Guatemala 0.002 0.001 0.995 0.002 0.001 Group3 PP54 TX-243 Oaxaca, Mexico 0.025 0.001 0.971 0.002 0.002 Group3 PP55 TX-244 Oaxaca, Mexico 0.001 0.001 0.996 0.001 0.001 Group3 PP56 TX-245 Guerrero, Mexico 0.002 0.005 0.992 0.001 0.001 Group3 PP60 TX-570 Sudan 0.001 0.01 0.927 0.046 0.016 Group3

168

Table S5 continued. PP62 TX-620 Mexico 0.001 0.001 0.994 0.001 0.003 Group3 PP68 TX-763 San luis, Mexico 0.004 0.003 0.75 0.239 0.004 Group3 PP69 TX-764 San luis, Mexico 0.004 0.003 0.751 0.238 0.004 Group3 PP72 TX-0016 Mexico 0.001 0.217 0.728 0.044 0.011 Group3 PP76 TX-0020 Chipas, Mexico 0.002 0.001 0.93 0.001 0.066 Group3 PP77 TX-0022 Chipas, Mexico 0.032 0.002 0.683 0.002 0.281 Group3 PP83 TX-0030 Chipas, Mexico 0.001 0.06 0.931 0.001 0.006 Group3 PP89 TX-0037 Chipas, Mexico 0.003 0.006 0.988 0.001 0.002 Group3 PP90 TX-0039 Chipas, Mexico 0.003 0.001 0.994 0.001 0.001 Group3 PP93 TX-0046 Chipas, Mexico 0.001 0.132 0.863 0.001 0.002 Group3 PP96 TX-0052 Chipas, Mexico 0.002 0.276 0.603 0.001 0.118 Group3 PP105 TX-0066 Chipas, Mexico 0.001 0.291 0.702 0.001 0.004 Group3 PP121 TX-0152 Jutiapa, Guatemala 0.001 0.02 0.96 0.006 0.013 Group3 PP124 TX-0160 Oaxaca, Mexico 0.001 0.002 0.992 0.001 0.004 Group3 PP126 TX-0167 Jutiapa, Guatemala 0.001 0.001 0.993 0.004 0.001 Group3 PP129 TX-0182 Guerrero, Mexico 0.026 0.003 0.962 0.006 0.003 Group3 PP132 TX-0204 Mexico 0.001 0.001 0.996 0.001 0.001 Group3 PP140 TX-0267 Oaxaca, Mexico 0.009 0.04 0.943 0.005 0.003 Group3 PP141 TX-0307 Guerrero, Mexico 0.004 0.014 0.976 0.004 0.001 Group3 PP146 TX-0401 Arizona, US 0.003 0.004 0.856 0.079 0.057 Group3 PP156 TX-0711 Guerrero, Mexico 0.001 0.001 0.99 0.001 0.007 Group3

169

Table S5 continued. PP157 TX-0717 Guerrero, Mexico 0.001 0.001 0.996 0.001 0.001 Group3 PP158 TX-0738 Mexico 0.003 0.002 0.98 0.015 0.001 Group3 PP161 TX-0775 Chipas, Mexico 0.001 0.001 0.997 0.001 0.001 Group3 PP162 TX-0832 Trinidad and Tobago 0.012 0.001 0.657 0.205 0.125 Group3 PP164 TX-0931 Uzbekistan 0.001 0.001 0.998 0.001 0 Group3 PP165 TX-0940 Columbia 0.387 0.001 0.61 0.001 0.001 Group3 PP167 TX-1003 Luzon, Phillippines 0.01 0.081 0.905 0.002 0.002 Group3 PP173 TX-1121 Texas, US 0.001 0.004 0.905 0.004 0.086 Group3 PP174 TX-1125 Luzon, Phillippines 0.001 0.006 0.962 0.004 0.027 Group3 PP175 TX-1131 Veracruz, Mexico 0.002 0.001 0.995 0.002 0.001 Group3 PP176 TX-1148 Kefa, Ethopia 0.265 0.001 0.732 0.001 0.001 Group3 PP180 TX-1324 Cote D'Ivoire 0.001 0.001 0.996 0.001 0.001 Group3 PP181 TX-1325 Cote D'Ivoire 0.001 0.001 0.997 0.001 0.001 Group3 PP182 TX-1464 Texas, US 0.001 0.001 0.997 0.001 0.001 Group3 PP191 TX-2367 Paraguay 0.003 0.004 0.99 0.001 0.002 Group3 PP192 TX-2368 Paraguay 0.001 0.001 0.997 0.001 0.001 Group3 PP193 TX-2387 Uzbekistan 0.002 0.001 0.984 0.003 0.01 Group3 PP194 TX-2388 Uzbekistan 0.002 0.001 0.994 0.001 0.001 Group3 PP195 TX-2408 brazil 0.027 0.013 0.927 0.004 0.029 Group3 PP196 TX-2410 Malta 0.001 0.002 0.862 0.002 0.133 Group3 PP200 TX-64 Chipas, Mexico 0.026 0.128 0.835 0.004 0.007 Group3

170

Table S5 continued. PP204 TX-101 Jutiapa, Guatemala 0.001 0.001 0.995 0.001 0.001 Group3 PP206 TX-117 Oaxaca, Mexico 0.001 0.288 0.708 0.002 0.001 Group3 PP20 TX-68 Guatemala 0.001 0.004 0.002 0.991 0.002 Group4 PP22 TX-76 Guatemala 0.001 0.014 0.002 0.982 0.002 Group4 PP23 TX-87 Guatemala 0.004 0.009 0.377 0.609 0.001 Group4 PP24 TX-91 Guatemala 0.001 0.102 0.001 0.896 0.001 Group4 PP50 TX-228 Jalapa, Guatemala 0.001 0.005 0.002 0.991 0.002 Group4 PP57 TX-247 Jutiapa, Guatemala 0.001 0.002 0.006 0.991 0.001 Group4 PP58 TX-326 Guerrero, Mexico 0.001 0.001 0.002 0.98 0.016 Group4 PP63 TX-633 Guatemala 0.001 0.053 0.002 0.943 0.001 Group4 PP137 TX-0235 El Salvador 0.001 0.001 0.002 0.99 0.007 Group4 PP142 TX-0320 Guerrero, Mexico 0.001 0.001 0.002 0.988 0.008 Group4 PP143 TX-0343 Guerrero, Mexico 0.001 0.002 0.253 0.717 0.028 Group4 PP163 TX-0878 Puerto Rico 0.049 0.115 0.002 0.83 0.004 Group4 Huchuetenango, PP177 TX-1166 0.001 0.004 0.001 0.993 0.001 Group4 Guatemala PP187 TX-1801 Martinique 0.001 0.202 0.203 0.592 0.001 Group4 PP190 TX-2289 Puerto Rico 0.006 0.001 0.2 0.791 0.001 Group4 PP198 TX-7 Mexico 0.001 0.004 0.003 0.991 0.001 Group4 PP202 TX-78 Guatemala 0.004 0.001 0.002 0.992 0.002 Group4 PP211 TX-212 Oaxaca, Mexico 0.017 0.002 0.02 0.769 0.193 Group4 PP213 TX-329 Guerrero, Mexico 0.001 0.001 0.002 0.994 0.002 Group4

171

Table S5 continued. PP2 TX-29 Mexico 0.001 0.001 0.001 0.001 0.997 Group5 PP74 TX-0018 Chipas, Mexico 0.001 0.001 0.001 0.001 0.997 Group5 PP75 TX-0019 Chipas, Mexico 0.002 0.002 0.226 0.005 0.766 Group5 PP78 TX-0025 Chipas, Mexico 0.001 0.001 0.002 0.001 0.995 Group5 PP79 TX-0026 Chipas, Mexico 0.001 0.001 0.001 0.001 0.997 Group5 PP81 TX-0028 Mexico 0.001 0.018 0.002 0.001 0.978 Group5 PP148 TX-0464 Chipas, Mexico 0.001 0.001 0.001 0.001 0.997 Group5 PP149 TX-0465 Chipas, Mexico 0.001 0.005 0.271 0.013 0.71 Group5 PP150 TX-0466 Chipas, Mexico 0.001 0.085 0.14 0.007 0.767 Group5 PP151 TX-0469 Chipas, Mexico 0.001 0.001 0.001 0.001 0.997 Group5 PP160 TX-0773 Chipas, Mexico 0.001 0.018 0.384 0.003 0.594 Group5 Zambezia, PP168 TX-1005 0.001 0.001 0.001 0.004 0.992 Group5 Mozambique PP170 TX-1103 Chipas, Mexico 0.001 0.001 0.001 0.001 0.997 Group5 PP171 TX-1105 Chipas, Mexico 0.001 0.001 0.001 0.001 0.997 Group5 PP178 TX-1167 Chipas, Mexico 0.001 0.001 0.001 0.001 0.996 Group5 PP179 TX-1168 Chipas, Mexico 0.001 0.078 0.165 0.002 0.754 Group5 PP5 TX-33 Chipas, Mexico 0.001 0.016 0.478 0.001 0.504 mixed PP10 TX-45 Chipas, Mexico 0.001 0.01 0.553 0.003 0.433 mixed PP21 TX-72 Guatemala 0.001 0.468 0.528 0.002 0.002 mixed PP41 TX-171 Oaxaca, Mexico 0.002 0.002 0.243 0.51 0.243 mixed PP59 TX-338 Guerrero, Mexico 0.001 0.002 0.503 0.493 0.001 mixed

172

Table S5 continued. Texas Race PP71 TX-1149 0.429 0.001 0.568 0.001 0.001 mixed Collection PP73 TX-0017 Mexico 0.072 0.034 0.287 0.508 0.1 mixed PP80 TX-0027 Chipas, Mexico 0.002 0.067 0.508 0.002 0.422 mixed PP98 TX-0055 Chipas, Mexico 0.001 0.314 0.52 0.159 0.006 mixed PP100 TX-0058 Chipas, Mexico 0.001 0.294 0.507 0.001 0.197 mixed PP104 TX-0065 Chipas, Mexico 0.002 0.289 0.508 0.002 0.2 mixed PP108 TX-0079 Guatemala 0.163 0.029 0.449 0.112 0.248 mixed PP109 TX-0080 Guatemala 0.001 0.537 0.002 0.458 0.001 mixed Chiquimula, PP118 TX-0112 0.001 0.397 0.524 0.077 0.001 mixed Guatemala PP125 TX-0162 Jutiapa, Guatemala 0.001 0.005 0.47 0.351 0.173 mixed PP145 TX-0399 Arizona, US 0.004 0.029 0.396 0.49 0.08 mixed PP153 TX-0610 Mexico 0.001 0.516 0.478 0.001 0.004 mixed PP155 TX-0695 Cortes, Honduras 0.001 0.003 0.461 0.002 0.533 mixed PP159 TX-0770 Chipas, Mexico 0.002 0.265 0.143 0.587 0.003 mixed PP166 TX-0953 Oaxaca, Mexico 0.002 0.259 0.329 0.406 0.003 mixed PP169 TX-1053 Chad 0.467 0.003 0.52 0.002 0.008 mixed PP183 TX-1534 Martinique 0.002 0.007 0.458 0.532 0.001 mixed PP184 TX-1556 Dominica 0.001 0.001 0.515 0.483 0.001 mixed PP186 TX-1630 Guadeloupe 0.002 0.001 0.42 0.575 0.002 mixed PP197 TX-2517 Sonara, Mexico 0.1 0.114 0.492 0.29 0.005 mixed

173

Table S5 continued. PP199 TX-30 Chipas, Mexico 0.001 0.007 0.511 0.002 0.479 mixed PP201 TX-77 Guatemala 0.001 0.547 0.001 0.449 0.001 mixed Chiquimula, PP212 TX-239 0.001 0.406 0.501 0.001 0.091 mixed Guatemala

174

Table S6. Fixation indices (FIS and FST) and gene flow estimate (Nm) for each locus across the groups obtained from STRUCTURE analysis of the diversity panel of landrace accessions. Letters ‘a’ and ‘b’ in the marker name differentiate between loci amplified by single SSR primer pair Locus Fwc(IS) Fwc(ST) Nm*

BNL1666_a 0.737 0.801 0.062

BNL3008 1.000 0.759 0.079

BNL3379 0.963 0.753 0.082

BNL2564 0.9461 0.738 0.089

BNL1404 1.000 0.715 0.100

CIR119_b 1.000 0.705 0.105

BNL256 1.000 0.701 0.106

CIR97_b 1.000 0.675 0.121

GH459 1.000 0.627 0.148

BNL1666_b 0.890 0.601 0.166

BNL4029 1.000 0.581 0.180

BNL1693 1.000 0.567 0.191

CIR218_a 0.921 0.551 0.204

BNL3031 1.000 0.533 0.219

BNL2967 0.974 0.533 0.219

BNL852 0.976 0.522 0.229

BNL1646 1.000 0.518 0.232

BNL1064 1.000 0.501 0.249

BNL3479 0.955 0.450 0.250

175

Table S6 continued. BNL836 0.943 0.497 0.253

BNL3835 0.966 0.496 0.253

BNL1030 1.000 0.495 0.255

BNL3987 1.000 0.485 0.266

CIR272_b 1.000 0.483 0.267

BNL1066 1.000 0.481 0.269

BNL1597 1.000 0.478 0.273

BNL1034 0.983 0.464 0.289

BNL3418 0.961 0.461 0.293

CIR110 1.000 0.432 0.329

BNL1694_a 1.000 0.432 0.329

BNL580_a 1.000 0.430 0.331

BNL580_b 1.000 0.428 0.335

BNL1417 0.971 0.425 0.339

BNL3449_a 1.000 0.424 0.340

BNL3599 0.984 0.411 0.357

CIR171_a 0.938 0.397 0.380

CIR372_a 0.783 0.395 0.383

BNL252 0.956 0.385 0.400

BNL3955 1.000 0.380 0.407

BNL1672 0.654 0.371 0.425

176

Table S6 continued. BNL1231 0.964 0.370 0.426

BNL3594_b 0.987 0.361 0.442

BNL1665 0.966 0.361 0.443

BNL1604_a 1.000 0.360 0.445

BNL2655 0.957 0.358 0.447

BNL2471 1.000 0.353 0.457

BNL1395 0.986 0.342 0.481

BNL2895 1.000 0.341 0.483

BNL2495 1.000 0.337 0.492

BNL1162 1.000 0.335 0.496

BNL1423 1.000 0.333 0.501

GH354 0.974 0.321 0.529

BNL285 0.953 0.310 0.556

CIR171_b 0.951 0.306 0.567

BNL226_a 0.972 0.306 0.568

BNL4007 1.000 0.305 0.569

BNL119 0.986 0.304 0.572

BNL2812_b 1.000 0.299 0.586

BNL1122_a 0.911 0.295 0.596

BNL834 1.000 0.295 0.597

BNL3441 1.000 0.293 0.603

177

Table S6 continued. BNL3976_b 0.924 0.286 0.625

BNL3650 1.000 0.285 0.627

BNL3649 0.986 0.285 0.628

BNL786 1.000 0.283 0.633

BNL2960 0.871 0.276 0.655

BNL1414 0.975 0.266 0.691

BNL946 0.982 0.265 0.694

BNL1161_b 0.986 0.262 0.706

BNL3034 0.968 0.259 0.715

BNL3590_a 0.968 0.256 0.727

BNL2646 0.953 0.251 0.744

BNL3029_a 1.000 0.248 0.757

BNL830 0.976 0.236 0.809

BNL1434 1.000 0.234 0.817

BNL1513 1.000 0.230 0.837

BNL1122_b 0.890 0.229 0.842

BNL1604_b 0.972 0.225 0.859

BNL2440_b 1.000 0.217 0.903

BNL1721 1.000 0.211 0.936

BNL3800 1.000 0.200 0.997

BNL1408 1.000 0.199 1.00

178

Table S6 continued. BNL1551 1.000 0.195 1.03

BNL3537 -0.387 0.190 1.07

BNL1667_b 1.000 0.189 1.07

BNL1421 0.973 0.185 1.10

BNL1440 0.982 0.181 1.13

BNL3449_b 0.981 0.176 1.17

BNL3474_a 0.978 0.175 1.18

BNL2499 0.953 0.168 1.24

BNL3510 1.000 0.162 1.30

BNL4041 1.000 0.160 1.31

BNL3590_b 0.977 0.159 1.32

BNL1694_b 0.978 0.157 1.34

BNL2572 0.991 0.144 1.48

BNL2812_a 0.926 0.135 1.59

BNL2667 0.990 0.129 1.68

BNL2553 1.000 0.126 1.73

CIR030 1.000 0.125 1.75

TMB1645 1.000 0.118 1.86

BNL2882 0.088 0.117 1.88

BNL3511 1.000 0.100 2.24

BNL3482 0.990 0.094 2.40

179

Table S6 continued. BNL1673 1.000 0.093 2.44

CIR218_b -0.681 0.071 3.29

BNL827 0.980 0.066 3.55

BNL1667_a 0.937 0.046 5.16

BNL1059 1.000 0.033 7.23

BNL1350 1.000 0.033 7.37

BNL3993 0.960 0.032 7.56

BNL358 0.983 0.028 8.68

BNL3627 1.000 0.020 12.25

BNL3778 0.869 0.014 17.74

*Nm = Gene flow estimated from 0.25(1-FST)/FST

180

References

Abdalla, A.M., O.U.K. Reddy, K.M. El-Zik, and A.E. Pepper. 2001. Genetic diversity and relationships of diploid and tetraploid cottons revealed using AFLP. Theor. Appl. Genet. 102:222–229.

Abdurakhmonov, I.Y., R.J. Kohel, J.Z. Yu, A.E. Pepper, A.A. Abdullaev, F.N. Kushanov, L.B. Salakhutdinov, Z.T. Buriev, S. Saha, B.E. Scheffler, J.N. Jenkins, and A. Abdukarimov. 2008. Molecular diversity and association mapping of fiber quality traits in exotic G. hirsutum L. germplasm. Genomics 92:478–487.

Barton, N.H., and M. Slatkin. 1986. A quasi-equilibrium theory of the distribution of rare alleles in a subdivided population. Heredity 56:409–415.

Bertini, C.H.C.D., I. Schuster, T. Sediyama, E.G. Barros, and M.A. Moreira. 2006. Characterization and genetic diversity analysis of cotton cultivars using microsatellites. Genet. Mol. Biol. 29:321–329.

Bolek, Y., K.M. El-Zik, A.E. Pepper, A.A. Bell, C.W. Magill, P.M. Thaxton, and O.U.K. Reddy. 2005. Mapping of verticillium wilt resistance genes in cotton. Plant Science 168(6): 1581-1590.

Bowman, D.T., O.L. May, and D.S. Calhoun. 1996. Genetic base of upland cotton cultivars released between 1970 and 1990. Crop Sci. 36:577–581.

Brubaker, C.L., F.M. Bourland, and J.F. Wendel. 1999. The origin and domestication of cotton. In: C.W. Smith, and J.T. Cothren, editors, Cotton: Origin, History, Technology, and Production. Wiley, New York, p. 3–32.

Campbell, B.T., V.E. Williams, and W. Park. 2009. Using molecular markers and field performance data to characterize the Pee Dee cotton germplasm resources. Euphytica 169: 285–301.

Campbell, B.T., S. Saha, R. Percy, J. Frelichowski, J.N. Jenkins, W. Park, C.D. Mayee, V. Gotmare, D. Dessauw, M. Giband, X. Du, et al. 2010. Status of the global cotton germplasm resources. Crop Sci. 50:1161-1179.

Carvalho, V.P., C.F. Ruas, J.M. Ferreira, R.M. Moreira, and P.M. Ruas. 2004. Genetic diversity among maize (Zea mays L.) landraces assessed by RAPD markers. Genet. Mol. Biol. 27:228-236.

Dent, A.E., and M.V. Bridgett. 2012. STRUCTURE HARVESTER: A website and program for visualizing STRUCTURE output and implementing the Evanno method. Conserv. Genet. Resour. 4:359–361.

181

Evanno, G., S. Regnaut, and J. Goudet. 2005. Detecting the number of clusters of individuals using the software STRUCTURE: A simulation study. Mol. Ecol. 14:2611–2620.

Excoffier, L., and H.E.L. Lischer. 2010. Arlequin suite ver 3.5: A new series of programs to perform population genetics analyses under Linux and windows. Mol. Eco. Res. 10:564–567.

Fang, D.D., L.L. Hinze, R.G. Percy, P. Li, D. Deng, and G. Thyssen. 2013. A microsatellite- based genome-wide analysis of genetic diversity and linkage disequilibrium in Upland cotton (Gossypium hirsutum L.) cultivars from major cotton-growing countries. Euphytica 191:391- 401.

Frankel, O.H., and A.H.D. Brown. 1984. Plant genetic resources today: a critical appraisal. In: J.H.W. Holden, and J.T. Williams, editors, Crop genetic resources: conservation & evaluation. London: George Allen and Unwin, p. 249-257.

Grover, C.E., X. Zhu, K.K. Grupp, J.J. Jareczek, J.P. Gallagher, E. Szadkowski, J.G. Seijo, and J.F. Wendel. 2015. Molecular confirmation of species status for the allopolyploid cotton species, Gossypium ekmanianum Wittmack. Genet. Resour. Crop Evol. 62:103–114.

Guo, Y., J.C. McCarty, J.N. Jenkins, C. An, and S. Saha. 2009. Genetic detection of node of first fruiting branch in crosses of a cultivar with two exotic accessions of upland cotton. Euphytica, 166(3):317-329.

Hardy, O.J., and X. Vekemans. 2002. SPAGeDi: A versatile computer program to analyse spatial genetic structure at the individual or population levels. Mol. Ecol. Notes 2:618-620.

Hawkins, J.S., J. Pleasants, and J.F. Wendel. 2005. Identification of AFLP markers that discriminate between cultivated cotton and the Hawaiian island endemic, Gossypium tomentosum Nuttall ex Seeman. Genet. Resour. Crop Evol. 52:1069-1078.

Hinze, L.L., E. Gazave, M.A. Gore, D.D. Fang, B.E. Scheffler, Z.Y. John, D.C. Jones, J. Frelichowski, and R.G. Percy. 2016. Genetic diversity of the two commercial tetraploid cotton species in the Gossypium Diversity Reference Set. J. Hered. 107 (3): 274-286. p.esw004.

Hutchinson, J.B., R.A. Silow, and S.G. Stephens. 1947. The evolution of Gossypium. Oxford University Press, London.

Jenkins, J.N. 1986. Host Plant Resistance: Advances in Cotton. In: Proc. Beltwide Cotton Conf. National Cotton Council of America, Memphis, TN. p 34-41.

Kalinowski, S.T. 2004. Counting alleles with rarefaction: Private alleles and hierarchical sampling designs. Conserv. Genet. 5:539-543.

182

Kalivas, A., F. Xanthopoulos, O. Kehagia, and A.S. Tsaftaris. 2011. Agronomic characterization, genetic diversity and association analysis of cotton cultivars using simple sequence repeat molecular markers. Genet. Mol. Res. 10:208–217.

Kantartzi, S.K. and J.M. Stewart. 2008. Association analysis of fibre traits in Gossypium arboreum accessions. Plant Breeding, 127(2):173-179.

Kuroda, Y., N. Tomooka, A. Kaga, S.M.S.W. Wanigadeva, and D.A. Vaughan. 2009. Genetic diversity of wild soybean (Glycine soja Sieb. et Zucc.) and Japanese cultivated soybeans [G. max (L.) Merr.] based on microsatellite (SSR) analysis and the selection of a core collection. Genet. Res. Crop Evol. 56:1045–1055.

Lacape, J.M., D. Dessauw, M. Rajab, J.L. Noyer, and B. Hau. 2007. Microsatellite diversity in tetraploid Gossypium germplasm: assembling a highly informative genotyping set of cotton SSRs. Mol. Breeding 19:45–58.

Lacape J.M., G. Gawrysiak, T.V. Cao, C. Viot, D. Llewellyn, S. Liu, J. Jacobs, D. Becker, P.A. Vianna Barroso, H. De Assuncao, O. Palai, S. Georges, J. Jean, and M. Giband. 2013. Mapping QTLs for traits related to phenology, morphology, and yield components in an inter-specific Gossypium hirsutum x G. barbadense cotton RIL population. Field Crops Res. 144:256-267.

Li, H., J. Luo, J.K. Hemphill, and J.T. Wang. 2001. A rapid and high yielding DNA miniprep for cotton (Gossypium spp.). Plant Mol. Biol. Rep. 19:183a-e.

Li, C.Q., G.S. Liu, H.H. Zhao, L.J. Wang, X.F. Zhang, Y. Liu, W.Y. Zhou, L.L. Yang, P.B. Li, and Q.L. Wang. 2013. Marker-assisted selection of Verticillium wilt resistance in progeny populations of upland cotton derived from mass selection-mass crossing. Euphytica 191(3):469-480.

Liu, S., R.G. Cantrell, J.C. McCarty, and J.M. Stewart. 2000. Simple sequence repeat-based assessment of genetic diversity in cotton race stock accessions. Crop Sci. 40:1459–1469.

Liu, K., M. Goodman, S. Muse, J.S. Smith, E. Buckler, and J. Doebley. 2003. Genetic structure and diversity among maize inbred lines as inferred from DNA microsatellites. Genetics 165:2117-2128.

Liu, K.J., and S.V. Muse. 2005. PowerMarker: An integrated analysis environment for genetic marker analysis. Bioinformatics 21:2128–2129.

Loiselle, B.A., V.L. Sork, J. Nason, and C. Graham. 1995. Spatial genetic structure of a tropical understory shrub, Psychotria officinalis (Rubiaceae). Am. J. Bot. 82:1420–1425.

Loveless, M.D., and J.L. Hamrick. 1984. Ecological determinants of genetic structure in plant populations. Ann. Rev. Ecol. Syst. 15:65-95.

183

Mei, H., X. Zhu, and T. Zhang. 2013. Favorable QTL alleles for yield and its components identified by association mapping in Chinese Upland cotton cultivars. PLoS One, 8(12), p.e82193.

McCarty, J.C. Jr., and J.N. Jenkins. 1992. Cotton Germplasm characteristics of 79 Day- neutral Primitive Race Accessions. Mississippi Agriculture and Forestry Experimental Station Technical Bulletin 184.

McCarty, J.C., and R.G. Percy. 2001. Genes from exotic germplasm and their use in cultivar improvement in Gossypium hirsutum L. and G. barbadense L. In: J.N. Jenkins, and S. Saha, editors, Genetic improvement of cotton—Emerging technologies, Sci Publ, Enfield, NH, p. 65–80.

Multani, D.S., and B.R. Lyon. 1995. Genetic fingerprinting of Australian cotton cultivars with RAPD markers. Genome 38:1005–1008.

Nei, M., F. Tajima, and Y. Tateno. 1983. Accuracy of estimated phylogenetic trees from molecular data. J. Mol. Evol. 19:153–170.

Niles, G.A., and C.V. Feaster. 1984. Breeding. In R.J. Kohel, and C.F. Lewis, editors, Cotton, American Society of Agronomy, Madison, WI, p. 201-231.

Percival, A.E. 1987. The national collection of Gossypium germplasm. USDA, Department of Agriculture, Southern Cooperative Series Bulletin No. 321, College Station, TX.

Percival, A.E., J.F. Wendel, and J.M. Stewart. 1999. Taxonomy and germplasm resources. In: W.C. Smith, and J.T. Cothren, editors, Cotton: Origin, History, Technology and Production. John Wiley and Sons, New York, p. 33–64.

Powell, W., G.C. Machray, and J. Provan. 1996. Polymorphism revealed by simple sequence repeats. Trends in Plant Sci. 1:215–222.

Priolli, R.H.G., P.T. Wysmierski, C.P.D. Cunha, J.B. Pinheiro, and N.A. Vello. 2013. Genetic structure and a selected core set of Brazilian soybean cultivars. Genet. Mol. Biol. 36:382-390.

Pritchard, J.K., M. Stephens, and P. Donnelly. 2000. Inference of population structure using multilocus genotype data. Genetics 155:945–959.

Quisenberry, J.E., W.R. Jordan, B.A. Roark, and D.W. Frywear. 1981. Exotic cottons as genetic sources for drought resistance. Crop Sci. 21:889-895.

R Core Team. 2013. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL http://www.R-project.org/

184

Rahman, M., T. Yasmin, N. Tabbasam, I. Ullah, M. Asif, and Y. Zafar. 2008. Studying the extent of genetic diversity among Gossypium arboreum L. genotypes/cultivars using DNA fingerprinting. Genet. Resour. Crop Evol. 55:331–339.

Roark, B., and J.E. Quisenberry. 1977. Evaluation of cotton germplasm for drought resistance. Proc. Beltwide Cotton Prod. Res. Conf. 1977:49-50.

Rousset, F. 2008. Genepop’007: A complete re‐implementation of the genepop software for Windows and Linux. Mol. Ecol. Res. 8:103-106.

Slatkin, M. 1985. Gene flow in natural populations. Annul. Rev. Ecol. Evol. Syst. 16:393- 430.

Smith, C.W., R.G. Cantrell, H.S. Moser, and S.R. Oakley. 1999. History of cultivar development in the United States. In: C.W. Smith, and J.T. Cothren, editors, Cotton: Origin, History, Technology, and Production. Wiley, New York, p. 99–171.

Tyagi, P., M.A. Gore, D.T. Bowman, B.T. Campbell, J.A. Udall, and V. Kuraparthy. 2014. Genetic diversity and population structure in the US Upland cotton (Gossypium hirsutum L.). Theor. Appl. Genet. 127(2):283-295.

Ulloa, M., I.Y. Abdurakhmonov, C. Perez-M, R. Percy, and J.M. Stewart. 2013. Genetic diversity and population structure of cotton (Gossypium spp.) of the New World assessed by SSR markers. Botany 91:251-259.

Ulloa, M., C. Wang, S. Saha, R.B. Hutmacher, D.M. Stelly, J.N. Jenkins, J. Burke, and P.A. Roberts. 2016. Analysis of root-knot nematode and fusarium wilt disease resistance in cotton (Gossypium spp.) using chromosome substitution lines from two alien species. Genetica 144(2):167-179.

US Department of Agriculture. 2016. Oil Crops Yearbook (89002). USDA Economics, Statistics, and Market Information System. (March 2016; http://usda.mannlib.cornell.edu/MannUsda/viewDocumentInfo.do?documentID=1290)

Van Becelaere, G., E.L. Lubbers, A.H. Paterson, and P.W. Chee. 2005. Pedigree vs. DNA marker-based genetic similarity estimates in cotton. Crop Sci. 45:2281–2287.

Van Esbroeck, G.A., and D.T. Bowman. 1998. Cotton germplasm diversity and its importance to cultivar development. J. Cotton Sci. 2: 121-129.

Vekemans, X., and O.J. Hardy. 2004. New insights from fine‐scale spatial genetic structure analyses in plant populations. Mol. Ecol. 13:921-935.

Wallace, T.P., D.T. Bowman, B.T. Campbell, P. Chee, O.A. Gutierrez, R.J. Kohel, J.C. McCarty, G.O. Myers, R.G. Percy, A.F. Robinson, W. Smith, D.M. Stelly, J. Stewart, P.M.

185

Thaxton, R.M. Ulloa, and D.B. Weaver. 2009. Status of USA cotton germplasm collection and crop vulnerability. Genet. Resour. Crop Ev. 56:507-532.

Wang, P., Z. Ning, L. Lin, H. Chen, H. Mei, J. Zhao, B. Liu, X. Zhang, W. Guo, and T. Zhang. 2014. Genetic dissection of tetraploid cotton resistant to Verticillium wilt using interspecific chromosome segment introgression lines. The Crop J. 2014(2):278–88.

Warburton, M.L., J. Crossa, J. Franco, M. Kazi, R. Trethowan, S. Rajaram, W. Pfeiffer, P. Zhang, S. Dreisigacker, and M. Van Ginkel. 2006. Bringing wild relatives back into the family: recovering genetic diversity in CIMMYT improved wheat germplasm. Euphytica 149:289-301.

Weir, B.S., and C.C. Cockerham. 1984. Estimating F-statistics for the analysis of population structure. Evolution 1358-1370.

Wendel, J., C. Brubaker, and A. Percival. 1992. Genetic diversity in Gossypium hirsutum and the origin of upland cotton. Am. J. Bot. 79:1291–1310.

Wendel, J., and R. Percy. 1990. Allozyme diversity and introgression in the galapagos- islands endemic Gossypium darwinii and its relationship to continental . Biochem. Syst. Ecol. 18:517–528.

Wendel, J., R. Rowley, and J. Stewart. 1994. Genetic diversity in and phylogenetic- relationships of the Brazilian endemic cotton, Gossypium mustelinum (malvaceae). Plant Syst. Evol. 192:49–59.

Xu, H., Y. Mei, J. Hu, J. Zhu, and P. Gong. 2006. Sampling a core collection of Island cotton (Gossypium barbadense L.) based on the genotypic values of fiber traits. Genet. Res. Crop Evol. 53:515–521.

Yan, J., T. Shah, M.L. Warburton, E.S. Buckler, M.D. McMullen, and J. Crouch. 2009. Genetic characterization and linkage disequilibrium estimation of a global maize collection using SNP markers. PloS One 4:p8451. doi:10.1371/journal.pone.0008451.

Zhang, J., Y. Lu, R. Cantrell, and E. Hughs. 2005. Molecular marker diversity and field performance in commercial cotton cultivars evaluated in the southwestern USA. Crop Sci. 45:1483–1490.

Zhang, H.B., Y. Li, B. Wang, and P.W. Chee. 2008. Recent advances in cotton genomics. Int. J. Plant Genomics 2008:742304.

Zhang, Y., X.F. Wang, Z.K. Li, G.Y. Zhang, and Z.Y. Ma. 2011. Assessing genetic diversity of cotton cultivars using genomic and newly developed expressed sequence tag-derived microsatellite markers. Genet. Mol. Res. 10:1462–1470.

186

Zhang, J., J.Yu, W. Pei, X. Li, J. Said, M. Song, and S. Sanogo. 2015. Genetic analysis of Verticillium wilt resistance in a backcross inbred line population and a meta-analysis of quantitative trait loci for disease resistance in cotton. BMC genomics, 16(1):577.

Zhao, Y., H. Wang, W. Chen, and Y. Li. 2014. Genetic structure, linkage disequilibrium and association Mapping of verticillium wilt resistance in elite cotton (Gossypium hirsutum L.) Germplasm Population. PLoS One 9(1):e86308. doi:10.1371/journal.pone.0086308.

Zhao, Y., H. Wang, W. Chen, Y. Li, H. Gong, X. Sang, F. Huo, and F. Zeng. 2015. Genetic diversity and population structure of elite cotton (Gossypium hirsutum L.) germplasm revealed by SSR markers. Plant Sys. Evol. 301(1):327-336.

187

APPENDICES

Appendix A: List of 289 elite G. hirsutum and 34 G. barbadense accessions screened for thrips tolerance at Upper Coastal Research Station, Rocky Mount, NC S. No Germplasm Gossypium species PI/ SA numbers 1 ACALA #111, ROGERS G. hirsutum PI 528816 2 ACALA 1064(New Mexico) G. hirsutum PI 528606 3 ACALA 1517 WILT G. hirsutum PI 528758 4 ACALA 1517-70 G. hirsutum PI 529290 5 ACALA 1517-75 G. hirsutum PI 529542 6 ACALA 1517D G. hirsutum PI 529243 7 ACALA 29 G. hirsutum PI 529427 8 ACALA 4-42 G. hirsutum PI 529116 9 ACALA 44WR G. hirsutum SA-1049 10 ACALA 5 G. hirsutum PI 529169 11 ACALA 51 G. hirsutum PI 529428 12 ACALA 5675 G. hirsutum PI 528674 13 ACALA 8 G. hirsutum PI 528967 14 ACALA GLANDLESS 8160 G. hirsutum PI 529536 15 ACALA MAXXA G. hirsutum PI540885 16 ACALA NAKED SEED G. hirsutum PI 528609 17 ACALA NUNN'S #5-37 G. hirsutum PI 528603 18 ACALA SJ-2 G. hirsutum PI606810 19 ACALA SJ-3 G. hirsutum PI 529537 20 ACALA SJ-4 G. hirsutum PI 529538 21 ACALA YOUNG'S G. hirsutum PI 528755 22 ACALA, MESSILLA VALLEY 898 G. hirsutum PI 529112 23 ACALA, N.M.8893 G. hirsutum PI 529147 24 AK-DJURA GREEN LINT G. hirsutum PI 528929 25 AK-DJURA HIGG BROWN G. hirsutum PI 529165 26 ALLEN 33 G. hirsutum PI 529318 27 ALL-IN-ONE G. hirsutum PI 529018 28 AMBASSADOR G. hirsutum PI 528973 29 ARK-1 G. hirsutum PI 529547 30 ARKANSAS 10 G. hirsutum PI 528912 31 ARKANSAS 12 G. hirsutum PI 528914 32 ARKOT 8102 G. hirsutum PI595852 33 ARKOT 8606 G. hirsutum PI628634

188

Appendix A continued. 34 ARKOT 8918 G. hirsutum PI628638 35 AUBURN 56 G. hirsutum PI 529215 36 BJAGL NECT G. hirsutum PI 529382 37 BLCABPD86S-1-90 G. hirsutum PI603008 38 BLIGHT MASTER G. hirsutum PI 529202 39 BOB SHAW 1 G. hirsutum PI 528662 40 BOBDEL G. hirsutum PI 528669 41 BRONCO 360 G. hirsutum PI601663 42 C5HUG2BES-2-87 G. hirsutum PI595762 43 CA17 G. hirsutum PI 529322 44 CA23 G. hirsutum PI 529323 45 CA30 G. hirsutum PI 529324 46 CABCSV506S-1-94 G. hirsutum PI634320 47 CABD3CABCH-1-89 G. hirsutum PI603002 48 CABD3SHP3S-1-90 G. hirsutum PI603007 49 CAHUGLBBCS-1-88 G. hirsutum PI603005 50 CARTER'S LONG STAPLE G. hirsutum PI 528722 51 CASCOT L-7 G. hirsutum PI607181 52 CD3HCABCUH-1-89 G. hirsutum PI603003 53 CD3HCAHUGH-2-88 G. hirsutum PI603000 54 CD3HCHULBH-1-88 G. hirsutum PI603001 55 CD3HHARCIH-1-88 G. hirsutum PI602999 56 CLEVEWILT 6 NAKED SEED G. hirsutum PI 528611 57 COKER 100 WILT G. hirsutum PI 528761 58 COKER 100A (WR) G. hirsutum PI 529216 59 COKER 139 G. hirsutum PI601389 60 COKER 201 G. hirsutum PI 529247 61 COKER 310 G. hirsutum PI 529249 62 COKER 312 G. hirsutum PI 529278 63 COKER 3131 G. hirsutum PI 529531 64 COKER 315 G. hirsutum PI 529530 65 COKER 5110 G. hirsutum PI 529279 66 COKER'S CLEVEWILT 3 G. hirsutum PI 528617 67 COKER'S DELTATYPE WEBBER #7 G. hirsutum PI 528620 68 COKER'S DELTATYPE WEBBER #9 G. hirsutum PI 528619 69 COKER'S WILDS #2 G. hirsutum PI 528626 70 COKER'S WILDS #4 G. hirsutum PI 528625 71 COKER'S WILDS #9 G. hirsutum PI 528624

189

Appendix A continued. 72 COLUMBIA G. hirsutum PI 528743 73 COOK-307-6 G. hirsutum PI 528997 74 CS-8608 G. hirsutum PI 513390 75 CS-8609 G. hirsutum PI 513391 76 CS-8610 G. hirsutum PI 513392 77 CS-8611 G. hirsutum PI 513393 78 CUP LEAF G. hirsutum PI 529014 79 D2 SMOOTH MUTANT G. hirsutum PI 529170 80 DEL CERRO G. hirsutum PI 529358 81 DELCOT 277 G. hirsutum PI 529258 82 DELFOS 6102 G. hirsutum PI 528958 83 DELFOS 9169 (ORIGINAL) G. hirsutum PI 528655 84 DELTA QUEEN G. hirsutum PI 529220 85 DELTAPINE 14 G. hirsutum PI 528970 86 DELTAPINE 15 G. hirsutum SA-0462 87 DELTAPINE 16 G. hirsutum PI 529251 88 DELTAPINE 20 G. hirsutum PI529567 89 DELTAPINE 45 G. hirsutum SA-3607.01 90 DELTAPINE 50 G. hirsutum PI529566 91 DELTAPINE 51 G. hirsutum SA-3138 92 DELTAPINE 61 G. hirsutum PI607174 93 DELTAPINE 90 G. hirsutum PI 529529 94 DELTAPINE A G. hirsutum PI 528767 95 DELTAPINE PREMA G. hirsutum SA 1669 96 DELTAPINE SMOOTH LEAF G. hirsutum PI 529219 97 DELTATYPE WEBBER G. hirsutum PI 528717 98 DELTATYPE WEBBER #4 G. hirsutum PI 528628 99 DELTATYPE WEBBER (253-1) T142-8 G. hirsutum PI 528844 100 DELTATYPE WEBBER 2139 G. hirsutum PI 528598 101 DES 119 G. hirsutum PI606809 102 DES 24 G. hirsutum PI 529522 103 DES 56 G. hirsutum PI 529520 104 DIXIE 14-5-2 G. hirsutum PI 528629 105 DIXIE KING G. hirsutum PI 529021 106 DIXIE TRIUMPH G. hirsutum PI 528956 107 DUNN 1047 G. hirsutum PI601196 108 DUNN 325 G. hirsutum PI601199 109 DURANGO G. hirsutum PI 529057

190

Appendix A continued. 110 EARLISTAPLE 7 G. hirsutum PI529570 111 EARLY FLOFF G. hirsutum PI 529047 112 EMPIRE G. hirsutum PI 529179 113 EMPIRE GL2 GL2 G. hirsutum SA-1113 114 EMPIRE WR G. hirsutum SA-1158 115 EWINGS LONG STAPLE x G. hirsutum PI 528726 TIDEWATER 116 EXPRESS 121 G. hirsutum PI 528977 117 EXPRESS 432 G. hirsutum PI 528702 118 FJA G. hirsutum PI529572 119 FLORIDA GREEN SEED G. hirsutum PI 528694 120 FM966 G. hirsutum PI 619097 121 FOX 4 G. hirsutum PI 529225 122 FREGO NANKEEN G. hirsutum PI 528937 123 FREGO UPLAND CR.DW.MEA G. hirsutum PI 528934 124 FTA G. hirsutum PI529573 125 GA 161 G. hirsutum PI612959 126 GERMAINS ACALA GC-356 G. hirsutum PI601474 127 GOLDEN CROWN G. hirsutum PI 529015 128 GP 1005 G. hirsutum PI600961 129 GP 3755 G. hirsutum PI607178 130 GP 3774 G. hirsutum PI607177 131 GP 5479 G. hirsutum PI600894 132 GREEN G. hirsutum PI601708 133 GREEN BROWN 7(NANKEEN) G. hirsutum PI 528784 134 GREEN LINT G. hirsutum SA-3142 135 GREGG G. hirsutum PI 529094 136 GREGG 35 G. hirsutum PI 529189 137 GSA 74 G. hirsutum PI600741 138 GSA71 G. hirsutum PI529577 139 GSC 25 G. hirsutum PI601109 140 GSC 27 G. hirsutum PI601351 141 GSC 30 G. hirsutum PI601484 142 GUMBO G. hirsutum PI529578 143 H1330 G. hirsutum PI583875 144 HALF AND HALF G. hirsutum PI 528964 145 HARTSVILLE G. hirsutum PI 528741 146 HARTSVILLE #5 G. hirsutum PI 528632

191

Appendix A continued. 147 HGPICG14QH-1-94 G. hirsutum PI634321 148 HOPI ACALA G. hirsutum PI 529244 149 HOPI MOENCOPI G. hirsutum PI 528635 150 JL-1-S(MS) G. hirsutum PI 529180 151 LA 304 (LA RN 910) G. hirsutum PI63663 152 LA 306 (LA RN 4-4) G. hirsutum PI630665 153 LA 322 (LA RN 910) G. hirsutum PI630666 154 LA 333 (LA RN 1032) G. hirsutum PI630667 155 LA 887 G. hirsutum PI547084 156 La.850082FN G. hirsutum PI 572268 157 LAMBRIGHT 2020A G. hirsutum PI592517 158 LANKART G. hirsutum PI601147 159 LANKART 311 G. hirsutum PI601392 160 LANKART 511 G. hirsutum PI601302 161 LANKART 57 G. hirsutum PI 528822 162 LANKART 611 G. hirsutum SA-1006 163 LANKART LX571 G. hirsutum PI606808 164 LBBCABCHUS-1-87 G. hirsutum PI595760 165 LBBCDBOAKH-1-90 G. hirsutum PI603004 166 LIGHTNING EXPRESS G. hirsutum PI 528978 167 LOCKET 4789 G. hirsutum PI 529188 168 LOCKETT 88 G. hirsutum SA-1017 169 LONE STAR G. hirsutum PI 528636 170 M.U.8B UA 7-44 G. hirsutum PI 528560 171 M240 G. hirsutum * 172 M4 G. hirsutum PI 529123 173 MACHA 700 (J. GANNAWAY) G. hirsutum PI607220 174 MCNAIR 220 G. hirsutum PI 529525 175 MCNAIR 235 G. hirsutum PI 529526 176 MD51NE G. hirsutum PI 566941 177 MEBANE G. hirsutum PI 528985 178 MISCOT 8006 G. hirsutum PI564681 179 MO-DEL G. hirsutum PI 529257 180 MULTIPLE MARKER G. hirsutum PI 528950 181 N/A G. hirsutum PI61390 182 NC 88-90 G. hirsutum PI 583374 183 NC 88-91 G. hirsutum PI 583375 184 NC 88-95 G. hirsutum PI 583376

192

Appendix A continued. 185 NEW BOYKIN G. hirsutum PI 528984 186 NORTHERN STAR G. hirsutum PI 528814 187 PAYMASTER 101 G. hirsutum SA-1021 188 PAYMASTER 101A G. hirsutum PI 529206 189 PAYMASTER 111 G. hirsutum PI 529259 190 PAYMASTER 145 G. hirsutum PI529602 191 PAYMASTER 18 G. hirsutum PI529600 192 PAYMASTER 54 G. hirsutum PI 528820 193 PAYMASTER HS200 G. hirsutum PI542974 194 PAYMASTER HS26 G. hirsutum PI606814 195 PD 0111 G. hirsutum PI529612 196 PD 0113 G. hirsutum PI529613 197 PD 0259 G. hirsutum PI529614 198 PD 0695 G. hirsutum PI529615 199 PD 1 G. hirsutum PI606805 200 PD 2 G. hirsutum PI606806 201 PD 2164 G. hirsutum PI529617 202 PD 3246 G. hirsutum PI529619 203 PD 4381 G. hirsutum PI 529621 204 PD 6208 (FORMALLY PD-3) G. hirsutum PI511353 205 PD 781 G. hirsutum PI 533643 206 PD 785 G. hirsutum PI 533644 207 PD 804 G. hirsutum PI 533645 208 PD 8619 G. hirsutum PI529625 209 PD 9232 G. hirsutum PI529627 210 PD 93009 G. hirsutum PI591419 211 PD 93019 G. hirsutum PI591420 212 PD 93021 G. hirsutum PI591421 213 PD 93030 G. hirsutum PI591422 214 PD 93034 G. hirsutum PI591423 215 PD 93043 G. hirsutum PI591424 216 PD 9364 G. hirsutum PI529629 217 PD-3-14 G. hirsutum PI591417 218 PIEDMONT CLEVELAND G. hirsutum SA-0366 219 PSC 355 G. hirsutum PI 612974 220 Pyramid G. hirsutum * 221 QUAPAW G. hirsutum PI607169 222 REX 713 G. hirsutum PI529583

193

Appendix A continued. 223 REX SL G. hirsutum PI 529226 224 RILCOT G. hirsutum PI 529096 225 ROGERS LG-10 G. hirsutum PI600731 226 ROWDEN G. hirsutum SA-0300 227 ROWDEN 41B, TPSA G. hirsutum PI 528818 228 Rugose indore G. hirsutum PI 528510 229 SA 2327 G. hirsutum PI607194 230 SA 2332 G. hirsutum PI607199 231 SA 2413 G. hirsutum PI607237 232 SA 2418 G. hirsutum PI 607242 233 SA 2423 G. hirsutum PI 607247 234 SC-1 G. hirsutum PI529598 235 SEALAND #1 (G.B. X G.H.) G. hirsutum PI 528871 236 SEALAND #2 (G.B. X G.H.) G. hirsutum PI 528872 237 SEALAND #7 WHITE FLOWER G. hirsutum PI 528874 238 SEALAND 391 (G.B. X G.H) G. hirsutum PI 528727 239 SEALAND 472 (G.B. X G.H.) G. hirsutum PI 528729 240 SEALAND 542 (G.B. X G.H.) G. hirsutum PI 528730 241 SEALAND 883 (G.B. X G.H.) G. hirsutum PI 528875 242 SG 747 G. hirsutum * 243 SMALL LEAF G. hirsutum PI 529396 244 SOUTHLAND 400 G. hirsutum PI540872 245 SOUTHLAND M1 G. hirsutum PI601652 246 SPEARS UPLAND EARLY LONG G. hirsutum PI 529043 247 SPNXCHGLBH-1-94 G. hirsutum PI634326 248 SPNXHQBPIS-1-94 G. hirsutum PI634327 249 STARDEL G. hirsutum PI 529040 250 STONEVILLE 112 G. hirsutum PI529631 251 STONEVILLE 2 G. hirsutum PI 528751 252 STONEVILLE 20 G. hirsutum PI 528671 253 STONEVILLE 213 G. hirsutum PI 529229 254 STONEVILLE 2B G. hirsutum SA-0308 255 STONEVILLE 506 G. hirsutum PI 529523 256 STONEVILLE 5A G. hirsutum PI 528658 257 STONEVILLE 7A G. hirsutum PI 529228 258 STONEVILLE 825 G. hirsutum PI 529524 259 STV 474 G. hirsutum * 260 TAMCOT CAB-CS G. hirsutum PI564768

194

Appendix A continued. 261 TAMCOT CAMD-E G. hirsutum PI529633 262 TAMCOT CD3H G. hirsutum PI513381 263 TAMCOT GCNH G. hirsutum PI564769 264 TAMCOT SP-21S G. hirsutum PI529635 265 TAMCOT SP-37H G. hirsutum PI529638 266 TAMCOT SPHINX G. hirsutum PI592801 267 TASHKENT 1 G. hirsutum PI 529447 268 TEJAS G. hirsutum PI591047 269 TERRA C-30 G. hirsutum PI601224 270 TERRA C-40 G. hirsutum PI601225 271 TH 458(TRIPLE HYBRID) G. hirsutum PI 529434 272 TIDELAND T.P.S.A.NO.1 G. hirsutum PI 529087 273 TIDELAND, TPSA-69 G. hirsutum PI 529194 274 TIDEWATER (SEABROOKS)(G.B. G. hirsutum PI 528642 INTO G.H.) 275 TM1 G. hirsutum PI 607172 276 TOOLE G. hirsutum PI 528963 277 TRICE G. hirsutum PI 528955 278 TRIUMPH G. hirsutum SA-0852 279 Virescen yellow G. hirsutum PI 528447 280 VIRESCENT NANKEEN G. hirsutum PI 528936 281 WANNAMAKER CLEVELAND G. hirsutum SA-0296 282 WESTBURN 70 G. hirsutum PI 529503 283 WESTBURN M G. hirsutum PI 529504 284 WESTERN STORMPROOF G. hirsutum PI 529088 285 WIELDS CLEVELAND G. hirsutum PI 528960 286 WILDS G. hirsutum SA-0884 287 WILDS 34-4(411), T82-2 G. hirsutum PI 528842 288 WILDS 34-4(411), T85-2 G. hirsutum PI 528843 289 WILDS 5 G. hirsutum PI 528979 290 GB-0067-1 G. barbadense PI 528325 291 GB-0074-3 G. barbadense PI 528330 292 GB-0077-3 G. barbadense PI 203568 293 GB-0085-2 G. barbadense PI 205064 294 GB-0219-2 G. barbadense PI 528379 295 GB-0228-2 G. barbadense PI 528388 296 GB-0236-3 G. barbadense PI 528393 297 GB-0303-3 G. barbadense PI 528167

195

Appendix A continued. 298 GB-0309-2 G. barbadense PI 528157 299 GB-0362-1 G. barbadense PI 361153 300 GB-0369-1 G. barbadense PI 528216 301 GB-0399-1 G. barbadense PI 528244 302 GB-0420-2 G. barbadense PI 528265 303 GB-0604-3 G. barbadense PI 407501 304 GB-0618-1 G. barbadense PI 608061 305 GB-1388-2 G. barbadense PI 360156 306 3-79-2 G. barbadense * 307 BF19-2 G. barbadense * 308 DPHTO-1 G. barbadense PI 603259 309 GPD52-1(white) G. barbadense * 310 GPD52-1(yellow) G. barbadense * 311 GB-0352 G. barbadense PI 528200 312 GB-0081 G. barbadense PI 203572 313 GB-0379 G. barbadense PI 528226 314 GB-0380 G. barbadense PI 528227 315 GB-0382 G. barbadense PI 528228 316 GB-0432 G. barbadense PI 528275 317 GB-0448 G. barbadense PI 266218 318 GB-0449 G. barbadense PI 266219 319 GB-0458 G. barbadense PI 528294 320 GB-0459 G. barbadense PI 528295 321 GB-0460 G. barbadense PI 528296 322 GB-0427 G. barbadense PI 528272 323 GB-0657 G. barbadense PI 608094

196