Template B v4.1 (beta): Created by L. Threet 11/15/19

Genetic structure of the host , fasciculata, and its microbial partners

By

TITLE PAGE Mahboubeh Hosseinalizadeh Nobarinezhad

Approved by:

Gary N. Ervin (Major Professor) Lisa Wallace Mark E. Welch Heather Jordan Justin A. Thornton (Graduate Coordinator) Rick Travis (Dean, College of Arts & Sciences)

A Dissertation Submitted to the Faculty of Mississippi State University in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy in Biological Sciences in the Department of Biological Sciences

Mississippi State, Mississippi

August 2020

Copyright by COPYRIGHT PAGE Mahboubeh Hosseinalizadeh Nobarinezhad

2020

Name: Mahboubeh Hosseinalizadeh Nobarinezhad ABSTRACT Date of Degree: August 7, 2020

Institution: Mississippi State University

Major Field: Biological Sciences

Major Professor: Gary E. Ervin

Title of Study: Genetic structure of the host plant, , and its microbial partners

Pages in Study: 136

Candidate for Degree of Doctor of Philosophy

Local factors have the potential to generate genetic structure within if populations respond differently to varying environmental conditions across their geographic range. In this project, spatial genetic structure was examined in the legume, Chamaecrista fasciculata, its symbiotic nitrogen fixing bacteria, and rhizosphere microbiomes. In the first chapter, the aim was to test for genetic structure among populations of C. fasciculata in the Southeast and to evaluate whether phenotypic variation in leaf pubescence is associated with genetic divergence among populations, which would be consistent with local adaptation. My results did not detect a significant association between genetic structure and phenotypic variation in leaf pubescence, but the role of environmental variables in generating the observed patterns of spatial genetic variation in C. fasciculata was demonstrated. In the second project, I analyzed genetic structure within a single population to test for the presence of fine-scale genetic structure of the host plant and to determine if genetic structure of symbiotic nitrogen-fixing rhizobia is influenced by host . Neighboring plants are expected to be more genetically similar than distant plants. If this expectation is supported and genotype x genotype interactions are important in this system, then

I anticipated that spatially close host plants would show more genetically similar microbiota in

their nodules and rhizospheric soil than distant host plants. The results indicated fine-scale genetic structure for both host plants and nodulating rhizobia, suggesting that the both organisms are influenced by similar mechanisms structuring genetic diversity or shared habitat preferences.

In the third project, I characterized fine-scale structure and diversity of microbial communities of the rhizosphere of host plants within a single population. The results revealed significant differences in bacterial and fungal communities among host plants and a significant association between genetic distance of both microbial communities and spatial and plant genetic distances.

These data confirm the importance of plant genotype and physical distance in shaping the genetic structure and diversity of bacterial and fungal communities of the rhizosphere. Overall, this study enhances our understanding of the degree to which intraspecific genetic variation in plants influences the diversity and structure of soil microbial communities.

DEDICATION

This dissertation is dedicated to my loving parents, who guided me to where I am today, for their

love, endless support and encouragement.

iv

ACKNOWLEDGEMENTS

I would like to acknowledge my advisor, Dr. Lisa Wallace, who played a huge role in my academic accomplishments; Without her guidance, expertise and patience, I could have never reached this current level of success. Secondly, my committee members, Dr. Gary Ervin, Dr.

Mark Welch and Dr. Heather Jordan, who has provided patient advice and guidance throughout the research process. My special thanks of gratitude go to Dr. Gary Ervin for taking the responsibility of being my major advisor at Mississippi State University and taking care of all official duties related to me at Mississippi State university after I moved to Old Dominion

University in the middle of my doctorate education.

I am thankful to Dr. Susan Singer for permission to use Chamaecrista fasciculata transcriptome data set to develop the microsatellite markers. I would also like to acknowledge the financial support received from different sources that made this research work possible including Biology Faculty Fund research awards, Zernickow Fund for Field Research, and teaching assistantship support provided by the Department of Biological Sciences at Mississippi

State, and the Graduate Student Research Award received from the Botanical Society of

America, and also Old Dominion University for assistantship funding from the J. Robert Stiffler endowment throughout the last three years of doctorate education.

I am also very grateful to Dr. Ronald Aitig and my caring lab mates in the Wallace and

Welch labs including Eranga Wettewa, Dr. Chathurani Ranathunge, Bobby Coltharp and Dr.

Giuliano Colosimo. Thank you all for your unwavering support. Last but not least, I am grateful

v

to my beloved and kindhearted family for their unconditional love and continual support over the last six years.

vi

TABLE OF CONTENTS

DEDICATION ...... iv

ACKNOWLEDGEMENTS ...... v

LIST OF TABLES ...... ix

LIST OF FIGURES ...... xi

CHAPTER

I. INTRODUCTION ...... 1

Genetic structure and its association with phenotypic variation ...... 1 Legume rhizobia symbiosis ...... 2 Fine-scale genetic structure and spatial autocorrelation in host plant and its associated microbiome ...... 3 Study system ...... 5 References ...... 9

II. GENETIC STRUCTURE AND PHENOTYPIC DIVERSITY IN CHAMAECRISTA FASCICULATA () AT THE SOUTHERN EXTENT OF ITS RANGE ....15

Introduction ...... 15 Materials and Methods ...... 19 Data collection ...... 19 Data Analysis ...... 21 Results ...... 28 Discussion ...... 39 Genetic and phenotypic variation in Chamaecrista fasciculata ...... 39 Genetic differentiation and gene flow ...... 42 Conclusions ...... 44 References ...... 46

III. FINE-SCALE PATTERNS OF GENETIC STRUCTURE IN THE HOST PLANT CHAMAECRISTA FASCICULATA (FABACEAE) AND ITS NODULATING RHIZOBIA SYMBIONTS...... 54

Introduction ...... 54 Materials and Methods ...... 58 vii

Plant and rhizobia collection ...... 58 Data analysis ...... 64 Results ...... 68 Discussion ...... 76 Conclusions ...... 85 References ...... 87

IV. FINE-SCALE PATTERNS OF GENETIC STRUCTURE IN CHAMAECRISTA FASCICULATA (FABACEAE) AND ITS RHIZOSPHERE MICROBIAL COMMUNITIES ...... 95

Introduction ...... 95 Materials and Methods ...... 99 Data collection ...... 99 Soil Sequence Data Processing and Analyses ...... 103 Results ...... 105 Discussion ...... 111 Conclusions ...... 124 References ...... 126

viii

LIST OF TABLES

Table 2.1 Genetic diversity, leaflet pubescence form, GPS coordinates and voucher ID in studied populations of Chamaecrista fasciculata...... 24

Table 2.2 Hierarchical structure of genetic variation in Chamaecrista fasciculata populations determined by an analysis of molecular variance...... 29

Table 2.3 Results from ANOVA conducted on genetic diversity measures across pubescent types in Chamaecrista fasciculata...... 32

Table 2.4 Post hoc analyses demonstrating significant differences (p < 0.05) in genetic diversity among pubescent, mixed, and glabrous populations of Chamaecrista fasciculata...... 33

Table 2.5 Results of linear regressions between genetic diversity and leaflet pubescence in populations of Chamaecrista fasciculata...... 36

Table 2.6 Importance of temperature and precipitation in explaining the observed genetic structure of Chamaecrista fasciculata populations. Posterior probabilities were generated using GESTE. The model with the highest probability is shown in bold...... 36

Table 3.1 Characteristics of 14 microsatellite loci used to evaluate genetic structure in Chamaecrista fasciculata...... 59

Table 3.2 Genetic diversity of the host plant, Chamaecrista fasciculata...... 69

Table 3.3 Fine-scale genetic structure in Chamecrista fasciculata host plants and nodulating Bradyrhizobium elkanii in the studied plots...... 70

Table 3.4 Genetic diversity of nodulating Bradyrhizobium elkanii in the truA gene across the studied plots...... 73

Table 3.5 Stepwise regression assigning plant genetic and geographic distance as predictors of nodulating Bradyrhizobium elkanii genetic distances...... 73

Table 4.1 Genetic diversity of the host plant, Chamaecrista fasciculata, across three plots...... 106

Table 4.2 Evidence of fine-scale genetic structure in host plants...... 107 ix

Table 4.3 Kruskal Wallis tests for Shannon and evenness indexes within and between plots...... 109

Table 4.4 Stepwise regression assigning plant genetic and spatial distances as predictors of rhizosphere microbial weighted UniFrac genetic distances...... 109

Table 4.5 Tested pH for rhizospheric soils for each plot...... 109

x

LIST OF FIGURES

Figure 2.1 Sampling locations of collected populations of Chamaecrista fasciculata. The pie charts show assignment to each of the two genetic clusters for each population based on STRUCTURE analysis. Black color represents cluster 1 and white color represents cluster 2. Numbers represent the population ID, and G, M and P stands for glabrous, mixed and pubescent, respectively...... 30

Figure 2.2 Bivariate plot of genetic diversity measures for glabrous, pubescent and mixed populations. A) Number of alleles (Na) and number of effective alleles (Ne). B) Observed and expected heterozygosity (Ho and He)...... 34

Figure 2.3 Results from Bayesian analysis in STRUCTURE. A) DeltaK values based on the method of Evanno et al. (2005) for all values of K that were tested. B) Assignment probability of each sample into each of the two clusters. Black color represents cluster 1 and white color represents cluster 2...... 37

Figure 2.4 Bivariate plot of pairwise genetic distances (FST/1-FST) vs. ln of geographical distances. A Mantel test indicated a non-significant correlation between the matrices (r = 0.08, P = 0.08)...... 38

Figure 3.1 Diagram of one sampling plot. Blue points represent the sampling point at distance intervals of 0, 1, 2, 5, 10 and 25 meters from the previous point. At each distance interval, two plants (red points) each were sampled in opposite directions 0.5-meter perpendicular to a central point (blue dot), for a total of 24 plants sampled per plot...... 60

Figure 3.2 A principle coordinates analysis (PCoA) of sampled plants based on microsatellite variation in all three plots. Orange circles = plot 1, black circles = plot 2, blue circles = plot 3. First and the second components account for 8.77% and 6.88% of variation in the dataset, respectively...... 71

Figure 3.3 Ritland’s kinship coefficients for pairs of plants plotted against the logarithm of geographic distance between pairs and the estimated regression line (regression slope bF = -0.01214, p = 0.000). Significant autocorrelation at a distance class is indicated by a star...... 74

xi

Figure 3.4 Ritland’s kinship coefficients for pairs of nodulating Bradyrhizobium plotted against the logarithm of geographic distance separating members of the pairs. A significant autocorrelation was only found for distance class 286. regression slope bF = -0.0031, p = 0.041...... 75

Figure 3.5 Phylogenetic tree of nodualting Bradyrhizobium based on truA sequences. The first and second numbers of each tip indicate plot number (1, 2, or 3) and distance interval (0, 1, 2, 5, 10, or 25). Values on the branches are posterior probability (PP). Support of less than 0.95 PP is not shown...... 77

Figure 3.6 Abundance of families and genera of Rhizobiales found in rhizosphere samples of Chamaecrista fasciculata in the current study...... 79

Figure 3.7 Pairwise plant genetic distances based on microsatellite alleles compared to pairwise geographic distance between sampled plants (r = 0.243, p = 0.001)...... 80

Figure 3.8 Pairwise nodulating Bradyrhizobium genetic distances based on truA sequences compared to pairwise geographic distance between sampled nodules (r = 0, p = 0.99)...... 82

Figure 3.9 Pairwise plant genetic distances based on microsatellite alleles compared to pairwise nodulating Bradyrhizobium genetic distances based on truA sequences (r = -0.087, p = 0.261)...... 84

Figure 4.1 Ritland’s kinship coefficients for pairs of plants plotted against the logarithm of geographic distance separating each pair of plants and the estimated regression line. Significant autocorrelations are indicated by stars. Regression slope bF = -0.01214, p < 0.0001...... 107

Figure 4.2 Plot of pairwise plant genetic distances compared to pairwise geographic distance (Mantel test, r = 0.243, p = 0.001)...... 110

Figure 4.3 Taxonomic classification of microbial sequences obtained from rhizospheric samples at the phylum level for (A) bacterial communities and (B) fungal communities...... 113

Figure 4.4 Boxplots of the number of observed OTUs and Shannon index for rhizosphere samples collected in each plot (spotsite). The first number in each sample name indicates the plot number, the secnd number represents the specific distance at which the sample was collected and the third number (1) represent samples collected from opposite direction in a specific distance. (A- B) bacterial communities and (C-D) fungal communities. A & C observed OTUs, B & D Shannon index...... 116

xii

Figure 4.5 Plot of the first two components from a principal coordinates analysis showing beta diversity among rhizosphere communities in each plot. The first number in each sample name indicates the plot number and the second number represents the specific distance at which the sample was collected for (A) bacterial communities and (B) fungal communities. The first two axes account for approximately 41.7% of the total variance (p = 0.001) for bacterial communities and approximately 30.1% of the total variance (p = 0.001) for fungal communities...... 123

xiii

CHAPTER I

INTRODUCTION

Genetic structure and its association with phenotypic variation

Genetic structure describes the total genetic diversity and its distribution within and among a set of populations. It is shaped by many factors, including life history, population size, geographical or environmental barriers, gene flow, selection and population crashes or bottlenecks (Charlesworth 2009; Slatkin 1987; Wright 1932). The concept of genetic structure was theorized from observations that plant populations have nonrandom spatial distributions

(Allard 1975; Brown et al. 1978, Linhart et al. 1981). Within a plant species, environmental heterogeneity has the potential to influence the distribution of genetic variation among populations (Antonovics et al. 1971; Linhart and Grant 1996; Mitton 1997). Geographic and physical barriers often constrain the movements of individuals and thereby impose a degree of genetic subdivision or population genetic structure on most species (Slatkin 1987).

Phenotypic variation due to underlying heritable genetic variation is a fundamental prerequisite for evolution by natural selection, which affects the genetic structure of a population indirectly via the contribution of phenotypes (Lewontin 1970). It is generally accepted that phenotypic diversity within a population or species is produced by differences in alleles and differences in environmental inputs that modify gene expression (Hallgrimsson and Hall 2005;

Charlesworth and Charlesworth 2010). In plants, one set of processes leading to population genetic structure could be the joint effects of and seed dispersal (Zanella et al. 2011) such

1

that, due to limited dispersal events, phenotypic differentiation between populations could occur.

Campbell and Dooley (1992) believe that most species should exhibit genetic structure, and in some populations, significant genetic structure over small distances results from restricted seed dispersal. Beside the impacts of pollen and seed dispersal, exogenous variables such as those related to climate and soil act on plant communities and may stimulate phenotypic differentiation among populations (Sultan 1987). Factors such as nutrient availability, climate, history, herbivory and plant–soil feedback are known to shape plant communities and drive phenotypic variation in those communities (Grime 2001; Klironomos 2002).

Legume rhizobia symbiosis

Bacteria that are capable of nitrogen fixation are called rhizobia. Legumes (Fabaceae) have evolved mutualisms with these bacteria and house them in anaerobic root nodules. Rhizobia appear to have diverged well before the existence of legumes and probably before the appearance of angiosperms (Turner and Young 2001). Therefore, nodulation capacity is thought to have been acquired after bacterial divergence (Hirsch et al. 2001). Rhizobia include the genera

Rhizobium, Bradyrhizobium, Mesorhizobium, and Sinorhizobium (Denison and Kiers 2004). It has been suggested that the success and diversification of legumes (est. 20,000 species; Cronk et al. 2006) may have been influenced by their symbioses with rhizobia (Martínez-Romero and

Caballero-Mellado 1996; Sprent 2009). Legume-rhizobium symbioses often exhibit a high degree of specificity (Kouchi et al. 2010) that is determined by molecular signals facilitating communication between rhizobia and their legume hosts (Yang et al. 2010). Like taxonomic diversity, functional variation of symbioses with rhizobia has been demonstrated in numerous studies. For example, Heath et al. (2010) found that plant fitness varied among combinations of

Medicago truncatula, the host, and Sinorhizobium meliloti, the symbiont grown in Parker (1999) 2

has also revealed that variation in plant fitness among rhizobia strains is geographically structured. The study of variation in functional traits and plant fitness, which can be interpreted as local adaptation to specific symbionts in the soil, has been targeted in some experiments.

Provorov et al. (2012) showed that adaptive evolution of rhizobia is associated mainly with their interactions with local host plants. In another study, Mandri et al. (2011) identified a correlation between higher nodulation and phosphatase activities and inoculation of host plants with local rhizobia. Therefore, evaluating genetic structure of nitrogen-fixing bacterial symbionts in association with geographic distance and genetic structure of the host plant will be quite informative on legume rhizobia symbiosis.

Fine-scale genetic structure and spatial autocorrelation in host plant and its associated microbiome

Many ecological theories and models acknowledge that elements that are close to one another in space or time are more likely to be influenced by the same generating processes (Franklin and

Mills 2003). Spatially close communities are more similar than expected by chance, and this similarity decays with distance (Ettema and Wardle 2002). Spatial autocorrelation is a consequence of dispersal limitation. Fine-scale genetic structure (SGS) within populations, defined as nonrandom patterns of genetic affinity over relatively small spatial scales (Smouse et al. 2008), not only impacts the evolutionary potential of a species, but it also indicates the role of ecological factors that shape populations (Campbell and Dooley 1992; Loiselle et al. 1995). Some reasons have been suggested for the presence of fine-scale genetic structure in plants, such as non-random fertilization with respect to physical distance among parents (Fenster and Sork 1988), and rare long-distance gene flow (Thomson and Thomson 1989). There are many ecological factors that can affect genetic structure of populations; for example, outcrossing enforces pollen movement,

3

increasing the probability of long-distance gene flow, which prevents population differentiation for neutral alleles (e.g. Slatkin and Maruyama 1975; Slatkin 1976). Insect-pollination compared to wind pollination has the potential to enhance genetic structure across the landscape and consequently increases fine-scale genetic structure (e.g. Augspurger 1980; Frankie and Haber

1983). Limited seed movement in gravity-dispersed species reduces effective population size and promotes interpopulation genetic structure (e.g. Bullock and Primack 1977; Beattie 1978). Annual life cycle increases susceptibility to drift due to bottleneck effects and as a result promotes local subdivision and fine scale genetic structure (e.g. Levin and Wilson 1978; Brown 1979).

Many factors, such as physical and chemical properties of soil, plant genotype and stage of plant development, directly or indirectly influence soil microbial community composition

(Lundberg et al. 2012; Marques et al. 2014; Schreiter et al. 2014; Qiao et al. 2017). Plants have a particularly close association with microbes belowground because root exudates known as rhizodeposits are important sources of carbon and other nutrients that heterotrophic microbes require (Hartmann et al. 2009). It has been suggested that close association between plants and microbes may be the result of active selection by plants for certain soil microbes or shared habitat preferences by plants and microbes (Bever et al. 2010) because spatially close biological communities are more similar than expected by chance and this similarity decays with distance

(Ettema and Wardle 2002). Limited dispersal ability of bacterial cells in the soil due to passive means is expected to maintain structure if there is not extensive soil disturbance. Therefore, it is most likely that dispersal limitation does influence patterns of microbial distribution (O’Malley

2007; Rout and Callaway 2012), and with host influence on rhizobia abundance, spatial autocorrelation of soil rhizobia communities is expected (Ettema and Wardle, 2002; Fierer and

Ladau 2012).

4

Nodulating rhizobia, which begin as free-living soil microbes, also experience variable selection as a result of changes in host genetic identity, resource availability, temperature, moisture and the soil biotic community. Both interspecific (Vuong et al. 2017; Pahua et al. 2018) and intraspecific (Heath and Tiffin 2007; Rangin et al. 2008; Crook et al. 2012) genetic variation of hosts influence selection on nodulating rhizobia. Even variation at a single host gene can have dramatic effects on strain occupancy and fitness in nodules (Kim et al. 2015). For plant hosts containing heavy seeds that are primarily dispersed by gravity, seedlings are expected to be near maternal plants. As a result, within a natural population, host plants and rhizobia symbionts are expected to develop an isolation by distance pattern. The presence of significant genotype x genotype interactions between plants and their symbionts is expected to reinforce this pattern at fine spatial scales.

Study system

Chamaecrista fasciculata (Michx.) Greene is an annual sub-erect native legume

(Fabaceae) that is widely distributed in the eastern U.S. It exhibits intraspecific morphological variation and wide ecological tolerance of different soil types and habitats (Irwin and Barneby

1982). Some taxonomists have named distinct infraspecific taxa based on morphological and ecological variation. This species exhibits a high degree of phenotypic variation in some characters, especially stem and leaf pubescence (Isely 1975). Geographic variation in these characters suggests that there may be local adaptation associated with these traits. Based on many observations, leaf pubescence has been reported across latitudes, although in accordance with Isley’s (1975) and Pullen’s (1963) observations and personal observations, the frequency of this character increases towards coastal areas of the Deep South. Variation in the shape of extra- floral nectaries and flower and anther color have also been detected within and among 5

populations (Pullen 1963; personal observation). As a result of the phenotypic variation detected across its range, the for the species is somewhat fluid. It is unclear what causes this variation across the species specifically, but Pullen’s (1963) work suggested it was not due to phenotypic plasticity. A combination of abiotic factors such as temperature, precipitation, soil factors, and pH, as well as biotic factors like insect herbivores and pathogens may be involved in driving the observed phenotypic variation.

Because of its ability to harbor nitrogen-fixing rhizobia, C. fasciculata is an important species in many natural ecosystems. Plant hosts that are genetically differentiated have the potential to select different sets of symbionts available in the soil in different locations, bringing about genotype x genotype interactions that influence the fitness of both partners. This suggests that the genetic and the community structure of symbionts are likely subject to selective pressures of the plant host and that they may be locally adapted to their hosts. Thus, finding genetic differentiation among host plant populations with high phenotypic variation could suggest some degree of local adaptation to abiotic and biotic factors.

The studies comprising this dissertation focused on understanding genetic and geographic structure in the host plant, C. fasciculata and symbiotic microbes of the root nodules and rhizsosphere. Bradyrhizobium is the only known genus of nodualting symbionts for this species

(Parker 1999; Parker and Kennedy 2006; Koppell and Parker 2012; Dorman and Wallace 2015).

In the first study, I considered the phenotypic variation in the leaf pubescence of C. fasciculata and tested the hypothesis that this phenotypic variation reflects an underlying population genetic structure using a set of microsatellite loci and populations across Mississippi and Alabama. The results from this study aid in understanding whether populations are evolving in geographic blocks connected by repeated gene flow or in relative isolation. Although genetic variation has

6

been characterized in this species previously (e.g. Fenster 1991, 2000; Fenster et al. 2004;

Etterson 2004; Ericson and Fenster 2006; Abdala-Roberts and Marquis 2007; Mannouris and

Byers 2013; Stanton-Geddes et al. 2013; Bueno et al. 2019), no genetic studies have focused on populations in the southernmost part of the distribution where novel phenotypic variation may exist. Thus, the results of this study have implications for understanding how genetic variation is structured in C. fasciculata at the southern edge of its distribution, as well as infraspecific classification of distinct lineages.

In the next study, I examined genetic variation at a finer scale (i.e., within a single population) and sought to determine if plant hosts and their symbionts show congruent patterns of genetic structure. I expected that plants in close physical proximity would be more genetically similar than those more distant from one another due to having gravity dispersed seeds. Dorman and Wallace (2019) previously demonstrated that Bradyrhizobium are phylogenetically diverse within a sample site yet do not show strong genetic structure across large geographic distances, suggesting that the scale at which successful symbioses are established may be very fine and determined by plant x rhizobia genotypic interactions, rather than soil traits. Thus, I also expected that nodulating rhizobia would exhibit a spatial scale of genetic structure that is concordant with their host plants due to genotype x genotype interactions, limited dispersal of bacteria in the soil, and enhancement of a suitable symbiont pool from continual host plant presence. Studying how genetic structure of nodulating bacteria might be affected by local host genetic diversity and environmental heterogeneity represents a critical step towards a general perception of plant–microbial mutualisms.

In the last study, I expanded my view of within-population structure by characterizing diversity of rhizosphere microbiomes associated with C. fasciculata. Three hypotheses were

7

tested in this chapter: host plants exhibit an isolation by distance pattern because of limited seed dispersal, rhizosphere microbial communities vary in association with plant genotypic variation, and microbial communities exhibit fine-scale structure that decays with distance. This study extends investigation of plant-microbe interactions to those bacteria and fungi living in proximity to the roots and evaluates their diversity across host plants. If genotypic variation of host plants is important in plant-microbe interactions and bacteria are dispersal-limited in the soil, then it is expected that plants with more similar genotypes will have more similar microbial communities and that microbial communities will become less similar with increasing physical distance. This study provides a better understanding of how plant genotype can influence soil microbial community structure at the finest spatial scales of interaction as by analyzing microbial community diversity and structure, I develop insight about the ecology, evolution and physiology of these interactions, which are key to plant health.

Collectively, this dissertation tests that genetic structure in C. fasciculata exists at multiple spatial scales and influences its phenotypic diversity and interactions with endophytic and free-living microbial communities. Studies of genotype x genotype interactions with nodulating symbionts and soil microbial communities help to understand the presence of local adaptation in plants and microbes. Furthermore, characterization of the finest spatial scales in interactions between legumes and rhizobia and identifying how genetic structure of microbial communities might be influenced by plant host diversity and environmental heterogeneity represents a critical step towards a deeper understanding of plant–microbial interactions.

8

References

Abdala-Roberts, L., and R. J. Marquis. 2007. Test of local adaptation to biotic interactions and soil abiotic conditions in the tended Chamaecrista fasciculata (Fabaceae). Oecologia 154: 315-326. Allard, R. W. 1975. The mating system and microevolution. Genetics 79: 115– 126. Antonovics, J., A. D. Bradshaw, and R. G. Turner. 1971. Heavy metal tolerance in plants. Advances in Ecological Research 7: 1–85. Augspurger, C. K. 1 980. Mass-flowering of a tropical shrub (Hybanthus prunifolius): Influence on pollinator attraction and movement. Evolution 34: 475-88. Beattie, A. 1978. Plant-animal interactions affecting gene flow in Viola. In the pollination of flowers by insects. ed. A. J. Richards, pp. 1 5 1-64. London: Academic Bever, J. D., I. A. Dickie, E. Facelli, J. M. Facelli, J. Klironomos, M. Moora, M. C. Rillig, W. D. Stock, M. Tibbett, and M. Zobel. 2010. Rooting theories of plant community ecology in microbial interactions. Trends in Ecology and Evolution 25: 468–478. Bueno, E., T. Kisha, S. L. Maki, et al. 2019. Genetic diversity of Chamaecrista fasciculata (Fabaceae) from the USDA germplasm collection. BMC Research Notes 12: 117. Brown, A. H. D. 1 979. Enzyme polymorphism in plant populations. Theoretical Population Biology 15: 1-42. Brown, A., E. Nevo, and D. Zohary. 1978. Association of alleles at esterase loci in wild barley, Hordeum spontaneum. Nature 268: 430-431. Bullock, S. H., R. B. Primack. 1977. Comparative experimental study of seed dispersal on animals. Ecology 58: 68 1- 86. Campbell, D. R., and J. L. Dooley. 1992. The spatial scale of genetic differentiation in a hummingbird-pollinated plant: comparison with models of isolation by distance. The American Naturalist 139: 735-748. Charlesworth, B. 2009. Fundamental concepts in genetics: Effective population size and patterns of molecular evolution and variation. Nature Reviews Genetics 10: 195-205. Charlesworth, B. and D. Charlesworth. 2010. Elements of evolutionary genetics. Roberts and company, Richboro, Pennsylvania, USA. Cronk, Q., I. Ojeda, and R. T. Pennington. 2006. Legume comparative genomics: progress in phylogenetics and phylogenomics. Current Opinion in Plant Biology 2: 99-103.

9

Crook, M. B, D. P. Lindsay, M. B. Biggs, J. S. Bentley, J. C. Price, S. C. Clement, M. J. Clement, S. R. Long, J. S. Griffitts. 2012. Rhizobial plasmids that cause impaired symbiotic nitrogen fixation and enhanced host invasion. Molecular Plant–Microbe Interactions 25: 1026–1033. Denison, R. F., and E. T Kiers. 2004. Lifestyle alternatives for rhizobia: mutualism, parasitism, and forgoing symbiosis. FEMS Microbiology Letters 237: 187– 193. Dorman, H. and L. E. Wallace. 2019. Diversity of nitrogen-fixing symbionts of Chamaecrista fasciculata (Partridge ) across variable soils. Southeastern Naturalist 80: 147-164. Erickson, D. L., and C. B. Fenster. 2006. Intraspecific hybridization and the recovery of fitness in the native legume Chamaecrista fasciculata. Evolution 60: 225– 233. Etterson, J. 2004. Evolutionary potential of Chamaecrista fasciculata in relation to climate change. I. Clinal patterns of selection along an environmental gradient in the Great Plains. Evolution 58: 1446–1458. Ettema, C. H. and D. A. Wardle. 2002. Spatial soil ecology. Trends in Ecology and Evolution 17:177–183. Fenster, C. B. and V. L. Sork. 1988. Effect of crossing distance and male parent on in vivo pollen tube growth in Chamaecrista (=) fasciculata. American Journal of Botany 75: 1898-1903. Fierer, N., and J. Ladau. 2012. Predicting microbial distributions in space and time. Nature Methods 9: 549-551. Fenster, C. B. 1991. Gene flow in Chamaecrista fasciculata (Leguminosae). I. Gene dispersal. Evolution. 45: 398–409. Fenster, C. B., W. S. Armbruster, P. Wilson, J. D. Thomson, M. R. Dudash. 2004. Pollination syndromes and floral specialization. Annual Review of Ecology, Evolution, and Systematics 35: 375–403. Frankie. G. W. and W. A. Haber. 1 983. Why move among mass-flowering neotropical trees. Molecular Biology and Evolution 17: 360- 72. Franklin, R. B., A. L. Mills. 2003 Multi-scale variation in spatial heterogeneity for microbial community structure in an eastern Virginia agricultural field. FEMS Microbiology Ecology 44: 335-346. Grime, J. P. 2001. Plant strategies, vegetation processes, and ecosystem properties, 2nd edn. Chichester: Wiley. Hallgrimsson, B. and B. K. Hall. 2005. Variation: a central concept in biology. Elsevier Academic Press, Cambridge, Massachusetts, USA.

10

Hartmann, A., M. Schmid, D. Van Tuinen, G. Berg. 2009. Plant-driven selection of microbes. Plant Soil 321: 235–257. Heath, K. D., and P. Tiffin. 2007. Context dependence in the coevolution of plant and rhizobial mutualists. Proceedings of the Royal Society of London B: Biological Sciences 274: 1905–1912. Heath, K. D., A. J. Stock, J. R. Stinchcombe. 2010. Mutualism variation in the nodulation response to nitrate. Journal of Evolutionary Biology 23: 1–7. Hirsch, A. M., Lum, M. R., and Downie, J. A. 2001. What makes the rhizobia-legume symbiosis so special? Plant Physiology 127: 1–9. Irwin, HS. and RC. Barneby. 1982. The American Cassinae: A synoptical revision of Leguminosae tribe Cassieae subtribe Cassinae in the New World. Memoirs of The New York Botanical Garden 35:1–918. Isely, D. 1975. Leguminosae of the United States: II. Subfamily . Memoirs of the New York Botanical Garden 25: 1-228. Kim, M., Y. Chen, J. Xi, C. Waters, R. Chen, D. Wang. 2015. An antimicrobial peptide essential for bacterial survival in the nitrogen-fixing symbiosis. Proceedings of the National Academy of Sciences, USA 112: 201500123. Klironomos, J. N. 2002. Feedback with soil biota contributes to plant rarity and invasiveness in communities. Nature 417: 67–70. Koppell, J. H., and M. A. Parker. 2012. Phylogenetic clustering of Bradyrhizobium symbionts on legumes indigenous to North America. Microbiology 158: 2050–2059. Kouchi, H., H. Imaizumi-Anraku, M. Hayashi, T. Hakoyama, T. Nakagawa, Y. Umehara, N. Suganuma, and M. Kawaguchi. 2010. How many in a pod? Legume genes responsible for mutualistic symbioses underground. Plant Cell Physiology 51: 1381- 1397. Levin, D. A., and J. B. Wilson. 1978. The genetic implications of ecological adaptations in plants. Evolution 70: 75-100. Lewontin, R. C. 1970. The units of selection. Annual Review of Ecology and Systematics 1: 1-18. Linhart, Y. B., J. B. Mitton, K. B. Sturgeon, and M. L. Davis. 1981. Genetic variation in space and time in a population of Ponderosa pine. Heredity 46: 407–426. Linhart, Y. B., and M. C. Grant. 1996. Evolutionary significance of local genetic differentiation in plants. Annual Review of Ecology, Evolution, and Systematics 27: 237-277. Loiselle, B. A., V. L. Sork, J. Nason, et al. 1995. Spatial genetic structure of a tropical understory shrub, Psychotria officinalis (Rubiaceae). American Journal of Botany 82: 1420-1425.

11

Lundberg. D. S, S. L. Lebeis, S. H. Paredes, S. Yourstone, J. Gehring, S. Malfatti, et al. 2012. Defining the core Arabidopsis thaliana root microbiome. Nature 488: 86–90. Mandri, B., J. J. Drevon, A. Bargaz, K. Oufdou, M. Faghire, C. Plassard, H. Payre, and C. Ghoulam. 2011. Interactions between common bean (Phaseolus vulgaris) genotypes and rhizobia strains isolated from Moroccan soils for growth, phosphatase and phytase activities under phosphorus deficiency conditions. Journal of Plant Nutrition 34: 127- 139. Mannouris, C., and D. L. Byers. 2013. The impact of habitat fragmentation on fitness-related traits in a native prairie plant, Chamaecrista fasciculata (Fabaceae). Biological Journal of Linnaean Society 108: 55–67. Martínez-Romero, E., and J. Caballero-Mellado. 1996. Rhizobium phylogenies and bacterial genetic diversity. Critical Reviews in Plant Sciences 15: 113–140. Marques J. M., T. F. da Silva, R. E. Vollu, A. F. Blank, G. C. Ding, L. Seldin, et al. 2014. Plant age and genotype affect the bacterial community composition in the tuber rhizosphere of field-grown sweet potato plants. FEMS Microbiology Ecology 88: 424–435. Mitton, J. B. 1997. Selection in natural populations. Oxford University Press, Oxford. United Kingdom. O’Malley, MA. 2007. The nineteenth century roots of ‘everything is everywhere. Nature Reviews Microbiology 5: 647–651. Pahua, V. J., P. J. N. Stokes, A. C. Hollowell, J. U. Regus, K. A. Gano-Cohen, C. E. Wendlandt, K. W. Quides, J. Y. Lyu, J. L. Sachs. 2018. Fitness variation among host species and the paradox of ineffective rhizobia. Journal of Evolutionary Biology 31: 599– 610. Parker, M. 1999. Mutualism in metapopulations of legumes and rhizobia. The American Naturalist 153: 48-60. Parker, M., and D. A. Kennedy. 2006. Diversity and relationships of Bradyrhizobium from legumes native to eastern North America. Canadian Journal of Microbiology 52: 1148- 1157. Provorov, N. A., E. E. Andronov, O. P. Onishchuk, O. N. Kurchak, and E. P. Chizhevskaya. 2012. Genetic structure of the introduced and local populations of Rhizobium leguminosarum in plant-soil systems. Microbiology 81: 224–232. Pullen, T. M. 1963. The Cassia fasciculata complex (Leguminosae) in the United States. Ph.D. dissertation, The University of Georgia, Georgia, USA. Qiao, Q., F. Wang, J. Zhang, Y. Chen, C. Zhang, G. Liu, et al. 2017. The variation in the rhizosphere microbiome of cotton with soil type, genotype and developmental stage. Scientific Reports 7: 3940-3952.

12

Rangin, C., B. Brunel, J. C. Cleyet-Marel, M. M. Perrineau, G. Bena. 2008. Effects of Medicago truncatula genetic diversity, rhizobial competition, and strain effectiveness on the diversity of a natural Sinorhizobium species community. Applied and Environmental Microbiology 74: 5653–5661. Rout, M. E. and R. M. Callaway. 2012. Interactions between exotic invasive plants and soil microbes in the rhizosphere suggest that everything is not everywhere. Annals of Botany 110: 213-222. Schreiter S., G. C. Ding, H. Heuer, G. Neumann, M. Sandmann, R. Grosch, et al. 2014. Effect of the soil type on the microbiome in the rhizosphere of field‐grown lettuce. Frontiers in Microbiology 5: 144-160. Slatkin, M. 1976. The rate of spread of an advantageous allele in a subdivided population. In Population Genetics and Ecology, ed. S. Karlin, E. Nevo, pp. 767-80. New York: Academic Press. Slatkin, M. 1987. Gene flow and the geographic structure of natural populations. Science 236: 787-792. Slatkin, M., T. Maruyama. 1975. The influence of gene flow on genetic distance. The American Naturalist 109: 597-601. Smouse, P. E, R. Peakall, and E. Gonzales. 2008. A heterogeneity test for fine-scale genetic structure. Molecular Ecology 17: 3389–3400. Sprent, J. I. 2009. Legume Nodulation: A Global Perspective. Wiley–Blackwell, Oxford, UK. 200 pp. Stanton-Geddes, J., T. Paape, B. Epstein, R. Briskine, J. Yoder, J. Mudge, A. K. Bharti, A. D. Farmer, P. Zhou, R. Denny, et al. 2013. Candidate genes and genetic architecture of symbiotic and agronomic traits revealed by whole-genome, sequence-based association genetics in Medicago truncatula. PLoS One 8: e65688. Sultan, S. E. 1987. Evolutionary implications of phenotypic plasticity in plants. Evolutionary Biology 21: 127–178. Thomson, J. D. and B. A. Thomson. 1989. Dispersal of Erythronium grandiflorum pollen by : implications for gene flow and reproductive success. Evolution 43: 657– 661. Turner, S. L., and Young, J. P. W. 2001. The glutamine synthetases of rhizobia: phylogenetics and evolutionary implications. Molecular Biology and Evolution 17: 309–319. Vuong, H. B, P. H. Thrall, and L. G. Barrett. 2017. Host species and environmental variation can influence rhizobial community composition. Journal of Ecology 105: 540– 548. Wright, S. 1932. The roles of mutation, inbreeding, crossbreeding and selection in evolution. Ecology 1: 356—366. 13

Yang, S., F. Tang, M. Gao, H. B. Krishnan, and H. Zhu. 2010. R gene-controlled host specificity in the legume–rhizobia symbiosis. Proceedings of the National Academy of Sciences, USA 107: 18735-18740. Zanella, C. M., M. Bruxel, G. M. Paggi, M. G. Goetze, M. V. Buttow, F. W. Cidade, and F. Bered. 2011. Genetic structure and phenotypic variations of the medicinal tetraploid species Bromelia antiacantha (Bromeliaceae). American Journal of Botany 98: 1511– 1519.

14

CHAPTER II

GENETIC STRUCTURE AND PHENOTYPIC DIVERSITY IN CHAMAECRISTA

FASCICULATA (FABACEAE) AT THE SOUTHERN EXTENT OF ITS RANGE

Introduction

The concept of genetic structure, or the nonrandom distribution of alleles and genes in space and time (Loveless and Hamrick 1984), was theorized from observations that plant populations have nonrandom spatial distributions (Allard 1975; Brown et al. 1978, Linhart et al.

1981). Many factors are known to affect genetic diversity and population genetic structure

(Loveless and Hamrick 1984), among which environmental heterogeneity and differential selection pressures are known to be the most important (Bradshaw 1972; Hedrick et al. 1976;

Spieth 1979). Within a plant species, environmental heterogeneity has the potential to influence the distribution of genetic variation among populations (Antonovics et al. 1971; Linhart and

Grant 1996; Mitton 1997). Environmental heterogeneity can create genetic variation through several evolutionary processes including adaptation, differential gene exchange and genetic drift or founder effects. Adaptation caused by natural selection can result in microgeographical variation (Linhart and Grant 1996). Differential gene exchange can be influenced by variation in flowering phenology among local habitats or ecological barriers among populations along an elevational gradient (Mitton et al. 1980). Through genetic drift or founder effects, founder populations colonize different sites, but gene flow is not sufficient to homogenize differences

(e.g., see Husband and Barrett 1996; Antonovics et al. 1997).

15

Factors such as microclimate and habitat variation have long been known to act on plants to produce genetic and phenotypic differentiation among populations. By influencing variation in biotic and abiotic elements (e.g. climate, resources and physical structure) as well as interactions with different species (e.g. resource competition, predation, mutualism and various forms of interspecific interference), these factors can result in habitat zonation (e.g., Schluter 2001; Ogden and Thorpe 2002). Phenotypic variation due to underlying heritable genetic variation is a fundamental prerequisite for evolution by natural selection (Lewontin 1970), and it is generally accepted that intraspecific phenotypic diversity results from variation in alleles and differences in gene expression (Hallgrimsson and Hall 2005; Charlesworth and Charlesworth 2010). It has been suggested that plant species with wide geographic distributions often display considerable phenotypic variation, which can reflect underlying genetic variation (Jain 1979; Karron 1987;

Hamrick and Godt 1990). However, when observed morphological differences among populations -either immediately or within a few generations- do not correspond with genetic differentiation, then phenotypic plasticity is suspected (e.g., James 1983; Losos et al. 1997;

Trussell and Etter 2001).

Genetic and phenotypic variation are also influenced by patterns of dispersal, migration, and gene flow events; these factors are known to be the most important forces producing geographic patterns in the distribution of genetic variation and can lead to spatially auto- correlated patterns (Myers and Giller 1988; Brown and Briggs 1991). Genetic structure within and among populations is generated by limits to parent–offspring dispersal distances (Wright

1943). The spatial genetic correlations are relatively large for pairs of adjacent or nearby individuals and generally decrease as the distance among individuals and populations increases

(Sokal and Wartenberg 1983). Dispersal, the movement of seeds or diaspores away from the

16

parent plant, is a key life-history stage in plants, and persistence, migration and seedling recruitment are all affected by seed dispersal distances (Howe and Smallwood 1982; Hyatt et al.

2003; Levin et al. 2003). Therefore, when dispersal between populations or subgroups is geographically restricted, the local accumulation of genetic differences produces a pattern influenced by distance (Slatkin 1993).

Environmental factors such as temperature, duration of growing season, and day length often vary with latitude, and native plant populations frequently exhibit latitudinal clines in traits as a result of adaptive responses to this variation. Although rarely examined, genetically based latitudinal clines in traits related to growth, phenology and life history have been identified in populations of several plant species (Weber and Schmid 1998; Kollmann and Bañuelos 2004;

Maron et al. 2004, 2007; Friedman et al. 2008; Montague et al. 2008). For example, in West

Africa, genetic diversity in sweet potato (Ipomoea batatas L.) distributed along a climatic gradient was associated with morphological differentiation, mainly the shape of the leaves and the color of the stem or root (Glato et al. 2017). Thus, it is expected that plant species distributed along a latitudinal gradient display genetic and phenotypic variation correlated with latitude.

Chamaecrista fasciculata, is a self-compatible, mostly outcrossing annual sub-erect native legume that is widely distributed in the eastern U.S. from Minnesota to Mississippi and from the east coast of the U.S. to New Mexico (USDA NRCS 2020). This species produces inflorescences of yellow flowers, which attract bees often as pollinators for the plant. It is an important species in many ecosystems as it provides cover, , and pollen for animals

(USDA NRCS 2020). This species exhibits intraspecific phenotypic variation in stem and leaflet pubescence, anther color, growth habit, and stem architecture, which may be geographically structured (Pullen 1963; Weakley 2015). For example, the frequency of leaflet pubescence

17

appears to increase toward coastal areas in lower latitudes in the southern U.S. (Isely 1975;

Nobarinezhad and Wallace personal observation). In this study, I employed microsatellite data to examine genetic diversity and structure among populations of C. fasciculata in Mississippi and

Alabama and to test for a genetic association with one phenotypically variable trait, leaflet pubescence. Previous study has suggested it is a genetically controlled trait (Wallace et al. in press). I studied this specific trait because the frequency of the trait is observed to decrease drastically toward higher latitudes, and I sought to investigate the association between genotype of the plant and its pubescence form. I focus on the southeast distribution of this species where phenotypic variation may be greatest, yet there have been few studies conducted on populations in this part of the species’ distribution. I hypothesize that there is genetic structure associated with phenotypic variation in leaflet pubescence among populations of C. fasciculata such that phenotypically distinct populations will also be genetically distinct. I also hypothesize that population genetic structure will follow an isolation by distance (IBD) pattern because of a low propensity for gene dispersal due to gravity-dispersed seeds and pollination. A previous study of pollen dispersal distance by honeybees and bumble-bees, the most important known pollinators for C. fasciculata, revealed that most of the pollen from a source plant is deposited on immediate neighbors (Cresswell et al. 1995). Under IBD, distance predicts differentiation as a result of dispersal limitation and drift, irrespective of environmental differences (Wright 1943).

Because leaflet pubescence appears to co-vary with latitude, I also hypothesized that there is latitudinal variation in genetic divergence of populations. This may occur because gene flow is strongest among populations in similar environments and selection pressures to a greater extent than predicted under IBD. Such an environment-driven pattern of gene flow is known as

"isolation by environment" or "isolation by ecology”. A previous study found clinal genetic

18

variation in populations from Minnesota to southern Oklahoma, which was associated with fitness traits, including slower reproductive development, greater number of leaves, and thicker leaves being favored in the most southern locations (Etterson and Shaw 2001).Our study tests for similarity of genetic patterns in southern edge populations compared to patterns observed in other areas of the species distribution and tests for an association between genetic and phenotypic divergence, which has received little attention thus far.

Materials and Methods

Data collection

In total, 38 populations of C. fasciculata across Mississippi and southern Alabama,

United States were sampled (Table 2.1, Figure 2.1). Since this species is not clonal and seeds are gravity dispersed from the maternal plant, individuals at each collection site were haphazardly sampled at least 2 meters apart to reduce the likelihood of sampling related individuals. GPS coordinates were recorded for each sampling site, and voucher specimens have been deposited in the Mississippi State University Herbarium (MISSA) (Table 2.1). Numbers and phenotypes of plants sampled from each population are given in Table 2.1. After collection, 2-8 pubescent pressed specimens per population and two leaflets per specimen from pubescent and mixed populations were scored for pubescence. The pressed plants were used to measure trichome density in the following manner. Two leaves from the middle and top part of each plant were randomly selected and then two leaflets from the middle part of each selected leaf were removed and observed under stereo microscope MEIJI Techno EMZ-13TR with 7x magnification to identify presence or absence of trichomes. Subsequently, photos were taken using a Canon

PowerShot G9X camera connected to the stereo microscope. At 7x magnification, three photos were taken per leaflet from bottom, middle and upper part of the leaflet. These photos were 19

uploaded into the software ImageJ 1.x (Schneider, 2012), presenting a 6000×4000-pixel ara of the leaflet. Consequently, the number of trichomes were counted manually in the respective photos from each of the bottom, middle and upper part of the leaflet in a 4×4 in2 leaf area and an average value was reported as the number of trichome per leaflet. Later, an average value from all studied leaflets per individual and finally a total average of trichomes in all studied individuals per population were reported (Table 2.1). Each population was assigned to one of three groups, glabrous, mixed, or pubescent.

Leaf samples to be used in DNA extraction were kept on ice in the field and then stored in silica gel or at -80°C prior to DNA extraction. Genomic DNA was extracted using a CTAB method (Dellaporta, 1983, lightly modified by Schnable lab described at file:///H:/First%20project-

First%20draft/Papers/For%20the%20references/Dellaporta_DNA_Extraction.2014.08.20.pdf) or the SYNERGY 2.0 Plant DNA extraction kit with modification (OPS Diagnostics, Lebanon,

New Jersey). Each sampled plant was genotyped at 14 microsatellite loci using a multiplexed genotyping approach to quantify genetic structure. The microsatellite loci were identified by

SiRoKo software (Kofler et al., 2012) using a C. fasciculata transcriptome data set (Singer et al.,

2009). Perfect repeats that were at least 15 base pairs long were targeted in the search. Primers were designed for 51 microsatellites and tested in a sample of 18 individuals. A total of 14 polymorphic loci, all with tri repeat type, were used in this study (Table 3.1). These loci were optimized in a multiplex PCR using the Kapa Multiplex PCR kit and fluorescent labeled primers

(Table 3.1) to permit automated genotyping. For each sample, three multiplex amplification reactions with five, four, and five loci per reaction respectively, were performed in a final volume of 10 μl in the presence of 10 ng of template DNA, 100 µmole of each of the reverse and

20

tagged fluorescent label primers and 10 µmole of tagged forward primer using a KAPA 2G Fast

Multiplex PCR kit (Kapa Bio-systems, Wilmington, Massachusetts). The tag in the 5’ forward primer matched the sequence of the fluorescent labeled primer (Culley et al., 2013; Table 3.1).

The thermal cycler program used to amplify loci included 3 min at 95°C, 30 cycles of 15 s at

95°C, 30 s at 60°C, and 30 s at 72°C, and a final extension step of 1 min at 72°C. Amplified products were genotyped at the Arizona State University DNA Lab with LIZ 600 size standard, and individual alleles were sized using GeneMarker software (SoftGenetics, State College,

Pennsylvania).

Data Analysis

The number of unique multilocus genotypes for each population was calculated by Poppr package (Kamvar et al. 2014) using RStudio statistical software v. 1.1.456 (RStudio Team 2012).

The possibility of null alleles in each locus and population was checked using the program

FreeNA (Chapuis and Estoup 2007). The clonal individuals were removed from the data set and detected nulls were corrected in respective loci and populations and later proceeded with genetic analyses (see Results). Genetic diversity within populations was assessed as number of alleles per locus (Na), percent of polymorphic loci (%P), observed heterozygosity (HO), and expected heterozygosity (HE) using GenAlEx version 6.503 (Peakall and Smouse 2012). For each locus, departure from Hardy–Weinberg expectations was tested through permutations of alleles among individuals and statistical significance was assessed using a p-value of 0.00009, which is adjusted using the sequential Bonferroni correction for multiple comparisons (Holm, 1979).

Genotypic linkage disequilibrium was measured for each pair of loci and tested through Fisher's exact test using GENEPOP version 3.2 (Raymond and Rousset 1995) and applying a Bonferroni- type corrected p-value of 0.0005. An analysis of molecular variance (AMOVA) (Excoffier et al. 21

1992) was conducted using GenAlEx version 6.503 (Peakall and Smouse 2012), which allows the hierarchical partitioning of genetic variance into three components: among populations, among individuals, and within individuals. Statistical significance of AMOVA was assessed by

9999 permutations.

Genetic diversity measures, including number of alleles (Na), number of effective alleles

(Ne), and observed and expected heterozygosity (Ho and He) were analyzed among pubescent, glabrous and mixed populations through ANOVA and least square difference (LSD) post hoc analyses. Regression analyses were used to study association between genetic diversity measures and leaflet pubescence. These analyses were conducted using IBM SPSS Statistics v.25 (IBM

Corp., 2017). To conduct these analyses, numerical values of 1, 2 and 3 were assigned to each of the glabrous, mixed and pubescent populations, respectively.

Isolation-by-distance (IBD), which is defined as a decrease in the genetic similarity among populations as the geographic distance between them increases, was investigated by comparing pairwise population genetic and geographic distances in a Mantel test (Mantel, 1967).

A matrix of pairwise FST among populations (MSU Repository) was created using option 6 sub- option 7 in GENEPOP version 3.2 (Raymond and Rousset 1995; Rousset 2008) and a matrix of pairwise geographic distances between populations was calculated using GenAlEx and a modification of the haversine formula (Sinnott, 1984) based on GPS coordinates of the sampled sites (MSU Repository). A Mantel test was performed using GENEPOP version 3.2 (Raymond and Rousset 1995; Rousset 2008) option 6 sub-option 9, which allows analysis of isolation by distance as described in Rousset (1997). Geographic distances were log transformed, pairwise F- statistics were converted to FST/(1-FST), minimum geographic distance was set to 1.0, and 1000

22

permutations were used to assess significance of a positive correlation indicative of isolation by distance.

The Bayesian statistical framework provided by the program STRUCTURE version 2.3.4

(Pritchard et al. 2000) was used to analyze the geographic structure of the populations. I used the

‘admixture model’ with ‘correlated allele frequencies’. I also used the sampling location of the individuals as a prior. A ‘burn-in’ period of 50000 MCMC replicates and a sampling period of

100000 replicates were set; these values were selected based on many initial testes to identify the best values that allow for maximum convergence of the runs. Twenty iterations of each K value, from one to ten, were performed (previous test runs indicated 10 as the best pre-set for k). In this way, multiple posterior probability values (log likelihood (lnL) values) for each K were generated, and the most likely K was evaluated using STRUCTURE HARVESTER (Earl and

VonHoldt 2012) by the DelataK-method following Evanno et al. (2005).

To identify whether geographic structure varies significantly under influence of annual temperature and precipitation, extracted from the WorldClim dataset (Fick and Hijmans 2017) for each sampled population, the hierarchical Bayesian method of Foll and Gaggiotti (2006) implemented in the program GESTE version 2 was used. By default settings, the reversible jump

MCMC method, and 10 pilot runs of a length of 5000 as burn-in prior to drawing samples from a chain of 50000 in length, separated by a thinning interval of 50 were used. All combinations of temperature and precipitation with each other and constant were considered, and models were evaluated using estimates of posterior probability, the 95% highest probability density interval

(HPDI).

23

Table 2.1 Genetic diversity, leaflet pubescence form, GPS coordinates and voucher ID in studied populations of Chamaecrista fasciculata.

a b Pop G/M/P N A P HO HE FIS GPS Coordinates Temperature Precipitation Voucher ID

(C) (mm)

1 G 24 5.21 100 0.59 0.59 0 -88.71022, 33.95617 16.8 1400 MISSA034317

2 G 25 5.71 100 0.5 0.56 0.13 -88.73812, 33.5108 16.7 1407 MISSA033001

3 G 11 4.71 92.86 0.53 0.56 0.02 -88.84644, 33.90426 16.4 1420 MISSA033003

4 G 11 3.86 92.86 0.39 0.54 0.27 -88.71072, 33.4773 16.8 1405 MISSA033004

5 G 18 4.79 100 0.57 0.61 0.04 -89.28017, 33.57761 16.5 1444 MISSA033005

6 G 24 5.5 100 0.62 0.63 0.03 -89.3508, 33.69892 16.5 1424 MISSA033006

7 G 11 4.64 100 0.55 0.58 0.01 -89.42863, 33.78039 16.5 1419 MISSA033007

8 G 24 6.29 100 0.56 0.59 0.04 -89.60973, 33.92705 16.4 1434 MISSA033008

9 G 33 6.07 100 0.51 0.61 0.15 -90.02042, 33.10603 17.1 1410 MISSA033011

10 G 23 4.93 100 0.51 0.58 0.08 -90.22324, 33.28662 17.5 1404 MISSA033013

24

Table 2.1 (continued)

a b Pop G/M/P N A P HO HE FIS GPS Coordinates Temperature Precipitation Voucher ID

(C) (mm)

11 G 11 4.71 100 0.54 0.6 0.09 -90.1377, 33.47292 17.4 1385 MISSA033015

12 P/37.7 11 3.71 100 0.45 0.46 -0.02 -88.68717, 31.8459 17.9 1473 MISSA033017

13 P/33.2 12 2.57 100 0.4 0.4 -0.01 -89.03673, 31.70537 18.1 1478 MISSA033018

14 P/32 10 3.86 100 0.58 0.56 -0.05 -89.29206, 31.94829 17.7 1489 MISSA033020

15 P/27.7 19 3.93 100 0.47 0.5 0.02 -89.40509, 32.02475 17.7 1486 MISSA033021

16 G 12 4.14 92.86 0.42 0.53 0.19 -89.40965, 32.25506 17.4 1507 MISSA033022

17 G 18 3.93 100 0.4 0.49 0.18 -88.78952, 33.27292 16.9 1398 MISSA033023

18 M/28.7 24 4.86 100 0.45 0.52 0.1 -89.062631, 33.5021 16.4 1441 MISSA034318

19 G 24 5.57 100 0.53 0.6 0.1 -89.511705, 33.3597 16.5 1458 MISSA034320

20 G 24 5.14 100 0.52 0.58 0.08 -89.520465, 3.38993 16.5 1460 MISSA034334

21 G 24 5.93 100 0.63 0.67 0.05 -90.0156, 33.782089 16.8 1405 MISSA034322

25

Table 2.1 (continued)

a b Pop G/M/P N A P HO HE FIS GPS Coordinates Temperature Precipitation Voucher ID

(C) (mm)

22 G 23 4.79 100 0.46 0.55 0.17 -89.531702, 3.38991 16.5 1458 MISSA034319

23 G 23 4.79 100 0.55 0.63 0.11 -90.058515, 3.77829 16.9 1404 MISSA034321

24 G 23 3.5 100 0.39 0.45 0.09 -88.81306, 3.476965 16.9 1411 MISSA034335

25 M/27.7 24 4.57 100 0.27 0.5 0.39 -88.94664, 30.81571 18.8 1657 MISSA034340

26 M/36.3 24 3.43 78.57 0.21 0.35 0.44 -88.91466, 30.77216 19.0 1652 MISSA034342

27 M/13.7 24 3.79 92.86 0.38 0.46 0.2 -87.50659, 31.5451 18.3 1504 MISSA034339

28 M/18.1 24 5.43 100 0.5 0.53 0.09 -88.31848, 32.18422 17.6 1510 MISSA034328

29 P/31 12 2.29 64.29 0.09 0.19 0.41 -89.19276, 31.02661 18.6 1617 MISSA034341

30 P/22.3 24 4.07 100 0.46 0.52 0.09 -89.19105, 31.05436 18.7 1603 MISSA034338

31 M/30.8 24 4.36 100 0.54 0.54 -0.03 -89.18211, 31.02901 18.5 1622 MISSA034336

32 M/22.4 24 4.79 100 0.49 0.53 0.15 -88.99221, 30.84834 18.9 1652 MISSA034337

26

Table 2.1 (continued)

a b Pop G/M/P N A P HO HE FIS GPS Coordinates Temperature Precipitation Voucher ID

(C) (mm)

33 M/29.5 24 3.07 85.71 0.34 0.39 0.13 -88.3374, 30.49338 19.5 1656 MISSA034323

34 P/28.7 24 3.21 85.71 0.27 0.45 0.33 -87.88067, 30.74051 19.3 1680 MISSA034326

35 P/19.5 24 2.07 57.14 0.11 0.21 0.35 -87.45387, 31.50992 18.0 1533 MISSA034324

36 P/23.7 24 4.07 78.57 0.38 0.44 0.11 -87.95872, 31.73604 18.0 1539 MISSA034327

37 G 18 1.64 35.71 0.18 0.19 0.01 -88.93438, 30.59455 19.3 1654 MISSA034325

Mean 20.4 4.32 93.44 0.44 0.51 0.12

Notes. N = number of individuals sampled; A = mean number of alleles per locus; P = percentage of polymorphic loci; HO = observed heterozygosity; HE = expected heterozygosity; FIS = inbreeding coefficient. G = Glabrous; P = Pubescent; M = Mixed, both glabrous and pubescent individuals present. a Numbers represent the mean number of trichomes in leaf area for pubescent individuals with the diameter of 4×4 inch. Glabrous individuals had no trichomes. b Vouchers are deposited at the Mississippi State University Herbarium (MISSA).

27

Results

Clonal genotypes were detected in four populations (i.e., two in population 9, two in population 29, six in population 37 and twelve in population 38). Prior to subsequent analyses, duplicate genotypes were removed from populations 9, 29 and 37, while population 38 was removed from the data set. Possible null alleles were detected at six loci and in 7 of the 37 populations (i.e., Cf6895 in populations 34 and 37, Cf10002 in population 37, Cf9980 in population 35, Cf5782 in population 25, Cf3119 in population 25 and Cf8757 in population 26).

I accounted for null alleles at these loci when the predicted null allele frequency was greater than

0.2, as suggested by Dakin and Avise (2004) as a cut-off for null alleles that can influence analyses of genetic structure. The number of alleles per locus for each population ranged from

1.64 to 6.29, with a mean value of 4.32 (Table 2.1). The percentage of polymorphic loci in 26 populations was 100 and it ranged 35-92 for the remaining populations. The mean %P across all populations was 93.44. Observed heterozygosity ranged from 0.09 to 0.63 and expected heterozygosity ranged from 0.19 to 0.67 (Table 2.1). Excluding four populations with negative values, the inbreeding coefficient (FIS) was consistently positive, indicating heterozygote deficiency across these populations (Table 2.1). After applying Bonferroni correction, 61 out of

518 (11.77%) locus by locus comparisons showed significant deviation from Hardy-Weinberg

Equilibrium (HWE). The global test of deviation (heterozygote deficiency) at the population level showed a p-value of 0.0 for all loci across all populations. After applying Bonferroni correction, 43 out of 3367 (1.27%) tests of linkage disequilibrium were significant, in different populations and involving different pairs of loci. Significant levels of genetic differentiation were detected among the sampled populations (FST = 0.193; P = 0.001). AMOVA indicated that

28

variation is partitioned as 68% within individuals, 19% among populations and 13% among individuals (Table 2.2).

Table 2.2 Hierarchical structure of genetic variation in Chamaecrista fasciculata populations determined by an analysis of molecular variance.

Level d.f. Sum of Variance Percent FST P-value

squares component variation

Among 36 1470.833 0.896 19% 0.193 0.001

Populations

Among 719 3121.931 0.609 13%

Individuals

Within Individuals 756 274 2361.500937.000 3.1243.420 68% 77%

29

Figure 2.1 Sampling locations of collected populations of Chamaecrista fasciculata. The pie charts show assignment to each of the two genetic clusters for each population based on STRUCTURE analysis. Black color represents cluster 1 and white color represents cluster 2. Numbers represent the population ID, and G, M and P stands for glabrous, mixed and pubescent, respectively. 30

It has been observed that pubescent leaflet is not a trait exclusively restricted to the southernmost populations, but the frequency of the trait largely drops out by moving farther north. ANOVA showed that there are significant differences among pubescent, glabrous and mixed populations for all of the tested genetic diversity measures including number of alleles

(F2,34 = 7.57, P = 0.002), number of effective alleles (F2,34 = 10.02, P = 0.000), observed heterozygosity (F2,34 = 4.65, P = 0.016) and expected heterozygosity (F2,34 = 6.32, P = 0.005)

(Table 2.3). LSD post hoc tests confirmed that significant differences exist between glabrous and pubescent populations for all tested genetic measures. While in case of Ne, there is also significant differentiation between glabrous and mixed populations (Table 2.4). Presence of private alleles was also investigated in each of the three leaf phenotype groups, but no private alleles were detected. Regression analyses indicated a significant association between all the genetic diversity measures i.e. Na, Ne, Ho and He and the pubescence form of population of origin

(Table 2.5, Figure 2.2 A & B).

31

Table 2.3 Results from ANOVA conducted on genetic diversity measures across pubescent types in Chamaecrista fasciculata.

Genetic Sum of df Mean F p-value

Diversity Squares Square

Na Between Groups 13.674 2 6.837 7.575 0.002

Within Groups 30.688 34 0.903

Total 44.362 36

Ne Between Groups 4.396 2 2.198 10.027 0.000

Within Groups 7.452 34 0.219

Total 11.848 36

Ho Between Groups 0.143 2 0.071 4.652 0.016

Within Groups 0.523 34 0.015

Total 0.666 36

He Between Groups 0.134 2 0.067 6.321 0.005

Within Groups 0.360 34 0.011

Total 0.494 36

32

Table 2.4 Post hoc analyses demonstrating significant differences (p < 0.05) in genetic diversity among pubescent, mixed, and glabrous populations of Chamaecrista fasciculata.

Genetic diversity Trait (i) Trait (j) Mean difference P-value

measure

Na G P 1.48361 0.000

Ne G M 0.51525 0.013

G P 0.79733 0.000

Ho G P 0.14083 0.008

He G P 0.14256 0.002

Notes. Na = number of alleles, Ne = effective alleles, Ho = observed heterozygosity, He = expected heterozygosity.

33

Figure 2.2 Bivariate plot of genetic diversity measures for glabrous, pubescent and mixed populations. A) Number of alleles (Na) and number of effective alleles (Ne). B) Observed and expected heterozygosity (Ho and He).

34

A Mantel test showed a non-significant correlation between population genetic and geographic distances (r = 0.08, P = 0.08; Figure 2.4). Bayesian analysis in STRUCTURE suggested the presence of two distinct genetic clusters (K = 2) based on the highest delta K value

(Figure 2.3A). The proportion of membership of each population to each of the two genetic clusters is shown in Fig. 3B. While populations 25 and 26 had mixed membership of individuals for two clusters, populations 29, 35 and 37 were grouped into cluster 2 and the rest of the populations grouped into cluster 1. These clusters do not correspond to leaflet pubescent type or geography. For example, populations 25 and 26 were composed of a mixture of glabrous and pubescent individuals. Cluster 2 includes pubescent populations (i.e., 29 and 35) and a glabrous population (i.e., 35) (Figures 2.1 & 2.3). Cluster 2 populations showed the lowest values for all genetic measures, and the highest FIS values. No specific geographic pattern was detected among populations included in cluster 1, and they were dispersed throughout the collection area (Figure

2.1). Out of 32 populations included in cluster 1, 19 were glabrous, seven were pubescent and six were mixed. The annual temperature and precipitation ranged 16.38 – 19.47 C and 1385 – 1680 mm, respectively. Analyses in GESTE indicated that precipitation did not have a significant effect on population genetic differentiation while a model including temperature plus constant showed the highest posterior probability (0.909) (Table 2.6).

35

Table 2.5 Results of linear regressions between genetic diversity and leaflet pubescence in populations of Chamaecrista fasciculata.

Genetic Standardized SE t P-value R-square diversity Coefficient measure

Na -0.548 0.941 -3.879 0.00 0.301

Ne -0.603 0.463 -4.476 0.00 0.364

Ho -0.455 0.122 -3.022 0.005 0.207

He -0.52 0.101 -3.6 0.001 0.27

Notes. Na = number of alleles, Ne = effective alleles, Ho = observed heterozygosity, He = expected heterozygosity.

Table 2.6 Importance of temperature and precipitation in explaining the observed genetic structure of Chamaecrista fasciculata populations. Posterior probabilities were generated using GESTE. The model with the highest probability is shown in bold.

Factor(s) Included Posterior probability

Constant 0

Constant, Precipitation 0

Constant, Temperature 0.909

Constant, Temperature, Precipitation 0.09

Constant, Temperature, Precipitation, Temperature*Precipitation 0

36

Figure 2.3 Results from Bayesian analysis in STRUCTURE. A) DeltaK values based on the method of Evanno et al. (2005) for all values of K that were tested. B) Assignment probability of each sample into each of the two clusters. Black color represents cluster 1 and white color represents cluster 2.

37

Figure 2.4 Bivariate plot of pairwise genetic distances (FST/1-FST) vs. ln of geographical distances. A Mantel test indicated a non-significant correlation between the matrices (r = 0.08, P = 0.08).

38

Discussion

Genetic and phenotypic variation in Chamaecrista fasciculata

Widespread and regional taxa show significantly higher within‐population diversity than narrow and endemic species in genetic studies based on microsatellites and other DNA fragment markers (Nybom 2004). Thus, because C. fasciculata is a widespread species, it is expected to contain high levels of genetic diversity. The current survey of populations of C. fasciculata distributed across its southeastern edge revealed genetic variability among populations.

Consistent with a primarily outcrossing breeding system, 71% of observed genetic variation occurs within individuals. Most of studied loci were 100% polymorphic in most studied populations (Table 2.1). Though within-population genetic diversity was variable among most populations (mean Ho ranging from 0.09 to 0.63), the mean value was lower than mean Ho derived by STMS (sequence tagged microsatellite sites) for other widespread species (0.57), or gravity-dispersal species (0.5) (Nybom 2004). However, the measured value for mean Ho in this study (0.44) was higher than for other annual species (0.18) (Nybom 2004). The average within- population diversity (H') measured for two related species in Brazil, i.e. Chamaecrista mucronata (Spreng.) and Chamaecrista semaphora (H.S. Irwin and Barneby), using RAPD markers, revealed values of 0.299 and 0.124, respectively, in those species (da Silva et al. 2007).

Allozyme-derived estimates of Ho and He for C. fasciculata determined from 12 populations distributed in an area of 20 by 8 meters were 0.256 and 0.282, respectively, in another study

(Fenster et al. 2003). Although these studies used different populations and different genetic markers, they show that C. fasciculata is genetically diverse and potentially adapted to different environmental conditions, which is likely key to its widespread distribution.

39

Phenotypic variation, unless driven by genetically programmed plasticity, is generated by genetic diversity, a fundamental source of biodiversity (Hughes et al. 2008). When phenotypic variance results from natural selection, it reflects both adaption to local environmental characteristics and genetic diversity (Liang et al. 2005; Wang et al. 2009). Leaf pubescence has been reported for C. fasciculata from coastal and non-coastal areas at its southern edge, although our observations are consistent with Pullen (1963) and Isely (1975) who suggested that the trait frequency increases towards the Gulf Coast. All pubescent populations sampled in our study were distributed at the lowest latitudes sampled, and mixed populations were located primarily at lower latitudes (Figure 2.1). With the observed phenotypic variation, we hypothesized that there should be genetic differentiation associated with leaf pubescence in this species. Our analyses demonstrate that there are significant differences in genetic diversity between pubescent and glabrous populations. In the case of Ne, glabrous populations were also significantly different from mixed populations. Regression analysis also revealed a significant association between leaf pubescence and genetic measures such that higher genetic diversity was observed in glabrous populations, with gradually decreased diversity observed in mixed and pubescent populations

(Table 2.5, Figures 2.2 A & B).

Despite these findings, our STRUCTURE analysis did not distinguish pubescent from glabrous or mixed populations (Figure 2. 3). Phenotypic variation in C. fasciculata has often been noted at local scales (e.g., Pullen 1963; Weakley 2015) and found to occur among geographically disparate populations (e.g., Galloway and Fenster 2000; Henson et al. 2013).

Even though there is no support for a significant association between phenotype and genotype in this study, this does not preclude the original hypothesis of association between genotype and phenotype. Markers directly involved in trichome development may show evidence of

40

geographic structure that would not be detected in a microsatellite-based study such as this.

Other studies of C. fasciculata suggest that pubescence is a genetically controlled trait as it is maintained in a common garden (Wallace et al. in press). Pullen (1963) also suggested that high levels of phenotypic variation observed in this species, except for root and stem branching that could be easily modified under cultivation, was not due to phenotypic plasticity. Etterson (2004), by studying this species along a latitudinal gradient, showed that plants with slower reproductive development, more leaves, and thicker leaves were favored in the most southern garden. Frazee and Marquis (1994) who quantified floral trait variation in an Illinois prairie population of C. fasciculata reported that variation in environmental factors explained a significant portion of the naturally occurring variation in corolla width, ovule number, ovule size, and anther length.

Further, they determined significant effects of leaf herbivory, variable soil nutrient and water content on floral trait variation. The significant effect of soil source on flower production is also shown by Abdala-Roberts and Marquis (2007). It has also been shown that soil water retention ability and fertility can affect extrafloral nectary production in this species (Chen 2003). Overall, there is spatial variation in the occurrence and selection of phenotypic traits in this species, which seems to be correlated with environmental variability. Many plants accumulate UV- absorbing compounds such as flavonols in trichomes which further protect the underlying photosynthetic tissues from damaging UV-A and UV-B radiation (Liakopoulos et al. 2006; Yan et al. 2012). Given that in adverse environments, trichomes are beneficial because they influence the water balance, protect photosynthesis and play a role in thermoregulation and tolerating heat

(Hauser 2014), the pubescent nature of plants growing at lower latitudes, where the temperature is the highest (Table 2.1), might reflect the protective nature of trichomes to increase reflectance and reduce the heat load.

41

Genetic differentiation and gene flow

Contrary to our expectation of an isolation by distance pattern of genetic structure due to gravity-dispersed seeds and bee pollination, I did not find a significant correlation between geographic and genetic distance. Isolation by distance results in increasing genotypic differences among populations with increasing geographic distance due to low distances over which gene flow occurs (Wright 1978). However, it is difficult to test for this effect because distance is often confounded by barriers, either current or historical. The absence of IBD in a heavy seeded, bee- pollinated plant may result not only from long-distance dispersal of seeds or pollen, but also a continuous distribution and higher rate of genetic exchange between populations (Travadon et al.

2011). A continuous distribution can have a large influence on overall levels of genetic diversity

(Criscione et al. 2005) by buffering the effects of genetic drift, which is one known driver of

IBD. For those widespread species that do not experience strong effects of genetic drift during their life-cycle, which may be due to constant niche availability, ability of seeds to survive between growing seasons, or ability to passively migrate and germinate at long distances from maternal plants, high levels of within-population genetic diversity can be maintained (Dorken and Eckert 2001). I did find high levels of diversity across populations in the study area, which would be consistent with a buffering effect against genetic drift. This study demonstrates that in these cases, the IBD model does not seem appropriate to infer dispersal distances of C. fasciculata populations.

Though populations in this study do not seem to be differentiated based strictly on distances separating them, an effect of temperature on levels and patterns of genetic diversity was detected, consistent with our third hypothesis. I found that populations at the highest latitudes sampled where temperature was relatively lower (Table 2.1) typically showed higher

42

levels of genetic diversity than populations at the lowest latitudes sampled with higher temperature. GESTE results showed a strong correlation between temperature and genetic diversity. Latitudinal variation, which has direct relation with temperature has previously been observed in other morphological traits in this species, including abundance of leaves and development of reproductive organs, by Etterson (2004). These results are in line with previous studies showing that clinal patterns for phenology were supported by observed patterns of genetic differentiation among populations in this species (Etterson and Shaw 2001; Etterson

2004). Though this species has a high outcrossing rate of 80% (Fenster, 1991, 1995), one potential reason for observed lower genetic diversity at lower latitudes, aside from less incidence of gene flow at this area, might be related to more frequent selfing at lower latitudes, which also can be concluded from heterozygosity deficiency observed in this study. Bueno et al. (2019) also found a decrease in genetic diversity of germplasm collections with decrease in latitude in this species using AFLP markers; germplasm accessions collected from the Central U.S. had higher mean genetic diversity (H) values (0.239) than southern U.S. populations (0.14).

Because the lowest latitude populations sampled are at the species’ range boundary on the coast they constitute edge populations, and therefore might experience more rapid cycles of extinction, recolonization and associated founder events or severe populations bottlenecks than those in less extreme upper latitude environments (Szczecińska et al. 2016). In accordance with the central marginal model (Lesica and Allendorf 1995) these peripheral populations at the edge of the species distribution may be expected to have lower genetic diversity and higher genetic differentiation than central populations (Tollefsrud et al. 2009; Pandey and Rajora 2012). Bueno et al. (2019) believes that the current genetic diversity occurring in northern populations of C. fasciculata is a result of post-glacial advances of populations northward while admixing with

43

gene pools from different refugia; and differentiation among southern populations is likely because of differences associated with restricted gene flow on either side of the Appalachian mountain chain and the Ozark mountains. Because mountain chains impeded gene flow between northern and southern populations and as result of that differentiation between these two gene pools gradually occurred after glaciation.

Considering the fact that the southeastern United States harbors an unusually large number of endemic plant taxa, which may reflect an abundance of distinct refugia in this region during Pleistocene glacial maxima (Trapnell et al. 2007), the studied populations might be relicts of a continuous distribution throughout the southeastern United States during glacial maxima.

Lower latitude populations might have been independently founded more recently by propagules from upper latitudes while losing a portion of genetic diversity. Supporting evidence for this is the lack of genetic differentiation between lower and upper latitude populations while genetic diversity diminishes from higher latitudes toward lower latitudes. Another explanation for lower genetic variation at lower latitudes relates to the lack of proper rhizobia in soil and consequently a limitation in suitable establishment of recruits and seed production of the host plants. Stanton-

Geddes and Anderson (2011) showed that the availability and quality of mutualists beyond a species’ range edge may limit range expansion. It is possible that the physical characteristics of the soil in coastal areas do not favor rhizobia proliferation, which serve as symbionts of the studied species; this hypothesis needs to be further examined in future studies by sampling soil from different latitudes while investigating genetic diversity across latitudes.

Conclusions

There are few genetic studies on C. fasciculata to date with no reference to accompanying phenotypic variation. Here, I showed that in Mississippi and coastal areas of 44

Alabama, genetic variation in C. fasciculata populations is largely distributed within individuals and that differences between populations reflect factors that covary with latitude. Accordingly, co-located populations are likely to be more similar than those more distant. Although leaflet pubescence was not found to be directly associated with the genetic markers used in this study, the significant association between the genetic diversity measures and the pubescence form of the population of origin suggests a link between this phenotype and genetic variation in this species. It is possible that by studying loci directly involved in trichome development evidence of geographic structure will be found in this species. Nevertheless, I found significant population genetic differentiation, occasionally, even at short distances. Therefore, I conclude that most of this genetic variation reflects genetic processes such as more genetic exchange among connected populations through continues distribution, frequent out-crossing at higher latitudes but relatively higher incidence of selfing at lower latitudes. Overall, our study presents the role that contemporary factors, including geographic distance and temperature, can play in generating patterns of spatial genetic variation in C. fasciculata. Better understanding the factors that shape genetic variation in this species contribute to comprehending how genetic variation is distributed, the probability of local adaptation of phenotypically diverse forms, and the maintenance of genetically variable populations.

45

References

Abdala-Roberts, L., and R. J. Marquis. 2007. Test of local adaptation to biotic interactions and soil abiotic conditions in the ant tended Chamaecrista fasciculata (Fabaceae). Oecologia 154: 315-326.

Allard, R. W. 1975. The mating system and microevolution. Genetics 79: 115– 126.

Antonovics, J., A. D. Bradshaw, and R. G. Turner. 1971. Heavy metal tolerance in plants. Advances in Ecological Research 7: 1–85.

Antonovics, J., P. H. Thrall, and A. M. Jarosz. 1997. Genetics and the spatial ecology of species interactions: the Silene– Ustilago System. In D. Tilman and P. Kareiva [eds.], Spatial ecology. The role of space in population dynamics and interspecific interactions, 158– 180. Princeton University Press, Princeton, New Jersey, USA.

Bradshaw, A. D. 1972. Some evolutionary consequences of being a plant. In T. Dobzhansky, M. K. Hecht, and W. C. Steere [eds.], Evolutionary Biology, 5:25—47. Appleton-Century- Crofts, New York, New York, USA.

Brown, A., E. Nevo, and D. Zohary. 1978. Association of alleles at esterase loci in wild barley, Hordeum spontaneum. Nature 268: 430-431.

Brown, A. H. D., and J. D. Briggs. 1991. Sampling strategies for genetic variation in ex situ collections of endangered plant species. In D. A. Falk and K. E. Holsinger [eds.], Genetics and conservation of rare plants, 99-119. Oxford University Press, Oxford, United Kingdom.

Bueno, E., T. Kisha, S. L. Maki, et al. 2019. Genetic diversity of Chamaecrista fasciculata (Fabaceae) from the USDA germplasm collection. BMC Research Notes 12: 117.

Chapuis, M. P., and A. Estoup. 2007. Microsatellite null alleles and estimation of population differentiation. Molecular Biology and Evolution 24: 621-631.

Charlesworth, B. and D. Charlesworth. 2010. Elements of evolutionary genetics. Roberts and company, Richboro, Pennsylvania, USA.

Chen, G. 2003. Impacts of florivores, , and soil moisture on seed production in partridge pea (Chamaecrista fasciculata), a native annual plant. M.S. Thesis, University of Missouri, St. Louis, USA.

Cresswell, J. E., A. P. Bassom, S. A. Bell, S. J. Collins, and T. B. Kelly. 1995. Predicted pollen dispersal by honeybees and bumblebees foraging on oil‐seed rape: a comparison of three models. Functional Ecology 6: 829– 841.

Criscione, C. D., R. Poulin, and M. S. Blouin. 2005. Molecular ecology of parasites: elucidating ecological and microevolutionary processes. Molecular Ecology 14: 2247–57. 46

Culley, T. M., T. I. Stamper, R. L. Stokes, J. R. Brzyski, N. A. Hardiman, M. R. Klooster, and B. J. Merritt. 2013. An efficient technique for primer development and application that integrates fluorescent labeling and multiplex PCR. Applications in Plant Sciences 1: 1300027.

Dakin, E. E., and J. C. Avise. 2004. Microsatellite null alleles in parentage analysis. Heredity 93: 504-509. da Silva, R. M., G. W. Fernandes, and M. B. Lovato. 2007. Genetic variation in two Chamaecrista species (Leguminosae), one endangered and narrowly distributed and another widespread in the Serra do Espinhaço, Brazil. Canadian Journal of Botany 85: 629–636.

Dellaporta, S. L., J. Wood and J. B. Hicks. 1983. A plant DNA mini preparation: Version II. Plant Molecular Biology Reporter 1:19-21.

Dorken, M. E., and C. G. Eckert. 2001. Severely reduced sexual reproduction in northern populations of a clonal plant, Decodon verticillatus (Lythraceae). Journal of Ecology 89: 339-350.

Earl Dent, A., and B. M. VonHoldt. 2012. STRUCTURE HARVESTER: a website and program for visualizing STRUCTURE output and implementing the Evanno method, version 2.3.4. Conservation Genetics Resources 4: 359-361.

Etterson, J. R., and R. G. Shaw. 2001. Constraint to adaptive evolution in response to global warming. Science 294: 151 – 154.

Etterson, J. 2004. Evolutionary potential of Chamaecrista fasciculata in relation to climate change. I. Clinal patterns of selection along an environmental gradient in the Great Plains. Evolution 58: 1446–1458.

Evanno, G., S. Regnaut, and J. Goudet. 2005. Detecting the number of clusters of individuals using the software STRUCTURE: a simulation study. Molecular Ecology 14: 2611–2620.

Excoffier, L., P. E. Smouse, and J. M. Quattro. 1992. Analysis of molecular variance inferred from metric distances among DNA haplotypes: application to human mitochondrial DNA restriction data. Genetics 131:479–91.

Fenster, C. B. 1991. Gene flow in Chamaecrista fasciculata (Leguminosae) I. Gene dispersal. Evolution 45: 398–409.

Fenster, C. B. 1995. Mirror image flowers and their effect on outcrossing rate in Chamaecrista fasciculata (Leguminosae). American Journal of Botany 82: 46–50.

Fenster, C. B., X. Vekemans, and O. J. Hardy. 2003. Quantifying gene flow from spatial genetic structure data in a metapopulation of Chamaecrista fasciculata (Leguminosae). Evolution 57: 995–1007. 47

Fick, S. E. and R. J. Hijmans, 2017. WorldClim 2: new 1km spatial resolution climate surfaces for global land areas. International Journal of Climatology 37: 4302-4315.

Foll, M., and O. E. Gaggiotti. 2006. Identifying the environmental factors that determine the genetic structure of populations. Genetics 174: 875– 891.

Frazee, J. E., and R. J. Marquis. 1994. Environmental contribution to floral trait variation in Chamaecrista fasciculata (Fabaceae: Caesalpinoideae). American Journal of Botany 81: 206–215.

Friedman, J. H., J. E. Roelle, J. F. Gaskin, A. E. Pepper, and J. R. Manhart. 2008. Latitudinal variation in cold hardiness in introduced Tamarix and native Populus. Evolutionary Applications 1:598–607.

Galloway, L. F. and C. B. Fenster. 2000. Population differentiation in an annual legume: local adaptation. Evolution 54: 1173-1181.

Glato, K., A. Aidam, N. A. Kane, D. Bassirou, M. Couderc, L. Zekraoui, et al. 2017. Structure of sweet potato (Ipomoea batatas) diversity in West Africa covaries with a climatic gradient. PLoS One 12: e0177697.

Hallgrimsson, B. and B. K. Hall. 2005. Variation: a central concept in biology. Elsevier Academic Press, Cambridge, Massachusetts, USA.

Hamrick, J. L. and M. J. W. Godt. 1989. Allozyme diversity in plant species. In A. H. D. Brown, M. T. Clegg, A. L. Kahler and B. S. Weir [eds.], Plant population genetics, breeding and germplasm resources, 43-63. Sinauer Associates Inc., Sunderland, Massachusetts, USA.

Hauser, M. T. 2014. Molecular basis of natural variation and environmental control of trichome patterning. Frontiers in plant science 320: 1-7.

Hedrick, P. W., M. E. Ginevan, and E. P. Ewing. 1976. Genetic polymorphism in heterogeneous environments. Annual Review of Ecology, Evolution, and Systematics 7: 1-32.

Henson, T. M., W. Cory, and M. T. Rutter. 2013. Extensive variation in cadmium tolerance and accumulation among populations of Chamaecrista fasciculata. PLoS One 8:e63200.

Holm, S. 1979. A simple sequentially rejective multiple test procedure. Scandinavian Journal of Statistics 6: 65–70.

Howe, H. F., and J. Smallwood. 1982. Ecology of seed dispersal. Annual Review of Ecology and Systematics 13: 201-228.

Hughes, A. R., B. D. Inouye, M. T. J. Johnson, N. Underwood, and M. Vellend. 2008. Ecological consequences of genetic diversity. Ecology Letters 11: 609-623.

48

Husband, B. C., and S. C. H. Barrett. 1996. A metapopulation perspective in plant population biology. Journal of Ecology 84: 461–469.

Hyatt, L. A., M. S. Rosenberg, T. G. Howard, G. Bole, W. Fang, J. Anastasia, K. Brown, R. Grella, K. Hinman, J. P. Kurdziel, and J. Gurevitch. 2003. The distance dependence prediction of the Janzen-Connell hypothesis: a meta-analysis. Oikos 103: 590-602.

IBM Corp. 2017. IBM SPSS Statistics for Winodows, version 25. IBM Corp., Armonk, New York, USA.

Isely, D. 1975. Leguminosae of the United States: II. Subfamily Caesalpinioideae. Memoirs of the New York Botanical Garden 25: 1-228.

Jain, S. 1979. Adaptive strategies: polymorphism, plasticity, and homeostasis. In O. T. Solbrig, S. Jain, G. B. Johnson, and P. H. Raven [eds.], Topics in Plant Population Biology, 160- 187. Columbia University Press, New York, USA.

James, F.C. 1983. Environmental component of morphological differentiation in birds. Science 221: 184–187.

Kamvar Z. N., J. F. Tabima, and N. J. Grünwald. 2014. Poppr: an R package for genetic analysis of populations with clonal, partially clonal, and/or sexual reproduction. Peer J 2: e281.

Karron, J. D. 1987. A comparison of levels of genetic polymorphism and self‐compatibility in geographically restricted and widespread plant congeners. Evolutionary Ecology 1: 47– 58.

Kofler, R., A. J. Betancourt, and C. Schlötterer. 2012. Sequencing of pooled DNA samples (Pool-Seq) uncovers complex dynamics of transposable element insertions in Drosophila melanogaster. PLoS Genetics 8: e1002487.

Kollmann, J., and M. J. Banuelos. 2004. Latitudinal trends in growth and phenology of the invasive alien plant Impatiens glandulifera (Balsaminaceae). Diversity and Distributions 10: 377– 385.

Lesica, P., and F. W. Allendorf. 1995 When are peripheral populations valuable for conservation? Conservation Biology 9: 753–760.

Levin, S. A., H. C. Muller- Landau, R. Nathan, and J. Chave. 2003. The ecology and evolution of seed dispersal: a theoretical perspective. Annual Review of Ecology, Evolution, and Systematics 34: 575-604.

Lewontin, R. C. 1970. The units of selection. Annual Review of Ecology and Systematics 1: 1-18.

49

Liakopoulos, G., S., Stavrianakou, and G. Karabourniotis. 2006. Trichome layers versus dehaired lamina of Olea europaea leaves: differences in flavonoid distribution, UV-absorbing capacity, and wax yield. Environmental and Experimental Botany 55: 294–304. doi: 10.1016/j.envexpbot.2004.11.008

Liang, S., X. Rong, L. Sai, J. Chen, Q. X. Chang, and X. X. Cai. 2005. Phenotypic variation of seed traits of Haloxylon ammodendron and its affecting factors. Biochemical Systematics and Ecology 60: 81–87.

Linhart, Y. B., and M. C. Grant. 1996. Evolutionary significance of local genetic differentiation in plants. Annual Review of Ecology, Evolution, and Systematics 27: 237-277.

Linhart, Y. B., J. B. Mitton, K. B. Sturgeon, and M. L. Davis. 1981. Genetic variation in space and time in a population of Ponderosa pine. Heredity 46: 407–426.

Losos, J. B., K. I. Warheit and T. W. Schoener.1997. Adaptive differentiation following experimental island colonization in Anolis lizards. Nature 387: 70–73.

Loveless, M. D. and J. L. Hamrick. 1984. Ecological determinants of genetic structure in plant populations. Annual Review of Ecology and Systematics 15: 65-95.

Mantel, N. 1967. The detection of disease clustering and a generalized regression approach. Cancer Research 27: 209–220.

Maron, J. L., M. Vila, R. Bommarco, S. Elmendorf, and P. Beardsley. 2004. Rapid evolution of an invasive plant. Ecological Monographs 74: 261–280.

Maron, J. L., S. Elmendorf, and M. Vila. 2007. Contrasting plant physiological adaptation to climate in the native and introduced range. Evolution 61: 1912–1924.

Mitton, J. B. 1997. Selection in natural populations. Oxford University Press, Oxford. United Kingdom.

Mitton, J. B., K. B. Sturgeon, and M. L. Davis. 1980. Genetic differentiation in ponderosa pine along a steep elevational transect. Silvae Genetica 29: 100-103.

Montague, J. L., S. C. H. Barrett, and C. G. Eckert. 2008. Re-establishment of clinal variation in flowering time among introduced populations of purple loosestrife (Lythrum salicaria, Lythraceae). Journal of Evolutionary Biology 21: 234–45.

Myers, A. A. and P. S. Giller. 1988. Analytical Biogeography. Chapman and Hall, London, United Kingdom.

Nybom, H. 2004. Comparison of different nuclear DNA markers for estimating intraspecific genetic diversity in plants. Molecular Ecology 13: 1143-1155.

50

Ogden, R., and R. S. Thorpe. 2002. Molecular evidence for ecological speciation in tropical habitats. Proceedings of the National Academy of Sciences, USA 99: 13612–13615.

Pandey, M., and O. P. Rajora. 2012. Genetic diversity and differentiation of core vs. peripheral populations of eastern white cedar, Thuja occidentalis (Cupressaceae). American Journal of Botany 99: 690–699.

Peakall, R., and P. E. Smouse. 2012. GenAlEx 6.5: genetic analysis in Excel. Population genetic software for teaching and research-an update. Bioinformatics 28: 2537-2539.

Pritchard, J. K., M. Stephens, and P. Donnelly. 2000 Inference of population structure using multilocus genotype data. Genetics 155: 945–959.

Pullen, T. M. 1963. The Cassia fasciculata complex (Leguminosae) in the United States. Ph.D. dissertation, The University of Georgia, Georgia, USA.

Raymond, M., and F. Rousset. 1995. GENEPOP: population genetics software for exact tests and ecumenicism, version 1.2. Journal of Heredity 86: 248-249.

Rousset, F. 1997. Genetic differentiation and estimation of gene flow from F-statistics under isolation by distance. Genetics 145: 1219-1228.

Rousset, F. 2008. Genepop'007: a complete reimplementation of the Genepop software for Windows and Linux. Molecular Ecology Resources 8: 103-106.

RStudio Team. 2012. RStudio: Integrated development environment for R, version 1.1.456. Boston, Massachussete, USA, website: http://www.rstudio.com [accessed 10 January 2020].

Schluter, D. 2001. Ecology and the origin of species. Trends in Ecology and Evolution 16: 372– 380.

Schneider, C., W. Rasband, and K. Eliceiri. 2012. NIH Image to ImageJ: 25 years of image analysis. Nature Methods 9: 671–67.

Singer, S., J. Doyle, G. May, S. Cannon, S. Maki, and D. Illut. 2009. Exploring Chamaecrista fasciculata genomics data [online]. Website: http://serc.carleton.edu/exploring_genomics/chamaecrista/chamaecrista_tr.html [accessed 10 Jannuary 2020]

Sinnott, R. W. 1984. Virtues of the Haversine. Sky Telescope 68: 158.

Slatkin, M. 1993 Isolation by distance in equilibrium and non-equilibrium populations. Evolution 47: 264–279.

Sokal, R. R. and D. Wartenberg. 1983. A test of spatial autocorrelation using an isolation-by- distance model. Genetics 105: 219-237.

51

Spieth, P. T. 1979. Environmental heterogeneity: A problem of contradictory selection pressures, gene flow, and local polymorphism. The American Naturalist 113: 247— 60.

Stanton-Geddes, J., and C. G. Anderson. 2011. Does a facultative mutualism limit species range expansion? Oecologia 167: 149-155.

Szczecińska, M., G. Sramko, K. Wołosz, and J. Sawicki. 2016. Genetic diversity and population structure of the rare and endangered plant species Pulsatilla patens (l.) mill in east central Europe. PloS One 11: e0151730.

Tollefsrud, M. M., J. H. Sønstebø, C. Brochmann, Ø. Johnsen, T. Skrøppa, and G. G. Vendramin. 2009. Combined analysis of nuclear and mitochondrial markers provide new insight into the genetic structure of North European Picea abies. Heredity 102: 549–562.

Trapnell, D. W., J. P. Schmidt, P. F. Quintana‐Ascencio and J. L. Hamrick. 2007. Genetic insights into the biogeography of the southeastern North American endemic, Ceratiola ericoides (Empetraceae). Journal of Heredity 98: 587–593.

Travadon, R., I. Sache, C. Dutech, A. Stachowiak, B. Marquer, and L. Bousset. 2011. Absence of isolation by distance patterns at the regional scale in the fungal plant pathogen Leptosphaeria maculans. Fungal Biology 115: 649– 659.

Trussell, G. C. and R. J. Etter. 2001. Integrating genetic and environmental forces that shape the evolution of geographic variation. Genetica 112–113: 321–337.

USDA, NRCS. 2020. The PLANTS Database (http://plants.usda.gov, 10 January 2020). National Plant Data Team, Greensboro, NC 27401-4901 USA.

Wallace, L. E., M. H. Nobarinezhad, and R. Coltharp. 2020. Phenotypic variation of partridge pea, Chamaecrista fasciculata, persists in a common garden. Castanea 85:1.

Wang, B. F., J. B. Zhang, X. H. Yang, and Z. P. Jiang. 2009. Relationship of seed characters and seeding growth traits of Haloxylon ammodendron from different provenances with geographical and climatic factors. Journal of Plant Resources and Environment 18: 28– 35.

Weakley, A. S. 2015. Flora of the southern and mid-Atlantic states, working draft of 21 May 2015. University of North Carolina Herbarium, North Carolina Botanical Garden, Chapel Hill, North Carolina, USA. Website: http://www.herbarium.unc.edu/flora.htm. [accessed 10 January 2020]

Weber, E., and B. Schmid. 1998. Latitudinal population differentiation in two species of Solidago (Asteraceae) introduced into Europe. American Journal of Botany 85: 1110– 1121

Wright, S. 1943. Isolation by distance. Genetics 28: 114–138.

52

Wright, S. 1978. Evolution and the genetics of populations, vol. 4, variability within and among populations. University of Chicago Press, Chicago, Illinois, USA.

Yan, A., J. Pan, L. An, Y. Gan, and H. Feng. 2012. The responses of trichome mutants to enhanced ultraviolet-B radiation in Arabidopsis thaliana. Journal of Photochemistry and Photobiology B: Biology 113: 29–35. doi: 10.1016/j.jphotobiol.2012.04.011

53

CHAPTER III

FINE-SCALE PATTERNS OF GENETIC STRUCTURE IN THE HOST PLANT

CHAMAECRISTA FASCICULATA (FABACEAE) AND ITS

NODULATING RHIZOBIA SYMBIONTS.

Introduction

Several studies have recognized plants as one of the most important factors shaping soil microbial community structure (Berg and Smalla 2009; Diouf et al. 2010; Ladygina and Hedlund

2010; McLaren and Turkington 2011). It has been suggested that this may be the result of active selection by plants for certain soil microbes or shared habitat preferences by plants and microbes

(Bever et al. 2010) because spatially close biological communities are more similar than expected by chance and this similarity decays with distance (Ettema and Wardle 2002). It has also been demonstrated that differences in microbial community composition are significantly correlated with phylogenetic distance of their plant hosts (Bouffaud et al. 2014). Plant species that develop specific associations with soil microbes have important implications for coexistence in natural communities (Bever et al. 2010), and recent evidence suggests that microbial communities are influenced by both dispersal limitation and host plants (O’Malley 2007).

Emmet et al. (2014), by studying plant phylogeny and life history of summer annuals in an agricultural field, indicated that rhizosphere beta-diversity was positively correlated with phylogenetic distance between plant species, but not genetic distance within a plant species.

Plants within a closely related taxonomic group are likely to share traits, including amount and

54

availability of rhizodeposits and root defensive strategies, and these are key factors shaping soil microbial communities (Zhou et al. 2017). Thus, plant hosts with similar genotypes may have similar soil microbial communities (Ehrenfeld et al. 2005; Hardoim et al. 2008; Berg and Smalla

2009) which may indicate the presence of genotype x genotype interactions in these situations.

While most studies have focused on differences in soil microbial communities associated with different plant species, a few studies have addressed these differences within a single species and found that microbial community structure and composition can also vary according to intraspecific genotypic differences of host plants (Aira et al. 2007; Berg and Smalla 2009;

Zancarini et al. 2012; Marques et al. 2014). These studies have demonstrated that diversity in community structure of the rhizosphere microbiome can be partially explained by the genotype of their plant hosts. For example, Bulgarelli et al. (2015) demonstrated that the host genotype accounts for approximately 5.7% of the variance in the Hordeum vulgare L. rhizosphere microbiome composition. Therefore, with dispersal-limited bacterial and fungal communities and host influence on microbial variety and abundance, spatial autocorrelation of rhizosphere communities is expected (Ettema and Wardle 2002; Fierer and Ladau 2012). However, the underlying mechanisms by which the plants drive the rhizosphere microbiome are not well understood. Thus, variation in the rhizosphere microbiome across diverse plant species and natural systems should be subjected to more research.

Nodulating rhizobia, which begin as free-living soil microbes, also experience variable selection as a result of changes in host genetic identity, resource availability, temperature, moisture and the soil biotic community. Both interspecific (Vuong et al. 2017; Pahua et al. 2018) and intraspecific (Heath and Tiffin 2007; Rangin et al. 2008; Crook et al. 2012) genetic variations of hosts influence selection on nodulating rhizobia. Even variation at a single host

55

gene can have dramatic effects on strain occupancy and fitness in nodules (Kim et al., 2015) and on the frequency of microbial genera in the rhizosphere (Zgadzaj et al. 2016). The plant effects often reciprocally depend on the rhizobial lineage and can be a result of genetic variation in a single rhizobial gene (Wang et al. 2018). Numerous studies have documented that legume– rhizobial symbioses show a high level of specificity, occurring at both species and genotypic levels (Perret et al. 2000; Wang et al. 2012). It is also shown that the bacterial diversity in total nodules of a host plant is significantly lower than that in the corresponding rhizosphere (Lu et al.

2017), which confirms the presence of selection by plant hosts to associate with certain types of rhizobia in the soil. Thus, it is expected that the presence of significant genotype x genotype interactions between plants and their symbionts would reinforce this pattern at fine spatial scales.

Therefore, analyzing the genetic structures of host plants and their symbionts is helpful to better understand the functional spatial scale of such interactions.

Chamaecrista fasciculata (Michx.) Greene (Partridge Pea), an annual native legume

(Fabaceae) is widely distributed in the eastern U.S. The species exhibits intraspecific morphological variation and wide ecological tolerance of different soil types and habitats (Irwin and Barneby 1982). As a species capable of harboring nitrogen-fixing rhizobia, it is an important species in many natural ecosystems because it provides nitrogen for other plants, as well as cover, nectar, and pollen for animals. It has also been of interest in agricultural systems, for example in crop rotation to enhance soil nitrogen (Reeves 1994) and to manage root-knot nematodes (Rodríguez-Kábana et al. 1995). Given its annual habitat, herbaceous growth form, and phylogenetic position as a nodulating form outside of the Papilionoid clade, there is growing interest in developing Partridge Pea as a model for studies of legume evolution (Singer et al.

2009). Bradyrhizobium is the only genus of rhizobia that is symbiotic with C. fasciculata, but

56

many different genetic variants of Bradyrhizobium are able to nodulate in C. fasciculata (Parker

1999; Parker and Kennedy 2006; Andrews and Andrews 2017; Dorman and Wallace 2019).

The findings of Dorman and Wallace (2019) indicated that in Mississippi

Bradyrhizobium are phylogenetically diverse within a site but do not have strong genetic structure across large geographic distances. This led them to suggest that the scale at which successful symbioses are established may be very fine and determined by plant x rhizobia genotypic interactions, rather than soil traits. In this study, I tested for the presence of similar patterns of fine-scale genetic structure between C. fasciculata host plants and their nodulating

Bradyrhizobium symbionts. I expected that plants in close physical proximity would be more genetically similar than those more distant from one another due to having gravity dispersed seeds. I also expected that nodulating rhizobia would exhibit a spatial scale of genetic structure that is concordant with their host plants due to genotype x genotype interactions, limited dispersal of bacteria in the soil, and enhancement of a suitable symbiont pool from continual host plant presence. Lastly, I also characterized members of order Rhizobiales, which includes

Bradyrhizobium, from plant rhizospheres to test the hypothesis that the bacterial strains in the nodules are a subset of available bacteria in the soil. This hypothesis is based on the assumption that plant hosts select symbionts from among the most abundant suitable rhizobia present in the soil surrounding their roots. Previous studies have surveyed genetic diversity and distributions of rhizobia communities in soil and nodules of different plant hosts (Leite et al. 2016; Zgadzaj et al.

2016; Jiang et al. 2017; Igolkina et al. 2018), but this study covers the subject at a different spatial scale. This fine-scale study directly tests for an association between genotypic diversity of host plants and their Bradyrhizobium symbionts and provides understanding of the degree to which host plants choose certain microbial genotypes from the soil. Additionally, examination of

57

fine-scale genetic structure of multiple interacting species has rarely been conducted. The proposed study enhances our understanding of the role of bacterial symbionts in local adaptation of plants and the degree to which plant hosts act as a predictor to determine genetic structure of soil microbes.

Materials and Methods

Plant and rhizobia collection

In Oktibbeha Co, Mississippi, June 2017, 70 plants were sampled in three consecutive linear plots starting at 33.35939, -88.86587, with 200 meters distance between the first and the second plots, 55 meters distance between the second and third plots, and with a total length of about 400 meters for all three studied plots. Within each plot, plants were sampled at distance intervals of 0, 1, 2, 5, 10 and 25 meters from the previous point. At each distance interval, two plants each were sampled in opposite directions 0.5 meter perpendicular to a central point, for a total of 24 plants sampled per plot (Figure 3.1). Plot two included 22 plants as two plants could not be identified at one of the points at distance 10 m. The two sampled plants at each point were within ca. 12 cm of one another (Figure 3.1). Whole plants, including roots, were carefully excavated from the soil. Plants were stored separately in plastic bags on ice in the field. Plants were kept at 4ºC until DNA could be extracted. All nodules were removed from the roots and stored at 4ºC until they were plated on growth medium.

58

Table 3.1 Characteristics of 14 microsatellite loci used to evaluate genetic structure in Chamaecrista fasciculata.

Locus* Forward and Reverse Sequence (5’- Multiple Fluoresce Allele Repeat Type 3’) x group nt label** size

Cf1394 F: GAAAAGGCGTCACCAACACC 1 NED 336-399 (AGA)8 R: CGTCCATGGCTGCTACTGC

Cf1749 F: TTGGGGGATGACAAAAGTGG 4 VIC 200-236 (AAG)7 4 R: CCTCAAAATCAAAAGATTGAAA CG Cf3118 F: 1 PET 200-239 (CCA)6 CCTCAAAATCAAAAGATTGAAA CG R: GGTGAAGGCGAAGAAACAGG Cf3411 F: GACGGCAAAGAATCCAAAGG 3 NED 295-319 (CCG)7 R: TCAGTGGATCTGCTTTCTCTCC Cf4935 F: 4 PET 192-225 (AAC)5… AGGAAGTGTTGATTCTGCAACC (AAC)7 R: AGCCCCTTCACACTCAGTCC

Cf5782 F: CTTCCTCAGGGTCACAGAACC 3 NED 189-213 (CTT)6 R: AAAATCCGAGAGCCATGACG

Cf6822 F: 1 PET 209-218 (CCA)6 CCACTACTATCCCTATCAACAAC AGC R: CGTTGAGCATCCACATCAGG Cf6895 F: TTCACGAGGACCCAGTAGGG 1 FAM 203-245 (CAT)6 R: AGAAGGCGAGACCAGAGAGC

Cf7140 F: 4 FAM 185-206 (TAG)8 GAGAAGGGAGTGGTCCTAATGG R: TGAGAGGCATTTGAGTCTTGC Cf8757 F: AGTAGCACCACACCCTCACG 4 FAM 379-433 (ATC)6 R: TTCCTCCAATCCCCTTTTCC

Cf9980 F: GCTGCTCTGGGAATATCACG 1 NED 205-352 (GAA)7 R: CTGCGTAGCCACTTCACTCG

Cf1000 F: AGAGAGTGCCCAGGTGAAGG 1 VIC 219-246 (TGG)9 2 R: GATCCTCGTCGCTCATAGGG

Cf2095 F: 3 FAM 246-300 (ATG)9 6 ATTACCAAGAGTTGGAAAATAT CG R: CCACCCATTCCAGAGTGTCC 59

Table 3.1 (continued)

Locus* Forward and Reverse Sequence (5’- Multiplex Fluorescent Allele size Repeat Type 3’) group label**

Cf4487 F: 4 NED 190-217 (TCT)12 CGAGGAGCCTCTTCTTCAGG R: CTGGGCTCATGTTTCTGAGG * The locus names correspond to the sequence names in the transcriptome file version_1.fasta available at https://serc.carleton.edu/exploring_genomics/chamaecrista/variation_.html. **Tag names and sequences (5’ – 3’) are as following: NED=M13A, TGTAAAACGACGGCCAGT; VIC= T7term, CTAGTTATTGCTCAGCGGT; PET= M13B, CACTGCTTAGAGCGATGC; FAM = M13(-21), TGTAAAACGACGGCCAGT

Figure 3.1 Diagram of one sampling plot. Blue points represent the sampling point at distance intervals of 0, 1, 2, 5, 10 and 25 meters from the previous point. At each distance interval, two plants (red points) each were sampled in opposite directions 0.5- meter perpendicular to a central point (blue dot), for a total of 24 plants sampled per plot.

Leaf samples to be used in DNA extraction were kept on ice in the field and then stored in silica gel or at -80ºC prior to DNA extraction. Genomic DNA was extracted using a CTAB method (Dellaporta 1983; lightly modified by Schnable lab described at file:///H:/First%20project-

First%20draft/Papers/For%20the%20references/Dellaporta_DNA_Extraction.2014.08.20.pdf).

Each sampled plant was genotyped at 14 trinucleotide microsatellite loci (Table 3.1) using a

60

multiplexed genotyping approach to quantify genetic structure as in Chapter 2. For each sample, three multiplex amplification reactions with five, four, and five loci per reaction respectively, were performed in a final volume of 10 μl in the presence of 10 ng of template DNA, 100 µmole of each of the reverse and tagged fluorescent label primers and 10 µmole of tagged forward primer using a KAPA 2G Fast Multiplex PCR kit (Kapa Bio-systems, Wilmington,

Massachusetts). Tag sequences were derived from Culley et al. (2013), and the tag in the 5’ forward primer matched the sequence of the fluorescent labeled primer (Table 3.1). The thermal cycler program used to amplify loci included 3 min at 95°C, 30 cycles of 15 s at 95°C, 30 s at

60°C, and 30 s at 72°C, and a final extension step of 1 min at 72°C. Successful amplification of the samples was checked by agarose gel electrophoresis, and amplified products were genotyped at the Arizona State University DNA Lab with LIZ 600 size standard. Individual alleles were sized using GeneMarker software (SoftGenetics, State College, Pennsylvania).

One to two nodules per plant were surface sterilized with 1% hypochlorite for 5 minutes then washed in sterile water. Later, they were placed in 70% ethanol for 5 minutes and then were washed three times with sterile distilled water each for 1 minute. After surface sterilization, nodules were ground to release rhizobia, and this mixture was plated on solid agar MAG medium

(Castillo 2013) in petri dishes at 30ºC until colonies appeared, ca. 4-10 days after plating. One colony per sample that could be morphologically identified as Bradyrhizobium (i.e., creamy yellow color, smooth margins, medium sized, and round appearance), following Sylvester-

Bradley et al. (1988) and Fuhrmann (1990), was randomly picked and suspended in 50 μL 1X

TE buffer solution (pH = 8). I selected only one colony per sample because only a single strain of rhizobia is typically found in a nodule as it arises from infection by a single bacterial cell.

Prior to their use directly in PCR the cells were lysed by heating at 65°C for 5 minutes. TruA, a

61

housekeeping gene involved in translation and ribosomal biogenesis (Ahn et al., 2004) and that is capable of distinguishing Bradyrhizobium strains (Zhang et al. 2012; Vinuesa et al. 2005;

Dorman and Wallace 2019) was used to characterize nodulating rhizobia. Bacterial samples were amplified with truAB-F/R primers specific for Bradyrhizobium (Zhang et al. 2012) and then sequenced using Sanger sequencing. PCR was used to amplify the region in 12.5 μl volume, containing 1 μl DNA, 1X LongAmp buffer (New England Biolabs, Ipswich, MA), 1.5 U

LongAmp Taq (New England Biolabs), 0.32 mM dNTP’s, 0.4 μM forward primer, and 0.4 μM reverse primer (Integrated DNA Technologies, Coralville, IA). The thermal cycler program consisted of heating to 95°C for 5 minutes, followed by 11 cycles of 94ºC for 45 sec., 60ºC for 1 min. decreased by 1.0ºC per cycle, 72ºC for 1:00 min., 26 cycles of 94ºC for 45 sec., 50ºC for 1 min., 72ºC for 1 min., and an elongation step of 72ºC for 10 min. Amplified products were subjected to agarose gel electrophoresis to check for the presence of a single band of the expected size for truA. Successfully amplified products (one sequence per sample) were cleaned by adding 0.2x Antarctic Phosphatase buffer, 5 units of Exonuclease I, and 1.25 units of

Antarctic Phosphatase (New England Biolabs, Ipswich, MA), to 7 μl of PCR product. This mixture was heated to 37°C for 15 minutes followed by 80°C for 15 minutes. Once the samples were cleaned, cycle sequencing was conducted in 10 μl reactions using either the forward or reverse primer and Big Dye version 3.1 (Life Technologies, Carlsbad, CA). Forward and reverse primer sequences were generated for all individuals using the PCR primers. Sequenced samples were dried and sent to the Arizona State University DNA Lab for capillary electrophoresis.

Rhizobia forward and reverse sequences were edited and assembled into a consensus sequence for each sample using Sequencher version 4.7 (Gene Codes Corporation, Ann Arbor, MI).

62

Sequences were aligned using the alignment algorithm in Geneious v.10.2.5 (Biomatters, Inc.

Newark, NJ).

For bacterial community sequencing of rhizosphere samples, whole-community DNA was extracted using the FastDNA™ SPIN Kit for soil isolation (MP Biomedicals, Solon, OH). I were unsuccessful in getting truA to amplify consistently in these samples. Thus, I used the 16s rRNA gene to characterize rhizosphere rhizobia communities because this gene has been broadly used in prokaryote taxonomy and phylogenetic reconstructions and sequences are widely available across prokaryotic lineages (Lapage et al. 1992). Besides, previous studies have shown that assessment of rhizobial genotypic diversity relevant to ecologically oriented studies can be achieved by 16S rDNA sequencing (Fox et al. 1992). Using primers 319F and 806R designed for the V3 and V4 region of the 16S rRNA gene (Holm et al. 2019), amplicons were produced using a 2-step PCR method (Holm et al. 2019) and sequenced on an Illumina (San Diego, CA) instrument. Each step 1 PCR reaction contained 1× Phusion Taq master mix (New England

Biolabs, Ipswich, MA), primers 319F and 806R (0.4 µM each), 3% DMSO, and 5 ng genomic

DNA. The thermal cycler program included the following cycles; an initial denaturation at 94°C for 3 min, 20 cycles of denaturation at 94°C for 30 s, annealing at 58°C for 30 s, elongation at

72°C for 1 min, and a final elongation step at 72°C for 7 min. Successful amplifications were tested by running a small amount using agarose gel electrophoresis. The second PCR, which added barcodes and adapters, and remaining steps in library preparation were conducted according to Holm et al. (2019) at the Microbiome Service Lab at the University of Maryland

School of Medicine. All samples were sequenced together in a single run on a MiSeq instrument

(Illumina, San Diego, CA) using paired end reads and length of 300 bp each.

63

To identify bacterial species in the rhizosphere samples bioinformatics analysis and annotation of the output data were carried out following Berlanas et al. (2019) using QIIME2

(Bolyen et al. 2019). This software provides quality filtering, picking operational taxonomic units (OTUs), taxonomic assignment, phylogenetic reconstruction, diversity analysis and graphical displays (Caporaso et al. 2010). Sequences were demultiplexed by sample at the

Microbiome Service Lab at the University of Maryland School of Medicine using a dual-barcode strategy, a mapping file linking barcode to samples, and QIIME-dependent script of split_libraries.py and split_sequence_file_on_sample_ids.py, (Holm et al. 2019). Primer sequences were removed from each read using q2-cutadapt plugin, and sequence quality control and feature table construction were conducted using DADA2 pipeline (Callahan et al. 2016).

Chimeras for combined runs were removed per the DADA2 pipeline. Amplicon sequence variants (ASVs) generated by DADA2 were taxonomically classified using the scikit-learn classifier (Pedregosa et al. 2011) trained with the SILVA v132 16S rRNA gene sequence database (Quast et al. 2012). After taxonomic classification, sequences belonging to Rhizobiales, which includes Bradyrhizobiaceae, were extracted from the dataset.

Data analysis

The plant dataset was tested for presence of unique multilocus genotypes using the Poppr package (Kamvar et al., 2014) in RStudio statistical software v. 1.1.456 (RStudio Team, 2012).

Genetic diversity of plants was assessed as mean number of alleles per locus (A), percent of polymorphic loci (P), observed heterozygosity (HO), expected heterozygosity (HE), and inbreeding coefficient (FIS) using GenAlEx version 6.503 (Peakall and Smouse 2012). For each locus, departure from Hardy–Weinberg expectations was tested through permutations of alleles among individuals and statistical significance was assessed using a p-value of 0.001, which is 64

adjusted using the sequential Bonferroni correction for multiple comparisons (Holm, 1979). A

Principal Coordinates Analysis (PCoA) was conducted using GenAlEx version 6.503 (Peakall and Smouse 2012) to explore dissimilarities among samples in the three studied plots. I performed analysis of spatial genetic structure (SGS) for plant samples using SPAGeDi 1.5

(Hardy and Vekemans 2002). The plant microsatellite loci were analyzed in each pair of individuals using pairwise relatedness coefficients according to Ritland’s estimator (equation 5 in Ritland 1996) which has been shown to be the best estimator, especially with highly polymorphic markers (Vekemans and Hardy 2004). Seventeen distance intervals, from 1 to 384 meters, were plotted to maximize the number of pairwise comparisons. Spatial genetic structure was tested by assessing the significance of the regression slope (bF) of pairwise statistics (Fij) on ln (distance) using 9999 randomizations of the individual spatial positions and obtaining 95% confidence intervals (CIs) after bootstrapping (1000 iterations) using SPAGeDi 1.5 (Hardy and

Vekemans 2002). Spatial genetic structure was also quantified by the ‘Sp’ statistic, which estimates SGS intensity and is calculated as – bF /(1 – F(1) ), where F(1) is the mean F( ij ) between individuals belonging to the first distance interval and bF is the regression slope of F( ij ) on rij

(physical distance between samples i and j) (Vekemans and Hardy 2004). F(1) can be considered a good estimate of the kinship between pairs of neighbors, on the condition that the first distance interval contains enough pairs of individuals to obtain reasonably precise F(1) values (Vekemans and Hardy 2004). Standard errors for the estimates of the kinship coefficients per distance class were estimated using a jackknife procedure over the loci. Then, values of the average genetic relatedness statistic between pairs of individuals separated by given distance intervals were plotted using a spatial autocorrelogram.

65

The sequence of truA from each nodule was checked against GenBank sequences via

BLAST-n searches (NCBI 1988) to identify closely matched bacterial strains. Rhizobia truA sequences were aligned in Geneious v.10.2.5 (Biomatters, Inc. Newark, NJ) using the internal alignment algorithm and compared against five reference sequences of taxonomically valid

Bradyrhizobium retrieved from GenBank (NCBI 1988) in a phylogenetic tree. The reference sequences included B. canariense BTA1 (Accession: JX064276), B. elkanii USDA 76

(Accession: JX064277), B. japonicum USDA 6 (Accession: JX064272), B. liaoningense USDA

3622 (Accession: JX064278), and B. yuanmingense CCBAU 10071 (Accession: JX064271).

These five strains are informative for identifying Bradyrhzobium species which associate with plants from Fabaceae (Koppell and Parker 2012; Dorman and Wallace 2019). I used jModeltest2

(Darriba et al. 2012, Guindon and Gascuel 2003) to select HKY+G (Hasegawa et al. 1985) as the best fitting model of molecular evolution according to the BIC MrBayes v. 3.2.3 (Ronquist et al.

2012) was used for phylogenetic reconstruction in the Cipres Science Gateway (Miller et al.

2010). I conducted Markov Chain Monte Carlo (MCMC) 5 million generations, sampling every

1000 points. Prior to determining the posterior probability of the trees with the highest likelihood, 1,250 trees were discarded as burn-in. A consensus trees is reported with posterior probability indicating support for clades. Sequence diversity analysis including number of haplotypes (H), haplotype diversity (Hd), and π for each plot and the entire dataset was conducted using DnaSP v5 (Librado and Rozas, 2009).

Analysis of spatial genetic structure (SGS) of nodulating rhizobia was conducted for each plot separately and all plots combined using SPAGeDi 1.5 (Hardy and Vekemans 2002).

Rhizobia truA haplotype sequences were converted to SPAGeDi 1.5 input using haplotype codes generated by SPADS 1.0 (Dellicour and Mardulyn 2014). All settings in SPAGeDi 1.5 were the

66

same as those used in analysis of SGS for the plants. Spatial genetic structure was tested by assessing the significance of the regression slope (bF) of pairwise statistics (Fij) on ln (distance) using 9999 randomizations of the individual spatial positions and obtaining 95% confidence intervals (CIs) after bootstrapping (1000 iterations) (Hardy and Vekemans 2002), and seventeen distance intervals, from 1 to 384 meters, were plotted to maximize the number of pairwise comparisons.

Isolation-by-distance for the plant hosts and nodulating rhizobia was investigated separately by comparing pairwise population genetic and geographic distances in a Mantel test

(Mantel 1967). Pairwise genetic distances for plant individuals were calculated using GenAlEx

(Peakall and Smouse 2012) and for rhizobia truA sequences using Mega v.6 (Tamura et al.

2013). Geographic distances between collected individuals were calculated using a modification of the haversine formula (Sinnott, 1984) based on the coordinates of the sampled plants in

GenAlEx (Peakall and Smouse 2012). The Mantel test was performed using PASSaGE 2

(Rosenberg 2011), and 1,000 permutations of the datasets were used to assess significance of the correlation. A stepwise regression analysis using IBM SPSS Statictics v.25 (IBM Corp. 2017) was also performed to test whether plant genetic and physical distance can be considered as explanatory variables for rhizobia genetic distances in the nodules.

Sequences of Rhizobiales 16S rDNA that were generated from rhizosphere samples were compared against sequences in GenBank (NCBI 1988) through BLASTn searches to identify closely matching sequences. The highest scoring, as determined by e-value, sequence identified to the species level was selected as the best matching sequence. The relative proportions of each identified strain across all plots were compared to have an understanding of the diversity and abundance of potential symbionts for C. fasciculata in the rhizospheric soil.

67

Results

No clonal genotypes were detected in the plant dataset. The number of microsatellite alleles per locus ranged from 3 to 12, with a mean value of 6.78 across all plants (Table 3.2). The percentage of polymorphic loci (%P) was 100 in each of the three plots, and the observed and expected heterozygosity ranged from 0.14 to 0.71 and 0.32 to 0.84, respectively. The inbreeding coefficients (FIS) were consistently positive, indicating heterozygote deficiency across all plots

(Table 3.2). Among the three plots only plot 1 demonstrated a significant regression slope (bF) for plant kinship coefficient over distance classes (p = 0.001 for bF = -0.0138) (Table 3.3). Since the genetic diversity measures among all three plots were very similar (Table 3.2) and principle coordinate analysis did not separate individuals by plot (Figure 3.2), I combined samples from the plots for downstream analyses. When considering the combined plots, I found significant autocorrelation in plant genotypes for all distance classes except 244, 261, 344 and 384 m, which are distributed at the intermediate and largest distances (Figure 3.3). The average kinship coefficient decreased linearly with the natural logarithm of spatial distance (rij) between individuals. Significant positive values of Fij were found at short distances (<43 m), indicating that neighboring individuals had a higher genetic relatedness than random pairs of individuals.

Negative values (i.e. individuals within a distance class were less genetically similar than expected with a random distribution) of Fij occurred at larger distances (Figure 3.3). Mean values of the regression slope (bF), F(1) and Sp statistic across all plots were -0.0121, 0.045 and 0.0127, respectively (Table 3.3). The p-value for the regression slope was 0.000.

68

Table 3.2 Genetic diversity of the host plant, Chamaecrista fasciculata.

Plot N A P HO HE FIS Significant deviation from HWE

1 24 5.1 100 0.48 0.55 0.09 no loci 2 22 4.35 100 0.49 0.52 0.1 1 locus (p = 0.001) 3 24 5 100 0.48 0.56 0.13 1 locus (p = 0.001) All 70 6.78 100 0.49 0.58 0.17 3 loci (p = 0.001) N = number of individuals sampled; A = mean number of alleles per locus; P = percentage of polymorphic loci; HO = observed heterozygosity; HE = expected heterozygosity; FIS = inbreeding coefficient; HWE= Hardy–Weinberg equilibrium.

Due to the lack of mature nodules for some of the collected plants, I failed to culture rhizobia for five plants, and sequencing was unsuccessful for 12 samples of the cultured strains.

Thus, I was able to generate 53 truA sequences from nodulating rhizobia. The aligned length of the truA sequence set was 497 nucleotides (~67% coverage of truA gene in Bradyrhizobium). All sequences most closely resembled Bradyrhizobium elkanii truA sequences in BLAST-n searches in GenBank (NCBI 1988). Additionally, the phylogeny of truA sequences indicates that all recovered sequences from C. fasciculata cluster exclusively with B. elkanii and with 1.0 posterior probability support (Figure 3.5). The sequence diversity of the nodulating rhizobia was found to be similar in all three plots. A similar number of haplotypes (6-9) was found across the plots, and these haplotypes exhibited low mean genetic difference between pairs of sequences (π

= 0.03457-0.04175; Table 3.4). Across the three plots 23 distinct rhizobia truA haplotypes were found. Only plot 1 demonstrated a significant regression slope (bF) for rhizobia kinship coefficient over distance classes (bF = -0.0165, p = 0.046) (Table 3.3). When plots were combined, I found no significant autocorrelations in rhizobia genotypes for any distance classes

69

except for 286 m (Figure 3.4), but I found a significant regression slope for rhizobia kinship coefficients against distance among pairs of individuals (bF = -0.0031, p = 0.041) (Table 3.3).

Table 3.3 Fine-scale genetic structure in Chamecrista fasciculata host plants and nodulating Bradyrhizobium elkanii in the studied plots.

Plot Distance range bF P-value F1 Sp statistic (m) Plant Plot 1 1-43 -0.0138 0.001* 0.0073 0.0139 Plot 2 1-43 -0.0063 0.125 -0.0165 0.0062 Plot 3 1-43 -0.0049 0.145 -0.0068 0.0048 Plots 1-3 1-384 -0.0121 0.000* 0.0451 0.0127 Rhizobia Plot 1 1-43 -0.0165 0.046* -0.0212 0.0162 Plot 2 1-43 0.0172 0.075 -0.0583 -0.0162 Plot 3 1-43 -0.0009 0.882 -0.008 0.0009 Plots 1-3 1-384 -0.0031 0.041* 0.0038 0.0031 b = the regression slope of multilocus kinship coefficients for pairs of individuals against the logarithm of geographic distance separating members of the pair; P-value = level of significance; F1 = the mean kinship coefficient (Fij) between individuals belonging to the first distance interval; Sp statistics = – bF /(1 – F1); CI = confidence interval. Note: Star shows significant values.

70

Figure 3.2 A principle coordinates analysis (PCoA) of sampled plants based on microsatellite variation in all three plots. Orange circles = plot 1, black circles = plot 2, blue circles = plot 3. First and the second components account for 8.77% and 6.88% of variation in the dataset, respectively.

A significant relationship between plant genetic and geographic distances was found (r =

0.243, p = 0.001, Figure 3.7), but the relationship between nodulating rhizobia genetic and geographic distances was not significant (r = 0, p = 0.99, Figure 3.8), as was the relationship between plant genetic and nodulating rhizobia genetic distance (r = -0.087, p = 0.26, Figure 3.9).

Stepwise regression analysis showed that plant genetic distance, and plant genetic plus geographic distance are both able to predict genetic distance of nodulating rhizobia with the adjusted r-square of 0.012 and 0.016, respectively (p < 0.0001 for both, Table 3.5). Geographic

71

distance alone did not significantly determine nodulating rhizobia genetic distance. Even though they were significant, the combined variables of plant genetic distance and geographic distance only explained 1.6% of the observed variation in nodulating rhizobia genetic distances (Table

3.5).

The total number of unique 16S rDNA sequences of Rhizobiales present in the rhizosphere samples from host plants in all three plots was 81 sequences. The majority of the retrieved sequences are of Bradyrhizobium, including B. elkanii (30%), B. canariense (22%), and unidentified Bradyrhizobium strains (9%) (Figure 3.6). Rhizobium strains were second in abundance, making up 9% of the rhizosphere sequences.

72

Table 3.4 Genetic diversity of nodulating Bradyrhizobium elkanii in the truA gene across the studied plots.

Plot N H Hd π

1 19 8 0.86 0.04175 2 15 6 0.829 0.03457 3 19 9 0.86 0.03679 Combined (1-3) 53 6 0.730 0.03338 N = number of sequences; H = number of haplotypes; Hd = haplotype diversity; π = a measure of the average differences between pairs of sequences.

Table 3.5 Stepwise regression assigning plant genetic and geographic distance as predictors of nodulating Bradyrhizobium elkanii genetic distances.

Predictor Adjusted r square P-value

Plant genetics, constant 0.012 0.000 Plant genetics, geographic distance, constant 0.016 0.000

73

Figure 3.3 Ritland’s kinship coefficients for pairs of plants plotted against the logarithm of geographic distance between pairs and the estimated regression line (regression slope bF = -0.01214, p = 0.000). Significant autocorrelation at a distance class is indicated by a star.

74

Figure 3.4 Ritland’s kinship coefficients for pairs of nodulating Bradyrhizobium plotted against the logarithm of geographic distance separating members of the pairs. A significant autocorrelation was only found for distance class 286. regression slope bF = -0.0031, p = 0.041.

75

Discussion

In many plant species, seed dispersal has a peak distribution at or close to the maternal plant and progressively fewer seeds are dispersed at greater distances from the maternal plant

(Portnoy and Willson 1993; Willson 1993; Schupp and Fuentes 1995). The extent of SGS within plant populations depends on factors such as seed and pollen dispersal distance and effective plant density (Vekemans and Hardy 2004). According to the theory of isolation by distance, SGS arises from the interplay of limited gene flow and local genetic drift, and the rate of decrease of genetic similarity with distance is a measure of strength of SGS (Loiselle et al. 1995; Rousset

2000; Hardy 2003). The morphological characteristics of C. fasciculata, including heaviness of the seeds and bee pollination, suggest that plants should exhibit an isolation by distance pattern.

The SGS pattern detected for host plants in the studied plots was consistent with IBD, showing an increase in genetic distance between pairs of plants as they increased in physical distance from one another. Additionally, the kinship coefficients gradually decrease from very positive to very negative over the distance of sampling. Discovering statistically significant fine scale genetic structure at most of our studied distances is concordant with the existence of very restricted seed dispersal events among plants and suggests that plants are not randomly distributed even at this fine scale. Values for the Sp statistic in this study, ranging 0.0048-0.0139, are comparable to another study on this species with reported Sp statistic of 0.00746 (Fenster et al. 2003). The mean Sp statistic in other outcrossing or gravity dispersed plant species is 0.0126 and 0.0281, respectively (Vekemans and Hardy 2004), which are comparatively close to our measured Sp statistic value for C. fasciculata.

76

Figure 3.5 Phylogenetic tree of nodualting Bradyrhizobium based on truA sequences. The first and second numbers of each tip indicate plot number (1, 2, or 3) and distance interval (0, 1, 2, 5, 10, or 25). Values on the branches are posterior probability (PP). Support of less than 0.95 PP is not shown.

77

Given the presence of fine-scale structure in host plants and widely documented genotype x genotype interactions between legumes and their symbionts, I expected to find significant genetic structure in nodulating rhizobia of C. fasciculata as well. Additionally, limited dispersal ability of bacterial cells in the soil due to passive means is expected to maintain structure if there is not extensive soil disturbance. Although I did not find a significant kinship coefficient for pairs of rhizobia at most geographic distances, the mean sp statistic in rhizobia followed the same pattern observed for plant hosts. That is, I found a significant bF for rhizobia and plant hosts in plot 1 and the combined plots. This finding suggests a common mechanism that influences genetic structure of plants and rhizobia. It may be that host plants directly influence their rhizobia symbionts by selecting for certain strains or that plant hosts and their nodulating rhizobia are influenced by similar environmental factors that structure genetic diversity. In our study, I focused on only one housekeeping gene, truA, to identify the nodulating rhizobia and estimate genetic diversity of this community. Although this does not capture genome-wide diversity, it is a suitable genetic marker that was considered highly informative of phylogenetic relationships in Bradyrhizobium by Zhang et al. (2012) and Dorman and Wallace (2019). Our study revealed relatively high haplotype diversity (Hd = 0.82-0.86 across plots) in truA, which indicates an ability to discriminate among most of the rhizobia symbionts I recovered. In another study, the nucleotide diversity of Rhizobium leguminosarum biovar viciae isolates from Vicia cracca L. acquired using two housekeeping genes, glnII and recA, and one nodulation gene, nodC, revealed relatively close values of nucleotide diversity for all of the three studied genes (π

= 0.036, 0.021 and 0.025, respectively) (Van Cauwenberghe et al. 2014). This study suggests that single genes can be informative of bacterial genetic diversity.

78

Figure 3.6 Abundance of families and genera of Rhizobiales found in rhizosphere samples of Chamaecrista fasciculata in the current study.

While a literature review found no fine-scale studies of genetic structure in nodulating rhizobia or plant endophytes for comparison to our results, I have found that the number of haplotypes recovered for C. fasciculata is similar to studies of other legumes. Dorman and

Wallace (2019), who also used truA to study genetic diversity of nodulating rhizobia in C. fasciculata across Mississippi, reported 9-25 haplotypes and 0.831-0.954 for haplotype diversity at a regional scale. Our finding of six haplotypes and substantial haplotype diversity in a single site indicates that populations can harbor diverse symbionts even at the smallest spatial scales. In

79

a fine scale study (< 2 m) of diversity patterns in microbial communities associated with roots of

Bistorta vivipara L., a perennial herb, it was revealed that similarities between the structure of bacterial and fungal communities were stronger in soils than in roots (Bjornsgaard Aas et al.

2019). The root-associated fungal and bacterial communities in that study both showed significant spatial autocorrelation at distances below 40–50 cm, while soil-associated communities showed no spatial structure across the 2 m scale investigated. This result, along with ours, suggests that different host filters are responsible for structuring symbiotic endophytes of roots and nodules.

Figure 3.7 Pairwise plant genetic distances based on microsatellite alleles compared to pairwise geographic distance between sampled plants (r = 0.243, p = 0.001).

80

Host promiscuity in symbiotic associations can influence exotic legume establishment and colonization of novel ranges (Klock et al. 2015), and it has been suggested that enhanced diversity of symbionts may underlie the success of host plants in non-native habitats (e.g.,

Ndlovu et al. 2013). Hence, it may be expected that due to a broad geographic distribution of a host species it potentially harbors different rhizobia strains that enable tolerance to varying soil types and environmental conditions. Unlike this assumption and the previous study on this system, which covered a vast geographic area with different physiographic regions and found different genetic variants of Bradyrhizobium nodulating C. fasciculata (Dorman and Wallace

2019), in our 400 meter study area, the only sequences retrieved from nodules of the widespread

C. fasciculata studied here belonged to Bradyrhizobium elkanii (Figure 3.5). While our results confirm the exclusive use of this Bradyrhizobium by C. fasciculata also noted in other studies

(Koppell and Parker 2012; Parker 2015; Andrews and Andrews 2017; Santos et al. 2017;

Dorman and Wallace 2019), it was surprising to find that all host plants harbored strains of a single species, B. elkanii. Although it is possible that other rhizobia types could have existed in untested nodules, the likelihood of high diversity among these seems rare in light of my results.

Given that numerous species of Bradyrhizobium were identified in the rhizosphere, these results suggest strong selection of particular symbionts by the host plants at this site, which may indicate thatgenotype x genotype interactions are important in this system too. The genetic diversity measures of nodulating rhizobia obtained by Dorman and Wallace (2019) in their study locations nearby the location of this study (e.g., H = 9 and Hd = 0.83-0.93) are very similar to our findings

(H = 6-9 and Hd = 0.83-0.86) but they still found multiple species of nodulating rhizobia.

Furthermore, genetic diversity analyses of the host plants in this study revealed less genetic variation among plants (Table 3.2), which was much lower than variation observed in studied

81

plants from chapter 2. Less variation among host plants suggest that they are likely to prefer the same type of rhizobia to associate with, which was supported by selection of the same type of nodulating rhizobia by all studied plants. This finding also supports our hypothesis that plants establish symbioses with only a subset of available soil bacteria in their nodules, and confirms several other studies showing that some microbial species typically associate with only a few or a single plant species. This fact is well known for pathogens as well as beneficial interactions e.g. in legume-rhizobium relationship. For example, Sinorhizobium meliloti effectively colonizes

Medicago, Melilotus and Trigonella, whereas Rhizobium leguminosarum induces nodules in

Pisum vicea, Lens and Lathyrus plants (Bais et al. 2006).

Figure 3.8 Pairwise nodulating Bradyrhizobium genetic distances based on truA sequences compared to pairwise geographic distance between sampled nodules (r = 0, p = 0.99).

82

Selection of only one species by the host plants may reduce the SGS pattern of kinship in nodulating rhizobia. A non-significant Mantel test also supports this idea by showing that genetic distance of isolates is not associated with geographic distance between pairs of nodules (r = 0, p

= 0.99, Figure 3.8). Additionally, plant genetic distances are required to explain a small but significant amount of the observed variation as our stepwise regression revealed such impacts of the host genotype on the genotype of the nodulating rhizobia (Table 3.5). Despite the expectation that plant genotype would explain a greater amount of variation in nodulating bacteria, other studies have also found a lack of strong influence by plant hosts on rhizobia diversity. For example, in a study on nodulating rhizobia of Lotus japonicus (Regel) K. Larsen, plant genotype explained only 4.91% of the variation in nodulating rhizobia (Zgadzaj et al. 2016). The lack of influence of geographic distance in structuring nodulating rhizobia is in line with results from

Dorman and Wallace (2019) who were not able to find a significant correlation between geographic and bacterial genetic distances at larger spatial scales either. Their finding is consistent with other studies demonstrating that genetic structure of soil bacteria is largely independent of geographic distance (Fierer and Jackson 2006). Results from my study show that plant genetic distance explains only about 1.2% variation in diversity of nodulating rhizobia while by adding geographic distance, the new model explains 1.6% of variation in nodulating rhizobia diversity. It is likely that at small scales, host plant and the geographic distance between pairs of hosts, cumulatively, control take up of specific rhizobia in the soil to associate with the host plant.

83

Figure 3.9 Pairwise plant genetic distances based on microsatellite alleles compared to pairwise nodulating Bradyrhizobium genetic distances based on truA sequences (r = -0.087, p = 0.261).

Along with geographic distance, soil influence on bacterial communities has been investigated in several studies. Zgadzaj et al. (2016) observed that while Bradyrhizobium were the major rhizobia symbionts of cowpea, irrespective of plant genotype and soil type, the presence of other genera including Enterobacter, Chryseobacterium, Sphingobacterium was significantly correlated with the soil type and, to a lesser extent, plant genotype. However, Leite et al. (2017) indicated that for rhizobia-host interaction, the plant genotype has a higher influence on the selection of the bacterial symbiont than soil type, due to the fact that the abundance of major OTUs differed in their study under influence of plant genotype. Fernandez-Gomez et al.

(2019) proposed that the bulk soil surrounding the rhizosphere might introduce low levels of restrictions and/or promote the recruitment of a subset of bacteria that colonize the rhizosphere. 84

Although it might be helpful to analyze the rhizosphere for chemical properties to gain a comprehensive understanding of the factors influencing the microbial communities in the nodules, we postulate that in our fine scale study the soil matrix is quite homogeneous and the role of environmental influences would have been minimized given the very small study area.

Congruently, there is no general decision about the key player, which means that both factors can dominate depending on biotic and abiotic conditions (Berg and Smalla 2009), and there are several contrasting reports recognizing plant or soil type as the dominant factor (Grayston et al.

1998; Girvan et al. 2003; Nunan et al. 2005). Although soil type and pH appear to influence free- living soil bacterial communities (Fierer and Ladau 2012), Sachs et al. (2009) found that rhizobia housed in nodules of different species of wild Lotus were a subset of those in the surrounding soil, indicating a strong role for plant host to choose particular rhizobia genotypes.

Conclusions

No study to date has been conducted on genetic diversity and structure of legume rhizobia system within Chamaecrista fasciculata and at the fine scale studied here. Here I showed that there is fine-scale genetic structure in the host plant and its nodulating rhizobia, and that plant genotype along with geographic distance contribute minimally to explaining genetic distance among nodulating rhizobia. The data indicate that co-located host plant individuals are likely to be more genetically similar than those more distant probably because large seeds are maintained close to the maternal plant. The low dispersal ability of bacteria in soil and specificity between legume and rhizobia species may explain the common pattern of structure between rhizobia and their host plants. Nevertheless, the small amount of variation in nodulating rhizobia genetic distances explained by host genetic distance and geographic distance suggests that there may be other environmental factors influencing these interactions. I also found that the only rhizobia 85

strain retrieved from sampled host plant nodules belonged to Bradyrhizobium elkanii, which was also the most common form in the rhizosphere. This finding reveals a high degree of plant selection among available rhizobia in soil. To understand to what degree this strain-specific legume rhizobia symbioses can develop, further studies in particular habitats are needed.

86

References

Ahn, K. S., U. Ha, J. Jia, D. Wu, and S. Jin. 2004. The truA gene of Pseudomonas aeruginosa is required for the expression of type III secretory genes. Microbiology 150: 539-547.

Aira, M., F. Monroy, and J. Domínguez. 2007. Eisenia fetida (Oligochaeta: Lumbricidae) modifies the structure and physiological capabilities of microbial communities improving carbon mineralization during vermicomposting of pig manure. Microbial Ecollogy 54: 662–671.

Andrews, M., and M. E. Andrews. 2017. Specificity in legume–rhizobia symbioses. International Journal of Molecular Sciences 18:705.

Bais H. P., T. L. Weir, L. G. Perry, S. Gilroy, J. M. Vivanco. 2006. The role of root exudates in rhizosphere interactions with plants and other organisms. Annual Review of Plant Biology 57: 233–266.

Berg G. and K. Smalla. 2009. Plant species and soil type cooperatively shape the structure and function of microbial communities in the rhizosphere. FEMS Microbiology Ecology 68: 1–13.

Berlanas C, M. Berbegal, G. Elena, M. Laidani, J. F. Cibriain, A. Sagues, and D. Gramaje. 2019. The fungal and bacterial rhizosphere microbiome associated with grapevine rootstock genotypes in mature and young vineyards. Front Microbiology 10: 1142.

Bever, J. D., I. A. Dickie, E. Facelli, J. M. Facelli, J. Klironomos, M. Moora, M. C. Rillig, W. D. Stock, M. Tibbett, and M. Zobel. 2010. Rooting theories of plant community ecology in microbial interactions. Trends in Ecology amd Evolution 25: 468–478.

Bjornsgaard Aas A, C. J. Andrew, R. Blaalid, U. Vik, H. Kauserud, and M. L. Davey. 2019. Fine-scale diversity patterns in belowground microbial communities are consistent across kingdoms. FEMS Microbiology Ecology 95: fiz058, https://doi.org/10.1093/femsec/fiz058

Bolyen E, J. R. Rideout, M. R. Dillon, N. A. Bokulich, C. C. Abnet, et al. 2019. Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2. Nature Biotechnology 37: 852–857.

Bouffaud M. L., M. A. Poirier, D. Muller, Y. Moënne-Loccoz. 2014. Root microbiome relates to plant host evolution in maize and other Poaceae. Environmental Microbiology 16: 2804– 2814.

Bulgarelli D., R. Garrido-Oter, P. C. Münch, A. Weiman, J. Dröge, Y. Pan, et al. 2015. Structure and function of the bacterial root microbiota in wild and domesticated barley. Cell Host Microbe 17: 392–403.

87

Callahan, B. J., P. J. Mc Murdie, M. J. Rosen, A. W. Han, A. J. Johnson, and S. P. Holmes. 2016. DADA2: High Resolution Sample Inference from Amplicon Data. Nature Methods.

Caporaso, J. G., J. Kuczynski, J. Stombaugh, K. Bittinger, F. D. Bushman, E. K. Costello, et al. 2010. QIIME allows analysis of high-throughput community sequencing data. Nature Methods 7: 335–336.

Crook, M. B, D. P. Lindsay, M. B. Biggs, J. S. Bentley, J. C. Price, S. C. Clement, M. J. Clement, S. R. Long, J. S. Griffitts. 2012. Rhizobial plasmids that cause impaired symbiotic nitrogen fixation and enhanced host invasion. Molecular Plant–Microbe Interactions 25: 1026–1033.

Darriba, D., G. L. Taboada, R. Doallo, and D. Posada. 2012. J Model Test 2: More models, new heuristics, and parallel computing. Nature Methods 9:772.

Dellaporta, S. L., J. Wood and J. B. Hicks. 1983. A plant DNA mini preparation: Version II. Plant Molecular Biology Reporter 1: 19-21.

Dellicour S, and P. Mardulyn. 2014. SPADS 1.0: a toolbox to perform spatial analyses on DNA sequence datasets. Molecular Ecology Resources 14: 647–651.

Diouf M., E. Baudoin, L. Dieng, K. Assigbetsé, A. Brauman. 2010. Legume and gramineous crop residues stimulate distinct soil bacterial populations during early decomposition stages. Canadian Journal of Soil Science 90: 289–293.

Dorman, H. and L. E. Wallace. 2019. Diversity of nitrogen-fixing symbionts of Chamaecrista fasciculata (Partridge Pea) across variable soils. Southeastern naturalist 80:147-164.

Ehrenfeld J. G., B. Ravit, K. Elgersma. 2005. Feedback in the plant–soil system. Annual Review of Environment and Resources 30: 75–115.

Emmett, B., E. B. Nelson, A. Kessler, and T. L. Bauerle. 2014. Fine-root system development and susceptibility to pathogen colonization. Planta 239: 325–340.

Ettema, C. H. and D. A. Wardle. 2002. Spatial soil ecology. Trends in Ecology and Evolution 17:177–183.

Fenster, C. B., X. Vekemans, and O.J. Hardy. 2003. Quantifying gene flow from spatial genetic structure data in a metapopulation of Chamaecrista fasciculata (Leguminosae). Evolution, 57: 995–1007.

Fernández-Gómez, B., J. Maldonado, D. Mandakovic, A. Gaete, et al. 2019. Bacterial communities associated to Chilean altiplanic native plants from the Andean grasslands soils. Scientific Reports 9:1042. doi: 10.1038/s41598-018-37776-0

Fierer, N., J. Ladau. 2012. Predicting microbial distributions in space and time. Nature Methods 9: 549-551. 88

Fierer, N., and R. B. Jackson. 2006. The diversity and biogeography of soil bacterial communities. Proceedings of the National Academy of Sciences (USA) 103: 626– 631.

Fox, G. E., J. D. Wisotzkey, and P. Jurtshuk. 1992. How close is close: 16SrRNA sequence identity may not be sufficient to guarantee species identity. International journal of systematic bacteriology 42:166–170.

Fuhrmann, J. J. 1990. Symbiotic effectiveness of indigenous soybean Bradyrhizobia as related to serological, morphological, rhizobiotoxine, and hydrogenase phenotypes. Applied and Environmental Microbiology 56: 224-229.

Girvan M. S., J. Bullimore, J. N. Pretty, A. M. Osborn, and A. S. Ball. 2003. Soil type is the primary determinant of the composition of total and active bacterial communities in arable soils. Applied and Environmental Microbiology 69: 1800–1809.

Guindon, S., and O. Gascuel. 2003. A simple, fast, and accurate method to estimate large phylogenies by maximum likelihood. Systematic Biology 52: 696–704.

Hardoim P. R., L. S. van Overbeek, J. D. van Elsas. 2008. Properties of bacterial endophytes and their proposed role in plant growth. Trends in Microbiology 16:463–471.

Hardy, O. J. 2003. Estimation of pairwise relatedness between individuals and characterization of isolation-by-distance processes using dominant genetic markers. Molecular Ecology 12: 1577e1588.

Hardy, O. J., X. Vekemans. 2002. SPAGeDi: a versatile computer program to analyze spatial genetic structure at the individual or population levels. Molecular Ecology Notes 2: 618– 620.

Hasegawa, M., K. Kishino, and T. Yano. 1985. Dating the human–ape splitting by a molecular clock of mitochondrial DNA. Journal of Molecular Evolution 22:160–174.

Heath, K. D., and P. Tiffin. 2007. Context dependence in the coevolution of plant and rhizobial mutualists. Proceedings of the Royal Society of London B: Biological Sciences 274: 1905–1912.

Holm. J. B., M. S. Humphrys, C. K. Robinson, M. L. Settles, S. Ott, L. Fu, H. Yang, P. Gajer, et al. 2019. Ultrahigh-throughput multiplexing and sequencing of >500-basepair amplicon regions on the Illumina HiSeq 2500 platform. mSystems 4: e00029-19.

IBM Corp. 2017. IBM SPSS Statistics for Winodows, version 25. IBM Corp., Armonk, New York, USA.

Irwin, H. S. and R. C. Barneby. 1982. The American Cassinae: A synoptical revision of Leguminosae tribe Cassieae subtribe Cassinae in the New World. Memoirs of The New York Botanical Garden 35: 1–918.

89

Igolkina A., G. A. Bazykin, E. P. Chizhevskaya, N. A. Provorov, E. E. Andronov. 2018. The Evolutionary Moulding in plant-microbial symbiosis: matching population diversity of rhizobial nodA and legume NFR5 genes. bioRxiv 285882; doi: https://doi.org/10.1101/285882

Jiang, Y., S. Li, R. Li, J. Zhang, Y. Liu, L. Lv, H. Zhu, W. Wu, W. Li. 2017. Plant cultivars imprint the rhizosphere bacterial community composition and association networks. Soil Biology and Biochemistry 109: 145e155

Kamvar Z. N., J. F. Tabima, and N. J. Grünwald. 2014. Poppr: an R package for genetic analysis of populations with clonal, partially clonal, and/or sexual reproduction. Peer J 2: e281.

Kim, M., Y. Chen, J. Xi, C. Waters, R. Chen, D. Wang. 2015. An antimicrobial peptide essential for bacterial survival in the nitrogen-fixing symbiosis. Proceedings of the National Academy of Sciences, USA 112: 201500123.

Klock M. M., L. G. Barrett, P. H. Thrall, K. E. Harms. 2015. Host-promiscuity in symbiont associations can influence exotic legume establishment and colonization of novel ranges. Diversity and Distributions 21: 1193–1203.

Koppell, J. H., and M. A. Parker. 2012. Phylogenetic clustering of Bradyrhizobium symbionts on legumes indigenous to North America. Microbiology 158: 2050–2059.

Ladygina N., and K. Hedlund. 2010. Plant species influence microbial diversity and carbon allocation in the rhizosphere. Soil Biology and Biochemistry 42: 162–168.

Lapage S. P., P. H. Sneath, E. F. Lessel, V. B. Skerman, H. P. Seeliger, W. A. Clark. Chapter 3, Rules of Nomenclature with Rocommendations. Washington (DC): ASM Press; 1992. International Code of Nomenclature of Bacteria. Bacteriological Code 1990 Revision.

Leite, J., D. Fischer, L. F. Rouws, P. I. Fernandes-Júnior, A. Hofmann, S. Kublik, M. Schloter, G. R. Xavier, and V. Radl. 2017. Cowpea Nodules Harbor Non-rhizobial Bacterial Communities that Are Shaped by Soil Type Rather than Plant Genotype. Frontiers in plant science 7: 2064.

Librado, P., and J. Rozas. 2009. DnaSP v5: A software for comprehensive analysis of DNA polymorphism data. Bioinformatics 25:1451–1452.

Loiselle, B. A., V. L. Sork, J. Nason, et al. 1995. Spatial genetic structure of a tropical understory shrub, Psychotria officinalis (Rubiaceae). American Journal of Botany 82:1420e1425.

Lu, J., F. Yang, S. Wang, H. Ma, J. Liang, and Y. Chen. 2017. Co-existence of Rhizobia and Diverse Non-rhizobial Bacteria in the Rhizosphere and Nodules of Dalbergia odorifera Seedlings Inoculated with Bradyrhizobium elkanii, Rhizobium multihospitium-Like and Burkholderia pyrrocinia-Like Strains. Frontiers in microbiology 8: 2255.

90

Mantel, N. 1967. The detection of disease clustering and a generalized regression approach. Cancer Research 27: 209–220.

Marques J. M., T. F. da Silva, R. E. Vollu, A. F. Blank, G. C. Ding, et al. 2014. Plant age and genotype affect the bacterial community composition in the tuber rhizosphere of field- grown sweet potato plants. FEMS Microbiology Ecology 88: 424–435.

McLaren J. R., and R. Turkington. 2011. Plant identity influences decomposition through more than one mechanism. PLoS ONE 6: e23702.

Miller, M. A., W. Pfeiffer, and T. Schwartz. 2010. Creating the CIPRES Science Gateway for inference of large phylogenetic trees. Pp. 1–8, In Institute of Electrical and Electronics Engineers (Eds.). Proceedings of the Gateway Computing Environments Workshop (GCE), New Orleans, LA. 115 pp.

National Center for Biotechnology Information (NCBI). 1988. National Library of Medicine (US), National Center for Biotechnology Information; Bethesda, MD. Available online at https://www.ncbi.nlm.nih.gov/. Accessed 14 March 2020.

Ndlovu J, D. M. Richardson, J. R. U. Wilson, J. J. Le Roux. 2013. Co-invasion of South African ecosystems by an Australian legume and its rhizobial symbionts. Journal of Biogeography 40: 1240–1251.

Nunan N, T. J. Daniell, B. K. Singh, A. Papert, J. W. Mc Nicol, and J. I. Prosser. 2005. Links between plant and rhizoplane bacterial communities in grassland soils, characterized using molecular techniques. Applied and Environmental Microbiology 71: 6784–6792

O’Malley, M. A. 2007. The nineteenth century roots of ‘everything is everywhere. Nature Reviews Microbiology 5:647–651.

Pahua, V. J., P. J. N. Stokes, A. C. Hollowell, J. U. Regus, K. A. Gano-Cohen, C. E. Wendlandt, K. W. Quides, J. Y. Lyu, J. L. Sachs. 2018. Fitness variation among host species and the paradox of ineffective rhizobia. Journal of Evolutionary Biology 31: 599– 610.

Parker, M. 1999. Mutualism in metapopulations of legumes and rhizobia. The American Naturalist 153: 48-60.

Parker, M. A. 2015. The spread of Bradyrhizobium lineages across host legume clades: From Abarema to Zygia. Microbial Ecology 69: 630–640.

Parker, M., and D. A. Kennedy. 2006. Diversity and relationships of Bradyrhizobium from legumes native to eastern North America. Canadian Journal of Microbiology 52: 1148- 1157.

Peakall, R., and P. E. Smouse. 2012. GenAlEx 6.5: genetic analysis in Excel. Population genetic software for teaching and research-an update. Bioinformatics 28: 2537-2539.

91

Pedregosa, F., G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, et al. 2011. Scikit- learn: machine learning in python. Journal of Machine Learning Research 12: 2825.

Perret X, C. Staehelin, W. J. Broughton. 2000. Molecular basis of symbiotic promiscuity. Microbiology and Molecular Biology Reviews 64: 180–201.

Portnoy, S., and M. F. Willson. 1993. Seed dispersal curves: behavior of the tail of the distribution. Evolutionary Ecology 7: 25–44.

Quast, C., E. Pruesse, P. Yilmaz, J. Gerken, T. Schweer, et al. 2012. The SILVA ribosomal RNA gene database project: improved data processing and web-based tools. Nucleic Acids Research 41: D590–D596.

Rangin, C., B. Brunel, J. C. Cleyet-Marel, M. M. Perrineau, G. Bena. 2008. Effects of Medicago truncatula genetic diversity, rhizobial competition, and strain effectiveness on the diversity of a natural Sinorhizobium species community. Applied and Environmental Microbiology 74: 5653–5661.

Reeves, D.W. 1994. Cover crops and rotations. Pp. 125–172, In J.L. Hatfield and B.A. Stewart (Eds.). Advances in Soil Science: Crops Residue Management. Lewis Publishers, CRC Press, Boca Raton, FL. 240 pp.

Rodríguez-Kábana, R., N. Kokalis-Burelle, D. G. Robertson, C. F. Weaver, and L. Wells. 1995. Effects of Partridge Pea–Peanut rotations on populations of Meloidogyne arenaria, incidence of Sclerotium rolfsii, and yield of peanut. Nematropica 25: 27–34.

Ronquist, F., M. Teslenko, P. van der Mark, D. L. Ayres, A. Darling, S. Höhna, B. Larget, L. Liu, M.A. Suchard, and J. P. Huelsenbeck. 2012. MrBayes 3.2: Efficient Bayesian phylogenetic inference and model choice across a large model space. Systematic Biology 61: 539–542.

Rosenberg, M. S., and C. D. Anderson. 2011. PASSaGE: Pattern Analysis, Spatial Statistics and Geographic Exegesis. Version 2. Methods in Ecology and Evolution 2: 229-232.

Rousset, F., 2000. Genetic differentiation between individuals. Journal of Evolutionary Biology 13: 58e62.

RStudio Team. 2012. RStudio: Integrated development environment for R, version 1.1.456. Boston, Massachussete, USA, website: http://www.rstudio.com [accessed 20 March 2020].

Sachs, J. L., S.W. Kembel, A. H. Lau, and E. L. Simms. 2009. In situ phylogenetic structure and diversity of wild Bradyrhizobium communities. Applied Environmental Microbiology 75: 4727– 4735.

92

Sachs, S. J., S. Wang, C. D. Campbell, and Edwards, A. C. 1998. Selective influence of plant species on microbial diversity in the rhizosphere. Soil Biology and Biochemistry 30: 369– 378.

Santos J. M, P. A. Casaes Alves, V. C. Silva, M. F. Kruschewsky Rhem, E. K. James, E. Gross. 2017. Diverse genotypes of Bradyrhizobium nodulate herbaceous Chamaecrista (moench) (Fabaceae, caesalpinioideae) species in Brazil. Systematic and Applied Microbiology 40: 69-79

Schupp, E. W., and M. Fuentes. 1995. Spatial patterns of seed dispersal and the unification of plant population ecology. Écoscience 2: 267– 275.

Singer, S., J. Doyle, G. May, S. Cannon, S. Maki, and D. Illut. 2009. Exploring Chamaecrista fasciculata genomics data [online]. Website: http://serc.carleton.edu/exploring_genomics/chamaecrista/chamaecrista_tr.html [accessed 10 Jannuary 2020]

Singer, S., S. Maki, A. Farmer, D. Ilut, G. May, S. Cannon, and J. Doyle. 2009a. Venturing beyond beans and peas: what can we learn from Chamaecrista? Plant Physiology 151: 1041-1047.

Sinnott, R. W. 1984. Virtues of the Haversine. Sky Telescope 68: 158.

Sylvester-Bradley R., P. Thornton, and P. Jones. 1988. Colony dimorphism in Bradyrhizobium strains. Applied and Environmental Microbiology 54: 1033-1038

Tamura, K., G. Stecher, D. Peterson, A. Filipski, and S. Kumar. 2013. MEGA6: Molecular evolutionary-genetics analysis version 6.0. Molecular Biology and Evolution 30: 2725– 2729.

Van Cauwenberghe, J., B. Verstraete, B. Lemaire, B. Lievens, J. Michiels, O. Honnay. 2014. Population structure of root nodulating Rhizobium leguminosarum in Vicia cracca populations at local to regional geographic scales. Systematic and Applied Microbiology 37: 613–621.

Vekemans, X., and O.J. Hardy. 2004. New insights from fine-scale spatial genetic structure analyses in plant populations. Molecular Ecology 13: 921e935.

Vinuesa, P., C. Silva, D. Werner, and E. Martínez-Romero. 2005. Population genetics and phylogenetic inference in bacterial molecular systematics: The roles of migration and recombination in Bradyrhizobium species cohesion and delineation. Molecular Phylogenetics and Evolution 34: 29–54.

Vuong, H. B, P. H. Thrall, and L. G. Barrett. 2017. Host species and environmental variation can influence rhizobial community composition. Journal of Ecology 105: 540– 548.

93

Wang, Q., J. Liu, H. Li, S. Yang, P. Kormoczi, A. Kereszt, H. Zhu. 2018. Nodule specific cysteine-rich peptides negatively regulate nitrogen-fixing symbiosis in a strain-specific manner in Medicago truncatula. Molecular Plant–Microbe Interactions 31: 240–248.

Wang, D., S. Yang, F. Tang, H. Zhu. 2012. Symbiosis specificity in the legume: Rhizobial mutualism. Cell Microbiology 14: 334–342.

Willson, M. F. 1993. Dispersal mode, seed shadows, and colonization patterns. Vegetatio 107/108: 261–280.

Zancarini, A., C. Mougel, A. S. Voisin, M. Prudent, C. Salon, N. Munier-Jolain. 2012. Soil nitrogen availability and plant genotype modify the nutrition strategies of M. truncatula and the associated rhizosphere microbial communities. PLoS ONE 7: e47096 10.1371/journal.pone.0047096

Zhang, Y. M., C. T. Tian, X. H. Sui, W. F. Chen, and W. X. Chen. 2012. Robust markers reflecting phylogeny and taxonomy of rhizobia. PLoS One 7: e44936. DOI:10.1371/journal. pone.0044936.

Zhang, Y. M., Y. Li, W. F. Chen, E. T. Wang, C. F. Tian, Q. Q. Li, and W. X. Chen. 2012. Biodiversity and biogeography of rhizobia associated with soybean plants grown in the North China Plain. Applied and Environmental Microbiology 77: 6331-6342.

Zhou N., S. Zhao, and C. Y. Tian. 2017. Effect of halotolerant rhizobacteria isolated from halophytes on the growth of sugar beet (Beta vulgaris L.) under salt stress. FEMS Microbiology Letters 364: fnx091.

Zgadzaj, R., R. Garrido-oter, D. Bodker, A. Koprivova, P. Schulze-lefert. 2016. symbiosis in Lotus japonicus drives the establishment of distinctive rhizosphere, root, and nodule bacterial communities. Proceedings of the National Academy of Sciences USA 113: E7996–E8005.

94

CHAPTER IV

FINE-SCALE PATTERNS OF GENETIC STRUCTURE IN CHAMAECRISTA FASCICULATA

(FABACEAE) AND ITS RHIZOSPHERE MICROBIAL COMMUNITIES

Introduction

Plant-soil microbiome interactions are complex, and the study of these relationships has been primarily focused on the pathogenicity of some microbial agents, particularly in agricultural systems (Philippot et al. 2013; Zancarini et al. 2012; Gilbert et al. 2014; Sapkota et al. 2015).

Understanding diversity in plant–microbial interactions is essential to advancing our knowledge of the mechanisms by which biodiversity regulates ecosystem processes, as well as predicting responses to environmental change, due to the intimate relationship between producers and decomposers (Hooper et al. 2000). Roots are surrounded by a narrow biologically active zone of soil known as the rhizosphere. The rhizosphere is the site where plant roots and microorganisms interact and is of great importance for plant performance, nutrient cycling, and ecosystem functioning (Singh et al. 2004). The rhizosphere has high microbial diversity, and its community structure is expected to be different than microbial communities found in bulk soil (Reinhold-

Hurek et al. 2015). Roots secrete large amounts of photosynthetically fixed carbon as exudates that contain a wide range of molecules such as carbohydrates, amino acids, organic acid ions, and vitamins (Bertin et al. 2003; Bais et al. 2006). Thus, by providing a nutrient-rich niche for microbes, plant roots commonly harbor beneficial microbes (Reinhold-Hurek et al. 2015).

95

Many factors, such as physical and chemical properties of soil, plant genotype and stage of plant development, and background microbial composition of soil, directly or indirectly influence rhizosphere microbial community composition (Lundberg et al. 2012; Marques et al.

2014; Schreiter et al. 2014; Qiao et al. 2017). Several studies believe plants are the most important factor shaping the rhizosphere microbiome, as evidenced by the fact that different plant species host specific microbial communities when grown on the same soil (e.g. Aira et al.

2010; Berendsen et al. 2012; Bazghaleh et al. 2015). Plants have a particularly close association with microbes belowground because root exudates known as rhizodeposits are important sources of carbon and other nutrients that heterotrophic microbes require (Hartmann et al. 2009). Given that plant species differ in the quantity and quality of their belowground inputs to soil (Wardle et al. 2004), the structure of their rhizosphere microbial communities can be distinctive in both composition and diversity (e.g., Costa et al. 2006; Garbeva et al. 2008). Rhizodeposits released by plants vary according to the age and development of plants (Inceoglu et al. 2010; Philippot et al. 2013; Wagner et al. 2016; Qiao et al. 2017), among species (Hacquard 2016; Lemanceau et al. 2017) and even among different genotypes of the same species (Gilbert et al 2014; Bazghaleh et al. 2015; Wagner et al. 2016; Qiao et al. 2017). It goes to reason that genetic variation among individual host plants can potentially influence microbial community structure (Schweitzer et al.

2012). For example, variation in rhizosphere microbial communities among genotypes has been demonstrated experimentally for Arabidopsis thaliana (L.) Heynh. (Lundberg et al. 2012;

Micallef et al. 2009) and has been attributed to differences among genotypes in root exudates

(Micallef et al., 2009). Similarly, in several experiments in terrestrial ecosystems dominated by

Populus trees, distinct microbial communities have been observed in bulk soil associated with different plant genotypes (Madritch and Lindroth, 2011; Schweitzer et al. 2008). It is also

96

believed that genotypic variations that result in phenotypic trait variation such as soil litter chemistry (Madritch et al. 2009; Schweitzer et al. 2004) and productivity (Lojewski et al. 2009) could even affect soil microbes. Plant genetic influence on rhizosphere microbial communities has also been reported for different genotypes of common grapevine, Vitis vinifera L. (Aira et al.

2010; Bouffaud et al. 2012; Peiffer et al. 2013; Marques et al. 2014; Jiang et al. 2017; Gallart et al. 2018). Maize hybrids are also able to select specific bacterial strains that their parents do not, thereby increasing the diversity of rhizosphere bacteria (Picard and Bosco 2005, 2006; but see

Roesch et al. 2006). However, a recent study suggests that crop cultivars might ignore beneficial soil microbes and become more dependent on fertilizer and other soil amendments than their wild relatives (Stephanie and Sachs 2020). It has also been demonstrated that differences in the microbial community are significantly correlated with phylogenetic distance of their plant hosts

(Bouffaud et al. 2014). Thus, plant hosts with similar genotypes may have similar soil microbial communities, suggesting the presence of genotype x genotype interactions in these situations.

With dispersal-limited bacterial and fungal communities and host influence on microbial diversity and abundance, spatial autocorrelation of these communities in the soil is expected

(Ettema and Wardle 2002; Fierer and Ladau 2012).

Chamaecrista fasciculata (Michx.) Greene (Partridge Pea) is an annual, sub-erect native legume that is widely distributed in the eastern U.S. It exhibits intraspecific morphological variation and wide ecological tolerance of different soil types and habitats (Irwin and Barneby

1982). Although not an agricultural crop, it is occasionally planted in agricultural areas to enhance nitrogen in soils. This species is important in natural communities as a source of pollen for bees, nectar for mutualistic ants, food for herbivores, and habitat for wildlife. Chamaecrista fasciculata forms root nodules to support symbioses with nitrogen fixing bacteria known as

97

rhizobia (Parker and Kennedy 2006). Previous studies on this system have demonstrated that

Bradyrhizobium is the only genus of rhizobia in nodules of C. fasciculata (Parker et al. 2002;

Parker 2012, 2015; Parker and Kennedy 2006, Parker and Rousteau 2014; Dorman and Wallace

2019), but no study to date has focused on diversity and structure of the rhizosphere microbiome of C. fasciculata. Furthermore, among the few studies that have investigated intraspecific plant genotypic influences on rhizosphere microbial communities, most have focused on subsets of these communities, such as plant growth-promoting bacteria (e.g. Paffetti et al. 1996; Dalmastri et al. 1999; Mazzola et al. 2004; Hartman et al. 2017), rather than the whole microbiome. Thus, relatively little is known about the relationship between intraspecific genotypic variation of plant host and their associated microbial communities outside model systems or under natural conditions (Schweitzer et al. 2011).

In this study, I characterized genetic structure of C. fasciculata host plants and their rhizosphere microbiomes. Understanding intraspecific structure of the host plant is important for making predictions about expected spatial patterns of its rhizosphere microbiome communities.

Three hypotheses were tested in the current study: 1) plant hosts exhibit an isolation by distance pattern because of limited seed dispersal, 2) rhizosphere microbial communities vary with plant genotypic variation, and 3) these communities exhibit fine-scale structure that decays with distance. If genotypic variation of host plants is important in plant-microbe interactions and bacteria are dispersal-limited in the soil, then it is expected that plants with more similar genotypes will have more similar microbial communities and that microbial communities will become less similar with increasing physical distance. Nitrogen-fixing leguminous plants, such as C. fasciculata, are key components of natural systems because these species, upon establishing rhizobial and mycorrhizal symbioses, constitute a fundamental source of nitrogen

98

for other members of their communities. By investigating the genetic impact of a leguminous species on microbial communities in the rhizosphere, I can develop a better understanding of plant microbe interactions, specifically at the finest spatial scale, where little research has been done so far.

Materials and Methods

Data collection

In June 2017, 70 plants were sampled in three consecutive rectangular plots starting at

33.35939, -88.86587 in Oktibbeha Co, Mississippi. Within each plot, plants were sampled at distance intervals of 0, 1, 2, 5, 10 and 25 meters from the previous point. At each distance, two plants each perpendicular to the central line of the transect were sampled 0.5 meter from a central point, for a total of 24 plants sampled per plot (Figure 3.1). Whole plants, including roots, were carefully excavated from the soil. Each plant and its roots were stored individually in a plastic bag on ice in the field. Plants were kept at 4°C until processed. In the lab, rhizosphere soil, attached closely to the root, was collected from each plant by shaking the root area into multiple collection tubes; these were stored at -80°C until DNA extraction. Leaf samples for

DNA extraction were stored in silica gel or at -80°C until DNA extraction.

Genomic DNA was extracted from leaves using a CTAB method (Dellaporta 1983, lightly modified by Schnable lab described at file:///H:/First%20project-

First%20draft/Papers/For%20the%20references/Dellaporta_DNA_Extraction.2014.08.20.pdf ).

The sampled plants were genotyped at 14 trinucleotide microsatellite loci (Table 3.1) to quantify genetic structure. Three multiplex amplification reactions per sample were run with five, four, and five loci per reaction, respectively. Reaction volumes were 10 μl in the presence of 10 ng of template DNA, 100 µmole of each of the reverse and tagged fluorescent label primers and 10 99

µmole of tagged forward primer using a KAPA 2G Fast Multiplex PCR kit (Kapa Bio-systems,

Wilmington, MA). Tag sequences (Culley et al. 2013) were placed at the 5’ end of each forward primer, and these tags matched the sequence of a fluorescent labeled primer (Table 3.1). The thermal cycler program used to amplify loci was 3 min at 95°C, 30 cycles of 15 s at 95°C, 30 s at

60°C, and 30 s at 72°C, and a final extension step of 1 min at 72°C. Successful amplification of fragments was checked by gel electrophoresis. Multiplexed samples were genotyped at the

Arizona State University DNA Lab with LIZ 600 size standard, and alleles were sized using

GeneMarker software (SoftGenetics, State College, PA).

For microbial community sequencing of rhizosphere soil samples, whole-community

DNA was extracted using the FastDNA™ SPIN Kit for soil isolation (MP Biomedicals, Solon,

OH). The concentration of the extracted DNA was determined using a Qubit 3 Fluorometer

(ThermoFisher Scientific, Waltham, MA). By applying 319F and 806R primer pair sequences designed for the V3 and V4 region of the 16S rRNA gene (Holm et al. 2019) for bacterial and archaeal microbes, and ITSF1/ITS2 for the fungal ITS1 gene (Sun et al. 2018), individual amplicons were produced using a 2-step PCR method as described in Holm et al. (2019). In the first step PCR of the 16s gene, each 12-microliter volume reaction contained 1× Phusion Taq master mix (New England Biolabs, Ipswich, MA), step 1 forward and reverse primers, 319F and

806R, (0.4 µM each), 3% DMSO, and 5 ng genomic DNA. The thermal cycler program included the following cycles; an initial denaturation at 94°C for 3 min, 20 cycles of denaturation at 94°C for 30 s, annealing at 58°C for 30 s, elongation at 72°C for 1 min, and a final elongation step at

72°C for 7 min. For the first step PCR of ITS1, each 12-microliter volume reaction contained

10X Standard Taq (Mg-free) reaction buffer (New England Biolabs, Ipswich, MA), dNTP’s (2 mM), MgCl2 (2 mM), Taq polymerase (0.2 units), 3% DMSO, forward and reverse primers,

100

ITSF1/ITS2 (each 0.5 or 0.8 µM; the required concentration varied among DNA samples) and 10 ng of genomic DNA. The thermal cycler program for ITS1 included the following cycles; an initial denaturation at 94°C for 2 min, followed by 50°C for 1 min and 72°C for 1 min, 30 cycles of denaturation at 94°C for 1 min, annealing at 50°C for 1 min, elongation at 72°C for 45s, and a final elongation step at 72°C for 5 min. The successful amplification of PCR products was tested using agarose gel electrophoresis before proceeding to the next step. For both gene regions, the resultant amplicons were diluted 1:20, and 1 µl was used in the second step of PCR. The second amplification step introduced an 8-bp dual-index barcode (Holm et al. 2019) to the amplicons, as well as the flow-cell linker adaptors, using primers containing a sequence that anneals to the

Illumina sequencing primer sequence introduced in step 1. The second step PCR, library quantification, and Illumina sequencing were conducted at the Microbiome Service Lab at the

University of Maryland School of Medicine following the methods in Holm et al. (2019). The

16S and ITS1 libraries of pooled amplicons were sequenced separately using 2 x 250 bp paired ends on a MiSeq instrument (Illumina, San Diego, CA).

For analyzing pH of the soil, which has been identified as the most important factor influencing soil bacterial communities by Fierer and Jackson (2006), rhiozospheric soil from the first three distance intervals (i.e. 0, 1 and 2) was combined for each plot and rhizospheric soil from the last three distance intervals (i.e. 5, 10 and 25) was combined for each plot. The soil samples were pulverized, dried, and sent to Waypoint Analytical (Richmond, VA) soil laboratory for assessment of pH.

101

Data Analyses of Plant Microsatellite Loci

I tested for the presence of duplicate multilocus plant genotypes using Poppr package

(Kamvar et al. 2014) and RStudio statistical software v. 1.1.456 (RStudio Team, 2012). To quantify genetic diversity of plants in each plot I calculated the mean number of alleles per locus

(A), percent of polymorphic loci (P), observed heterozygosity (HO), expected heterozygosity

(HE), and inbreeding coefficient (FIS) using GenAlEx version 6.503 (Peakall and Smouse 2012).

Departure from Hardy–Weinberg expectations was tested for each locus using permutations of alleles among individuals. Statistical significance was assessed using a sequential Bonferroni corrected p-value of 0.001 for multiple comparisons (Holm 1979). A Principal Coordinates

Analysis (PCoA) was conducted in GenAlEx version 6.503 (Peakall and Smouse, 2012) to determine dispersion of samples among the three studied plots. I conducted analysis of spatial genetic structure (SGS) using SPAGeDi 1.5 (Hardy and Vekemans 2002). Microsatellite loci were analyzed in each pair of plants using pairwise relatedness coefficients according to

Ritland’s estimator (equation 5 in Ritland 1996), the best estimator for highly polymorphic markers (Vekemans and Hardy 2004). Relatedness was assessed along 17 distance intervals, 1 to

384 meters, to maximize the number of pairwise comparisons. Spatial genetic structure was tested by assessing the significance of the regression slope (bF) of pairwise statistics (Fij) on ln

(distance) using 9999 randomizations of the individual spatial positions and obtaining 95% confidence intervals (CIs) after 1000 bootstrap replicates using SPAGeDi 1.5 (Hardy and

Vekemans 2002). Spatial genetic structure was also quantified by the ‘Sp’ statistic, which estimates SGS intensity and is calculated as – bF /(1 – F(1) ), where F(1) is the mean F( ij ) between individuals belonging to the first distance interval and bF is the regression slope of F( ij

) on rij (physical distance between samples i and j) (Vekemans and Hardy 2004). Standard errors

102

of the kinship coefficients per distance class were estimated using a jackknife procedure over loci. Then, values of the average genetic relatedness statistic between pairs of individuals separated by given distance intervals were plotted using a spatial autocorrelogram. Isolation-by- distance for the plant hosts was investigated by comparing pairwise population genetic and geographic distances in a Mantel test (Mantel 1967). Pairwise genetic distances for plants were calculated using GenAlEx (Peakall and Smouse 2012) and geographic distances were calculated using a modification of the haversine formula (Sinnott, 1984) based on the coordinates of the sampled plants in GenAlEx (Peakall and Smouse 2012). The Mantel test was performed using

PASSaGE 2 (Rosenberg 2011), and 1,000 permutations of the datasets were used to assess significance of the correlation.

Soil Sequence Data Processing and Analyses

Following Berlanas et al. (2019), bioinformatics analysis and annotation of the output data were carried out using QIIME2 software (Bolyen et al. 2019), which provides quality filtering, picking operational taxonomic units (OTUs), taxonomic assignment, phylogenetic reconstruction, diversity analysis and graphical displays. After sequences were demultiplexed by sample at the University of Maryland using a dual-barcode strategy, a mapping file linking barcode to samples, and QIIME-dependent script of split_libraries.py and split_sequence_file_on_sample_ids.py, respectively (Holm et al. 2019) they were sent back to us for the rest of analyses. The primer sequences were removed from each read using q2-cutadapt plugin, and sequence quality control and feature table construction were conducted using

DADA2 pipeline. Chimeras for combined runs were removed per the DADA2 protocol.

Amplicon sequence variants (ASVs) generated by DADA2 were taxonomically classified using the scikit-learn classifier (Pedregosa et al. 2011) trained with the SILVA v132 16S rRNA gene 103

sequence database (Quast et al. 2012). For the ITS1 dataset, I processed sequence output as for the 16S rRNA gene data, with the addition of using q2-itsxpress plugin to extract only the fungal

ITS1 regions of the reads, and using the UNITE reference database (UNITE Community 2019) to assign taxonomy. The output from QIIME2, including OTU and taxonomy tables, reference sequences, and the phylogenetic tree, was used as input for the subsequent analyses of α- and β- diversity, which measures local diversity and turnover in diversity between samples, respectively. Alpha diversity was calculated for each of the 12 sample locations in each plot (i.e., sequences from the two plants sampled in the same direction from the central point at a given distance were pooled), whereas β- diversity was calculated between each pair of distances.

Biodiversity indexes and ordination analyses based on taxonomic profiles were analyzed in RStudio statistical software v. 1.1.456 (RStudio Team, 2012) using the vegan (Oksanen et al.

2018) and Phyloseq packages (McMurdie and Holmes 2013). In each data set, samples were rarified to simulate a similar number of reads per sample. A taxonomic bar plot was generated using rarified samples for each data set. Alpha diversity, or the diversity of microbes within a given rhizospheric sample, can be described by richness, or the number of different OTU’s per sample, and evenness, or the number of OTU’s in relation to the OTU abundance within a sample. Shannon Index is also a commonly used diversity index that takes into account both abundance and evenness of species present in the community. The observed OTU’s and Shannon diversity index were calculated for each sample using the Phyloseq package (McMurdie and

Holmes 2013). Kruskal-Wallis tests in the Qiime2 software (Bolyen et al. 2019) were used to determine if measures of alpha diversity differed between each pair of rhizosphere samples for all 70 studied individuals. Therefore, Shannon and evenness vectors output from QIIME2 has been used as an input to measure differences between pairs of samples. To evaluate β- diversity,

104

which quantifies dissimilarity in rhizospheric soil composition between sample points, weighted

UniFrac distance matrices (Lozupone, and Knight 2005) for each bacterial and fungal community were subjected separately to PERMANOVA (Anderson 2001) using the adonis function with 999 permutations to test for significance using the vegan package of R.

Dissimilarity of communities among the sample points was visualized using principal coordinates analysis (PCoA). Regression analyses were used to test for an association between genetic distance of microbial communities using weighted UniFrac distance matrices (Lozupone et al. 2007; Pepe-Ranney et al. 2016) and spatial geographic and plant genetic distances (as calculated above) in IBM SPSS Statistics v.25 (IBM Corp. 2017). These analyses were conducted separately for the bacterial and fungal isolates.

Results

Analyses of the host plant revealed no clonal genotypes in the plant dataset and showed that the number of alleles per locus ranged from 3 to 12, with a mean value of 6.78 for the whole set (Table 4.1). The percentage of polymorphic loci (%P) was 100, the observed heterozygosity

(HO) ranged from 0.14 to 0.71, and the expected heterozygosity (HE) ranged from 0.32 to 0.84 among the three plots. Inbreeding coefficients (FIS) were consistently positive, indicating heterozygote deficiency across all plots (Table 4.1). Since genetic diversity measures among all three plots were very similar (Table 4.1), and a principal coordinates analysis did not indicate that individuals cluster by their plot (data not shown), I combined the plots for some downstream analyses. Among all three plots only plot 1 demonstrated a significant regression slope (bF) for plants kinship coefficient over distances classes (p = 0.001 for bF = -0.0138) (Table 4.2). In the combined plot dataset, I found significant autocorrelation in plant genotypes for all distance classes except 244, 261, 344 and 384 m. The average kinship coefficient decreased linearly with 105

spatial distance (rij) between individuals. Significantly positive values of Fij were found at short distances, indicating that neighboring individuals had a higher genetic relatedness than random pairs of individuals. Negative values (i.e. individuals within a distance class were less genetically similar than expected with a random distribution) of Fij occurred at larger distances (Figure 4.1).

The mean value of the regression slope (bF), F(1) and Sp statistic were -0.01214, 0.0452 and

0.012, respectively, and the slope was significant, with p < 0.001. Plant genetic and geographic distances were significantly correlated in a Mantel test (r = 0.243, p = 0.001, Figure 4.2).

Table 4.1 Genetic diversity of the host plant, Chamaecrista fasciculata, across three plots.

Plot N A P HO HE FIS Significant deviation from HWE

1 24 5.1 100 0.48 0.55 0.09 no loci 2 22 4.35 100 0.49 0.52 0.1 1 locus (p = 0.001) 3 24 5 100 0.48 0.56 0.13 1 locus (p = 0.001) All 70 6.78 100 0.49 0.58 0.17 3 loci (p = 0.001)

N = number of individuals sampled; A = mean number of alleles per locus; P = percentage of polymorphic loci; HO = observed heterozygosity; HE = expected heterozygosity; FIS = inbreeding coefficient; HWE= Hardy–Weinberg equilibrium.

106

Figure 4.1 Ritland’s kinship coefficients for pairs of plants plotted against the logarithm of geographic distance separating each pair of plants and the estimated regression line. Significant autocorrelations are indicated by stars. Regression slope bF = - 0.01214, p < 0.0001.

Table 4.2 Evidence of fine-scale genetic structure in host plants.

Plot Distance range (m) bF P-value F1 Sp statistic Plot 1 1-43 -0.0138 0.001* 0.0073 0.0139 Plot 2 1-43 -0.0063 0.125 -0.0165 0.0062 Plot 3 1-43 -0.0049 0.145 -0.0068 0.0048 Plots 1-3 1-384 -0.0121 0.000* 0.0451 0.0127 b = the regression slope of multilocus kinship coefficients for pairs of individuals against the logarithm of geographic distance separating members of the pair; P-value = level of significance; F1 = the mean kinship coefficient (Fij) between individuals belonging to the first distance interval; Sp statistics = – bF /(1 – F1). Note: Star shows significant values.

Bacteria of Acidobacteria, Protobacteria and Bacteroidetes and fungi of Basidiomycota,

Ascomycota and Mortierellomycota were dominant in the rhizospheric soils for all plots (Figure

4.3 and MSU repository https://hdl.handle.net/11668/17896). Significant differences were not detected in any measures of alpha diversity for bacterial communities and fungal communities

107

between samples (Table 4.3, Figure 4.4, alpha diversity measures including richness and evenness and Kruskal-Willis test result are provided at the MSU Repository https://hdl.handle.net/11668/17896). Rhizosphere microbial beta diversity varied among sample points (Figure 4.5), and these values were significantly different from one another for both bacterial (PERMANOVA: false-F = 2.08, p = 0.001) and fungal data sets (PERMANOVA: false-

F = 2.71, p = 0.001). Pairwise beta diversity comparisons of sample points are provided at the

MSU Repository (https://hdl.handle.net/11668/17896). In PCoA plots, the first two components accounted for approximately 41.7% of the total variance for bacterial communities and 30.1 % of the total variance for fungal communities. Both PCoA plots show some level of separation among plots. Bacterial sequences from plot 1 are scattered in the PCoA plot while plots 2 and 3 appear more separated. In the fungal PCoA plot, sequences retrieved from plot 1 and 2 overlapped to some extent but these are separated from plot 3. Stepwise regression analyses indicated significant association (p < 0.05) between genetic distance of both bacterial and fungal rhizosphere communities and spatial and plant genetic distances (Table 4.4).

The pH results for rhizsopheric soil in all three plots are provided in table 4.5.

108

Table 4.3 Kruskal Wallis tests for Shannon and evenness indexes within and between plots.

Microbial Index Plots H* P-value community Bacterial Shannon Within 45.62 0.08 Among 2.68 0.26 Evenness Within 39.89 0.22 Among 0.49 0.78 Fungal Shannon Within 37.36 0.19 Among 8.98 0.11 Evenness Within 32.99 0.36 Among 0.87 0.64 *H = the variance of the ranks among groups

Table 4.4 Stepwise regression assigning plant genetic and spatial distances as predictors of rhizosphere microbial weighted UniFrac genetic distances.

Microbial community Predictor Adjusted r square P-value Bacterial Plant genetics, constant 0.006 0.000 Plant genetics, spatial distance, 0.022 0.000 constant Fungal Plant genetics, constant 0.002 0.028 Plant genetics, spatial distance, 0.012 0.00 constant

Table 4.5 Tested pH for rhizospheric soils for each plot.

Plot Combined soils from 0, 1 and 2 distance Combined soils from 0, 1 and 2 distance intervals intervals 1 5.9 6.8 2 6.5 7.4 3 5.3 5.3

109

Figure 4.2 Plot of pairwise plant genetic distances compared to pairwise geographic distance (Mantel test, r = 0.243, p = 0.001).

110

Discussion

Nonrandom spatial distribution of genotypes, which is defined as spatial genetic structure

(SGS), occurs in natural populations due to restricted gene flow (Vekemans and Hardy 2004).

Restricted gene flow in plant populations often results from limited seed and pollen dispersal, causing greater genetic similarity among neighboring than among more distant individuals. The fact is that most plant seeds do not travel far, often only one or a few meters away from maternal plants (Harper 1977; Howe and Smallwood 1982; Willson 1993; Cain et al.1998; Cheplick

1998), which is expected to generate a pattern of SGS. In several studies, estimated migration rates, which are based on the occurrence of the farthest individual from a source population, ranged from 0.0 to 2.5 m/yr (Matlack 1994; Brunet and von Oheimb 1998; Bossuyt et al. 1999).

Owing to heavy gravity dispersed seeds with no dispersal facilitation appendages and seasonal variation in bee pollinator service (Fenster 1991), isolation by distance (IBD) is expected in C. fasciculata. Values measured for the Sp statistic in this study, ranging 0.0048-0.0139, were significant for one out of the three studied plots and for the combined plot. These values are comparable to another study of C. fasciculata in which the sp statistic was 0.00746 (Fenster et al.

2004) and in another widespread species, Orchis purpurea Huds., with a Sp range of 0.0094-

0.0144 (Jacquemyn et al. 2006). The mean Sp statistic in other outcrossing or gravity-dispersed plant species has been reported as 0.0126 and 0.0281, respectively (Vekemans and Hardy 2004).

Results from Mantel test (r = 0.243, p = 0.001) suggest that C. fasciculata has limited seed dispersal, as indicated by the presence of IBD and fine SGS. The presence of statistically significant genetic structure at most of the studied distances indicates the existence of restricted dispersal events among plants within the plots and suggests that C. fasciculata plants are not randomly distributed at this fine scale.

111

The rhizosphere is the biologically active zone of soil where plant roots and microorganisms interact (Singh et al. 2004). These interactions involve root exudates from the plant, which can shape the structure of and enhance the activity of microbial communities (Paterson et al. 2007).

Rhizosphere influences on bacteria composition have been widely recognized in numerous plant species, such as maize (Peiffer et al. 2013), rice (Edwards et al. 2015), Arabidopsis thaliana

(Lundberg et al. 2012), alfalfa (Xiao et al. 2017), poplar (Beckers et al. 2017), and grape (Samad et al. 2017). These studies have found differing extents of rhizosphere effects on microbes, dependent on the plant species, and these effects were related to factors such as plant root physiology and root exudation profiles (Fitzpatrick et al. 2018; Sasse et al. 2018). For example, maize hybrids can select specific bacterial strains that their parents do not, thereby increasing the diversity of rhizosphere bacteria (Picard and Bosco 2005, 2006; Roesch et al. 2006). Studying the model legume Lotus japonicus (Regel) K. Larsen, Zgadzaj et al. (2016) identified a role of the legume host in selecting a broad taxonomic range of root-associated bacteria that likely contribute to plant growth and ecological performance. Given the presence of fine scale genetic structure in C. fasciculata and its expected ability to shape rhizosphere microbial communities, I expected to find a similar pattern of community structure in rhizospheric microbes.

112

Figure 4.3 Taxonomic classification of microbial sequences obtained from rhizospheric samples at the phylum level for (A) bacterial communities and (B) fungal communities.

113

Figure 4.3 (continued)

Taxonomic classification of microbial sequences obtained from rhizospheric samples at the phylum level for (A) bacterial communities and (B) fungal communities.

In our study, Acidobacteria, Protobacteria, and Bacteroidetes were the dominant bacteria found, and Basidiomycota, Ascomycota and Mortierellomycota were the dominant fungi in the C. fasciculata rhizosphere. The presence of these taxa is consistent with previous reports of legume rhizosphere microbiomes (Mendes et al. 2014; Sugiyama et al. 2014; Xiao et al. 2017; Hartman et al. 2017; Zhou et al. 2017). Metagenomics studies have revealed five main properties for

Acidobacteria, including nitrogen assimilation, carbon usage, metabolism of iron, antimicrobials, and abundance of transporters (Parsley et al. 2011; Faoro et al. 2012; Navarrete et al. 2013;

Mendes et al., 2014). The Proteobacteria are of great importance to carbon, sulfur and especially nitrogen cycling (Manz et al. 1992; Ludwig et al. 1995; Gupta 2000; Moulin et al. 2001; Spain et 114

al. 2009). Bacteria of the Bacteroidetes are prolific degraders of complex carbohydrates (Thomas et al. 2011), with wide-ranging environmental impacts (Griffiths and Gupta 2001); for example, decomposition of polysaccharides enables soluble sugars to be made available to other organisms and recycles carbon, nitrogen, and water (Thomas et al. 2011). Basidiomycota are the main decomposers of recalcitrant components of plant litter (Rayner et al. 1988), and at least 4,500 species of this phyla form mutualistic associations known as ectomycorrhizae with roots of vascular plants (Watling 1995). Ascomycota fungi are important drivers in carbon and nitrogen cycling in arid ecosystems and play roles in soil stability, plant biomass decomposition, and endophytic interactions with plants (Green et al. 2008; Crenshaw et al. 2008; Challacombe et al.

2019). Species of Mortierellomycota live as saprotrophs in soil, on decaying leaves and other organic material and decompose their remains (Blackwell 2011; Wilfried et al. 2015; Lücking et al. 2016; Zhang et al. 2019). Overall, the community assembly of the rhizospheric microbes of C. fasciculata appear to be common to plants in natural ecosystems.

While I discovered that neither richness nor evenness differed significantly between samples in bacterial and fungi datasets (Table 4.3), selection by legume species on microbial communities can be very specific (Wille et al. 2018). For instance, chickpea, lentil, and pea have different root‐associated fungal communities in general, but arbuscular mycorrhizal fungi (AMF) communities were not found to differ among the three legumes (Borrell et al., 2017). Unlike richness and evenness, both bacterial and fungal beta diversity differed significantly among rhizosphere soils, which is similar to results from a few studies that have so far tested the influence of intraspecific genetic variation of legumes on rhizosphere microbial communities.

For example, Liu et al. (2019) reported that undomesticated Glycine soja (Siebold and Zucc.) had higher diversity in the rhizosphere in comparison to domesticated soybean genotypes. The

115

comparison of the soybean rhizosphere to bulk soil in their study also revealed significantly different microbial composition. In a non-legume species, Spartina alterniflora Loisel., Zogg et al. (2018) found significant differences in bacterial community composition and diversity between bulk and rhizosphere soil, which was dependent on genetic variation of this natural saltmarsh grass. Comparing alpha diversity measures for bacterial and fungal sequences in our whole dataset revealed relatively higher bacterial diversity than fungal diversity (mean observed

OTU’s for bacterial and fungal communities was 66.88 and 72, mean Shannon index was 3.89 and 2.67, and mean evenness was 0.94 and 0.64, respectively).

Figure 4.4 Boxplots of the number of observed OTUs and Shannon index for rhizosphere samples collected in each plot (spotsite). The first number in each sample name indicates the plot number, the secnd number represents the specific distance at which the sample was collected and the third number (1) represent samples collected from opposite direction in a specific distance. (A-B) bacterial communities and (C-D) fungal communities. A & C observed OTUs, B & D Shannon index.

116

Figure 4.4 (continued)

Boxplots of the Shannon index for bacterial communities collected in each plot (spotsite). The first number in each sample name indicates the plot number, the second number represents the specific distance at which the sample was collected and the third number (1) represent samples collected from opposite direction in a specific distance.

117

Figure 4.4 (continued)

Boxplots of the number of observed OTUs for rhizosphere samples collected in each plot (spotsite). The first number in each sample name indicates the plot number, the second number represents the specific distance at which the sample was collected and the third number (1) represent samples collected from opposite direction in a specific distance. (A-B) bacterial communities and (C-D) fungal communities. A & C observed OTUs, B & D Shannon index.

118

Figure 4.4 (continued)

Boxplots of the number of observed OTUs and Shannon index for rhizosphere samples collected in each plot (spotsite). The first number in each sample name indicates the plot number, the second number represents the specific distance at which the sample was collected and the third number (1) represent samples collected from opposite direction in a specific distance. (A-B) bacterial communities and (C-D) fungal communities. A & C observed OTUs, B & D Shannon index.

In this study, the first two components of the PCoA plots based on weighted UniFrac distance matrices of the sequences accounted for approximately 41.7% and 30.1 % of the total variance for bacterial and fungal communities, respectively. In an extensive survey of the potato‐ associated microbiome, Weinert et al. (2011) showed that 9% of all detected operational taxonomic units revealed a cultivar‐dependent abundance, and significant genotypic effects

119

among 10 maize inbred lines accounted for 26% of the variation in the bacterial rhizosphere composition (Emmett et al. 2017). Peiffer et al. (2013) showed that plant genotype explains a significant amount, 19%, of the total observed bacterial diversity in the rhizosphere among 27 modern maize inbred lines, although there was weak structure among bacteria of the maize inbred lines when PC1 and PC2 were plotted for PCoA. Liu et al. (2019) revealed slight soybean genotype modifying abilities for microbiome assembly. These rhizosphere effects may be influenced by the specific profile of root exudates with a high concentration of flavonoids, which are essential components of signal exchange between legumes and symbiotic rhizobia during nodule formation (Liu and Murray 2016). The influence of root exudates was investigated by

White et al. (2015), revealing that isoflavonoids significantly alter soybean rhizosphere bacterial diversity. Altogether, these findings support the hypothesis of the sensitivity of bacterial communities to intraspecific plant genotype variation.

Aside from C. fasciculata genotypes influencing rhizosphere microbiome composition, our regression analyses for both bacterial and fungal communities revealed significant influence of distance decay on microbial community structure of rhizosphere soil. In fact, the r-square values obtained for correlation between plant and soil microbe structure for both bacterial and fungal communities were less than that for combination of the influence of both plant genotype and spatial distance on microbial communities (Table 4.4). Our result is in line with several other studies that recognized decay distance influence on soil microbiome. For example, Horner-

Devine et al. (2004) found that by using the decay of community similarity with distance over a scale of centimeters to hundreds of meters in salt marsh sediments bacteria can exhibit a taxa– area relationship and that bacterial communities located close together were more similar in composition than communities located farther apart. Cho and Tiedge (2000) studied the degree

120

of endemicity of Pseudomonas strains and found a negative correlation between the genetic similarity of isolates and geographic distance at regional scales (distances ranging from 5 m to

80 km). Franklin and Mills (2003) documented microbial distance–decay patterns at smaller scales (2.5 cm to 11 m) and reported a significant distance–decay relationship with a scale- dependent slope. These studies, as well as our results, indicate that physical distance between microbial communities is an influencing factor in the structure and diversity of rhizosphere microbial communities.

Fierer and Jackson (2006) found that the diversity of microbiomes in soils with pH 5.3-

7.4, which is the range pH that I measured for our studied plots, is not highly variable. Therefore,

I conclude that our studied rhizosphere microbiome in this small scale has not been influenced by soil pH as much as by host genotypes or other untested factors. It is believed that microbiota colonizing and associating with plants enable plants to buffer local environmental changes, with a positive influence on plant fitness (Vannier et al. 2018). Vannier et al. (2018) showed that in clonal plants microorganisms are transmitted between individuals, ensuring the availability of microbe partners for the next generation of plants and dispersal of the microbes between hosts. In the previous study on this system, Dorman and Wallace (2019) showed that geographical distance did not result in isolation and diversification of microbiota as they did not find geographical structure in nodulating rhizobia in the larger scale they studied in Mississippi.

Previous work on bacterial communities from North America and South America has shown that latitude or geographic distance did not significantly influence diversity (Fierer and Jackson

2006); however, soil pH was later shown to be the most influential factor (Lauber et al. 2009).

The effects of plant species on microbial taxa are often difficult to predict, in part to the reason that plant associations with soil microorganisms is being highly context dependent (Fierer 2017).

121

I believe that at small scales, microenvironmental heterogeneity, such as pH and moisture content, likely interact with the plant genotype to shape the spatial scaling of the rhizosphere microbiota. Microenvironmental heterogeneity is proposed to be an important diversity- generating factor in the neotropical flora; It is shown that at small (0.1-1 m) scales, neotropical rain forests exhibit high heterogeneity in numerous environmental factors such as conspecifics, litter, soil factors, topography, and mutualists and pests (Svenning 2001). Decades of research has shown that properties of surface soils including pH, organic carbon concentration, salinity, texture and available nitrogen concentration exhibit an enormous range (Fierer 2017), which is a product of the main factors that affect soil formation such as climate, macro and microorganisms, and parent material and time (Jenny 1941). Even in each soil profile, environmental conditions can vary considerably across the distinct microbial habitats found in soil, including the rhizosphere, preferential water flow paths (including cracks in the soil), and animal burrows. For example, oxygen concentrations can vary from 20% to <1% from the outside to the inside of soil aggregates that are only a few millimeters in size and as a result influence bacterial content and activity of that soil aggregate. There have been several studies into viruses and their role in the soil microbiome, and phages that target specific bacteria, including Rhizobium spp. (Kimura et al. 2008, Frampton et al. 2012), but the overall effects of viruses on the composition and activity of the soil microbiome remain poorly understood. Collectively, a combination of the abovementioned factors along with plant genotype is likely to have shaped the rhizosphere microbial communities in our study species.

122

Figure 4.5 Plot of the first two components from a principal coordinates analysis showing beta diversity among rhizosphere communities in each plot. The first number in each sample name indicates the plot number and the second number represents the specific distance at which the sample was collected for (A) bacterial communities and (B) fungal communities. The first two axes account for approximately 41.7% of the total variance (p = 0.001) for bacterial communities and approximately 30.1% of the total variance (p = 0.001) for fungal communities.

123

Figure 4.5 (continued)

Plot of the first two components from a principal coordinates analysis showing beta diversity among rhizosphere communities in each plot. The first number in each sample name indicates the plot number and the second number represents the specific distance at which the sample was collected for (A) bacterial communities and (B) fungal communities. The first two axes account for approximately 41.7% of the total variance (p = 0.001) for bacterial communities and approximately 30.1% of the total variance (p = 0.001) for fungal communities.

Conclusions

Our results contribute to a growing body of evidence that plant genetic variation and physical distance might be influential at some levels on and microbial community composition. So far, the studies that have investigated this influence have primarily focused on particularly

124

specific rhizobacterial communities such as plant growth-promoting bacteria and primarily on model legumes. I studied a non-model native legume and showed that plants in physical proximity show significantly higher kinship than those more distant from one another. Our dataalso suggests that both plant genotype and physical distance might be important in shaping the genetic structure and diversity of both bacterial and fungal communities of the rhizosphere. Although a greater amount of variation in microbial genetic distances was explained by plant genetic distance plus geographic distance, I conclude that the structure of these communities is at least partially controlled by plants at local spatial scales. These interactions may be a consequence of a long history of coevolution in natural systems. Our results, which are in line with other fine scale microbial studies suggest that large scale studies of microbial diversity might have undersampled microbial diversity, which could result in a biased picture of the spatial scaling of microbial biodiversity and the incorrect conclusion that the spatial scaling of microbial biodiversity is different from that of plant diversity. Given that undersampling could result in the observation of specifically flat or nonexistent rates of distance–decay, it increases the importance of sampling effort in describing diversity patterns.

125

References

Aira, M., F. Monroy, and J. Domínguez. 2007. Eisenia fetida (Oligochaeta: Lumbricidae) modifies the structure and physiological capabilities of microbial communities improving carbon mineralization during vermicomposting of pig manure. Microbial Ecology 54: 662–671.

Anderson, M. J. 2001. A new method for non-parametric multivariate analysis of variance. Austral Ecology 26: 32– 46.

Bais H. P., T. L. Weir, L. G. Perry, S. Gilroy, J. M. Vivanco. 2006. The role of root exudates in rhizosphere interactions with plants and other organisms. Annual Review of Plant Biology 57: 233–266.

Bazghaleh, N., C. Hamel, Y. Gan, B. Tar’an, J. D. Knight. 2015. Genotype-specific variation in the structure of root fungal communities is related to chickpea plant productivity. Applied and Environmental Microbiology 81: 2368–2377.

Beckers, B., M. Op De Beeck, N. Weyens, W. Boerjan, J. Vangronsveld. 2017. Structural variability and niche differentiation in the rhizosphere and endosphere bacterial microbiome of field-grown poplar trees. Microbiome 5:25.

Berendsen, R. L., C. M. J. Pieterse, and P. A. H. M. Bakker. 2012. The rhizosphere microbiome and plant health. Trends in Plant Science 17: 478–486.

Berlanas C, M. Berbegal, G. Elena, M. Laidani, J. F. Cibriain, A. Sagues, and D. Gramaje. 2019. The fungal and bacterial rhizosphere microbiome associated with grapevine rootstock genotypes in mature and young vineyards. Front Microbiology 10: 1142.

Bertin, C., X. Yang, and L. A. Weston. 2003. The role of root exudates and allelochemicals in the rhizosphere. Plant Soil 256: 67–83

Blackwell, M. 2011. The fungi: 1, 2, 3… 5.1 million species? American Journal of Botany 98: 426–438.

Bolyen E, J. R. Rideout, M. R. Dillon, N. A. Bokulich, C. C. Abnet, et al. 2019. Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2. Nature Biotechnology 37: 852–857.

Borrell, A. N., Y. Shi, Y. Gan, L. D. Bainard, J. J. Germida, C. Hamel. 2017. Fungal diversity associated with pulses and its influence on the subsequent wheat crop in the Canadian prairies. Plant Soil 414: 13–31.

Bossuyt, B., M. Hermy, and J. Deckers. 1999. Migration of herbaceous plant species across ancient‐recent forest ecotones in central Belgium. Journal of Ecology 87: 628– 638.

126

Bouffaud, M. L., M. Kyselková, B. Gouesnard, G. Grundmann, D. Muller, Y. Moënne-Loccoz. 2012. Is diversification history of maize influencing selection of soil bacteria by roots? Molecular Ecology 21: 195–206.

Brunet, J. and G. Von Oheimb. 1998. Migration of vascular plants to secondary woodlands in southern Sweden. Journal of Ecology 86: 429– 438.

Cain, M. L., H. Damman, and A. Muir. 1998. Seed dispersal and the Holocene migration of woodland herbs. Ecological Monographs 68: 325– 347.

Challacombe, J. F., C.N. Hesse, L. M. Bramer, et al. 2019. Genomes and secretomes of Ascomycota fungi reveal diverse functions in plant biomass decomposition and pathogenesis. BMC Genomics 20: 976.

Cheplick, G. P. 1998. Seed dispersal and seedling establishment in grass populations. In G. P. Cheplick [ed.], Population biology of grasses, 84– 105. Cambridge University Press, Cambridge, UK.

Cho, J. C., J. M. Tiedje. 2000. Biogeography and degree of endemicity of fluorescent Pseudomonas strains in soil. Applied and Environmental Microbiology 66: 5448-5456

Costa, R., M. Gotz, N. Mrotzek, J. Lottmann, G. Berg, K. Smalla. 2006. Effects of site and plant species on rhizosphere community structure as revealed by molecular analysis of microbial guilds. FEMS Microbiology Ecology 56: 236-249.

Crenshaw, C., C. Lauber, R. L. Sinsabaugh, L. K. Stavely. 2008. Fungal dominance of nitrogen transformation in semi-arid grassland. Biogeochemistry 87:17–27.

Culley, T. M., T. I. Stamper, R. L. Stokes, J. R. Brzyski, N. A. Hardiman, M. R. Klooster, and B. J. Merritt. 2013. An efficient technique for primer development and application that integrates fluorescent labeling and multiplex PCR. Applications in Plant Sciences 1: 1300027.

Dalmastri, C., L. Chiarini, C. Cantale, A. Bevivino, S. Tabacchioni. 1999. Soil type and maize cultivar affect the genetic diversity of maize root-associated Burkholderia cepacia populations. Microbiol Ecology 38: 273–284.

Dellaporta, S. L., J. Wood and J. B. Hicks. 1983. A plant DNA mini preparation: Version II. Plant Molecular Biology Reporter 1: 19-21.

Dorman, H. and L. E. Wallace. 2019. Diversity of nitrogen-fixing symbionts of Chamaecrista fasciculata (Partridge Pea) across variable soils. Southeastern naturalist 80:147-164.

Edwards, J., C. Johnson, C. Santos-Medellín, E. Lurie, N. K. Podishetty, S. Bhatnagar, et al. 2015. Structure, variation, and assembly of the root-associated microbiomes of rice. Proceedings of the National Academy of Sciences 112: E911–20. 127

Emmett, B. D., N. D. Youngblut, D. H. Buckley, L. E. Drinkwater. 2017. Plant phylogeny and life history shape rhizosphere bacterial microbiome of summer annuals in an agricultural field. Frontiers in Microbiology 8: 2414.

Ettema, C. H. and D. A. Wardle. 2002. Spatial soil ecology. Trends in Ecology and Evolution 17: 177–183.

Faoro, H., A. Glogauer, G. H. Couto, E. M. de Souza, L. U. Rigo, L. M. Cruz, et al. 2012. Characterization of a new Acidobacteria-derived moderately thermostable lipase from a Brazilian Atlantic Forest soil metagenome. FEMS Microbiology Ecology 81: 386–394.

Fenster, C. B. 1991. Gene flow in Chamaecrista fasciculata (Leguminosae). I. Gene dispersal. Evolution. 45: 398–409.

Fenster, C. B., W. S. Armbruster, P. Wilson, J. D. Thomson, M. R. Dudash. 2004. Pollination syndromes and floral specialization. Annual Review of Ecology, Evolution, and Systematics 35: 375–403.

Fierer, N. 2017. Embracing the unknown: disentangling the complexities of the soil microbiome. Nature Reviews Microbiology 15: 579–590. https://doi.org/10.1038/nrmicro.2017.87

Fierer, N., and J. Ladau. 2012. Predicting microbial distributions in space and time. Nature Methods 9: 549-551.

Fierer, N., and R. Jackson. 2006. The diversity and biogeography of soil bacterial communities. Proceedings of the National Academy of Sciences, USA 103: 626-631.

Fitzpatrick, C. R., J. Copeland, P. W. Wang, D. S. Guttman, P. M. Kotanen, M. T. J. Johnson. 2018. Assembly and ecological function of the root microbiome across angiosperm plant species. Proceedings of the National Academy of Sciences 115: E1157–65.

Frampton, R. A., A. R. Pitman, and P. C. Fineran. 2012. Advances in bacteriophage-mediated control of plant pathogens. International Journal of Microbiology 2012: 326-452.

Franklin, R. B., and A. L. Mills. 2003 Multi-scale variation in spatial heterogeneity for microbial community structure in an eastern Virginia agricultural field. FEMS Microbiology Ecology 44: 335-346.

Gallart, M., K. L. Adair, J. Love, D. F. Meason, P. W. Clinton, J. Xue, et al. 2018. Host genotype and nitrogen form shape the root microbiome of Pinus radiata. Microbial Ecology 75: 419–433.

Garbeva, P. V., J. D. Van Elsas, J. A. Van Veen. 2008. Rhizosphere microbial community and its response to plant species and soil history. Plant Soil 302: 19–32. 10.1007/s11104-007- 9432-0

128

Gilbert, J. A., J. K. Jansson, R. Knight. 2014. The Earth Microbiome project: successes and aspirations. BMC Biology 12: 69.

Green, L. E., A. Porras-Alfaro, R. L. Sinsabaugh. 2008. Translocation of nitrogen and carbon integrates biotic crust and grass production in desert grassland. Journal of Ecology 96: 1076–85.

Griffiths, E., and R. S. Gupta. 2001. The use of signature sequences in different proteins to determine the relative branching order of bacterial divisions: evidence that Fibrobacter diverged at a similar time to Chlamydia and the Cytophaga-Flavobacterium-Bacteroides division. Microbiology 147: 2611–2622.

Gupta, R. S. 2000. The phylogeny of proteobacteria: relationships to other eubacterial phyla and eukaryotes. FEMS Microbiology Reviews 24: 367–402.

Hacquard, S. 2016. Disentangling the factors shaping microbiota composition across the plant holobiont. New Phytologist 209: 454-457.

Hardy, O. J., and X. Vekemans. 2002. SPAGeDi: a versatile computer program to analyze spatial genetic structure at the individual or population levels. Molecular Ecology Notes 2: 618– 620.

Harper, J. L. 1977. Population Biology of Plants. Academic Press, London.

Hartmann, A., M. Schmid, D. Van Tuinen, and G. Berg. 2009. Plant-driven selection of microbes. Plant Soil 321: 235–257.

Hartman, K., M.G. van der Heijden, V. Roussely-Provent, et al. 2017. Deciphering composition and function of the root microbiome of a legume plant. Microbiome 5: 2

Holm, S. 1979. A simple sequentially rejective multiple test procedure. Scandinavian Journal of Statistics 6: 65–70.

Holm. J. B., M. S. Humphrys, C. K. Robinson, M. L. Settles, S. Ott, L. Fu, H. Yang, P. Gajer, et al. 2019. Ultrahigh-throughput multiplexing and sequencing of >500-basepair amplicon regions on the Illumina HiSeq 2500 platform. mSystems 4: e00029-19.

Hooper, D. U., D. E. Bignell, V. K. Brown, L. Brussard, J. M. Dangerfield, D. H. Wall, et al. 2000. Interactions between aboveground and belowground biodiversity in terrestrial ecosystems: patterns, mechanisms, and feedbacks. Bioscience 50: 1049–1061. doi: 10.1641/0006-

Horner-Devine, M. C., M. Lage, J. B. Hughes, B. J. M. Bohannan. 2004. A taxa–area relationship for bacteria. Nature 432: 750–1753.

Howe, H. F., and J. Smallwood. 1982. Ecology of seed dispersal. Annual Review of Ecology and Systematics 13: 201– 228. 129

IBM Corp. 2017. IBM SPSS Statistics for Winodows, version 25. IBM Corp., Armonk, New York, USA.

Inceoglu, Ö., J. Falcão Salles, L. van Overbeek, J. D. van Elsas. 2010. Effects of plant genotype and growth stage on the betaproteobacterial communities associated with different potato cultivars in two fields. Applied and Environmental Microbiology 76: 3675–3684.

Irwin, H. S. and R. C. Barneby. 1982. The American Cassinae: A synoptical revision of Leguminosae tribe Cassieae subtribe Cassinae in the New World. Memoirs of The New York Botanical Garden 35: 1–918.

Jacquemyn, H., R. Brys, K. Vandepitte, O. Honnay, I. Roldán‐Ruiz. 2006. Fine‐scale genetic structure of life history stages in the food‐deceptive orchid Orchis purpurea. Molecular Ecology 15: 2801– 2808.

Jenny, H. 1941. Factors of soil formation. A system of quantitative pedology. New York, NY, USA: McGraw-Hill. 281 p.

Jiang, Y., S. Li, R. Li, J. Zhang, Y. Liu, L. Lv, et al. 2017. Plant cultivars imprint the rhizosphere bacterial community composition and association networks. Soil Biology and Biochemistry 109: 145–155.

Kamvar, Z. N., J. F. Tabima, and N. J. Grünwald. 2014. Poppr: an R package for genetic analysis of populations with clonal, partially clonal, and/or sexual reproduction. Peer J 2: e281.

Kimura, M., Z. J. Jia, N. Nakayama, and S. Asakawa. 2008. Ecology of viruses in soils: past, present, and future perspectives. Soil Science and Plant Nutrition 54: 1–32.

Lauber, C. L., M. Hamady, R. Knight, and N. Fierer. 2009. Pyrosequencing-based assessment of soil pH as a predictor of soil bacterial community structure at the continental scale. Applied and Environmental Microbiology 75: 5111–5120.

Lemanceau, P., M. Barret, S. Mazurier, S. Mondy, B. Pivato, T. Fort, et al. 2017. Plant communication with associated microbiota in the spermosphere, rhizosphere and phyllosphere, in How Plants Communicate with Their Biotic Environment ed. Beards G. (London: Academic Press Ltd.) 101–133.

Liu, C. W., and J. D. Murray. 2016. The Role of Flavonoids in Nodulation Host-Range Specificity: An Update. Plants (Basel, Switzerland) 53: 33.

Liu, F., T. Hewezi, S. L. Lebeis, et al. 2019. Soil indigenous microbiome and plant genotypes cooperatively modify soybean rhizosphere microbiome assembly. BMC Microbiolgy 19: 201.

Lojewski, N. R., D. G. Fischer, J. K. Bailey, J. A. Schweitzer, T. G. Whitham, and S. C. Hart, 2009. Genetic basis of aboveground productivity in two native Populus species and their hybrids. Tree Physiology 29: 1133–1142. 130

Lozupone, C., M. Hamady, and R. Knight. 2006. UniFrac – an online tool for comparing microbial community diversity in a phylogenetic context. BMC Bioinformatics 7: 371.

Lozupone, C. A., M. Hamady, S. T. Kelley, R. Knight. 2007. Quantitative and qualitative β diversity measures lead to different insights into factors that structure microbial communities. Applied and Environmental Microbiology 73: 1576–1585.

Lucking, R., B. P. Hodkinson, and S. D. Leavitt. 2016. The classification of lichenized fungi in the Ascomycota and Basidiomycota – Approaching one thousand genera. Bryologist 119: 361–416.

Ludwig, W., R. Rossellomora, R. Aznar, et al. 1995. Comparative sequence analysis of 23S ribosomal-RNA from proteobacteria. Systematic and Applied Microbiology 18: 164–188.

Lundberg. D. S, S. L. Lebeis, S. H. Paredes, S. Yourstone, J. Gehring, S. Malfatti, et al. 2012. Defining the core Arabidopsis thaliana root microbiome. Nature 488: 86–90.

Madritch, M. D., S.L. Greene, and R.L. Lindroth. 2009. Genetic mosaics of ecosystem functioning across aspen-dominated landscapes. Oecologia 160: 119–127.

Madritch, M. D., and R. L. Lindroth. 2011. Soil microbial communities adapt to genetic variation in leaf litter inputs. Oikos 120: 1696–1704.

Mantel, N. 1967. The detection of disease clustering and a generalized regression approach. Cancer Research 27: 209–220.

Manz, W., R. Amann, W. Ludwig, M. Wagner, and K.-H. Schleifer. 1992. Phylogenetic oligonucleotide probes for the major subclasses of proteobacteria: Problems and solutions. Systematic and Applied Microbiology 15: 593–600.

Marques J. M., T. F. da Silva, R. E. Vollu, A. F. Blank, G. C. Ding, et al. 2014. Plant age and genotype affect the bacterial community composition in the tuber rhizosphere of field- grown sweet potato plants. FEMS Microbiology Ecology 88: 424–435.

Matlack, G. R. 1994. Plant species migration in a mixed‐history forest landscape in eastern North America. Ecology 75: 1491– 1502.

Mazzola, M., D. L. Funnell, and J. M. Raaijmakers. 2004. Wheat cultivar-specific selection of 2,4-diacetylphloroglucinol-producing fluorescent Pseudomonas species from resident soil populations. Microbial Ecology 48: 338–348.

McMurdie, P. J., and S. Holmes. 2013. phyloseq: an R package for reproducible interactive analysis and graphics of microbiome census data. PLoS ONE 8: e61217.

Mendes, L. W., E. E. Kuramae, A. A. Navarrete, J. A. van Veen, and S. M. Tsai. 2014. Taxonomical and functional microbial community selection in soybean rhizosphere. The ISME Journal 8: 1577–87. 131

Micallef, S. A., S. Channer, M. P. Shiaris, A. Colón-Carmona. 2009. Plant age and genotype impact the progression of bacterial community succession in the Arabidopsis rhizosphere. Plant Signaling and Behavior 4: 777–780.

Moulin, L., A. Munive, B. Dreyfus, and C. Boivin-Masson. 2001. Nodulation of legumes by members of the betasubclass of proteobacteria. Nature 411: 948–950.

Navarrete, A. A., E. E. Kuramae, M. de Hollander, A. S. Pijl, J. A. van Veen, and S. M. Tsai. 2013. Acidobacterial community responses to agricultural management of soybean in Amazon forest soils. FEMS Microbiology Ecology 83: 607–621.

Oksanen, J., F. G. Blanchet, M. Friendly, R. Kindt, P. Legendre, D. McGlinn, P. R. Minchin, et al. 2018. vegan: Community Ecology Package. R package version 2.5-2https://CRAN.R- project.org/package=vegan

Pepe-Ranney, C., A. N. Campbell, C. N. Koechli, et al. 2016. Unearthing the ecology of soil microorganisms using a high resolution DNASIP approach to explore cellulose and xylose metabolism in soil. Frontiers in Microbiology 7: 703.

Qiao, Q., F. Wang, J. Zhang, Y. Chen, C. Zhang, G. Liu, et al. 2017. The variation in the rhizosphere microbiome of cotton with soil type, genotype and developmental stage. Scientific Reports 7: 3940-3952.

Parker, M. A. 2012. Legumes select symbiosis-island–sequence variants in Bradyrhizobium. Molecular Ecology 21: 1769–17778.

Parker, M. A. 2015. The spread of Bradyrhizobium lineages across host legume clades: From Abarema to Zygia. Microbial Ecology 69: 630–640.

Parker, M. A., and D. A. Kennedy. 2006. Diversity and relationships of Bradyrhizobium from legumes native to eastern North America. Canadian Journal of Microbiology 52: 1148– 1157.

Parker, M. A., and A. Rousteau. 2014. Mosaic origins of Bradyrhizobium legume symbionts on the Caribbean island of Guadeloupe. Molecular Phylogenetics and Evolution 77: 110– 115.

Parker, M. A., B. Lafay, J. J. Burdon, and P. van Berkum. 2002. Conflicting phylogeographic patterns in rRNA and nifD indicate regionally restricted gene transfer in Bradyrhizobium. Microbiology 148: 2557–2565.

Parsley, L. C., J. Linneman, A. M. Goode, K. Becklund, I. George, R. M. Goodman, et al. 2011. Polyketide synthase pathways identified from a metagenomic library are derived from soil Acidobacteria. FEMS Microbiology Ecology 78: 176–187.

Paterson, E., T. Gebbing, C. Abel, A. Sim, G. 2007. Telfer Rhizodeposition shapes rhizosphere microbial community structure in organic soil. New Phytologist 173: 600-610 132

Peakall, R., and P. E. Smouse. 2012. GenAlEx 6.5: genetic analysis in Excel. Population genetic software for teaching and research-an update. Bioinformatics 28: 2537-2539.

Pedregosa, F., G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, et al. 2011. Scikit- learn: machine learning in python. Journal of Machine Learning Research 12: 2825.

Peiffer, J. A., A. Spor, O. Koren, Z. Jin, S. G. Tringe, J. L. Dangl, et al. 2013. Diversity and heritability of the maize rhizosphere microbiome under field conditions. Proceedings of the National Academy of Sciences of the United States of America 110: 6548–6553.

Philippot, L., J. M. Raaijmakers, P. Lemanceau, and W. H. van der Putten. 2013. Going back to the roots: the microbial ecology of the rhizosphere. Nature Reviews Microbiology 11: 789–799.

Picard, C., and M. Bosco. 2005. Maize heterosis affects the structure and dynamics of indigenous rhizospheric auxins‐producing Pseudomonas populations. FEMS Microbiology Ecology 53: 349– 357.

Picard, C., M. Bosco. 2006. Heterozygosis drives maize hybrids to select elite 2,4- diacethylphloroglucinol-producing Pseudomonas strains among resident soil populations. FEMS Microbiology Ecology 58: 193–204.

Pérez-Jaramillo, J. E., R. Mendes, J. M. Raaijmakers. 2016. Impact of plant domestication on rhizosphere microbiome assembly and functions. Plant Molecular Biology 90: 635–644.

Porter, S. S., and J. L. Sachs. 2020. Agriculture and the disruption of plant–microbial symbiosis. Trends in Ecology and Evolution, DOI: 10.1016/j.tree.2020.01.006

Rayner, A. D. M., and L. Boddy. 1988. Fungal decomposition of wood, its biology and ecology. John Wiley and Sons, Ltd., Chichester, United Kingdom.

Rosenberg, M. S., and C. D. Anderson. 2011. PASSaGE: Pattern Analysis, Spatial Statistics and Geographic Exegesis. Version 2. Methods in Ecology and Evolution 2: 229-232.

Reinhold-Hurek, B., W. Bünger, C. S. Burbano, M. Sabale, and T. Hurek. 2015. Roots shaping their microbiome: global hotspots for microbial activity. Annual Review of Phytopathology 53: 403–424.

Roesch, L. F. W., F. L. Olivares, M. L. M. Pereira Passaglia, P. A. Selbach, E. L. Saccol de Sa, and F.A. Oliveira de Camargo. 2006. Characterization of diazotrophic bacteria associated with maize: effect of plant genotype, ontogeny and nitrogen-supply. World Journal of Microbiology and Biotechnology 22: 967–974.

Quast, C., E. Pruesse, P. Yilmaz, J. Gerken, T. Schweer, et al. 2012. The SILVA ribosomal RNA gene database project: improved data processing and web-based tools. Nucleic Acids Research 41: D590–D596.

133

RStudio Team. 2012. RStudio: Integrated development environment for R, version 1.1.456. Boston, Massachussete, USA, website: http://www.rstudio.com [accessed 20 March 2020].

Samad, A., F. Trognitz, S. Compant, L. Antonielli, A. Sessitsch. 2017. Shared and host-specific microbiome diversity and functioning of grapevine and accompanying weed plants. Environmental Microbiology 19: 1407–24.

Sapkota, R., K. Knorr, L. N. Jørgensen, K. A. O’Hanlon, M. Nicolaisen, 2015. Host genotype is an important determinant of the cereal phyllosphere mycobiome. New Phytologist 207: 1134–1144.

Sasse, J., E. Martinoia, and T. Northen. 2018. Feed your friends: do plant exudates shape the root microbiome? Trends in Plant Science 23: 25–41.

Schreiter, S., G. C. Ding, H. Heuer, G. Neumann, M. Sandmann, R. Grosch, et al. 2014. Effect of the soil type on the microbiome in the rhizosphere of field‐grown lettuce. Frontiers in Microbiology 5: 144-160.

Schweitzer, J. A., J. K. Bailey, D. G. Fischer, C. J. LeRoy, E. V. Lonsdorf, T. G. Whitham, and S. C. Hart. 2008. Plant–soil–microorganism interactions: Heritable relationship between plant genotype and associated soil microorganisms. Ecology 89: 773–781.

Schweitzer, J. A., J. K., Bailey, B.J. Rehill, G.D. Martinsen, S.C. Hart, R.L. Lindroth, et al. 2004. Genetically based trait in a dominant tree affects ecosystem processes. Ecology Letters 7: 127–134.

Schweitzer, J. A., D. G. Fischer, B. J. Rehill, S. C. Wooley, S. A. Woolbright, R. L. Lindroth, et al. 2011. Forest gene diversity is correlated with the composition and function of soil microbial communities. Population Ecology 53: 35–46.

Schweitzer, J. A., M. D. Madritch, E. Felker-Quinn, J. K. Bailey. 2012. From genes to ecosystems: how plant genetics links above- and below-ground processes. In: Wall D (ed) Soil ecology and ecosystem services. Oxford University Press, Oxford, pp 82–98

Singer, S., J. Doyle, G. May, S. Cannon, S. Maki, and D. Illut. 2009. Exploring Chamaecrista fasciculata genomics data [online]. Website: http://serc.carleton.edu/exploring_genomics/chamaecrista/chamaecrista_tr.html [accessed 10 Jannuary 2020]

Singh, B. K., P. Milard, A. S. Whitely, J. C. Murrell. 2004. Unravelling rhizosphere-microbial interactions: opportunities and limitations. Trends Microbiol1 2: 386–393.

Spain, A., L. Krumholz, and M. Elshahed. 2009. Abundance, composition, diversity and novelty of soil Proteobacteria. The ISME Journal 3: 992–1000.

134

Sugiyama, A., Y. Ueda, T. Zushi, H. Takase, K. Yazaki. 2014. Changes in the bacterial community of soybean rhizospheres during growth in the field. PLoS One 9: e100709.

Sun, X. G. Lyu, Y. Luan, Z. Zhao, H. Yang, D. Su. 2018. Analyses of microbial community of naturally homemade soybean pastes in liaoning province of China by illumina miseq sequencing. Food Research International 111: 50-57.

Svenning, J. 2001. On the role of microenvironmental heterogeneity in the ecology and diversification of neotropical rain-forest palms (Arecaceae). The Botanical Review 67: 1– 53.

Thomas, F., J. H. Hehemann, E. Rebuffet, M. Czjzek, G. Michel. 2011. Environmental and gut bacteroidetes: the food connection. Frontiers in Microbiology 2: 93.

UNITE Community. 2019. UNITE QIIME release for Fungi 2. Version 18.11.2018. UNITE Community. https://doi.org/10.15156/BIO/786349

Vannier, N., C. Mony, A. K. Bittebiere, S. Michon-Coudouel, M. Biget, and P. Vandenkoornhuyse, 2018. A microorganisms' journey between plant generations. Microbiome 6: 79.

Vekemans, X., and O. J. Hardy. 2004. New insights from fine-scale spatial genetic structure analyses in plant populations. Molecular Ecology 13: 921e935.

Vettori, C. D., G. Paffetti, G. Pietramellara, E. Stotzky, B. Gallori. 1996. Amplification of bacterial DNA bound on clay minerals by the random amplified polymorphic DNA (RAPD) technique. FEMS Microbiology Ecology 20: 251–260.

Wagner, M. R., D. S. Lundberg, T. G. del Rio, S. G. Tringe, J. L. Dangl, T. Mitchell‐Olds. 2016. Host genotype and age shape the leaf and root microbiomes of a wild perennial plant. Nature Communications 7: 12151.

Wardle, D. A., R. D. Bardgett, J. N. Klironomos, H. Setala, W. H. Van der Putten, and D. H. Wall. 2004. Ecological linkages between aboveground and belowground biota. Science 304: 1629–1633.

Watling, R. 1995. Assessment of fungal diversity—macromycetes, the problems. Canadian Journal of Botany 73: S15-S24.

Weinert, N., Y. Piceno, G. C. Ding, R. Meincke, H. Heuer, G. Berg, M. Schloter, G. Andersen, K. Smalla. 2011. PhyloChip hybridization uncovered an enormous bacterial diversity in the rhizosphere of different potato cultivars: many common and few cultivar-dependent taxa. FEMS Microbiology Ecology 75: 497–506.

White, L. J., K. Jothibasu, R. N. Reese, V.S. Brözel, and S. Subramanian. 2015. Spatio Temporal Influence of Isoflavonoids on Bacterial Diversity in the Soybean Rhizosphere. Molecular Plant-Microbe Interactions 28: 22-29. 135

Wilfried, T. et al. 2015. Conserving the functional and phylogenetic trees of life of European tetrapods. Philosophical Transactions of the Royal Society B Biological Sciences 370: 20140005–20140005.

Wille, L., M. M. Messmer, B. Studer, and P. Hohmann. 2018. Insights to plant–microbe interactions provide opportunities to improve resistance breeding against root diseases in grain legumes. Plant, Cell and Environment 42: 20–40.

Willson, M. F. 1993. Dispersal mode, seed shadows, and colonization patterns. Vegetatio 107/108: 261– 280.

Xiao, ., W. Chen, L. Zong, J. Yang, S. Jiao, Y. Lin, et al. 2017. Two cultivated legume plants reveal the enrichment process of the microbiome in the rhizocompartments. Molecular Ecology 26: 1641–51.

Zancarini A., Mougel C., Voisin A.-S., Prudent M., Salon C., Munier-Jolain N. (2012). Soil nitrogen availab Zancarini, A., C. Mougel, A. S. Voisin, M. Prudent, C. Salon, N. Munier-Jolain. 2012. Soil nitrogen availability and plant genotype modify the nutrition strategies of M. truncatula and the associated rhizosphere microbial communities. PLoS ONE 7: e47096.

Zgadzaj, R., R. Garrido-oter, D. Bodker, A. Koprivova, P. Schulze-lefert. 2016. Root nodule symbiosis in Lotus japonicus drives the establishment of distinctive rhizosphere, root, and nodule bacterial communities. Proceedings of the National Academy of Sciences USA 113: E7996–E8005.

Zhang, T., Z. Wang, X. Lv, et al. 2019. High-throughput sequencing reveals the diversity and community structure of rhizosphere fungi of Ferula Sinkiangensis at different soil depths. Scientific Reports 9: 6558.

Zhou, N., S. Zhao, C. Y. Tian. 2017. Effect of halotolerant rhizobacteria isolated from halophytes on the growth of sugar beet (Beta vulgaris L.) under salt stress. FEMS Microbiology Letters 364: fnx091. 10.1093/femsle/fnx091

Zogg, G. P., S. E. Travis, and D. A. Brazeau. 2018. Strong associations between plant genotypes and bacterial communities in a natural salt marsh. Ecology and Evolution 8: 4721–4730.

136