bioRxiv preprint doi: https://doi.org/10.1101/2021.02.03.429501; this version posted February 3, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.
1 Horizontal gene transfer of a unique nif island drives convergent evolution of free-living
2 N2-fixing Bradyrhizobium 3
4 Jinjin Tao^, Sishuo Wang^, Tianhua Liao, Haiwei Luo*
5
6 Simon F. S. Li Marine Science Laboratory, School of Life Sciences and State Key Laboratory of
7 Agrobiotechnology, The Chinese University of Hong Kong, Shatin, Hong Kong SAR 8 9 ^These authors contribute equally to this work. 10
11 *Corresponding author:
12 Haiwei Luo 13 School of Life Sciences, The Chinese University of Hong Kong 14 Shatin, Hong Kong SAR 15 Phone: (+852) 39436121 16 E-mail: [email protected] 17 18 Running Title: Free-living Bradyrhizobium evolution
19 Keywords: free-living Bradyrhizobium, nitrogen fixation, lifestyle, HGT
1
bioRxiv preprint doi: https://doi.org/10.1101/2021.02.03.429501; this version posted February 3, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.
20 Summary
21 The alphaproteobacterial genus Bradyrhizobium has been best known as N2-fixing members that
22 nodulate legumes, supported by the nif and nod gene clusters. Recent environmental surveys
23 show that Bradyrhizobium represents one of the most abundant free-living bacterial lineages in
24 the world’s soils. However, our understanding of Bradyrhizobium comes largely from symbiotic
25 members, biasing the current knowledge of their ecology and evolution. Here, we report the
26 genomes of 88 Bradyrhizobium strains derived from diverse soil samples, including both
27 nif-carrying and non-nif-carrying free-living (nod free) members. Phylogenomic analyses of
28 these and 252 publicly available Bradyrhizobium genomes indicate that nif-carrying free-living
29 members independently evolved from symbiotic ancestors (carrying both nif and nod) multiple
30 times. Intriguingly, the nif phylogeny shows that all nif-carrying free-living members comprise a
31 cluster which branches off earlier than most symbiotic lineages. These results indicate that
32 horizontal gene transfer (HGT) promotes nif expansion among the free-living Bradyrhizobium
33 and that the free-living nif cluster represents a more ancestral version compared to that in
34 symbiotic lineages. Further evidence for this rampant HGT is that the nif in free-living members
35 consistently co-locate with several important genes involved in coping with oxygen tension
36 which are missing from symbiotic members, and that while in free-living Bradyrhizobium nif and
37 the co-locating genes show a highly conserved gene order, they each have distinct genomic
38 context. Given the dominance of Bradyrhizobium in world’s soils, our findings have implications
2
bioRxiv preprint doi: https://doi.org/10.1101/2021.02.03.429501; this version posted February 3, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.
39 for global nitrogen cycles and agricultural research.
40 Introduction
41 Members of the slow-growing genus Bradyrhizobium constitute an important group of
42 rhizobia [1, 2]. They symbiotically fix N2 with diverse legume tribes and are the predominant
43 symbionts of a wide range of nodulating legumes [3, 4]. Recent studies, however, have provided
44 accumulating evidence for the existence of close relatives of rhizobia including Bradyrhizobium
45 that do not carry nod genes which are essential for the establishment of nodulation, and thus are
46 not able to nodulate legumes [5]. Increasingly, Bradyrhizobium strains without nod have been
47 found to be particularly abundant in soils [6-8]. VanInsberghe et al [8] showed that
48 non-symbiotic Bradyrhizobium was the dominant bacterial lineage in North America coniferous
49 soils. A global atlas of the dominant soil-dwelling bacteria based upon 16S rRNA gene amplicon
50 sequencing indicated that the genus Bradyrhizobium was the most abundant bacterial lineage in
51 soils across the world [6]. A recent study suggested that ancestral lineages of Bradyrhizobium
52 might adapt to a free-living lifestyle, highlighting the importance of the free-living lifestyle in
53 the evolution of Bradyrhizobium [9].
54 Although the majority of N2 fixation in terrestrial ecosystems is generally thought to be
55 performed by symbiotic rhizobia [10-12], free-living N2-fixing bacteria may be important
56 contributors to the nitrogen budgets in a number of environments, for example soil ecosystems
57 that lack leguminous plants [13]. Free-living N2 fixation also occurs in understudied ecosystems
3
bioRxiv preprint doi: https://doi.org/10.1101/2021.02.03.429501; this version posted February 3, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.
58 such as deep soil and canopy soil [14, 15], which may lead to the underestimation of its global
59 rates. Rhizobia are generally thought to be capable of N2 fixation only in nodules [16, 17]. The
60 exceptions are found in some members of Bradyrhizobium and Azorhizobium which were shown
61 to be able to fix N2 in the free-living state [18, 19]. Such cases in Bradyrhizobium were mostly
62 explored in photosynthetic members [18, 19], but have recently been found in other strains. For
63 instance, Bradyrhizobium sp. AT1, a non-symbiotic strain isolated from the root of sweet potato
64 was reported to fix N2 with a rate comparable to the photosynthetic strain Bradyrhizobium sp.
65 ORS278 under the molecular oxygen (O2) concentrations at 1-5% (for both) in the free-living
66 state [20]. Another example is Bradyrhizobium sp. DOA9, which harbors a nif cluster on its
67 chromosome and another on a symbiotic plasmid: while both could participate in N2 fixation
68 during symbiosis, only the chromosomal one participated in N2 fixation in the free-living state
69 [21].
70 Compared with the nodules, the condition of soils is harsh for N2 fixation for several
71 reasons. First, in contrast to being at nanomolar levels in nodules, the O2 concentration in soils
72 could be up to atmospheric levels [22], which is highly detrimental to N2 fixation, a process that
73 requires an anaerobic microenvironment [23]. Second, unlike symbiotic rhizobia, free-living
74 rhizobia lack readily available organic matter provided by the host plants in exchange for fixed
75 nitrogen. Last, free-living soil bacteria are frequently exposed to stresses like droughts, high
4
bioRxiv preprint doi: https://doi.org/10.1101/2021.02.03.429501; this version posted February 3, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.
76 osmolarity, and high temperature. It is therefore imperative to investigate the strategies that
77 free-living N2-fixing Bradyrhizobium may use to deal with the harsh condition in soils.
78 Despite the potential ecological significance, free-living Bradyrhizobium, particularly
79 those able to fix N2, have been poorly characterized. To this end, we isolated 88 Bradyrhizobium
80 and five related strains from soils of soybean cropland, artificial park, forest, and grassland at
81 several geographic locations in China, and had their genomes sequenced. By building a
82 phylogenomic tree and reconstructing the ancestral lifestyles with additional 252 genomes where
83 42 are free-living Bradyrhizobium strains deposited in public databases, we revealed complex
84 lifestyle shifts along the evolutionary history of Bradyrhizobium. Notably, we showed that
85 horizontal gene transfer (HGT) of a unique nif island among free-living members may have
86 facilitated their transitions from symbiotic lifestyle and adaptation to soil habitats.
87
88 Methods and Materials
89 Detailed methods are described in Supplementary Text S1. In brief, we collected soil
90 samples from five different sites in China (Fig. S1A) that cover several ecosystem types: soybean
91 cropland (Heihe and Lvliang), artificial park (Shenzhen), undeveloped forest (Lanzhou), and
92 grassland (Hefei) (see Fig. S1A and Data Set S1 for details). Five different media were applied to
93 retrieve Bradyrhizobium from the samples. We obtained 93 isolates that likely resembled
94 Bradyrhizobium, and had their genomes sequenced with the Hiseq X platform (Illumina),
5
bioRxiv preprint doi: https://doi.org/10.1101/2021.02.03.429501; this version posted February 3, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.
95 followed by genome assembly and gene identification using SPAdes v3.10.1 [24] and Prokka
96 v1.12 [25], respectively. Completeness of these genomes was calculated using CheckM v1.0.7
97 [26] with default parameters.
98 The phylogenomic tree of Bradyrhizobium was constructed using IQ-Tree v1.6.2 [27]
99 with the 93 strains isolated in the current study and 252 genomes deposited in the NCBI
100 Genbank database based on amino acid sequences of 123 shared single-copy genes identified by
101 OrthoFinder v2.2.7 [28]. Bradyrhizobium members were divided into three lifestyle types based
102 on the presence or absence of nod and nif genes (Data Sets S1 and S2). Strains that possess
103 nifBDEHKN and nodABCIJ genes (at most one missing gene allowed; the same thereafter) were
104 classified as symbiotic (Sym), strains possessing only nifBDEHKN genes were classified as
105 free-living N2-fixing (FLnif), and those lacking both nod and nif genes were classified as
106 free-living non-N2-fixing (FLnonnif). The lifestyle of ancestral nodes in the phylogenomic tree was
107 inferred using Mesquite v3.61 [29] by the maximum parsimony reconstruction method, which
108 infers the ancestral lifestyles by minimizing the number of steps of lifestyle change along the
109 phylogenomic tree. Similar to a recent study [9], we did not apply the methodologically complex
110 maximum likelihood method because the frequent lifestyle transitions across short branches in
111 the Bradyrhizobium phylogeny likely leads to overestimates of the transition rate and hence
112 inaccurate reconstruction of ancestral lifestyles.
6
bioRxiv preprint doi: https://doi.org/10.1101/2021.02.03.429501; this version posted February 3, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.
113 Metadata of 4,958 sequence files (runs) were retrieved from the NCBI Sequence Read
114 Archive (SRA) using the term “nifH metagenome” (last accessed in December 2020). Raw data
115 were downloaded using the SRA Toolkit (https://github.com/ncbi/sra-tools) and were further
116 processed with QIIME2 [30]. We applied evolutionary placement methods to assign the short
117 sequence reads to the four phylogenetically distinguishable types of nif clusters (i.e., FL, PB,
118 Sym and SymBasal; see Fig. 2) using PaPaRa v2.5 [31] and EPA-ng v0.3.8 [32]. The normalized
119 abundance of each type of Bradyrhizobium nifH genes was calculated as the number of reads
120 assigned to the corresponding nifH type divided by the total number of reads in the amplicon
121 sequencing data set assigned to Bradyrhizobium.
122
123 Results and discussion
124 Newly sequenced free-living strains expand ecological diversity of Bradyrhizobium
125 We isolated and sequenced the genomes of 93 Bradyrhizobium strains initially identified
126 by 16S rRNA gene (five were later shown to belong to Afipia based on phylogenomic
127 construction; see below) isolated from diverse types of soils, i.e., lateritic red earths, black soils,
128 brown coniferous forest soils, cinnamon soils, and yellow-brown earths in different sites of
129 China (Figure S1). Among these isolates, 81 were non-symbiotic strains, as evidenced by the
130 lack of the nodulation-determining nod genes, four of which carried nif genes. The remaining
131 twelve were potential symbiotic strains that were likely released from decaying nodules: 11 of
7
bioRxiv preprint doi: https://doi.org/10.1101/2021.02.03.429501; this version posted February 3, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.
132 them carried nod genes and the other one phylogenetically clustered with photosynthetic
133 members which use a nod-independent strategy for nodulation [33] (Fig. 1); all of them were
134 isolated from the rhizosphere or bulk soil of leguminous plants (Data Set S1). Together with the
135 215 publicly available Bradyrhizobium genomes, our data set included 167 symbiotic (Sym), 22
136 free-living N2-fixing (FLnif), and 119 free-living non-N2-fixing (FLnonnif) strains (Data Sets S1
137 and S2).
138 Despite a significantly expanded data set, all the Bradyrhizobium isolates fell into seven
139 phylogenetic supergroups (Fig. 1) named in a recent study [2]. These were the Soil 1, Soil 2, B,
140 jicamae, B. elkanii, Kakadu, Photosynthetic, and B. japonicum supergroups (Fig. 1). Among the
141 93 newly isolated strains, 88 contributed to all these supergroups except for the Soil 2 and
142 Kakadu supergroups, and the remaining five strains were related to the genus Afipia in the
143 outgroup. The newly isolated strains were phylogenetically nested within those that have
144 publicly available genomes, except for the 13 strains in the B. jicamae supergroup which formed
145 a sister clade to the previously sequenced strains of that supergroup (Fig. 1). The phylogenetic
146 distribution of our newly sequenced genomes indicates a high diversity of soil-dwelling
147 Bradyrhizobium.
148 Our ancestral lifestyle reconstruction analysis inferred that the last common ancestor
149 (LCA) of Bradyrhizobium adapted to a free-living lifestyle, consistent with our prior study based
150 on a very limited taxon sampling [9] (Fig. 1). This was evident by the pattern that a vast majority
8
bioRxiv preprint doi: https://doi.org/10.1101/2021.02.03.429501; this version posted February 3, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.
151 of the outgroup lineages as well as all members of the basal Bradyrhizobium clade (Soil 1 and
152 Soil 2 supergroups) were non-symbiotic lineages. The symbiotic lifestyle independently
153 originated eight times based on the sampled lineages, including three major origins: one in the B.
154 jicamae supergroup (FLnonnif-Sym1), one in the B. elkanii supergroup (FLnonnif-Sym3), and
155 another one at the LCA of the Kakadu, Photosynthetic, and B. japonicum supergroups
156 (FLnonnif-Sym4). These transitions occurred in relatively deep phylogenetic positions, suggesting
157 that they represent evolutionarily ancient events. Free-living lifestyles originated from symbiotic
158 strains 23 times: 13 of these events gave rise to non-N2-fixing strains and 10 gave rise to
159 N2-fixing members (Fig. 1). Transitions from symbiotic to free-living lifestyle (both Sym-FLnif
160 and Sym-FLnonnif) likely occurred recently as indicated by their shallow phylogenetic positions
161 (Fig. 1). Hence, the loss of the ability to nodulate legume plants occurred more frequently than
162 the reverse process in Bradyrhizobium, consistent with a prior study based on the analysis of
163 phenotypic markers [7]. This pattern remained when we combined FLnif and FLnonnif as a single
164 free-living lifestyle (Fig. S2). Furthermore, the free-living N2-fixing lifestyle originated from
165 free-living non-N2-fixing ancestors four times (Fig. 1). The infrequency of this type of lifestyle
166 transition does not necessarily mean that this type of transition is rare, as it may result from a
167 potential bacterial cultivation bias against those carrying the nif cluster under aerobic conditions
168 with readily available N sources.
169
9
bioRxiv preprint doi: https://doi.org/10.1101/2021.02.03.429501; this version posted February 3, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.
170 HGT of the nif island drives the expansion of the free-living N2-fixing lifestyle
171 Free-living N2-fixing Bradyrhizobium is of much interest, because these Bradyrhizobium
172 members may play a previously unrecognized role in the N cycle in soils, especially given the
173 reportedly high abundance of Bradyrhizobium in soils [6]. We asked where the N2-fixing (nif)
174 genes in free-living Bradyrhizobium come from and whether the nif genes differ between
175 free-living and symbiotic members. Since most free-living strains evolved from their symbiotic
176 ancestors, one possibility is that free-living N2-fixing strains inherited the same version of nif
177 genes from their nodulating ancestors. Alternatively, free-living N2-fixing Bradyrhizobium might
178 have lost the entire symbiosis island which includes both nod and nif clusters among other genes,
179 and instead recruited a new set of nif genes from an external donor by HGT. To test these
180 competing hypotheses, we constructed the phylogenetic tree of the nif cluster based on the
181 concatenated alignment of the genes nifABDEHKNVX for those from the Bradyrhizobium and
182 several other rhizobia including Azorhizobium, Rhizobium, Sinorhizobium and Mesorhizobium
183 from Alphaproteobacteria, and Burkholderia and Cupriavidus from Betaproteobacteria (see
184 Materials and methods).
185 The nif genes of all Bradyrhizobium formed a monophyletic group (Fig. S3), indicating a
186 single origin of nif in this genus. Within the Bradyrhizobium clade, the nif gene tree (Fig. 2)
187 displayed substantial topological incongruence with the species tree (Fig. 1), suggesting different
188 evolutionary histories between nif genes and the core genome. In the Bradyrhizobium part of the
10
bioRxiv preprint doi: https://doi.org/10.1101/2021.02.03.429501; this version posted February 3, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.
189 nif gene tree, the Photosynthetic Bradyrhizobium supergroup together with a strain from the B.
190 jicamae supergroup (SZCCT0283) branched first, followed by a few strains of the B. japonicum
191 supergroup (Fig. 2). The latter consisted of two clades, one exclusively constituted by symbiotic
192 strains (SymBasal; Fig. 2), some of which were early-split lineages of the B. japonicum
193 supergroup (Fig. 1), and the other mostly composed of free-living N2-fixing members (Cluster
194 FL) (Fig. 2). Nif genes of the B. jicamae and B. elkanii supergroups, which together with B.
195 japonicum comprised the three largest clades in the nif gene tree (Fig. 2), were nested within the
196 B. japonicum supergroup, suggesting that they were horizontally transferred from the latter.
197 Despite much inconsistency between the species and nif trees, but because the basal positions in
198 the nif phylogeny mostly included members from Photosynthetic, Kakadu, and B. japonicum
199 supergroups, it is likely that the nif cluster in Bradyrhizobium was first acquired by the LCA of
200 the Photosynthetic, Kakadu and B. japonicum supergroups. This is consistent with the ancestral
201 lifestyle reconstruction shown in Fig. 1 which inferred a symbiotic ancestor of the corresponding
202 ancestral node (FLnonnif-Sym4).
203 Strikingly, despite a scattered distribution in the species phylogeny (Fig. 1), 17
204 free-living N2-fixing strains grouped on the nif phylogeny (Cluster FL in Fig. 2). Also grouping
205 in Cluster FL were the nif genes from three symbiotic strains (DOA9, CCBAU 53363 and p9-20).
206 These three strains encoded another nif cluster located in the symbiotic plasmid (DOA9) or
207 symbiosis island on the chromosome (CCBAU 53363 and p9-20), which grouped with other
11
bioRxiv preprint doi: https://doi.org/10.1101/2021.02.03.429501; this version posted February 3, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.
208 symbiotic strains in the nif phylogeny (Fig. 2). Clearly, the grouping of the nif genes belonging
209 to Cluster FL was the result of HGT. This point was further evidenced by the conserved genomic
210 context around nif genes in Cluster FL (see the next section). Moreover, the geographic locations
211 of isolates belonging to Cluster FL spans three continents (North America, South America and
212 Asia) (Fig. 1B). Remarkably, as introduced above, two strains (Bradyrhizobium sp. AT1 and
213 Bradyrhizobium sp. DOA9) whose nif genes from Cluster FL in the nif phylogeny were
214 previously shown to perform N2 fixation under micro-aerobic conditions (marked by a box
215 around the strain name in Fig. 2) [20, 21]. The above evidence together indicates that nif genes
216 from Cluster FL, which were mainly comprised by free-living Bradyrhizobium, are likely to be a
217 special group of nif genes that specifically take part in free-living N2 fixation.
218
219 A unique nif island likely contributes to the oxygen tolerance of free-living N2-fixing
220 Bradyrhizobium
221 We sought more evidence for HGT of the nif cluster by analyzing genes surrounding it.
222 This led to the identification of a ~50 kb genomic region containing nif genes (nif island)
223 conserved among all members of Cluster FL (Fig. 3A). The regions flanking the nif island were
224 not conserved across most free-living strains (Fig. S4), providing further evidence for HGT of
225 the nif island. Note that four strains of Cluster FL (BM-T, Y-H1, R2.2-H and W) shared highly
226 conserved genomic contexts (Fig. S4). These strains were closely related and comprised a
12
bioRxiv preprint doi: https://doi.org/10.1101/2021.02.03.429501; this version posted February 3, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.
227 monophyletic group in the phylogenomic tree (marked by a box around the strain names in Fig.
228 S5), indicating that the shared genomic context of their nif islands was inherited from their LCA.
229 Additionally, the conserved genomic context of the nif island among strains from the
230 Photosynthetic Bradyrhizobium supergroup (Fig. S4) was consistent with their vertical descent
231 shown in the species phylogeny (Fig. 1). We further compared this unique nif island with the nif
232 clusters of other strains in this genus (Fig. 3).
233 The upstream boundary of the nif island in Cluster FL Bradyrhizobium (Fig. 3A) was
234 marked by nifA, which codes for a transcriptional activator required for the expression of nif
235 operons [34]. Adjacent to nifA was a suf cluster, which is responsible for synthesizing and
236 inserting Fe-S clusters into nitrogenase [35]. The core genes encoding the nitrogenase were
237 located downstream of the suf cluster, and were mainly separated into two operons, nifDKENX
238 and nifHQ. FixABCX which encode a membrane complex involved in electron transfer to
239 nitrogenase were located downstream of the nif genes [36]. Extensively studied in rhizobia, the
240 role of fixABCX in N2 fixation has been suggested for free-living diazotrophs [37]. At the
241 downstream boundary of the nif island was a mod cluster encoding molybdate transporter.
242 In general, the nif island conserved in free-living Bradyrhizobium was similar to that in
243 photosynthetic strains (Fig. 3A), as also noted in previous studies [38, 39], and that in a newly
244 sequenced soil-dwelling strain (SZCCT0283) from the B. jicamae supergroup (Fig. 3A),
245 consistent with the evolutionary relatedness of these nif islands (Fig. 2). A few differences were
13
bioRxiv preprint doi: https://doi.org/10.1101/2021.02.03.429501; this version posted February 3, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.
246 that those from Cluster PB carried an additional copy of nifH but fewer genes in the mod operon
247 compared with those from Cluster FL (Fig. 3). Further, the early-split positions of Cluster FL and
248 PB nif genes in the phylogeny (Fig. 2) suggest an ancient evolutionary origin. Because nif genes
249 in the symbiosis island of most symbiotic strains branched later than those from the nif island
250 (Fig. 2), presumably, the nif island found in photosynthetic and free-living Bradyrhizobium
251 represent the ancestral status of symbiotic Bradyrhizobium, and it was later expanded to a
252 symbiosis island by integrating nod as well as other genes in non-photosynthetic symbiotic
253 Bradyrhizobium.
254 The nif clusters of the nif island conserved in Cluster PB and FL (Fig. 3A) shared many
255 genes with those on the symbiosis island (Fig. 3B), such as nifDKENBH and fixBCX, agreeing
256 with their essential roles in N2 fixation. However, they also displayed notable differences.
257 Apparently, the gene arrangement of the free-living nif island was more compact and more
258 conserved across different strains (Fig. 3). In contrast, in the nif cluster from the symbiosis island,
259 several genes involved in N2 fixation like nifA and fixA are often linked to nod genes and
260 separated from the nif cluster [40]. The nif island in free-living and photosynthetic members
261 additionally carried a nifV gene, which encodes homocitrate synthase to synthesize homocitrate,
262 a component of the Fe-Mo cofactor of nitrogenase [41]. The vast majority of symbiotic rhizobia
263 (not limited to Bradyrhizobium) do not harbor nifV [41], which is known to be compensated for
264 by the homocitrate provided by their legume host during the symbiotic state [42]. Some studies
14
bioRxiv preprint doi: https://doi.org/10.1101/2021.02.03.429501; this version posted February 3, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.
265 attributed the ability of free-living N2 fixation to nifV, because this gene was detected in the
266 genomes of the few rhizobial strains able to perform N2 fixation in the free-living state [41, 43,
267 44]. Here, we found that nifV was present in around half of Bradyrhizobium genomes, not
268 restricted to the free-living and photosynthetic strains (Fig. 2). Phylogenetic analysis suggests
269 that nifV was presumably present at the LCA of Bradyrhizobium, but lost within the B.
270 japonicum supergroup (Node 1 in Fig. 2; Fig. S6). Further, early studies showed that
271 Bradyrhizobium sp. CB756, a strain carrying nifV on the symbiosis island (Fig. 3B), fixed N2 at a
272 rate more than a magnitude lower than the nif island-carrying photosynthetic Bradyrhizobium
273 (strains ORS310 and ORS322) and Azorhizobium caulinodans ORS571 in agar culture under air
274 or O2 concentrations resembling soil environments [19, 45]. This hints that the absence of nifV
275 might not be the only reason accounting for the decreased efficiency of free-living N2 fixation of
276 certain rhizobia strains, at least at higher O2 concentrations.
277 We further identified that several genes involved in O2 tolerance and stress response were
278 also specific to the nif island of free-living and photosynthetic members (Fig. 3), including glbO,
279 hspQ, and the mod operon. Specifically, glbO encodes a 17-kDa group-II truncated hemoglobin
280 that binds O2 with high affinity [46, 47]. The expression of truncated hemoglobin genes is
281 induced upon exposure to oxidative stress generated by H2O2 and hypoxia [48]. The product of
282 glbO might therefore play an important role in protecting nitrogenase from O2 inactivation,
283 endowing the free-living Bradyrhizobium with higher tolerance to O2. HspQ encodes a
15
bioRxiv preprint doi: https://doi.org/10.1101/2021.02.03.429501; this version posted February 3, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.
284 chaperone protein, which combats the detrimental effects on proteins caused by stressors such as
285 increased temperatures, oxidative stress, and heavy metals [49]. The mod operon transports
286 extracellular molybdate with high affinity [50]. We also observed copy number differences
287 between the free-living and symbiotic members, such as the nifZ which has a function in the
288 maturation of the Fe-Mo protein nitrogenase [51], but whether the additional copies contribute to
289 adaptation to the free-living lifestyle is still unknown. In summary, we predict that in addition to
290 nifV, the genes involved in oxygen tolerance and stress response in the nif island (e.g., glbO,
291 hspQ and mod) may play a role in the adaptation to free-living lifestyle. Intriguingly, the nif
292 island from Azorhizobium, which is able to perform free-living N2 fixation, shared some of the
293 above genes like nifV, glbO and mod, and a generally similar gene arrangement with the
294 free-living nif island in Bradyrhizobium (Fig. 3C).
295 Note that, for simplicity, we defined free-living strains based on the absence of nod in the
296 current study. Four free-living N2-fixing strains, namely B. mercantei SEMIA 6399, B.
297 yuanmingense P10 130, B. liaoningense CCBAU 83689 and B. mercantei SEMIA 6399, were
298 originally isolated from nodules (Data Set S2). Interestingly, except the last isolate whose nif
299 genes were found in Cluster FL, the nif genes of the other three were embedded in symbiotic
300 lineages in the nif phylogeny (Fig. 2), indicating that the nif genes were inherited from
301 nodulating ancestors. These three strains were also found to carry genes encoding components of
302 T3SS and the effectors it injects, as evident by the genes gained or lost shown in Fig. S7C
16
bioRxiv preprint doi: https://doi.org/10.1101/2021.02.03.429501; this version posted February 3, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.
303 (Sym-FLnonnif transitions 2, 5 and 6; Text S2), suggesting that they are likely nodule-inhabiting
304 strains that do not nodulate but with the potential to interact with plants. This is in contrast to
305 other free-living N2-fixing strains where symbiosis genes were commonly lost during lifestyle
306 transition (Sym-FLnonnif transitions 1, 3, 4 and 7-10 in Fig. S7C; Text S2). In general, the gene
307 arrangement of nif genes and surrounding regions for these strains were similar to those of
308 symbiotic strains, particularly when compared with their closest relatives, but the degree of
309 similarity varied: while P10 130 showed a very high similarity to its closest nod-carrying relative
310 CCGE-LA001 in the genomic context of the nif genes, SEMIA 6399 showed a limited similarity
311 to its closest nod-carrying relative NAS96.2, likely due to a recent insertion in SEMIA 6399 of
312 several genes between nifZ and nifH (Fig. 3B). We did not analyze CCBAU 83689 as the contig
313 carrying nif genes was too short. The results indicate the different origins of the free-living
314 N2-fixing Bradyrhizobium: while some were derived from their symbiotic ancestors, most
315 analyzed strains of this lifestyle originated via HGT of the free-living nif island (Fig. 3). It would
316 be interesting to conduct experiments to assess the detailed functions of the genes on the nif
317 island under O2 concentrations resembling those of soil environments in future.
318
319 Widespread distribution of the unique free-living nif island in the environment
320 We further asked how prevalent the nif island specific to free-living Bradyrhizobium is in
321 the environments and how it compares to the nif genes of symbiotic members. We therefore
17
bioRxiv preprint doi: https://doi.org/10.1101/2021.02.03.429501; this version posted February 3, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.
322 assessed the normalized abundances of nifH, the most commonly used marker gene to identify
323 N2-fixing microbes [52], that potentially correspond to free-living (Cluster FL in Fig. 2),
324 photosynthetic (Cluster PB in Fig. 2), and symbiotic Bradyrhizobium (Sym and SymBasal in Fig.
325 2) in 4,958 amplicon sequencing data sets (see Materials and methods). We divided these data
326 sets into four categories according to the environments where the samples were collected: soils
327 (e.g., bulk soil, rhizosphere and freshwater lake sediment), plants (root and phyllosphere
328 samples), marine (including seawater, marine sediment, coral and mangrove samples), and others
329 (e.g., bioreactor and unidentified samples) (Data Set S3). Amplicon reads were assigned to
330 different types of Bradyrhizobium nifH according to their phylogenetic placements (see
331 Materials and methods).
332 Amplicon analysis showed that Bradyrhizobium constitute an important group of
333 diazotrophic communities in different habitats, and the relative abundance of Bradyrhizobium
334 was 7.52±18.57%, 6.67±12.75%, 17.39±18.32, 14.54±18.85% in marine, soil, plant, and other
335 environmental types, respectively (Data Set S3). The nifH that resembles those belonging to
336 Cluster FL displayed the highest normalized abundance in samples from marine, soil, and
337 habitats classified as “others” (Fig. 4; p < 0.001 for all comparisons, paired
338 Wilcoxon-Mann-Whitney test). Specifically, Cluster FL accounted for 56±47%, 46±38%, and 51
339 ±39% (mean±standard deviation) of reads that were assigned to Bradyrhizobium in these three
340 types of habitats (marine, soil, and others), respectively. In particular, for marine samples, nearly
18
bioRxiv preprint doi: https://doi.org/10.1101/2021.02.03.429501; this version posted February 3, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.
341 all Bradyrhizobium nifH were assigned to Cluster FL (Data Set S3). For the data sets of plant
342 samples, while symbiotic nifH displayed the highest abundance (Fig. 4), nifH assigned to Cluster
343 FL also showed a considerably high normalized abundance in plants (29±39%), and was even
344 the dominant ecotype in diazotrophic communities associated with some Asteraceae species and
345 the roots of several perennial grass species (Data Set S3). Hence, though the presence of
346 free-living Bradyrhizobium-specific nifH does not necessarily indicate the occurrence of a
347 complete nif island, it is tempting to suggest that Bradyrhizobium members carrying the
348 free-living nif island i) were distributed in a variety of environments (Fig. S8), and ii) were in
349 general more abundant than that of symbiotic strains in the genus in most habitats except plants,
350 hinting at a previously unrecognized role of their nif island in the free-living lifestyle.
351 Nevertheless, it should be noted that Cluster FL nifH also displayed large variations in regard to
352 the normalized abundance between different diazotrophic communities of the same type of
353 habitat despite the general trend described above.
354
355 Caveats and concluding remarks
356 Adding the 93 soil-dwelling strains greatly expanded our knowledge of the ecology and
357 evolution of Bradyrhizobium. However, because of the bias towards symbiotic strains in
358 previous genomics research and the difficulties in cultivating those dwelling soil environments,
359 the free-living strains analyzed here may underestimate the real diversity of wild diazotrophic
19
bioRxiv preprint doi: https://doi.org/10.1101/2021.02.03.429501; this version posted February 3, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.
360 Bradyrhizobium that adapt to a free-living lifestyle. For example, prior studies have shown that
361 Bradyrhizobium is the dominant genus in the diazotrophic communities in many soil ecosystems,
362 most of which have not been sampled for Bradyrhizobium cultivation however [53, 54]. Of the
363 limited types of soil ecosystems sampled here, the Bradyrhizobium strains we collected are not
364 necessarily representative of the wild populations in these ecosystems owing to the potential
365 cultivation bias. For example, our cultivation process was under aerobic condition and the
366 cultivation medium was rich in fixed nitrogen, conditions disfavoring the isolation of
367 diazotrophic members. Therefore, it is not clear whether important free-living diazotrophic
368 lineages occupying distinct phylogenetic positions are missing and how this complements the
369 current conclusion we reached. Another caveat to some of our findings is that some of our
370 analyses built on an over-simplified classification of Bradyrhizobium lifestyles, which was based
371 on the presence/absence of certain genes like nif and nod. In fact, isolates with a symbiosis island
372 could be ineffective (i.e., those form nodules but fix little N2 within nodules) [55]. Likewise,
373 whether the free-living strains possessing the nif island could perform N2 fixation, and if they
374 could, the N2 fixation efficiency under different O2 concentrations, remains to be studied.
375 In spite of the above caveats, our study reveals an interesting pattern of convergent
376 evolution of free-living N2-fixing Bradyrhizobium from their symbiotic ancestors. Though the
377 nested phylogenetic positions within symbiotic lineages (Fig. 1) might leave an impression that
378 free-living N2-fixing members could have inherited the nif genes from their symbiotic ancestors,
20
bioRxiv preprint doi: https://doi.org/10.1101/2021.02.03.429501; this version posted February 3, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.
379 we provided compelling evidence that this is unlikely the case. Rather, it is the HGT of a
380 conserved nif island, which presumably represents the ancestral state of all types of nif found in
381 the Bradyrhizobium, that drives the independent transitions from symbiotic to free-living
382 N2-fixing lineages. This serves as a classic example of convergent shifts in lifestyle driven by
383 HGT of certain genes in bacteria, which, although mostly explored in pathogens or symbionts in
384 prior studies [56-58], may play a prominent role in free-living bacteria. It further hints that the
385 symbiosis island common to the vast majority of symbiotic Bradyrhizobium members could be
386 derived from the ancestral nif island potentially by recruiting symbiosis genes. This may be
387 different from the traditional view that the symbiosis island, including nif, nod and T3SS among
388 other genes, was acquired in its entirety to enable ancestral interactions between Bradyrhizobium
389 and legumes [9, 59]. Given the global dominance of Bradyrhizobium in the soil microbiota, even
390 a small proportion of them being diazotrophic members could potentially bring a considerable
391 amount of fixed nitrogen to the bulk soils and other nitrogen-limited terrestrial ecosystems. Our
392 results therefore have important implications for understanding terrestrial nitrogen cycles.
393 Because one of the major aims of modern agricultural research is to transfer the N2 fixation
394 ability to non-legume crops [60, 61] , the capacity of certain free-living Bradyrhizobium to fix N2
395 and their association with non-leguminous plants may imply potential application in agriculture.
396
397 Acknowledgements
21
bioRxiv preprint doi: https://doi.org/10.1101/2021.02.03.429501; this version posted February 3, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.
398 We thank Xiaoyuan Feng for help with genome assembly. We appreciate Shuang Wang
399 and Jie Liu for providing soil samples and Jianjun Xu for assisting in sample collection. We are
400 grateful to Xingqin Lin for her helpful suggestions in medium design. We also thank Qin Li, Yan
401 Li, Xiaojun Wang, Xiao Chu, Hao Zhang and Danli Luo for their helpful discussion. This work
402 was supported by the Hong Kong Research Grants Council Area of Excellence Scheme
403 (AoE/M-403/16).
404
405 Data and codes availability
406 The genomic sequences of the 88 sequenced Bradyrhizobium and five Afipia strains have
407 been submitted to the NCBI databases GenBank (accessions: PRJNA698083), which will be
408 made publicly available upon the acceptance of the manuscript. The Python [62] and Ruby [63]
409 codes used for phylogenomic and comparative genomics analyses are deposited in the online
410 repository https://github.com/luolab-cuhk/Bradyrhizobium-Nif-HGT.
411
412 Competing interests
413 The authors declare no competing interests in relation to this work.
414
22
bioRxiv preprint doi: https://doi.org/10.1101/2021.02.03.429501; this version posted February 3, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.
415 References
416 1. Ormeno-Orrillo E, Martinez-Romero E. A genomotaxonomy view of the Bradyrhizobium
417 genus. Frontiers in microbiology 2019; 10: 1334.
418 2. Avontuur JR, Palmer M, Beukes CW, Chan WY, Coetzee MPA, Blom J, et al.
419 Genome-informed Bradyrhizobium taxonomy: where to from here? Systematic and
420 Applied Microbiology 2019; 42: 427-439.
421 3. Andrews M, Andrews ME. Specificity in legume-rhizobia symbioses. International
422 journal of molecular sciences 2017; 18: 705.
423 4. Parker MA. The spread of Bradyrhizobium lineages across host legume clades: from
424 Abarema to Zygia. Microb Ecol 2015; 69: 630-640.
425 5. Garrido-Oter R, Nakano RT, Dombrowski N, Ma KW, McHardy AC, Schulze-Lefert P,
426 et al. Modular Traits of the Rhizobiales Root Microbiota and Their Evolutionary
427 Relationship with Symbiotic Rhizobia. Cell Host & Microbe 2018; 24: 155-167.
428 6. Delgado-Baquerizo M, Oliverio AM, Brewer TE, Benavent-Gonzalez A, Eldridge DJ,
429 Bardgett RD, et al. A global atlas of the dominant bacteria found in soil. Science 2018;
430 359: 320-+.
431 7. Hollowell AC, Regus JU, Gano KA, Bantay R, Centeno D, Pham J, et al. Epidemic
432 Spread of Symbiotic and Non-Symbiotic Bradyrhizobium Genotypes Across California.
433 Microbial Ecology 2016; 71: 700-710.
23
bioRxiv preprint doi: https://doi.org/10.1101/2021.02.03.429501; this version posted February 3, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.
434 8. VanInsberghe D, Maas KR, Cardenas E, Strachan CR, Hallam SJ, Mohn WW.
435 Non-symbiotic Bradyrhizobium ecotypes dominate North American forest soils. ISME J
436 2015; 9: 2435-2441.
437 9. Wang S, Meade A, Lam H, Luo H. Evolutionary Timeline and Genomic Plasticity
438 Underlying the Lifestyle Diversity in Rhizobiales. Msystems 2020: 5(4).
439 10. Peoples MB, Craswell ET. Biological Nitrogen-Fixation - Investments, Expectations and
440 Actual Contributions to Agriculture. Plant and Soil 1992; 141: 13-39.
441 11. Cleveland CC, Townsend AR, Schimel DS, Fisher H, Howarth RW, Hedin LO, et al.
442 Global patterns of terrestrial biological nitrogen (N-2) fixation in natural ecosystems.
443 Global Biogeochemical Cycles 1999; 13: 623-645.
444 12. Herridge DF, Peoples MB, Boddey RM. Global inputs of biological nitrogen fixation in
445 agricultural systems. Plant and Soil 2008; 311: 1-18.
446 13. Vitousek PM, Menge DN, Reed SC, Cleveland CC. Biological nitrogen fixation: rates,
447 patterns and ecological controls in terrestrial ecosystems. Philos Trans R Soc Lond B Biol
448 Sci 2013; 368: 20130119.
449 14. Reed SC, Cleveland CC, Townsend AR. Relationships among phosphorus, molybdenum
450 and free-living nitrogen fixation in tropical rain forests: results from observational and
451 experimental analyses. Biogeochemistry 2013; 114: 135-147.
24
bioRxiv preprint doi: https://doi.org/10.1101/2021.02.03.429501; this version posted February 3, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.
452 15. Matson AL, Corre MD, Burneo JI, Veldkamp E. Free-living nitrogen fixation responds to
453 elevated nutrient inputs in tropical montane forest floor and canopy soils of southern
454 Ecuador. Biogeochemistry 2015; 122: 281-294.
455 16. Lee KB, De Backer P, Aono T, Liu CT, Suzuki S, Suzuki T, et al. The genome of the
456 versatile nitrogen fixer Azorhizobium caulinodans ORS571. BMC genomics 2008; 9:
457 1-14.
458 17. Dreyfus B, Garcia JL, Gillis M. Characterization of Azorhizobium-Caulinodans Gen-Nov,
459 Sp-Nov, a Stem-Nodulating Nitrogen-Fixing Bacterium Isolated from Sesbania-Rostrata.
460 International Journal of Systematic Bacteriology 1988; 38: 89-98.
461 18. Dreyfus BL, Elmerich C, Dommergues YR. Free-Living Rhizobium Strain Able to Grow
462 on N-2 as the Sole Nitrogen-Source. Applied and Environmental Microbiology 1983; 45:
463 711-713.
464 19. Alazard D. Nitrogen-Fixation in Pure Culture by Rhizobia Isolated from Stem Nodules of
465 Tropical Aeschynomene Species. Fems Microbiology Letters 1990; 68: 177-182.
466 20. Terakado-Tonooka J, Fujihara S, Ohwaki Y. Possible contribution of Bradyrhizobium on
467 nitrogen fixation in sweet potatoes. Plant and Soil 2013; 367: 639-650.
468 21. Wongdee J, Boonkerd N, Teaumroong N, Tittabutr P, Giraud E. Regulation of nitrogen
469 fixation in Bradyrhizobium sp. strain DOA9 involves two distinct NifA regulatory
25
bioRxiv preprint doi: https://doi.org/10.1101/2021.02.03.429501; this version posted February 3, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.
470 proteins that are functionally redundant during symbiosis but not during free-living
471 growth. Frontiers in microbiology 2018; 9: 1644.
472 22. Hunt S, Layzell DB. Gas-Exchange of Legume Nodules and the Regulation of
473 Nitrogenase Activity. Annual Review of Plant Physiology and Plant Molecular Biology
474 1993; 44: 483-511.
475 23. Gallon JR. Reconciling the Incompatible - N-2 Fixation and O-2. New Phytologist 1992;
476 122: 571-609.
477 24. Bankevich A, Nurk S, Antipov D, Gurevich AA, Dvorkin M, Kulikov AS, et al. SPAdes:
478 a new genome assembly algorithm and its applications to single-cell sequencing. J
479 Comput Biol 2012; 19: 455-477.
480 25. Seemann T. Prokka: rapid prokaryotic genome annotation. Bioinformatics 2014; 30:
481 2068-2069.
482 26. Parks DH, Imelfort M, Skennerton CT, Hugenholtz P, Tyson GW. CheckM: assessing the
483 quality of microbial genomes recovered from isolates, single cells, and metagenomes.
484 Genome Res 2015; 25: 1043-1055.
485 27. Nguyen LT, Schmidt HA, von Haeseler A, Minh BQ. IQ-TREE: a fast and effective
486 stochastic algorithm for estimating maximum-likelihood phylogenies. Mol Biol Evol
487 2015; 32: 268-274.
26
bioRxiv preprint doi: https://doi.org/10.1101/2021.02.03.429501; this version posted February 3, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.
488 28. Emms DM, Kelly S. OrthoFinder: phylogenetic orthology inference for comparative
489 genomics. Genome biology 2019; 20: 1-14.
490 29. Maddison WP, Maddison DR (2019). Mesquite: A modular system for evolutionary
491 analysis. Version 3.61. 2019.
492 30. Caporaso JG, Kuczynski J, Stombaugh J, Bittinger K, Bushman FD, Costello EK, et al.
493 QIIME allows analysis of high-throughput community sequencing data. Nature Methods
494 2010; 7: 335-336.
495 31. Berger SA, Stamatakis A. Aligning short reads to reference alignments and trees.
496 Bioinformatics 2011; 27: 2068-2075.
497 32. Barbera P, Kozlov AM, Czech L, Morel B, Darriba D, Flouri T, et al. EPA-ng: Massively
498 Parallel Evolutionary Placement of Genetic Sequences. Systematic Biology 2019; 68:
499 365-369.
500 33. Giraud E, Moulin L, Vallenet D, Barbe V, Cytryn E, Avarre JC, et al. Legumes
501 symbioses: Absence of Nod genes in photosynthetic bradyrhizobia. Science 2007; 316:
502 1307-1312.
503 34. Dixon R, Kahn D. Genetic regulation of biological nitrogen fixation. Nature Reviews
504 Microbiology 2004; 2: 621-631.
505 35. Outten FW, Djaman O, Storz G. A suf operon requirement for Fe-S cluster assembly
506 during iron starvation in Escherichia coli. Molecular Microbiology 2004; 52: 861-872.
27
bioRxiv preprint doi: https://doi.org/10.1101/2021.02.03.429501; this version posted February 3, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.
507 36. Edgren T, Nordlund S. The fixABCX genes in Rhodospirillum rubrum encode a putative
508 membrane complex participating in electron transfer to nitrogenase. Journal of
509 Bacteriology 2004; 186: 2052-2060.
510 37. Ledbetter RN, Costas AMG, Lubner CE, Mulder DW, Tokmina-Lukaszewska M, Artz
511 JH, et al. The Electron Bifurcating FixABCX Protein Complex from Azotobacter
512 vinelandii: Generation of Low-Potential Reducing Equivalents for Nitrogenase Catalysis.
513 Biochemistry 2017; 56: 4177-4190.
514 38. Okubo T, Piromyou P, Tittabutr P, Teaumroong N, Minamisawa K. Origin and Evolution
515 of Nitrogen Fixation Genes on Symbiosis Islands and Plasmid in Bradyrhizobium.
516 Microbes and Environments 2016; 31: 260-267.
517 39. Okubo T, Tsukui T, Maita H, Okamoto S, Oshima K, Fujisawa T, et al. Complete
518 Genome Sequence of Bradyrhizobium sp S23321: Insights into Symbiosis Evolution in
519 Soil Oligotrophs. Microbes and Environments 2012; 27: 306-315.
520 40. Lamb JW, Hennecke H. In Bradyrhizobium-Japonicum the Common Nodulation Genes,
521 Nodabc, Are Linked to Nifa and Fixa. Molecular & General Genetics 1986; 202:
522 512-517.
523 41. Nouwen N, Arrighi JF, Cartieaux F, Chaintreuil C, Gully D, Klopp C, et al. The role of
524 rhizobial (NifV) and plant (FEN1) homocitrate synthases in
28
bioRxiv preprint doi: https://doi.org/10.1101/2021.02.03.429501; this version posted February 3, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.
525 Aeschynomene/photosynthetic Bradyrhizobium symbiosis. Scientific reports 2017; 7:
526 1-10.
527 42. Hakoyama T, Niimi K, Watanabe H, Tabata R, Matsubara J, Sato S, et al. Host plant
528 genome overcomes the lack of a bacterial gene for symbiotic nitrogen fixation. Nature
529 2009; 462: 514-517.
530 43. Agarwal AK, Keister DL. Physiology of Ex-Planta Nitrogenase Activity in
531 Rhizobium-Japonicum. Applied and Environmental Microbiology 1983; 45: 1592-1601.
532 44. Terpolilli JJ, Hood GA, Poole PS. What Determines the Efficiency of N-2-Fixing
533 Rhizobium-Legume Symbioses? Advances in Microbial Physiology, Vol 60 2012; 60:
534 325-389.
535 45. Bender GL, Plazinski J, Rolfe BG. Asymbiotic Acetylene-Reduction by a Fast-Growing
536 Cowpea Rhizobium Strain with Nitrogenase Structural Genes Located on a Symbiotic
537 Plasmid. Applied and Environmental Microbiology 1986; 51: 868-871.
538 46. Liu C, He Y, Chang Z. Truncated hemoglobin o of Mycobacterium tuberculosis: the
539 oligomeric state change and the interaction with membrane components. Biochemical
540 and Biophysical Research Communications 2004; 316: 1163-1172.
541 47. Pathania R, Navani NK, Rajomohan G, Dikshit KL. Mycobacterium tuberculosis
542 hemoglobin HbO associates with membranes and stimulates cellular respiration of
543 recombinant Escherichia coli. Journal of Biological Chemistry 2002; 277: 15293-15302.
29
bioRxiv preprint doi: https://doi.org/10.1101/2021.02.03.429501; this version posted February 3, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.
544 48. Ascenzi P, De Marinis E, Coletta M, Visca P. H(2)O(2) and (center dot)NO scavenging
545 by Mycobacterium leprae truncated hemoglobin O. Biochemical and Biophysical
546 Research Communications 2008; 373: 197-201.
547 49. Shimuta T, Nakano K, Yamaguchi Y, Ozaki S, Fujimitsu K, Matsunaga C, et al. Novel
548 heat shock protein HspQ stimulates the degradation of mutant DnaA protein in
549 Escherichia coli. Genes to Cells 2004; 9: 1151-1166.
550 50. Grunden AM, Shanmugam KT. Molybdate transport and regulation in bacteria. Archives
551 of Microbiology 1997; 168: 345-354.
552 51. Jimenez-Vicente E, Yang ZY, del Campo JSM, Cash VL, Seefeldt LC, Dean DR. The
553 NifZ accessory protein has an equivalent function in maturation of both nitrogenase
554 MoFe protein P-clusters. Journal of Biological Chemistry 2019; 294: 6204-6213.
555 52. Gaby JC, Buckley DH. A global census of nitrogenase diversity. Environmental
556 Microbiology 2011; 13: 1790-1799.
557 53. Han LL, Wang Q, Shen JP, Di HJ, Wang JT, Wei WX, et al. Multiple factors drive the
558 abundance and diversity of the diazotrophic community in typical farmland soils of China.
559 FEMS microbiology ecology 2019; 95: fiz113.
560 54. Meng H, Zhou Z, Wu R, Wang Y, Gu J. Diazotrophic microbial community and
561 abundance in acidic subtropical natural and re-vegetated forest soils revealed by
30
bioRxiv preprint doi: https://doi.org/10.1101/2021.02.03.429501; this version posted February 3, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.
562 high-throughput sequencing of nifH gene. Applied Microbiology and Biotechnology 2019;
563 103: 995-1005.
564 55. Gano-Cohen KA, Wendlandt CE, Stokes PJ, Blanton MA, Quides KW, Zomorrodian A,
565 et al. Interspecific conflict and the evolution of ineffective rhizobia. Ecology Letters 2019;
566 22: 914-924.
567 56. Remigi P, Zhu J, Young JPW, Masson-Boivin C. Symbiosis within Symbiosis: Evolving
568 Nitrogen-Fixing Legume Symbionts. Trends in Microbiology 2016; 24: 63-75.
569 57. Novick RP, Ram G. The Floating (Pathogenicity) Island: A Genomic Dessert. Trends in
570 Genetics 2016; 32: 114-126.
571 58. Gal-Mor O, Finlay BB. Pathogenicity islands: a molecular toolbox for bacterial virulence.
572 Cellular Microbiology 2006; 8: 1707-1719.
573 59. Provorov NA, Andronov EE. Evolution of root nodule bacteria: Reconstruction of the
574 speciation processes resulting from genomic rearrangements in a symbiotic system.
575 Microbiology 2016; 85: 131-139.
576 60. Soumare A, Diedhiou AG, Thuita M, Hafidi M, Ouhdouch Y, Gopalakrishnan S, et al.
577 Exploiting Biological Nitrogen Fixation: A Route Towards a Sustainable Agriculture.
578 Plants-Basel 2020; 9: 1011.
579 61. Santi C, Bogusz D, Franche C. Biological nitrogen fixation in non-legume plants. Annals
580 of Botany 2013; 111: 743-767.
31
bioRxiv preprint doi: https://doi.org/10.1101/2021.02.03.429501; this version posted February 3, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.
581 62. Cock PJA, Antao T, Chang JT, Chapman BA, Cox CJ, Dalke A, et al. Biopython: freely
582 available Python tools for computational molecular biology and bioinformatics.
583 Bioinformatics 2009; 25: 1422-1423.
584 63. Goto N, Prins P, Nakao M, Bonnal R, Aerts J, Katayama T. BioRuby: bioinformatics
585 software for the Ruby programming language. Bioinformatics 2010; 26: 2617-2619.
586 64. Zulkower V, Rosser S. DNA Features Viewer: a sequence annotation formatting and
587 plotting library for Python. Bioinformatics 2020; 36: 4350-4352.
588
589
32
bioRxiv preprint doi: https://doi.org/10.1101/2021.02.03.429501; this version posted February 3, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.
590 Legends
591 Figure 1. The maximum-likelihood phylogenomic tree of Bradyrhizobium and inferred lifestyle
592 evolutionary history. Strains from Xanthobacteraceae were used as an outgroup. Ancestral
593 lifestyles were inferred using the parsimony method in Mesquite. Branches in black, green, red
594 and gray represent free-living non-N2-fixing, free-living N2-fixing, symbiotic and uncertain
595 lifestyles, respectively. The color strips in the outer layer indicates different supergroups. Red
596 and green solid circles denote the presence of nod and nif genes, respectively. Black stars
597 adjacent to the tips of the phylogeny indicate strains isolated and sequenced in the present study.
598 We inferred 8, 13, 10 and 4 times FLnonnif-Sym, Sym-FLnonnif, Sym-FLnif and FLnonnif-FLnif
599 transitions respectively (noted with arrows). Purple circles on the nodes indicate ultrafast
600 bootstrap values higher than or equal to 95% calculated by IQ-Tree. The red and green triangles
601 on the outer layer indicate symbiotic members of the basal lineages of the B. japonicum
602 supergroup and free-living N2-fixing strains, respectively.
603
604 Figure 2. Nif gene phylogeny of Bradyrhizobium. The phylogeny was constructed using the
605 concatenated alignments of nifABDEHKNVX. Nif genes from Rhizobium, Sinorhizobium,
606 Mesorhizobium, Azorhizobium and Beta-rhizobia (rhizobia from Betaproteobacteria) are used as
607 the outgroup. The color strips on the outer layer indicates different supergroups, and the
608 background colors of the species name represent different lifestyles. The purple circles on the
33
bioRxiv preprint doi: https://doi.org/10.1101/2021.02.03.429501; this version posted February 3, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.
609 nodes indicate ultrafast bootstrap values higher than or equal to 95% calculated by IQ-Tree. The
610 three symbiotic strains (also mentioned in Fig. 1) with nif genes on both the free-living nif island
611 and symbiosis island are indicated by black triangles. The tips with a black box around denote
612 the strains from Cluster FL with experimental evidence for their N2 fixation in the free-living
613 state. Note that three symbiotic strains, namely DOA9, p9-20 and CCBAU 53363, carried a nif
614 cluster similar to the one on the free-living nif island (NI) in addition to another one on the
615 symbiosis island (SI). The geographic locations of free-living N2-fixing strains are displayed in
616 Fig. S1B.
617
618 Figure 3. The comparison of the gene arrangement of the nif gene cluster located in the nif island
619 and in the symbiosis island. Gene functions are distinguished by different colors. The
620 visualization of gene arrangement is performed with dna-features-viewer v3.0.3 [64]. The
621 structures of the gene arrangement classified the nif clusters into three types: those belonging to
622 the Cluster FL or Cluster PB (A), those found in the symbiosis island in most symbiotic strains
623 or in FLnif strains that cluster with symbiotic strains (B), and those from outgroup species (C). In
624 panel B, two out of the three free-living strains whose phylogenetic positions are nested within
625 symbiotic strains, labeled as “FLnifwithinSym”, are shown together with the symbiotic strains
626 closest to them in the nif tree (see also Fig. 2): SEMIA 6399 vs. NAS96.2, and CCGE-LA001 vs.
34
bioRxiv preprint doi: https://doi.org/10.1101/2021.02.03.429501; this version posted February 3, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.
627 P10 130 (the other one, strain CCBAU 83689, is not shown as the assembly of the contig where
628 the nif island is located is incomplete).
629
630 Figure 4. Normalized abundance of free-living and symbiotic nifH of Bradyrhizobium in
631 amplicon sequencing samples collected from different types of environments. The values shown
632 as percentage in the y-axis in the violin plots denote the normalized abundance (see Materials
633 and methods) of different types of nifH. The width of each curve represents the approximate
634 frequency of data points in the corresponding region. The P-value is derived from a paired
635 Wilcoxon-Mann-Whitney test (two-sided).
636
35
Isolated and sequenced in this study bioRxiv preprint doi: https://doi.org/10.1101/2021.02.03.429501; this version posted February 3, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. Colorstrip > Supergroups Soil 1 supergroup Soil 2 supergroup Outgroup Bradyrhizobium jicamae supergroup 1 Bradyrhizobium elkanii supergroup Kakadu supergroup 1 1 Photosynthetic supergroup Bradyrhizobium japonicum supergroup
Dots > Gene clusters SEMIA 6399 nifBDEHKN 2 nodABCIJ 2 8 Branch color > Lifestyles 3 Symbiotic (Sym) 7 2
Free-living N2-fixing (FLnif) Free-living non-N2-fixing (FLnonnif) 4 Uncertain Bradyrhizobium S23321 4 Arrow > Lifestyle transitions 6 3 Ec3.3
nonnif 4 BR 10245 FL -Sym (8 times) BR 10247 Cp5.3 Sym-FLnonnif (13 times) 3 5 6 Sym-FLnif (10 times) 13 8 7 SZCCT0346 FLnonnif-FLnif (4 times) 9 5 SZCCT0007 12 4 p9-20 BK707
6 11 7
5 9 SZCCT0231 10 3 8
10 1 DOA9 P10 130 2
CCH5-F6
CCBAU 83689 R2.2-H W BM-T Y-H1 CCBAU 53363 UNPA324
22
TSA1 UNPF46 AT1 39S1MB
cf659 Colorstrip > Supergroups
CCBAU 53363 (NI) Soil 1 supergroup
SZCCT0007
SZCCT0346
SZCCT0231 p9-20 (NI) Soil 2 supergroup DOA9 (NI)
BK707 UNPA324
Bradyrhizobium jicamae supergroup 39S1MB 1TA S23321 cf659
UNPF46 TSA1 CCH5-F6 22 BM-T BR 446 Y-H1 R2.2-H W BR3351 Tv2a-2 DOA1 Ai1a-2 Bradyrhizobium elkanii supergroup BR 10247 PAC68 ARR65 BR 10245 Ec3.3 WSM2254 WSM3983 STM 3809 NAS96.2 bioRxiv preprint doi: https://doi.org/10.1101/2021.02.03.429501; this version posted February 3, 2021. The copyright holder for this preprint SEMIA 6399 Kakadu supergroup Cp5.3 P10 130 (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprintORS375 in perpetuity. It is made CCGE-LA001 available under aCC-BY-NC-NDAzorh 4.0Azorh International license. ORS285 PAC 48 Photosynthetic supergroup ORS278 USDA 3259 izobium izobium SZCCT0094 USDA 3254 R5 BTAi1 3458 Bradyrhizobium japonicum supergroup doeber caulino BR3262 STM 3843 BLY3-8 SZCCT0283 S58 einerae dansUFLA ORS571 BLY6-1 CI-1B Outgroup CI-41S Semia 938 Background > Lifestyle 1-100 SZCCT0424 C9 Symbiotic NBRC 14791 Sym Basal USDA 76 Free-living N2-fixing USDA 94 Cluster FL CCBAU 05737 CCBAU 43297 Cluster PB 587 Dots > Genes BR29 TnphoA33 CCBAU 15615 UFLA 03-321 nifV SZCCT0284 UFLA03-84 SEMIA 690 USDA 4 SEMIA 6208 SZCCT0046 WSM2793 Strains have two nif clusters OO99 Rc2d CCBAU 15544 SA-3 Rp7b CCBAU 15635 BR3267 USDA 124 p9-20 (SI) Genomes displayed in Figure 3 CCBAU 43298 CCBAU 10071 CCBAU 25435 Sym CCBAU 25021 CCBAU 35157 Is-34 CCBAU 05623 USDA 38 WSM1743 J5 Bradyrhizobium CCBAU 51781 NBRC 14783 CCBAU 51770 FN1 CCBAU 51778 CCBAU 15618 LMG 28791 USDA 6 Ghvi E109 Rc3b ERR11 SEMIA 5079 Node 1 AC87j1 USDA 110 Is-1 INPA54B CNPSo 3426 110spc4 XF7 CNPSo 3424 Y21 LVM 105 CNPSo 3448 LMG 26795 CCBAU 41267 CB756 SEMIA 5080 USDA 3456 USDA 122 USDA 3384 SZCCT0233 WSM2783 Leo121 CCBAU 15517 Leo170 CCBAU 83623 LMTR 13 SZCCT0285 LMTR 3 CCBAUUSDA 15354 123 WSM1741 LMTR 21 SZCCT0133 CCBAU 23086 LmjM6 SZCCT0293 LmjM3 SZCCT0342 RST89 RST91 CCBAU 83689 DOA9 plasmid USDA 135 CCBAU 05525 Ro19 CCBAU 51649 SZCCT0287 NK6 CCBAU 51670 CCBAU 53424 SZCCT0340 CCBAU 53426 SZCCT0341 Gha CCBAU 53363 (SI) UBMA197 RP6 WSM1417 L2 WSM4349 UBMA192 UBMA182 UBMA171 CCBAU 53390 UBMA195 UBMA183 UBMA181 CCBAU 51757 SUTN9-2 WSM1253 UBMA122 UBMAN05 UBMA510
Th.b2 174MSW
AS23.2
NAS80.1 CCBAU 51787 SEMIA 6148 CCNWSX0360 BR 10303 UBMA051 UBMA050 UBMA060 UBMA052 UBMA061 nifA sufBCDS nifZ nif(H)DKENX hspQ nifT nifB nifZ glbO nifHQ nifV nifW fixABCX modECBAD bolA nifZ iscA nifZ paaD grxD ahpD modE (A) glbO fabG sufE nifX hspQ K13819 nifT iscA nifZ fdx E3.1.1.45 hndC cheY nifW fixX PRDX2_4 S23321 nifA sufB sufC sufD sufS nifH nifD nifK nifE nifN iscS nifB per nifH nifQ irr nifV cysE fixA fixB fixC modC modB modA 0 6,000 12,000 18,000 24,000 30,000 bolA 36,000 42,000 48,000 nifZ iscA nifZ paaD grxD fixX sufE nifX hspQ hcaC K13819 nifT iscA nifZ fdx glbO E3.1.1.45 hndC cheY nifW modE DOA9 (NI) fabG nifA sufB sufC sufD sufS nifD nifK nifE nifN iscS nifB per nifH nifQ irr nifV cysE fixA fixB fixC modC modB modA modD TC.DME fabG 0 6,000 12,000 18,000 24,000 30,000 36,000 42,000 48,000 nifZ iscA nifZ paaD bolA hylB fixX sufB sufE nifX hspQ hcaC K13819 nifT iscA nifZ fdx glbO E3.1.1.45 grxD rssB nifW modE CCBAU 53363 (NI) fabG nifA sufB sufC sufD sufS nifD nifK nifE nifN iscS nifB per nifH nifQ irr nifV cysE fixA fixB fixC modC modB modA modD ABC.SP.S 0 6,000 12,000 18,000 24,000 30,000 36,000 42,000 48,000 sufE iscA nifZ paaD bolA hylB fixX Cluster FL nifZ narL nifX hspQ K13819 nifT iscA nifZ fdx glbO E3.1.1.45 grxD rssB nifW modE SZCCT0007 sufB sufC sufD sufS nifD nifK nifE nifN iscS nifB per nifH nifQ irr nifV cysE fixA fixB fixC modC modB modA modD hyuA 0 6,000 12,000 18,000 24,000 30,000 hylB 36,000 42,000 48,000 nifZ iscA nifZ paaD bolA fixX glbO sufE nifX hcaC K13819 nifT iscA nifZ fdx E3.1.1.45 grxD algB nifW modE p9-20 (NI) sufB sufC sufD sufS nifD nifK nifE nifN iscS nifB per nifH nifQ irr nifV cysE fixA fixB fixC modC modB modA crt 0 6,000 12,000 18,000 24,000 30,000 hylB 36,000 42,000 48,000 nifZ iscA nifZ paaD bolA fixX glbO sufE nifX hspQ K13819 nifT iscA nifZ fdx E3.1.1.45 grxD devR nifW modE AT1 fabG nifA sufB sufC sufD sufS nifD nifK nifE nifN iscS nifB per nifH nifQ irr nifV fixA fixB fixC modB modA ABC.SP.S 0 6,000 12,000 18,000 24,000 30,000 36,000 42,000 48,000 bolA bfd iscA nifZ paaD grxD ahpD nifX hspQ K13819 nifT iscA nifZ fdx hcaC glbO E3.1.1.45 hndC cheY nifW fixX PRDX2_4 mhuD SZCCT0283 nifH nifD nifK nifE nifN TST MAO VCP iscS nifB per yghU glpE nifH nifQ irr nifV cysE fixA fixB fixC modC modB modA modD IBA57 0 6,000 12,000 18,000 24,000 30,000 36,000 42,000 48,000 54,000 grxD fixJ iscA nifZ paaD hndC cheY fixX modC sufE nifX hspQ K13819 nifT iscA nifZ fdx glbO E3.1.1.45 bolA arsC1 irr nifW modE modD ORS278 fabG nifA sufB sufC sufD sufS nifH nifD nifK nifE nifN TST iscS nifB per K07487 nifH nifQ nifV cysE fixA fixB fixC livK 0 6,000 12,000 18,000 24,000 30,000 36,000 grxD 42,000 48,000 54,000 fixJ iscA nifZ paaD hndC cheY grxD modE ahpD Cluster PB sufE nifX hspQ K13819 nifT iscA nifZ fdx glbO E3.1.1.45 bolA arsC1 irr nifW fixX modC PRDX2_4 ORS285 fabG nifA sufB sufC sufD sufS nifH nifD nifK nifE nifN TST iscS nifB per nifH nifQ nifV cysE fixA fixB fixC modD livK 0 6,000 12,000 18,000 24,000 30,000 36,000 42,000 48,000 54,000 hya,hyp sufBCDS nifDKENX nifBnifTnifZ nifHQ nifV nifW fixBCX (B) nifZ parE1_3_4 hyaC hypA hypC nifX iscA nifT nifZ parD1_3_4 nifW fixX ahpD SEMIA 6399 hyaA hyaB hyaC hyaD hyaF hupV hypB hypF hypD hypE ald sufB sufC sufD sufS nifD nifK nifE nifN VCP iscS nifB per entS opuC opuA moeB ompW nifH nifQ nifV fixB fixC 0 8,000 16,000 24,000 32,000 40,000 48,000 56,000 64,000 72,000 fixJ nifZ fixX ahpD modC bioRxiv preprint doi: https://doi.org/10.1101/2021.02.03.429501; this version posted February 3, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. ItdctR is made hypC hypA nifX iscA nifT nifZ ompW nifW K07482 PRDX2_4 FLnif within Symavailable under aCC-BY-NC-ND 4.0 International license. P10 130 fixL dctA hyaA hyaB hyaC hyaD hyaF hupV hypB hypD hypE ald sufB nifD nifK nifE nifN iscS nifB nifH nifQ nifV fixB fixC 0 8,000 16,000 24,000 32,000 40,000 0 2,500 5,000 7,500 10,000 12,500 iscA K07497 E3.1.1.11 nifX K13819 nifT nifQ nifW fixX DOA1 parA K00666 E6.4.1.4B bccA glmS E3.2.1.15 fabG nifH nifD nifK nifE nifN nifB nifV fixB fixC omp31 0 8,000 16,000 E3.2.1.15 24,000 32,000 40,000 iscA 48,000 56,000irr 64,000 Sym Basal parA K07497 E3.1.1.11 K07484 nifX K13819 nifZ nifQ nifW fixX BR 446 K00666 E6.4.1.4B bccA gumF glmS tgnB fabG nifH nifD nifK nifE nifN iscS nifB nifV fixB fixC 0 8,000 16,000 24,000 nifZ 32,000 40,000 48,000 56,000 64,000 nifX iscA nifT nifZ fixX PRDX2_4 fer E1.17.4.1B NAS96.2 ald sufB sufC sufD sufS nifD nifK nifE nifN davA iscS nifB per ompW nifH nifQ nifV fixB fixC ahpD ispA CYP117A fabG CYP114 CYP112A metE modC 0 8,000 16,000 24,000 32,000 40,000 nifZ 48,000 56,000 64,000 fixJ hypC hyaF hypA hypC nifX iscA nifT nifZ hylB nifW fixX PRDX2_4 TC.DME CCGE-LA001 dctR fixL dctA hyaA hyaB hyaC hyaD hupV hypB hypF hypD hypE ald sufB nifD nifK nifE nifN iscS nifB ompW nifH nifQ nifV fixB fixC 0 8,000 16,000 24,000 32,000 40,000 48,000 56,000 64,000
fixJ hypC hyaF hypA hypC sufS nifX iscA nifT nifZ hndC arsC1 nifW fixX PRDX2_4 p9-20 (SI) fixJ fixL dctA modB hyaA hyaB hyaC hyaD hupV hypB hypF hypD hypE ald nifD nifK nifE nifN iscS nifB ompW nifH nifQ nifV fixB fixC ahpD modC Sym 0 8,000 16,000 24,000 32,000 40,000 48,000 56,000 64,000 fixJ hypC hypA hypC nifX iscA nifT nifZ hcaC hndC nifW fixX PRDX2_4 CB756 fixJ fixL dctA modB K07148 hyaA hyaB hyaC hyaD hyaF hupV hypB hypF hypD hypE ald sufB sufC sufD sufS nifD nifK nifE nifN iscS nifB per ompW nifH nifQ nifV fixB fixC ahpD 0 8,000 16,000 24,000 32,000 40,000 48,000 56,000 64,000 nifT iscA K07497 virD4 ompR nifB fixX E3.1.30.1 ABC.MS.P K07497 DOA9 plasmid fabG nifD nifK ltrA bchE E3.2.1.15 envZ omp31 K07493 K07485 nifH fixA fixB fixC mcp ABC.MS.S K07486 K07485 VCP ybjD 0 8,000 TSTA3 16,000 K07483 24,000 32,000 nifT 40,000 48,000 56,000 64,000 ABC.SN.P TSTA3 E3.1.1.11 nifX iscA nifZ nifW fixX E3.1.30.1 ABC.SN.A korD CCBAU 53363 (SI) nodA nodB nodC ubiG E2.1.3.- nodI nodJ asm21 gmd nifA E3.2.1.15 K07486 K07497 fabG nifD nifK nifE nifN nifB nifH nifQ fixA fixB fixC panB panC ppaX smf ntaA atsK ABC.SN.S frdA 0 8,000 16,000 24,000 32,000 40,000 48,000 56,000 64,000
modE ahpD iscA nifZ paaD bolA cheY (C) PRDX2_4 sufE K07006 phnG phnO hcaC K13819 nifT iscA nifZ fdx glbO E3.1.1.45 grxD arsC1 nifW fixX Azorhizobium caulinodans phnE modD nifA sufS sufD sufC sufB nifH nifD nifK nifE nifN cysE nifV phnF phnH phnI phnJ phnK phnL phnM phnN TST MAO VCP iscS nifB per cysJ yghU glpE nifH nifQ irr fixA fixB fixC ORS 571 0 2,500 5,000 7,500 10,000 12,500 15,000 17,500 20,000 0 5,000 10,000 15,000 20,000 25,000 30,000 35,000 40,000 45,000 nifT nifZ iscA Outgroup iscA nifZ nifX nifW fixX K06995 E3.2.1.123 Rhodopseudomonas palustris fabG nifA nifB nifH nifD nifK nifE nifN nifQ K13819 iscS nifV fixA fixB fixC metA TC.PST K16150 exoP exoQ tagA UXE ictB degP bmpA nupC nupB nupA degP otsA ppk2 fabI E2.3.1.8 ackA mucR paaI DX-1 0 8,000 16,000 24,000 32,000 40,000 48,000 56,000 64,000 bioRxiv preprint doi: https://doi.org/10.1101/2021.02.03.429501; this version posted February 3, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.
Marine Soil Plant Others *** *** *** *** *** *** *** *** ns *** *** ***
100
80
60
40
Normalized abundance (%) 20
0
Cluster FL Cluster PB Sym Cluster SymBasalCluster FL Cluster PB Sym Cluster SymBasalCluster FL Cluster PB Sym Cluster SymBasalCluster FL Cluster PB Sym Cluster SymBasal