bioRxiv preprint doi: https://doi.org/10.1101/2020.10.16.342204; this version posted October 16, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

1 Phylogenetic analysis of the salinipostin g-butyrolactone gene cluster uncovers

2 new potential for bacterial signaling-molecule diversity

3

4 Kaitlin E. Creamera, Yuta Kudoa, Bradley S. Mooreb,c, Paul R. Jensena#

5

6 a Center for Marine Biotechnology and Biomedicine, Scripps Institution of

7 Oceanography, University of California San Diego, La Jolla, California, USA

8 b Center for Oceans and Human Health, Scripps Institution of Oceanography, University

9 of California San Diego, La Jolla, California, USA

10 c Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California

11 San Diego, La Jolla, California, USA

12

13 Running Head: Phylogenetic analysis of the salinipostin gene cluster

14

15 #Address correspondence to Paul R. Jensen, [email protected].

16

17 Keywords salinipostin, g-butyrolactones, biosynthetic gene clusters, Salinispora,

18 bacterial signaling molecules, actinomycetes, HGT bioRxiv preprint doi: https://doi.org/10.1101/2020.10.16.342204; this version posted October 16, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

19 Abstract

20 communicate by small-molecule chemicals that facilitate intra- and inter-

21 species interactions. These extracellular signaling molecules mediate diverse processes

22 including virulence, bioluminescence, biofilm formation, motility, and specialized

23 metabolism. The signaling molecules produced by members of the phylum

24 are generally comprised of g-butyrolactones, g-butenolides, and furans.

25 The best known actinomycete g-butyrolactone is A-factor, which triggers specialized

26 metabolism and morphological differentiation in the genus Streptomyces. Salinipostins

27 A-K are unique g-butyrolactone molecules with rare phosphotriester moieties that were

28 recently characterized from the marine actinomycete genus Salinispora. The production

29 of these compounds has been linked to the 9-gene biosynthetic gene cluster spt. Critical

30 to salinipostin assembly is the g-butyrolactone synthase encoded by spt9. Here, we

31 report the global distribution of spt9 among sequenced bacterial genomes, revealing a

32 surprising diversity of gene homologs across 12 bacterial phyla, the majority of which

33 are not known to produce g-butyrolactones. Further analyses uncovered a large group

34 of spt-like gene clusters outside of the genus Salinispora, suggesting the production of

35 new salinipostin-like diversity. These gene clusters show evidence of horizontal transfer

36 between many bacterial taxa and location specific homologous recombination exchange

37 among Salinispora strains. The results suggest that g-butyrolactone production may be

38 more widespread than previously recognized. The identification of new g-butyrolactone

39 biosynthetic gene clusters is the first step towards understanding the regulatory roles of

40 the encoded small molecules in Actinobacteria.

41 bioRxiv preprint doi: https://doi.org/10.1101/2020.10.16.342204; this version posted October 16, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

42 Importance

43 Signaling molecules orchestrate a wide variety of bacterial behaviors. Among

44 Actinobacteria, g-butyrolactones mediate morphological changes and regulate

45 specialized metabolism. Despite their importance, few g-butyrolactones have been

46 linked to their cognate biosynthetic gene clusters. A new series of g-butyrolactones

47 called the salinipostins was recently identified from the marine actinomycete genus

48 Salinispora and linked to the spt biosynthetic gene cluster. Here we report the detection

49 of spt-like gene clusters in diverse bacterial families not known for the production of this

50 class of compounds. This finding expands the taxonomic range of bacteria that may

51 employ this class of compounds and provides opportunities to discover new compounds

52 associated with chemical communication.

53

54 Introduction

55 Bacteria use chemical signaling molecules to regulate gene expression in a population

56 dependent manner. This process, known as quorum sensing, controls group behaviors

57 including swarming, bioluminescence, virulence, biofilm formation, cell competence,

58 DNA uptake, public good production, and specialized metabolism. In many Gram-

59 negative bacteria, quorum sensing is mediated by acyl-homoserine lactone (AHL)

60 autoinducers (AIs) and their cognate receptors (1). In some Gram-positive bacteria,

61 autoinducing peptides (AIPs) and their respective transmembrane two-component

62 histidine sensor kinases control similar group behaviors (2). Among Actinobacteria, g-

63 butyrolactone signaling molecules regulate morphological development and specialized

64 metabolite production. Given the importance of Actinobacteria for the production of bioRxiv preprint doi: https://doi.org/10.1101/2020.10.16.342204; this version posted October 16, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

65 antibiotics and other useful compounds, the discovery of new signaling molecules could

66 facilitate the discovery of new natural products from the large number of “cryptic” gene

67 clusters detected in actinomycete genome sequences.

68

69 To date, the types of signaling molecules known to be produced by Actinobacteria

70 include g-butyrolactones (3–25), g-butenolides (26–30), furans (31, 32), PI factor (33),

71 and N-methylphenylalanyl-dehydrobutyrine diketopiperazine (34) (Figure 1). Most of

72 these were discovered from members of the genus Streptomyces. Sometimes referred

73 to as actinobacterial “hormones”, signaling molecules are commonly produced in low

74 amounts and have proven difficult to isolate and characterize. Many of these molecules

75 not only induce the production of specialized metabolites, but also regulate bacterial

76 morphogenesis and control complex regulatory systems (15). The first bacterial

77 signaling molecule discovered was A-factor (autoregulatory factor, 2-isocapryloyl-3R-

78 hydroxymethyl-g-butyrolactone) from the actinomycete Streptomyces griseus. It was

79 shown to trigger sporulation and the production of the antibiotic streptomycin (3). A-

80 factor biosynthesis requires a g-butyrolactone synthase and a reductase encoded by the

81 genes afsA and bprA, respectively (15), and its elucidation revitalized the search to link

82 signaling molecules to their biosynthetic genes (16, 32, 35–41). Most of the

83 biosynthetically characterized g-butyrolactones, g-butenolides, and furans have been

84 linked to afsA gene homologs via sequence similarity, biochemical verification, or A-

85 factor receptor binding assays. However, many afsA gene homologs observed in

86 Streptomyces, Kitasatospora, and Amycolatopsis genomes have yet to be linked to a bioRxiv preprint doi: https://doi.org/10.1101/2020.10.16.342204; this version posted October 16, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

87 small molecule (42–51). Likewise, the Acl series of g-butyrolactones reported from S.

88 coelicolor has not been linked to an AfsA homolog (3, 14, 18) (Figure 1).

89

90 While most A-factor-like molecules have been identified from Streptomyces strains, it

91 remains possible that other actinobacterial taxa produce related signaling molecules.

92 This includes the obligate marine actinomycete genus Salinispora, which is comprised

93 of nine species: Salinispora tropica, S. arenicola, S. oceanensis, S. mooreana, S.

94 cortesiana, S. fenicalii, S. goodfellowii, S. vitiensis, and S. pacifica (52, 53) isolated from

95 marine sediments (54–57), seaweeds (55), and sponges (58, 59). The genus

96 Salinispora has proven to be a prolific source of specialized metabolites (60) including

97 the proteasome inhibitor salinosporamide A (61), which is currently in phase III clinical

98 trials as an anticancer agent. Whole-genome sequencing of 118 Salinispora strains

99 revealed 176 distinct biosynthetic gene clusters (BGCs), of which only 25 had been

100 linked to their products (62). In a subsequent study, a majority of Salinispora BGCs

101 were shown to be transcriptionally active under standard cultivation conditions,

102 suggesting that many of their small molecule products were being missed using

103 traditional detection and isolation techniques (63). Given that little is known about the

104 regulation of specialized metabolism in this genus, it remains possible that quorum

105 sensing signaling molecules may play a role in the regulation of BGC expression.

106

107 Recently, a series of compounds known as salinipostins A-K with rare bicyclic

108 phosphotriesters were identified from a Salinispora sp. RLUS08-036-SPS-B (64). While

109 these compounds were identified based on anti-malarial activity against Plasmodium bioRxiv preprint doi: https://doi.org/10.1101/2020.10.16.342204; this version posted October 16, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

110 falciparum, the g-butyrolactone part of the salinipostin structure is reminiscent of

111 Streptomyces A-factor (64, 65). Salinipostin biosynthesis was linked to the spt gene

112 cluster via a knockout of the afsA homolog spt9 in S. tropica CNB-440, which resulted in

113 the elimination of salinipostin production (63). Subsequently, eight volatile bicyclic

114 lactones, salinilactones A-H, were isolated and characterized from S. arenicola CNS-

115 205 (66, 67). Two g-butyrolactones Sal-GBL1 and Sal-GBL2 were also recently

116 characterized from multiple Salinispora spp. strains (68). The Sal-GBLs, salinilactones

117 A-H, and salinipostins A-K all share a bicyclic lactone motif and are proposed to

118 originate from the same spt BGC (63, 66–68) (Supplementary Figure S1).

119

120 In this study, we set out to determine the distribution of spt9 butyrolactone synthase

121 homologs and the diversity of BGCs in which they reside, across all sequenced

122 bacteria. We uncover that salinipostin-like BGCs are widely distributed outside of the

123 genus Salinispora and exhibit gene rearrangements and some unusual gene fusions

124 relevant to g-butyrolactone biosynthesis. Finally, the evolutionary history of the

125 salinipostin BGC indicates it was horizontally transferred between two Salinispora

126 species at a location where they are known to co-occur.

127

128 Materials and Methods

129 Identification and distribution of Spt9 homologs

130 The predicted Pfam function of spt9 was identified using the NCBI Conserved Domain

131 Database prediction tool (69). AnnoTree (70) was used to determine the taxonomic

132 distribution of the spt9/afsA Pfam03756 “A-factor biosynthesis hotdog domain- bioRxiv preprint doi: https://doi.org/10.1101/2020.10.16.342204; this version posted October 16, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

133 containing protein”. Protein-protein BLASTp (2.6.0+) (71) was used to query the S.

134 tropica CNB-440 319-amino acid Spt9 protein sequence against the JGI IMG/MER

135 sequence database (publicly available genomic sequence data integrated with JGI

136 sequence data, all_img_core 2019) using an E-value cutoff of 1E-5 and at least 25%

137 identity to identify the top 500 Spt9 homologs. To evaluate the gene neighborhood of

138 the top matches, all but 33 Salinispora Spt9 homologs were removed, and a

139 representative 20kb upstream and downstream of the Spt9 homologs were drawn;

140 sequences were grouped into Actinobacterial family or Gammaproteobacterial class.

141 Also included were 22 previously characterized afsA gene homologs including nine

142 linked to the production of 34 g-butyrolactone molecules; two linked to the production of

143 three g-butenolide molecules; and one linked to the production of five furan molecules.

144

145 To identify Spt9 homologs within the genus Salinispora, protein-protein BLASTp was

146 used to search all publicly accessible Salinispora genomes with an E-value cutoff of 1E-

147 5. PCR was used to confirm the integrity of the split spt BGC in S. arenicola CNS-296

148 and the presence of a hypothetical gene in S. pacifica CNS-143. PCR was performed

149 by aliquoting 90 ng of gDNA into the PCR mixture consisting of 2X Phusion Green Hot

150 Start II High-Fidelity PCR Master Mix (1.5 mM MgCl2, 200 µM each dNTP, 0.4 U

151 Phusion enzyme; Thermo Scientific), 3% DMSO, and 0.5 µM of each forward and

152 reverse primer (primer pair A: 6F [5’-ATCGAACGTGTCATCGAATGGC-3’], 6dntransR

153 [5-CGTAGCCGAGGAAAGAAGCATC-3’]; primer pair B: 6F, 6dntrans-IGR_R [5’-

154 TCGTTCATCAGAGGTCCCCTTC-3’]; primer pair C: 6F, 7R [5’-

155 GATCAGATAGAGCATGGCGAGC-3’]. PCR conditions were as follows: primer pair A bioRxiv preprint doi: https://doi.org/10.1101/2020.10.16.342204; this version posted October 16, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

156 (6F, 6dntransR), 30 s of initial denaturation at 98°C, followed by 30 cycles of

157 denaturation at 98°C for 5 s, annealing at 66°C for 20 s, and extension at 72°C for 30-

158 50 s, followed by a final extension for 7 min at 72°C; primer pair B (6F, 6dntrans-IGR_R)

159 same as previous but with annealing at 65.6°C for 20 s, and extension at 72°C for 69 s;

160 primer pair C (6F, 7R), same as previous but with annealing at 65.7-66°C for 20 s, and

161 extension at 72°C for 35-50 s, followed by a final extension for 5-7 min at 72°C. The

162 resulting products were visualized on a 0.8% agarose gel run in 1X TAE at 95-97V for

163 30-60 min; excised, purified, and Sanger sequenced in forward and reverse directions

164 (Eton Bioscience, San Diego, CA), trimmed, and mapped to their respective genomes in

165 Geneious v8.1.9 (72).

166

167 Identification of salinipostin-like BGCs

168 ClusterScout (73) searches were performed to identify salinipostin-like BGCs in other

169 sequenced genomes using the following Pfam functions: spt1 Pfam00391, Pfam01326;

170 spt2 Pfam00501; spt4 Pfam00550; spt5 Pfam07993; spt6 Pfam00334; spt7 Pfam01040;

171 spt8 Pfam00296; spt9 Pfam03756. It should be noted that antiSMASH v4 and v5 (74,

172 75) do not fully identify spt1-2 in the salinipostin butyrolactone BGC, thus other methods

173 were used to find spt-like BGCs. Independent searches were run with a minimum

174 requirement of either 3, 4, or 5 Pfam matches, a maximum distance of <10,000bp

175 between each Pfam match, and a minimum distance of 1bp from the scaffold edge. The

176 boundaries of each match were extended by a maximum of 10,000bp to help identify full

177 biosynthetic operons. For some searches, the spt9/afsA Pfam was defined as essential.

178 MultiGeneBlast (76) was also used to query the contiguous S. tropica CNB-440 bioRxiv preprint doi: https://doi.org/10.1101/2020.10.16.342204; this version posted October 16, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

179 salinipostin spt1-9 gene cluster against the NCBI GenBank Bacteria BCT subdivision

180 database. Finally, the STRING v11 database (77) was queried using Spt1-9 to identify

181 significant protein-protein interactions, gene neighborhoods, and gene co-occurrences

182 within 5,090 organisms. Biosynthetic clusters retrieved from each ClusterScout,

183 MultiGeneBlast, and STRING search were manually inspected for spt Pfams and gene

184 organization.

185

186 Phylogenetic distribution of Spt9 homologs and salinipostin-like BGCs

187 A maximum likelihood amino acid phylogeny was generated from the top 403 Spt9

188 homologs and an additional 22 experimentally characterized Spt9/AfsA proteins. The

189 sequences were aligned with MUSCLE (78) within the Mesquite system for phylogenetic

190 computing (79) and analyzed using ProtTest 3.4.2 (80) to determine an amino acid

191 model for tree calculations. RAxML (81) was used to create a tree using ML + rapid

192 bootstraps with 500 replicates. A second phylogeny was generated for the Spt9

193 homologs observed in salinipostin-like BGCs using the same parameters. The

194 topologies of these trees and branch support were confirmed using PhyML (82) with

195 SMS Smart Model Selection (AIC model selection; BIONJ tree searching, NNI tree

196 improvement, and an aLRT SH-like fast likelihood method) (83).

197

198 To test if the Salinispora salinipostin BGC was acquired as an intact gene cluster, Spt1-

199 9 proteins from 116 Salinispora genomes were aligned with MUSCLE (78) and PhyML

200 (82) used to calculate a phylogenetic tree for each protein with automatic SMS Smart

201 Model Selection (AIC model selection; BIONJ tree searching, NNI tree improvement, bioRxiv preprint doi: https://doi.org/10.1101/2020.10.16.342204; this version posted October 16, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

202 and an aLRT SH-like fast likelihood-based method) (83). The nine Spt1-9 protein trees

203 were compared for congruency with a concatenated Salinispora species tree created

204 using the following 11 single-copy protein sequences: DnaA, GyrB1, GyrB2, PyrH,

205 RecA, Pgi, TrpB, AtpD, SucC, RpoB, and TopA as previously reported (84). Spt1-9

206 protein sequences were also concatenated (3,758 total amino acid characters), aligned

207 with MUSCLE (78), and a maximum likelihood tree calculated using SMS and PhyML

208 with the previously described parameters.

209

210 FigTree (85) and the Interactive Tree of Life (iTOL v4) (86) were used to visualize

211 phylogenetic trees and Actinobacterial families were assigned using a recently

212 proposed phylogeny (87). Fused genes consisting of functional domains from two Spt

213 proteins were identified using Geneious v8.1.9 (72).

214

215 Results

216 Taxonomic distribution of Spt9 homologs

217 The g-butyrolactone synthase AfsA is critical in the biosynthesis of the Streptomyces

218 signaling molecule A-factor (15). The identification of the afsA homolog spt9 in the

219 salinipostin (spt) BGC and its role in catalyzing the g-butyrolactone ring formation in

220 salinipostin biosynthesis led us to explore the distribution of spt9/afsA homologs among

221 sequenced bacterial genomes. We first confirmed that Spt9 contains two AfsA-like hot-

222 dog fold superfamily domains that are distantly related to the FabA and FabZ β-

223 hydroxyacyl-ACP dehydratases associated with fatty-acid biosynthesis in Escherichia

224 coli (88). Using AnnoTree (70), 1,230 spt9/afsA Pfam03756 hits were identified out of bioRxiv preprint doi: https://doi.org/10.1101/2020.10.16.342204; this version posted October 16, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

225 the 27,000 reference genomes in the Genome Database (Figure 2).

226 Surprisingly, these were distributed among 12 phyla, the majority of which are not

227 known for the production of g-butyrolactone signaling molecules. The phylum

228 Actinobacteria contained 74% of the hits, Proteobacteria had 21% of the hits, and the

229 rest were scattered across 10 additional phyla. Noticeably, of the 3,579 Actinobacteria

230 in the reference Genome Taxonomy Database, 25% (911) contained the AfsA

231 Pfam03756 compared to only 3% (256) of the 8,882 Proteobacteria.

232

233 Spt9 phylogeny and gene environment

234 To assess the detailed phylogenetic relationships of the Spt9 homologs, we conducted

235 a BLASTp search to identify the top related Spt9 proteins across the ~70,000+ bacterial

236 genomes in the 2019 JGI IMG Blast database (89). The top 500 homologs shared at

237 least 25% amino acid identity with Spt9, of which the top 403 were further analyzed

238 after removing duplicate Salinispora Spt9 homologs. We additionally identified 22 AfsA

239 homologs that have been experimentally linked to the production of a diverse array of g-

240 butyrolactones, g-butenolides, and furans (Figure 1). A maximum likelihood phylogeny

241 generated using these Spt9/AfsA homologs was incongruent with the established

242 taxonomic relationships of the strains in which the sequences were detected (Figure 3,

243 Supplementary Figure S2). One prominent example includes the Salinispora Spt9

244 sequences, which are sister to a homolog in Streptomyces phaeofaciens. These

245 sequences fall within a larger clade comprised of diverse members of the

246 Gammaproteobacteria (Citrobacter koseri and Dickeya sp.) and Actinobacteria

247 (Cellulomonas cellasea and Rhodococcus sp.) as opposed to cladding with other bioRxiv preprint doi: https://doi.org/10.1101/2020.10.16.342204; this version posted October 16, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

248 members within the family. The 22 AfsA-homologs that have

249 been linked to the production of g-butyrolactones, g-butenolides, and furans are

250 restricted to one large clade in the phylogeny and distinct from the northern end of the

251 tree, which contains the Salinispora Spt9 sequences (Figure 3).

252

253 The large number of Spt9/AfsA homologs illustrates considerable potential for the

254 discovery of new g-butyrolactone synthase mediated chemical diversity (Figure 3). New

255 biosynthetic routes are supported by the diverse gene environments in which these

256 Spt9/AfsA homologs are observed. For example, antiSMASH 5 (75) analyses reveal

257 that some Spt9 homologs are close to genes encoding ketosynthase and thiolase

258 enzymes, suggesting they are part of larger PKS gene clusters. Among Salinispora

259 strains, six Spt9 homologs were observed outside of the spt BGC. Two of these were

260 observed in Salinispora oceanensis strains CNT-124 and CNT-584, each of which

261 contain two Spt9 homologs in a type II PKS BGC in addition to the spt BGC. The other

262 four were observed in Salinispora fenicalii strains CNT-569 and CNR-942, which lack

263 the spt BGC but instead have a Spt9 homolog both in a type II PKS BGC and a

264 butyrolactone non-ribosomal peptide synthetase (NRPS) BGC.

265

266 Despite incongruence with the species phylogeny, the gene environments surrounding

267 some Spt9/AfsA homologs show evidence of gene conservation. For example, in the

268 clades near the characterized Spt9/AfsA homologs, many Streptomyces and

269 Kitasatospora species showed a general conservation of a bprA gene homolog (dark

270 pink) next to spt9 homologs as required for A-factor biosynthesis in Streptomyces bioRxiv preprint doi: https://doi.org/10.1101/2020.10.16.342204; this version posted October 16, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

271 griseus (Figure 3). The three Spt9 homologs that are closest to the Salinispora Spt9

272 sequence (observed in Streptomyces phaeofaciens, Dickeya sp. and Citrobacter koseri)

273 contain a spt4 acyl carrier protein homolog (pale blue). Below the Salinispora Spt9

274 clade, conservation of two 3-oxoacyl (acyl-carrier-protein) synthases (light green), an

275 acyl carrier protein (light blue, spt4 homologs), a hydrolase (pale yellow), and a 3-

276 oxoacyl-(acyl-carrier-protein) reductase (light blue) is observed across taxonomically

277 diverse Streptomycetaceae, Nocardiopsaceae, Nocardiaceae, Thermomonosporaceae,

278 Pseudonocardiaceae, and Streptosporangiaceae strains. At the bottom of the tree, the

279 Gammaproteobacterial outgroup Pseudomonas sp. RIT357 shares conserved genes

280 around the Spt9 homolog with other diverse families of bacteria including other

281 Gammaproteobacteria, Nocardiaceae, Streptomycetaceae, and Pseudonocardiaceae

282 with a putative hydrolase of the HAD superfamily (light yellow), a cytochrome P450

283 (light teal), and a MFS (major facilitator superfamily) protein transporter (light yellow).

284 The presence of a few conserved genes near Spt9 homologs yet the lack of fully

285 syntenic Spt9 gene neighborhoods suggests there are some functional similarities

286 involving Spt9 homologs across diverse bacterial families. The observation of diverse

287 gene environments flanking Spt9 homologs illustrate the enormous potential for g-

288 butyrolactone, g-butenolide, and furan production among bacteria not known to produce

289 these molecules through potentially new biosynthetic routes.

290

291 Targeted search for spt-like BGCs

292 A number of Spt9 gene neighborhoods outside of Salinispora caught our attention due

293 to similarities with the salinipostin BGC (Figure 3). To search more thoroughly for spt- bioRxiv preprint doi: https://doi.org/10.1101/2020.10.16.342204; this version posted October 16, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

294 like BGCs among sequenced genomes, we searched for spt-like BGCs using

295 ClusterScout (73), MultiGeneBlast (76) and STRING v11 (77). These efforts led to the

296 identification of 91 spt-like BGCs spanning six actinomycete families within the genera

297 Nocardia, Gordonia, Tsukamurella, Mycobacterium, Dietzia, Streptomyces,

298 Kitasatospora, Rhodococcus, and Kutzneria (Figure 4). All of these BGCs possess spt1,

299 spt5, spt6, spt7, and spt9 homologs with spt9 towards the 3’ end of the cluster as seen

300 in Salinispora. Notably, none of the spt-like BGCs contain the flavin-dependent

301 oxidoreductase spt8, whose role is unknown in salinipostin biosynthesis. None of these

302 spt-like BGCs have been linked to the small molecules they encode.

303

304 A maximum likelihood phylogeny of the Spt9 homologs observed in spt-like BGCs

305 clearly delineates them from the 11 AfsA homologs linked to g-butyrolactone, g-

306 butenolide, and furan biosynthesis (Figure 4). Compared with the nine gene salinipostin

307 BGC identified in Salinispora (Supplementary Figure S1), we observed gene

308 reorganizations and fusions in other bacteria (Figure 4). Most notably, spt2 and spt3 are

309 fused across the large clade bracketed by Nocardia and Dietzia timorensis, as well as

310 two Kutzneria species and five Streptomyces species at the most southern part of the

311 tree. This fusion is conserved across most BGCs except those observed in Salinispora,

312 four Streptomyces species, and Rhodococcus rhodnii NRRL B-16535. A second

313 example of gene fusion is observed between spt6 and spt9 in Dietzia timorensis.

314 Alignment of the spt2, spt3, spt6, and spt9 fused and individual genes reveals

315 maintenance of the functional domains (Supplementary Figure S3). These gene fusions,

316 also known as Rosetta gene fusions (90, 91), suggest a functional interaction between bioRxiv preprint doi: https://doi.org/10.1101/2020.10.16.342204; this version posted October 16, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

317 the encoded proteins in the biosynthesis of salinipostin-like g-butyrolactone molecules.

318 The gene fusions could have arisin from the single-domain individual spt2, spt3, spt6,

319 and spt9 genes in the Salinispora spt BGC and thus suggest some evolutionary

320 selective advantage for these specific co-localized biosynthetic genes to become fused.

321

322 Many of the spt-like BGCs differed in gene order compared to that observed in

323 Salinispora while others contained extra genes in the cluster including a nitroreductase

324 (dark pink). It is noteworthy that spt8, a flavin-dependent oxidoreductase

325 (Supplementary Figure S1) was unique to the Salinispora spt BGC; however, no

326 definitive function of spt8 has been suggested in the proposed biosynthesis of the

327 salinipostins (68) or salinilactones (66, 67). Similarly, some Streptomyces spt-like BGCs

328 did not contain spt2 (AMP-ligase) and spt4 (acyl carrier protein) homologs, which are

329 proposed to help load and carry the R2 aliphatic sidechain during salinipostin

330 biosynthesis, respectively (68). These observations suggest additional structural

331 diversity remains to be discovered. Furthermore, two BGCs found in Kitasatospora

332 cheerisanensis and Frankia sp. contained spt9 and spt2 homologs next to a type I

333 polyketide synthase (PKS), suggesting a potential role in the regulation of the PKS BGC

334 as observed in methylenomycin biosynthesis (31). Overall, we observed that some of

335 the 91 spt-like BGCs occur in different genomic locations within the same genus, which

336 could support BGC migration or horizontal gene transfer. However, there is also

337 evidence of vertical inheritance based on gene conservation in some strains. To

338 investigate this further, we focused on the genus Salinispora, where BGC migration and

339 transfer events have been previously reported (62). bioRxiv preprint doi: https://doi.org/10.1101/2020.10.16.342204; this version posted October 16, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

340

341 The spt BGC in the genus Salinispora

342 The salinipostin BGC (spt1-9) is highly conserved within the genus Salinispora (62).

343 Notably, only five strains have been identified via genome mining that lack the spt BGC

344 and these belong to the recently described species S. fenicalii, S. goodfellowii, and S.

345 vitiensis (Supplementary Figure S4A). At the species level, the spt BGC is commonly

346 observed in the same genomic environment (Supplementary Figure S5) within previously

347 defined genomic islands (62, 92). For example, the spt BGC is located in Genomic Island

348 (GI) 20 in S. tropica (62) with notable conservation of upstream and downstream regions.

349 Similarly, positional conservation of the spt BGC in Genomic Island 15 (62) is observed

350 in most S. arenicola and S. pacifica strains.

351

352 Variations in the spt BGC are observed in three Salinispora strains (Figure S5). In the

353 case of S. arenicola CNS-296, spt1-6 and spt7-9 are split onto different contigs and

354 flanked by transposases. Targeted PCR amplifications of spt6 and the neighboring

355 transposase confirmed that the BGC is indeed split (Supplementary Figure S6A).

356 Attempts to amplify a region between spt7 and the downstream hypothetical gene

357 resulted in multiple PCR products likely due to multiple copies of the hypothetical gene in

358 the S. arenicola CNS-296 genome and was thus uninformative. However, primers

359 spanning spt6-7 resulted in a product that was 3.2kb larger than what was observed from

360 a contiguous BGC in S. arenicola CNQ-884 (Supplementary Figure S6B). Sequences

361 obtained from the ends of this amplicon mapped poorly to spt6 (67% identity) and better

362 to spt7 (99% identity), providing additional support for an insertion between spt6-7 in the bioRxiv preprint doi: https://doi.org/10.1101/2020.10.16.342204; this version posted October 16, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

363 S. arenicola CNS-296 spt BGC. It remains to be determined if S. arenicola CNS-296

364 produces salinipostins. In contrast, the detection of spt genes on multiple contigs in S.

365 arenicola CNT-088 and S. pacifica CNS-143 is likely due to poor genome assembly (93).

366 We also investigated the hypothetical gene annotated between spt6-7 in S. pacifica CNS-

367 143. A comparison of PCR products spanning spt6-7 in this strain with S. arenicola CNQ-

368 884, where the spt BGC is contiguous, yielded amplicons of the same size and with the

369 same conserved spt6-7 domains (Supplementary Figure S6B). This indicates that the

370 extra hypothetical gene called in S. pacifica CNS-143 is likely an error and provides

371 further support for conservation of the spt BGC in the genus Salinispora.

372

373 While conservation of the spt BGC within Salinispora supports vertical inheritance, an

374 Spt9 phylogeny (Supplementary Figures S4B, S5) reveals incongruencies with the

375 established Salinispora species phylogeny (52, 62) (Supplementary Figure S4A) that

376 are consistent with recombination events. In one example, Spt9 sequences from S.

377 vitiensis CNS-055 and S. cortesiana CNY-202 occur within the S. oceanensis clade as

378 opposed to outside of it (Supplementary Figures S4, S5). A more pronounced example

379 is the placement of S. tropica Spt9 sequences within the larger S. arenicola clade,

380 suggesting the former acquired its sequence from the latter. This recombination event

381 appears to have involved the entire BGC since the remaining S. tropica Spt1-8

382 sequences share a similar phylogeny (Figure S4B). As predicted, a concatenated

383 phylogeny of Spt1-9 (Figure 5A) is incongruent with the Salinispora species phylogeny

384 (Supplementary Figure S4A) and supports a horizontal exchange of the BGC between

385 S. arenicola and S. tropica. Mapping the geographic origin of the strains onto the tree bioRxiv preprint doi: https://doi.org/10.1101/2020.10.16.342204; this version posted October 16, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

386 reveals that all twelve S. tropica and the six most closely related S. arenicola sequences

387 all originated from ocean sediments collected in the Bahamas and the Yucatán (Figure

388 5B). A closer examination reveals that the two species co-occur at four of the five

389 collection sites (Figure S7). This geographic proximity would provide opportunities for

390 BGC horizontal gene transfer and homologous recombination to occur.

391

392 Discussion

393 Specialized metabolites that function as signaling molecules regulate important

394 functional traits in bacteria. However, only a small number of bacterial signaling

395 molecules have been identified to date. This may be because they are small in size,

396 generally produced in low yields, and often lack activity in the bioassays commonly

397 used to guide small molecule discovery. Gamma-butyrolactones represent an important

398 class of signaling molecules produced by Actinobacteria (Figure 1). The salinipostins,

399 salinilactones, Sal-GBL1, and Sal-GBL2 were recently reported from the marine

400 actinomycete genus Salinispora (64, 66–68) and bear structural similarities to previously

401 characterized actinomycete g-butyrolactones. Linkage between the biosynthesis of

402 these compounds and the g-butyrolactone synthase Spt9 in the salinipostin spt BGC led

403 us to more broadly explore the potential for signaling molecule production by assessing

404 the distribution of this protein among sequenced bacterial genomes. Surprisingly, we

405 detected Spt9 homologs across 12 diverse bacterial phyla, many of which are not

406 known to produce g-butyrolactones (Figure 2). Despite the unexpectedly wide

407 distribution of Spt9 homologs, only 285 of the ~25,500 currently described microbial

408 natural products in The Natural Products Atlas (94) contain a g-butyrolactone moiety. Of bioRxiv preprint doi: https://doi.org/10.1101/2020.10.16.342204; this version posted October 16, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

409 these, only 14 have been linked to their respective BGCs in the Minimum Information

410 about a Biosynthetic Gene cluster (MIBiG) repository (95) and only four of these,

411 including the salinipostins in Salinispora, A-factor and SCB1-3 in Streptomyces

412 coelicolor A3(2), and lactonamycin in Streptomyces rishiriensis, contain an Spt9

413 homolog. Thus, opportunities remain to identify the products of Spt9-containing BGCs

414 and to establish formal links between these compounds and their biosynthetic origins.

415 Our results suggest that the production of g-butyrolactones and related compounds may

416 be more common that previously recognized.

417

418 A phylogenetic tree of the top 403 Spt9 homologs, including 22 AfsA homologs from

419 experimentally characterized BGCs, showed that the associated g-butyrolactone, g-

420 butenolide, and furan signaling molecules are restricted to a clade that is distinct from

421 the majority of uncharacterized Spt9 sequences (Figure 3). The Spt9 tree also showed

422 major incongruencies with recognized Actinobacterial and Gammaproteobacterial

423 classification, suggesting extensive horizontal gene transfer. The genomic environments

424 around the Spt9 homologs were diverse, suggesting the potential production of

425 considerable chemical diversity. It remains to be seen if all of these Spt9 homologs

426 catalyze g-butyrolactone synthase-like reactions, especially when they are distantly

427 related to experimentally characterized AfsA homologs. Heterologous expression of

428 these Spt9 homologs with various precursor substrates to test if the Spt9 homologs

429 performs the canonical AfsA condensation reaction that assembles a fatty acid ester (b-

430 ketoacyl-DHAP ester) intermediate is a next step towards establishing their functionality

431 (15). bioRxiv preprint doi: https://doi.org/10.1101/2020.10.16.342204; this version posted October 16, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

432

433 Surprisingly, we discovered a large clade of Spt9 homologs in the genus Nocardia that

434 shared similar operon structure with the Salinispora spt BGC (starred, Figure 3). To

435 date, no small molecules isolated from Nocardia spp. have been linked to these BGCs.

436 A number of differences distinguish the Salinispora spt BGC from the 91 spt-like BGCs

437 observed both in Nocardia and other genera (Figure 4). Of particular note is the

438 absence of spt8 (a flavin-dependent oxidoreductase) outside of the genus Salinispora

439 and the different gene organization, with spt7 occurring after spt9, in many of the spt-

440 like BGCs. Several spt-like BGCs also have an additional nitroreductase gene (Figure

441 4) between spt9 and spt7, suggesting the production of a g-butyrolactone with a reduced

442 nitrogen or nitro functional group. Other spt-like BGCs lack the AMP-ligase spt2 and the

443 acyl carrier protein spt4, suggesting the products may lack the extended aliphatic

444 sidechain observed in the salinipostins. All of these variations suggest the production of

445 new chemical diversity and represent potential examples of BGC diversification arising

446 in response to different selective pressures.

447

448 Also of note are the spt2-spt3 and spt6-spt9 gene fusions observed in the genera

449 Nocardia, Gordonia, Tsukamurella, Mycobacterium, Dietzia, and Streptomyces. Both

450 pairs of fused genes appear functional based on the maintenance of conserved

451 functional domains (Supplementary Figure S3). Dietzia timorensis is the only strain with

452 both spt2-spt3 and spt6-spt9 fusions and is sister to the large clade containing the other

453 gene fusions. Protein fusions in BGCs can arise when the clustered genes are likely co-

454 transcribed and co-translated; these fusion products can provide evidence of functional bioRxiv preprint doi: https://doi.org/10.1101/2020.10.16.342204; this version posted October 16, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

455 interaction and perhaps even a selective advantage over individual proteins (90, 91,

456 96). The gene fusions observed in the spt-like BGCs appear to represent the evolution

457 of more complex, multifunctional proteins in these strains. The spt2-spt3 and spt6-spt9

458 gene fusions are similar to recently described multi-domain enzyme fusions in the

459 desferrioxamine (des) BGC, where they are hypothesized to contribute to chemical

460 diversification (96). Two additional fusiosn involving an AfsA/Spt9 homolog were

461 observed within trans-AT PKS modules associated with gladiofungin and gladiostatin

462 biosynthesis, where AfsA functions both for unprecedented offloading and butenolide

463 formation (97, 98). Identifying unusual gene fusions such as these represents an

464 exciting new avenue for genome-mining driven natural product discovery (96, 99).

465

466 Conservation of the spt BGC in the 116 Salinispora genomes examined here suggests it

467 was present in a common ancestor of the genus. Yet, the Salinispora Spt9 phylogeny is

468 incongruent with the established species phylogeny (Figure S4). These incongruencies

469 are likely due to homologous recombination and horizontal gene transfer (HGT) events,

470 the most apparent of which is represented by the clustering of the S. tropica Spt9

471 sequences within the S. arenicola clade. All twelve S. tropica Spt1-9 sequences share

472 the same evolutionary history, providing evidence that the HGT and recombination

473 event affected the entire BGC (Figure 5A). Co-localization of S. tropica and S. arenicola

474 in Bahamian and Yucatán sediments provides spatial opportunities for these exchange

475 events to occur (Figure 5B, Figure S7). Horizontal gene transfer and homologous

476 recombination have been identified as important avenues of BGC transfer and

477 diversification, especially in Actinobacteria (100). It remains unknown if the S. arenicola bioRxiv preprint doi: https://doi.org/10.1101/2020.10.16.342204; this version posted October 16, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

478 spt locus that appears to have replaced the ancestral version or been acquired de novo

479 in S. tropica provides a selective advantage or affects the compounds produced. This

480 homologous recombination and HGT event acting on an entire BGCs adds to growing

481 evidence of gene cluster transfer between both closely and distantly related bacteria, as

482 seen in the granaticin, coronafacoyl phytotoxins, tunicamycin, foxicin,

483 antimycin, streptomycin, and bicyclomycin BGCs (100). BGC exchange is well

484 documented in the genus Salinispora and has been linked to gene gain, loss,

485 duplication, and divergence in lineage-specific patterns (62, 84, 101). Overall,

486 acquisition of the spt BGC by S. tropica highlights the importance of understanding the

487 functional roles of its products and the effects of evolutionary exchange events such as

488 these on population and species-level dynamics.

489

490 The spt BGC has been linked to the production of both the salinipostins A-K (63, 64,

491 68), Sal-GBL1 and Sal-GB2 (68), and the salinilactones A-H (66, 67), which share

492 structural similarities to the A-factor family of g-butyrolactone signaling molecules.

493 Additionally, S. arenicola and S. pacifica produce two other AHLs that have yet to be

494 linked to their biosynthetic genes (102). AHLs are the most common class of

495 autoinducer signaling molecules produced by Gram-negative bacteria, thus Salinispora

496 appears to employ both g-butyrolactone and homoserine lactone signaling molecules.

497 While further studies are needed to understand the ecological functions of the spt

498 products, there is evidence that lactone signaling molecules affect microbial community

499 organization and function (103) and can also elicit specialized metabolite production

500 (40, 104, 105). Thus, perhaps the small molecule spt products regulate the expression bioRxiv preprint doi: https://doi.org/10.1101/2020.10.16.342204; this version posted October 16, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

501 of other biosynthetic pathways in Salinispora. In support of this, spt9 genes were

502 detected within Salinispora PKS and NRPS gene clusters. This is reminiscent of the

503 methylenomycin BGC in Streptomyces coelicolor A3(2), which encodes the production

504 of methylenomycin furan (MMF) signaling molecules that induce the production of the

505 methylenomycin antibiotic (106). Additionally, we identified an spt-like BGC neighboring

506 the recently identified cyphomycin PKS BGC in a Brazilian Streptomyces sp. ISID311

507 isolated from the fungus-growing ant Cyphomyrmex sp. (107). None of the genes in this

508 spt-like BGC have been linked to cyphomycin biosynthesis (107), which suggests their

509 small molecule products may instead have a regulatory role. Little is known about the

510 role of signaling compounds in regulating actinomycete specialized metabolism, thus

511 the recently reported total synthesis of salinipostin (108) and the identification of

512 molecules from orphan spt-like BGCs could support future experiments designed to

513 explore these roles.

514

515 Our results reveal unexplored biosynthetic potential related to g-butyrolactone signaling

516 molecules. The key biosynthetic protein sequence, Spt9, is broadly distributed among

517 diverse bacteria and observed in a wide range of gene environments suggesting the

518 potential for unrealized chemical diversity. Experimentally characterized g-butyrolactone,

519 g-butenolide, and furan BGCs are largely restricted to the genus Streptomyces, yet spt-

520 like BGCs are observed among bacterial genera that are not widely recognized for the

521 production of signaling molecules. Evidence of gene fusions and gene gain/loss in the

522 newly described spt-like BGCs suggest that new chemical diversity awaits discovery

523 within this unusual class of compounds. bioRxiv preprint doi: https://doi.org/10.1101/2020.10.16.342204; this version posted October 16, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

524

525 Data Availability

526 All sequences analyzed in this paper were retrieved from publicly accessible databases

527 including JGI IMG/MER and NCBI; all sequence accession information is included in the

528 Supplementary Dataset S1. PCR sequences produced as part of this work can be

529 accessed at GenBank Accession ####, listed in Table S1. Additionally, all phylogenetic

530 tree alignment files for each tree figure are posted on the Open Science Framework

531 page for this project, accessible at:

532 https://osf.io/4g3mn/?view_only=6c4bdfd0afc74407bf0e6e56678a5c62

533

534 Acknowledgements

535 This research was supported by the National Institutes of Health (5R01GM085770 to

536 P.R.J. and B.S.M.), the National Science Foundation Graduate Research Fellowship

537 Program (Grant No. DGE-1650112 to K.E.C.), and the Japan Society for Promotion of

538 Science (JSPS Overseas Research Fellowship to Y.K.). We thank our colleagues Dr.

539 Leesa J Klau and Dr. Henrique R. Machado for advice on analyses and figure design.

540 bioRxiv preprint doi: https://doi.org/10.1101/2020.10.16.342204; this version posted October 16, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

541 References

542

543 1. Papenfort K, Bassler BL. 2016. Quorum sensing signal–response systems in

544 Gram-negative bacteria. Nat Rev Microbiol 14:576–588.

545 2. Novick RP, Geisinger E. 2008. Quorum Sensing in Staphylococci. Annu Rev

546 Genet 42:541–564.

547 3. Khokhlov AS, Tovarova II, Borisova LN, Pliner SA, Shevchenko LN, Kornitskaia

548 EI, Ivkina NS, Rapoport IA. 1967. A-faktor, obespechivaiushchii biosintez

549 streptomitsina mutantnym shtammom Actinomyces streptomycini. Dokl Akad

550 Nauk SSSR 177:232–235.

551 4. Ando N, Matsumori N, Sakuda S, Beppu T, Horinouchi S. 1997. Involvement of

552 AfsA in A-factor Biosynthesis as a Key Enzyme. J Antibiot (Tokyo) 50:847–852.

553 5. Lee YJ, Kitani S, Nihira T. 2010. Null mutation analysis of an afsA-family gene,

554 barX, that is involved in biosynthesis of the γ-butyrolactone autoregulator in

555 Streptomyces virginiae. Microbiology 156:206–210.

556 6. Sato K, Nihira T, Sakuda S, Yanagimoto M, Yamada Y. 1989. Isolation and

557 Structure of a New Butyrolactone Autoregulator from Streptomyces sp. FRI-5. J

558 Ferment Bioeng 68:170–173.

559 7. Hashimoto K, Nihira T, Sakuda S, Yamada Y. 1992. IM-2, a Butyrolactone

560 Autoregulator, Induces Production of Several Nucleoside Antibiotics in

561 Streptomyces sp. FRI-5. J Ferment Bioeng 73:449–455.

562 8. Kitani S, Yamada Y, Nihira T. 2001. Gene Replacement Analysis of the

563 Butyrolactone Autoregulator Receptor (FarA) Reveals that FarA Acts as a Novel bioRxiv preprint doi: https://doi.org/10.1101/2020.10.16.342204; this version posted October 16, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

564 Regulator in Secondary Metabolism of Streptomyces lavendulae FRI-5. J

565 Bacteriol 183:4357–4363.

566 9. Kitani S, Iida A, Izumi T aki, Maeda A, Yamada Y, Nihira T. 2008. Identification of

567 genes involved in the butyrolactone autoregulator cascade that modulates

568 secondary metabolism in Streptomyces lavendulae FRI-5. Gene 425:9–16.

569 10. Waki M, Nihira T, Yamada Y. 1997. Cloning and Characterization of the Gene

570 (farA) Encoding the Receptor for an Extracellular Regulatory Factor (IM-2) from

571 Streptomyces sp. Strain FRI-5. J Bacteriol 179:5131–5137.

572 11. Gräfe U, Schade W, Eritt I, Fleck WF, Radics L. 1982. A new inducer of

573 anthracycline biosynthesis from Streptomyces viridochromogenes. J Antibiot

574 (Tokyo) 35:1722–1723.

575 12. Gräfe U, Reinhardt G, Schade W, Eritt I, Fleck W. F., Radics L. 1983. Interspecific

576 Inducers of Cytodifferentiation and Anthracycline Biosynthesis from Streptomyces

577 bikinensis and S. cyaneofuscatus. Biotechnol Lett 5:591–596.

578 13. Zou Z, Du D, Zhang Y, Zhang J, Niu G, Tan H. 2014. A γ-butyrolactone-sensing

579 activator/repressor, JadR3, controls a regulatory mini-network for jadomycin

580 biosynthesis. Mol Microbiol 94:490–505.

581 14. Joo H-S, Yang Y-H, Lee C-S, Kim J-H, Kim B-G. 2007. Fragmentation study on

582 butanolides with tandem mass spectrometry and its application for the screening

583 of ScbR-captured quorum sensing molecules in Streptomyces coelicolor A3(2).

584 Rapid Commun Mass Spectrom 21:764–770.

585 15. Kato J-Y, Funa N, Watanabe H, Ohnishi Y, Horinouchi S. 2007. Biosynthesis of y-

586 butyrolactone autoregulators that switch on secondary metabolism and bioRxiv preprint doi: https://doi.org/10.1101/2020.10.16.342204; this version posted October 16, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

587 morphological development in Streptomyces. Proc Natl Acad Sci 104:2378–2383.

588 16. Efremenkova OV. 2016. A-Factor-Like Autoregulators. Russ J Bioorganic Chem

589 42:457–472.

590 17. Ceniceros A, Dijkhuizen L, Petrusma M. 2017. Molecular characterization of a

591 Rhodococcus jostii RHA1 γ-butyrolactone(-like) signalling molecule and its main

592 biosynthesis gene gblA. Sci Rep 7:1–13.

593 18. Onoprienko VV, Anisova LN, Blinova IN, Efremenkova OV, Koz’min YP, Khokhlov

594 AS. 1983. Bioregulators of Streptomyces coelicolor A3(2), p. 111–112. In VII

595 Sovetsko-indiĭskiĭ simpozium po khimii prirodnykh soedineniĭ. Tbilisi.

596 19. Takano E, Nihira T, Hara Y, Jones JJ, Gershater CJL, Yamada Y, Bibb M. 2000.

597 Purification and Structural Determination of SCB1, a Gamma-Butyrolactone That

598 Elicits Antibiotic Production in Streptomyces coelicolor A3(2). J Biol Chem

599 275:11010–11016.

600 20. Hsiao NH, Söding J, Linke D, Lange C, Hertweck C, Wohlleben W, Takano E.

601 2007. ScbA from Streptomyces coelicolor A3(2) has homology to fatty acid

602 synthases and is able to synthesize γ-butyrolactones. Microbiology 153:1394–

603 1404.

604 21. Hsiao NH, Nakayama S, Merlo ME, de Vries M, Bunet R, Kitani S, Nihira T,

605 Takano E. 2009. Analysis of Two Additional Signaling Molecules in Streptomyces

606 coelicolor and the Development of a Butyrolactone-Specific Reporter System.

607 Chem Biol 16:951–960.

608 22. Sidda JD, Poon V, Song L, Wang W, Yang K, Corre C. 2016. Overproduction and

609 identification of butyrolactones SCB1-8 in the antibiotic production superhost: bioRxiv preprint doi: https://doi.org/10.1101/2020.10.16.342204; this version posted October 16, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

610 Streptomyces M1152. Org Biomol Chem 14:6390–6393.

611 23. Yamada Y, Sugamura K, Kondo K, Yanagimoto M, Okada H. 1987. The structure

612 of inducing factors for virginiamycin production in Streptomyces virginiae. J

613 Antibiot (Tokyo) 40:496–504.

614 24. Kondo K, Higuchi Y, Sakuda S, Nihira T, Yamada Y. 1989. New Virginiae

615 Butanolides From Streptomyces virginiae. J Antibiot (Tokyo) 42:1873–1876.

616 25. Kawachi R, Akashi T, Kamitani Y, Sy A, Wangchaisoonthorn U, Nihira T, Yamada

617 Y. 2000. Identification of an AfsA homologue (BarX) from Streptomyces virginiae

618 as a pleiotropic regulator controlling autoregulator biosynthesis, virginiamycin

619 biosynthesis and virginiamycin M1 resistance. Mol Microbiol 36:302–313.

620 26. Arakawa K, Mochizuki S, Yamada K, Noma T, Kinashi H. 2007. γ-Butyrolactone

621 autoregulator-receptor system involved in lankacidin and lankamycin production

622 and morphological differentiation in Streptomyces rochei. Microbiology 153:1817–

623 1827.

624 27. Arakawa K, Tsuda N, Taniguchi A, Kinashi H. 2012. The Butenolide Signaling

625 Molecules SRB1 and SRB2 Induce Lankacidin and Lankamycin Production in

626 Streptomyces rochei. ChemBioChem 13:1447–1457.

627 28. Yamamoto S, He Y, Arakawa K, Kinashi H. 2008. γ-Butyrolactone-Dependent

628 Expression of the Streptomyces Antibiotic Regulatory Protein Gene srrY Plays a

629 Central Role in the Regulatory Cascade Leading to Lankacidin and Lankamycin

630 Production in Streptomyces rochei. J Bacteriol 190:1308–1316.

631 29. Kitani S, Miyamoto KT, Takamatsu S, Herawati E, Iguchia H, Nishitomi K, Uchida

632 M, Nagamitsu T, Omura S, Ikeda H, Nihira T. 2011. Avenolide, a Streptomyces bioRxiv preprint doi: https://doi.org/10.1101/2020.10.16.342204; this version posted October 16, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

633 hormone controlling antibiotic production in Streptomyces avermitilis. Proc Natl

634 Acad Sci 108:16410–16415.

635 30. Zhu J, Sun D, Liu W, Chen Z, Li J, Wen Y. 2016. AvaR2, a pseudo γ-

636 butyrolactone receptor homologue from Streptomyces avermitilis, is a pleiotropic

637 repressor of avermectin and avenolide biosynthesis and cell growth. Mol Microbiol

638 102:562–578.

639 31. Corre C, Song L, O’Rourke S, Chater KF, Challis GL. 2008. 2-Alkyl-4-

640 hydroxymethylfuran-3-carboxylic acids, antibiotic production inducers discovered

641 by Streptomyces coelicolor genome mining. Proc Natl Acad Sci 105:17510–

642 17515.

643 32. Sidda JD, Corre C. 2012. Gamma-Butyrolactone and Furan Signaling Systems in

644 StreptomycesMethods in Enzymology, 1st ed. Elsevier Inc.

645 33. Recio E, Colinas Á, Rumbero Á, Aparicio JF, Martín JF. 2004. PI Factor, a Novel

646 Type Quorum-sensing Inducer Elicits Pimaricin Production in Streptomyces

647 natalensis. J Biol Chem 279:41586–41593.

648 34. Matselyukh B, Mohammadipanah F, Laatsch H, Rohr J, Efremenkova O, Khilya V.

649 2015. N-methylphenylalanyl-dehydrobutyrine diketopiperazine, an A-factor mimic

650 that restores antibiotic biosynthesis and morphogenesis in Streptomyces

651 globisporus 1912-B2 and Streptomyces griseus 1439. J Antibiot (Tokyo) 68:9–14.

652 35. Horinouchi S. 2002. A Microbial Hormone, A-factor, as a Master Switch for

653 Morphological Differentiation and Secondary Metabolism in Streptomyces griseus.

654 Front Biosci 2045–2057.

655 36. Takano E. 2006. γ-Butyrolactones: Streptomyces signalling molecules regulating bioRxiv preprint doi: https://doi.org/10.1101/2020.10.16.342204; this version posted October 16, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

656 antibiotic production and differentiation. Curr Opin Microbiol 9:287–294.

657 37. Nishida H, Ohnishi Y, Beppu T, Horinouchi S. 2007. Evolution of γ-butyrolactone

658 synthases and receptors in Streptomyces. Environ Microbiol 9:1986–1994.

659 38. Willey JM, Gaskell AA. 2011. Morphogenetic Signaling Molecules of the

660 Streptomycetes. Chem Rev 111:174–187.

661 39. Polkade A V., Mantri SS, Patwekar UJ, Jangid K. 2016. Quorum sensing: An

662 under-explored phenomenon in the phylum Actinobacteria. Front Microbiol 7:1–

663 13.

664 40. Niu G, Chater KF, Tian Y, Zhang J, Tan H. 2016. Specialised metabolites

665 regulating antibiotic biosynthesis in Streptomyces spp. FEMS Microbiol Rev

666 40:554–573.

667 41. Daniel-Ivad M, Pimentel-Elardo S, Nodwell JR. 2018. Control of Specialized

668 Metabolism by Signaling and Transcriptional Regulation: Opportunities for New

669 Platforms for Drug Discovery? Annu Rev Microbiol 72:25–48.

670 42. Lee KM, Lee C-K, Choi S-U, Park H-R, Kitani S, Nihira T, Hwang Y-I. 2005.

671 Cloning and in vivo functional analysis by disruption of a gene encoding the γ-

672 butyrolactone autoregulator receptor from Streptomyces natalensis. Arch

673 Microbiol 184:249–257.

674 43. Healy FG, Eaton KP, Limsirichai P, Aldrich JF, Plowman AK, King RR. 2009.

675 Characterization of γ-butyrolactone autoregulatory signaling gene homologs in the

676 angucyclinone polyketide WS5995B producer Streptomyces acidiscabies. J

677 Bacteriol 191:4786–4797.

678 44. Choi S-U, Lee C-K, Hwang Y-I, Kinoshita H, Nihira T. 2004. Cloning and bioRxiv preprint doi: https://doi.org/10.1101/2020.10.16.342204; this version posted October 16, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

679 functional analysis by gene disruption of a gene encoding a γ-butyrolactone

680 autoregulator receptor from Kitasatospora setae. J Bacteriol 186:3423–3430.

681 45. Ichikawa N, Oguchi A, Ikeda H, Ishikawa J, Kitani S, Watanabe Y, Nakamura S,

682 Katano Y, Kishi E, Sasagawa M, Ankai A, Fukui S, Hashimoto Y, Kamata S,

683 Otoguro M, Tanikawa S, Nihira T, Horinouchi S, Ohnishi Y, Hayakawa M,

684 Kuzuyama T, Arisawa A, Nomoto F, Miura H, Takahashi Y, Fujita N. 2010.

685 Genome Sequence of Kitasatospora setae NBRC 14216T: An Evolutionary

686 Snapshot of the Family Streptomycetaceae. DNA Res 17:393–406.

687 46. Aroonsri A, Kitani S, Hashimoto J, Kosone I, Izumikawa M, Komatsu M, Fujita N,

688 Takahashi Y, Shin-ya K, Ikeda H, Nihira T. 2012. Pleiotropic Control of Secondary

689 Metabolism and Morphological Development by KsbC, a Butyrolactone

690 Autoregulator Receptor Homologue in Kitasatospora setae. Appl Environ

691 Microbiol 78:8015–8024.

692 47. Intra B, Euanorasetr J, Nihira T, Panbangred W. 2016. Characterization of a

693 gamma-butyrolactone synthetase gene homologue (stcA) involved in bafilomycin

694 production and aerial mycelium formation in Streptomyces sp. SBI034. Appl

695 Microbiol Biotechnol 100:2749–2760.

696 48. Salehi-Najafabadi Z, Barreiro C, Rodríguez-García A, Cruz A, López GE, Martín

697 JF. 2014. The gamma-butyrolactone receptors BulR1 and BulR2 of Streptomyces

698 tsukubaensis: tacrolimus (FK506) and butyrolactone synthetases production

699 control. Appl Microbiol Biotechnol 98:4919–4936.

700 49. Tan G-Y, Peng Y, Lu C, Bai L, Zhong J-J. 2015. Engineering validamycin

701 production by tandem deletion of γ-butyrolactone receptor genes in Streptomyces bioRxiv preprint doi: https://doi.org/10.1101/2020.10.16.342204; this version posted October 16, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

702 hygroscopicus 5008. Metab Eng 28:74–81.

703 50. Choi S-U, Lee C-K, Hwang Y-I, Kinosita H, Nihira T. 2003. γ-Butyrolactone

704 autoregulators and receptor proteins in non-Streptomyces actinomycetes

705 producing commercially important secondary metabolites. Arch Microbiol

706 180:303–307.

707 51. Du Y-L, Shen X-L, Yu P, Bai L-Q, Li Y-Q. 2011. Gamma-butyrolactone regulatory

708 system of Streptomyces chattanoogensis links nutrient utilization, metabolism,

709 and development. Appl Environ Microbiol 77:8415–8426.

710 52. Millán-Aguiñaga N, Chavarria KL, Ugalde JA, Letzel A-C, Rouse GW, Jensen PR.

711 2017. Phylogenomic Insight into Salinispora (Bacteria, Actinobacteria) Species

712 Designations. Sci Rep 7:3564.

713 53. Román-Ponce B, Millán-Aguiñaga N, Guillen-Matus D, Chase AB, Ginigini JGM,

714 Soapi K, Feussner KD, Jensen PR, Trujillo ME. 2020. Six novel species of the

715 obligate marine actinobacterium Salinispora, Salinispora cortesiana sp. nov.,

716 Salinispora fenicalii sp. nov., Salinispora goodfellowii sp. nov., Salinispora

717 mooreana sp. nov., ... Int J Syst Evol Microbiol 70:4668–4682.

718 54. Mincer TJ, Jensen PR, Kauffman CA, Fenical W. 2002. Widespread and

719 Persistent Populations of a Major New Marine Actinomycete Taxon in Ocean

720 Sediments. Appl Environ Microbiol 68:5005–5011.

721 55. Jensen PR, Gontang E, Mafnas C, Mincer TJ, Fenical W. 2005. Culturable marine

722 actinomycete diversity from tropical Pacific Ocean sediments. Environ Microbiol

723 7:1039–1048.

724 56. Mincer TJ, Fenical W, Jensen PR. 2005. Culture-dependent and culture- bioRxiv preprint doi: https://doi.org/10.1101/2020.10.16.342204; this version posted October 16, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

725 independent diversity within the obligate marine actinomycete genus Salinispora.

726 Appl Environ Microbiol 71:7019–7028.

727 57. Maldonado LA, Fenical W, Jensen PR, Kauffman CA, Mincer TJ, Ward AC, Bull

728 AT, Goodfellow M. 2005. Salinispora arenicola gen. nov., sp. nov. and Salinispora

729 tropica sp. nov., obligate marine actinomycetes belonging to the family

730 Micromonosporaceae. Int J Syst Evol Microbiol 55:1759–1766.

731 58. Kim TK, Garson MJ, Fuerst JA. 2005. Marine actinomycetes related to the

732 ‘Salinospora’ group from the Great Barrier Reef sponge Pseudoceratina clavata.

733 Environ Microbiol 7:509–518.

734 59. Vidgen ME, Hooper JNA, Fuerst JA. 2012. Diversity and distribution of the

735 bioactive actinobacterial genus Salinispora from sponges along the Great Barrier

736 Reef. Antonie Van Leeuwenhoek 101:603–618.

737 60. Jensen PR, Moore BS, Fenical W. 2015. The marine actinomycete genus

738 Salinispora: a model organism for secondary metabolite discovery. Nat Prod Rep

739 32:738–751.

740 61. Feling RH, Buchanan GO, Mincer TJ, Kauffman CA, Jensen PR, Fenical W. 2003.

741 Salinosporamide A: A highly cytotoxic proteasome inhibitor from a novel microbial

742 source, a marine bacterium of the new genus Salinospora. Angew Chemie - Int

743 Ed 42:355–357.

744 62. Letzel A-C, Li J, Amos GCA, Millán-Aguiñaga N, Ginigini J, Abdelmohsen UR,

745 Gaudêncio SP, Ziemert N, Moore BS, Jensen PR. 2017. Genomic insights into

746 specialized metabolism in the marine actinomycete Salinispora. Environ Microbiol

747 19:3660–3673. bioRxiv preprint doi: https://doi.org/10.1101/2020.10.16.342204; this version posted October 16, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

748 63. Amos GCA, Awakawa T, Tuttle RN, Letzel A-C, Kim MC, Kudo Y, Fenical W,

749 Moore BS, Jensen PR. 2017. Comparative Transcriptomics as a Guide to Natural

750 Product Discovery and Biosynthetic Gene Cluster Functionality. PNAS

751 114:E11121–E11130.

752 64. Schulze CJ, Navarro G, Ebert D, DeRisi J, Linington RG. 2015. Salinipostins A-K,

753 long-chain bicyclic phosphotriesters as a potent and selective antimalarial

754 chemotype. J Org Chem 80:1312–1320.

755 65. Yoo E, Schulze CJ, Stokes BH, Onguka O, Yeo T, Mok S, Gnädig NF, Zhou Y,

756 Kurita K, Foe IT, Terrell SM, Boucher MJ, Cieplak P, Kumpornsin K, Lee MCS,

757 Linington RG, Long JZ, Uhlemann A-C, Weerapana E, Fidock DA, Bogyo M.

758 2020. The Antimalarial Natural Product Salinipostin A Identifies Essential α/β

759 Serine Hydrolases Involved in Lipid Metabolism in P. falciparum Parasites. Cell

760 Chem Biol 27:143–157.

761 66. Schlawis C, Kern S, Kudo Y, Grunenberg J, Moore B, Schulz S. 2018. Structural

762 Elucidation of Trace Components Combining GC/MS, GC/IR, DFT-Calculation

763 and Synthesis - Salinilactones, Unprecedented Bicyclic Lactones from Salinispora

764 Bacteria. Angew Chemie Int Ed 57:14921–14925.

765 67. Schlawis C, Harig T, Ehlers S, Guillen-Matus DG, Creamer KE, Jensen PR,

766 Schulz S. 2020. Extending the Salinilactone Family. ChemBioChem 21:1629 –

767 1632.

768 68. Kudo Y, Awakawa T, Du Y-L, Jordan PA, Creamer KE, Jensen PR, Linington RG,

769 Ryan KS, Moore BS. 2020. Expansion of gamma-butyrolactone signaling

770 molecule biosynthesis to phosphotriester natural products. bioRxiv bioRxiv preprint doi: https://doi.org/10.1101/2020.10.16.342204; this version posted October 16, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

771 2020.10.11.335315.

772 69. Marchler-Bauer A, Bo Y, Han L, He J, Lanczycki CJ, Lu S, Chitsaz F, Derbyshire

773 MK, Geer RC, Gonzales NR, Gwadz M, Hurwitz DI, Lu F, Marchler GH, Song JS,

774 Thanki N, Wang Z, Yamashita RA, Zhang D, Zheng C, Geer LY, Bryant SH. 2017.

775 CDD/SPARCLE: Functional classification of proteins via subfamily domain

776 architectures. Nucleic Acids Res 45:D200–D203.

777 70. Mendler K, Chen H, Parks DH, Lobb B, Hug LA, Doxey AC. 2019. AnnoTree:

778 visualization and exploration of a functionally annotated microbial tree of life.

779 Nucleic Acids Res 47:4442–4448.

780 71. Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ.

781 1997. Gapped BLAST and PSI-BLAST: a new generation of protein database

782 search programs. Nucleic Acids Res 25:3389–3402.

783 72. Kearse M, Moir R, Wilson A, Stones-Havas S, Cheung M, Sturrock S, Buxton S,

784 Cooper A, Markowitz S, Duran C, Thierer T, Ashton B, Meintjes P, Drummond A.

785 2012. Geneious Basic: An integrated and extendable desktop software platform

786 for the organization and analysis of sequence data. Bioinformatics 28:1647–1649.

787 73. Hadjithomas M, Chen I-MA, Chu K, Huang J, Ratner A, Palaniappan K, Andersen

788 E, Markowitz V, Kyrpides NC, Ivanova NN. 2017. IMG-ABC: New features for

789 bacterial secondary metabolism analysis and targeted biosynthetic gene cluster

790 discovery in thousands of microbial genomes. Nucleic Acids Res 45:D560–D565.

791 74. Blin K, Wolf T, Chevrette MG, Lu X, Schwalen CJ, Kautsar SA, Suarez Duran HG,

792 de los Santos ELC, Kim HU, Nave M, Dickschat JS, Mitchell DA, Shelest E,

793 Breitling R, Takano E, Lee SY, Weber T, Medema MH. 2017. antiSMASH 4.0— bioRxiv preprint doi: https://doi.org/10.1101/2020.10.16.342204; this version posted October 16, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

794 improvements in chemistry prediction and gene cluster boundary identification.

795 Nucleic Acids Res 45:1–6.

796 75. Blin K, Shaw S, Steinke K, Villebro R, Ziemert N, Lee SY, Medema MH, Weber T.

797 2019. antiSMASH 5.0: updates to the secondary metabolite genome mining

798 pipeline. Nucleic Acids Res 47:W81–W87.

799 76. Medema MH, Takano E, Breitling R. 2013. Detecting Sequence Homology at the

800 Gene Cluster Level with MultiGeneBlast. Mol Biol Evol 30:1218–1223.

801 77. Szklarczyk D, Gable AL, Lyon D, Junge A, Wyder S, Huerta-Cepas J, Simonovic

802 M, Doncheva NT, Morris JH, Bork P, Jensen LJ, Von Mering C. 2019. STRING

803 v11: Protein-protein association networks with increased coverage, supporting

804 functional discovery in genome-wide experimental datasets. Nucleic Acids Res

805 47:D607–D613.

806 78. Edgar RC. 2004. MUSCLE: Multiple sequence alignment with high accuracy and

807 high throughput. Nucleic Acids Res 32:1792–1797.

808 79. Maddison WP, Maddison DR. 2018. Mesquite: a modular system for evolutionary

809 analysis. 3.40.

810 80. Darriba D, Taboada GL, Doallo R, Posada D. 2011. ProtTest 3: fast selection of

811 best-fit models of protein evolution. Bioinformatics 27:1164–1165.

812 81. Stamatakis A. 2014. RAxML version 8: A tool for phylogenetic analysis and post-

813 analysis of large phylogenies. Bioinformatics 30:1312–1313.

814 82. Guindon S, Dufayard JF, Lefort V, Anisimova M, Hordijk W, Gascuel O. 2010.

815 New algorithms and methods to estimate maximum-likelihood phylogenies:

816 Assessing the performance of PhyML 3.0. Syst Biol 59:307–321. bioRxiv preprint doi: https://doi.org/10.1101/2020.10.16.342204; this version posted October 16, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

817 83. Lefort V, Longueville JE, Gascuel O. 2017. SMS: Smart Model Selection in

818 PhyML. Mol Biol Evol 34:2422–2424.

819 84. Ziemert N, Lechner A, Wietz M, Millán-Aguiñaga N, Chavarria KL, Jensen PR.

820 2014. Diversity and evolution of secondary metabolism in the marine

821 actinomycete genus Salinispora. Proc Natl Acad Sci U S A 111:E1130-9.

822 85. Rambaut A. 2016. FigTree v1.4.3. 1.4.3.

823 86. Letunic I, Bork P. 2019. Interactive Tree Of Life (iTOL) v4: recent updates and

824 new developments. Nucleic Acids Res 47:W256–W259.

825 87. Nouioui I, Carro L, García-López M, Meier-Kolthoff JP, Woyke T, Kyrpides NC,

826 Pukall R, Klenk H-P, Goodfellow M, Göker M. 2018. Genome-Based Taxonomic

827 Classification of the Phylum Actinobacteria. Front Microbiol 9:1–119.

828 88. Dillon SC, Bateman A. 2004. The Hotdog fold: Wrapping up a superfamily of

829 thioesterases and dehydratases. BMC Bioinformatics 5:109.

830 89. Chen I-MA, Chu K, Palaniappan K, Pillay M, Ratner A, Huang J, Huntemann M,

831 Varghese N, White JR, Seshadri R, Smirnova T, Kirton E, Jungbluth SP, Woyke

832 T, Eloe-Fadrosh EA, Ivanova NN, Kyrpides NC. 2018. IMG/M v.5.0: an integrated

833 data management and comparative analysis system for microbial genomes and

834 microbiomes. Nucleic Acids Res 47:666–677.

835 90. Enright AJ, Iliopoulos I, Kyrpides NC, Ouzounis CA. 1999. Protein interaction

836 maps for complete genomes based on gene fusion events. Nature 402:86–90.

837 91. Marcotte EM, Pellegrini M, Ng H-L, Rice DW, Yeates TO, Eisenberg D. 1999.

838 Detecting Protein Function and Protein-Protein Interactions from Genome

839 Sequences. Science (80- ) 285:751–753. bioRxiv preprint doi: https://doi.org/10.1101/2020.10.16.342204; this version posted October 16, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

840 92. Penn K, Jenkins C, Nett M, Udwary DW, Gontang EA, McGlinchey RP, Foster B,

841 Lapidus A, Podell S, Allen EE, Moore BS, Jensen PR. 2009. Genomic islands link

842 secondary metabolism to functional adaptation in marine Actinobacteria. ISME J

843 3:1193–1203.

844 93. Ziemert N, Podell S, Penn K, Badger JH, Allen E, Jensen PR. 2012. The natural

845 product domain seeker NaPDoS: A phylogeny based bioinformatic tool to classify

846 secondary metabolite gene diversity. PLoS One 7:e34064.

847 94. Van Santen JA, Jacob G, Singh AL, Aniebok V, Balunas MJ, Bunsko D, Neto FC,

848 Castaño-Espriu L, Chang C, Clark TN, Little JLC, Delgadillo DA, Dorrestein PC,

849 Duncan KR, Egan JM, Galey MM, Haeckl FPJ, Hua A, Hughes AH, Iskakova D,

850 Khadilkar A, Lee J-H, Lee S, Legrow N, Liu DY, Macho JM, McCaughey CS,

851 Medema MH, Neupane RP, O’Donnell TJ, Paula JS, Sanchez LM, Shaikh AF,

852 Soldatou S, Terlouw BR, Tran TA, Valentine M, Van Der Hooft JJJ, Vo DA, Wang

853 M, Wilson D, Zink KE, Linington RG. 2019. The Natural Products Atlas: An Open

854 Access Knowledge Base for Microbial Natural Products Discovery ́. ACS Cent Sci

855 5:1824–1833.

856 95. Kautsar SA, Blin K, Shaw S, Navarro-Muñoz JC, Terlouw BR, van der Hooft JJJ,

857 van Santen JA, Tracanna V, Suarez Duran HG, Pascal Andreu V, Selem-Mojica

858 N, Alanjary M, Robinson SL, Lund G, Epstein SC, Sisto AC, Charkoudian LK,

859 Collemare J, Linington RG, Weber T, Medema MH. 2020. MIBiG 2.0: a repository

860 for biosynthetic gene clusters of known function. Nucleic Acids Res 48:D454–

861 D458.

862 96. Chevrette MG, Gutiérrez-García K, Selem-Mojica N, Aguilar-Martínez C, Yañez- bioRxiv preprint doi: https://doi.org/10.1101/2020.10.16.342204; this version posted October 16, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

863 Olvera A, Ramos-Aboites HE, Hoskisson PA, Barona-Gómez F. 2020.

864 Evolutionary dynamics of natural product biosynthesis in bacteria. Nat Prod Rep

865 37:566–599.

866 97. Niehs SP, Kumpfmüller J, Dose B, Little RF, Ishida K, Flórez L V., Kaltenpoth M,

867 Hertweck C. 2020. Insect-Associated Bacteria Assemble the Antifungal

868 Butenolide Gladiofungin by Non-Canonical Polyketide Chain Termination. Angew

869 Chemie Int Ed.

870 98. Nakou IT, Jenner M, Dashti Y, Romero-Canelón I, Masschelein J,

871 Mahenthiralingam E, Challis GL. 2020. Genomics-driven discovery of a novel

872 glutarimide antibiotic from Burkholderia gladioli reveals an unusual polyketide

873 synthase chain release mechanism. Angew Chemie Int Ed

874 10.1002/anie.202009007.

875 99. de Rond T, Asay JE, Moore BS. 2020. Co-Occurrence of Enzyme Domains

876 Guides the Discovery of an Oxazolone Synthetase. bioRxiv 2020.06.11.147165.

877 100. Park CJ, Smith JT, Andam CP. 2019. Horizontal Gene Transfer and Genome

878 Evolution in the Phylum Actinobacteria, p. 155–174. In Horizontal Gene Transfer.

879 101. Bruns H, Crüsemann M, Letzel A-C, Alanjary M, Mcinerney JO, Jensen PR,

880 Schulz S, Moore BS, Ziemert N. 2017. Function-related replacement of bacterial

881 siderophore pathways. Nat Publ Gr 12:320–329.

882 102. Bose U, Ortori CA, Sarmad S, Barrett DA, Hewavitharana AK, Hodson1 MP,

883 Fuerst JA, Shaw PN. 2017. Production of N-acyl homoserine lactones by the

884 sponge-associated marine actinobacteria Salinispora arenicola and Salinispora

885 pacifica. FEMS Microbiol Lett 364. bioRxiv preprint doi: https://doi.org/10.1101/2020.10.16.342204; this version posted October 16, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

886 103. McBride SG, Strickland M. 2019. Quorum sensing modulates microbial efficiency

887 by regulating bacterial investment in nutrient acquisition enzymes. Soil Biol

888 Biochem 136:107514.

889 104. Patteson JB, Lescallette AR, Li B. 2019. Discovery and Biosynthesis of

890 Azabicyclene, a Conserved Nonribosomal Peptide in Pseudomonas aeruginosa.

891 Org Lett 21:4955–4959.

892 105. Okada BK, Seyedsayamdost MR. 2017. Antibiotic dialogues: induction of silent

893 biosynthetic gene clusters by exogenous small molecules. FEMS Microbiol Rev

894 41:19–33.

895 106. Alberti F, Leng DJ, Wilkening I, Song L, Tosin M, Corre C. 2019. Triggering the

896 expression of a silent gene cluster from genetically intractable bacteria results in

897 scleric acid discovery. Chem Sci 10:453–463.

898 107. Chevrette MG, Carlson CM, Ortega HE, Thomas C, Ananiev GE, Barns KJ, Book

899 AJ, Cagnazzo J, Carlos C, Flanigan W, Grubbs KJ, Horn HA, Hoffmann FM,

900 Klassen JL, Knack JJ, Lewin GR, McDonald BR, Muller L, Melo WGP, Pinto-

901 Tomás AA, Schmitz A, Wendt-Pienkowski E, Wildman S, Zhao M, Zhang F, Bugni

902 TS, Andes DR, Pupo MT, Currie CR. 2019. The antimicrobial potential of

903 Streptomyces from insect microbiomes. Nat Commun 10:1–11.

904 108. Okamura H, Fujioka T, Mori N, Taniguchi T, Monde K, Watanabe H, Takikawa H.

905 2019. First enantioselective synthesis of salinipostin A, a marine cyclic enol-

906 phosphotriester isolated from Salinispora sp. Tetrahedron Lett 60:150917.

907

908 bioRxiv preprint doi: https://doi.org/10.1101/2020.10.16.342204; this version posted October 16, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

909 Figure Legends.

910

911 1. Actinobacterial AfsA homolog phylogeny, gene neighborhoods, and small

912 molecule signaling products. Maximum likelihood phylogeny created with

913 RaxML using a LG+I+G+F ProtTest model; branches are labeled with

914 bootstrap support (500 replicates). Colored circles indicate Actinobacterial

915 family. Gene neighborhoods are drawn 5’ to 3’ when genome sequences

916 were available and aligned by AfsA homolog (red); other genes are colored

917 by COG function. Colored boxes delineate g-butyrolactones (green), g-

918 butenolides (red), and furans (blue). Those mapped to the tree have been

919 experimentally linked to the gene cluster (references inset). Other

920 actinobacterial signaling molecules not associated with AfsA are also shown

921 (yellow). Bracketed numbers for each AfsA homolog and their associated

922 signaling molecule product are referred to in subsequent figures.

923

924 2. Distribution of Spt9 homologs across 27,000 bacterial genomes. Shaded taxa

925 contain Spt9/AfsA Pfam03756 A-factor biosynthesis hotdog domain-

926 containing protein homologs as determined using AnnoTree. Pie charts show

927 the proportion of genomes in that taxon with a Spt9 homolog, with the total

928 number of hits indicated. Pie chart size is proportional to the percentage of

929 Spt9/AfsA hits out of the 1,230 detected across all taxa.

930 bioRxiv preprint doi: https://doi.org/10.1101/2020.10.16.342204; this version posted October 16, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

931 3. Phylogeny and gene environments of characterized AfsA and top Spt9

932 homologs. Condensed maximum likelihood phylogeny of the top Spt9

933 homologs (403, black) and 22 experimentally characterized AfsA homologs

934 (22, red) linked to known molecules. The tree was calculated with a

935 WAG+I+G+F ProtTest model with 500 replicates in RaxML; branches are

936 labeled with bootstrap support. Taxonomically coherent clades are collapsed

937 with the number of sequences indicated in parentheses. The Pseudomonas

938 sp. RIT357 AfsA homolog was used as an outgroup. Gene neighborhoods are

939 drawn 5’ to 3’ and aligned with the Spt9 homolog (red); genes are colored by

940 COG function as annotated by JGI IMG/MER. Shaded rectangles indicate

941 Actinobacterial family or Gammaproteobacterial class (see legend) with

942 circles proportional to the number of sequences in each familial clade.

943 Representative chemical structures are shown (g-butyrolactones: salinipostin

944 A from Salinispora tropica CNB-440, A-factor from Streptomyces griseus; g-

945 butenolide: avenolide from Streptomyces avermitilis; furan: methylenomycin

946 from Streptomyces coelicolor A3(2)) and bracketed numbers correspond to

947 the AfsA homologs and their associated signaling molecule products in Figure

948 1. Stars indicate salinipostin-like BGCs.

949

950 4. Phylogeny of Spt9 homologs within salinipostin-like BGCs. The maximum

951 likelihood phylogeny was calculated with a WAG+I+G+F ProtTest model with

952 500 replicates in RaxML; branches are labeled with bootstrap support. Gene

953 neighborhoods are drawn 5’ to 3’ and aligned with the Spt9 homolog (red). bioRxiv preprint doi: https://doi.org/10.1101/2020.10.16.342204; this version posted October 16, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

954 The names of Spt9/AfsA homologs linked to the production of specific

955 compounds are colored in red. Gene fusions are shown with approximate

956 transition points and neighboring genes are colored by COG function as

957 annotated by JGI IMG/MER. Colored rectangles indicate Actinobacterial

958 family or Gammaproteobacterial class (see legend). Representative chemical

959 structures are shown (g-butyrolactones: A-factor from Streptomyces griseus,

960 salinipostin A from Salinispora tropica CNB-440; g-butenolide: SRB-1 from

961 Streptomyces rochei; furan: methylenomycin MMF-1 from Streptomyces

962 coelicolor A3(2)) and bracketed numbers correspond to the AfsA homologs

963 and associated signaling molecule products in Figure 1.

964

965 5. Concatenated Spt1-9 phylogeny. A) Colored by Salinispora species. B)

966 Colored by Salinispora strain isolation location. The maximum likelihood tree

967 was calculated in PhyML with a Smart Model Selection HIVb+G+I+F model

968 and midpoint-rooting; branches have proportional circles representing aLRT

969 branch support.

970

971 Supplemental Figure Legends.

972

973 1. S1. Organization and functional annotation of the Salinispora salinipostin spt

974 gene cluster. Pfam characterizations for the nine spt genes (1-9) are given in

975 parentheses. The structure of salinipostin A is shown.

976 bioRxiv preprint doi: https://doi.org/10.1101/2020.10.16.342204; this version posted October 16, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

977 2. S2. Expanded phylogenetic tree of the top 403 Spt9 homologs (black) and 22

978 experimentally characterized AfsA homologs (red). The RaxML maximum

979 likelihood tree was calculated with a WAG+I+G+F ProtTest model with 500

980 replicates; branches are labeled with bootstrap support. Gene neighborhoods are

981 drawn 5’ to 3’ for one representative taxa in each monophyletic clade and aligned

982 with the Spt9 homolog in red that was used to build the tree; genes are colored

983 by their COG function as annotated by JGI IMG/MER. Shaded rectangles

984 indicate Actinobacterial family or Gammaproteobacterial class (see legend).

985 Representative chemical structures are shown (g-butyrolactones: salinipostin A

986 from Salinispora tropica CNB-440, A-factor from Streptomyces griseus; g-

987 butenolide: avenolide from Streptomyces avermitilis; furan: methylenomycin from

988 Streptomyces coelicolor A3(2)) and bracketed numbers correspond to the AfsA

989 homologs and associated signaling molecule products in Figure 1. Stars indicate

990 salinipostin-like BGCs.

991

992 3. S3. Alignment of fused and individual Spt sequences. A) Fused Spt2-3 protein

993 sequences aligned with individual Spt2 and Spt 3 sequences from Salinispora

994 tropica CNB-440 and Streptomyces sp. PRh5. Conserved regions are in green

995 on the identity graph and colored by amino acid residue. The fused Spt2-3

996 proteins have conserved residues in the Spt2 and Spt3 functional domains (as

997 predicted by the NCBI Conserved Domain Database tool, E-value cutoff 0.1). B)

998 Fused Spt6-9 protein sequence from Dietzia timorensis ID05-A0528 aligned with

999 individual Spt6 and Spt9 sequences from Mycobacterium sp. WY10, bioRxiv preprint doi: https://doi.org/10.1101/2020.10.16.342204; this version posted October 16, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

1000 Tsukamurella sp. 1534, and Nocardia carnea NRRL B-1997. Conserved regions

1001 are in green on the identity graph and colored by amino acid residue. The fused

1002 Spt6-9 protein in Dietzia timorensis ID05-A0528 includes both functional domains

1003 and conserved sequence similarity to the individual Spt6 and Spt9 proteins (as

1004 predicted by the NCBI Conserved Domain Database tool, E-value cutoff 0.01).

1005

1006 4. S4. Salinispora species and Spt1-9 phylogenies. A) Maximum likelihood

1007 phylogenetic tree of 11 single-copy, concatenated proteins (DnaA, GyrB1,

1008 GyrB2, PyrH, RecA, Pgi, TrpB, AtpD, SucC, RpoB, and TopA) from 118

1009 Salinispora genomes as reported in Ziemert et al. 2014 (84). The PhyML tree

1010 was midpoint rooted and calculated with a Smart Model Selection AIC

1011 HIVb+G+I+F amino acid model; branches are labeled with aLRT support. Taxa

1012 lacking spt are in black and those with the BGC are colored by species as

1013 indicated in the legend. B) Individual maximum likelihood phylogenetic trees for

1014 Spt1-9 amino acid sequences from 116 Salinispora strains. The PhyML trees

1015 were midpoint-rooted and calculated with a Smart Model Selection for each

1016 salinipostin protein: Spt1 (HIVb+G+I+F), Spt2 (Flu+G+I+F), Spt3 (HIVb+G+I+F),

1017 Spt4 (JTT+G), Spt5 (HIVb+G+I+F), Spt6 (HIVb+G+F), Spt7 (HIVb+G+I+F), Spt8

1018 (HIVw+G+I+F), Spt9 (JTT+G+I+F); branches are labeled with their aLRT support.

1019 Taxa are colored by species as indicated in the legend.

1020

1021 5. S5. Salinispora Spt9 phylogeny and gene cluster neighborhoods. Maximum

1022 likelihood tree for Spt9 amino acid sequences from 116 Salinispora strains with bioRxiv preprint doi: https://doi.org/10.1101/2020.10.16.342204; this version posted October 16, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

1023 the spt BGC was calculated with a JTT+I+G ProtTest model with 500 replicates

1024 in RaxML; branches are labeled with bootstrap support. Spt9 homologs in non-

1025 Salinispora bacteria were used as outgroups. Brackets on the left indicate

1026 genomic island (GI) locations of the spt BGC. Taxa are colored by species (see

1027 key). Gene neighborhoods are drawn 5’ to 3’ and aligned with the Spt9 homolog

1028 (red). Neighboring genes are colored as per MultiGeneBlast only if they share

1029 homology to genes in the Salinispora tropica CNB-440 spt BGC contig; white

1030 indicates no homology to genes in the S. tropica CNB-440 spt BGC contig. Stars

1031 indicate BGCs in which variations from the canonical spt BGC were observed.

1032

1033 6. S6. Targeted PCR of Salinispora spt BGC variants. A) The spt BGC in S.

1034 arenicola CNS-296 was observed on two contigs. Primer sets A (6F, 6dntransR)

1035 and B (6F, 6dntrans-IGR_R), which span spt6 and the neighboring transposase

1036 gene, successfully amplified ~1400bp and ~3500bp products (gel image, lanes

1037 1A and 1B, respectively), from S. arenicola CNS-296 (expected product size in

1038 parentheses). These primers failed to yield a product from S. arenicola CNQ-884,

1039 which possesses a contiguous spt BGC (gel image, lanes 2A and 2B,

1040 respectively). The light blue coverage graph indicates coverage of 8 PCR product

1041 sequences mapped to the reference genome and the identity graph shows mean

1042 pairwise identity over all mapped sequences (green: 100% identity; yellow:

1043 between 30%-99% identity; red: below 30% identity). B) Primer pair C targeting

1044 spt6-7 amplified an ~5kb product in S. arenicola CNS-296 (gel images, lane 1C).

1045 Sequencing from the 5’ and 3’ ends (900bp each) mapped back to spt6 (poorly, bioRxiv preprint doi: https://doi.org/10.1101/2020.10.16.342204; this version posted October 16, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

1046 67% identity) and spt7 (99% identity) indicating an ~3.2kb insertion between spt6

1047 and spt7. Primer pair C amplified similarly sized contiguous spt6-7 products

1048 (~1,600bp) in both S. arenicola CNQ-884 (gel images, lane 2C) and S. pacifica

1049 CNS-143 (gel images, lane 3C). The sequences contained identical spt6-7

1050 sequence length and conserved domains indicating that the hypothetical gene

1051 between spt6 and spt7 in S. pacifica CNS-143 might be a genome assembly or

1052 annotation artifact. Coverage and identity graphs of mapped sequences is the

1053 same as described above.

1054

1055 7. S7. Co-occurrence of S. tropica and S. arenicola strains. Circles indicate

1056 geographic areas within the Bahamas and the Yucatán from which the 18

1057 Salinispora tropica and S. arenicola strains that are most closely related in the

1058 Spt1-9 phylogeny (Figure S5) were isolated. The total number of strains are

1059 indicated with pie charts showing the proportion of each species; two strains are

1060 not shown as exact collection location coordinates in the Bahamas are not known

1061 for S. tropica strains CNT-250 and CNT-261.

1062

1063 Supplemental Table Legends.

1064 Table S1. NCBI GenBank Accession numbers for the spt6-7 PCR products as

1065 described in Figure S6.

1066

1067

1068 bioRxiv preprint doi: https://doi.org/10.1101/2020.10.16.342204; this version posted October 16, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

1069 Supplementary Datasets.

1070 Dataset S1. List of all sequence datasets with relevant accession information used in

1071 this paper’s analyses.

1072 Tab 1: Fig1_AnnoTree_hits_genes: Number of Spt9 (pfam PF03766) genome

1073 hits identified by AnnoTree across Phyla, Class, Order, Family, Genus, and Species;

1074 the proportion of all hits and the number of genomes in each clade were used to draw

1075 and scale the pie charts of Figure 2. Also listed is the Gene ID, GTDB ID, and protein

1076 sequence of all genome hits.

1077 Tab 2: 403_spt9homolog_IMGgeneinfo: List of all 403 Spt9 homolog sequence

1078 information (including locus tag, gene product name, genome ID, genome name,

1079 Genbank accession, amino acid sequence length, scaffold ID, scaffold external

1080 accession, scaffold length, scaffold GC%, and Pfam). Corresponds to Figure 3 and

1081 Figure S2.

1082 Tab 3: 22_known_afsAhomologs: List of 22 characterized AfsA homologs

1083 belonging in the gamma-butyrolactone, furan, gamma-butenolide, and ‘other’ class of

1084 signaling molecules; corresponds to Figure 1. Information includes gene locus tag, gene

1085 symbol/name, molecule linked to gene if known, genome strain, NCBI protein

1086 accession, JGI Gene ID, JGI genome ID, and gene product name.

1087 Tab 4: 152_spt-likeBGCs: List of all 152 Spt9 homologs identified within Spt-like

1088 BGCs and additional homologs using for building the tree in Figure 4. Gene information

1089 includes JGI gene ID, locus tag, gene product name, JGI genome ID, genome name,

1090 start coordinate, end coordinate, strand, DNA sequence length, amino acid length,

1091 scaffold ID, scaffold external accession, scaffold GC %. bioRxiv preprint doi: https://doi.org/10.1101/2020.10.16.342204; this version posted October 16, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

1092 Tab 5: 11_MLST_Salinispora: List of 119 Salinispora species strain DnaA,

1093 GyrB1, GyrB2, Pgi, TrpB, SucC, RecA, PyrH, TopA, AtpD, RpoD JGI gene ID

1094 accessions for Figure S4-A.

1095 Tab 6: Spt1_Salinispora_116: List of all 116 Spt1 Salinispora gene sequence

1096 information for Figure 5 and S4-B (including JGI gene ID, locus tag, gene product name,

1097 JGI genome ID, genome name, start coordinate, end coordinate, strand, DNA sequence

1098 length, amino acid sequence length, scaffold ID, scaffold external accession name, and

1099 pfam).

1100 Tab 7: Spt2_Salinispora_116: List of all 116 Spt2 Salinispora gene sequence

1101 information for Figure 5 and S4-B (included information is the same as Spt1).

1102 Tab 8: Spt3_Salinispora_116: List of all 116 Spt3 Salinispora gene sequence

1103 information for Figure 5 and S4-B (including JGI gene ID, locus tag, gene product name,

1104 JGI genome ID, genome name, start coordinate, end coordinate, strand, DNA sequence

1105 length, amino acid sequence length, scaffold ID, and scaffold external accession name).

1106 Tab 9: Spt4_Salinispora_116: List of all 116 Spt4 Salinispora gene sequence

1107 information for Figure 5 and S4-B (included information is the same as Spt1).

1108 Tab 10: Spt5_Salinispora_116: List of all 116 Spt5 Salinispora gene sequence

1109 information for Figure 5 and S4-B (included information is the same as Spt1).

1110 Tab 11: Spt6_Salinispora_116: List of all 116 Spt6 Salinispora gene sequence

1111 information for Figure 5 and S4-B (included information is the same as Spt1).

1112 Tab 12: Spt7_Salinispora_116: List of all 116 Spt7 Salinispora gene sequence

1113 information for Figure 5 and S4-B (included information is the same as Spt1). bioRxiv preprint doi: https://doi.org/10.1101/2020.10.16.342204; this version posted October 16, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

1114 Tab 13: Spt8_Salinispora_116: List of all 116 Spt8 Salinispora gene sequence

1115 information for Figure 5 and S4-B (included information is the same as Spt1).

1116 Tab 14: Spt9_Salinispora_116: List of all 116 Spt9 Salinispora gene sequence

1117 information for Figure 5 and S4-B (included information is the same as Spt1). Figure 1. PI Factor (S. natalensis) Acl-1a (S. coelicolor) Acl-2a (S. coelicolor) Acl-2c (S. coelicolor) Gräfe’s factors SCB1 (S. coelicolor) SCB4 (S. coelicolor) SCB7 (S. coelicolor) (S. cyaneofuscatus & S. bikinensis) OH OH OH OH OH OH HOH2C CH2OH O O O O O O Gamma-butyrolactones H2N CC NH2 OH HOH2C CH2OH H H H O O O O O OH O OH O OH Recio, E., et al. 2004. OH OH OH Gamma-butenolides H O N-methylphenylalanyl-dehydrobutyrine Acl-1b (S. coelicolor) Acl-2b (S. coelicolor) Acl-2d (S. coelicolor) OH SCB2 (S. coelicolor) SCB5 (S. coelicolor) SCB8 (S. coelicolor) diketopiperazine (MDD) (S. globisporus) OH OH Furans OH OH OH OH OH O O O O O O O O NH H H H Other O O O Khokhlov, A. S., et al. 1967. H O OH O OH O OH N OH OH OH Onoprienko, V.V., et al. 1983. O OH Joo, H.S., et al. 2007. Efremenkova, O.V. 2016. SCB3 (S. coelicolor) SCB6 (S. coelicolor) O OH Matselyukh, B., et al. 2015. OH OH O O O Takano, E., et al. 2000. Family H O H Hsiao, N.H., et al. 2007. OH O H Hsiao, N.H. et al. 2009. Micromonosporineae Gräfe, U., et al. 1983. OH O OH Sidda, J.D. et al. 2016.

Nocardiaceae A-factor (S. griseus) SVB1 (S. venezuelae) Factor I (S. viridochromogenes) Gräfe Factors (guess) Streptomyces cyaneofuscatus [1] H 100 OH Pseudonocardiaceae OH OH 9 8 StcA Streptomyces sp. SBI034 O O O Streptomycetaceae H H 100 AfsA SGR-6889 Streptomyces griseus [2] O O Khokhlov, A.S., et al. 1967. O O Ando, N., et al. 1997. OH Zou, Z., et al. 2014. OH Gräfe, U., et al. 1982. 3 kb Kato, J., et al. 2007.

7 5 ScbA SCO6266 Streptomyces coelicolor A3(2) [3] VB-E (S. virginiae) JadW1 Sven5969 Streptomyces venezuelae ATCC 10712 [4] OH 100 O 2 2 Factor 1 (guess) Streptomyces viridochromogenes NRRL 3427 [5] SRB1 (S. rochei) SRB2 (S. rochei) H O OH BulS1 (homolog) Streptomyces tsukubensis 8 3 HO HO VB-D (S. virginiae) OH OH 1 3 2 8 SrrX Streptomyces rochei plasmid pSLA2-L [6] O O O O OH O ScgA (guess) Streptomyces chattanoogensis Arakawa, K., et al. 2007. O OH O OH Arakawa, K., et al. 2012. Yamamoto, S., et al. 2008. H O 2 2 SngA (homolog) Streptomyces natalensis ATCC 27448 OH IM-2 (S. lavendulae) FarX Streptomyces lavendulae FRI-5 [7] OH VB-A (S. virginiae) VB-B (S. virginiae) VB-C (S. virginiae) 8 2 O OH OH OH 2 1 2 8 BulS2 (homolog) Streptomyces tsukubensis O Sato, K., et al. 1989. O O H Hashimoto, K., et al. 1992. O OH Kitani, S., et al. 2001. bioRxiv preprint doi: https://doi.org/10.1101/2020.10.16.342204; this version posted October 16, 2020. The copyright holder for this preprint Kitani, S., et al. 2008. H (which was not certified by peer review) is the author/funder. All rightsBarX reserved. Streptomyces No reuse allowed without permission. virginia e [8] Waki, M., et al. 1997. H O H Yamada, Y., et al. 1987. O OH OH O OH Kondo, K., et al. 1989. Kawachi, R., et al. 2000. Lee, Y.J., et al. 2010. 5 3 SabA (homolog) Streptomyces acidiscabies 84.104 2 2 Avenolide (S. avermitilis) MMF-1 (S. coelicolor) MMF-2 (S. coelicolor) MMF-3 (S. coelicolor) KsbS4 (homolog) Kitasatospora setae KM-6054 NBRC 14216 O HO O HO O HO O NcsR1 (homolog) Streptomyces carzinostaticus ATCC 15944 O OH HO HO HO ShbA1 SHJG7319 (homolog) Streptomyces hygroscopicus jinggangensis 5008 O Kitani, S., et al. 2011. O O O 100 Zhu, J., et al. 2016.

9 2 AvaA (homolog) Streptomyces avermitilis [9] MMF-4 (S. coelicolor) MMF-5 (S. coelicolor) RJB (Rhodococcus jostii RHA1) MmfL Streptomyces coelicolor A3(2) plasmid SCP1 [10] 5 7 HO O HO O OH GblA Rhodococcus jostii [11] O HO HO 2 7 O O Spt9 Salinispora tropica CNB-440 [12] O O Corre, C., et al. 2008. Sidda, J.D. & Corre, C. 2012. Ceniceros, A., et al. 2017. RAM 26405 (guess) Amycolatopsis mediterranei S699 ATCC 13685 Salinipostin A (Salinispora sp. RLUS08-036-SPS-B) Sal-GBL1 (Salinispora tropica CNB-440) Sal-GBL2 (Salinispora tropica CNB-440)

0.5 O O O O O O O O O P O HO HO O O

R1

R2 Salinipostins A-K (Salinispora sp. RLUS08-036-SPS-B) Salinipostin A Salinipostin F Salinipostin I C15H31 R1 O O O O P C H Salinipostin B Salinipostin D Salinipostin G Salinipostin J O O 14 29 R2

Schulze, C.J., et al. 2015. Salinipostin C Salinipostin E Salinipostin H Salinipostin K Amos G.C.A., et al. 2017. C13H27 Kudo, Y., et al. 2020. bioRxiv preprint doi: https://doi.org/10.1101/2020.10.16.342204; this version posted October 16, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. 256 Figure 2. 4 911

3

Dormibacterota

Patescibacteria Eremiobacterota Chloroflexota

Bipolaricaulota Myxococcota ActinobacteriotaDeinococcota Proteobacteria Bdellovibrionota SAR324

Nitrospirota A UBA2233*

Chrysiogenetota CG2-30-70-394*

Desulfobacterota UBP13 Dadabacteria ArmatimonadotaFirmicutes UBP12*H Desulfobacterota B

Firmicutes G DesulfuromonadotaGWC2-55-46 MBNT15 Deferrisomatota* DTU030* Firmicutes E Bacteria-domain**

Firmicutes C UBA10199 O2-12-FULL-43-9 2 Firmicutes B UBP10 Firmicutes FirmicutesA F Nitrospirota CG2-30-53-67 10 FirmicutesFirmicutes D Methylomirabilota 5 Moduliflexota Margulisbacteria Nitrospinota 3 Cyanobacteriota Schekmanbacteria Acidobacteriota 1 Aerophobetota Campylobacterota Caldatribacteriota Aquificota

Synergistota Deferribacterota Thermosulfidibacterota* Thermodesulfobiota* Dependentiae Dictyoglomota UBP6 Coprothermobacterota Bacteroidota Caldisericota KSB1 UBP7 14 SM23-31* Calditrichota Calescibacteriota Marinisomatota Thermotogota AABM5-125-24* Wallbacteria UBP17 Fibrobacterota Delongbacteria Riflebacteria Latescibacterota Zixibacteria Fusobacteriota UBP14 Lindowbacteria*UBA8481 Gemmatimonadota UBP1 Fermentibacterota Goldbacteria-1* BRC1

Eisenbacteria* UBP4 RBG-13-66-14* Firestonebacteria TA06 4572-55* UBA9089 Edwardsbacteria UBP18* Cloacimonadota WOR-3 CG03 Verrucomicrobiota Planctomycetota Poribacteria

Entotheonellota*

Hydrogenedentota

UBA6262* Desantisbacteria UBP3

OLB16* UBA3054* Elusimicrobiota

Omnitrophota

GCA-001730085*

Tree scale: 0.01

Spirochaetota 1% of all hits 1 20 25% of all hits * Only 1 genome in the phylum ** No phylum classification Figure 3. 8 9 4 5 1 6 Salinispora_pacifica(8)_oceanensis(3)_mooreana(2)_vitiensis(1) Salinipostin A (S. tropica CNB-440, Spt9) [12] 2 8 Salinispora_arenicola_(13) Cellulomonadaceae 5 8 O Corynebacteriaceae Salinispora_arenicola_CNH996_CC96DRAFT_scaffold00044.44 O O 100 P O Dietziaceae Salinispora_arenicola_CNB527_B033DRAFT_scaffold_31.32 [12] 8 9 O O Frankiaceae C15H31 100 Salinispora_tropica_(2) Gammaproteobacteria 2 8 Streptomyces_phaeofaciens_JCM_4814_Ga0128480_10001 Micrococcaceae 100 Citrobacter_koseri_FDAARGOS_164_Ga0109959_111 Micromonosporaceae 4 7 Dickeya_sp._DW_0440_DXXDW0440DRAFT_AOON01000047_1.47 Mycobacteriaceae 2 3 Nocardiaceae 8 7 Cellulomonas_cellasea_DSM_20118_Ga0072847_1033 Nocardiopsaceae Rhodococcus_sp._1163_Ga0183007_134 Pseudonocardiaceae 4 9 Streptomyces_sp._1_Ga0181085_11 Streptomycetaceae 100 6 8 Frankia_sp_(11) Streptosporangiaceae 9 9 5 1 Actinokineospora_sp_(2) Thermomonosporaceae Pseudonocardia_spinosispora_DSM_44797_G406DRAFT_scaffold00042.42 Tsukamurellaceae 2 9 100 6 3 Streptacidiphilus_albus_(2) 100 1 taxa 9 7 Streptomyces_sp_(2) 100 5 taxa 9 1 Amycolatopsis_sp_(2) 8 8 Streptomyces_chartreusis_NRRL_12338_KIMDRAFT_AGDE01000124_1.124 25 taxa 3 kb Nocardiopsis_gilva_YIM_90087_D475DRAFT_ANBG01000431_1.431 100 8 0 Streptomyces_sp_(3) 4 6 Nocardiopsis_halotolerans_DSM_44410 Streptomyces_albulus_B-3066_Ga0133142_1158 2 2 6 1 Nocardia_vinacea_NBRC_16497_NoneDRAFT_BAGN01000398_1.398 6 5 Actinoalloteichus_hymeniacidonis_DSM_45092 1 5 4 5 8 4 Kutzneria_buriramensis_DSM_45791_Ga0104555_106 Microbispora_sp._ATCC_PTA-5024_AWEV01000072 100 9 8 Lechevalieria_aerocolonigenes_(2) 6 5 4 0 Nocardia_mexicana_NBRC_108244_Ga0125337_1036 Actinomadura_echinospora_DSM_43163_Ga0074858_102 100 Rhodococcus_sp_(7) 9 8 7 8 9 3 Streptomyces_sp._94_Ga0181100_11 100 Streptacidiphilus_melanogenes_NBRC_103184_Ga0123946_1016 8 6 Kitasatospora_cheerisanensis_KCTC_2395_JNBY01000051 Streptomyces_toyocaensis_NRRL_15009_Ga0055350_112.2 5 3 Nocardiopsis_chromatogenes_YIM_90109_D476DRAFT_ANBH01000221_1.221 4 8 9 9 Amycolatopsis_benzoatilytica_DSM_43387_AmybeDRAFT_scaffold1.1 Allonocardiopsis_opalescens_DSM_45601_Ga0181037_110 3 3 100 2 8 Streptomyces_sp_(2) 8 9 5 2 Streptomyces_sp_(2) 5 3 Actinoplanes_sp._TFC3_Ga0126036_101 4 8 Micromonospora_lupini_Lupac_08_CAIE01000009 Actinoplanes_sp._N902-109_CP005929 100 2 8 Nocardiopsis_sp_(5) Saccharothrix_syringae_NRRL_B-16468_Ga0163380_1051 3 5 8 5 Kitasatospora_azatica_KCTC_9699_BR98DRAFT.3 9 4 Streptomyces_sp._TLI_235_Ga0151174_101 5 7 3 4 Kitasatospora_sp._MBT63_Ga0124360_1465 9 6 Streptomyces_sp._RSD-27_Ga0077381_11027 Streptomyces_sp._CG_926_Ga0105840_119 7 0 100 Kitasatospora_sp._SolWspMP-SS2h_K353DRAFT_scaffold00010.10 Streptomyces_sp._4121.5_Ga0151167_12 100 100 Streptomyces_albus_(2) 100 Streptacidiphilus_albus_(2) 8 9 100 Streptomyces_scabiei_(10) 9 9 7 6 Streptomyces_sp_(5) 1 5 Saccharothrix_sp._ST-888_Ga0158382_10144 100 8 9 Streptomyces_sp_(3) 4 3 BulS1_homolog_Streptomyces.tsukubensis SrrX_Streptomyces.rochei_plasmid.pSLA2-L [6] 1 5 9 9 Grafe_Factors_guess_Streptomyces.cyaneofuscatus [1] OH 9 2 StcA_Streptomyces.sp.SBI034 O 100 AfsA_SGR_6889_Streptomyces.griseus [2] O 8 9 O ScbA_SCO6266_Streptomyces.coelicolor_A3_2 [3] A-factor (S. griseus, AfsA) [2] 4 5 9 100 JadW1_Sven_5969_Streptomyces.venezuelae.ATCC_10712 [4] Factor1_guess_Streptomyces.viridochromogenes.NRRL_3427 [5] ScgA_guess_Streptomyces.chattanoogensis 100 BarX_Streptomyces.virginiae [8] 2 6 Kitasatospora_sp._MBT63_Ga0124360_1719 8 SabA_homolog_Streptomyces.acidiscabies.84.104 3 3 Streptomyces_pini_PL19_Ga0075229_120 100 7 2 100 Streptomyces_europaeiscabiei_(3) 6 2 Streptomyces_scabiei_NCPPB_4086_Ga0057460_1059 BulS2_homolog_Streptomyces.tsukubensis 9 9 9 8 100 Streptomyces_sp_(2) 8 6 Actinobacteria_bacterium_OV450_Ga0082241_1032 1 1 6 9 7 2 FarX_Streptomyces.lavendulae.FRI-5 [7] Streptomyces_pactum_ACT12_Ga0100190_108 SngA_homolog_Streptomyces.natalensis.ATCC_27448 100 9 6 Streptomyces_sp_(2) 8 6 100 KsbS4_homolog_Kitasatospora.setae.KM_6054_NBRC_14216 Kitasatospora_setae_KM-6054_NBRC_14216_NC_016109 6 0 Streptomyces_toyocaensis_NRRL_15009_Ga0055350_112 7 3 9 5 Streptomyces_sp._70_Ga0181093_11.2 100 Streptomyces_cattleya_(2) 6 1 8 5 Streptomyces_sp_(4) 100 NcsR1_homolog_Streptomyces.carzinostaticus.ATCC_15944 6 0 9 2 Streptomyces_carzinostaticus_subsp._neocarzinostaticus_locus_AY117439 4 7 Streptomyces_sp._MUSC136T_Ga0078521_1024 6 1 Streptomyces_sp._CT34_Ga0072930_1237 7 9 Streptomyces_sp_(6) 100 7 3 Streptomyces_sp_(2) 9 9 5 1 9 0 Amycolatopsis_sp_(3) bioRxiv preprint doi: https://doi.org/10.1101/2020.10.16.342204; this version posted October 16, 2020. 5The 0 copyright holder for this preprint Lechevalieria_flava_DSM_43885_LX90DRAFT_scaffold00007.7 (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. 3 9 Streptomycetaceae_bacterium_MP113-05_Ga0057068_1198 5 9 Streptacidiphilus_rugosus_AM-16_BS83DRAFT_scf7180000000012 2 9 8 4 8 3 Pseudonocardia_sp_(2) 1 7 Micromonospora_humi_DSM_45647_Ga0070213_106 100 Streptomyces_sp_(2) 8 3 8 4 Rhodococcus_sp_(37)_GlbA_Rhodococcus.jostii.RHA1 [11] 8 5 Arthrobacter_cupressi_CGMCC_1.10783_Ga0079831_116 100 6 2 Streptomyces_sp_(3) HO O 8 9 MmfL_Streptomyces.coelicolor_A3_2_plasmid.SCP1 [10] Methylenomycin MMF-1 100 (S. coelicolor A3(2), MmfL) [10] HO 9 2 Streptomyces_sp_(3) O 9 5 2 2 Kitasatospora_mediocidica_KCTC_9733_BS80DRAFT 9 3 100 6 6 Streptomyces_sp_(2) O 100 AvaA_homolog_Streptomyces.avermitilis [9] 5 5 ShbA1_SHJG7319_homolog_Streptomyces.hygroscopicus.jinggangensis.5008 O OH Streptomyces_sp._1331.2_Ga0151187_11 O Avenolide (S. avermitilis, AvaA) [9] RAM_26405_Amycolatopsis.mediterranei.S699_ATCC_13685_guess 9 6 7 2 Nocardia_sp_(44) 100 9 8 Gordonia_polyisoprenivorans_(3) 3 0 3 7 Tsukamurella_sp._1534_BN39DRAFT_CAJY01000005_1.5 4 2 5 1 Thermobifida_cellulosilytica_TB100_TCel_TB100_Contig1443 Nocardia_sp._BMG51109_D892DRAFT_scf7180000000033_quiver.4 100 Streptomyces_sp_(4) 100 100 Rhodococcus_sp_(61) Nocardia_tenerifensis_NBRC_101015_BAGH01000190_1.190 5 8 5 6 Nocardia_sp_(9) 8 0 100 Williamsia_deligens_DSM_44902_LX14DRAFT_scaffold00001.1 Corynebacterium_singulare_DSM_44357 100 Rhodococcus_sp_(12) 8 5 100 100 Rhodococcus_sp_(4) 100 Williamsia_maris_DSM_44693_LX13DRAFT_scaffold00003.3 9 9 2 5 Gordonia_polyisoprenivorans_VH2_DSM_44266_NC_016906 100 Nocardia_otitidiscaviarum_(2) 100 9 8 6 2 Streptomyces_sp_(10)_Pseudonocardia_sp_(1) Streptacidiphilus_oryzae_TH49_BS73DRAFT_unitig_14_quiver.5 100 Streptomyces_sp_(3) 8 9 7 3 Trabulsiella_sp_(8) 6 4 9 9 Yokenella_sp_(2)_Koserella_sp_(1) 100 100 Salmonella_enterica_(4) 100 Salmonella_enterica_enterica_F15M02184_Ga0132713_1049 100 Salmonella_sp_(3) Pseudomonas_sp._RIT357_JFYX01000005

0.6 Figure 4. Cellulomonadaceae 100 Grafe_Factors_guess_Streptomyces.cyaneofuscatus [1] 9 6 Dietziaceae StcA_Streptomyces.sp.SBI034 A-factor (S. griseus) [2] Frankiaceae 100 AfsA_SGR_6889_Streptomyces.griseus [2] OH Gammaproteobacteria 9 0 ScbA_SCO6266_Streptomyces.coelicolor_A3_2 [3] Micromonosporaceae O 100 Mycobacteriaceae JadW1_Sven_5969_Streptomyces.venezuelae.ATCC_10712 [4] 2 4 O O Nocardiaceae Factor1_guess_Streptomyces.viridochromogenes.NRRL_3427 [5] Pseudonocardiaceae 7 7 BulS1_homolog_Streptomyces.tsukubensis Streptomycetaceae SrrX_Streptomyces.rochei_plasmid.pSLA2-L [6] SRB1 (S. rochei) [6] Tsukamurellaceae 2 7 2 5 ScgA_guess_Streptomyces.chattanoogensis HO OH spt 1 98765432 3 7 FarX_Streptomyces.lavendulae.FRI-5 [7] O 5 4 1 6 BulS2_homolog_Streptomyces.tsukubensis O 3 kb SabA_homolog_Streptomyces.acidiscabies.84.104 2 6 O OH 5 6 BarX_Streptomyces.virginiae [8] 9 5 SngA_homolog_Streptomyces.natalensis.ATCC_27448 6 0 NcsR1_homolog_Streptomyces.carzinostaticus.ATCC_15944 KsbS4_homolog_Kitasatospora.setae.KM_6054_NBRC_14216 8 3 GblA_Rhodococcus.jostii [11] 100 ShbA1_SHJG7319_homolog_Streptomyces.hygroscopicus.jinggangensis.5008 Methylenomycin MMF-1 [10] HO O 8 0 AvaA_homolog_Streptomyces.avermitilis [9] (S. coelicolor A3(2)) MmfL_Streptomyces.coelicolor_A3_2_plasmid.SCP1 [10] HO O 5 5 Streptomyces_sp_1_Ga0181085_113 7 4 Rhodococcus_sp_1163_Ga0183007_134 1 8 Cellulomonas_cellasea_DSM_20118_Ga0072847_1033 RAM_26405_guess_Amycolatopsis.mediterranei.S699_ATCC_13685 3 1 9 9 Citrobacter_koseri_FDAARGOS_164_Ga0109959_111 5 5 Dickeya_sp_DW_0440_DXXDW0440DRAFT_AOON01000047_1-47 Streptomyces_phaeofaciens_JCM_4814_Ga0128480_10001 6 5 Salinispora_arenicola_CNT849_B112DRAFT_scaffold00005.5 6 0 Salinispora_arenicola_CNT850_B165DRAFT_scaffold_30.31 1 5 Salinispora_arenicola_CNY260_C579DRAFT_scaffold00003.3 4 5 Salinispora_arenicola_CNS744_C573DRAFT_scaffold00027.27 1 2 6 Salinispora_arenicola_CNY231_C577DRAFT_scaffold00015.15 6 Salinispora_arenicola_CNP105_C572DRAFT_scaffold00007.7 1 1 9 9 Salinispora_arenicola_CNH941_B109DRAFT_scaffold_39.40 4 6 Salinispora_arenicola_CNS051_C570DRAFT_scaffold00019.19 3 5 Salinispora_arenicola_CNY244_T362DRAFT_scaffold00003.3 4 3 Salinispora_arenicola_CNS205_NC_009953 Salinispora_arenicola_CNX481_C574DRAFT_scaffold00020.20 6 7 8 5 Salinispora_arenicola_CNH713_T341DRAFT_scaffold00001.1 2 2 Salinispora_arenicola_CNH718_T338DRAFT_scaffold00021.21 2 2 Salinispora_arenicola_CNH996_CC96DRAFT_scaffold00044.44 Salinispora_arenicola_CNB527_B033DRAFT_scaffold_31.32 7 2 Salinispora_pacifica_CNY498_T414DRAFT_scaffold00015.15 6 4 Salinispora_pacifica_CNR114_B118DRAFT_scaffold_45.46 9 1 Salinispora_pacifica_DSM_45544_Salpac3DRAFT_Scaffold1.1 4 7 9 7 Salinispora_pacifica_CNY330_C587DRAFT_scaffold00019.19 5 2 Salinispora_pacifica_CNQ768_C581DRAFT_scaffold00007.7 4 4 Salinispora_pacifica_CNT851_C585DRAFT_scaffold00016.16

6 0 Salinispora_pacifica_CNT084_B171DRAFT_scaffold_30.31 9 7 Salinispora_pacifica_CNT609_C584DRAFT_scaffold00016.16 8 9 Salinispora_vitiensis_CNS055_C550DRAFT_scaffold00001.1 9 0 9 9 8 5 Salinispora_oceanensis_CNT138_B173DRAFT_scaffold_0.1 Salinispora_oceanensis_CNS863_C552DRAFT_scaffold00023.23 8 8 O 8 7 Salinispora_oceanensis_CNT045_C554DRAFT_scaffold00033.33 O O Salinispora_mooreana_CNT150_B174DRAFT_scaffold_9.10 P O Salinispora_mooreana_CNY646_T419DRAFT_scaffold00005.5 O O C15H31 100 Salinispora_tropica_CNT250_K261DRAFT_scaffold00028.28 [12] Salinispora_tropica_CNB440_NC_009380 Salinipostin A (S. tropica CNB-440) [12] 9 1 Nocardia_amikacinitolerans_DSM_45536_Ga0061128_scaffold00008.8 4 1 Nocardia_amikacinitolerans_DSM_45538_Ga0061130_scaffold00007.7 5 9 Nocardia_amikacinitolerans_DSM_45535_Ga0061127_scaffold00016.16 2 3 Nocardia_amikacinitolerans_NBRC_108937_Ga0128695_1026 8 2 9 0 Nocardia_amikacinitolerans_DSM_45539_Ga0061131_scaffold00009.9 3 5 Nocardia_amikacinitolerans_DSM_45537_Ga0061129_scaffold00003.3 7 1 Nocardia_sp._CNS044_T333DRAFT_scaffold00003.3 Nocardia_xishanensis_NBRC_101358_Ga0128728_1047 6 7 Nocardia_lijiangensis_NBRC_108240_Ga0128715_104514 100 Nocardia_puris_NBRC_108233_Ga0125338_1068 Nocardia_puris_DSM_44599_Ga0244502_109 100 Nocardia_seriolae_ZJ0503_Ga0072888_1003 2 6 9 9 Nocardia_seriolae_U-1_Ga0098320_1007 9 9 Nocardia_seriolae_EM150506_Ga0175482_11 9 9 Nocardia_niigatensis_NBRC_100131_DRAFT_BAGC01000033_1.33 5 6 Nocardia_yamanashiensis_NBRC_100130_Ga0128726_123 Nocardia_inohanensis_NBRC_100128_Ga0128710_110 3 8 3 1 100 Nocardia_arizonensis_NBRC_108935_Ga0132308_125 Nocardia_sp._W9405_Ga0100189_1067 9 4 Nocardia_takedensis_NBRC_100417_DRAFT_BAGG01000032_1.32 8 3 Nocardia_vinacea_NBRC_16497_NoneDRAFT_BAGN01000072_1.72 100 4 2 Nocardia_pseudovaccinii_NBRC_100343_Ga0128722_13779 Nocardia_anaemiae_NBRC_100462_Ga0128699_1068 7 3 Nocardia_vulneris_W9851_Ga0077227_1024 9 5 Nocardia_vulneris_NBRC_108936_Ga0165461_103 8 0 Nocardia_brasiliensis_NBRC_14402_NoneDRAFT_BAFT01000023_1.23 6 1 100 Nocardia_brasiliensis_ATCC_700358_NZ_AIHV01000004

100 Nocardia_brasiliensis_IFM_10847_NBRGIDRAFT_BAUA01000119_1.119 Nocardia_tenerifensis_DSM_44704_Ga0244505_110 9 9 Nocardia_cyriacigeorgica_GUH-2_FO082843 100 Nocardia_sp._LAM0056_Ga0131900_11 Nocardia_cyriacigeorgica_MDA3349_Ga0272060_121671 4 8 Nocardia_sp._NRRL_WC-3656_Ga0163568_1001 9 7 Nocardia_elegans_NBRC_108235_Ga0128705_10382 9 4 6 6 Nocardia_nova_NBRC_15556_Ga0128713_108831 8 9 Nocardia_africana_NBRC_100379_Ga0128696_105 Nocardia_violaceofusca_NBRC_14427_Ga0128734_1472 9 2 8 0 Nocardia_kruczakiae_NBRC_101016_Ga0128711_10032 8 2 9 8 Nocardia_mikamii_NBRC_108933_Ga0128733_114 Nocardia_vermiculata_NBRC_100427_Ga0128723_101 8 7 4 3 Nocardia_nova_SH22a_NONO 100 Nocardia_mexicana_DSM_44952_Ga0244508_102 9 2 Nocardia_mexicana_NBRC_108244_Ga0125337_1045 7 8 Nocardia_sp._BMG51109_D892DRAFT_scf7180000000033_quiver.4 Nocardia_terpenica_NC_YFY_NT001_Ga0263770_112065 8 7 100 Nocardia_carnea_NBRC_14403_NoneDRAFT_BAFV01000020_1.20 9 8 Nocardia_carnea_NRRL_B-1997_Ga0162393_105 Nocardia_harenae_NBRC_108248_Ga0128707_111 5 1 Gordonia_polyisoprenivorans_HW436_A3OCDRAFT_chromosome1.1 100 Gordonia_polyisoprenivorans_VH2_DSM_44266_NC_016906 Gordonia_polyisoprenivorans_NBRC_16320_BAEI01000070 5 5 Tsukamurella_sp._1534_BN39DRAFT_CAJY01000005_1.5 8 4 Mycobacterium_sp._WY10_Ga0157057_11 9 8 3 0 Mycobacterium_sp._PYR15_Ga0263308_11 bioRxiv preprint doi: https://doi.org/10.1101/2020.10.16.342204; this version posted October 16, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse 9allowed 3 withoutMycobacterium_sp._EPa45_Ga0081715_1 permission. 1 9 8 Mycobacterium_sp._BK086_Ga0307707_106 8 7 Mycobacterium_sp._BK086_Ga0307707_5533 9 6 Mycobacterium_rhodesiae_JS60_MycrhDRAFT_MRE.2 5 2 Mycobacterium_sp._YR708_Ga0189713_101 3 2 9 9 Mycobacterium_sp._YR782_Ga0189714_102 2 6 Mycobacterium_sp._Root265_Ga0102231_131 9 9 Mycobacterium_sp._UM_RHS_N414DRAFT_AUWP01000022_1.22 9 9 Mycobacterium_cosmeticum_DSM_44829_Ga0055289_13 9 7 Mycobacterium_canariasense_Ga0126510_1076 Mycobacterium_sp._UNC267MFSha1.1M11_FF89DRAFT_scaffold00006.6 Dietzia_timorensis_ID05-A0528_Ga0132761_110652 9 4 Streptomyces_sp._PRh5_JABQ01000014 100 Streptomyces_rapamycinicus_NRRL_5491_CP006567 9 8 Streptomyces_rhizosphaericus_NRRL_B-24304_Ga0308953_10591 Streptomyces_sp._M56_Ga0264043_11 9 6 7 5 Streptomyces_malaysiensis_DSM_4137_Ga0263872_12 3 0 Streptomyces_sp._2314.4_Ga0123413_12 9 4 Streptomyces_sp._2112.2_Ga0123412_11 9 9 Streptomyces_sp._2323.1_Ga0151184_11 6 0 Streptomyces_sp._2333.5_Ga0151173_11 100 1 9 Streptomyces_sp._NRRL_WC-3701_Ga0161515_1001 1 5 Streptomyces_capuensis_NRRL_B-12337_Ga0163347_1004 7 6 Streptomyces_sp._NRRL_WC-3703_Ga0162420_1004 7 9 Streptomyces_sp._NRRL_WC-3702_Ga0163439_1001 100 Kitasatospora_aureofaciens_NRRL_B-2658_Ga0161558_1112 5 6 Streptomyces_capuensis_NRRL_B-3501_Ga0162441_1003 5 6 Streptomyces_sp._NRRL_F-5755_Ga0161516_1306 6 3 Streptomyces_sp._BK308_Ga0307715_120 8 1 Streptomyces_durhamensis_NRRL_B-3309_Ga0163365_1016 6 9 Streptomyces_incarnatus_NRRL8089_81736_Ga0111306_11 7 4 100 Streptomyces_pluripotens_MUSC_137_Ga0072393_107012 9 8 Streptomyces_sp._DI166_Ga0115233_1034 8 4 Streptomyces_sp._846.5_Ga0299895_12 3 7 Rhodococcus_rhodnii_NRRL_B-16535_Ga0163405_1039 3 4 Kutzneria_albida_DSM_43870_CP007155 Kutzneria_buriramensis_DSM_45791_Ga0104555_119 100 Kitasatospora_cheerisanensis_KCTC_2395_JNBY01000094 Frankia_sp._ACN1ag_Ga0074159_13041 Pseudomonas_sp._RIT357_JFYX01000005

0.6 Figure 5.

Salinispora pacifica CNY331 sp8 Salinispora pacifica CNY363 sp8 A. Salinispora pacifica CNY330 sp8 B. Salinispora pacifica CNY498 sp8

Salinispora pacifica CNY331 sp8 bioRxiv preprint doi: https://doi.org/10.1101/2020.10.16.342204; this version posted October 16, 2020.Salinispora pacifica CNS-143 The copyright holder for this preprint Salinispora pacifica CNY363 sp8 Salinispora pacifica CNT855 sp8 Salinispora pacifica CNY330 sp8 (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. Salinispora pacifica CNY498 sp8

SalinisporaSalinispora pacifica pacifica CNT133A CNT609 sp8sp8 Salinispora pacifica CNR114 sp8 Salinispora pacifica CNS-143 Salinispora pacifica CNT855 sp8 Salinispora pacifica CNT084Salinispora sp8 pacifica CNT851 sp8 Salinispora pacifica CNT796 sp8 Salinispora pacifica CNR894 sp8 Salinispora pacifica CNQ768 sp8 SalinisporaSalinispora pacifica pacifica CNT133A CNT609 sp8sp8 Salinispora pacifica CNR114 sp8 Salinispora oceanensis CNT854 sp5 Salinispora pacifica CNT084Salinispora sp8 pacifica CNT851 sp8 Salinispora pacifica CNT796 sp8 Salinispora oceanensis CNT584 sp5 SalinisporaSalinispora pacifica pacifica CNQ768 CNR894 sp8 sp8 SalinisporaSalinispora cortesiana oceanensis CNY202 sp3CNT124 sp5 Salinispora oceanensis CNT854 sp5 Salinispora oceanensis CNT584 sp5 SalinisporaSalinispora cortesiana oceanensis CNY202 sp3CNT124 sp5 Salinispora vitiensis CNS055 sp7 Salinispora pacifica CNT003 sp8 Salinispora oceanensis DSM 45543 sp5 SalinisporaSalinispora pacifica CNT603pacifica sp8 CNT131 sp8 Salinispora vitiensis CNS055 sp7 Salinispora pacifica DSM 45544 sp8 Salinispora pacifica CNS103 sp8 Salinispora pacifica CNT003 sp8 Salinispora pacifica CNY239 sp8 Salinispora pacifica CNR510 sp8 Salinispora pacifica CNH732 sp8 Salinispora mooreana CNY646 sp4 Salinispora oceanensis DSM 45543 sp5 Salinispora pacifica CNT603 sp8 Salinispora pacifica CNT131 sp8 Salinispora pacifica DSM 45544 sp8 Salinispora oceanensis CNS860 sp5 Salinispora pacifica CNS103 sp8 Salinispora oceanensisCNT403 sp5 Salinispora pacifica CNY239 sp8

Salinispora pacifica CNR909 sp8 Salinispora pacifica CNR510 sp8 Salinispora pacifica CNT001 sp8 Salinispora pacifica CNH732 sp8 Salinispora mooreana CNY646 sp4 Salinispora oceanensis CNS996 sp5 SalinisporaSalinispora oceanenesis oceanensis CNT403 CNS860 sp5 sp5

Salinispora pacifica CNR909 sp8 Salinispora oceanensis CNT045 sp5 Salinispora pacifica CNT001 sp8 Salinispora oceanensis CNS996 sp5 Salinispora oceanensis CNT045 sp5 Salinispora oceanensis CNY673 spt5 Salinispora oceanensis CNY703 sp5 Salinispora oceanensis CNY673 spt5 Salinispora mooreana CNS237 sp4 Salinispora mooreana CNT150 sp4 Salinispora oceanensis CNT029 sp5 Salinispora oceanensis CNY703 sp5 Salinispora mooreana CNS237 sp4 Salinispora mooreana CNT150 sp4 Salinispora oceanensis CNT138 sp5 Salinispora oceanensis CNT029 sp5 Salinispora oceanensis CNT138 sp5

Salinispora arenicola Salinispora arenicola

Salinispora tropica Salinispora tropica

Salinispora arenicola CNY694 sp10 SalinisporaSalinispora arenicola arenicola CNB458 CNY690Salinispora sp10 sp10 tropica CNY012 sp1 Salinispora arenicola CNY694 sp10 Salinispora arenicola CNB527SalinisporaSalinispora sp10SalinisporaSalinispora tropicatropica tropica tropica CNB476CNB536 CNT261 CNH898 sp1 sp1 sp1 SalinisporaSalinispora arenicola arenicola CNB458 CNY690SalinisporaSalinisporaSalinispora sp10 sp10 tropica tropicatropica CNB536 CNH898CNY012 sp1 sp1 Salinispora arenicola CNY685 sp10SalinisporaSalinisporaSalinispora tropica tropicatropica tropica CNS197CNR699 CNB-440CNT250 CNS416 sp1sp1 sp1 sp1 Salinispora arenicola CNB527Salinispora sp10Salinispora tropica tropica CNB476 CNT261 sp1 sp1 Salinispora tropica CNY681CNY678 sp1 Salinispora arenicola CNY685 sp10SalinisporaSalinisporaSalinisporaSalinispora tropica tropicatropica tropica CNS197CNR699 CNB-440CNY681CNY678CNT250 CNS416 sp1 sp1 sp1sp1 sp1

Salinispora arenicola CNH646 sp10 Salinispora arenicola CNH646 sp10

SalinisporaSalinispora arenicolaarenicola arenicola CNY486 CNH962CNH963 CNH996BCNH996 sp10 sp10sp10 sp10 sp10 SalinisporaSalinispora arenicolaarenicola arenicola CNY486 CNH962CNH963 CNH996BCNH996 sp10 sp10sp10 sp10 sp10 SalinisporaSalinispora arenicolaarenicola arenicolaarenicola CNH941 CNP193 CNH964CNP105 sp10 sp10 sp10 sp10 SalinisporaSalinispora arenicolaarenicola arenicolaarenicola CNH941 CNP193 CNH964CNP105 sp10 sp10 sp10 sp10 Salinispora tropica Salinispora arenicola CNX814 sp10 Salinispora arenicola arenicola CNH643 CNH905 sp10 sp10 Salinispora arenicola CNX814 sp10 Salinispora arenicola arenicola CNH643 CNH905 sp10 sp10 SalinisporaSalinisporaSalinispora arenicola arenicola arenicola CNH718 CNH877 CNY011 sp10 sp10 sp10 Bahamas SalinisporaSalinisporaSalinisporaSalinispora arenicola arenicola arenicola arenicola CNH718 CNH877 CNY011 sp10CNY679 sp10 sp10 sp10 Salinispora arenicola CNY679 sp10 Salinispora arenicola CNY230 sp10 Salinispora arenicola CNX508 sp10 Salinispora arenicola CNY230 sp10 Salinispora arenicola CNT850 sp10 Salinispora arenicola CNX508 sp10 Salinispora arenicola arenicola CNT850 CNT799 sp10 sp10 SalinisporaSalinispora arenicola arenicola CNT799 CNH713 sp10 sp10 SalinisporaSalinispora arenicola CNY280 arenicola sp10 CNH713 sp10 SalinisporaSalinispora arenicola arenicola CNT859 CNY280 sp10 sp10 Salinispora arenicola CNT859 sp10 Salinispora arenicola arenicola CNT849 arenicola sp10 CNT800 CNT798sp10 sp10 Salinispora cortesiana Salinispora arenicola arenicola CNT849 arenicola sp10 CNT800 CNT798sp10 sp10 SalinisporaSalinispora arenicola CNS848 sp10 arenicola CNT857 sp10 SalinisporaSalinispora arenicola CNS848 sp10 arenicola CNT857 sp10 Salinispora arenicola CNX891 sp10 Salinispora arenicola CNX891 sp10 Yucatán Salinispora arenicola CNS-991 sp10 Salinispora arenicola CNX482 sp10 Salinispora arenicola CNS-991 sp10 Salinispora arenicola CNX482 sp10 Salinispora mooreana Salinispora arenicola CNY256 sp10 Salinispora arenicola CNY256 sp10 Salinispora arenicola CNR425 sp10 Puerto Vallarta Salinispora arenicola CNR425 sp10 Salinispora oceanensis Salinispora arenicola CNY237 sp10 Salinispora arenicola CNY237 sp10 Salinispora arenicola CNS820 sp10 Sea of Cortez Salinispora arenicola CNS820 sp10 Salinispora vitiensis Salinispora arenicola CNS744 sp10 Salinispora arenicola CNS744 sp10 Hawaii Salinispora arenicola CNS342 sp10 Salinispora pacifica SalinisporaSalinispora arenicola arenicola CNS342 CNY231 sp10 sp10 Salinispora arenicola CNY231 sp10 Salinispora arenicola Palmyra Salinispora arenicola CNS325 sp10 Salinispora arenicola CNS325 sp10 Salinispora arenicola CNS-205 sp10 Salinispora arenicola CNS-205 sp10 Fiji SalinisporaSalinispora arenicola arenicola CNY282 CNT005 sp10 sp10 SalinisporaSalinispora arenicola arenicola CNY282 CNT005 sp10 sp10 SalinisporaSalinispora arenicola arenicola CNS296 CNY234 sp10 sp10 SalinisporaSalinispora arenicola arenicola CNS296 CNY234 sp10 sp10 Guam SalinisporaSalinispora arenicola arenicola CNR921 CNS243 sp10 sp10 SalinisporaSalinispora arenicola arenicola CNR921 CNS243 sp10 sp10

Palau SalinisporaSalinisporaSalinispora arenicola arenicola arenicola CNX481 CNY244 CNY260 sp10 sp10 sp10 SalinisporaSalinispora arenicola arenicola CNY244 CNY260 sp10 sp10 Salinispora arenicola CNX481 sp10 Salinispora arenicola CNQ884Salinispora arenicola sp10 CNS051 sp10

Salinispora arenicola CNS299 sp10 Salinispora arenicola CNR107Salinispora arenicola sp10 CNQ748 sp10 Salinispora arenicola CNQ884Salinispora arenicola sp10 CNS051 sp10 Salinispora arenicola CNS673 sp10

Salinispora arenicola CNS299 sp10 Tree scale: 0.01 Salinispora arenicola CNR107Salinispora arenicola sp10 CNQ748 sp10 Salinispora arenicola CNS673 sp10 Red Sea Tree scale: 0.01 > 0.95 aLRT Salinispora arenicola Madeira Islands Salinispora arenicola > 0.95 aLRT Salinispora arenicola CNT-088 sp10

Salinispora arenicola CNT-088 sp10