AGRICULTURAL RESEARCH FOUNDATION INTERIM REPORT FUNDING CYCLE 2020 – 2022

TITLE: Probiotic solutions to improve Pacific oyster larvae growth and development ​

RESEARCH LEADER: Ryan Mueller (PI, OSU), Carla Schubiger (CoI, OSU), Chris Langdon (CoI, ​ OSU), MK English (Grad. Student, OSU)

COOPERATORS: Sandra Loesgen (U. of Florida), Annika Jagels (Postdoc, U. of Florida) ​ EXECUTIVE SUMMARY: The work completed during the first year of this ARF project was focused ​ on identifying and characterizing bacterial isolates that can be used as probiotics in oyster aquaculture by improving the growth and health of oyster larvae and juveniles and increasing their resistance to disease. The genetic and physiological mechanisms of probiotics that underlie these outcomes in oysters are currently poorly defined. Therefore, the primary goal of our research was to define the antagonistic mechanisms of several isolated oyster probiotics by performing co-culture experiments between probiotics and a known oyster pathogen ( strain RE22) and by performing ​ ​ bioinformatic analyses of genome sequences of the probiotics. For the latter, genome sequences of probiotics were compared to sequences of known proteins and genes to identify potential mechanisms of antagonism between probiotics and pathogens. DNA sequences with matches to factors known to be involved in antagonisms will be targeted in the analysis of data from subsequent transcriptomic experiments of co-cultures of probiotics and pathogens with and without oyster larvae. Due to the challenges of performing in-person laboratory research during the COVID-19 pandemic, our focus over the first year of this project turned to remotely performing detailed bioinformatic analyses to discover the genes and gene products that are possibly responsible for killing or inhibitory activity of probiotic strains against the oyster bacterial pathogen, V. coralliilyticus strain RE22. Our analyses have led us to focus on ​ ​ one probiotic strain in particular, which we have identified as Epibacterium mobile strain B11. Results of ​ ​ comparative genomic analyses confidently mark this bacterium as a close phylogenetic relative of other known antibiotic-producing strains of . Additional bioinformatic analyses have shown that the E. ​ mobile strain B11 genome encodes genes for the production of a relatively novel antibiotic, Tropodithietic ​ Acid (TDA), which is also produced by a well-studied probiotic of shellfish and finfish, inhibens (Zhao et al., 2019). Additionally, we have obtained preliminary confirmation of​ growth-stage ​ ​ ​ ​ specific production of TDA by strain B11 with mass spectrometry. Intriguingly, analysis of the genome of E. mobile strain B11 suggests that, in addition to the production of TDA, this bacterium may produce ​ entirely novel secondary metabolites that may have antagonistic activity against other bacteria. We are currently performing experiments to confirm the production of these new molecules and to explore their killing and/or inhibitory activity against other bacteria. Although in-person experimental work has been slowed due to the pandemic, we have completed several preliminary experiments to define the parameters of our proposed co-culture experiments. We are currently in the process of setting up these co-culture experiments between the probiotic strains DM14 and B11 to determine the genes expressed in the presence of V. coralliilyticus strain RE22, and with the goal of providing further confirmation of the role ​ ​ of these genes in secondary metabolite production and to potentially identify additional genetic mechanisms responsible for probiotic activity in these strains.

OBJECTIVES: The primary objective of our work is to investigate the interactions between probiotics, ​ pathogens, and oysters, in order to better define how probiotics enhance growth and suppress disease during larval and juvenile oyster development.

1 PROCEDURES: New Probiotic Isolation and Testing Probiotics bacterial strains used in this project were isolated from various sources using a uniform screening method. The DM14 probiotic isolate, which was the primary isolate identified for use in the original ARF proposal, was isolated from oyster spat that had survived a mortality event at Hatfield, with ​ the assumption that the bacteria associated with this spat may have played a protective role against disease. This spat was crushed and plated on agar plates containing LB plus Seawater to isolate bacteria capable of growth on this medium. After overnight incubation at 25 °C, a random colony was picked, cultured in LB plus seawater, and stocked with glycerol at -80 °C. The second probiotic strain, B11, which we have subsequently focused on in our research this year (see reasons below), was isolated directly from oyster tissue. This tissue was dissected from the animal and heated in a microcentrifuge tube for 10 min in a 100 °C water bath, and plated on an agar plate containing LB plus 3% NaCl (LBS). After overnight incubation at 25 °C, an off-white colony was picked, cultured in LBS, and stocked with glycerol at -80 °C. Both of these probiotic strains, along with all other putative probiotic isolates used in our project were screened for antagonistic activity against the oyster pathogen, V. coralliilyticus strain RE22, using a ​ ​ ​ ​ uniform assay. Frozen stocks of the probiotics and V. coralliilyticus strain RE22 were struck on LBS ​ ​ ​ ​ plates and grown overnight at 25 °C. Single colonies were grown overnight in 5 ml LBS and shaking at 25 °C. The V. coralliilyticus strain RE22 culture was diluted to approximately 1E05 cfu/ml with LBS, and ​ ​ ​ ​ ​ ​ 50 µl spread on LBS plates with glass beads. After slightly drying the plates, 10 µl of probiotic culture was spotted onto the RE22 lawn, and the plate incubated overnight at 25 °C. Zones of inhibitions in the RE22 lawn indicated probiotic activity.

16S Sequencing and Strain Identification We sequenced the 16S rRNA gene of each isolate to rule out the possibility that one or more were duplicates and to confirm their taxonomic identity. A phenol:chloroform DNA extraction method was used to isolate DNA from 1 ml of overnight cultures of each probiotic. We then ran PCR to amplify the 16S rRNA gene using the primer pair 8F/1513R, creating a ~1500 bp gene product (Figure 1). DNA was cleaned and then sent to the CGRB for Sanger sequencing. Sequencing generated a forward and reverse read for each sample. We trimmed low-quality ends from these reads determined by a base call intensity chromatogram, then overlaid the trimmed forward and reverse reads in the Geneious Prime application to create a consensus sequence. Some reverse reads were consistently low quality, so for those samples, only the forward read was used. Finally, we analyzed each consensus sequence with the BLAST software to identify the best matches to a database of all known prokaryotic 16S Ribosomal RNA Sequences. Best matches to each probiotic 16S sequence are reported in Table 1. If there was more than one close match based on these parameters, the isolate was only identified to the genus level rather than species.

Antagonism Assays We initially selected DM14 as the probiotic to be used in our co-culture experiment based on its fast, robust growth compared to the other strains. We performed a series of preliminary experiments to prepare for the main co-culture experiment, with the goal of identifying the time point where the ratio between probiotic to pathogen cells was greatest in order to sample for transcriptome analysis. To accurately estimate cell counts to use in experiments, we first established the relationship between optical density of cultures and cells per ml, which was used to identify when a shift in relative abundance occurred between the pathogen and probiotic. Prior to creating a co-culture, we first established equations to get a rough estimate of the relationship between OD600 (optical density at 600 nm, a common way to measure cell culture density) and cell density in CFU/ml (colony-forming units per ml). DM14 and RE22 were cultured separately overnight. A series of dilutions was used to provide a range of OD600 readings. These dilutions were then further diluted to a concentration low enough to be counted using a plating method. We calculated an 2 equation for each strain relating CFU counts with OD600 readings for each culture (R ​ values of 0.44 and ​

2 0.41 for DM14 and RE22, respectively; Figure 2). This relationship was used in these preliminary experiments, but we ultimately decided to use a Guava flow cytometer for more accurate cell measurements in the main experiment. The first preliminary co-culture spanned a 24-hour incubation period with sampling at 1, 2, 4, 12, and 24 hours. Briefly, cells from overnight cultures of DM14 and RE22 washed with ASW. Using the established equations to relate OD600 to CFU/ml, the probiotic DM14 was stocked at a density of 6E04 CFU/ml and the pathogen RE22 at 3E04 CFU/ml in the co-culture. We used the differential medium TCBS agar, which selectively grows vibrios (RE22), to subtract out the RE22 counts from the combined counts on ZMB plates, which grows both indiscriminately. RE22 appeared to decrease in abundance around 4 hours, so we chose to shorten co-culture incubation times for the next preliminary experiment. The second preliminary co-culture was a five-hour incubation. Aliquots were removed from the culture at 1 hour intervals. These aliquots were then diluted to give a countable number of cells on plates, then spread on plates and allowed to grow overnight. These were enumerated and the original concentrations back-calculated. Analysis of this time-course showed a remarkable change in relative abundance around 4 hours, but since the incubation ended at 5 hours, we decided to set our next preliminary incubation longer than the first. The third preliminary experiment had the same setup and sampling procedures, but with a 10-hour time course. Looking at the ratios of DM14 to RE22 (Figure 3), we identified the 7 h as the time point as being the point when the ratio of probiotic cells in the co-culture begins outweighs those of the pathogen, indicating a killing or inhibitory effect of the probiotic cells against the pathogen cells. In co-culture experiments for transcriptomic studies the 7 h time-point will be used for RNA extraction to investigate differentially expressed genes of the DM14 probiotic strains that may be accounting for this observation and in support of our hypothesis of the expression of an antagonistic factor at this time.

RNA Extractions The goal of the proposed transcriptomic study of a co-culture between the probiotic (DM14) and pathogen (RE22) is to determine how the cells are reacting to each other; namely, we want to identify genes regulated in DM14 that might account for the killing ability of this probiotic against the oyster pathogen V. coralliilyticus strain RE22. This procedure involves extracting mRNA, which encodes the ​ ​ transcripts guiding the translation of proteins. A requirement for this goal was to simulate oyster hatchery conditions with relevant cell concentrations, while having enough cells to extract sufficient RNA from. Towards this end, we determined the minimum amount of RNA (in total cell numbers) that will be needed for sequence library preparation and subsequent transcriptome sequencing. An overnight culture of DM14 was used to create aliquots of different numbers of cells ranging from 1E05 to 1E08 cells in each sample. We performed extractions on each aliquot using Ribozol, following the manufacturer’s protocol. Quantities of RNA were determined by gel electrophoresis to look for the presence of the characteristic double banding pattern for ribosomal RNAs. Gel electrophoresis of the extracted RNA indicated that 1E07 cells total was sufficient to produce enough RNA for sequencing library generation, establishing the cell abundances needed for subsequent co-culture experiments (Figure 4). To further evaluate the quality of the extracted RNA, the NanoDrop spectrophotometer was used to determine if any residual chemical contaminants were present in extractions. Results showed contamination with phenol (an ingredient in Ribozol). To get rid of this contamination, we found that an additional Et-NaAc cleanup successfully removed phenol while minimizing RNA loss. Based on these results of these preliminary experiments, a total volume of co-cultures at 100 ml and concentrations of bacteria at 1E05 cells/ml (1E07 cells total) will allow for sufficient high-quality RNA extraction for subsequent sequencing of transcriptomes.

Genome Sequencing, Assembly, and Annotation We used a phenol:chloroform DNA extraction method to isolate genomic DNA from 1 ml each of overnight cultures of 5 probiotics (B1, B11, D16, D19, and DM14). DNA from these samples were

3 prepared for sequencing by generated Illumina Nextera libraries and sequenced at OSU’s CGRB with the Illumina MiSeq. Output read lengths of 250 bases from each library were assembled into the respective probiotic genomes using SPAdes genome assembler (Bankevich et al., 2012). Contiguous sequences (contigs; segments of DNA that belong to the same o​rganism, but do not over​ lap, likely due to inherent sequence properties) were assembled using kmers of 127 bp. Resulting contigs under 500 bp long were omitted from further gene-level analyses. Statistics for the resulting assembled genomes are seen in Table 2. N50 (the sequence length of the shortest contig at 50% of the total genome length), average contig, median, maximum, and minimum are all measured in bases. The genome annotation program Prokka (Seemann, 2014) was used to assign identities to all of the genes in each genome. Genes were either ​ classified based on similarity to a known, characterized gene, or designated a hypothetical protein (Table 3).

Average Nucleotide Identity (ANI) Genome Comparisons To define the phylogenetic relationships between the genome of Epibacterium mobile strain B11 ​ ​ and other closely related bacterial genomes and to confirm the placement of strain B11 within the E. ​ mobile species, we performed whole genome ANI comparisons using the FastANI tool (Jain et al., 2018). ​ ​ ​ All publicly available genomes from the Epibacterium genus were downloaded from the NCBI Genbank ​ ​ database (n = 59) and used to perform whole genome nucleotide alignments for all pairwise comparisons. Alignment fraction (AF) values were determined from the ratio of matched to total mappings reported by FastANI. Heatmaps of ANI distance matrices were created in R (R Core Team, 2018) with the ​ ​ ‘heatmap.2’ function of the ‘gplots’ package. Genome-wide average nucleotide identity (ANI) and alignment fraction (AF) analyses have been widely used for defining the boundaries of bacterial species (Jain et al., 2018; Richter and Rosselló-Móra, 2009; Varghese et al., 2015). Based on analysis of >90,000 ​ prokaryotic genomes, a level of 95% ANI as calculated by the FastANI program has been shown to represent an appropriate cutoff for discriminating bacterial species (Jain et al., 2018). ​ ​ Reference Guided Genome Assembly of E. mobile Strain B11 ​ ​ The initial genome assembly of B11 sequences using the SPAdes genome assembler (Bankevich ​ et al., 2012) produced an assembly with 92 contiguous sequences (i.e., contigs) and an N50 of 96,182 ​ base pairs. While acceptable for a “draft genome”, this assembly is not considered complete, as most bacterial genomes will have fewer than 5 replicons (i.e., contigs) representing self-replicating circular or linear chromosomes (usually 1) and associated plasmids (between 0-4). Thus, we attempted to obtain a more complete assembly of the B11 genome using a closely related reference genome, whose assembly was considered complete. We chose E. mobile strain F1926, as this genome was most similar to strain ​ ​ B11 based on our ANI analysis and given that its genome is fully assembled into 1 chromosome and 4 plasmids. The SPAdes assembler was used in “reference-guided” mode to allow for the contigs of B11 to be aligned to the complete sequence of F1926 and ordered relative to this reference. The assembler should then be able to better assemble the genome in the proper order and produce an assembly with less overall contigs and a higher N50. Surprisingly, the opposite result was obtained. Whereas the initial assembly of B11 had 92 contigs, the reference-guided assembly produced 371 contigs. The reference-guided B11 assembly was also 1.27X longer (6.1 million base pairs) than expected based on alignments to strain F1926 (4.8 million base pairs). Given this difference in size and the lower overall quality of the reference-guided assembly, the initial SPAdes B11 assembly was used for all subsequent analyses.

Genome Mining for Secondary Metabolite Production Genes Genes encoded in five probiotic isolate strains examined as part of this project were annotated against the antiSMASH database (Blin et al., 2019) to predict if any have a potential role in the ​ ​ production of known secondary metabolites. Secondary metabolites are chemicals naturally produced by microbes and are an important source of new antibiotics and other drugs (Newman and Cragg, 2016). All ​ ​ genomes were found to encode different genes potentially involved in secondary metabolite production.

4 The genome of E. mobile strain B11 was further analyzed since it was the only genome of the five ​ ​ examined to encode genes involved in antibiotic production beyond bacteriocins, which were detected in all genomes. In particular we focused on one locus of the genome that was predicted by antiSMASH to encode genes involved in antibiotics produced by polyketide synthases (PKS) and non-ribosomal peptide synthetases (NRPS). Throughout the remainder of this report, this locus and the genes encoded within will be referred to as “Region 3.1”. Initial exploration of the gene annotations within Region 3.1 of the E. mobile strain B11 genome ​ ​ suggested that this region may be a genomic island within the larger genome and may carry a set of genes that carry out related functions involved in secondary metabolite production and defense. Genomic islands are contiguous regions of DNA that can be transferred horizontally via transduction, transformation, or conjugation and that can code for many functions often involved in symbiosis or pathogenesis. A more detailed bioinformatic analysis of the genes in this region was th​ ​erefore carried out to better define the functions and conservation of the Region 3.1 genes. Given the high ANI similarity of E. mobile strain B11 to genomes of other E. mobile strains, we ​ ​ ​ ​ investigated whether the Region 3.1 locus was conserved in its entirety and in a syntenic manner within these related genomes, under the assumption that a pattern of high conservation of gene content and order (i.e., gene synteny) amongst these genomes would indicate that the expressed functions encoded within this region are highly conserved and crucial to the fitness of members of this species in its natural environment. Initial comparison of the E. mobile strain B11 Region 3.1 genes against all other known ​ ​ Epibacterium genomes was performed using BLAST to find homologs in each corresponding genome. ​ BLAST results were evaluated for conservation of the entire content and order of the B11 Region 3.1 versus each query genome through visualization of pairwise DNA sequence comparison plots created with the ‘genoPlotR’ library of R. Additionally, an ANI comparison of the Region 3.1 locus and the corresponding matching loci from each of the E. mobile strains was performed as described above for ​ ​ whole genomes. Automated annotations of genes within Region 3.1 of the E. mobile strain B11 genome were ​ ​ manually curated to better define potential functions encoded. To identify distant homologs to each of the genes in this region and increase the probability of matching known functions involved in secondary metabolite production and defense, we first identified all significant matches to each gene of Region 3.1 of strain B11 using the BLAST program. The sequences of all significant hits were aligned with the Clustal Omega program (Sievers and Higgins, 2018) and Hidden Markov Models (HMMs) were built for ​ ​ each respective homologous gene cluster. HMMs are sequence profiles that highlight the conserved sequence patterns within sets of closely related homologs, thus amplifying the important conserved signals within the larger group of sequences. Given this property, we used each HMM and the HMMER program (Johnson et al., 2010) to search against the Swiss-Prot database (Bairoch and Apweiler, 1997) of ​ ​ ​ ​ non-redundant protein sequences for significant matches to well-annotated proteins. All search results were sorted for significance and top hits were matched to each query and summarized in an output table.

Liquid Chromatography-High Resolution Mass Spectrometry (LC-HRMS) of B11 Metabolites Extraction of secondary metabolites from cells on agar plugs of E. mobile strain B11 followed a slightly modified micro-scale extraction method (Smedsgaard, 1997). E.​ mobile str​ ain B11 was grown on ​ ​ ​ ​ Zobell agar plates overnight to reach two separate phenotypic states discriminated by colony color. Yellow colonies were collected from plates incubated for 24 hours, while brown colonies were collected from plates incubated for 48 hours (Figure 5). Pigmentation change of the colonies likely relates to the expression of respiratory enzymes in response to active quorum sensing and developmental change related to the growth phase of this bacterium. Since quorum sensing regulates a wide range of phenotypic traits, including secondary metabolite production, we hypothesized that if secondary metabolites are produced by strain B11, they would be differentially expressed at the different growth phases that correspond to the visual color change of the colonies. 2 For each colored colony type, replicate agar plugs (~1.3 cm )​ were cut with individual bacterial ​ colonies with a sterile pipette tip and transferred to a 2 mL vial and 1 mL of 80% MeCN + 0.1% formic

5 acid was added. In addition to E. mobile strain B11 the TDA-producing bacterium, Phaeobacter inhibens ​ ​ ​ strain T5, was grown and used as a positive control for all subsequent metabolite analyses (Dogs et al., ​ 2013). Extraction of metabolites from the agar plugs was carried out by an ultrasonic bath for 30 min. Afterw​ ard, the supernatant was removed, and metabolite extraction from agar plugs was repeated. The supernatants were combined, centrifuged for 5 min at 15,000 x g, and diluted to the required concentration (0.01 mg/mL and injections of 1-5 µL for Liquid Chromatography-High Resolution Mass Spectrometry (LC-HRMS)) analysis, which can separate the individual metabolites in a mixture and measure the mass and charge of each. This allows for subsequent identification of specific compounds by matching mass/charge values against values of known compounds. LC-HRMS analysis of sample extracts was carried out with an Agilent 1290 Infinity II series UPLC coupled to an Agilent 6546 QTOF mass spectrometer with an electrospray ionization (ESI). Chromatography was performed using a Kinetex® C18 column (50 × 2.1 mm, 2.6 μm particle, Phenomenex) oven temperature was set to 40 °C and the sample injection volume was 5 μL. A binary gradient consisting of MeCN (eluent A) and water (eluent B) (both + 0.1% formic acid) at a constant flow rate of 500 μL/min was used. A gradient was applied as following: 0.00 min, 10% A; 0.25 min, 10% A; 10.00 min, 95% A; 13.00 min, 95% A; 13.1 min, 10% A; 15.00 min 10% A. Extracts were analyzed on the MS in positive and negative mode. For data acquisition and subsequent qualitative analysis, MassHunter software (Agilent Technologies) was used. Mass spectra of the two variant colony types of E. mobile strain B11 (yellow and brown) and P. inhibens strain T5 were compared for the detection of MS ​ ​ ​ ​ peaks corresponding to the known mass/charge value for the antibiotic, tropodithietic acid (TDA).

SIGNIFICANT ACCOMPLISHMENTS TO DATE:

Defined co-culture conditions (relative and absolute quantities of cells) Tracking relative abundance of each species in co-cultures of the probiotic and pathogen led to the designation of 7 hours as a sampling time point for the transcriptomics study. Additionally, RNA extraction trials revealed the minimum number of cells necessary in a co-culture for a high RNA yield that will support deep and even sequencing. Both of these findings will be combined to set up the main co-culture experiment, stocking both bacteria at a final concentration of 1E07 cells per culture replicate, with twice as many probiotic DM14 as pathogenic RE22. Each culture will be filtered and prepared for RNA extraction after a 7-hour co-culture time course.

Assignment of Probiotic Strain B11 to the E. mobile species BLAST analysis of the PCR product ​ of the 16s ​ rRNA gene indicated high similarity to other 16S genes of organisms assigned to the E. mobile species group. Further confirmation of this taxonomic ​ ​ designation was found when ANI comparisons were performed between all publicly available genome sequences of the Epibacterium genus. All strain genomes assigned to E. mobile clustered together at ​ ​ ​ ​ >95% ANI, which is the accepted cutoff for species designations. Within this species cluster, several smaller strain groupings were evident, with E. mobile strain B11 clustering in the largest group (Top-right ​ ​ cluster of Figure 6). All of the E. mobile strains clustered separate from all other strains assigned to the Epibacterium genus, which we​re all less ​than 80% ANI in comparison to the E. mobile genomes. These ​ ​ ​ results confirm the assignment of strain B11 to the E. mobile species. These results also confirm the ​ ​ relatively close relationship of E. mobile stain B11 to other known antibiotic-producing bacteria within ​ ​ this genus, such as E. ulvae strain U95 (Breider et al., 2019). ​ ​ ​ ​

Genome Mining of Probiotics Homology searches for genes involved in secondary metabolite production with the antiSMASH server revealed that the genomes of all five strains encoded bacteriocin genes. Bacteriocins are small antimicrobial peptides that are produced by diverse types of bacteria and that can self-assemble in the extracellular milieu and produce pores in the membranes of neighboring bacterial competitors. Beyond this, though, the E. mobile strain B11 genome was the only genome to encode several additional genes ​ ​

6 putatively involved in secondary metabolite production. Namely, matches within the “Region 3.1” locus of the genome were to a locus that may have functional roles a polyketide synthase (PKS) gene and a non-ribosomal peptide synthesis (NRPS) gene (Fig. 7). PKS and NRPS gene products are well-known as the primary synthetic machines that bacteria use to produce antimicrobial and bactericidal secondary metabolites (Moghaddam et al., 2021). Thus, the genes encoded within the Region 3.1 locus were further ​ ​ investigated to find additional evidence supporting their role in the observed probiotic activity of E. ​ mobile strain B11. ​ Homology analysis comparing the Region 3.1 locus of E. mobile strain B11 with other ​ ​ Epibacterium genomes revealed that the entire ~60 kilobase long locus was highly conserved amongst all ​ isolate genomes of within the E. mobile species, but is effectively missing in related genomes designated ​ ​ to fall outside of the E. mobile species group, but within the Epibacterium genus (e.g., E. ulvae; Fig. 8). This finding was rein​ forced wh​ en examining the gene conser​ vation and or​der of this re​gion acr​oss genomes that are phylogenetically related to E. mobile strain B11 (Fig. 9). Here we found that not only ​ ​ did all genomes within the E. mobile species encode similar DNA sequences to the Region 3.1, but that ​ ​ almost all genes were highly conserved based on ANI and that the gene order (i.e., synteny) of this locus was highly retained relative to the B11 genome reference. That is, few, if any, gene substitutions or rearrangements were detected between homologous Region 3.1 loci of the E. mobile species group. Fig. ​ ​ 10A shows high levels of synteny, with almost all genes of Region 3.1 being conserved in presence and order between E. mobile strain B11 and E. mobile strain F1926. Alternatively, the comparison between E. ​ ​ ​ ​ ​ mobile strain B11 and P. inhibens strain T5 shows almost no conservation with only one locus matching ​ ​ ​ between the genomes (Fig. 10B). Conservation of gene sequences and of synteny is a hallmark feature of genes that encode products that are functionally related and that are key traits involved in the survival of a given organism. Thus, these results strongly suggest that the genes in this region are critical to the fitness of this species in natural environments where they are competing with other bacteria in mixed communities. Manual annotation of the genes encoded in Region 3.1 of E. mobile strain B11 provides ​ ​ additional insight into the potential functions expressed by organisms carrying this locus and of the potential for physical transfer and recombination of the locus between organisms. One gene that stood out in this analysis was annotated to match a “Maximins-S” antimicrobial peptide. This particular peptide is expressed in the giant fire-bellied toad, Bombina maxima, and has activity against mycoplasma bacteria, ​ ​ but no activity against common Gram-positive and Gram-negative bacteria nor fungi. We also find genes putatively involved in the biosynthesis of 12- and 14-membered ring macrolactone antibiotics and in the gramicidin antibiotic. Interestingly, we also discovered genes that may be involved in resistance to these self-produced antibiotics. These include genes that may alter the charge of the outer membrane to prevent ionic interactions with charged antibiotics, as well as enzymes that alter the ribosome structure in a way that may prevent inactivation by antibiotics. The last notable feature of this locus are several genes that might be involved in the transfer and recombination of this locus between genomes. These genes occur at one end of this locus and encode for a transposase, a recombinase, and two DNA replication enzymes. The presence of these genes combined with the strict inter-species conservation pattern described above hint at the possibility that the entire locus can be mobilized between organisms and that it was horizontally transferred into the Epibacterium lineage at the point of speciation of E. mobile from other ​ ​ ​ ​ species within this genus. Follow-up research to confirm these hypotheses will be needed, but these preliminary results suggest that this region has a novel functional and evolutionary role in survival and fitness of E. mobile populations. ​ ​

TDA and Putative Secondary Metabolite Identification by MS and Bioinformatics Based on the results of our initial bioinformatic analyses on the E. mobile strain B11 genome that ​ ​ indicated the potential production of secondary metabolites, we sent samples of B11 metabolites to Sandra Loesgen’s laboratory at the University of Florida for extraction and detection and preliminary identification with mass spectrometry (MS). The results of targeted MS analyses unambiguously confirmed the production of the known antibiotic tropodithietic acid (TDA) by only brown-pigmented

7 cells of E. mobile strain B11 on the basis of the observed isotopic pattern and the accurate mass/charge of ​ ​ the targeted ion peak (Fig. 10). No TDA was detected to be produced by the yellow colonies of B11, although this may be due to a small sample size. TDA production was confirmed by comparison against the spectra from a known TDA-producing bacterium, P. inhibens strain T5 (Brinkhoff et al., 2004). ​ ​ ​ ​ Metabolite samples from brown-pigmented cells showed the same retention time for TDA that was found in T5 samples, as well as a diagnostic isotopic pattern of TDA (Fig. 10). To verify that there is agreement between mass spectrometry results and genome information, we bioinformatically confirmed the presence of TDA synthesis genes in E. mobile strain B11. We created a ​ ​ BLAST database using 22 genes known to be involved in TDA production, then used the BLASTN program to compare the whole genome of E. mobile strain B11 to these TDA genes (Geng et al., 2008). ​ ​ ​ ​ ​ ​ ​ ​ After sorting for quality to eliminate weak matches, matches for 20 of the 22 genes involved in TDA synthesis were found in the E. mobile strain B11 genome. Notably, none were encoded within or near ​ ​ Region 3.1. This presence confirmed that B11 has the genes to synthesize TDA, and that the genes are not found in the Region 3.1 locus (Fig. 11). Additional metabolite features were discovered in the untargeted LC-HRMS spectra of the metabolites from brown-pigmented E. mobile strain B11 cells, which we are conditionally targeting for ​ ​ further analysis on the basis of hypotheses generated from the described bioinformatic results above. Among these are two possible peptides that appear to be produced and maybe secreted in the extracellular space around B11 cells. Production of extracellular peptides agrees with our hypothesis from the bioinformatic analyses suggesting that B11 can produce and secrete antimicrobial peptides as a defense mechanism. The initial spectra of these peptides (“peptide 1007” and “peptide 778”) provide some insight into their amino acid composition (Fig. 12), which can be further investigated by prediction and translation of gene sequences in the E. mobile genome. ​ ​ Another notable feature that was detected in this analysis appears to be a fatty acid related molecule, which matches the chemical formula of 1-hexadecanosulfonyl-O-L-serine (Fig. 13). This molecule is an unusual molecule in bacteria, but is known to function as a ligand of the Outer membrane phospholipase A (OMPA) in E. coli (Snijder et al., 2001). The functional role of OMPA for bacteria in ​ ​ ​ general is poorly understood, but in E. coli this enzyme is involved in the secretion of bacteriocins. Taken ​ ​ together, we believe that these MS and bioinformatic results support the hypothesis that E. mobile strain ​ ​ B11 produces an antimicrobial peptide (AMP) under specific environmental conditions, and that this AMP may have inhibitory activity against other competing bacteria, such as oyster pathogens, in natural environments. We are currently devising additional experiments, including the transcriptomic experiments described above, to continue to test this hypothesis and to explore the production of secondary metabolites in E. mobile strain B11 in order to better define the mechanisms behind the ​ ​ observed probiotic activity of this organism.

8 FIGURES AND TABLES

Figure 1. Gel electrophoresis of the PCR-amplified 16S rRNA gene of 5 probiotic strains. The band ​ for B11 was weaker than the other four, but NanoDrop DNA quantification revealed that there was enough DNA to sequence (data not shown). Image is cropped to show samples adjacent to each other.

9

Figure 2. Lines of best fit for OD600 and CFU/ml of probiotic DM14 (left) and pathogen RE22 (right). Counts are from culture replicates. Lines represent linear regressions of the data and the shaded area is a​ 95% confidence interval of each regression.

10

Figure 3. Ratio of probiotic DM14 to pathogen RE22 over a 10-hour co-culture. The co-culture is ​ initially stocked with 2x more probiotic than pathogen. Declines in probiotic counts occur around t=4 h, when the ratio falls below 1. Recovery of the probiotic relative to pathogen counts begins around t=6 h. The shaded region represents a 95% confidence interval around the average trendline (blue) of 3 culture replicates. A dashed horizontal line has been added at y=1 to indicate a ratio of 1:1.

11

Figure 4. Gel electrophoresis in determining cell quantity necessary for sufficient RNA extraction. Samples were run in duplicate. A ladder in the leftmost lane is used to show sample band size. Strong bands in the first four sample columns (from cell cultures with 1E08 and 1E07 cells total) indicate the presence of sufficient RNA amounts, while weaker bands of RNA extracted from 1E06 total cells are below that needed for transcriptome sequencing. Samples containing 1E05 cells did not have enough RNA to see on the gel.

12

Figure 5. Pigment change in colonies possibly associated with quorum sensing and development. Brown colonies are seen in the dense growth of the T-streak (upper left and right), and yellow colonies sparsely populate the less dense bottom left section of the plate. Agar plugs with colonies of each phenotype were separately analyzed in high-resolution mass spectrometry.

13

Figure 6. Hierarchical Clustering of Whole Genome Sequences Based on ANI Relationships. Genomes that are more closely related are represented by more intense yellow colors at the corresponding intersecting cells of the clustering matrix. The clustering matrix represents an identity matrix with the genomes ordered in the same manner along both axes. The diagonal of the matrix represents self:self comparisons, while off-diagonal cells represent pairwise comparisons between genomes. All E. mobile ​ genomes including strain B11 are highly related with >95% ANI.

14

Figure 7. antiSMASH Genome Analysis Results. The genome sequence of E. mobile strain B11 was ​ ​ ​ determined to have nine separate loci with significant matches to known genes involved in secondary metabolite production or regulation.

15

Figure 8. Hierarchical Clustering of Region 3.1 Homologous Loci Based on ANI Relationships. Loci ​ that are more closely related are represented by more intense yellow colors at the corresponding intersecting cells of the clustering matrix. The clustering matrix represents an identity matrix with the genomes ordered in the same manner along both axes. The diagonal of the matrix represents self:self comparisons, while off-diagonal cells represent pairwise comparisons between genomes. Note that the E. ​ ulvae genome, which is outside the E. mobile species group, is the only genome to not have a matching ​ ​ ​ locus with high ANI to all other genomes.

16

Figure 9. Genome Comparison of Homologous Loci to E. mobile strain B11 Region 3.1. (A) ​ ​ ​ Comparison of E. mobile strain B11 Region 3.1 (top) against a homologous and highly conserved locus ​ ​ found in E. mobile strain F1926 (bottom). Red bars represent highly conserved regions between the two ​ ​ genomes. Gene boundaries in each locus are designated by blue arrows. Highly conserved gene content and synteny is represented by straight vertical red bars, indicating both the presence of homologous genes in each strain and the conserved order of all genes. This plot is representative of all genome comparisons made with isolates from within the E. mobile species group. (B) Comparison plot of E. mobile strain B11 ​ ​ ​ ​ Region 3.1 (top) against a locus of poor conservation found in P. inhibens strain T5 (bottom). Only one ​ ​ significant conserved match was detected between the two loci represented by the twisted blue bar, which indicates a potential locus inversion in strain T5 relative to the reference strain B11.

17

Figure 10. Mass Spectra and Structure of the Antibiotic Tropodithietic Acid (TDA). The chemical ​ structure of TDA is shown in the top left of the figure. An ion peak corresponding to TDA was detected in metabolite extracts of P. inhibens strain T5 at 2.745 minutes (top right chromatogram). A similar peak ​ ​ was detected in metabolite extracts from brown-pigmented E. mobile strain B11 cells at 2.702 minutes (second chromatogram). No corresponding TDA peak was d​ etected fr​om extracts of yellow-pigmented E. ​ mobile strain B11 cells. The bottom MS spectrum shows the characteristic isotopic pattern for TDA ​ (spectrum for P. inhibens strain T5 shown; brown-pigmented E. mobile strain B11 showed a similar, but ​ ​ ​ ​ less intense, pattern).

18

Figure 11. TDA Biosynthetic Genes in the E. mobile strain B11 Genome. Genes with significant ​ ​ ​ matches to known TDA biosynthetic genes are found in a single operon and contig in the B11 genome.

19

Figure 12. MS spectra of Putative Peptide Compounds Produced by Brown-pigmented E. mobile ​ strain B11. Peptide “1007” and “778” were identified from LC-HRMS analysis of metabolite extracts ​ from E. mobile strain B11 cultures. Several potential diagnostic amino acid subunits are detected in the ​ ​ respective MS/MS spectra of each peptide.

20

Figure 13. MS spectra of unknown Compounds. Three non-polar peaks were detected in the MS ​ analysis of B11 metabolites (7.201, 7.609, and 8.018 minutes; top graph). These peaks appear to all be derivatives or variants of the same molecule. Each of these ions were predicted to have similar chemical formulae, of which C19H39NO5S might correspond to the known compound ​ ​ ​ ​ ​ ​ 1-hexadecanosulfonyl-O-L-serine.

21 Table 1. of Probiotic Strains Based on 16S rRNA Gene Sequences. Probiotic Taxonomy1 ​ B1 Vibrio anguillarum

B11 Epibacterium mobile

DM14 Pseudoalteromonas sp. ​ D16 Pseudoalteromonas sp. ​ D19 Pseudoalteromonas sp. ​ 1 A​ ssignment is to genus or species level based on confidence of best matches.

22 Table 2. Genome assembly statistics of probiotic genomes. Probiotic Contigs Total N50 Average Median Maximum Minimum Strain (#) genome (bp) contig length length length length length (Mbp) (bp) (bp) (bp) (bp)

B1 180 3.86 63,800 21,400 6,400 187,000 500

B11 92 4.72 96,200 51,300 26,200 414,500 600

DM14 148 4.05 51,400 27,400 17,000 171,000 600

D16 266 3.89 26,000 14,500 9,400 95,000 500

D19 225 4.42 42,400 19,600 10,400 221,000 500

23 Table 2. Gene content of the five probiotic genomes. Probiotic Total genes Known Unknown Functional Functional Annotations Annotations1 ​ B1 3477 2482 995

B11 4571 2728 1843

DM14 3698 2388 1310

D16 3522 2346 1176

D19 3969 2583 1386

1 A​ nnotated as hypothetical proteins.

24 REFERENCES

Bairoch, A., and Apweiler, R. (1997). The SWISS-PROT protein sequence data bank and its supplement TrEMBL. Nucleic Acids Res. 25, 31–36. ​ ​ Bankevich, A., Nurk, S., Antipov, D., Gurevich, A.A., Dvorkin, M., Kulikov, A.S., Lesin, V.M., Nikolenko, S.I., Pham, S., Prjibelski, A.D., et al. (2012). SPAdes: A New Genome Assembly Algorithm and Its Applications to Single-Cell Sequencing. J. Comput. Biol. 19, 455–477. ​ ​ Blin, K., Shaw, S., Steinke, K., Villebro, R., Ziemert, N., Lee, S.Y., Medema, M.H., and Weber, T. (2019). antiSMASH 5.0: updates to the secondary metabolite genome mining pipeline. Nucleic Acids Res. 47, W81–W87. ​ ​ Breider, S., Sehar, S., Berger, M., Thomas, T., Brinkhoff, T., and Egan, S. (2019). Genome sequence of Epibacterium ulvae strain DSM 24752T, an indigoidine-producing, macroalga-associated member of the marine Roseobacter group. Environ. Microbiome 14, 4. ​ ​ Brinkhoff, T., Bach, G., Heidorn, T., Liang, L., Schlingloff, A., and Simon, M. (2004). Antibiotic Production by a Roseobacter Clade-Affiliated Species from the German Wadden Sea and Its Antagonistic Effects on Indigenous Isolates. Appl. Environ. Microbiol. 70, 2560–2565. ​ ​ Dogs, M., Voget, S., Teshima, H., Petersen, J., Davenport, K., Dalingault, H., Chen, A., Pati, A., Ivanova, N., Goodwin, L.A., et al. (2013). Genome sequence of Phaeobacter inhibens type strain (T5T), a secondary metabolite producing representative of the marine Roseobacter clade, and emendation of the species description of Phaeobacter inhibens. Stand. Genomic Sci. 9, 334–350. ​ ​ Geng, H., Bruhn, J.B., Nielsen, K.F., Gram, L., and Belas, R. (2008). Genetic Dissection of Tropodithietic Acid Biosynthesis by Marine Roseobacters. Appl. Environ. Microbiol. 74, 1535–1545. ​ ​ Jain, C., Rodriguez-R, L.M., Phillippy, A.M., Konstantinidis, K.T., and Aluru, S. (2018). High throughput ANI analysis of 90K prokaryotic genomes reveals clear species boundaries. Nat. Commun. 9, 5114. ​ ​ Johnson, L.S., Eddy, S.R., and Portugaly, E. (2010). Hidden Markov model speed heuristic and iterative HMM search procedure. BMC Bioinformatics 11, 431. ​ ​ Moghaddam, J.A., Jautzus, T., Alanjary, M., and Beemelmanns, C. (2021). Recent highlights of biosynthetic studies on marine natural products. Org. Biomol. Chem. 19, 123–140. ​ ​ Newman, D.J., and Cragg, G.M. (2016). Natural Products as Sources of New Drugs from 1981 to 2014. J. Nat. Prod. 79, 629–661. ​ ​ R Core Team (2018). R: A Language and Environment for Statistical Computing (Vienna, Austria: R Foundation for Statistical Computing). Richter, M., and Rosselló-Móra, R. (2009). Shifting the genomic gold standard for the prokaryotic species definition. Proc. Natl. Acad. Sci. 106, 19126–19131. ​ ​ Seemann, T. (2014). Prokka: rapid prokaryotic genome annotation. Bioinformatics 30, 2068–2069. ​ ​ Sievers, F., and Higgins, D.G. (2018). Clustal Omega for making accurate alignments of many protein sequences. Protein Sci. 27, 135–145. ​ ​ Smedsgaard, J. (1997). Micro-scale extraction procedure for standardized screening of fungal metabolite production in cultures. J. Chromatogr. A 760, 264–270. ​ ​ Snijder, H.J., Kingma, R.L., Kalk, K.H., Dekker, N., Egmond, M.R., and Dijkstra, B.W. (2001). Structural investigations of calcium binding and its role in activity and activation of outer membrane phospholipase A from Escherichia coli11Edited by I. B. Holland. J. Mol. Biol. 309, 477–489. ​ ​ Varghese, N.J., Mukherjee, S., Ivanova, N., Konstantinidis, K.T., Mavrommatis, K., Kyrpides, N.C., and Pati, A. (2015). Microbial species delineation using whole genome sequences. Nucleic Acids Res. 43, ​ ​ 6761–6771. Zhao, W., Yuan, T., Piva, C., Spinard, E.J., Schuttert, C.W., Rowley, D.C., and Nelson, D.R. (2019). The Probiotic Bacterium Phaeobacter inhibens Downregulates Virulence Factor Transcription in the Shellfish Pathogen Vibrio coralliilyticus by N-Acyl Homoserine Lactone Production. Appl. Environ. Microbiol. 85. ​ ​

25 ADDITIONAL FUNDING RECEIVED DURING PROJECT TERM: None ​

FUTURE FUNDING POSSIBILITIES: NOAA Saltonstall-Kennedy, Oregon Sea Grant Program Development Grants, Oregon Sea Grant SE​ ED-LEAF Competition, USDA NIFA Aquaculture calls, OSU Microbiology Department John Fryer Fellowship

26